Choosing An XMPP Server

26Aug08

Choosing an XMPP server is a big decision.  Should you go with the popular one or the one written in the most popular language?  Perhaps you don’t plan to become a systems administrator and you need one which is easy to set up and maintain.  Unfortunately for people making this important choice, there is not much guidance published beyond features comparisons.  What follows is an account of our decision making process on XMPP servers – how we came to pick jabberd2 originally, and how we switched to ejabberd.

A Brief History Of XMPP Servers

XMPP began as Jabber and had only one server, jabberd.  As popularity of the protocol grew, more servers appeared, and now there are half a dozen major contenders, both commercial and open source.

The main players these days seem to be:

  • jabberd – The original server.  It started in C, but now appears to be a mix of C and C++ code.  The main users for years was jabber.org itself.  The code may have changed substantially in the last couple of years, but I remember it being rather crufty.  Matthias Wimmer has been maintainer at least since 2004, and he continues to maintain it today, although it does go through some long periods of inactivity.
  • jabberd 2.x – A rewrite of jabberd, originally by Rob Norris.  The code is pure C, modular, and fairly easy to understand.  Chesspark picked this to start with in 2006.  Another notable user of jabberd 2.x is Meebo.  Unfortunately it was abandoned by its original maintainers some time ago.  Tomas Sterna stepped up to the plate and took over the project, and it is now actively maintained again.
  • ejabberd – An XMPP server written in Erlang which claims to be quite scalable.  Erlang is the language created decades ago by Ericcson to power their telephone switches.  It has many features that make it well suited for XMPP servers.  ejabberd has been around and active since early 2005, and is supported officially by Process One.  It also has a growing developer community.  Jabber.org switched from jabberd to ejabberd some time ago, and continues to run ejabberd today.
  • Openfire – A Java server written by JIVE Software.  This was formerly a commercial product called Wildfire.  I have no personal experience with it, but it appears to have an active community.
  • Tigase -Another Java server that started in late 2004.  It is actively maintained.  I have no personal experience with Tigase other than meeting Artur at the recent XMPP Summit, but the Seesmic folks speak very highly of the project.

This list is incomplete.  Notable omissions include Google Talk (not publically available), Jabber XCP (the commercial offering from Jabber.com), and djabberd (Danga’s jabber server written in Perl).

Chesspark’s Initial Decision

Chesspark chose jabberd2 as its XMPP server about 3 years ago.  I recall being impressed with the clean and modular code base as well as its ability to change the SQL queries right in the configuration file.  It also supported PostgreSQL which was the RDBMS we preferred.  I don’t think that Tigase or ejabberd were considered; they were likely too young at the time.  Jabberd was the only other real choice, but we were not impressed with the code.

One major factor which influenced our decision was code readability and maintainability.  We wanted an XMPP server that we could patch ourselves if needed, and we didn’t want to be stuck in case the project was abandoned down the road.  This turned out to be a wise decision – the jabberd 2.x project was unmaintained for a long period while we used it.  Over the last few years we’ve made patches and assisted others with patches as best we can.

Jabberd 2.x Disappointment

Over time, Chesspark’s user base got larger as the site became more popular, and jabberd 2.x’s warts began to show.  Here are the main ones in the approximate order we discovered them.

  • Database transactions abused.  Jabberd 2.x does no queries out of the box that require the use of database transactions.  By default it is configured to do every query in a transaction, even if it is a simple SELECT.  This is a common problem with many libraries we’ve used at Chesspark.  Thankfully, jabberd 2.x can be configured to turn this off.  Normally this would not affect anything, but the small amount of overhead caused can add up fast when jabberd 2.x does lots of queries, which it certainly does.
  • Memory leaks. There are several memory leaks that persist to this day.  Even with a small userbase like Chesspark has, this forces us to restart the server about once a week.  As memory usage climbs, the server latency gets higher.  Our attempts to find this leak have been unsuccessful to date.
  • Non-blocking design inconsistent. The server uses a non-blocking design common to scalable daemons; this is great.  Unfortunately, all database calls use the blocking database API and a single database connection.  This means that even with light load, packet latencies can be quite high if the database isn’t ridiculously fast.

For us, the show stopper was latency.  Games depend on near real-time performance, and latency destroys the user experience.  We generated test load which logged in a few hundred users, did roster operations, logged out, and repeated.  We then measured latency of chess moves on the same server.  We were shocked when we saw the numbers – over 3 seconds of lag between an IQ query and its response.

All of these things could theoretically be fixed, and I hope that they are fixed eventually.  Diversity in server choices is a feature of XMPP – the more the merrier.

Finding A New Server

Once we decided to abandon jabberd2, we needed to find a new server.

Feature wise, all the current servers support the stack we need – authentication with TLS and SASL, ability to use PostgreSQL as a backend, private XML storage, external components, and privacy lists.  Years ago, some of these features were hard to come by, but today they are common.  ejabberd, Openfire, and Tigase have pubsub and BOSH support as well, neither of which was available in any open source server when we started.

We knew right off the bat that we didn’t want to be writing C.  While we have a lot of C experience, we like to reserve C for the few times it is actually needed and spend the rest of our time in more productive and higher level languages.  This removed jabberd from the list.

From here the language choice is Erlang or Java.  Erlang is a dynamically typed, functional language – quite a radical departure from the norm for people most C and Java hackers.  We work a lot in Python, so Erlang was the closer fit.  Many people make the decision to work in Java, and from there they will need to pick between Tigase and Openfire.

One thing to note is that some people seem scared away by the Erlang language.  Don’t be one of these people.  Erlang is well documented and pretty easy to learn.  We knew nothing about Erlang a few months ago.  That did not slow us down too much when we needed to write ejabberd modules or make changes in ejabberd internals.  Even without knowing Erlang, we were able to write extensions to ejabberd much faster than for jabberd2.

The last part of our decision was to test server latency with ejabberd.  We ran the same test that we ran on jabberd2, and ejabberd didn’t flinch.  The measured latency at idle was twice as fast in some cases with ejabberd, and there was very little change even as we pounded the database to levels that would have made jabberd2 cry.

Life With Ejabberd

ejabberd is not perfect; no server is.  Here’s a list of our current gripes:

  • Memory hog – Erlang uses a lot of memory for basic string handling since a string is represented as a list of integers.  There also seems to be a bug in TLS that causes it to use quite a bit more memory than non-TLS connections.  These add up to quite a bit of memory usage.  For Chesspark, we use over a gig of RAM for a few hundred connections.  Jabber.org uses about 2.7GB of RAM for its 10k+ connections.  I’m not sure what the discrepancy is between these numbers; we are still looking. I expect the TLS memory issue to be solved soon, and the Process One folks told me that they were going to switch the string handling to use Erlang binaries which are more memory efficient.
  • Lots of database queries – As with jabberd2, ejabberd does an enormous amount of database queries.  With mod_privacy enabled, two roster related queries are done for every packet sent.  ejabberd also uses the database inappropriately, with idioms like SELECT, DELETE, INSERT which can lead to race conditions.  Thankfully, this does not seem to be a big problem with a correctly tuned database, as ejabberd doesn’t block on the queries.
  • Lack of comments in the code – The code is often quite clear, but comments would be helpful.  There is some basic, but very helpful, developer documentation, but the code contains virtually no comments.  Luckily, this is not as bad as it seems,because many of the idioms in the server are really Erlang and OTP idioms, so reading up a bit on Erlang and OTP answers a lot of questions.  I’m also not sure other servers are better.  Jabberd2 probably had more comments, but it also had less documentation in general; I’ve found working with ejabberd to be easier.

There are many excellent things about ejabberd that make up for these and other shortcomings, and have made us very happy we made the switch.

  • Hot code loading – After we write a new ejabberd module, we can deploy it in production without pausing or restarting the server.  We can also redeploy it later if we find a bug.  We have even redeployed core server pieces this way with success.  This can even be done to some degree right in the Web interface.
  • Live console – It is possible to open an Erlang shell inside the running ejabberd node.  This makes it really easy to poke around to see which processes are running, how much memory they are using, and which internal database tables are getting full.  Java, C, Python, and Ruby have nothing quite like this, although Twisted Python’s manhole is similar.
  • Very low CPU usage – You’d think a C based XMPP server would be a winner in this area, but that is not always true.  ejabberd uses very little CPU usage, except when things go wrong.   Chesspark’s XMPP server is sitting around a load of 0.1 to 0.2; Jabber.org sits at 0.1.  This is exactly what you want.  An XMPP server should be I/O bound, not CPU bound.

So far we’re pretty happy.  How did you pick your XMPP server?



13 Responses to “Choosing An XMPP Server”

  1. 1 Peter Petrov

    Hi Jack. I’m working on a project whose architecture turned out very similar to ChessPark. But we chose Ejabberd right from the start. The only small problem we had with it so far is its built-in PubSub module, which doesn’t conform to the latest XEP.

  2. 2 metajack

    @Peter: Have you tried Ralph Meijer’s Idavoll project? We use that for the pubsub based Chesspark services (there wasn’t a jabberd2 pubsub, and still isn’t) right now and it works well with ejabberd. You can’t do PEP that way, but for everything else it works. Can you tell us more about this project?

  3. This article is what I needed and wanted to read, so thanks a lot! I’m currently defining the architecture we are going to use, and I’m being blocked by not being able to compile jabberd14-1.6.1.1 on CentOS. Pretty basic stuff 😀

    If you don’t mind, what OS/linux distro are you using?, also are you still using Punjab for the BOSH, or ejabberd’s own implementation?

    I feel like there is not a defined place in the web to discuss these kind of issues/topics (other than your blog hehe), the main topic being creating XMPP based web projects. If there are, please tell me.

  4. 4 Peter Petrov

    @Jack: Yes, we experimented with the latest Idavoll from SVN (v0.8.x). Unfortunately it too didn’t understand properly the latest XEP. So we decided, for now, to implement the older syntax which should work with Ejabberd’s PubSub.

    As for the project, it’s a site which broadcasts major chess tournaments and matches (with some extra goodies).

  5. 5 Peter Petrov

    Btw, you should probably add DJabberd to the list of servers. After all it powers LiveJournal, which is the second or third largest XMPP service in the world.

  6. Oddly enough, I was looking for the article yesterday!

    I ended up choosing Openfire, because we already use Jive Software’s Clearspace Community platform, and Openfire offers some integration with the Clearspace system. The Clearspace system is quite CPU intensive, so I’ll be keeping a close eye on Openfire’s CPU usage. If it gets too high, I’ll likely switch to ejabberd and for go the integration features. Ultimately, performance is our key concern with a growing userbase of 50k+.

    Thanks for the great blogs and libstophe.js.

  7. 7 metajack

    @Favio: We use Debian stable running on Amazon EC2 instances for everything but DNS and E-Mail. Those run on more traditionally hosted computers, also on Debian stable.

    We use Punjab at Chesspark quite heavily. See Tofu’s write up of our BOSH set up here: http://thetofu.livejournal.com/71339.html

    Punjab works excellently with Google Talk, ejabberd, and every other XMPP server we’ve tested. We might be able to gain a little efficiency by using ejabberd’s built in BOSH service, and we’ll be testing this in the next weeks.

    As or discussing XMPP on the Web, it’s quite new. Perhaps we can talk Peter Saint-Andre into making an official list for it.

  8. 8 metajack

    @Peter: I’m quite surprised that Idavoll didn’t support the latest XEP version since Ralph is one of the co-authors. Did you report this to Ralph? If you haven’t been able to get ahold of him, please E-mail me the relevant information and I will see that he gets it.

    I’m quite surprised to hear another chess site is using XMPP! Can you give me more information?

    I mentioned djabberd in the post, but I thought it had been abandoned. The only thing I remember about it was that Chesspark’s clients did not work with it for some reason, but it has been years since we tested this.

    @Keith: You’re very welcome. It sounds like you’ve made the reasonable choice considering the other technology investments you have. Do you have 50k+ simultaneous connections, or just that many accounts? Let me know how the testing goes; I’m always interested in XMPP server performance reports.

  9. Hello Jack,

    Glad you made the switch.
    Just for the record, how did you proceed to the migration ?
    Did you find it easy enough ?


    Mickaël Rémond

  10. 10 metajack

    @Mickaël: I just wrote up the details of the migration here: https://metajack.wordpress.com/2008/08/27/migrating-to-ejabberd-the-gory-details/

    I’m quite interested in your feedback.

  11. Hey Jack, I’m always happy to create another @xmpp.org list — shall this one be xmpp-on-the-web@xmpp.org or something catchier like hybrids@xmpp.org? I agree that we have a lot of work to do in the area of integrating HTTP and XMPP technologies…

  12. 12 matteblack

    Metajack
    Thanks for this clarity.

    I am not a programmer, but trying to find candidates for my devs to examine.

    I just looked at Palaver and Speeqe. Is Palaver built on top of ejabbedD, and Speeqe on top of that?

    Confused. Can’t find much about it, and the wiki is a bit confusing as to what’s what.
    Can you clarify this?


  1. 1 Migrating To Ejabberd: The Gory Details « metajack