The Beijinger is a community website targeted at English-speaking expats in Beijing, China. The Beijinger is the online counterpart to The Beijinger magazine which is a monthly print publication and the leading English language community magazine in Beijing. The bulk of the development work was done by Jeff Warrington on behalf of The Beijinger.

The Beijinger is built around free classifieds which are a user-driven section of the website. Other primary content areas are a mix of user created content such as forums and events and content provided by the staff of The Beijinger such as a blog and directory. Users can add comments on most types of content and in the case of directory listings can add reviews of places in the directory. Other areas include a weekly podcast and a photo gallery.

Some quick statistics about the site: 49,300+ users (about 25,000 active in the last 6 months), 325,000+ nodes (about 33,000 currently published - most of the unpublished nodes are expired classified ads), over 2.5 million page views a month.

Note: Transfer speeds from China to Europe and North America are not great as China's pipe to the outside world is not large. Speeds within China are quite fast for a fairly graphics-heavy site so speeds outside of China should not be taken as representative of the experience of the users of The Beijinger within China.

Background

The previous incarnation of TheBeijinger.com was built around a frankenstein mishmash of off the shelf PHP-based products including the ubiquitous phpBB for the forum, b2evolution for the blogging component, a commercial product Geoclassifieds for the classifieds, Coppermine for the image gallery and a custom directory created with the Symfony PHP framework. The obvious problems of having to maintain multiple products each with different administrative system and with different user databases not to mention having to try and provide a common theme for each of these pieces led us to conclude that we had to build a new version of the website using a single CMS.

The search for a CMS was fairly short as there are not many CMS that can get you most of what we needed out of the book while at the same time being flexible enough to allow for substantial modifications and enhancements both in the form of contributed modules as well as custom module development. At the time that work began on the site Drupal was just releasing 4.7. Work progressed slowly due to my initial unfamiliarity with Drupal and limited overall time. As Drupal 4.7 turned into Drupal 5 it became clear that Drupal 5 represented a substantial upgrade in functionality so the project was shifted to Drupal 5 which is what the current site is built on.

Migration

With two separate user systems for the classifieds and the forum and a substantial amount of legacy ad and forum post content, migration was never going to be easy. Add to that migrating blog posts from b2evolution, image galleries from Coppermine and linking up the directory listings from the Symfony-based internal directory database used by the editors of the print version of The Bejinger and it was clear that data migration would loom large in the plan for a new website.

I decided on XML as my data interchange format and proceeded to write a series of PHP scripts to generate XML files to represent each component that would be migrated such as users, forum posts, classified ads, blog posts, blog categories, forum categories, etc. Considerable time was spent (and wasted) making sure to get all the content in utf8 encoding as unfortunately not all the legacy pieces were setup and stored as utf8.

When I searched for Drupal import tools I found that while there were a few options none of them were able to provide all that was needed. The one with the most promise at the time was the ImportExportAPI module. It's quite a complex module and it took a fair amount of time just get my head around it but it did prove useful for relatively easy import of forum and blog content. Once I became more familiar with Drupal and started using CCK types I switched to writing my own custom modules to handle the importing of most of the other content.

Modules

With Drupal's core modules one can have a basic functional site up in no time. Of course it doesn't take long before you turn to contrib modules for additional functionality and/or to add the one little thing you must have. We've been no different and if anything we've splurged on contrib modules to the point where it might be too many but then there are just so many nice things that can be had with just the right module. I find that as long as you run a PHP op-code accelerator such as APC, XCache or eAccelerator that adding a few more contrib modules won't hurt too much all else being equal. In our case we've made use of eAccelerator.

Contrib

The modules that no one can live without:

Other modules we found necessary and/or useful:

Custom

In addition to a number of 'glue' modules we wrote a number of custom modules to implement site areas such as classifieds, events, directory listings and an enhanced forum. At this time some of the custom work might be useful to other users and so if I can generalize the code enough I hope to give back some of the module work to the Drupal community. Some modules created for this project that have been added to Drupal include Node Extended Stats, Email Confirm, Ignore User. Other modules that I took over or acted as co-maintainer include OpenX (Openads), Troll, Word Filter, Account Reminder, Activity.

Search & Caching

Core Drupal has known performance and functionality limitations with both the database caching layer and the core search module. In order to reduce the load on our database server we made use of both the Memcache module and the Apachesolr module.

The Memcache module allows for the replacement of database caching with a cache served by Memcache. Performance using Memcache is fast enough on its own and factoring in the reduced load to the database the overall performance impact can be quite substantial. Using Memcache in Drupal 5 requires patching core but the impact is minimal and Drupal 6 already contains the changes in the patch so its future-proof.

To provide better search results and reduce database usage at the same time, we opted to replace Drupal's core search module with the Apachesolr module. The Apachesolr module allows for integration between Drupal and the Apache Solr search platform. The Apachesolr project was and is still undergoing rapid development so tracking the module for use in a production environment was challenging but we went live with what I felt to be a relatively stable and feature-full release of the module. We liked the faceted search feature of Apachesolr (sample search and once Apachesolr reaches a 1.0 stable release we envision using faceted search to a greater degree as it can replace navigation structures in place and allow Apachesolr to let the user navigate to exactly the content they want.

Design

The theme for The Beijinger was designed by Raincity Studios out of their Shanghai office. Thanks goes out to Jacob Redding (jredding) for connecting us with Raincity. Special thanks goes out to one of the first Chinese designers hired in the Shanghai office
Stinga Zhou who did most of the CSS work.

Running a Drupal site

Prior to re-launching our site I attempted to do as much load testing as possible but without having a testing environment similar to our production environment coupled with little experience in proper load testing I was unable to really test whether the site would be OK after we switched. Rather than delay the launch once again we went ahead and launched. What we found very quickly is that Drupal put a much heavier load on the database than anticipated. At the time we launched our site consisted solely of a single colocated Dell server purchased at the tail end of 2004. The machine was and is a capable machine but not a speed demon by any measure and with 3G of RAM we were not terribly short of memory but certainly could have used more.

Recognizing that the resource needs of a database and a web server are not the same we opted to split out the database to a separate virtual server. Our hosting partner kitted us out with a 4G RAM 2 CPU CentOS virtual server. Backed by a high performance, large capacity SAN optimized for disk IO the VPS immediately brought the load down to manageable levels. The webserver being CPU bound is still not as fast as it could be given that it's oldish hardware but is acceptable for now.

On the webserver end additional measures were taken to improve site performance. The biggest improvement came from using the high performance webserver Nginx. Nginx uses very little resources and provides exceptional performance for static file serving much as lighttpd does. Nginx can also run PHP via FastCGI. We moved some of our other low-traffic non-Drupal sites to Nginx as well as our OpenX server. In order to further optimize performance using Nginx we made use of the suggestions found in this article: Using Lighttpd as a static file server for Drupal and moved as much of our static content serving to Nginx. If I could find the time to go through all of our other sites I would actually prefer to jettison Apache all together and use Nginx exclusively. For info on using Nginx with Drupal I found this article helpful: Nginx, Fastcgi, PHP, rewrite config for Drupal. Additional performance improvements came from following the advice in Yahoo's Developer Network website.

On the database side it became clear within a short period of time that sticking with MyIASM tables in MySQL was a losing proposition. The table-level locking of MyIASM tables vs row-level locking in InnoDB resulted in high lock times negating any perceived performance advantage of MyISAM tables over InnoDB. After looking into InnoDB more closely I found that InnoDB has the advantage of allowing both index data and actual table data to be stored in memory whereas MyISAM to my knowledge does not load table data into memory. After converting all tables to InnoDB the performance impact was felt immediately on some of the more complex queries running on the forum and locks on heavily used tables such as user, node, node_revisions, term_node went away quickly.

It's taken some time to tweak the configuration of MySQL with InnoDB to get to where we want and there is still some profiling to be done but the difference in database performance now vs at launch time is night and day.

On a side note we made use of the Percona releases of MySQL containing their optimization patches and nice additional features. For anyone looking to improve their knowledge of MySQL I highly recommend reading the Percona MySQL performance blog.

Site Credits

I wish this list were longer because it would mean that my team was bigger :) Unfortunately I (jaydub on Drupal.org) had to act as Lead Developer, Sysadmin, DBA, Network Admin and even at times the themer. While it would have been nice to have had a larger team I certainly learned a lot more Drupal that I would have otherwise. Joining me on the team and driving the project to launch was Drupal Association Events Coordinator Jacob Redding who came to live in China and found himself pulled away from Chinese class and into Drupal community work in Beijing and Shanghai. Jacob came on board at the end of 2007 and acted as project manager. Jacob provided invaluable guidance in how to move the project forward as well as helping me to deepen my skills in Drupal. As stated in the design section, Raincity Studios gets credit for the site theme. Our hosting partner Candis and all-around genius Richard Ford have worked hard to provide us with solid hosting in Beijing.

Comments

Michelle’s picture

Curious why you wrote a custom module to do the forums?

Michelle

--------------------------------------
See my Drupal articles and tutorials or come check out life in the Coulee Region.

jaydub’s picture

I did not bypass the core forum module completely but rather wrote a forum extender module to add features such as 'Unanswered Posts' and such.

The previous forum was phpBB and the boss was very into phpBB and its featureset (itself extended by a number phpBB add-ons) so there was a lot of pressure to fill in the gaps in Drupal's forum implementation with features in phpBB missing in Drupal.

This work preceded the Advanced Forum module and was more focused on adding in backend stuff than theming only changes. To see these additional forum pages on the site you'd need to create and account and visit the forums.

Some things such as forum moderators being able to moderate comments only in the forums they were set as moderators to required some core hacking since there is no real comment access infrastructure in Drupal 5 or 6 on the level of node access. You'll find many requests for this in the Forum Access module queue as well as other modules since the only out of the box solution in core Drupal is to give users the administer comments permission which gives them permission to admin -all- comments.

I know you've been at the center of the ongoing work on the Drupal forum functionality. There's still a lot to do to pry people away from the likes of phpBB but after trying to extend phpBB I'd do anything to avoid that in the future!

Michelle’s picture

The unanswered posts is somethign that drupal.org itself could really use. Would be a nice addition to the core forum blocks.

Moderation is really a sticky point. Hopefully that will be fixed in D7 if we get the forum posts as nodes in.

Definitely a lot more work to be done. I've got 1.0 just about ready but there's a whole slew of features waiting for me to open up the 2.x branch. LOL

Thanks for the explanation,

Michelle

--------------------------------------
See my Drupal articles and tutorials or come check out life in the Coulee Region.

libo217005’s picture

gooooooooooooooooood

-------------
http://book.drupaluser.cn supply drupal books

http://bbs.drupaluser.cn drupal bbs website

droople’s picture

Very good site, brilliant write up.

Maybe one or two thumbnails can be added.

I think it should be promoted to front page.

While at it, may I ask how you setup your review system?

Thank you

jaydub’s picture

There are probably better ways to do it now especially with Fivestar for comments but the solution we used for reviews of our Directory Listing nodes was based on a tutorial from Drupal guru Michelle.

http://shellmultimedia.com/tutorials/rate-review-one-step

droople’s picture

it's a 2007 tutorial.

Cheers

dontai’s picture

A very cool site, and I didn't even know it was Drupal.

How did you do your news links, where you can mouseover the link and it gives you a truncated version of the article? What module did you use?

Thanks in advance.