DrupalCon London 2011: SLICK DATA SHARDING: HOW TO DEVELOP SCALABLE DATA APPLICATIONS

Presented by
Tobby Hagler
High-traffic websites that capture a lot of data from users often encounter performance problems when database input becomes a bottleneck. High volume user-submitted content (comments, ratings, form submissions, etc.) is typically stored in a single (master) database, and this creates problems not only for scale but also for replication and useful backups. It becomes important to be able to write these sorts of things to other secondary storage locations. I'll cover how to successfully write to different databases (MySQL and MongoDB) while still use Drupal's APIs and to cover pitfalls and successes.

In addition to scalability, data sharding provides other capabilities. Applications may be developed using Node.js or with other technology, but still needs access to the same data. With smart data sharding, this becomes possible and even easy.

Intended audience
Developers looking to build large-volume sites who haven't built massive-scale sites before. Developers looking to build complex applications that need to integrate with Drupal, but won't necessarily need to be in Drupal.

Questions answered by this session
What is data sharding, and how does it apply to Drupal sites?

Why MongoDB and how can I use it?

How can I use a secondary MySQL database (or database cluster)?

But Drupal's APIs give me what functionality I need. How can I do all of this without reinventing the wheel?

Besides scale, what other advantages do these techniques have?