DrupalCon Dublin 2016: Faster and More Scalable than Memcache/Redis: Tiered Storage with LCache

DrupalCon Dublin 2016

Video Description

Scalable caching in Drupal is broken. Once cache access saturates a network link, the main options are Memcache sharding (which has broken coherency during and after network splits) and Redis clustering (immature in multi-master and as complex as MySQL replication in master/replica modes).

We can do better. We can have better performance, scale, and operational simplicity. We just need to take a lesson from multicore processor architectures and their use of L1/L2 caches. Drupal doesn't even need full-scale coherency management; it just needs the cache writes on an earlier request to be guaranteed readable on a later request.

Inspired by how processors handle core-local caches and the design of Pantheon's Valhalla file system, I've written a library and module called LCache with the following properties:

Uses the database for canonical cache storage and coherency management. This is the L2.
Opportunistically uses APCu (which is local to each PHP-FPM pool and absent for CLI) as an L1.
Before Drupal bootstraps, it freshens APCu (if present) using L2's causality and sequencing information.
Falls back to just using the database cache if APCu isn't present or functional.
There is nothing extra to deploy or configure, just enabling the APCu extension for PHP.
Support for tag-based invalidation.
Initial results show 20% less page-generation time than with datacenter-local (but not host-local) Redis access. This is primarily because there's only one round-trip to the cache for each request, and that request only returns data for cache updates.

In this presentation, I will cover:

The coherency protocol, including some discussion of vector clocks
Benchmarks
The unit testing model for the cache coherency implementation
How fallback to pure database caching works
Why core should adopt this as the default cache implementation in, say, Drupal 8.3 or 8.4.