Final draft of case study for landing page caching

grantjenks · grantjenks · commit 2d1f43ea2be4 · 2019-07-02T10:13:55.000-07:00
diff --git a/diskcache/recipes.py b/diskcache/recipes.py
@@ -321,10 +321,10 @@ def wrapper(*args, **kwargs):
 def memoize_stampede(cache, expire, name=None, typed=False, tag=None, beta=1):
     """Memoizing cache decorator with cache stampede protection.
 
-    Cache stampedes are a type of cascading failure that can occur when
-    parallel computing systems using memoization come under heavy load. This
-    behaviour is sometimes also called dog-piling, cache miss storm, cache
-    choking, or the thundering herd problem.
+    Cache stampedes are a type of system overload that can occur when parallel
+    computing systems using memoization come under heavy load. This behaviour
+    is sometimes also called dog-piling, cache miss storm, cache choking, or
+    the thundering herd problem.
 
     The memoization decorator implements cache stampede protection through
     early recomputation. Early recomputation of function results will occur
diff --git a/docs/case-study-landing-page-caching.rst b/docs/case-study-landing-page-caching.rst
@@ -2,17 +2,21 @@ Case Study: Landing Page Caching
 ================================
 
 :doc:`DiskCache <index>` version 4 added recipes for cache stampede mitigation.
-Let's look at how that applies to landing page caching.
+Cache stampedes are a type of system overload that can occur when parallel
+computing systems using memoization come under heavy load. This behaviour is
+sometimes also called dog-piling, cache miss storm, cache choking, or the
+thundering herd problem. Let's look at how that applies to landing page
+caching.
 
     >>> import time
     >>> def generate_landing_page():
     ...     time.sleep(0.2)  # Work really hard.
     ...     # Return HTML response.
 
 Imagine a website under heavy load with a function used to generate the landing
-page. There are two processes each with five threads for a total of ten
-concurrent workers. Also assume that generating the landing page takes about
-two hundred milliseconds.
+page. There are five processes each with two threads for a total of ten
+concurrent workers. The landing page is loaded constantly and takes about two
+hundred milliseconds to generate.
 
 .. image:: _static/no-caching.png
 
@@ -26,15 +30,15 @@ regenerates the page with a consistently slow latency.
     ... def generate_landing_page():
     ...     time.sleep(0.2)
 
-With traditional caching, the result of generating the landing page can be
-memoized for one second. After each second, the cached HTML expires and all ten
-workers rush to regenerate the result.
+Assume the result of generating the landing page can be memoized for one
+second. Memoization supports a traditional caching strategy. After each second,
+the cached HTML expires and all ten workers rush to regenerate the result.
 
 .. image:: _static/traditional-caching.png
 
 There is a huge improvement in average latency now but some requests experience
 worse latency than before due to the added overhead of caching. The cache
-stampede is visible now as the spikes in the concurrency graph. If generating
+stampede is visible too as the spikes in the concurrency graph. If generating
 the landing page requires significant resources then the spikes may be
 prohibitive.
 
@@ -48,12 +52,71 @@ synchronize generating the landing page::
     ...     time.sleep(0.2)
 
 The double-checked locking uses two memoization decorators to optimistically
-look up the cache result before locking.
+look up the cached result before locking. With `expire` set to zero, the
+cache's get-operation is performed but the set-operation is skipped. Only the
+inner-nested memoize decorator will update the cache.
 
 .. image:: _static/synchronized-locking.png
 
+The number of concurrent workers is now greatly improved. Rather than having
+ten workers all attempt to generate the same result, a single worker generates
+the result and the other ten benefit. The maximum latency has increased however
+as three layers of caching and locking wrap the function.
+
+Ideally, the system would anticipate the pending expiration of the cached item
+and would recompute the result in a separate thread of execution.  Coordinating
+recomputation would be a function of the number of workers, the expiration
+time, and the duration of computation. Fortunately, Vattani, et al. published
+the solution in "Optimal Probabilistic Cache Stampede Prevention" in 2015.
+
+    >>> @dc.memoize_stampede(cache, expire=1)
+    ... def generate_landing_page():
+    ...     time.sleep(0.2)
+
+Early probabilistic recomputation uses a random number generator to simulate a
+cache miss prior to expiration. The new result is then computed in a separate
+thread while the cached result is returned to the caller. When the cache item
+is missing, the result is computed and cached synchronously.
+
 .. image:: _static/early-recomputation.png
 
+The latency is now its theoretical best. An initial warmup execution takes two
+hundred milliseconds and the remaining calls all return immediately from the
+cache. Behind the scenes, separate threads of execution are recomputing the
+result of workers and updating the cache. The concurrency graph shows a nearly
+constant stream of workers recomputing the function's result.
+
+    >>> @dc.memoize_stampede(cache, expire=1, beta=0.5)
+    ... def generate_landing_page():
+    ...     time.sleep(0.2)
+
+Vattani described an additional parameter, :math:`\beta`, which could be used
+to tune the eagerness of recomputation. As the number and frequency of
+concurrent worker calls increases, eagerness can be lessened by decreasing the
+:math:`\beta` parameter. The default value of :math:`\beta` is one, and above
+it is set to half.
+
 .. image:: _static/early-recomputation-05.png
 
+Latency is now still its theoretical best while the worker load has decreased
+significantly. The likelihood of simulated cache misses is now half what it was
+before. The value was determined through experimentation.
+
+    >>> @dc.memoize_stampede(cache, expire=1, beta=0.3)
+    ... def generate_landing_page():
+    ...     time.sleep(0.2)
+
+Lets see what happens when :math:`\beta` is set too low.
+
 .. image:: _static/early-recomputation-03.png
+
+When set too low, the cache item expires before a new value is recomputed. The
+real cache miss then causes the workers to synchronously recompute the landing
+page and cache the result. With no barrier in place, eleven workers cause a
+cache stampede. The eleven workers are composed of ten synchronous workers and
+one in a background thread. The best way to customize :math:`\beta` is through
+experimentation, otherwise the default is very reasonable.
+
+:doc:`DiskCache <index>` provides data types and recipes for memoization and
+mitigation of cache stampedes. The decorators provided are composable for a
+variety of scenarios. The best way to get started is with the :doc:`tutorial`.