Living on the Edge: Lazy Static Sites with Modern CDNs and Lambda

15 October 2019

Static sites have surged in popularity and mindshare for all the right reasons: they're easy to deploy, they're unencumbered by the operational and performance complexities of databases and servers, and their pages can be cached in a CDN available as close to your users as possible. However, there is a catch—static sites pay speed and complexity costs at build and deploy time. Moreover, each popular static site tool provides a different API surface for this build step. Gatsby requires you to learn GraphQL to incorporate data in an opinionated way, whereas react-static consolidates all data fetching in a central place. Both tools require build steps that run in the order of minutes, not seconds. Even worse, you have to evict the entire CDN cache when content changes, as neither system tracks which pages need purging.

Websites like the New York Times can't wait for a twenty-minute build to deliver breaking stories. A CDN-based service like unpkg can't build a static filesystem for every possible npm package without exorbitant storage and compute costs. Yet these services stay fast in spite of the pitfalls of dynamic sites. How?

CDNs: Old and Busted

CDNs (Content Delivery Networks) are magic, and it's a crime that we don't talk about them more. A CDN consists of data centers, called edge nodes, distributed across the globe. A user can request the CDN to cache pages, code, and assets, which distributes the cached file to each of its edge nodes. When a user requests a file, the edge node closest to the user serves the file, reducing latency between the origin server and the user. Edge-cached requests are _fast—_think 10-50ms—and reduce bandwidth, performance load, and cost from the origin server.

Why don't we talk about the power of CDNs more? Perhaps it's because the oldest big players in the space, Akamai and CloudFront, limit our collective imagination. Akamai has an opaque "call sales" pricing model, relies on expensive long-term contracts, and has sparse documentation. CloudFront from AWS is cheap, but cache purges are expensive, coupled to URLs, and take anywhere from five to 40 minutes to fully propagate to all of the edge nodes. Configuration changes on both services can take up to an hour, making iterative development and transient environments (like PR preview URLs) impossible or impractical.

Beyond the shortcomings of the old guard, companies tend to treat the CDN as a cost center, restricting provisioning and configuration to siloed teams. This reduces transparency, raises the bus factor, and makes DevOps harder. No wonder people treat their CDN provider as a risk instead of an asset!

CDNs: New Hotness

CDNs can be so much more than "that thing I cache my JavaScript with after I bug the right person in the right JIRA ticket." A new generation of CDN providers bet their business models on that possibility, and they're winning. The most prominent are CloudFlare and Fastly, each with shared and unique superpowers.

Both services support instant cache purges and near-instant (1-15 second) configuration changes. They're automation-friendly and can be provisioned in CI/CD pipelines without a long wait. They both support tag-based cache purges, a method of decoupling what to purge from what URL the content has. Most exciting is their support for edge computing (running code on edge nodes instead of on an origin server). Workers, CloudFlare's edge computing platform, lets you run JavaScript on the edge for A/B testing, canary deploys, or even entire APIs. Fastly has VCL, which allows for some of these scenarios, but is launching Terrarium, a WASM-based edge compute service, in the near future that should match the flexibility and performance of CloudFlare Workers.

Fastly includes features especially useful for dynamic content-driven websites. "Soft purges" mark a page as stale without deleting it from the edge cache. This enables the use of "stale" headers, which permits the edge cache to serve old content for brief periods of time when fetching new content in the background or masking an error response.

As fascinating as they are, modern CDNs can't do everything yet. Dynamic sites still need servers. What's the minimum viable ops we need to fill the gap?

CDN + Lambda: A Match Made in Heaven

We're big fans of Lambda and the accompanying Serverless ecosystem at Formidable. Lambda removes entire categories of ops work that, as consultants, we'd rather delegate to AWS: autoscaling, load balancing, patching, Node event loop tuning—the "boring" stuff that isn't specific to an app's business needs and wakes us up in the middle of the night.

Of course, using Lambda requires trade-offs. Bandwidth and compute time cost more in this model. Cold starts, not present in VM or container systems, increase response latency, especially when facing bursts of traffic. Until recently, cold start times made Lambda in a VPC borderline unusable, with cold starts of up to 15 seconds! How do we work around these limitations?

Put a CDN on it!

By setting cache headers on a Lambda response, we can cache the response on the edge network. If we cache this response for 15 minutes, we won't hit the origin Lambda at all during those 15 minutes. The majority of our users will get a cache hit, so they never experience the latency of warm or cold starts from Lambda. Furthermore, we don't pay any Lambda or API Gateway costs for those cache hits, as serving from the cache is cheaper than running more compute. Finally, we can mitigate the impact of cold starts on cache misses by using stale headers. On a cache miss, the CDN serves the user the last cache hit while fetching the new Lambda response in the background. If you can tolerate even 30 seconds of stale content, you can eliminate cold starts for the majority of your users. To get wild, we can warm a few Lambda instances in the background, turning the remaining cache misses into warm starts. Since bursts of traffic will hit the cache instead of Lambda, we don't need to keep thousands of Lambda instances warm!

We call this combination of a CDN and Lambda a "lazy static site"—it combines the performance of static sites with on-demand purging and regeneration of dynamic content. unpkg has a similar model:

Case Study

We first implemented this architecture for a client running a high-traffic content-driven site. The client, looking to modernize its stack, enlisted us to migrate its WordPress site on a specialized service provider to a Next.js and GraphQL stack on Lambda.

The baseline performance of server-rendered React leaves much to be desired. Combined with Lambda cold starts, we couldn't meet our standards for key performance metrics like time to first byte. Compounding our problems, our GraphQL Lambda had to live in a VPC to talk to a database—hello 15-second cold starts!

To mitigate our latency issues, we developed our lazy static site strategy—keep a few Lambdas warm, cache everything at the edge. First, we migrated from the legacy provider's "fake" CDN (a few EC2 instances in a few AWS regions) to Fastly. To prevent users from hitting cold starts (or even just slow origin requests), we used the stale-while-revalidate header in Fastly to serve stale articles and reviews for 30 seconds while fetching the new content from the origin in the background to serve to subsequent readers. As a bonus, we used Fastly's stale-if-error to serve stale pages when hitting an error from the origin. This header masked a production WordPress outage from users for several hours!

Intelligent caching isn't enough to make a fast-changing site chug; you also have to tackle that famous hard problem of computer science, cache invalidation. Our client's editorial team runs the show, and they need powerful tools to ensure that important edits and updates propagate immediately. This isn't possible (or cheap) with CloudFront, but with Fastly, it's free and instant. We wrote a plugin for their WordPress instance that sends change events to AWS. A Lambda responds to the events and purges the changed articles or reviews from the edge cache.

Besides caching React SSR, we cached GraphQL responses to make client-side SPA navigation just as fast as server rendering. This presented a new challenge: caching and purging a GraphQL response is hard. Either it's a POST request and you can't cache it at all, or it's a GET request and your cache key becomes whitespace-sensitive (the same GraphQL query can have different URLs). Using Fastly's surrogate keys header, we "tagged" GraphQL responses by category and ID, e.g. review-12345. Instead of purging the GraphQL request by URL, we purged by providing Fastly with these tags. We could purge by specific article (blogPost-54321) or for entire categories of content (e.g. reviews). Decoupling "what to purge" from "what the URL is" cracked the code of reliably caching and purging GraphQL responses from the edge.

We gambled on granular, event-based purging and won, making both users and editors happy. A modern CDN gave us the right cards.

OSS Example: Badges

Lazy static sites are too fun not to share, so we built an open-source service that uses much of the same architecture: Badges. Badges provides badges you may not find elsewhere—enhanced Travis build status, Sauce Labs browser matrices, and matrices for the Sauce Labs run of the latest Travis build. It also provides gzip size badges for npm packages. We would love your input and PRs for new badges!

Badges uses Fastly's advanced caching features like stale-while-revalidate, stale-if-error, and origin shielding to keep responses fast and reduce trips to the Lambda origin and upstream APIs. As a bonus, Badges includes Terraform modules and scripts for continuous integration and deployment with Github, CodeBuild, and CodePipeline!


For a list of available badge endpoints and how to use them, see the instructions here.

Preact master Travis build status:

Inferno Sauce browser matrix for latest Travis build:

Immer bundle size:

The Future is Edgy

Lazy static sites give you the performance of a static site with the power of a dynamic one. It's not a silver bullet—personalized content is hard to cache effectively and securely. You can mitigate this by moving personalization to client-side rendering, but this doesn't work for content that is 100% personalized, like a shopping cart page. For these problems, you may need to invest more in the performance and latency of your origin server while evaluating the tradeoffs of your compute strategy (e.g. Lambda vs. Kubernetes vs. VMs).

However, modern CDNs innovate at a breakneck pace, and they're moving computing out from the regional cloud into the edge. Providers now even experiment with pushing databases (FaunaDB, Workers KV), authorization, and A/B testing into global CDN networks.

We see a bright future where all services live as close to their users as possible. Bring your sunglasses.

Thanks to Brian Beck for donating the original badge-matrix service!

Join our team — we're hiring!

Related Posts

Say Hello to Charlie

Charlie is a full-stack engineer with a broad range of experience leading large and small teams. Today he explains what he does at Formidable and why ...

Distributed Locking in DynamoDB

This article will discuss transactions in DynamoDB, their limitations, and a proposed solution involving distributing locking, replete with details.

Tipple: Stealing Ideas From GraphQL and Putting Th ...

You've been using Redux for a while now. It was exciting at first, but the amount of code you need to ship a new feature is starting to creep upwards. ...

Check out more of Tyler's blog posts