API Caching on Fastly
by Kevin Sheurs on 05/28/14
We are moving towards a service-oriented architecture to help us grow and scale out the VHX platform. Given the nature of our releases, we tend to see large and sudden increases in traffic to various resources (site visits, video embeds, purchasing, streaming, apps, etc). We're focused on putting our API at the core of this architecture, with its performance and availability as our top priorities. This post digs into how we are utilizing Fastly and their direct Varnish Configuration Language (VCL) access to cache and accelerate our API.
Varnish and HTTP Caching
Varnish is an HTTP reverse proxy cache that is at the core of Fastly's service. When a request comes in to our API it first goes through Varnish and does a lookup in cache for that item. If it's not found, it passes the request to the backend (our API application), and then puts that response in cache for subsequent requests before finally returning the response to the client. The rules around looking up requests and caching responses are based on standard HTTP headers. Generally by default, HEAD
and GET
requests are cached and POST
, PUT
, and DELETE
requests are not. There are many other factors to determine the correct caching policy (like Set-Cookie
, Authorization
, etc), but let's look at some specific cache related headers.
The HTTP Cache-Control
header is used to specify directives on how clients should cache a response. With the following header for example, clients would cache the response for 1 hour:
response.headers['Cache-Control'] = 'public, max-age=3600'
Since we have both an intermediate proxy (Fastly/Varnish) and the originating client (browser or library), Cache-Control
would be respected by both clients. This can become problematic because our dynamic content would now be cached in two places; giving us only partial cache invalidation control.
Fastly has another caching header that precisely addresses this: Surrogate-Control
. By setting our Cache-Control
header to no-cache
and Surrogate-Control
to max-age=3600
we can tell originating clients not to cache responses, but Fastly will. Perfect!
response.headers.merge!( 'Cache-Control' => 'public, no-cache, no-store, must-revalidate', 'Surrogate-Control' => 'max-age=3600' )
Invalidation
As data changes in our system the API needs to immediatly reflect that. Fastly has great instant invalidation capabilities via the HTTP PURGE
method. Upon a database write in our system, we queue up purges to be sent to Fastly (see their Github for some useful libraries). In it's simplest form you can do something like the following to purge a resource:
curl -X PURGE "https://api.vhx.tv/packages/1"
However, in addition to that Surrogate-Control
header, there is also the Surrogate-Key
header. This allows you in your backend response to group content dependencies together based on various keys. So for example, on an API response that includes both site and package data our cache headers would be:
response.headers.merge!( 'Cache-Control' => 'public, no-cache, no-store, must-revalidate', 'Surrogate-Control' => 'max-age=3600', 'Surrogate-Key' => 'sites/1 packages/1' )
Now when package #1 in our system changes, we can issue a single "purge by key" command via the Fastly API for packages/1
that will invalidate all items in cache that are dependent on that data. Basically a wildcard-like approach to cache invalidation - powerful stuff!
VCL and Varying
VCL is the language in Varnish that let's you have low-level control of request handling and caching policies. You can do things like set or change request and response headers on the fly, have specific logic to do a "pass" to the backend or "lookup" in Varnish cache, route incoming requests to different backends per certain criteria, etc.
One way we utilize this is to inspect an incoming request for the VHX session cookie. For the most part our API does application and user identification via Oauth2, but there are a few endpoints that need to return logged-in content based on the session. By having the following, we can easily pass through logged-in requests to our back-end and continue to serve cached logged-out requests via Fastly:
if (req.http.cookie ~ "_session") { return(pass); }
Vary
is another cool HTTP header that Varnish respects that let's you cache variations in content under the same "hash" or cache key. Some of the content served via our API is geographically sensitive. A user from Australia may see different content from a user in the United Kingdom.
With the following VCL and HTTP headers we can now properly vary on our dynamic content giving us the benefit of caching and the ease of invalidation via a single purge command.
VCL:
sub vcl_recv { set req.http.X-Geo-Country-Code = geoip.country_code; }
HTTP headers:
response.headers.merge!( 'X-Geo-Country-Code' => 'AU', 'Vary' => 'Accept-Encoding, X-Geo-Country-Code' )
We look forward to iterating on our API and service oriented archtecture and finding more ways Fastly and Varnish can help accelerate our performance!