composites, caching, and freshness
as part of another project i am working on, i had a chance to review some basic Web-caching challenges.
typical resource definition
typically, my REST-ful implementations for public (non-authenticated) resources support a POST factory along with PUT, DELETE, GET. something like this:
Method | URI | Response | Comments |
---|---|---|---|
POST | /customers/ |
302 Found |
Creates new resource and generates new {id} value |
PUT | /customers/{id} |
200 OK |
Updates existing resource |
DELETE | /customers/{id} |
204 No Content |
Removes existing resource |
GET | /customers/ |
200 OK |
Returns a list of customer resoures |
/customers/{id} |
200 OK |
Returns a single existing resource |
this is nothing too fancy or amazing. but it does expose a rather common challenge regarding caching:
"If you are using caching intermediaries, what happens to the cached list of customers (GET /customers/
) after a PUT, POST, or DELETE?"
the problem is that the /customers/
resource migh now have stale data. maybe too many resources in the list (due to deletes) or one that has been edited, etc.
this is esp. true for widely-used sites that have dynamic data. basically, this problem occurs becuase the resource that results from GET /customers/
is a composite resource. it's made up of
one or more existing resources in the system. if you're not careful, composites will give you headaches.
fun w/ caching headers
there are a number of ways to keep your composites fresh using the Cache-Control HTTP header. here are some common examples:
- ETag:{#etag}
Cache-Control:public
max-age:3600 - this is the minimum caching set. provide the (strong) ETag and tell any intermediaries that this is publicly cache-able resource. the problem here is that the composite resource (customer list) will fall out of sync since intermediaries have been told to keep them for up to one hour
- ETag:{#etag}
Cache-Control:no-cache - in this case intermediaries will no longer cache the item. not ideal, but acceptable for small-ish implementations
- ETag:{#etag}
Cache-Control:public, must-revalidate - this is better. intermediaries are now instructed to re-validate the resource w/ the origin server before continuing.
there are a number of other possible caching directives avialable (see the link above for details). the point here is you have lots of options when it comes to alining your resource 'freshness' requirements w/ your bandwidth and request-load requirements.