Skip to content

Results caching #4460

@mkleczek

Description

@mkleczek

Problem

Response caching can be a very effective way to shorten (sometimes significantly) response times and - what's even more important - to take load off the database server.

Currently PostgREST does not provide any way to cache responses. In line with the philosophy of "do one thing and do it well" response caching is supposed to be handled by either:

  • upper layers (ie. a caching proxy) - similar to eg. TLS. To support caching users can set appropriate response headers in pre-request or rpc functions.
  • "lower-middle" layers (ie. query result caching by some smart connection pooler)
  • lower layer (ie. by adding RAM to the database server and using it as either shared buffers or OS page cache)

Unfortunately, JWTs in practice make it impossible to cache anything by caching proxies:

  • RFC 7234 explicitly forbids implicit caching of requests with Authorization header unless explicitly allowed by response headers https://datatracker.ietf.org/doc/html/rfc7234#section-3.2
  • Even if Authorization header value is included in the caching key, the cache is not effective due to JWTs lifetimes. It would be much better to have some claims (such as role) be a part of the caching key, so that requests with different JWTs representing the same user could be served with the same cached response. But that would require proxies to be able to validate JWTs and to understand the same claims that PostgREST understands.

Caching by "lower-middle" layers is problematic due to SQL (and especially PostgreSQL dialect) being a very rich language with semantics that is not trivial to handle properly by any middle man.
HTTP/REST layer is much better suited for caching decisions as HTTP methods have well defined and narrow semantics (which is used by PostgREST already: HEAD/GET methods are executed in read only transactions).

Caching by lower layers (ie. "below" the DBMS) is very effective in hiding IO latency but does not allow caching compute intensive results (eg. joins, sorting, aggregates etc.).

Solution

Taking the above discussion into account, it looks like the best place to handle response caching is PostgREST - it has all the knowledge required to make proper caching decisions.

First version should implement in-memory cache of responses using SIEVE cache implementation we use for JWT caching.
In the future we can think of implementing a two-level caching allowing users to configure an external caching solution (such as Memcached or Redis) in addition to the in-memory cache.

We could also implement cache invalidation based on LISTEN/NOTIFY since we are already listening on pgrst channel.

We need to think about a way for the users to provide caching directives in their responses. Besides the standard caching headers (ie. Cache-Control and other headers) we also must allow to somehow provide analogous information about JWT claims affecting caching decisions.

Related issues

#133 Implement Vary header (while important, it does not address the issue as there is no way to specify a JWT claim in Vary header). There is also a No-Vary-Search header (marked as experimental in MDN)
#1089 Shortcomings in HTTP compliance (contains discussion about Range and Content-Range headers and their impact on caching)

Metadata

Metadata

Assignees

No one assigned

    Labels

    ideaNeeds of discussion to become an enhancement, not ready for implementationperf

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions