-
-
Notifications
You must be signed in to change notification settings - Fork 651
cache: fix excessive caching and some other lack of caching #4335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
cache: fix excessive caching and some other lack of caching #4335
Conversation
307 responses must not be cached if they do not have an explicit caching directive. Final responses with an explicit caching directive can be cached even if their status code is not heuristically cacheable status codes. fix part of nodejs#4334
// Allow caching for status codes 200 and 307 (original behavior) | ||
// Also allow caching for other status codes that are heuristically cacheable | ||
// when they have explicit cache directives | ||
if (statusCode !== 200 && statusCode !== 307 && !HEURISTICALLY_CACHEABLE_STATUS_CODES.includes(statusCode)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a heuristically cacheable status code matters only for responses without an explicit caching directive. It is not a requirement for responses with an explicit caching directive, see Storing Responses in Caches.
This previous code also allows to cache 307 responses without an explicit caching directive if the cacheByDefault
option is used. But 307 is not heuristically cacheable, and so, it should not be cached if its response lacks explicit caching directive.
test/interceptors/cache.js
Outdated
@@ -1453,7 +1456,7 @@ describe('Cache Interceptor', () => { | |||
res.end(body) | |||
}).listen(0) | |||
|
|||
const client = new Client(`http://localhost:${server.address().port}`) | |||
const client = new Client(`http://localhost:${server.address().port}`, { maxRedirections: 0 }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the test includes redirect responses (furthermore, invalid ones), explicitly disable following redirections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Max redirections is not supported anymore, you'd require to use the interceptor instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand it well, it means that redirection are not followed by default, an interceptor has to be added for that. Fine for the test then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed this option.
@@ -1489,47 +1492,55 @@ describe('Cache Interceptor', () => { | |||
}) | |||
} | |||
|
|||
test('does not cache non-heuristically cacheable error status codes', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RFC allows to cache non-heuristically cacheable error responses if they have an explicit caching directive.
So, I change this test to check we do not cache non-heuristically cacheable response not having a caching directive.
finish fixing nodejs#4334
26fe391
to
00d1d45
Compare
// https://datatracker.ietf.org/doc/html/rfc9111#section-3 | ||
// This list should not grow beyond 206 and 304 unless the RFC is updated | ||
// by a newer one including more. Please introduce another list if | ||
// implementing caching of responses with the 'must-understand' directive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently responses with the must-understand
directive may be cached without further checks by the current implementation, if they do not follow the RFC recommendation which states they should also includes a no-store
directive (that has to be ignored by caches supporting must-understand
).
So, I think the current implementation is good enough for now, about that directive. (The current implementation appears to me as just reading this directive and flagging it, but not doing anything of it beyond that.)
}) | ||
} | ||
|
||
test('discriminates caching of range requests, or does not cache them', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prior the second commit of this PR, this new test was failing.
test/interceptors/cache.js
Outdated
} | ||
}) | ||
|
||
test('discriminates caching of conditionnal requests, or does not cache them', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prior the second commit of this PR but after the first one, this new test was failing too.
(Prior the first one, it was not, since non heuristically cacheable statuses were not cached, and 304 is not heuristically cacheable.)
} | ||
|
||
// Third request with different since should hit the origin too | ||
request.headers['if-modified-since'] = new Date(0).toUTCString() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have checked this third request case before the fix by moving it as the second request, and it was failing too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@fredericDelaporte could you check if there is any docs that needs updating? |
I have put a "skipped" in the "Documented" item because I considered no documentation update was required. (Nothing documented about cache has changed, and the change is bug fixes and a slight improvement. I have also checked no mention of 307 being specifically cached were there.) Maybe I should have instead ticked it for meaning I have done these checks? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, but CI seems red for cache
lib/handler/cache-handler.js
Outdated
} | ||
// Responses with neither status codes that are heuristically cacheable, nor explicit caching directives, | ||
// are not cacheable. | ||
if (!HEURISTICALLY_CACHEABLE_STATUS_CODES.includes(statusCode) && !cacheControlDirectives && !resHeaders['expires']) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cacheControlDirectives
is never undefined or null but just { }
if there is no directive. So, that is a bug here.
But I have not seen yet why the tests still succeed on my setup.
Edit: On my setup the test response has a date which causes the cacheByDefault
directive to be ignored for determining the freshness.
// Responses with neither status codes that are heuristically cacheable, nor "explicit enough" caching | ||
// directives, are not cacheable. "Explicit enough": see https://www.rfc-editor.org/rfc/rfc9111.html#section-3 | ||
if (!HEURISTICALLY_CACHEABLE_STATUS_CODES.includes(statusCode) && !resHeaders['expires'] && | ||
!cacheControlDirectives.public && | ||
cacheControlDirectives['max-age'] === undefined && | ||
// RFC 9111: a private response directive, if the cache is not shared | ||
!(cacheControlDirectives.private && cacheType === 'private') && | ||
!(cacheControlDirectives['s-maxage'] !== undefined && cacheType === 'shared') | ||
) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simply testing there is a cache-control
directive is not enough. It is approximate, and thinking about it again, it may allow to cache cases that should not.
But the complete test is not straightforward.
Especially I thought writing !(cacheControlDirectives.private === true && cacheType === 'private')
instead of !(cacheControlDirectives.private && cacheType === 'private')
, because a qualified private
directive allow caching by shared caches if they strip the headers listed by private
.
But the RFC wording, "a private response directive, if the cache is not shared", can be understood as private directive, qualified or not, should allow caching (in the absence of the other alternatives) only for private caches.
lib/handler/cache-handler.js
Outdated
if (cacheByDefault) { | ||
// When cacheByDefault is truthy, the cache interceptor does no more mandate an explicit expiry prior to | ||
// reaching here. We should then check the full condition allowing to consider caching responses. | ||
// Responses with neither status codes that are heuristically cacheable, nor "explicit enough" caching | ||
// directives, are not cacheable. "Explicit enough": see https://www.rfc-editor.org/rfc/rfc9111.html#section-3 | ||
if (!HEURISTICALLY_CACHEABLE_STATUS_CODES.includes(statusCode) && !resHeaders['expires'] && | ||
!cacheControlDirectives.public && | ||
cacheControlDirectives['max-age'] === undefined && | ||
// RFC 9111: a private response directive, if the cache is not shared | ||
!(cacheControlDirectives.private && cacheType === 'private') && | ||
!(cacheControlDirectives['s-maxage'] !== undefined && cacheType === 'shared') | ||
) { | ||
return false | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I considered the subject, I was not expecting it to turn that way because I was not aware of the cacheByDefault
option.
Do to the code preceding the call to canCacheResponse, which actually does already some cacheability checks, we need that big test only when cacheByDefault is truthy.
Maybe that code should be moved into canCacheResponse
instead of staying out of it, allowing to have a better picture of how the cache interceptor checks cacheability.
The remaining CI failures seem to be a flaky websockets test. (There is a flaky cache test too which has blown on my setup once, name "revalidates the request when the response is stale".) |
lib/handler/cache-handler.js
Outdated
// When cacheByDefault is truthy, the cache interceptor does no more mandate an explicit expiry or | ||
// heuristically cacheability prior to reaching here. We should then check the full condition allowing | ||
// to consider caching responses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is wrong, even when cacheByDefault is not defined or falsy, an explicit expiry is not mandated by the current implementation. (Also see #4338.) This shortcut has to be removed.
This reverts commit 5cf41d4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
307 responses must not be cached if they do not have an explicit caching directive.
Final responses with an explicit caching directive can be cached even if their status code is not heuristically cacheable status codes.
206 and 304 semantics are not handled by the cache and as such must not be cached.
This relates to...
fix #4334.
Changes
Features
N/A
Bug Fixes
Breaking Changes and Deprecations
N/A
Status