-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Policy for "Back/Forward Cache" (aka "bfcache" or "Page Cache") #16359
Comments
I'd prefer to assume absence because the feature isn't standardized, but I don't know if/how it can be disabled in the browsers we support. |
In principle, the bfcache is just an optimisation that's only sorta observable. I think @annevk was planning on updating how HTML sorta acknowledges the existence of bfcaches in plenty of browsers to actually be more accurate (because with Blink implementing a bfcache we'll then have every major browser having one). We'll want to run some tests with the bfcache enabled (though it's a bit awkward, like testing any caching behaviour, given eviction is implementation defined). Are there any we want to run with it disabled? Testing behaviour that isn't actually in the shipped configuration isn't that useful. |
It is standardized, no? It's the |
It's hard to discuss observability in relative terms. More concretely: the feature directly undermines the expectations of the test I referenced above. That test sets
These may not happen if the UA is using a bfcache. Safari 12.0 fails for this reason. My first suggestion is that we document an assumption: " My second suggestion is that we document as assumption: "The browser does not implement a so-called 'bfcache'" and take steps to ensure it is disabled in each browser. Although I initially favored the second suggestion, @annevk's input that there is spec prose governing the behavior makes the first suggestion seem more appropriate. |
I don't understand what documenting an assumption means. |
@annevk I'm referring to the "Assumptions" document in the project's guide for writing tests. For the purposes of consistency, I think we have to instruct test authors to assume that navigation will not occur because the presence of the Document in the session history entry (and thus the possibility of a navigation) is at the discretion of the UA https://html.spec.whatwg.org/multipage/history.html#the-session-history-of-browsing-contexts |
That document reads more to me like "expected configuration", not "assumptions". Whether |
@annevk Essentially, I want to update the referenced test to set The reason I've raised this in terms of WPT policy is that the flaky test seems like a symptom of a larger problem: the bfcache behavior can subvert the expectations of test authors. I agree that authors should not assume a cache is in use. They also shouldn't assume a cache isn't in use. The recommendation I want to make is "don't use Maybe the "Assumptions" document isn't the best place to put this, but I still feel that this needs to be said somewhere. Simply fixing the test seems inconsistent--we'd be acknowledging the use of those APIs as a bad practice without telling other folks to avoid it. |
So looking at the test I think it's actually testing bfcache behavior (it's affected by WebSocket), so I'd be opposed to changing it. This might show a bug in Safari. |
Can you elaborate on what text you feel the test is verifying? |
Well, that |
Ahah, the unloading document cleanup steps:
However, in that test, the document is being unloaded from the WebSocket's Since the salvageable state is not set to false, the presence of the Document in the session history remains at the discretion of the UA. If that's right, then Safari's flaky behavior is permissible. |
Interesting, though that would mean it's showing a Firefox bug as for some reason Firefox does seem to set salvageable to false, right? |
Well, Firefox does seem to discard the document, but I don't know if that's because it has incorrectly set salvageable to false. Isn't it also possible that it has removed the document from the session history (or informally, "cleared the bf cache")? Would it be violating any spec text for doing that? |
Perhaps, but it's suspect that it always does so and never fails the test. That makes it much more deterministic than it ought to be, no? |
Definitely hinky. I've reported it on Mozilla's bug tracker, but the validity of this test is still questionable. As long as the BF cache might be active, I don't think we can assert any specific behavior. |
Well, we want to test salvageable given it's going to be implemented everywhere and given that I suspect we actually want to assume that it works within WPT. And I wouldn't really want to rewrite existing tests that might be testing it so they're no longer testing it (and might not be testing anything relevant anymore). |
Although I'd be reluctant to maintain tests that are expected to be flaky in conforming runtimes, that seems like a theoretical risk at this point. Safari behaves inconsistently on this test, but the reason is out of scope of this test and possibly of the bfcache. The problem is that I've created a generic test case for this and submitted via gh-16478. Would you mind taking a look? |
Then the implementation is not conforming to the spec. The spec requires to set the salvageable state to false on the document on navigation if it has any open https://html.spec.whatwg.org/multipage/browsing-the-web.html#unloading-documents:make-disappear This is exactly what the test is intending to test. (Edit: I see now that this was already pointed out earlier.) |
No, wait,
I'm not sure how to make it not flaky, short of running the same test several times and asserting that it got the "bfcache" behavior at least once. |
See also whatwg/html#1931 |
In chromium we would like to start writing more WPT tests for bfcache. We had to fix a few of our own internal web tests. It looks like none that needed fixing were WPT. We also expect some of these tests to pass/fail depending on whether bfcache is enabled or not. I don't see a way to reliably test without a way to specify that a test needs
I can still imagine a situation where because of implementation differences, "aggressively use bfcache" could still cause problems since it has no definition but would still be better than the current state of not being able to test at all. The alternative of writing tests to detect whether bfcaching has happened and pass in either case is bad. While it makes everything green, you have no guarantee that you tested anything. Are there any other alternatives? |
So we could add some mechaism for wpt to opt in to different bfcache behaviours. The most obvious way would be to provide a testdriver API to change the bfcache behaviour for a particular browsing context (or, less ideally, for the whole browser). That depends on this being something we can opt into at runtime. An alternative would be something like a meta element that could map down onto a pref set at browser start time. If there's something that sounds good to implementors here I can help write an RFC and get it implemented in wpt. |
Explicitly opt-in to always enable bfcache wouldn't work in cases where one wants to test whether use of some feature But maybe we could start with something simple. A way to reliably disable bfcache - that would be the first step. |
I'm imagining e.g. a test that ensures that e.g. we don't expose location sensors while in BFCache. A browser could just disable BFCache if location sensors are used and so the test would never be able to trigger bfcaching. It seems like Being able to reliably disable would definitely be a start. whatwg/html#5744 already exists but maybe using a header is not suitable and a test-only JS API would be better. |
I understand how an opt-out could be useful for web application developers. They can't make any assumptions about the contents of their users' history, so the history API is the only way they can reliably revisit previously-viewed documents. Is that relevant in WPT, though? WPT test authors have total knowledge of the navigation history, so they can always trigger the desired navigation with another mechanism (e.g. the If so, it seems like we could address the problem in WPT with documentation only. For example:
|
There are tests that use the history API that will pass or fail depending on whether BFCache is enabled. E.g. (and more here I don't know if this is not tested in WPT for a good reason or if it's just something that we should migrate from Chromium to WPT but haven't done yet. These cases are rare but they do exist, they mostly seem to involve the interaction between redirects/posts and the history API. Is there another way around the problem that I'm not seeing? |
So I don't think it's possible for vendors to convert PRECONDITION_FAILED into failure for specific tests, or at least the infrastructure for saying "this precondition ought not to fail in my implementation" doesn't exist and seems hard to maintain. AFAICT the situation with bfcache is:
So whilst I understand the desire to treat this as an optimisation and not over-specify behaviour, I'm not convinced that's in the best interests of the platform; it seems like there should just be an agreed set of rules that can be tested, even if we have to agree on some additional test-only rule about eviction policy (i.e. a way to enforce the invariant the eviction doesn't happen within a single test so any page that's in bfcache at any point during a test remains so throughout the test). |
Chrome infrastructure allows us to expect a range of outcomes for a test, so that when others check in tests for features that are unsupported in chrome we can just set the expectation appropriately and make it pass in the future. What do others do? More generally, if you cannot assert that that a tests passes for real and doesn't just give
I'm not sure how this would work. E.g. there are some sensors that should not collect data while a page is in BFCache. In Chrome's impl we do not cache if they are enabled (we will eventually make it so they stop collecting and things work but we have a long list of things to fix). So now imagine a test to ensure that no sensor data is collected while in BFCache. In Chrome, if we just force the page into the cache without fixing the sensor code, we will collect the data and the test will see that data was collected and fail but the test is a privacy test and Chrome's privacy is correct. Do you have examples of a test you could write using your proposal? |
My point about
It seems like there are two different things here. One is a test that no sensor data is collected once you navigate away from the page. That would be the privacy test and wouldn't depend on the bfcache implementation. The other is that enabling sensors doesn't exclude a page from being put in the bfcache. Chrome today would pass the former and fail the latter. That seems totally reasonable. The situation where we basically say "implementations can refuse to bfcache a page for any reason during tests" seems like it's going to make all the "real" tests implementation-specific, even when there's a commonly agreed set of behaviour that everyone is aiming to support and that could reasonably be tested in a cross-browser fashion. |
I don't understand your reply about That is exactly how Am I missing something about your process? Is it not possible to check in an expectation so that nobody ever has to think about it again unless it changes?
It doesn't seem reasonable to me to fail a WPT if we're spec compliant. Here's what happens in some imaginary circumstances where we have a test for bfcache+gps under both schemes With
In both cases, privacy problems to go undetected. With
APIFinally, on the idea of an API to force caching, that seems quite problematic, we have code that evicts pages from the cache and expects them to be evicted, dropping out of IPC handlers half-way, knowing that the page is doomed. Keeping that page around and pulling it back out of cache could lead to crashes. As far as I can see, using |
Is Would it make sense for the API for this be in terms of https://github.com/WICG/app-history ? |
I agree the force caching idea doesn't work. Consider it withdrawn. But nothing I've said recently depends on that. Fundamentally I think the question here here is "should bfcache be specced in a way that ensures UAs behave consistently, subject to reasonable limits" (c.f. implementation defined limits in the HTML spec). Since the behaviour of the feature is observable from content, the obvious answer is "yes". But since it's regarded as an optimisation, the traditional answer is "no". If we don't spec the feature in enough detail that we can agree what tests should do, it seems unlikely that we're going to end up with a useful set of shared tests. Implementors will end up writing product-specific tests that require whatever behaviour they happen to implement and largely ignore the "shared" tests which will most likely just reflect the decisions of whichever implementor happens to write them. Whether the status in the case of fundamental disagreement over the behaviour of the feature is |
I agree with the goal of sharing tests even on optional behaviour and 1 navigate forward 1 enable feature I don't see what goes wrong with the above and I don't see how it interferes with fully speccing BFCache or with having a battery of shared tests. |
I agree that if the constraints are a) it must never be required by the specification to put a page in the bfcache (even under test conditions) and b) we must never display FAIL for something that's technically allowed, your solution satisifes the constraints and will allow producing a shared testsuite. I just also believe that the social dynamics of accpeting those constraints will tend to cause people to rely more on browser-specific tests that encode the actual rules they implement in a clear PASS/FAIL format and pay less attention to the shared tests. If we agreed that any bfcache implementation must follow explicit rules, with some additional constraints imposed on the test scenario specifically (e.g. rules around not having time-based eviction during a test), we'd end up with something more testable and more consistent between browsers; thus helping authors. We could make the tests clearly PASS/FAIL and a browser with no implementation at all would simply accept that they FAIL these tests. That's not so unusual; e.g. there are Blink-only features for which other browsers FAIL the tests not because the implementation is wrong, but because they object in principle to the existence of the feature. It's possible that I'm wrong about this of course, but in general getting people to pay attention to test failures from tests they didn't write is even harder than getting them to write tests in the first place. It would be useful to get some input from @smaug---- and whoevr works on this on the WebKit side on what would be most likely to make the testsuite useful for their implementation work. |
cc @cdumez |
back-forward-cache/resources/helper.sub.js: Helper for A->B->A navigation scenarios. BFCache state is detected by observing `pageshow` events. We might want to use more explicit APIs like `isPreviousPageInBFCache` discussed at #16359 in the future / in more complicated scenarios, but so far `pageshow`-based detection seems to work without hurting ergonomics. back-forward-cache/resources/events.html: The file that loads `helper.sub.js` and contains test logic and assertions. We navigate to `back-forward-cache/resources/back.html` and then back-navigate to `back-forward-cache/resources/events.html`. In the case of BFCache is not used, two async_test objects are created: - The first one during the initial navigation, which never completes, and - The second one during the back navigation, which will execute test assertions and then complete. This doesn't affect the behavior (the first one seems just ignored), but might look awkward. back-forward-cache/events.html: The main test Document that opens `back-forward-cache/resources/events.html` using `window.open()` with 'noopener' option. This is because (at least) Chromium requires top-level navigations to trigger BFCache and thus `back-forward-cache/resources/events.html` is navigated away during the test, but the WPT test infrastructure doesn't support navigating the main test Document. testharness.js: `fetch_tests_from_prefixed_local_storage()` is introduced to communicate test results from `back-forward-cache/resources/events.html` in a similar way to `fetch_tests_from_window()`. PrefixedLocalStorage.js: Some basic utility methods for `helper.sub.js` are added. Bug: 1107415 Change-Id: I034f9f5376dc3f9f32ca0b936dbd06e458c9160b
back-forward-cache/resources/helper.sub.js: Helper for A->B->A navigation scenarios. BFCache state is detected by observing `pageshow` events. We might want to use more explicit APIs like `isPreviousPageInBFCache` discussed at #16359 in the future / in more complicated scenarios, but so far `pageshow`-based detection seems to work without hurting ergonomics. back-forward-cache/resources/events.html: The file that loads `helper.sub.js` and contains test logic and assertions. We navigate to `back-forward-cache/resources/back.html` and then back-navigate to `back-forward-cache/resources/events.html`. In the case of BFCache is not used, two async_test objects are created: - The first one during the initial navigation, which never completes, and - The second one during the back navigation, which will execute test assertions and then complete. This doesn't affect the behavior (the first one seems just ignored), but might look awkward. back-forward-cache/events.html: The main test Document that opens `back-forward-cache/resources/events.html` using `window.open()` with 'noopener' option. This is because (at least) Chromium requires top-level navigations to trigger BFCache and thus `back-forward-cache/resources/events.html` is navigated away during the test, but the WPT test infrastructure doesn't support navigating the main test Document. testharness.js: `fetch_tests_from_prefixed_local_storage()` is introduced to communicate test results from `back-forward-cache/resources/events.html` in a similar way to `fetch_tests_from_window()`. PrefixedLocalStorage.js: Some basic utility methods for `helper.sub.js` are added. Design doc: https://docs.google.com/document/d/1p3G-qNYMTHf5LU9hykaXcYtJ0k3wYOwcdVKGeps6EkU/edit?usp=sharing Bug: 1107415 Change-Id: I034f9f5376dc3f9f32ca0b936dbd06e458c9160b
I created a PR #28950 that shows a basic framework for BFCache WPTs (especially for simple scenarios). What do you think? If there are no objections, I'd like to merge the PR and upload other WPT PRs for more individual BFCache-related issues based on the PR. |
I'm very happy to see progress here. Since the PR adds APIs to testharness.js I think it needs to go through the RFC process to ensure that we get cross-vendor feedback on the API design. |
I'm a bit disappointed by the tests since they don't guarantee that the page gets put into bfcache (e.g. using testdriver or something similar to force it). So I'm not sure how I would ever run the tests locally to debug bfcache; I expect I would get "precondition failed" most of the time. I would just have to hope that the browser decides to cache, I guess, and try to do all my debugging in those runs? |
What debugging are you thinking about? Why do you expect to get precondition failed most of the time? We have web tests in chrome that expect bfcache to work every time and to my knowledge they are not flaky. If we force caching when the browser has a reason to not cache then we're asking for browser crashes. |
Well, for example, if I'm writing a test to see what happens while in bfcache (to test, e.g., a fully active condition), it's not very useful to me if sometimes I'm in bfcache and sometimes I'm not. That just makes the debugging experience very frustrating. If bfcache works every time, then maybe the tests can assert that, instead of using the precondition failed system. |
@domenic, are you concerned that browsers will most likely not cache the page when running web tests? I guess adding @jgraham thank you! So I guess now @hiroshige-g just needs to make a RFC explaining |
"Most likely" is overstating it. It's more that I have no confidence as to whether they will or not; it's not that I have confidence they usually won't. Maybe they cache when my system has plenty of extra memory? Maybe they cache based on previously observed traffic patterns? It'd be a nightmare to debug such situations...
Yeah, I think so. It'd give me some reassurance that, if a browser has the capability to bfcache the page, then it's trying to do so, and I don't have to worry about some heuristic I might be tripping the wrong direction. |
These are all concerns but any browser that wants to make caching non-deterministic should disable that when running WPTs. In chrome the only thing we consistently tweak like that in tests is I don't think we should introduce something to say "make bfcache deterministic" at the start of every bfcache test, that should just be the default for all WPT tests. One problem is that there is no agreed cache-size for WPT tests. We should probably try to agree on something so that nobody writes tests that are doomed to hit |
This kind of debugging against heuristics might be more likely in the wild rather than in writing tests, so web developers might have experiences/opinions... not sure though. As for testing, so far I haven't encountered flakiness around BFCache eligibility, and |
I don't think this is realistic. I run WPTs in Chrome Canary with no flags, or just the usual Chrome stable that I run. The point of WPTs is to test browsers as they are, not browsers in a special mode. Indeed, from what I understand wpt.fyi (and the CI for the WPT repository) run in a no-flags mode. If a test requires a browser to be in a special mode, then it should communicate that to the browser in some way, e.g. using WebDriver. |
BFCache is observable, including outside of tests intended for BFCache. We had to rewrite some of chrome's pre-existing web tests to cope with BFCache. Having observable behaviour controlled by heuristics means that all tests that navigate are potentially subject to this problem (whether they know it or not). So providing an API won't fix everything but I do see the benefit of not running in a special mode. I think we can wait and see if this problem ever occurs before trying to fix it. Note, chrome's dev-tools will tell you whether/why a navigation was cached, so that will help. |
This problem has already occured: some of the wpt.fyi preview results generated by CI on the above PR fail in Chrome (and some succeed), since the wpt.fyi CI does not know to run Chrome in a special mode. |
I think the only special thing we do right now is disabling the timeout and that should be irrelevant. It's more likely that that bot is in an group that just doesn't have BFCache at all. No beta or stable desktop chrome has it yet, so it is essentially like any unreleased feature right now. |
The bot runs with experimental web platform features so I don't think that's what causes it. I suspect it's the timeout; it runs many tests in a row and it'd be easy to trigger such a timeout, IIUC? |
OK, so if they're enabling BFCache as an experimental feature then I imagine it's got the default timeout (15s whereas real users who are in our rollout groups get 180s. Regardless of these tests, we should update the default timeout to 180s so that enabling the feature gets people a useful experience and also in the long-term, drop the need to send a timeout in our configs. |
It's not really true that we run browsers in a totally unmodified state from what ships; it's pretty common to have to set some prefs to get reliable test behaviour. So I don't think that it's totally unreasonable to supply a flag/pref/etc. that disables bfcache eviction based on heuristics. I also think it would be quite reasonable to have an API (in testdriver or as a test-only API in the DOM) to clear the bfcache, if that helps get a clean slate (I think we typically run each test in a fresh top-level-navigable already, so it might not make much difference). |
IIUC BFCache is not included in the experimental web platform features (see https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/platform/runtime_enabled_features.json5;l=246?ss=chromium) and the failures are expected for me (due to lack of |
back-forward-cache/resources/helper.sub.js: Helper for A->B->A navigation scenarios. BFCache state is detected by observing `pageshow` events. We might want to use more explicit APIs like `isPreviousPageInBFCache` discussed at #16359 in the future / in more complicated scenarios, but so far `pageshow`-based detection seems to work without hurting ergonomics. back-forward-cache/resources/events.html: The file that loads `helper.sub.js` and contains test logic and assertions. We navigate to `back-forward-cache/resources/back.html` and then back-navigate to `back-forward-cache/resources/events.html`. In the case of BFCache is not used, two async_test objects are created: - The first one during the initial navigation, which never completes, and - The second one during the back navigation, which will execute test assertions and then complete. This doesn't affect the behavior (the first one seems just ignored), but might look awkward. back-forward-cache/events.html: The main test Document that opens `back-forward-cache/resources/events.html` using `window.open()` with 'noopener' option. This is because (at least) Chromium requires top-level navigations to trigger BFCache and thus `back-forward-cache/resources/events.html` is navigated away during the test, but the WPT test infrastructure doesn't support navigating the main test Document. testharness.js: `fetch_tests_from_prefixed_local_storage()` is introduced to communicate test results from `back-forward-cache/resources/events.html` in a similar way to `fetch_tests_from_window()`. PrefixedLocalStorage.js: Some basic utility methods for `helper.sub.js` are added. Design doc: https://docs.google.com/document/d/1p3G-qNYMTHf5LU9hykaXcYtJ0k3wYOwcdVKGeps6EkU/edit?usp=sharing Bug: 1107415 Change-Id: I034f9f5376dc3f9f32ca0b936dbd06e458c9160b
Firefox and Safari implement a cache which drastically alters the behavior of
history.back
andhistory.forward
. Chrome is experimenting with a similar feature.The optimized behavior is at odds with at least one test in WPT:
websockets/unload-a-document/002.html
(latest results from wpt.fyi). It's difficult to say how many more tests are affected by this. There appears to be just 80 references tohistory.back
andhistory.forward
, and that may be a fair upper bound.Judging only from the issues where "bfcache" has been mentioned, it seems as though we're willing to accept tests concerning this behavior as long as they do not fail in non-supporting browsers. I can't find any discussion about the policy itself, hence this issue.
It doesn't seem like we can support this optionally since contributors need to know what will happen if they invoke
history.back
in a test.If we assume it is present, we'll need to update the relevant tests and instruct authors to circumvent the behavior when it's unwanted. If we assume it is absent, we should include instructions/automation on disabling it whenever possible. In either case, we should make mention of this in the documentation on the infrastructure's assumptions.
The text was updated successfully, but these errors were encountered: