Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Policy for "Back/Forward Cache" (aka "bfcache" or "Page Cache") #16359

Open
jugglinmike opened this issue Apr 15, 2019 · 59 comments
Open

Policy for "Back/Forward Cache" (aka "bfcache" or "Page Cache") #16359

jugglinmike opened this issue Apr 15, 2019 · 59 comments

Comments

@jugglinmike
Copy link
Contributor

Firefox and Safari implement a cache which drastically alters the behavior of history.back and history.forward. Chrome is experimenting with a similar feature.

The optimized behavior is at odds with at least one test in WPT: websockets/unload-a-document/002.html (latest results from wpt.fyi). It's difficult to say how many more tests are affected by this. There appears to be just 80 references to history.back and history.forward, and that may be a fair upper bound.

Judging only from the issues where "bfcache" has been mentioned, it seems as though we're willing to accept tests concerning this behavior as long as they do not fail in non-supporting browsers. I can't find any discussion about the policy itself, hence this issue.

It doesn't seem like we can support this optionally since contributors need to know what will happen if they invoke history.back in a test.

If we assume it is present, we'll need to update the relevant tests and instruct authors to circumvent the behavior when it's unwanted. If we assume it is absent, we should include instructions/automation on disabling it whenever possible. In either case, we should make mention of this in the documentation on the infrastructure's assumptions.

@jugglinmike
Copy link
Contributor Author

I'd prefer to assume absence because the feature isn't standardized, but I don't know if/how it can be disabled in the browsers we support.

@gsnedders
Copy link
Member

In principle, the bfcache is just an optimisation that's only sorta observable. I think @annevk was planning on updating how HTML sorta acknowledges the existence of bfcaches in plenty of browsers to actually be more accurate (because with Blink implementing a bfcache we'll then have every major browser having one).

We'll want to run some tests with the bfcache enabled (though it's a bit awkward, like testing any caching behaviour, given eviction is implementation defined). Are there any we want to run with it disabled? Testing behaviour that isn't actually in the shipped configuration isn't that useful.

@annevk
Copy link
Member

annevk commented Apr 16, 2019

It is standardized, no? It's the pagehide / pageshow events, the salvageable state on documents, session history entries being able to point directly to Document objects, the fully active concept, etc. It might not be great, but it's better than a lot of things.

@jugglinmike
Copy link
Contributor Author

In principle, the bfcache is just an optimisation that's only sorta observable.

It's hard to discuss observability in relative terms. More concretely: the feature directly undermines the expectations of the test I referenced above. That test sets window.location to trigger navigation to another document. The new document invokes history.back() and expects the following:

  • the scripts in the first document are evaluated a second time
  • the timers created from the prior evaluation have been destroyed

These may not happen if the UA is using a bfcache. Safari 12.0 fails for this reason.

My first suggestion is that we document an assumption: "history.back() and history.forward() do not trigger a navigation" and update tests that assume otherwise to use alternate methods (e.g. setting window.location).

My second suggestion is that we document as assumption: "The browser does not implement a so-called 'bfcache'" and take steps to ensure it is disabled in each browser.

Although I initially favored the second suggestion, @annevk's input that there is spec prose governing the behavior makes the first suggestion seem more appropriate.

@annevk
Copy link
Member

annevk commented Apr 17, 2019

I don't understand what documenting an assumption means. history.back() may trigger a navigation if there's no Document object on the prior session history entry, as per https://html.spec.whatwg.org/multipage/browsing-the-web.html#traverse-the-history.

@jugglinmike
Copy link
Contributor Author

@annevk I'm referring to the "Assumptions" document in the project's guide for writing tests. For the purposes of consistency, I think we have to instruct test authors to assume that navigation will not occur because the presence of the Document in the session history entry (and thus the possibility of a navigation) is at the discretion of the UA https://html.spec.whatwg.org/multipage/history.html#the-session-history-of-browsing-contexts

@annevk
Copy link
Member

annevk commented Apr 18, 2019

That document reads more to me like "expected configuration", not "assumptions". Whether back() navigates or swaps with an existing Document object is quite different from that. It's much more of a cache thing which we shouldn't expect anything of in general, but might be okay within a set of delineated tests.

@jugglinmike
Copy link
Contributor Author

@annevk Essentially, I want to update the referenced test to set window.location instead of invoking history.back. That alone will make the test behave consistently in Safari, but future tests could just as easily make the same mistake.

The reason I've raised this in terms of WPT policy is that the flaky test seems like a symptom of a larger problem: the bfcache behavior can subvert the expectations of test authors. I agree that authors should not assume a cache is in use. They also shouldn't assume a cache isn't in use. The recommendation I want to make is "don't use history.back or history.forward for navigation because we can make no guarantees about how the browser will respond."

Maybe the "Assumptions" document isn't the best place to put this, but I still feel that this needs to be said somewhere. Simply fixing the test seems inconsistent--we'd be acknowledging the use of those APIs as a bad practice without telling other folks to avoid it.

@annevk
Copy link
Member

annevk commented Apr 22, 2019

So looking at the test I think it's actually testing bfcache behavior (it's affected by WebSocket), so I'd be opposed to changing it. This might show a bug in Safari.

@jugglinmike
Copy link
Contributor Author

Can you elaborate on what text you feel the test is verifying?

@annevk
Copy link
Member

annevk commented Apr 23, 2019

Well, that WebSocket causes unsalvageable to be set and that therefore going back cannot ever go to a cached ddocument.

@jugglinmike
Copy link
Contributor Author

Ahah, the unloading document cleanup steps:

  1. For each WebSocket object webSocket whose relevant global object is window, make disappear webSocket.

    If this affected any WebSocket objects, then set document's salvageable state to false.

However, in that test, the document is being unloaded from the WebSocket's close event handler, which is nested in the open handler. In other words, the WebSocket connection has been established and the closing handshake has been started, so "make disappear a WebSocket" does nothing.

Since the salvageable state is not set to false, the presence of the Document in the session history remains at the discretion of the UA. If that's right, then Safari's flaky behavior is permissible.

@annevk
Copy link
Member

annevk commented Apr 24, 2019

Interesting, though that would mean it's showing a Firefox bug as for some reason Firefox does seem to set salvageable to false, right?

@jugglinmike
Copy link
Contributor Author

Well, Firefox does seem to discard the document, but I don't know if that's because it has incorrectly set salvageable to false. Isn't it also possible that it has removed the document from the session history (or informally, "cleared the bf cache")? Would it be violating any spec text for doing that?

@annevk
Copy link
Member

annevk commented Apr 24, 2019

Perhaps, but it's suspect that it always does so and never fails the test. That makes it much more deterministic than it ought to be, no?

@jugglinmike
Copy link
Contributor Author

Definitely hinky. I've reported it on Mozilla's bug tracker, but the validity of this test is still questionable. As long as the BF cache might be active, I don't think we can assert any specific behavior.

@annevk
Copy link
Member

annevk commented Apr 24, 2019

Well, we want to test salvageable given it's going to be implemented everywhere and given that I suspect we actually want to assume that it works within WPT. And I wouldn't really want to rewrite existing tests that might be testing it so they're no longer testing it (and might not be testing anything relevant anymore).

@jugglinmike
Copy link
Contributor Author

Although I'd be reluctant to maintain tests that are expected to be flaky in conforming runtimes, that seems like a theoretical risk at this point. Safari behaves inconsistently on this test, but the reason is out of scope of this test and possibly of the bfcache. The problem is that sessionStorage occasionally resolves to null following history.back(), even when there is no WebSocket in play (other global APIs are also affected).

I've created a generic test case for this and submitted via gh-16478. Would you mind taking a look?

@zcorpan
Copy link
Member

zcorpan commented Apr 25, 2019

These may not happen if the UA is using a bfcache.

Then the implementation is not conforming to the spec. The spec requires to set the salvageable state to false on the document on navigation if it has any open WebSockets.

https://html.spec.whatwg.org/multipage/browsing-the-web.html#unloading-documents:make-disappear

This is exactly what the test is intending to test.

(Edit: I see now that this was already pointed out earlier.)

@zcorpan
Copy link
Member

zcorpan commented Apr 25, 2019

No, wait,

  • 001.html is testing navigation with an open websocket, so expects "no bfcache"
  • 002.html is testing navigation after the websocket has been closed, so expects "bfcache" with the exception of "this test can fail if the document is unloaded on navigation e.g. due to OOM". Yes, this is inevitably flaky for browsers that have bfcache but can sometimes evict entries from the bfcache while running the test.

I'm not sure how to make it not flaky, short of running the same test several times and asserting that it got the "bfcache" behavior at least once.

@jugglinmike
Copy link
Contributor Author

See also whatwg/html#1931

@fergald
Copy link
Contributor

fergald commented Jan 21, 2021

In chromium we would like to start writing more WPT tests for bfcache. We had to fix a few of our own internal web tests. It looks like none that needed fixing were WPT. We also expect some of these tests to pass/fail depending on whether bfcache is enabled or not. I don't see a way to reliably test without a way to specify that a test needs

  • don't use bfcache
  • aggressively use bfcache
  • either is fine

I can still imagine a situation where because of implementation differences, "aggressively use bfcache" could still cause problems since it has no definition but would still be better than the current state of not being able to test at all.

The alternative of writing tests to detect whether bfcaching has happened and pass in either case is bad. While it makes everything green, you have no guarantee that you tested anything.

Are there any other alternatives?

@jgraham
Copy link
Contributor

jgraham commented Jan 21, 2021

So we could add some mechaism for wpt to opt in to different bfcache behaviours. The most obvious way would be to provide a testdriver API to change the bfcache behaviour for a particular browsing context (or, less ideally, for the whole browser). That depends on this being something we can opt into at runtime. An alternative would be something like a meta element that could map down onto a pref set at browser start time. If there's something that sounds good to implementors here I can help write an RFC and get it implemented in wpt.

@smaug----
Copy link
Contributor

smaug---- commented Jan 21, 2021

Explicitly opt-in to always enable bfcache wouldn't work in cases where one wants to test whether use of some feature
disables bfcache. (implementation might not just work correctly if certain feature is used while the page is in bfcache). And those features unfortunately vary between browsers, and I expect also between browser versions.
(Some new API implementation might initially require that bfcache gets disabled, but then later that particular implementation is improved and can be used with bfcache.)

But maybe we could start with something simple. A way to reliably disable bfcache - that would be the first step.
And all the implementations supporting bfcache should have some common cases when bfcache is enabled
(don't use *unload event listeners, don't use no-store, use noopener, be the only top level browsing context in the browsing context group)

@fergald
Copy link
Contributor

fergald commented Jan 22, 2021

@smaug----

Explicitly opt-in to always enable bfcache wouldn't work in cases where one wants to test whether use of some feature
disables bfcache. (implementation might not just work correctly if certain feature is used while the page is in bfcache). And those features unfortunately vary between browsers, and I expect also between browser versions.

I'm imagining e.g. a test that ensures that e.g. we don't expose location sensors while in BFCache. A browser could just disable BFCache if location sensors are used and so the test would never be able to trigger bfcaching. It seems like PRECONDITION_FAILED would be the right outcome for that test.

Being able to reliably disable would definitely be a start. whatwg/html#5744 already exists but maybe using a header is not suitable and a test-only JS API would be better.

@jugglinmike
Copy link
Contributor Author

I understand how an opt-out could be useful for web application developers. They can't make any assumptions about the contents of their users' history, so the history API is the only way they can reliably revisit previously-viewed documents.

Is that relevant in WPT, though? WPT test authors have total knowledge of the navigation history, so they can always trigger the desired navigation with another mechanism (e.g. the location API). If the bfcache would interfere with their test, can't they circumvent it by choosing some other mechanism?

If so, it seems like we could address the problem in WPT with documentation only. For example:

Do not use the history API unless explicitly testing that feature. The user-agent's behavior for the history API is intentionally underspecified in ways that may interfere with other conformance tests.

@fergald
Copy link
Contributor

fergald commented Jan 26, 2021

There are tests that use the history API that will pass or fail depending on whether BFCache is enabled. E.g.

https://source.chromium.org/chromium/chromium/src/+/master:third_party/blink/web_tests/http/tests/history/cross-origin-redirect-on-back.html

(and more here

I don't know if this is not tested in WPT for a good reason or if it's just something that we should migrate from Chromium to WPT but haven't done yet.

These cases are rare but they do exist, they mostly seem to involve the interaction between redirects/posts and the history API.

Is there another way around the problem that I'm not seeing?

@jgraham
Copy link
Contributor

jgraham commented Apr 8, 2021

So I don't think it's possible for vendors to convert PRECONDITION_FAILED into failure for specific tests, or at least the infrastructure for saying "this precondition ought not to fail in my implementation" doesn't exist and seems hard to maintain.

AFAICT the situation with bfcache is:

  • Presence in the cache is observable and bfcache behaviour can cause compat issues.
  • Vendors are largely trying to align on the same model.

So whilst I understand the desire to treat this as an optimisation and not over-specify behaviour, I'm not convinced that's in the best interests of the platform; it seems like there should just be an agreed set of rules that can be tested, even if we have to agree on some additional test-only rule about eviction policy (i.e. a way to enforce the invariant the eviction doesn't happen within a single test so any page that's in bfcache at any point during a test remains so throughout the test).

@fergald
Copy link
Contributor

fergald commented Apr 8, 2021

Chrome infrastructure allows us to expect a range of outcomes for a test, so that when others check in tests for features that are unsupported in chrome we can just set the expectation appropriately and make it pass in the future. What do others do?

More generally, if you cannot assert that that a tests passes for real and doesn't just give PRECONDITION_FAILED how can you know that a test is being run and not just being skipped? E.g. ​this test uses assert_implements_optional which results in PRECONDITION_FAILED. Are you saying I could break the code that calculates support scaling, causing it to result in PRECONDITION_FAILED all of the time and nobody would notice?

So whilst I understand the desire to treat this as an optimisation and not over-specify behaviour, I'm not convinced that's in the best interests of the platform; it seems like there should just be an agreed set of rules that can be tested, even if we have to agree on some additional test-only rule about eviction policy (i.e. a way to enforce the invariant the eviction doesn't happen within a single test so any page that's in bfcache at any point during a test remains so throughout the test).

I'm not sure how this would work. E.g. there are some sensors that should not collect data while a page is in BFCache. In Chrome's impl we do not cache if they are enabled (we will eventually make it so they stop collecting and things work but we have a long list of things to fix). So now imagine a test to ensure that no sensor data is collected while in BFCache. In Chrome, if we just force the page into the cache without fixing the sensor code, we will collect the data and the test will see that data was collected and fail but the test is a privacy test and Chrome's privacy is correct.

Do you have examples of a test you could write using your proposal?

@jgraham
Copy link
Contributor

jgraham commented Apr 8, 2021

My point about PRECONDITION_FAILED is that although we can set a test to expect PRECONDITION_FAILED and so notice if the test later becomes some other status, including PASS, there isn't a good way to know if a precondition failed for expected reasons ("oh we aren't expecting to support this case") or for bad reasons ("we should support this case but don't because of a bug"), other than manually examining each instance like you would for any other failure. So it only really works to help developers understand that what happened is expected when that manual check is trivial ("oh this is a whole optional feature we don't support") rather than the case here where getting PRECONDITION_FAILED might indicate that you're not caching a thing you expect to cache or might indicate that you made intentionally different choices.

In Chrome's impl we do not cache if they are enabled (we will eventually make it so they stop collecting and things work but we have a long list of things to fix). So now imagine a test to ensure that no sensor data is collected while in BFCache.

It seems like there are two different things here. One is a test that no sensor data is collected once you navigate away from the page. That would be the privacy test and wouldn't depend on the bfcache implementation. The other is that enabling sensors doesn't exclude a page from being put in the bfcache. Chrome today would pass the former and fail the latter. That seems totally reasonable.

The situation where we basically say "implementations can refuse to bfcache a page for any reason during tests" seems like it's going to make all the "real" tests implementation-specific, even when there's a commonly agreed set of behaviour that everyone is aiming to support and that could reasonably be tested in a cross-browser fashion.

@fergald
Copy link
Contributor

fergald commented Apr 8, 2021

I don't understand your reply about PRECONDITION_FAILED. We create a bfcache-gps-sensor-test we expect it to be PRECONDITION_FAILED, we check in that expectation (with a comment as to why) and we're done, it continues to yield the same result until some day when you make GPS+BFCache work, then it starts passing and you delete the unusual expectation. There's no ongoing overhead where people have to puzzle over why why it's PRECONDITION_FAILED.

That is exactly how PRECONDITION_FAILED tests work right now until someone implements the feature. It seems entirely in line with the expected usage, when a feature (e.g. bfcache-with-gps enabled) is missing, don't run the test.

Am I missing something about your process? Is it not possible to check in an expectation so that nobody ever has to think about it again unless it changes?

It seems like there are two different things here. One is a test that no sensor data is collected once you navigate away from the page. That would be the privacy test and wouldn't depend on the bfcache implementation. The other is that enabling sensors doesn't exclude a page from being put in the bfcache. Chrome today would pass the former and fail the latter. That seems totally reasonable.

It doesn't seem reasonable to me to fail a WPT if we're spec compliant.

Here's what happens in some imaginary circumstances where we have a test for bfcache+gps under both schemes

With PASS/FAIL:

  • In-spec browsers have fails
  • If we write the test to require bfcache and then test privacy, a browser that doesn't support bfcache+gps gets no new signal if it somehow starts unexpectedly allowing bfcache+gps - we continue to get a FAIL but now it's because we are leaking GPS data in bfcache but the checked-in expectation is FAIL so nothing alerts
  • If we write the test to only test privacy (and pass with/without bfcache) then a browser that does support bfcache+gps can lose coverage if something else causes it to stop using bfcache during this test

In both cases, privacy problems to go undetected.

With PRECONDITION_FAILED/PASS/FAIL:

  • In-spec browsers have no fails
  • A browser that unexpectedly switches from supporting bfache+gps to not supporting it (or vice versa) will not match expectations and so will alert.

API

Finally, on the idea of an API to force caching, that seems quite problematic, we have code that evicts pages from the cache and expects them to be evicted, dropping out of IPC handlers half-way, knowing that the page is doomed. Keeping that page around and pulling it back out of cache could lead to crashes.

As far as I can see, using PRECONDITION_FAILED covers everything and has no downsides as long as everyone has the ability to check in expectations.

@zcorpan
Copy link
Member

zcorpan commented Apr 8, 2021

Is isPreviousPageInBFCache sufficiently expressive? Which page is the "previous" page? Considering that tests may want to exercise edge cases that are currently not interoperable, e.g. whatwg/html#6483

Would it make sense for the API for this be in terms of https://github.com/WICG/app-history ?

cc @jakearchibald @domenic

@jgraham
Copy link
Contributor

jgraham commented Apr 8, 2021

I agree the force caching idea doesn't work. Consider it withdrawn. But nothing I've said recently depends on that.

Fundamentally I think the question here here is "should bfcache be specced in a way that ensures UAs behave consistently, subject to reasonable limits" (c.f. implementation defined limits in the HTML spec). Since the behaviour of the feature is observable from content, the obvious answer is "yes". But since it's regarded as an optimisation, the traditional answer is "no".

If we don't spec the feature in enough detail that we can agree what tests should do, it seems unlikely that we're going to end up with a useful set of shared tests. Implementors will end up writing product-specific tests that require whatever behaviour they happen to implement and largely ignore the "shared" tests which will most likely just reflect the decisions of whichever implementor happens to write them. Whether the status in the case of fundamental disagreement over the behaviour of the feature is PASS, FAIL or PRECONDITION_FAILED doesn't change much here.

@fergald
Copy link
Contributor

fergald commented Apr 10, 2021

I agree with the goal of sharing tests even on optional behaviour and PRECONDITION_FAILED is exactly how to do that.

1 navigate forward
2 assert_implements_optional(previousPageIsInBFCache(), "BFCache is not supported")
3 navigate back
4 assert(pageshow.persisted)

1 enable feature
2 navigate forward
3 assert_implements_optional(previousPageIsInBFCache(), "BFCache is not supported with $feature enabled")
4 navigate back
5 assert(pageshow.persisted)
6 assert that feature didn't do anything terrible while in BFCache (maybe there's nothing to test here for some features)

I don't see what goes wrong with the above and I don't see how it interferes with fully speccing BFCache or with having a battery of shared tests.

@jgraham
Copy link
Contributor

jgraham commented Apr 12, 2021

I agree that if the constraints are a) it must never be required by the specification to put a page in the bfcache (even under test conditions) and b) we must never display FAIL for something that's technically allowed, your solution satisifes the constraints and will allow producing a shared testsuite. I just also believe that the social dynamics of accpeting those constraints will tend to cause people to rely more on browser-specific tests that encode the actual rules they implement in a clear PASS/FAIL format and pay less attention to the shared tests.

If we agreed that any bfcache implementation must follow explicit rules, with some additional constraints imposed on the test scenario specifically (e.g. rules around not having time-based eviction during a test), we'd end up with something more testable and more consistent between browsers; thus helping authors. We could make the tests clearly PASS/FAIL and a browser with no implementation at all would simply accept that they FAIL these tests. That's not so unusual; e.g. there are Blink-only features for which other browsers FAIL the tests not because the implementation is wrong, but because they object in principle to the existence of the feature.

It's possible that I'm wrong about this of course, but in general getting people to pay attention to test failures from tests they didn't write is even harder than getting them to write tests in the first place. It would be useful to get some input from @smaug---- and whoevr works on this on the WebKit side on what would be most likely to make the testsuite useful for their implementation work.

@gsnedders
Copy link
Member

cc @cdumez

chromium-wpt-export-bot pushed a commit that referenced this issue May 24, 2021
back-forward-cache/resources/helper.sub.js:
Helper for A->B->A navigation scenarios.
BFCache state is detected by observing `pageshow` events.
We might want to use more explicit APIs like
`isPreviousPageInBFCache` discussed at
#16359
in the future / in more complicated scenarios, but so far
`pageshow`-based detection seems to work without hurting ergonomics.

back-forward-cache/resources/events.html:
The file that loads `helper.sub.js` and
contains test logic and assertions.
We navigate to `back-forward-cache/resources/back.html` and then
back-navigate to `back-forward-cache/resources/events.html`.
In the case of BFCache is not used, two async_test objects are
created:
- The first one during the initial navigation,
  which never completes, and
- The second one during the back navigation, which will execute
  test assertions and then complete.
This doesn't affect the behavior (the first one seems just ignored),
but might look awkward.

back-forward-cache/events.html:
The main test Document that opens
`back-forward-cache/resources/events.html` using
`window.open()` with 'noopener' option.
This is because (at least) Chromium requires top-level navigations
to trigger BFCache and thus `back-forward-cache/resources/events.html`
is navigated away during the test, but the WPT test infrastructure
doesn't support navigating the main test Document.

testharness.js:
`fetch_tests_from_prefixed_local_storage()` is introduced
to communicate test results from
`back-forward-cache/resources/events.html`
in a similar way to `fetch_tests_from_window()`.

PrefixedLocalStorage.js:
Some basic utility methods for `helper.sub.js` are added.

Bug: 1107415
Change-Id: I034f9f5376dc3f9f32ca0b936dbd06e458c9160b
chromium-wpt-export-bot pushed a commit that referenced this issue May 24, 2021
back-forward-cache/resources/helper.sub.js:
Helper for A->B->A navigation scenarios.
BFCache state is detected by observing `pageshow` events.
We might want to use more explicit APIs like
`isPreviousPageInBFCache` discussed at
#16359
in the future / in more complicated scenarios, but so far
`pageshow`-based detection seems to work without hurting ergonomics.

back-forward-cache/resources/events.html:
The file that loads `helper.sub.js` and
contains test logic and assertions.
We navigate to `back-forward-cache/resources/back.html` and then
back-navigate to `back-forward-cache/resources/events.html`.
In the case of BFCache is not used, two async_test objects are
created:
- The first one during the initial navigation,
  which never completes, and
- The second one during the back navigation, which will execute
  test assertions and then complete.
This doesn't affect the behavior (the first one seems just ignored),
but might look awkward.

back-forward-cache/events.html:
The main test Document that opens
`back-forward-cache/resources/events.html` using
`window.open()` with 'noopener' option.
This is because (at least) Chromium requires top-level navigations
to trigger BFCache and thus `back-forward-cache/resources/events.html`
is navigated away during the test, but the WPT test infrastructure
doesn't support navigating the main test Document.

testharness.js:
`fetch_tests_from_prefixed_local_storage()` is introduced
to communicate test results from
`back-forward-cache/resources/events.html`
in a similar way to `fetch_tests_from_window()`.

PrefixedLocalStorage.js:
Some basic utility methods for `helper.sub.js` are added.

Design doc:
https://docs.google.com/document/d/1p3G-qNYMTHf5LU9hykaXcYtJ0k3wYOwcdVKGeps6EkU/edit?usp=sharing

Bug: 1107415
Change-Id: I034f9f5376dc3f9f32ca0b936dbd06e458c9160b
@hiroshige-g
Copy link
Contributor

I created a PR #28950 that shows a basic framework for BFCache WPTs (especially for simple scenarios). What do you think? If there are no objections, I'd like to merge the PR and upload other WPT PRs for more individual BFCache-related issues based on the PR.

@jgraham
Copy link
Contributor

jgraham commented Jun 2, 2021

I'm very happy to see progress here.

Since the PR adds APIs to testharness.js I think it needs to go through the RFC process to ensure that we get cross-vendor feedback on the API design.

@domenic
Copy link
Member

domenic commented Jun 2, 2021

I'm a bit disappointed by the tests since they don't guarantee that the page gets put into bfcache (e.g. using testdriver or something similar to force it). So I'm not sure how I would ever run the tests locally to debug bfcache; I expect I would get "precondition failed" most of the time. I would just have to hope that the browser decides to cache, I guess, and try to do all my debugging in those runs?

@fergald
Copy link
Contributor

fergald commented Jun 2, 2021

What debugging are you thinking about? Why do you expect to get precondition failed most of the time? We have web tests in chrome that expect bfcache to work every time and to my knowledge they are not flaky.

If we force caching when the browser has a reason to not cache then we're asking for browser crashes.

@domenic
Copy link
Member

domenic commented Jun 2, 2021

Well, for example, if I'm writing a test to see what happens while in bfcache (to test, e.g., a fully active condition), it's not very useful to me if sometimes I'm in bfcache and sometimes I'm not. That just makes the debugging experience very frustrating.

If bfcache works every time, then maybe the tests can assert that, instead of using the precondition failed system.

@rakina
Copy link
Contributor

rakina commented Jun 2, 2021

@domenic, are you concerned that browsers will most likely not cache the page when running web tests? I guess adding enableBFCacheForThisPage as a way to explicitly say to the browser that we'd prefer bfcache to be used wouldn't hurt. It's still possible for the browser to not cache and cause the test to do a PRECONDITION_FAILED, but hopefully less so. Do you think that will help?

@jgraham thank you! So I guess now @hiroshige-g just needs to make a RFC explaining create_remote_prefixed_local_storage and fetch_tests_from_prefixed_local_storage (and maybe the additions to PrefixedLocalStorage.js?)

@domenic
Copy link
Member

domenic commented Jun 2, 2021

are you concerned that browsers will most likely not cache the page when running web tests?

"Most likely" is overstating it. It's more that I have no confidence as to whether they will or not; it's not that I have confidence they usually won't. Maybe they cache when my system has plenty of extra memory? Maybe they cache based on previously observed traffic patterns? It'd be a nightmare to debug such situations...

Do you think that will help?

Yeah, I think so. It'd give me some reassurance that, if a browser has the capability to bfcache the page, then it's trying to do so, and I don't have to worry about some heuristic I might be tripping the wrong direction.

@fergald
Copy link
Contributor

fergald commented Jun 3, 2021

"Most likely" is overstating it. It's more that I have no confidence as to whether they will or not; it's not that I have confidence they usually won't. Maybe they cache when my system has plenty of extra memory? Maybe they cache based on previously observed traffic patterns? It'd be a nightmare to debug such situations...

These are all concerns but any browser that wants to make caching non-deterministic should disable that when running WPTs. In chrome the only thing we consistently tweak like that in tests is --enable-features=BackForwardCacheNoTimeEviction but if we were to add a lot of other heuristics for not-caching then we would also turn them off when running tests. If caching fails consistently or flakily due to some non-heuristic reason then the test or the browser has a bug.

I don't think we should introduce something to say "make bfcache deterministic" at the start of every bfcache test, that should just be the default for all WPT tests.

One problem is that there is no agreed cache-size for WPT tests. We should probably try to agree on something so that nobody writes tests that are doomed to hit PRECONDITION_FAILED for that reason.

@hiroshige-g
Copy link
Contributor

This kind of debugging against heuristics might be more likely in the wild rather than in writing tests, so web developers might have experiences/opinions... not sure though.

As for testing, so far I haven't encountered flakiness around BFCache eligibility, and disableBFCacheForThisPage is probably easily polyfillable, so I expect the impact on test framework design would be limited.

@domenic
Copy link
Member

domenic commented Jun 3, 2021

I don't think we should introduce something to say "make bfcache deterministic" at the start of every bfcache test, that should just be the default for all WPT tests.

I don't think this is realistic. I run WPTs in Chrome Canary with no flags, or just the usual Chrome stable that I run. The point of WPTs is to test browsers as they are, not browsers in a special mode. Indeed, from what I understand wpt.fyi (and the CI for the WPT repository) run in a no-flags mode.

If a test requires a browser to be in a special mode, then it should communicate that to the browser in some way, e.g. using WebDriver.

@fergald
Copy link
Contributor

fergald commented Jun 4, 2021

I don't think this is realistic. I run WPTs in Chrome Canary with no flags, or just the usual Chrome stable that I run. The point of WPTs is to test browsers as they are, not browsers in a special mode. Indeed, from what I understand wpt.fyi (and the CI for the WPT repository) run in a no-flags mode.

BFCache is observable, including outside of tests intended for BFCache. We had to rewrite some of chrome's pre-existing web tests to cope with BFCache. Having observable behaviour controlled by heuristics means that all tests that navigate are potentially subject to this problem (whether they know it or not).

So providing an API won't fix everything but I do see the benefit of not running in a special mode. I think we can wait and see if this problem ever occurs before trying to fix it.

Note, chrome's dev-tools will tell you whether/why a navigation was cached, so that will help.

@domenic
Copy link
Member

domenic commented Jun 4, 2021

This problem has already occured: some of the wpt.fyi preview results generated by CI on the above PR fail in Chrome (and some succeed), since the wpt.fyi CI does not know to run Chrome in a special mode.

@fergald
Copy link
Contributor

fergald commented Jun 4, 2021

This problem has already occured: some of the wpt.fyi preview results generated by CI on the above PR fail in Chrome (and some succeed), since the wpt.fyi CI does not know to run Chrome in a special mode.

I think the only special thing we do right now is disabling the timeout and that should be irrelevant.

It's more likely that that bot is in an group that just doesn't have BFCache at all. No beta or stable desktop chrome has it yet, so it is essentially like any unreleased feature right now.

@domenic
Copy link
Member

domenic commented Jun 4, 2021

The bot runs with experimental web platform features so I don't think that's what causes it. I suspect it's the timeout; it runs many tests in a row and it'd be easy to trigger such a timeout, IIUC?

@fergald
Copy link
Contributor

fergald commented Jun 4, 2021

OK, so if they're enabling BFCache as an experimental feature then I imagine it's got the default timeout (15s whereas real users who are in our rollout groups get 180s.

Regardless of these tests, we should update the default timeout to 180s so that enabling the feature gets people a useful experience and also in the long-term, drop the need to send a timeout in our configs.

@jgraham
Copy link
Contributor

jgraham commented Jun 4, 2021

It's not really true that we run browsers in a totally unmodified state from what ships; it's pretty common to have to set some prefs to get reliable test behaviour. So I don't think that it's totally unreasonable to supply a flag/pref/etc. that disables bfcache eviction based on heuristics. I also think it would be quite reasonable to have an API (in testdriver or as a test-only API in the DOM) to clear the bfcache, if that helps get a clean slate (I think we typically run each test in a fresh top-level-navigable already, so it might not make much difference).

@hiroshige-g
Copy link
Contributor

The bot runs with experimental web platform features so I don't think that's what causes it. I suspect it's the timeout; it runs many tests in a row and it'd be easy to trigger such a timeout, IIUC?

IIUC BFCache is not included in the experimental web platform features (see https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/platform/runtime_enabled_features.json5;l=246?ss=chromium) and the failures are expected for me (due to lack of --enable-features=BackForwardCache flags). The test are passing on virtual/bfcache tests (which sets --enable-features=BackForwardCache) on Chromium bots and fails on non-virtual tests (e.g. https://ci.chromium.org/ui/p/chromium/builders/try/linux-rel/701591/overview).

chromium-wpt-export-bot pushed a commit that referenced this issue Jul 9, 2021
back-forward-cache/resources/helper.sub.js:
Helper for A->B->A navigation scenarios.
BFCache state is detected by observing `pageshow` events.
We might want to use more explicit APIs like
`isPreviousPageInBFCache` discussed at
#16359
in the future / in more complicated scenarios, but so far
`pageshow`-based detection seems to work without hurting ergonomics.

back-forward-cache/resources/events.html:
The file that loads `helper.sub.js` and
contains test logic and assertions.
We navigate to `back-forward-cache/resources/back.html` and then
back-navigate to `back-forward-cache/resources/events.html`.
In the case of BFCache is not used, two async_test objects are
created:
- The first one during the initial navigation,
  which never completes, and
- The second one during the back navigation, which will execute
  test assertions and then complete.
This doesn't affect the behavior (the first one seems just ignored),
but might look awkward.

back-forward-cache/events.html:
The main test Document that opens
`back-forward-cache/resources/events.html` using
`window.open()` with 'noopener' option.
This is because (at least) Chromium requires top-level navigations
to trigger BFCache and thus `back-forward-cache/resources/events.html`
is navigated away during the test, but the WPT test infrastructure
doesn't support navigating the main test Document.

testharness.js:
`fetch_tests_from_prefixed_local_storage()` is introduced
to communicate test results from
`back-forward-cache/resources/events.html`
in a similar way to `fetch_tests_from_window()`.

PrefixedLocalStorage.js:
Some basic utility methods for `helper.sub.js` are added.

Design doc:
https://docs.google.com/document/d/1p3G-qNYMTHf5LU9hykaXcYtJ0k3wYOwcdVKGeps6EkU/edit?usp=sharing

Bug: 1107415
Change-Id: I034f9f5376dc3f9f32ca0b936dbd06e458c9160b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants