New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Http cache: introduce cache entries, prevent cache misses when resource will be fetched #24203
Conversation
Heads up! This PR modifies the following files:
|
7986748
to
d6ea5dc
Compare
d6ea5dc
to
c4f6a6b
Compare
@bors-servo try=wpt |
Http cache: introduce cache entries, prevent cache misses when resource will be fetched <!-- Please describe your changes on the following line: --> FIX #24166 --- <!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: --> - [ ] `./mach build -d` does not report any errors - [ ] `./mach test-tidy` does not report any errors - [ ] These changes fix #___ (GitHub issue number if applicable) <!-- Either: --> - [ ] There are tests for these changes OR - [ ] These changes do not require tests because ___ <!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.--> <!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. --> <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/24203) <!-- Reviewable:end -->
💔 Test failed - linux-rel-css |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The big comment is that this looks like it's structured differently from the spec, which (IIRC) leaves all caching to network_or_cache_fetch
.
Also, it would be nice if a GET
could join on to a GET
in progress, rather than waiting for all its payload to arrive. That might be a big change though.
@@ -224,6 +225,13 @@ pub fn main_fetch( | |||
// Step 11. | |||
// Not applicable: see fetch_async. | |||
|
|||
// Get the cache entry corresponding to the url, after potential hsts switch. | |||
let cache_entry = if let Ok(mut http_cache) = context.state.http_cache.write() { | |||
http_cache.get_entry(&request) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, is this spec-complaint? Shouldn't anything that touches the cache be in http_network_or_cache_fetch
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly-speaking, I think it's ok, since at this point we're not actually running any specified cache related logic, we're just getting the entry corresponding to the current request, later to be used to check the cache. So basically the "entry" is the interface to the "cache" from the spec perspective, and it will be used later in http_network_or_cache_fetch
.
I think the spec in this case is not really providing the full algorithm, rather describing the outcome from a black-box perspective.
For example, see the diagram for the state-machine of the Chromium cache, little of which shows up in the spec: https://www.chromium.org/developers/design-documents/network-stack/http-cache
Same thing goes for the Gecko cache: https://developer.mozilla.org/en-US/docs/Mozilla/HTTP_cache#Implementation (more specifically, see their CacheEntry
description).
If anything, ours matches the spec most closely, and yet ours seems to be the most naive implementation of them all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd still prefer a solution that kept all the caching logic inside http_network_or_cache_fetch
, I find it confusing moving some bits of caching out. What's the motivation for doing that?
components/net/http_cache.rs
Outdated
response: response, | ||
needs_validation: has_expired, | ||
}; | ||
if cached_resource.aborted.load(Ordering::Relaxed) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ordering::Relaxed
? You are a brave soul!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other parts of this workflow also use Relaxed
, and we could change that in this PR as well in another commit, see
servo/components/net/fetch/methods.rs
Line 480 in fb5a121
response.aborted.store(true, Ordering::Relaxed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see #24203 (comment)
components/net/http_cache.rs
Outdated
response: response, | ||
needs_validation: has_expired, | ||
}; | ||
if cached_resource.aborted.load(Ordering::Relaxed) { | ||
return None; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we construct a new CachedResponse
at this point and return it, rather than returning None
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used inside a while
loop at https://github.com/servo/servo/pull/24203/files#diff-e980a3e3bf7e4cb8a8d3dda60ce7a408R770
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I realize this change was left over from #22813, and I'm not sure if its' really necessary. Basically there is no need to make this return an Option<CacheResource>
, based on the check for aborted
, or at least it's not solving any known problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second-thoughts, I think this is a change that makes sense, since we shouldn't construct an aborted response, and I should have extracted it from the pervious PR at the time.
So I switched to release/acquire for that flow.
@@ -563,6 +580,132 @@ impl HttpCache { | |||
} | |||
} | |||
|
|||
/// Get the cache entry corresponding to this request | |||
pub fn get_entry(&mut self, request: &Request) -> Option<HttpCacheEntry> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly annoyingly that means we need a write lock just to read the cache :( We may as well make it a Mutex
rather than a RWLock
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I can change that to a mutex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we just keep it a RwLock
for now, since everything else on HttpState
uses one?
servo/components/net/http_loader.rs
Line 64 in 9138d1d
pub struct HttpState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, bit of a waste, but not a huge problem.
components/net/http_cache.rs
Outdated
let entry = self | ||
.entries | ||
.entry(entry_key.clone()) | ||
.or_insert(HttpCacheEntry::new(entry_key)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting that invalidation can end up inserting an entry into the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we actually want to return if there is no corresponding entry :)
components/net/http_cache.rs
Outdated
/// if a pending network fetch has already begun, | ||
/// and a pending response has been stored in the cache, | ||
/// or if there is no pending network fetch. | ||
is_ready_to_construct: Arc<(Mutex<bool>, Condvar)>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use futures rather than a condvar here? Essentially we're building a Future<Option<CachedResource>>
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would require moving all of fetch to futures/tasks, since a fetch currently runs on a thread, and in such a setup its more reliable to do this with condvars and locks.
There is an issue at #22813
Ok lots of |
On that, with the current changes, there are in fact two separate "wait" workflows:
If by "joining" you mean actually sending the intermediary steps on the |
@bors-servo try=wpt |
Http cache: introduce cache entries, prevent cache misses when resource will be fetched <!-- Please describe your changes on the following line: --> FIX #24166 --- <!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: --> - [ ] `./mach build -d` does not report any errors - [ ] `./mach test-tidy` does not report any errors - [ ] These changes fix #___ (GitHub issue number if applicable) <!-- Either: --> - [ ] There are tests for these changes OR - [ ] These changes do not require tests because ___ <!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.--> <!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. --> <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/24203) <!-- Reviewable:end -->
💔 Test failed - linux-rel-css |
@bors-servo try=wpt |
Http cache: introduce cache entries, prevent cache misses when resource will be fetched <!-- Please describe your changes on the following line: --> FIX #24166 --- <!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: --> - [ ] `./mach build -d` does not report any errors - [ ] `./mach test-tidy` does not report any errors - [ ] These changes fix #___ (GitHub issue number if applicable) <!-- Either: --> - [ ] There are tests for these changes OR - [ ] These changes do not require tests because ___ <!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.--> <!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. --> <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/24203) <!-- Reviewable:end -->
💔 Test failed - linux-rel-css |
@bors-servo try=wpt (100ms timeout) |
Http cache: introduce cache entries, prevent cache misses when resource will be fetched <!-- Please describe your changes on the following line: --> FIX #24166 --- <!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: --> - [ ] `./mach build -d` does not report any errors - [ ] `./mach test-tidy` does not report any errors - [ ] These changes fix #___ (GitHub issue number if applicable) <!-- Either: --> - [ ] There are tests for these changes OR - [ ] These changes do not require tests because ___ <!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.--> <!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. --> <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/24203) <!-- Reviewable:end -->
☀️ Test successful - linux-rel-css, linux-rel-wpt, status-taskcluster |
474dcf3
to
589c09a
Compare
@bors-servo try=wpt (increase timeout from 100 to 500ms) |
Http cache: introduce cache entries, prevent cache misses when resource will be fetched <!-- Please describe your changes on the following line: --> FIX #24166 --- <!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: --> - [ ] `./mach build -d` does not report any errors - [ ] `./mach test-tidy` does not report any errors - [ ] These changes fix #___ (GitHub issue number if applicable) <!-- Either: --> - [ ] There are tests for these changes OR - [ ] These changes do not require tests because ___ <!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.--> <!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. --> <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/24203) <!-- Reviewable:end -->
Ok now its really ready for another round of review @asajeffrey |
Oh One way we could deal with that is change how prefetching is done, so that rather than having it just treated like a regular cache access, we treat the first request for the resource as being a cache hit, even if the resource didn't have cache metadata? This does worry me a bit, as it's not clear whether that's spec compiliant, but it would get this test to pass, and would improve pref in cases where we prefetched a resource that turned out to have no cache metadata. I'll read over the latest version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'm still confused about why we're moving caching logic out of http_network_or_cache_fetch
, and there's still the question about how to treat the first fetch of a cached prefetch, but otherwise LGTM.
@@ -224,6 +225,13 @@ pub fn main_fetch( | |||
// Step 11. | |||
// Not applicable: see fetch_async. | |||
|
|||
// Get the cache entry corresponding to the url, after potential hsts switch. | |||
let cache_entry = if let Ok(mut http_cache) = context.state.http_cache.write() { | |||
http_cache.get_entry(&request) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd still prefer a solution that kept all the caching logic inside http_network_or_cache_fetch
, I find it confusing moving some bits of caching out. What's the motivation for doing that?
@@ -563,6 +580,132 @@ impl HttpCache { | |||
} | |||
} | |||
|
|||
/// Get the cache entry corresponding to this request | |||
pub fn get_entry(&mut self, request: &Request) -> Option<HttpCacheEntry> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, bit of a waste, but not a huge problem.
I understand your concern, and I think the proposed changes actually fit in with your thinking, for the following reasons:
In terms of the motivations of the proposed approach, they are:
Please let me know what you think. |
87e6be1
to
a672080
Compare
a672080
to
eeb3a4e
Compare
OK, we appear to be at a bit of a stalemate here, about whether all caching logic should be in |
In favor of #24318 |
FIX #24166
./mach build -d
does not report any errors./mach test-tidy
does not report any errorsThis change is