[Taskcluster] Switch to repeat-only for stability checks #25149
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We frequently hear from test authors that they find it frustrating when
the stability checks turn up failures that they cannot reproduce. One
common case comes from the fact that stability checks run the same test
repeatedly without restarting the browser (the AAABBBCCC behavior). This
is unlike any other way we run tests, and can cause some tests to
consistently appear flaky due to global state (e.g. the
origin-isolation/ tests).
To fix this, switch the stability checks to only use 'repeat-restart'
flake detection (previously we used both 'repeat-loop' and
'repeat-restart'). This mode run the tests in entire sets, then restarts
the browser and runs them again, aka ABC[restart]ABC[restart]ABC. The
hope is that we will not lose too much flake coverage, but will reduce
the amount of non-addressable flake that is reported.
This also makes it more feasible to implement a timeout-avoiding
mechanism for the stability checks; see
https://docs.google.com/document/d/1dAlCSHUQldtgWDDTrGJR-ksm19FZZ3k8ppqc5-kSwIk/edit#