New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automating browsers using Chrome DevTools Protocol #5
Comments
As active participants in the Browser Tools and Testing Working Group, the following folks may have more to add on the topic: @AutomatedTester, @andreastt, @burg, @shs96c, and @whimboo. |
In general I think adding additional executors to wptrunner is fine, so at a high level I don't have any objections. It is of course the responsibility of the people adding the executor to maintain it over the long term. Having said that I think there are some considerations that should be made by anyone planning to add an executor. In terms of API surface the addition of testdriver means we are exposing more, and more complex, interaction APIs to tests, following the rule that any such interaction must be defined in a web standard. At present the only web standard which defines these generic actions is WebDriver, so any non-WebDriver implementation needs to replicate WebDriver semantics. For the marionette executor this is trivial because it implments WebDriver command semantics directly. For CDP that isn't true, so there is more complexity required in the executor. In the current implementation that complexity goes in the ChromeDriver layer, so in effect the proposal here is to rewrite a subset of ChromeDriver in Python so that changes no longer need to be coordinated with the ChromeDriver maintainers. Again, I have no objection if it's easier to take on the extra work, but you should not assume it will be trivial. I also think the benefits in terms of the possible standardisation of CDP are overstated. To make my position clear; I believe that allowing developers to test websites in multiple browsers is an essential part of facilitating a healthy browser ecosystem, and the recent trend to test using Chrome-specific tooling is harmful and must be reversed. I think the superior usability of a duplex remote control protocol is understood to be one of the primary drivers of that change, and standardising CDP looks like the most plausible way of filling that technical gap. But until the standard exists we can't let people writing tests use CDP-only features. Given how long the process of standardisation typically takes, I suspect we are more than a year away from having a standards-track document of sufficent quality to use as the basis for any testdriver features. Of course it is also possible that the political benefits you allude to are a real short term gain from having this code. It is less clear to me why having a CDP implemenation in wpt specifically changes the dynamic there, but it is definitely possible that you're correct. So, again, I have no objections to this change. If people want to add and maintain an executor based on any remote protocol or other way of interacting with the browser, that's fine, as long as they maintain it and as long as it isn't later used as an argument for changing the rules around what's permitted in the standardised testdriver API. |
I'm unconvinced that this is true. As long as only Chrome is using it, it doesn't really promote standardisation. There's been enough concerns about about the willingness of the Chrome team to standardise the protocol (which seems to have been marred for years in internal Google politics), as well as a fair number of concerns around the design of the protocol. While it seems increasingly likely we're going to have ever more desire to standardise CDP with it slowly getting adopted across browsers, as long as it is unclear whether the Chrome team (and the devtools team especially) will be in favour of standardisation and be willing to make any changes from the current protocol in their implementation. Aside from this, I essentially agree with @jgraham's response above, though to add:
Given both ExecutorCDP and ChromeDriver will be/are maintained by the Chrome team, is this not something that can be dealt with at an organisational level within the Chrome team rather than (essentially) multiple parts of the Chrome team maintaining near identical code? That seems very wasteful, but ultimately that's a concern for the Chrome team and not for WPT. |
It also seems premature to me to use CDP in WPT. |
This distinction reduces the cohesiveness of the group maintaining WPT infrastructure. It's at odds with a "team" mentality which I am assuming is desirable. It also risks code rot given that not all contributors are employed to maintain WPT. Given the choice, I would much rather see commitment from the entire group on these RFCs, even if that delays consensus.
Concerns around standardization seem to apply to Chrome's Mojo API as well, but as the proposal probably makes clear, I wasn't able to reconcile that. The first Mojo references were accepted before this group formed, so maybe it's simply grandfathered in from a time with different standards. Or is there some difference between the technologies that I'm missing? |
Is there an important use case for WPT that WebDriver does not address? I want to hear about it. I don't know what you mean by "more expressive tests".
It sounds like the easiest way to fix this situation is for Chromium or its contributors to implement WebDriver support properly. After over a year of talk, there are still no spec drafts nor independent test suites for CDP, so I have no reason to believe there is any upside to this proposal. |
I am not claiming that if this lands you will personally be on the hook for all future patches. But throwing code over the wall and expecting other people to block changes on maintaining it whilst not contributing towards that maintainance is also problematic, even if there is initial agreement that the code can land. To solve this I think such a feature should come with a clear ownership story, which in this case would probably mean that Google's ecosystem-infra team would agree to be primarily responsible for keeping it working. I consider the current situation with executormarionette.py to be analogous, with Mozilla committing to keeping it working. Concretely, to define the scope of that commitment, I would expect that people making small changes would fix all executors, and would be held to that in code review. On the other hand, I would not expect that large changes requiring substantial per-executor work would necessarily be blocked on fixing every executor (although such changes will need an RFC so there will be some oppertunity to object to things that are hard to implement for whatever reason), and I would not expect third parties to scramble to fix problems caused by external changes breaking an executor (e.g. if the underlying protocol changes). For marionette we avoid the latter problem by running it in mozilla-central CI, so we are confident that kind of breakage is as rare as possible.
I thought I was told that Mojo was an unfortunate implementation detail required to enable spec-compliant testing APIs for some standards in Blink. If that isn't the case I agree we should consider ripping out tests that depend on it. Practically this has primarilly affected testing of Blink-only features, so the test compatibility issue hasn't arisen in practice yet. That's very different to the testdriver situation where most APIs are general purpose. |
In those meeting minutes, @shs96c (credited as "simonstewart") said:
@reillyeon (credited as "reillyg") had a slightly more concrete use case:
|
Maybe we could ask for commitment from Chrome to use this CDPExecutor in their infrastructure. That would give us a lot more confidence in the continued integration. @LukeZielinski recently gave a presentation about their efforts to reduce the amount of internal infrastructure Google uses to run WPT; this could be a part of that effort. Do folks at Mozilla use ExecutorMarionette to test Firefox? |
Given part of the motivation is being able to test current It's already the case that application-specific executors are essentially expected to be maintained by those with an interest in the application; the Marionette executor is essentially maintained by @jgraham, the Servo executor is essentially maintained by @jdm, etc. As @jgraham says, we can and do land code that doesn't work cross-executor from day one (an obvious recent example is Perhaps we should make it possible to use the WebDriver executor for all browsers (which support it!) and then we have that as a baseline, and it should, for the foreseeable future, be possible to run all tests against all browsers regardless of executor up-to-dateness?
Yes. |
If CDP is easier to run against content_shell than WebDriver, such that Chrome could switch to using wptrunner internally with the former and not the latter, that would be an excellent reason to have the CDP executor, independent of any other considerations.
This is sort of scope creep, but without actually running that configuration in CI it might well break, but we certainly want to keep using the preferred executors for master runs. So there are at least tradeoffs in doing this (though it may nevertheless be a good idea). |
Although there certainly are things that work better in a bidirectional protocol like CDP than a command-response protocol like WebDriver, note again that it wouldn't be possible to add such features to the testing API until such a time as they were covered by a web standard (in practice, a web standard with vendor buy-in since the RFC process would presumably reject any extensions that couldn't be implemented in multiple browsers). |
With #1 (the process itself) resolved yesterday, let's act as if this RFC was filed yesterday and give at least until Feb 14 before making a decision on this. It is my team (Ecosystem Infra) which has asked for this prototype and now its integration into WPT, so I am of course very supportive of this. What is the point of this? We have identified the difficulty of adding test automation capabilities as a problem for increased use of web-platform-tests in Chromium, affecting a large fraction of new features which are "hard to test" for one reason or another. @LukeZielinski has some diagrams he might be able to share of just how many layers and steps are involved. We're looking at many options for decreasing the overall complexity, perhaps most importantly using wptrunner in Chromium to avoid implementing testdriver.js internals twice. This RFC is another prong of investigation, to see what happens if we remove the WebDriver layer from the puzzle and use CDP directly. Is it faster or slower? Less work to add new features? We don't know yet, and this is an experiment towards getting some answers. It's unknown which of ExecutorWebDriver and ExecutorCDP we'll want to use in which contexts long term. For any capabilities we add to testdriver.js, the behavior has to be well defined and have tests, so that anyone can implement it without reverse engineering. This is the same bar as for web-exposed features. Currently the only way to define vendor-neutral remote debugging commands is via the WebDriver spec or WebDriver extension specs like https://w3c.github.io/permissions/#automation. This will remain the case until there's an alternative. To address some of the specific feedback:
@burg, the type:untestable label tracks things we can't currently test, plenty of which could be solved by new WebDriver endpoints. Some cases like reading console output might be awkward to express using WebDriver, but none impossible because polling can always emulate bidirectional communication.
@jgraham, yes, our team and in particular @LukeZielinski will be responsible for the future of ExecutorCDP.
@jugglinmike, I think we'd like to try full Chrome runs with both ExecutorWebDriver and ExecutorCDP in both Taskcluster (one-off) and Chrome's infrastructure, to see how they compare.
@youennf, given the above, and the precedence of ExecutorMarionette, do you still feel this is premature? |
Thanks @foolip for the great explanation.
The diagrams Philip refers to can be found near the top of the design doc to use WPT in Chromium CI |
That doc isn't public. |
Apologies, should be fixed now. |
The statement "By adopting it today, WPT can promote discussion around standardization and attract more implementers to the process." has the process of standardisation exactly backwards. We only landed support for the webdriver protocol in WPT once the standardisation process was effectively complete, and there was support implemented in multiple browsers. If the primary driver for this effort is to support standardisation of the protocol, then I'd suggest a better approach would be to put forward an initial draft spec for discussion. If the problem is that the chromedriver lags behind the implementation of chrome, I fail to see how that is a problem the W3C needs to overcome, and it would seem to support the risk @jugglinmike highlighted: "it may encourage a major consumer/contributor (Chromium) to bypass the standardization process for test automation features." If the problem is "more expressive tests", then I would be interested in seeing those cases where the webdriver protocol is incapable of expressing the desired behaviour, and understanding which specs are being prevented from adding tests by this. I acknowledge that a bi-directional protocol would be beneficial, and I'm very much in favour of standardising such a protocol. As it stands, the CDP makes a good starting point, but, to my eyes, the proper starting point would be an ED of that spec. |
I fear that the framing of generally "automating browsers" and mention of CDP standardization has made it seem like this proposal goes much further than it actually does. The proposal is to add a product-specific executor similar to executormarionette or executorservo and see if we find it provides any upside in terms of speed or ease of extensibility. At this point we're not suggesting using it by default anywhere, ceasing to maintain ExecutorWebDriver for Chrome, or striving to make ExecutorCDP for any browser other than Chrome. I certainly also agree that having a shared bi-directional protocol would be good, but I don't consider promoting that discussion part of the motivation for this RFC. Before we go there we should see if there are any benefits for even a single browsers. For the time being, WebDriver and its extension specs are the only mechanism we have for defining cross-browser remote control capabilities. Any addition to testdriver.js will go through the RFC process now, and I trust it would not go unnoticed if capabilities not backed by WebDriver are proposed. |
I think to be accepted, the current RFC draft needs to be heavily rewritten if the goal is just to add a product-specific executor, and not positioning it as a longer-term project to supplant at least the the WebDriver executor. |
Maybe |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@jugglinmike following #9, would you mind doing this as a PR? |
FTR, my opinion is that if this is an internal wptrunner change that the Chrome team want and will maintain there isn't a need to write an RFC because it becomes an implementation detail rather than a change in an externally-facing API. Of course I don't want to stop you writing up an RFC if you believe it's helpful to do so. |
@jgraham there would have to be either a new chrome_cdp product or a command line argument to change the executor, do you think that's internal enough to skip the RFC process? There might also be new third-party dependencies, which haven't been spelled out completely in this issue but should be. |
@jgraham is OOO for the rest of this week, FWIW. I think we should move towards being able to use command line args to change the executor, and given in the Chrome case we're likely to have both executors around for a long time we should definitely use an argument for this. Third-party dependencies I think should only be a concern for the RFC process if it affects what's needed for |
Yes. I think everything proposed here counts as internal in the sense that it doesn't produce new requirements or obligations on people not using the proposed code. I think this RFC can be dropped and worked on as a normal issue. However any proposal to add test-accessible features which depend on features of CDP would require an RFC and would almost certainly be rejected at this time, on the basis that there isn't a clear multi-vendor standard for those features. I know no one is proposing such features at present, but I just want to make it very clear that accepting this proposal (or, rather closing it as not requiring an RFC) doesn't create precedent for any future proposals that may depend on it. |
Thanks @jgraham, that all sounds good and in line with my thinking. @jugglinmike, I'll go ahead and close this issue, and the rest can be handled in regular code review. |
Summary
Introduction of an integration with the Chrome browser built on the proprietary Chrome DevTools Protocol ("CDP"). This will be an alternative to the existing WebDriver integration, enabled via explicit opt-in.
Motivation
Although the WebDriver standard is defined to empower web developers to automate browsers, its unidirectional processing model limits the kind of interaction it can simulate. In 2018, the Browser Tools and Testing Working Group acknowledged that a bi-directional protocol extension is desirable and in-scope.
CDP is a functional implementation of such a protocol. By adopting it today, WPT can promote discussion around standardization and attract more implementers to the process. In the longer term, WPT will enjoy the same benefits that motivate the bidirectional protocol: more expressive tests.
In more immediate terms, this will give WPT more control when testing experimental releases of Chrome. That's because Chromedriver occasionally lags behind Chrome; WPT can't test such releases.
Risks
This new code will increase WPT's maintenance burden in two ways:
Separately, it may encourage a major consumer/contributor (Chromium) to bypass the standardization process for test automation features.
Details
This was first proposed by @gsnedders in February of 2018.
The new code will be maintained within the WPT repository itself. It will be subject to the same criteria for code quality and correctness, and the automated verification procedures (as provided by, e.g. TravisCI) will form part of the acceptance criteria for subsequent patches. It will necessarily involve some additional third-party code. Whatever libraries are selected will be compatible with WPT's BSD-style license.
A proof-of-concept was built in the fall of 2018. The source code is available online as is a presentation delivered to review technical considerations.
We expect the support of the Chrome DevTools Protocol maintainers in finalizing the implementation.
Regarding the second risk (bypassing testing standards), WPT already supports such practices today. We maintain tests which reference the Chromium-developed "Mojo" JavaScript interface for standards like Web Bluetooth, Generic Sensors, WebUSB, and WebXR.
The text was updated successfully, but these errors were encountered: