New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add video recording to NixOS VM tests #41165
base: master
Are you sure you want to change the base?
Conversation
Success on x86_64-linux (full log) Attempted: qemu_test Partial log (click to expand)
|
Failure on x86_64-darwin (full log) Attempted: qemu_test Partial log (click to expand)
|
Related: #33299 |
Failure on aarch64-linux (full log) Attempted: qemu_test Partial log (click to expand)
|
@dotlambda: This is basically a follow-up to that PR and what I suggested back then. Having a custom UI module in QEMU also has the advantage that we can watch regions for changes, which is particularly useful for making tests with keyboard/mouse input more reliable. Right now we handle input blindly based for example on the assumption that a particular application has started, but we don't actually know whether an input field has focus or even the application is accepting input (we only check whether there is an X window, which could be empty) at all. With access to frame deltas we can simply watch for a certain region to change, press a key, wait for another change (and if it doesn't change, repeat the keypress), press the next key and so on. |
There is a however still an issue with the videos, because not all frames are written to the intermediate file. The amount of frames received from QEMU is correct but it seems that the UI module doesn't seem to get all the frame updates from the console. I'm adding a WIP label until I've resolved that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Needed to fixup the first commit because this would break QEMU on non- |
@GrahamcOfBorg build qemu_test |
Failure on x86_64-darwin (full log) Attempted: qemu_test Partial log (click to expand)
|
Success on x86_64-linux (full log) Attempted: qemu_test Partial log (click to expand)
|
Success on aarch64-linux (full log) Attempted: qemu_test Partial log (click to expand)
|
@GrahamcOfBorg build qemu_test |
Success on x86_64-linux (full log) Attempted: qemu_test Partial log (click to expand)
|
Failure on x86_64-darwin (full log) Attempted: qemu_test Partial log (click to expand)
|
Success on aarch64-linux (full log) Attempted: qemu_test Partial log (click to expand)
|
av_dict_free(&opt); | ||
|
||
return ecode; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO the Nixpkgs repo is really not the place to store large C source files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could also put them into a separate repository if you'd prefer that.
While this is cool, it significantly increases the complexity of our testing infrastructure, in particular adding a custom version of QEMU that might be a PITA to maintain. How large are the video files generated by this for a NixOS release? Our storage costs are already pretty high... |
@edolstra: The intermediate video files are up to 2 MB (they only contain compressed frame deltas, although lossless) and the encoded videos in WebM format are around 2-5 MB. If that's too much we could alternatively just not encode them and only make the intermediate format available so that videos can be encoded on a per-case basis. |
I marked this as stale due to inactivity. → More info |
I think it would be a shame to let this PR rot.
|
Right now this is a module that encodes every frame delta from QEMU to an intermediate format so that we can use FFmpeg to encode it into a proper video. In the end, the goal is to have videos of the graphical machine output for the NixOS tests, which is especially useful in tests involving X where we're basically blind when we run into a race condition or a loaded Hydra node causes the test to fail. The reason why I picked the approach to use an intermediate format and encode it properly later is based on my benchmarks of some seemingly simple approaches I tried before. First of all I tried to search the web for solutions that already existed and found a few, but they weren't really suitable: * Use the screendump QMP command to collect frames from the VM, which works to some degree but it misses frames. * Enable SPICE[1] and capture video from the server, which I actually tried to implement before the next option. However, existing solutions for capturing video off a SPICE server are rare and when testing with my own PoC implementation, I got frame drops as well and I didn't manage to capture early boot. * Try a patch[2] from the QEMU development mailing list, which adds a HMP command to capture and encode it directly to a video. This was the slowest option of all and it even lead to test failures because we got a timeout during VM startup. * Similar results to SPICE I had when capturing video using VNC and VncProxy[3]. So I dug through the QEMU code base and found out that UI modules get frame deltas from Pixman, which is perfect for us, because we're not losing frames and it also allows direct access to pixel data. It also is fast and I couldn't even properly benchmark the overhead properly as tests usually tend to vary in speed for a few seconds. Before actually writing our own intermediate format, I tried to use an existing format that would be suitable for us. The requirements for this format would be to support different frame sizes and variable framerates, plus it needs to be very fast to encode. While asking in the #ffmpeg channel on freenode, the best format for these requirements would be using the NUT[4] format (thanks to "furq" for the suggestion). However while reading the format specification I came to the realisation that our requirements are so simple that even NUT is complicated in comparison, which is why I written our own format. The specification is as follows: The first byte (the opcode) is either an 'S' (0x53, for "switch") or an 'U' (0x55, for "update") and determines the format of the following data. A "switch" is a surface change, like eg. a resize of the display and the data following the opcode are the dimensions (width and height, both are unsigned 32 bit integers), format (unsigned 8 bit integer) and bytes per pixel (unsigned 8 bit integer, currently either 2 or 4) of the surface. An "update" is a portion of the region that has changed since the last update and it's followed by X, Y, width and height (all 32 bit unsigned integers) coordinates of the updated region, the absolute time (64 bit unsigned integer) and the raw frame data afterwards. Note that we don't provide a length here, because we can infer that from the bytes per pixel of the last "switch" packet and the coordinates. All of the data is in the native endian format of the host processor architecture, which is not a problem, because encoding of the final video will take place on the same processor architecture. All of the data is also gzip compressed, so that we don't accumulate gigabytes of frame data during test runs. I also moved the qemu_test expression out of the default.nix of the main qemu expression, so that when we improve this we don't accidentally break stuff for users of the normal QEMU. [1]: https://www.spice-space.org/ [2]: https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg00865.html [3]: https://github.com/amitbet/vncproxy [4]: https://ffmpeg.org/~michael/nut.txt Signed-off-by: aszlig <aszlig@nix.build>
This tool converts the intermediate format that we now get from QEMU when running with --nixos-test outfile into any video format that can be played by most video players, particularly web browsers when watching the videos on Hydra. The video format to be used is determined by the output file name, so it's easy to change it to a different format rather than WebM (which is the format this was designed for) by simply changing the file name. I also have to give credit to "kepstin" from the #ffmpeg channel on freenode for helping with the rescaling of the presentation timestamp. Signed-off-by: aszlig <aszlig@nix.build>
This now encodes the raw video frames for every single VM test in our internal format. Due to compression the average video file size is about ~3M so I think it's a good idea to enable it by default so we can see post-mortem what went wrong. Instead of killing the machines with the SIGKILL signal during cleanup, we now kill it with SIGTERM because otherwise the atexit() handler doesn't run and we get a video file without the gzip buffer flushed. I'm also adding a helper attribute called 'videos' to every test, so that all these videos can be encoded on Hydra. The derivation propagates the build products from the actual test runner and also adds videos to the Hydra build products. Signed-off-by: aszlig <aszlig@nix.build>
This is mainly for Hydra so that the videos show up in the build products. Encoding this to WebM takes a while especially for long test runs, but it also helps debugging tests without the need to manually run the encoding process on the actual output path of the test runner's videos. Signed-off-by: aszlig <aszlig@nix.build>
When writing with header_size - 1 we shouldn't check whether header_size has been written. I also added an exit(1) to make sure this is fatal, because otherwise things such as the issue here go completely unnoticed. Signed-off-by: aszlig <aszlig@nix.build>
For some tests such as runInMachine, there is neither a .videos attribute nor does it actually need to record videos. However, right now it still does record videos so we need to fix that soon. Signed-off-by: aszlig <aszlig@nix.build>
This is actually quite common whenever video frames are written that a signal can interrupt a call to write(). The first option that came into my mind was to use sigprocmask() to make sure that we don't get interrupted while writing the frame data. However this would also introduce some overhead. So instead, we're now just ignoring a frame update/switch whenever we reach the end of the file, so we don't have that overhead and also only loose one frame to the end of the video stream because we do actually flush the gzip buffer in the atexit() handler. Signed-off-by: aszlig <aszlig@nix.build>
When building with Nix there is a NIX_BUILD_CORES environment variable set to either the default, which should be equal to av_cpu_count() or a user-set value, so let's respect that. If the value is 0 or unset (not inside a Nix build), we use av_cpu_count() instead. Signed-off-by: aszlig <aszlig@nix.build>
If we only want to run tests without actually encoding the videos, it doesn't really make sense to pull in the FFmpeg dependency. When debugging tests we can still either use the encoder manually or simply append the .videos attribute in order to get the encoded videos (which then of course will pull in the FFmpeg dependency). Signed-off-by: aszlig <aszlig@nix.build>
0d0f82e
to
7566b3e
Compare
Rebased against current master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As much as I appreciate the work put into this, I still don't think patching qemu downstream is the right way forward. Quoting my original comment from three years ago:
Did you try approaching upstream qemu with the problems you're facing? I'm not sure if we do ourselves much of a favor shipping our own custom fork of qemu, and tooling to produce a somewhat ffmpeg-specific intermediate format etc. We certainly can't be the only ones facing this. I'd rather see this somewhere inside or alongside qemu, or spice being fixed to not drop frames (if the underlying architecture allows this).
Please approach upstream to get either this or another reliable way of video recording merged. I don't think this is suitable to ship in nixpkgs only.
I'd like to avoid putting such a module upstream since currently it's too Nix-specific and I also have plans for using this same module to add UI test functionality (eg. selecting/matching regions for change or constraining OCR). Furthermore, upstream already rejected a similar implementation (although that one used ffmpeg, while ours does not). |
Addendum: Since QEMU 7.0 there is a new D-Bus display, which as far as I can see calls a D-Bus listener for every frame update. This is a lot more close to what we want, but it seems to always write the full frame image even if only a subset has changed. It's not a show-stopper though since we can easily work around this limitation if we even need to. I'll experiment with that a bit to see whether it's a viable option or whether it has similar drawbacks such as SPICE/VNC. |
Sometimes, it's a bit difficult to debug VM tests post-mortem, especially when GUIs are involved. So this adds a new
nixos-test
QEMU UI module toqemu_test
, which writes raw video frame deltas to an intermediate file (the reasons for this are detailed in the commit message of db8e70f), so that it can be encoded to a video file at a later stage.Encoding of the video is done using a helper tool called
nixos-test-encode-video
, which encodes the intermediate format into a more commonly recognized one determined by the filename given (for examplenixos-test-encode-video foo.video bar.webm
will encode the intermediatefoo.video
intobar.webm
using a WebM container with VP9).NixOS VM test runners now have another attribute called
videos
, which will gather all the build products from the normal test runner, encode the videos to WebM and add them to the build products. This could see a bit of improvement in Hydra so that those videos are directly displayed in the browser instead of a download.I've also added usage of the
.video
attribute inrelease.nix
, so that this is done by default but doesn't increase test run time when building the tests directly without going throughrelease.nix
.A Hydra jobset of this branch is available at https://headcounter.org/hydra/jobset/aszlig/nixos-tests.
Partial example GIF from the enlightenment test:
Converted back to draft since there are still a few issues to solve: