New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
virtualbox: 5.2.28 -> 6.0.6 #60943
virtualbox: 5.2.28 -> 6.0.6 #60943
Conversation
@ambrop72 thanks a lot for your work, especially for upstreaming the patches! However, there's already a WIP PR at #53120 - which is blocked on getting the virtualbox VM tests to work again. I tried running the VM test based on your PR, however, they fail with a different error than there:
|
Don't know why I didn't find that, I remember searching. Comparison:
How does one run the VM test? |
No worries! Thanks a lot for taking the time! I hope not too much effort was duplicated… Given patches were already sent to upstream as used here, it might makes sense to continue in this PR. Could you incorporate the other changes from #53120 in here, and copy over the PR checkboxes?
Oracle mentions 32-bit support as being discontinued:
I'm not sure if they actively dropped any code, but given even they don't test it, it probably makes sense to drop support on our side, too.
nixos vm tests can be run via As explained in #53120, I guess the problem is somehow related due to the tests being run inside a qemu vm, but we can't really run tests elsewhere currently (see #5241 for that discussion)… |
I've added all relevant changes from #53120 to this PR. I don't have the same error with the NixOS tests but a guru medidation from virtualbox in the kvm guest:
which is then followed by tons of hex values. This error is the same as encountered in #53120. I have then found this, which I believe to be about the same bug in VirtualBox: https://forums.virtualbox.org/viewtopic.php?f=6&t=91974. In that case they got an almost identical guru meditation, with VirtualBox using SW virtualization (like these NixOS tests do, that's why they must run a 32-bit guest) which happens when a program opens a serial port (but with windows host and guest). That the last kernel message before our crash is about initializing the serial port suggests this is the same problem. I have then tried enabling nested virtualization in KVM:
That actually went surprisingly well, it does seem like VirtualBox is successfully using hardware virtualization within KVM (but I would have to confirm by looking at the VirtualBox machine log). But there is an error with mounting vboxsf in the guest initrd:
This needs to be debugged, I think it's just a regression due to the VirtualBox update. If this works out then at least the tests will work with some manual intervention (enabling nested KVM on Hydra builders is probably not a smart idea because the feature is experimental). |
ea7fa57
to
c12e12a
Compare
With nested KVM I have gotten several tests (probably all others) to work except simple-gui. It is timing out wating for the VM to start. Maybe sending just an enter press no longer works to start the VM. The output says screenshots were made but since the build fails I think they are lost. Build output. |
0c68d98
to
c682682
Compare
With the current state of this PR, the tests pass for me. The precondition is enabling nested KVM on the build machine, by adding Needed changes were:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some nitpicks
sha256 = "0hjx99pg40wqyggnrpylrp5zngva4xrnk7r90i0ynrqc7n84g9pn"; | ||
}) | ||
# https://www.virtualbox.org/ticket/18620 | ||
./fix_kbuild.patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead shipping patches via nixpkgs, can we use fetchpatch here and fetch from their issue tracker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're very small and we have other patches right here in nixpkgs, so I think it doesn't make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nixpkgs policy is to not include patches when possible. Many of the patches in Nixpkgs are added under the same "well there are other patches in here...", but it is generally wrong. When possible (like in most of these cases) the patches should be included via fetchpatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, I'm grateful of your work here -- this PR is in great shape, despite these patches being checked in. In the future, please fetch patches (and @flokli please be a bit more insistent :) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, since I managed to not say it in the first two messages (d'oh):
Thank you so much for your work in Nixpkgs, patching, and working with upstream to get this going. I really appreciate it, it makes a big difference to NixOS!
|
||
# Build kernel modules | ||
export INSTALL_MOD_PATH=$out | ||
configurePhase = "true"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doConfigure = "false";
Awesome work! I'm not sure if we can get tests requiring nested kvm support enabled on hydra soon (might need a new feature flag), but this at least makes it possible for maintainers to run the tests manually. Can we switch between "hardware virtualization, 64 bit guest" and "software virtualization, 32 bit guest" via some argument, similar to Did you open an issue in upstreams issue tracker about the broken software virtualization support? |
nixos/tests/virtualbox.nix
Outdated
@@ -29,6 +45,10 @@ let | |||
"${pkgs.dbus.daemon}/bin/dbus-daemon" --fork \ | |||
--config-file="${pkgs.dbus.daemon}/share/dbus-1/system.conf" | |||
|
|||
# Some programs (e.g. VBoxService, systemd-detect-virt) load libdbus | |||
# using dlopen(), make sure that works. | |||
export LD_LIBRARY_PATH=${pkgs.dbus.lib}/lib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be debugged further, whether it's broken on NixOS too (and whether we need to add dbus to rpath)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured that when I refactored the additions package to not skip the fixup phase which includes RUNPATH stripping, the dbus path got stripped out since there was no direct dependency on dbus. I will fix this in a new commit by adding the dbus path in postFixup
(and removing dbus from the first round of patchelf because none of the binaries directly depend on dbus). This leaves VBoxClient
and VBoxService
with dbus in RUNPATH (as found by grepping binaries for libdbus-1.so
). So I think that explains why VBoxService
broke in the initrd.
As far as systemd-detect-virt
is concerned, I suspect what happened is that because mount.vboxsf
previously ended up with a bunch of unnecessary RUNPATH
entries, copy_bin_and_libs "${guestAdditions}/bin/mount.vboxsf"
resulted in libdbus being copied to /lib (or something equivalent, not sure what exactly it does) and found by systemd-detect-virt
. What's interesting is that just running systemd-detect-virt
in NixOS does not result in a libdbus error and strace shows that it does not load libdbus (but I see no point in investigating why).
So I will remove the LD_LIBRARY_PATH
setting in the tests from its current place and add it just for the systemd-detect-virt
call. It would not work without this now because because mount.vboxsf
no longer has dbus in RUNPATH (it doesn't need it) and we don't copy_bin_and_libs
anything that does.
I added another commit with the latest fixes. Tests still pass. Is there anything else to do before it can be merged? |
I was mistaken about |
Should I merge these commits all into one, so they can easily be cherry-picked by anyone who wants it on release versions? |
I will add a flag to control whether HW- or SW-virt is used, and what bitness guest is used, independently, with an assert that SW-virt requires 32-bit guest. |
I will try to reproduce the SW virt problem, but I am not sure I will be successful. |
I'm a bit confused with the github temporary anomaly, but yes please - we probably might want to cherry-pick this to 19.03 too, or should at least have the option to. |
Amazing work @ambrop72, thanks for digging into this further. 🍺 |
I definitely owe @ambrop72 a 🍺 at NixCon ;-)
|
Quite some fixing was needed to get this to work. Changes in VirtualBox and additions: - VirtualBox is no longer officially supported on 32-bit hosts so i686-linux is removed from platforms for VirtualBox and the extension pack. 32-bit additions still work. - There was a refactoring of kernel module makefiles and two resulting bugs affected us which had to be patched. These bugs were reported to the bug tracker (see comments near patches). - The Qt5X11Extras makefile patch broke. Fixed it to apply again, making the libraries logic simpler and more correct (it just uses a different base path instead of always linking to Qt5X11Extras). - Added a patch to remove "test1" and "test2" kernel messages due to forgotten debugging code. - virtualbox-host NixOS module: the VirtualBoxVM executable should be setuid not VirtualBox. This matches how the official installer sets it up. - Additions: replaced a for loop for installing kernel modules with just a "make install", which seems to work without any of the things done in the previous code. - Additions: The package defined buildCommand which resulted in phases not running, including RUNPATH stripping in fixupPhase, and installPhase was defined which was not even run. Fixed this by refactoring using phases. Had to set dontStrip otherwise binaries were broken by stripping. The libdbus path had to be added later in fixupPhase because it is used via dlopen not directly linked. - Additions: Added zlib and libc to patchelf, otherwise runtime library errors result from some binaries. For some reason the missing libc only manifested itself for mount.vboxsf when included in the initrd. Changes in nixos/tests/virtualbox: - Update the simple-gui test to send the right keys to start the VM. With VirtualBox 5 it was enough to just send "return", but with 6 the Tools thing may be selected by default. Send "home" to reliably select Tools, "down" to move to the VM and "return" to start it. - Disable the VirtualBox UART by default because it causes a crash due to a regression in VirtualBox (specific to software virtualization and serial port usage). It can still be enabled using an option but there is an assert that KVM nested virtualization is enabled, which works around the problem (see below). - Add an option to enable nested KVM virtualization, allowing VirtualBox to use hardware virtualization. This works around the UART problem and also allows using 64-bit guests, but requires a kernel module parameter. - Add an option to run 64-bit guests. Tested that the tests pass with that. As mentioned this requires KVM nested virtualization.
Filed the VirtualBux bug for the guru meditation: https://www.virtualbox.org/ticket/18632 I pushed all the changes as a single commit. The tests now run with SW virtualization as before but disable the virtual UART to work around the guru meditation (this suppresses all console output from guests). There are now flags to enable nested KVM, to turn the UART back on and to run 64-bit guests (the latter two require nested KVM). 64-bit guests in fact work (all that is different is the machine Let's get this merged now. |
Manually ran unfree tests too, and spinned up a real VM, looks good :-) Let's merge this 🚀! Again, thanks for all the great work done here! |
The VirtualBox serial port bug will be fixed in the next maintenance release! https://www.virtualbox.org/ticket/18632#comment:1 |
awesome! I also see some of your patches being mainlined: flokli/virtualbox@e9a91b9 :-) |
Quite some fixing was needed to get this to work. See the discussion and comments in commits and comments.
Things done
Tested host and guest (including desktop) on 19.03, no breakage observed.
sandbox
innix.conf
on non-NixOS)nix-shell -p nix-review --run "nix-review wip"
./result/bin/
)nix path-info -S
before and after)