Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make-disk-image: change to be less VM-centric #24964

Merged
merged 1 commit into from Apr 24, 2017

Conversation

copumpkin
Copy link
Member

@copumpkin copumpkin commented Apr 17, 2017

This changes much of the make-disk-image.nix logic (and thus most NixOS image building) to use LKL to set up the target directory structure rather than a Linux VM. The only work we still do in a VM is less IO-heavy stuff that while still time-consuming, is less of the overall load. The goal is to kill more of that stuff, but that will require deeper changes to NixOS activation scripts and switch-to-configuration.pl, and I don't want to bite off too much at once.

This is a new version of #21943, refactored to take advantage of the new nixos-prepare-root logic I introduce in #23026. There's still a couple of WIP things here but I wanted to get this out there to show what I've been talking about in the other PR.

Motivation for this change

Building disk images with VMs is miserable in EC2 and other cloud virtualization providers, and doesn't work at all in local VMWare due to some shady nested virtualization bugs.

Things done
  • Tested using sandboxing
    (nix.useSandbox on NixOS,
    or option build-use-sandbox in nix.conf
    on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • Linux
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Fits CONTRIBUTING.md.

@mention-bot
Copy link

@copumpkin, thanks for your PR! By analyzing the history of the files in this pull request, we identified @domenkozar, @edolstra and @obadz to be potential reviewers.

@copumpkin copumpkin changed the base branch from copumpkin/nixos-install-refactor to master April 17, 2017 13:52
@domenkozar
Copy link
Member

How long does the build take on EC2 with this PR?

@copumpkin
Copy link
Member Author

copumpkin commented Apr 17, 2017 via email

@domenkozar
Copy link
Member

I just had a 3 hour deploy where AMI handling was ~80min so this is highly appreciated!!

@copumpkin
Copy link
Member Author

copumpkin commented Apr 17, 2017 via email

@7c6f434c
Copy link
Member

Hm, tools for cheap image builds sound like something useful for spinoffs, too…

pkgs.vmTools.runInLinuxVM (
let
# Copied from https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/installer/cd-dvd/channel.nix
channelSources = pkgs.runCommand "nixos-${config.system.nixosVersion}" {} ''
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This chunk might not actually be necessary...

@copumpkin
Copy link
Member Author

@7c6f434c yes, I have a larger plan for all sorts of images that I start talking about a bit here: #23052

@copumpkin copumpkin added this to Improve speed & portability in Amazing images Apr 17, 2017
@copumpkin
Copy link
Member Author

Pushed a few more changes that I think take this out of WIP land.

I also ran a few tests for speed in a loop just to get a sense for how this performs.

On (the same across all tries) physical hardware (with VT-X etc.):

Before I merged nixos-prepare-root (in case that affected it negatively):

real	3m39.639s
real	3m38.517s
real	3m40.729s
(I got sick of waiting)

This PR:

real	0m22.107s
real	0m21.942s
real	0m22.242s
real	0m22.511s
real	0m22.127s
real	0m22.318s
real	0m22.215s
real	0m22.328s
real	0m22.200s
real	0m22.235s

On EC2:

I don't have exact measurements handy but my EC2 image builds were in the 25-30 minute range and are now in the 30s-1m range.

@copumpkin copumpkin changed the title [WIP] make-disk-image: change to be less VM-centric make-disk-image: change to be less VM-centric Apr 17, 2017
@copumpkin copumpkin requested a review from cstrahan April 17, 2017 20:47
exportReferencesGraph = [ "closure" metaClosure ];
postVM = ''
${if format == "raw" then ''
mv $diskImage $out/nixos.img
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks nixos/modules/virtualisation/virtualbox-image.nix which relies on $diskImage being set.
The former code had diskImage=$out/nixos.${if format == "qcow2" then "qcow2" else "img"} in PreVM, which makes much sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! What's the best way to test that? Is there a good VM test or a nix-build invocation I can run to make sure it works?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I've been able to repro with this:

nix-build nixos/ --no-out-link -A config.system.build.virtualBoxOVA --arg configuration "{ imports = [ ./nixos/modules/virtualisation/virtualbox-image.nix ]; }"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@layus I don't think I can replicate that exact logic sensibly here, but I'll make it work for downstream consumers

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I can build an OVA fine in the latest push. Takes about a minute to build on my machine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also tested that novaImage works fine, which seems to be the only other thing that uses make-disk-image.nix. An upcoming PR will consolidate all the other image builders (GCE, Azure, etc.) to use this script (making adjustments here to accommodate variations)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you figured everything yourself. Nice :-).

Some years ago, I tried to build a VBox image using Nix{,OS}. It took one hour to make a trivial image. Now I took me about 8min, download included. There are still three copies where one would be optimal, but it's already way faster.

done
# Set up core system link, GRUB, etc.
NIXOS_INSTALL_BOOTLOADER=1 chroot $mountPoint /nix/var/nix/profiles/system/bin/switch-to-configuration boot
chroot $mountPoint /nix/var/nix/profiles/system/activate
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I be calling the activation script here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As opposed to not calling activate at all? Was the activation script called in the previous flow?

@copumpkin
Copy link
Member Author

@shlevy mind taking a look?

@shlevy
Copy link
Member

shlevy commented Apr 18, 2017

@copumpkin probably not until this weekend, sorry

@copumpkin
Copy link
Member Author

@volth I'd certainly be open to that, but probably not as part of this ticket, which is purely trying to refactor the implementation of the existing interface. Perhaps make a ticket talking about how you might use that? The thought of having builds not managed by Nix blows my mind so I can't imagine it 😄

@copumpkin copumpkin force-pushed the make-disk-image-refactor branch 2 times, most recently from d4516ea to 7f8f168 Compare April 18, 2017 18:55
@copumpkin
Copy link
Member Author

It turns out I actually broke the old make-disk-image.nix in my last nixos-prepare-root PR. I'd rather not put effort into fixing it if we're going to replace it with this in the next day or two. Does anyone expect this PR to be particularly contentious?

@cstrahan
Copy link
Contributor

Does anyone expect this PR to be particularly contentious?

Not I, FWIW.

Copy link
Contributor

@cstrahan cstrahan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from the (open?) question regarding /nix/var/nix/profiles/system/activate, this looks good to me.

I think it's reasonable to refactor the filterFn and other channel bits in a follow up.

@copumpkin
Copy link
Member Author

The old thing didn't call activate, nope. I'm not sure why I started doing that or whether it's the correct thing to do. It also forced me to start doing umount -R, which might suggest I'm doing the wrong thing.

@cstrahan
Copy link
Contributor

@copumpkin Perhaps it's best to keep the activate part as it was, then. Also, I could be wrong, but I thought there was a use case in the context of AMIs that went something like this:

  • Create an image with a particular configuration
  • When it first boots, the activation script reads some info from the environment, and possibly pulls down a new configuration.nix, which it then switches to

That's essentially what one of your old PRs did (which I think I contributed a bit to, IIRC), right? My memory is a little foggy here.

I would think calling activate would make it difficult (or impossible) to do such first-boot setup.

@copumpkin
Copy link
Member Author

copumpkin commented Apr 21, 2017 via email

@shlevy
Copy link
Member

shlevy commented Apr 23, 2017

I'm not really all that familiar with this code and dont' really have an opinion either way. Fine by me if it works.

@cstrahan
Copy link
Contributor

🚢 👍

@copumpkin
Copy link
Member Author

copumpkin commented Apr 23, 2017 via email

This changes much of the make-disk-image.nix logic (and thus most NixOS
image building) to use LKL to set up the target directory structure rather
than a Linux VM. The only work we still do in a VM is less IO-heavy stuff
that while still time-consuming, is less of the overall load. The goal is
to kill more of that stuff, but that will require deeper changes to NixOS
activation scripts and switch-to-configuration.pl, and I don't want to
bite off too much at once.
@copumpkin
Copy link
Member Author

Okay, removed the call to activate, so I'll merge now. Going to be making a few more commits in next few days adding some tests and switching some of the other image builders to use this.

@copumpkin copumpkin merged commit 7025fc6 into NixOS:master Apr 24, 2017
@domenkozar
Copy link
Member

I'll test this shortly. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Amazing images
Improve speed & portability
Development

Successfully merging this pull request may close these issues.

None yet

7 participants