Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/amazonImageZfs: init #106574

Merged
merged 3 commits into from Aug 25, 2021
Merged

nixos/amazonImageZfs: init #106574

merged 3 commits into from Aug 25, 2021

Conversation

grahamc
Copy link
Member

@grahamc grahamc commented Dec 10, 2020

Motivation for this change

Create an AWS AMI which uses ZFS as its root filesystem. Submitting a PR for initial review. I'm expecting significant changes to this before merging. A big thanks to Clever's examples and help with this.

This PR was implemented for Flox.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

Copy link
Member Author

@grahamc grahamc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just calling out weird things about what is going on here for purposes of discussion.

nixos/maintainers/scripts/ec2/amazon-image-zfs.nix Outdated Show resolved Hide resolved
nixos/lib/make-zfs-image.nix Outdated Show resolved Hide resolved
nixos/modules/virtualisation/amazon-image.nix Outdated Show resolved Hide resolved
nixos/release.nix Show resolved Hide resolved
@grahamc
Copy link
Member Author

grahamc commented Dec 17, 2020

@nshalman and I talked about this PR this morning, and we decided to try making 2 disk images: one tiny one for a boot pool, and a second one for the root, to make zpool auto-expand work properly.

@grahamc
Copy link
Member Author

grahamc commented Mar 3, 2021

It appears this is fat-allocated the disk images for ext4 now, but not zfs.

@samueldr samueldr mentioned this pull request Apr 25, 2021
10 tasks
@numinit
Copy link
Contributor

numinit commented Apr 25, 2021

@grahamc Hmmmm. I like this idea. :-)

So, one setup I've been successfully running with (after using the local/system/user mounts for a while)... instead of a separate boot pool and root pool, I've been just using an ESP, swap, and zpool, in that order. GRUB boot on MBR drives (e.g. Amazon, prgmr, etc) is achieved by using a hybrid MBR and storing the kernel in the ESP, so a boot pool ends up not being necessary.

@numinit
Copy link
Contributor

numinit commented Apr 25, 2021

To add on... I see a path forward where make-disk-image.nix can still do most of the work here so everyone using it downstream (e.g. nixos-generators) can benefit, and the images will still only require one disk rather than two.

The main issues I saw there were (a) copytofs used for copying the closure doesn't support ZFS and (b) even if it did the zpool would have to be created inside the VM (since you need to be root with ZFS modules loaded to do that).

But maybe if the VM instantiation in make-disk-image.nix worked a little more like it does here, then the ext4 FS (if the image is being generated with ext4) could be created inside the VM too, and the process for ZFS would be identical.

@numinit
Copy link
Contributor

numinit commented May 2, 2021

#121532 may be a good prereq for this. Going to take a stab at getting a ZFS image building next based on this PR.

@grahamc
Copy link
Member Author

grahamc commented Aug 19, 2021

instead of a separate boot pool and root pool, I've been just using an ESP, swap, and zpool, in that order.

My first commit used this approach, however it has a fairly troublesome flaw: autoexpand doesn't work on partitions, and for this image I'd like for autoexpand to work correctly. My first commit created a very ugly hack to make it work, but I don't feel comfortable merging like that.

Using two disks, while somewhat inconvenient for our existing tooling, fixes this issue.

@grahamc grahamc marked this pull request as ready for review August 19, 2021 19:44
@grahamc grahamc requested a review from tomberek August 19, 2021 19:45
nixos/maintainers/scripts/ec2/amazon-image.nix Outdated Show resolved Hide resolved
nixos/modules/virtualisation/amazon-image.nix Outdated Show resolved Hide resolved
nixos/maintainers/scripts/ec2/amazon-image-zfs.nix Outdated Show resolved Hide resolved
nixos/maintainers/scripts/ec2/amazon-image-zfs.nix Outdated Show resolved Hide resolved
nixos/modules/virtualisation/amazon-image.nix Outdated Show resolved Hide resolved
nixos/release.nix Show resolved Hide resolved
@grahamc grahamc marked this pull request as draft August 20, 2021 18:16
@grahamc grahamc force-pushed the amazon-image-zfs branch 2 times, most recently from bac56d4 to 104535e Compare August 23, 2021 15:46
@grahamc
Copy link
Member Author

grahamc commented Aug 24, 2021

This PR includes #135568 to auto-expand the pool on startup.

@grahamc

This comment has been minimized.

@grahamc grahamc marked this pull request as ready for review August 24, 2021 15:22
@grahamc grahamc force-pushed the amazon-image-zfs branch 2 times, most recently from d62f1d8 to 2012c84 Compare August 24, 2021 20:02
@grahamc
Copy link
Member Author

grahamc commented Aug 24, 2021

Other than some commit squashing, I think this PR is ready to go.

@grahamc
Copy link
Member Author

grahamc commented Aug 25, 2021

I've pushed a cleaned up history with what I think are decent commit messages about the code and the why's. I've also extended the image-info.json to have a structure which supports arbitrary collections of disks.

This is a private interface for internal NixOS  use. It is similar
to `make-disk-image` except it is much more opinionated about what
kind of disk image it'll make.

Specifically, it will always create *two* disks:

1. a `boot` disk formatted with FAT in a hybrid GPT mode.
2. a `root` disk which is completely owned by a single zpool.

The partitioning and FAT decisions should make the resulting images
bootable under EFI or BIOS, with systemd-boot or grub.

The root disk's zpools options are highly customizable, including
fully customizable datasets and their options.

Because the boot disk and partition are highly opinionated, it is
expected that the `boot` disk will be mounted at `/boot`. It is
always labeled ESP even on BIOS boot systems.

In order for the datasets to be mounted properly, the `datasets`
passed in to `make-zfs-image` are turned in to NixOS configuration
stored at /etc/nixos/configuration.nix inside the VM.
NOTE: The function accepts a system configuration in the `config`
argument. The *caller* must manually configure the system
in `config` to have each specified `dataset` be represented
by a corresponding `fileSystems` entry.

One way to test the resulting images is with qemu:

```sh
boot=$(find ./result/ -name '*.boot.*');
root=$(find ./result/ -name '*.root.*');

echo '`Ctrl-a h` to get help on the monitor';
echo '`Ctrl-a x` to exit';

qemu-kvm \
    -nographic \
    -cpu max \
    -m 16G \
    -drive file=$boot,snapshot=on,index=0,media=disk \
    -drive file=$root,snapshot=on,index=1,media=disk \
    -boot c \
    -net user \
    -net nic \
    -msg timestamp=on
```
Introduce an AWS EC2 AMI which supports aarch64 and x86_64 with a ZFS
root.

This uses `make-zfs-image` which implies two EBS volumes are needed
inside EC2, one for boot, one for root. It should not matter which
is identified `xvda` and which is `xvdb`, though I have always
uploaded `boot` as `xvda`.
Having a disks object with a dictionary of all the disks and their
properties makes it easier to process multi-disk images.

Note the rename of `label` to `system_label` is because `$label`i
is something of a special token to jq.
Copy link
Contributor

@tomberek tomberek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Examined example AMI. Performed as expected.

@grahamc grahamc merged commit 9ea7f44 into NixOS:master Aug 25, 2021
@grahamc grahamc deleted the amazon-image-zfs branch August 25, 2021 16:08
@github-actions
Copy link
Contributor

Successfully created backport PR #137665 for release-21.05.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants