-
-
Notifications
You must be signed in to change notification settings - Fork 15.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nixos/digital-ocean-image: init #58464
Conversation
150c3ac
to
f7991b8
Compare
Can you share expected example user data? |
@colemickens if you mean username/password: it's just nixos/nixos. It should be able to sudo without a password. I got into a console and looked at the systemd journal. I can't copy-paste from the console so I apologize for the screenshot of text instead of copy-paste: It looks like cloud-init isn't very robust and doesn't like having "nixos" as the distro. I'll look into that more tomorrow. |
@eamsden I meant example user-data you're passing into the Droplet config, or are you not specifying any and just expecting to see the SSH keys? The problem isn't there. If you scroll further, you see that the network portion of the "vendor-data" config fails to apply. It then doesn't process the user-data or ssh keys, I suspect. But this just makes me question all the more if it's worth the fun of |
I may have been looking at the wrong log, this is the most recent |
I added the template file it was complaining about (see 37d3a52) and now it's not able to write /etc/hosts because NixOS makes it read-only. Is there an idiomatic way to override that? Log: |
I don't know. I'm still stuck on the last line of the log which reminds the same: To be honest, the time already spent reading cloud-init logs reminded me why I've gone to lengths to avoid this in the past, especially on NixOS where it's only real utility is better served with a few lines of bash. In case it's useful -- I did write up a nixos/maintainers script that will auto-build and upload the image to Digital Ocean. I got the iteration cycle down to a few minutes. I also opened another PR that reduces the size of the image by about ~230MB by changing the |
@colemickens Perhaps I should go ahead and build a utility that can parse the digital ocean metadata and set things properly. The only thing that wouldn't work for would be resetting the root password, since (for obvious reasons) they don't expose that in the metadata, but it isn't too hard to do from the nix config anyway if you really want a root password set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I just realized this is a draft PR. So, my comments might be not relevant at all!
@nlewo the stuff you mentioned in the review is to help me debug the nixos image on digital ocean, and will most certainly be removed before the draft tag comes off the PR |
It looks like the best approach is going to be a very optional module that can load bits and pieces of configuration from the config drive. Digital Ocean uses an OpenStack config drive so maybe we should just have a generic OpenStack service module? This would let us set (optionally according to the NixOS config)
from the Digital Ocean metadata Further, we could do as the Amazon AMI image does and allow loading of a NixOS module from the user data, so users of the NixOS image could quickly spin up a custom NixOS system directly from a cloud console. The goal with this is not to enforce compliance with the host provider's metadata, but to make starting with NixOS in a Digital Ocean droplet as painless as possible, while also not interfering with the operation of e.g. NixOps or other deployment tools. |
Note that you can just directly https://developers.digitalocean.com/documentation/metadata/ You could put this in a systemd unit that starts up at boot:
and:
This is similar to what the NixOS amazon, GCE and Azure images do |
@arianvp I am working on something similar to this. The issue is that we cannot depend on networking being up when we start to configure the system. Fortunately, there is a config drive available in a standard location, with the same JSON metadata available. I'm working on making sure the instance can configure its networking etc from this. |
Why can't we depend on the network being up? This is how the coreos digital ocean image works too. |
Interesting. Thank you! It wasn't clear from the Digital Ocean documentation that we could depend on DHCP being enabled and configure networking that way. Since it appears that we can. I'll go with your suggestion as soon as I get time to hack on it again. |
I was able to successfully boot an image, set up ssh, and hostname, with this config: Feel free to take inspiration from it, or else I'm also happy to provide a patchset myself on top of this PR |
@arianvp I'm certainly not going to complain if someone else does work. :) That said, my plan this week is to take the config you wrote and wrap the systemd services in options, so a user can turn them off in an updated system config, and add something similar to the Amazon/GCE setup where a user can put a NixOS configuration module in the user data and have the machine rebuild itself to that config on startup. I'd also like to make sure the default config is present on the system so that the user can use /etc/nixos/configuration.nix to configure the system without having to manually replicate everything the image does for Digital Ocean compatibility. |
Yeh good idea. we should probably disable the hostname fetching when someone sets the hostname in the NixOS config too. Such that nixos-rebuild doesn't race with the metadata daemon |
@arianvp I'm testing an image now and then I'll push, then I have to go to $DAYJOB. Some things I'd like to do before I take off the draft label.
|
@arianvp (In the latest push I did make systemd set the hostname from metadata only if it is not set in the NixOS config) |
A bit of hunting reveals that cloud-utils is pulled in by nixos/modules/system/boot/grow-partition.nix, via the option boot.growPartition. That seemingly innocuous option pulls in cloud-utils, but just for the |
0812e66
to
79a5a96
Compare
I'm hoping the bloat will be fixed separately when #58471 is merged, so I'm going to make this into a real PR now. @arianvp @colemickens do either of you want to be in the maintainers list for this image? Or should I leave it as just me? |
You can add me to the maintainers list. I'll be actively using this module do I don't mind maintaining it. Good job on the size hunting. |
… exists before writing it from user data
@arianvp I wasn't able to find you in https://github.com/NixOS/nixpkgs/blob/master/maintainers/maintainer-list.nix. Am I searching the wrong handle perhaps? I was grepping for 'arianvp'. |
@eamsden I don't maintain any packages so far, so I'm not in that list yet =)
|
set -e | ||
TEMPDIR=$(mktemp -d) | ||
curl --retry-connrefused http://169.254.169.254/metadata/v1/vendor-data | munpack -C $TEMPDIR | ||
$TEMPDIR/entropy-seed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an error sometimes when this script runs, which causes issues in config switching when importing this module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it the curl
part that is failing or is it the $TEMPDIR/entropy-seed
part that is failing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is the curl not getting data, which is causing munpack to fail. But I'll have to try it again after work.
It didn't seem to be stopping the base system from coming up from the image, but if I build a system closure that imports digital-ocean-config.nix (I was trying this for my website) and I don't turn off the entropy seed fetcher, it gives an error when switching to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this is a known issue with munpack, since 2005 at least! https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=211472
That bug report from 2005 is talking about the same version of munpack packaged by nixpkgs now. I think we need an alternate solution, if we are going to grab the entropy data at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it's worth, I'm using the following variant:
curl http://169.254.169.254/metadata/v1/vendor-data | munpack -tC $TEMPDIR
ENTROPY_SEED=$(grep -rl "DigitalOcean Entropy Seed script" $TEMPDIR)
${pkgs.runtimeShell} $ENTROPY_SEED
echo "attempting to fetch configuration from Digital Ocean user data..." | ||
export HOME=/root | ||
export NIX_PATH=/nix/var/nix/profiles/per-user/root/channels/nixos:nixos-config=/etc/nixos/configuration.nix:/nix/var/nix/profiles/per-user/root/channels | ||
userData=$(mktemp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied this from the Amazon init module and changed obvious things, but I haven't tested it yet.
fi | ||
echo "setting configuration from Digital Ocean user data" | ||
cp "$userData" /etc/nixos/configuration.nix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely happy about this part. I'd like to find a way to make sure that digital-ocean-config is in the module list without having to explicitly include it. That way if a user provides a config in the user data they can assume still that the digital ocean stuff is in place, and disable it via the options if necessary.
cc @arianvp thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually the pattern that people are used to , is:
# configuration.nix
{ imports = [ ./hardware-configuration.nix ]
}
right? At least, that is the standard /etc/nixos/configuration.nix
that is being generated.
then just put the DO specific things in /etc/nixos/hardware-configuration.nix
It's not ideal because if people forget, the config won't work... but it is the same as on their laptops.
Other thing we could do is:
cp $userData /etc/nixos/do-userdata.nix
And then hardcode /etc/nixos/configuration.nix
to be:
{ imports = [./do-userdata.nix <nixpkgs....blah../do-config.nix> ]; }
Then people don't have to remember to include the hardware-configuration
manually
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking along those lines too. But maybe we should stick to what the AWS/GCE images do, which is to put the user data in /etc/nixos/configuration.nix
, and then make an issue to discuss this. Or we could just lead the way in doing things in a bit nicer way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it would be nice to get @infinisil's take as the code owner on this question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What @arianvp suggested sounds much better, I don't think there's much to discuss there. I'd just go ahead and do it this way, which can serve as an example for others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eamsden shall we go with my suggestion? I'd love to see this merged. Been using it for quite a while already on my own nixpkgs fork
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arianvp Yeah go with your suggestion. I too really want to see this merged
…sh keys when users.mutableUsers is disabled
A few notes from trying this out.
Besides these minor issues it seems to be working pretty well though! |
}; | ||
}; | ||
|
||
/* Fetch the ssh keys for root from Digital Ocean */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you reuse nixos/modules/virtualisation/{openstack-config.nix,ec2-data,amazon-init.nix}
here? There's already a lot of metadata server fetching and applying…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No not really. They all are very specific to their platforms and I'm not sure how I could get a lot of code reuse out of them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The openstack-init
from openstack-config.nix
uses ec2-metadata-fetcher.nix
, which creates a service fetching metadata, which is picked up by the service defined in ec2-data.nix
(which is imported).
This looks pretty similar to what we're trying to do here (and what we should do in brightbox-image.nix
too, btw.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made more than one attempt to use openstack to fetch metadata, and eventually settled on just using DigitalOcean's documented metdata service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been using this for the past month or so and would like to see or help it get pushed to completion if possible! So, I'm using this review to organize what I believe are the remaining three issues that have been brought up regarding this PR, and I think it should be in a pretty good shape once they're addressed?
script = '' | ||
set -e | ||
TEMPDIR=$(mktemp -d) | ||
curl --retry-connrefused http://169.254.169.254/metadata/v1/vendor-data | munpack -C $TEMPDIR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As has been noted, this fails due to issues with munpack being unable to name the files properly; we can avoid this with the -t
argument and find the file by its header comment:
curl --retry-connrefused http://169.254.169.254/metadata/v1/vendor-data | munpack -C $TEMPDIR | |
curl --retry-connrefused http://169.254.169.254/metadata/v1/vendor-data | munpack -tC $TEMPDIR | |
ENTROPY_SEED=$(grep -rl "DigitalOcean Entropy Seed script" $TEMPDIR) | |
${pkgs.runtimeShell} $ENTROPY_SEED |
(also remove the following line that executes $TEMPDIR/entropy-seed
)
initrd.kernelModules = [ "virtio_scsi" ]; | ||
kernelModules = [ "virtio_pci" "virtio_net" ]; | ||
loader = { | ||
grub.device = "/dev/sda"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grub.device = "/dev/sda"; | |
grub.device = "/dev/vda"; |
All of my droplets have vda
rather than sda
so this is required for switch to succeed when installing bootloader updates. Not sure if there are machines where sda
exists on DO or if this just wasn't tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it probably just wasn't tested
fi | ||
echo "setting configuration from Digital Ocean user data" | ||
cp "$userData" /etc/nixos/configuration.nix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As was previously suggested by arianvp:
cp "$userData" /etc/nixos/configuration.nix | |
cp "$userData" /etc/nixos/do-userdata.nix | |
echo '{ modulesPath, ... }: { | |
imports = [ | |
./do-userdata.nix | |
(modulesPath + "/virtualisation/digital-ocean-config.nix") | |
]; | |
}' > /etc/nixos/configuration.nix |
(not necessarily suggesting that the template config be inlined like that but within the constraints of github review ui...)
@arcnmx would you want to make a PR based on this PR with your suggested changes? I think this will then be ready to merge |
@arianvp sorry was away for a bit but can do! |
Closing in favor of #66978 |
Motivation for this change
A just-works NixOS image that can be uploaded to Digital Ocean, hopefully to be used as a basis for better Digital Ocean support in NixOps.
Things done
sandbox
innix.conf
on non-NixOS)nix-shell -p nix-review --run "nix-review wip"
./result/bin/
)nix path-info -S
before and after)