Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

machinectl compliant NixOS installation #67232

Merged
merged 5 commits into from Sep 25, 2019

Conversation

ck3d
Copy link
Contributor

@ck3d ck3d commented Aug 22, 2019

  • Patched nixos-install to get a system-nspawn complient NixOS installation.
  • Patched NixOS option boot.isContainer to cover the incompatibility of resolvconf and networkd.
  • Patched NixOS activation script "var" to deactivate immutability of /var/empty in case of a container
Motivation for this change

Simple installation of NixOS into a systemd controlled container via machinectl.
Used networkd to get working network setup without additional configuration.

see also #9884 and #35364

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nix-review --run "nix-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.
Notify maintainers

cc @nbarbey
cc @

@ck3d ck3d changed the title container config: better default in case of resolved added nixos test for machinectl Aug 22, 2019
@markuskowa
Copy link
Member

markuskowa commented Aug 23, 2019

@GrahamcOfBorg test systemd-machinectl
@GrahamcOfBorg build nixosTests.installer.simple

@ck3d ck3d changed the title added nixos test for machinectl machinectl complient NixOS installation Aug 23, 2019
@ck3d ck3d force-pushed the container-useHostResolvConf branch 2 times, most recently from 94efb79 to 012ed77 Compare August 28, 2019 05:13
Copy link
Member

@markuskowa markuskowa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simple installer test runs locally.

@ck3d ck3d force-pushed the container-useHostResolvConf branch from 012ed77 to dba2d58 Compare August 28, 2019 16:54
@markuskowa markuskowa changed the title machinectl complient NixOS installation machinectl compliant NixOS installation Aug 28, 2019
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/prs-ready-for-review-may-2019/3032/49

name = "systemd-machinectl";

machine = { lib, ... }: {
# use networkd to optain systemd network setup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# use networkd to optain systemd network setup
# use networkd to obtain systemd network setup

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for reviewing, I updated the PR.

@flokli
Copy link
Contributor

flokli commented Sep 2, 2019

cc @arianvp

fi
nixos-enter --root "$mountPoint" -- /run/current-system/bin/switch-to-configuration boot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this moved out of the if ? Why do we need to run switch-to-configuration boot ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to ensure that /sbin/init is available, otherwise machinectl do not know per default what to start.
The NixOS option boot.loader.initScript.enable setups system.build.installBootLoader, which creates /sbin/init.
I called switch-to-configuration to get the script executed.

Do you see a better way to solve this issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No this seems fine to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, why not simply not pass the --no-bootloader flag? Seems to me that you do want to install the boot loader (i.e. the creation of /sbin/init) here.

Copy link
Member

@arianvp arianvp Oct 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because a container machinectl directory will not have a /boot partition, to which a bootloader is installed, but it will need an /sbin/init for it to be picked up by systemd-nspawn --boot --directory <path-to-container>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/boot should be an implementation detail of certain boot loaders (namely GRUB and UEFI), but the initScript boot loader shouldn't require that.

@arianvp
Copy link
Member

arianvp commented Sep 12, 2019

Looks good to me!

networking.useNetworkd = true;

# systemd-nspawn expects /sbin/init
boot.loader.initScript.enable = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not important for merging this) do we perhaps want to make a ./modules/profiles/machinectl.nix that enables this initScript.enable option?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea but I would propose nspawn-container.nix as a profile name. This is more general. machinectl is only the tool that handles basic nspawn container functionality.

@lheckemann lheckemann added this to the 20.03 milestone Sep 12, 2019
@lheckemann lheckemann modified the milestones: 20.03, 19.09 Sep 13, 2019
@fpletz fpletz self-assigned this Sep 13, 2019
Avoid assertion in nixos/modules/system/boot/resolved.nix
if service systemd-resolved is enabled.
The activation script is needed to get the missing files in etc/ created.
Needed for container manager like systemd-nspawn.
@fpletz
Copy link
Member

fpletz commented Sep 13, 2019

Rebased onto current master and squashed some of the fixup commits. Doing a final test run now and will add a short paragraph in the release notes.

Thanks! 🎉

@fpletz
Copy link
Member

fpletz commented Sep 13, 2019

Please do not merge this yet. There are still some issues with the functionality in this PR, probably due to the systemd bump recently.

In the generated container, some services seem to fail due to very weird reasons:

# systemctl -M my-container --failed                                                                                                                   ~/src/nixpkgs-tmp
  UNIT                                   LOAD   ACTIVE SUB    DESCRIPTION
● nscd.service                           loaded failed failed Name Service Cache Daemon
● systemd-journal-catalog-update.service loaded failed failed Rebuild Journal Catalog
● systemd-logind.service                 loaded failed failed Login Service
● systemd-networkd-wait-online.service   loaded failed failed Wait for Network to be Configured

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

4 loaded units listed.
# systemctl -M my-container status systemd-logind.service                                                                                              ~/src/nixpkgs-tmp
● systemd-logind.service - Login Service
   Loaded: loaded (/nix/store/4vw3gb6dk116y38vwgjv3ymq5mfdcfli-systemd-243/example/systemd/system/systemd-logind.service; enabled; vendor preset: enabled)
  Drop-In: /nix/store/w5v9r1cl65qimpf6jggbaivgl1cgrapb-system-units/systemd-logind.service.d
           └─overrides.conf
   Active: failed (Result: exit-code) since Fri 2019-09-13 18:47:21 CEST; 11min ago
     Docs: man:systemd-logind.service(8)
           man:logind.conf(5)
           https://www.freedesktop.org/wiki/Software/systemd/logind
           https://www.freedesktop.org/wiki/Software/systemd/multiseat
  Process: 242 ExecStartPre=/sbin/modprobe -abq drm (code=exited, status=238/STATE_DIRECTORY)
  Process: 243 ExecStart=/nix/store/4vw3gb6dk116y38vwgjv3ymq5mfdcfli-systemd-243/lib/systemd/systemd-logind (code=exited, status=238/STATE_DIRECTORY)
 Main PID: 243 (code=exited, status=238/STATE_DIRECTORY)

Note the /sbin/modprobe -abq drm that failed.

Additionally, machinectl shell does not work for me.

@arianvp
Copy link
Member

arianvp commented Sep 13, 2019

This seems to be similar to as: systemd/systemd#7605

status=238/STATE_DIRECTORY is thrown if systemd cannot create the StateDirectory= directory.

This kind of stuff should only happen if there is something funky going on with /var/lib (e.g. when timesyncd migrated away from DynamicUser but forgot to implement downgrade code, as linked above)

@fpletz
Copy link
Member

fpletz commented Sep 14, 2019

The StateDirectory was caused by me because I had to run nixos-install twice. I started the machine during those invocations which did the chown to the subuid/gid range but the nixos-install recreated some directories such as /var/lib as the host uid 0.

Nevertheless, nscd won't start because machinectl will default to enable user namespaces which breaks DynamicUser support, see the main issue #57083 and the WIP PR #67336.

@fpletz fpletz modified the milestones: 19.09, 20.03 Sep 14, 2019
@arianvp
Copy link
Member

arianvp commented Sep 25, 2019

@fpletz what happens if you run this as sudo systemd-nspawn -bD /var/lib/machines/nixos ? If this works then I'm happy merging this as is. As the bug that seems to be stopping us is an upstream systemd issue; not something that is inherently wrong with the image

@fpletz
Copy link
Member

fpletz commented Sep 25, 2019

@arianvp This will work (already tried it) but the attached documentation is misleading. Since we're not targeting 19.09 anymore this seems fine to me but we have to ensure we fix the documentation if this isn't fixed in systemd before 20.03.

@ck3d
Copy link
Contributor Author

ck3d commented Sep 29, 2019

Yes I will, I prepare a PR.

vcunat added a commit that referenced this pull request Oct 7, 2019
This reverts commit 66967ec, reversing
changes made to fb6595e.
Fixes #70442; discussion: #70027
@maxbrieiev
Copy link

So this was reverted.
Is there any way to run nixos in the container with machinectl/systemd-nspawn on a non-NixOs host?

@minijackson
Copy link
Member

minijackson commented Dec 25, 2021

@maxbrieiev The current instructions I have are:

cp /var/lib/machines/$MACHINE_NAME/nix/store/*-etc-os-release /var/lib/machines/$MACHINE_NAME/etc/os-release`
mkdir /var/lib/machines/$MACHINE_NAME/sbin
ln -s /init /var/lib/machines/$MACHINE_NAME/sbin/init
chattr -i /var/lib/machines/$MACHINE_NAME/var/empty

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet