Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Use systemd-tmpfiles to manage /etc symlinks #47453

Closed
wants to merge 1 commit into from

Conversation

jameysharp
Copy link
Contributor

I have only minimally tested this, so please don't merge it without careful review! I'm submitting this pull request primarily to ask for comments on whether this is a good direction.

Motivation for this change

This eliminates a Perl script from NixOS activation; most of the necessary functionality is already in systemd.

I suspect that most of the work that's currently done in the activation script could be done with systemd-tmpfiles instead. Eventually I hope it might be possible to eliminate the activation script entirely, as well as most or ideally all of stage-2-init.sh, which duplicates a lot of functionality that systemd has built-in.

I set off down this road because I needed to build a read-only (squashfs) NixOS image and there are an awful lot of assumptions about writable filesystems which are difficult to reason about when they're scattered across all these impure shell and Perl scripts.

But coming back down to earth, I thought eliminating setup-etc.pl might be a good first step that doesn't impact much of the rest of the boot process.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Fits CONTRIBUTING.md.

This eliminates a Perl script from NixOS activation; most of the
necessary functionality is already in systemd.
@Mic92
Copy link
Member

Mic92 commented Sep 28, 2018

I like the fact that is much less code.


# finally, create new files in /etc
SYSTEMD_LOG_LEVEL=debug ${config.systemd.package}/bin/systemd-tmpfiles --create ${etc-tmpfiles}
ln -sf ${etc-tmpfiles} /etc/.created
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this no longer uses the /etc/static?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it make the process mostly atomic?

@symphorien
Copy link
Member

Does systemd-tmpfiles work even with an empty /etc and without systemd running ?

@Mic92
Copy link
Member

Mic92 commented Sep 28, 2018

I am pretty sure it does. It is designed to re-provision systems like rm -rf /etc && systemctl reboot.

https://github.com/systemd/systemd/blob/master/src/tmpfiles/tmpfiles.c

@Mic92
Copy link
Member

Mic92 commented Sep 28, 2018

@arianvp
Copy link
Member

arianvp commented Sep 28, 2018

This is very nice and I think it's a good first step. I've been doing some research myself for replacing stage-2-init last week. And exactly for the same reason, namely me wanting to create immutable images reliably. Perhaps we can create a tracking issue for that goal where I can put all these brainstorm ideas? For example, systemd can do stuff like and expanding and creating file systems, which we currently do ourselves in the activation script. (https://www.freedesktop.org/software/systemd/man/systemd-makefs@.service.html)

@jameysharp
Copy link
Contributor Author

@Mic92 You're right, this branch does not try to match the level of pseudo-atomicity of the current /etc/static approach, primarily because that would have been harder to do and I just wanted a simple proof-of-concept.

Here's an outline for a more aggressive change that would allow atomic update of everything on the system at once:

When building a NixOS configuration, generate a squashfs image that contains all the files that need to be placed anywhere on the filesystem when activating the configuration, including /etc and /run/wrappers/bin. The squashfs image can represent permissions and ownership as needed, using the mksquashfs -pf option.

Activation then proceeds in three steps:

  1. For every non-directory p in the new image, ensure symlinks exist from /<p> to /run/filesystem-overlay/<p>. If p doesn't exist in the previous running image, attempts to follow the new symlink will fail, which is "almost atomic".
  2. Atomic update /run/filesystem-overlay to refer to the new squashfs image. (This might mean an atomic symlink, or I don't know, maybe some trick with mount --move?)
  3. Delete all symlinks whose targets are in /run/filesystem-overlay but which no longer exist, as well as their now-empty parent directories. Since the link targets disappeared atomically in step two, this step is also "almost atomic".

The actions needed for steps 1 and 3 can be computed before applying any changes: As step 0, generate a tmpfiles.d snippet that can be applied first with systemd-tmpfiles --create to implement step 1, and then again with systemd-tmpfiles --remove to implement step 3. That keeps the "almost atomic" sections as short as possible.

When building a read-only NixOS system image, the above squashfs can just be made part of the final image, skipping all the symlink-management steps.

And @arianvp, I am certainly in favor of a tracking issue for these kinds of ideas. I have NixOS modules locally here for building a read-only NixOS root filesystem in a squashfs, and also for doing the stage-1 initrd using systemd instead of heaps of shell scripts; both work in my limited use cases, but I'm sure they have major flaws for use cases I haven't considered, so I figured I'd sort out things like this issue before trying to upstream that stuff. I'm happy to chat about what I did so far though!

@jameysharp
Copy link
Contributor Author

Folks interested in this pull request might also be interested in #47563 which I just filed, which is a proof-of-concept for getting rid of some of the shell scripting during stage-2 and activation.

@Mic92
Copy link
Member

Mic92 commented Sep 30, 2018

By reading your description it seems to be similar to how the current /etc/static is build except that it does not need a special filesystem. Therefore I do not think squashfs would provide us a benefit. Since you are build read-only systems, you might be also interested in the configuration of the aarch64 builder written by @grahamc, that uses overlayfs instead. Also I am not sure if everything was non-persistent.

@jameysharp
Copy link
Contributor Author

I'm still working on related ideas, but I'm abandoning this particular approach for various reasons, including @Mic92's comments.

@jameysharp jameysharp closed this Jul 8, 2019
@jameysharp jameysharp deleted the etc-tmpfiles branch July 8, 2019 17:23
@peterhoeg
Copy link
Member

I'm still working on related ideas

Any chance you can share something about this? I too, am not particularly hooked on the current situation (see #47898 as well).

@jameysharp
Copy link
Contributor Author

I just pushed a WIP branch which I'm not quite ready to turn into a pull request: https://github.com/jameysharp/nixpkgs/tree/deactivation

It gets rid of a few activation scripts for various services but the big deal is it can precompute /etc/passwd and group, rather than generating them with a perl script during activation, subject to some constraints.

@peterhoeg
Copy link
Member

the big deal is it can precompute /etc/passwd and group

very nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants