-
-
Notifications
You must be signed in to change notification settings - Fork 15.3k
Comparing changes
Open a pull request
base repository: NixOS/nixpkgs
base: 673c8193cdb1
head repository: NixOS/nixpkgs
compare: dcf40f7c24ee
- 9 commits
- 6 files changed
- 1 contributor
Commits on Mar 14, 2019
-
nixos: Add 'chroot' options to systemd.services
Currently, if you want to properly chroot a systemd service, you could do it using BindReadOnlyPaths=/nix/store (which is not what I'd call "properly", because the whole store is still accessible) or use a separate derivation that gathers the runtime closure of the service you want to chroot. The former is the easier method and there is also a method directly offered by systemd, called ProtectSystem, which still leaves the whole store accessible. The latter however is a bit more involved, because you need to bind-mount each store path of the runtime closure of the service you want to chroot. This can be achieved using pkgs.closureInfo and a small derivation that packs everything into a systemd unit, which later can be added to systemd.packages. That's also what I did several times[1][2] in the past. However, this process got a bit tedious, so I decided that it would be generally useful for NixOS, so this very implementation was born. Now if you want to chroot a systemd service, all you need to do is: { systemd.services.yourservice = { description = "My Shiny Service"; wantedBy = [ "multi-user.target" ]; chroot.enable = true; serviceConfig.ExecStart = "${pkgs.myservice}/bin/myservice"; }; } If more than the dependencies for the ExecStart* and ExecStop* (which btw. also includes "script" and {pre,post}Start) need to be in the chroot, it can be specified using the chroot.packages option. By default (which uses the "full-apivfs"[3] confinement mode), a user namespace is set up as well and /proc, /sys and /dev are mounted appropriately. In addition - and by default - a /bin/sh executable is provided as well, which is useful for most programs that use the system() C library call to execute commands via shell. The shell providing /bin/sh is dash instead of the default in NixOS (which is bash), because it's way more lightweight and after all we're chrooting because we want to lower the attack surface and it should be only used for "/bin/sh -c something". Prior to submitting this here, I did a first implementation of this outside[4] of nixpkgs, which duplicated the "pathSafeName" functionality from systemd-lib.nix, just because it's only a single line. However, I decided to just re-use the one from systemd here and subsequently made it available when importing systemd-lib.nix, so that the systemd-chroot implementation also benefits from fixes to that functionality (which is now a proper function). Unfortunately, we do have a few limitations as well. The first being that DynamicUser doesn't work in conjunction with tmpfs, because it already sets up a tmpfs in a different path and simply ignores the one we define. We could probably solve this by detecting it and try to bind-mount our paths to that different path whenever DynamicUser is enabled. The second limitation/issue is that RootDirectoryStartOnly doesn't work right now, because it only affects the RootDirectory option and not the individual bind mounts or our tmpfs. It would be helpful if systemd would have a way to disable specific bind mounts as well or at least have some way to ignore failures for the bind mounts/tmpfs setup. Another quirk we do have right now is that systemd tries to create a /usr directory within the chroot, which subsequently fails. Fortunately, this is just an ugly error and not a hard failure. [1]: https://github.com/headcounter/shabitica/blob/3bb01728a0237ad5e7/default.nix#L43-L62 [2]: https://github.com/aszlig/avonc/blob/dedf29e092481a33dc/nextcloud.nix#L103-L124 [3]: The reason this is called "full-apivfs" instead of just "full" is to make room for a *real* "full" confinement mode, which is more restrictive even. [4]: https://github.com/aszlig/avonc/blob/92a20bece4df54625e/systemd-chroot.nix Signed-off-by: aszlig <aszlig@nix.build>
Configuration menu - View commit details
-
Copy full SHA for ac64ce9 - Browse repository at this point
Copy the full SHA ac64ce9View commit details -
nixos/systemd-chroot: Rename chroot to confinement
Quoting @edolstra from [1]: I don't really like the name "chroot", something like "confine[ment]" or "restrict" seems better. Conceptually we're not providing a completely different filesystem tree but a restricted view of the same tree. I already used "confinement" as a sub-option and I do agree that "chroot" sounds a bit too specific (especially because not *only* chroot is involved). So this changes the module name and its option to use "confinement" instead of "chroot" and also renames the "chroot.confinement" to "confinement.mode". [1]: #57519 (comment) Signed-off-by: aszlig <aszlig@nix.build>
Configuration menu - View commit details
-
Copy full SHA for 0ba48f4 - Browse repository at this point
Copy the full SHA 0ba48f4View commit details -
nixos/confinement: Allow to configure /bin/sh
Another thing requested by @edolstra in [1]: We should not provide a different /bin/sh in the chroot, that's just asking for confusion and random shell script breakage. It should be the same shell (i.e. bash) as in a regular environment. While I personally would even go as far to even have a very restricted shell that is not even a shell and basically *only* allows "/bin/sh -c" with only *very* minimal parsing of shell syntax, I do agree that people expect /bin/sh to be bash (or the one configured by environment.binsh) on NixOS. So this should make both others and me happy in that I could just use confinement.binSh = "${pkgs.dash}/bin/dash" for the services I confine. [1]: #57519 (comment) Signed-off-by: aszlig <aszlig@nix.build>
Configuration menu - View commit details
-
Copy full SHA for 46f7dd4 - Browse repository at this point
Copy the full SHA 46f7dd4View commit details -
nixos/confinement: Allow to include the full unit
From @edolstra at [1]: BTW we probably should take the closure of the whole unit rather than just the exec commands, to handle things like Environment variables. With this commit, there is now a "fullUnit" option, which can be enabled to include the full closure of the service unit into the chroot. However, I did not enable this by default, because I do disagree here and *especially* things like environment variables or environment files shouldn't be in the closure of the chroot. For example if you have something like: { pkgs, ... }: { systemd.services.foobar = { serviceConfig.EnvironmentFile = ${pkgs.writeText "secrets" '' user=admin password=abcdefg ''; }; } We really do not want the *file* to end up in the chroot, but rather just the environment variables to be exported. Another thing is that this makes it less predictable what actually will end up in the chroot, because we have a "globalEnvironment" option that will get merged in as well, so users adding stuff to that option will also make it available in confined units. I also added a big fat warning about that in the description of the fullUnit option. [1]: #57519 (comment) Signed-off-by: aszlig <aszlig@nix.build>
Configuration menu - View commit details
-
Copy full SHA for 9e9af4f - Browse repository at this point
Copy the full SHA 9e9af4fView commit details
Commits on Mar 15, 2019
-
nixos/confinement: Explicitly set serviceConfig
My implementation was relying on PrivateDevices, PrivateTmp, PrivateUsers and others to be false by default if chroot-only mode is used. However there is an ongoing effort[1] to change these defaults, which then will actually increase the attack surface in chroot-only mode, because it is expected that there is no /dev, /sys or /proc. If for example PrivateDevices is enabled by default, there suddenly will be a mounted /dev in the chroot and we wouldn't detect it. Fortunately, our tests cover that, but I'm preparing for this anyway so that we have a smoother transition without the need to fix our implementation again. Thanks to @infinisil for the heads-up. [1]: #14645 Signed-off-by: aszlig <aszlig@nix.build>
Configuration menu - View commit details
-
Copy full SHA for d13ad38 - Browse repository at this point
Copy the full SHA d13ad38View commit details
Commits on Mar 27, 2019
-
nixos/confinement: Remove handling for StartOnly
Noted by @infinisil on IRC: infinisil: Question regarding the confinement PR infinisil: On line 136 you do different things depending on RootDirectoryStartOnly infinisil: But on line 157 you have an assertion that disallows that option being true infinisil: Is there a reason behind this or am I missing something I originally left this in so that once systemd supports that, we can just flip a switch and remove the assertion and thus support RootDirectoryStartOnly for our confinement module. However, this doesn't seem to be on the roadmap for systemd in the foreseeable future, so I'll just remove this, especially because it's very easy to add it again, once it is supported. Signed-off-by: aszlig <aszlig@nix.build>
Configuration menu - View commit details
-
Copy full SHA for 861a1ce - Browse repository at this point
Copy the full SHA 861a1ceView commit details -
nixos/confinement: Use PrivateMounts option
So far we had MountFlags = "private", but as @infinisil has correctly noticed, there is a dedicated PrivateMounts option, which does exactly that and is better integrated than providing raw mount flags. When checking for the reason why I used MountFlags instead of PrivateMounts, I found that at the time I wrote the initial version of this module (Mar 12 06:15:58 2018 +0100) the PrivateMounts option didn't exist yet and has been added to systemd in Jun 13 08:20:18 2018 +0200. Signed-off-by: aszlig <aszlig@nix.build>
Configuration menu - View commit details
-
Copy full SHA for 52299bc - Browse repository at this point
Copy the full SHA 52299bcView commit details -
nixos/release-notes: Add entry about confinement
First of all, the reason I added this to the "highlights" section is that we want users to be aware of these options, because in the end we really want to decrease the attack surface of NixOS services and this is a step towards improving that situation. The reason why I'm adding this to the changelog of the NixOS 19.03 release instead of 19.09 is that it makes backporting services that use these options easier. Doing the backport of the confinement module after the official release would mean that it's not part of the release announcement and potentially could fall under the radar of most users. These options and the whole module also do not change anything in existing services or affect other modules, so they're purely optional. Adding this "last minute" to the 19.03 release doesn't hurt and is probably a good preparation for the next months where we hopefully confine as much services as we can :-) I also have asked @samueldr and @lheckemann, whether they're okay with the inclusion in 19.03. While so far only @samueldr has accepted the change, we can still move the changelog entry to the NixOS 19.09 release notes in case @lheckemann rejects it. Signed-off-by: aszlig <aszlig@nix.build>
Configuration menu - View commit details
-
Copy full SHA for ada3239 - Browse repository at this point
Copy the full SHA ada3239View commit details
Commits on Mar 29, 2019
-
Merge pull request #57519 (systemd-confinement)
Currently if you want to properly chroot a systemd service, you could do it using BindReadOnlyPaths=/nix/store or use a separate derivation which gathers the runtime closure of the service you want to chroot. The former is the easier method and there is also a method directly offered by systemd, called ProtectSystem, which still leaves the whole store accessible. The latter however is a bit more involved, because you need to bind-mount each store path of the runtime closure of the service you want to chroot. This can be achieved using pkgs.closureInfo and a small derivation that packs everything into a systemd unit, which later can be added to systemd.packages. However, this process is a bit tedious, so the changes here implement this in a more generic way. Now if you want to chroot a systemd service, all you need to do is: { systemd.services.myservice = { description = "My Shiny Service"; wantedBy = [ "multi-user.target" ]; confinement.enable = true; serviceConfig.ExecStart = "${pkgs.myservice}/bin/myservice"; }; } If more than the dependencies for the ExecStart* and ExecStop* (which btw. also includes script and {pre,post}Start) need to be in the chroot, it can be specified using the confinement.packages option. By default (which uses the full-apivfs confinement mode), a user namespace is set up as well and /proc, /sys and /dev are mounted appropriately. In addition - and by default - a /bin/sh executable is provided, which is useful for most programs that use the system() C library call to execute commands via shell. Unfortunately, there are a few limitations at the moment. The first being that DynamicUser doesn't work in conjunction with tmpfs, because systemd seems to ignore the TemporaryFileSystem option if DynamicUser is enabled. I started implementing a workaround to do this, but I decided to not include it as part of this pull request, because it needs a lot more testing to ensure it's consistent with the behaviour without DynamicUser. The second limitation/issue is that RootDirectoryStartOnly doesn't work right now, because it only affects the RootDirectory option and doesn't include/exclude the individual bind mounts or the tmpfs. A quirk we do have right now is that systemd tries to create a /usr directory within the chroot, which subsequently fails. Fortunately, this is just an ugly error and not a hard failure. The changes also come with a changelog entry for NixOS 19.03, which is why I asked for a vote of the NixOS 19.03 stable maintainers whether to include it (I admit it's a bit late a few days before official release, sorry for that): @samueldr: Via pull request comment[1]: +1 for backporting as this only enhances the feature set of nixos, and does not (at a glance) change existing behaviours. Via IRC: new feature: -1, tests +1, we're at zero, self-contained, with no global effects without actively using it, +1, I think it's good @lheckemann: Via pull request comment[2]: I'm neutral on backporting. On the one hand, as @samueldr says, this doesn't change any existing functionality. On the other hand, it's a new feature and we're well past the feature freeze, which AFAIU is intended so that new, potentially buggy features aren't introduced in the "stabilisation period". It is a cool feature though? :) A few other people on IRC didn't have opposition either against late inclusion into NixOS 19.03: @edolstra: "I'm not against it" @infinisil: "+1 from me as well" @grahamc: "IMO its up to the RMs" So that makes +1 from @samueldr, 0 from @lheckemann, 0 from @edolstra and +1 from @infinisil (even though he's not a release manager) and no opposition from anyone, which is the reason why I'm merging this right now. I also would like to thank @infinisil, @edolstra and @danbst for their reviews. [1]: #57519 (comment) [2]: #57519 (comment)
Configuration menu - View commit details
-
Copy full SHA for dcf40f7 - Browse repository at this point
Copy the full SHA dcf40f7View commit details
There are no files selected for viewing