-
-
Notifications
You must be signed in to change notification settings - Fork 15.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nixos/systemd-sandbox: A generic sandboxing module #87661
Conversation
2 is the highest amount of sandboxing this module supports right now. | ||
''; | ||
default = 0; | ||
type = enum [ 0 1 2 ]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think more descriptive strings would be better - something like
enum [ "unsafe" "basic" "full" ];
What happens when you set it to default 1 or 2 btw? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the reason for the numbers is forward compatibility. As I have seen over the past systemd releases, more confinement options are added over time (for example ProtectKernelLogs
and ProtectClock
). Enabling them for all services which have confinement enabled would most likely break some of them. This way, we can add more options (and numeric values) in the future without having to do someting like [ "unsafe" "basic" "full" "fuller" "fullest" "full2023" "full_newer" ]
I also thought about setting some of the options by default (like PrivateTmp
since postgres has its socket in /run now), but I suspect a lot of stuff to break and I cannot test all nixos modules ;) The first that comes to mind is Docker which really needs cgroups, for example.
Also, this affects ALL units, including systemd-shipped units like udev and this PR would most likely break all NixOS systems on the unstable channel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that makes sense. I will play with it in VM.
How to replace
to
|
@Izorkin Since I use |
This can be enabled using `systemd.services.<name>.sandbox` and supports different levels of confinement, allowing for new new confinement options to be added in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this abstraction goes a bit too far. The options provided by systemd are already quite abstract and it is not quite clear what should go in each level. What if I have for example a python program that could run with all sandboxing options except the MemoryDenyWriteExecute
one. I rather would prefer to have these options in every service set since they are quite service specific.
FWIW, I think the top tier could use splitting up; I support some kind of abstraction for hardening because I don't think individually setting 10+ options on each systemd service is going to scale, especially as new hardening options are introduced over time, and the status quo is very bad:
But I do think it needs some careful consideration as to exactly what interface to expose. I think that rather than monotonic levels of sandboxing, several discrete named categories of sandboxing, plus some convenience bundles that contain common combinations, would be the best approach. Of course, some thought needs to be put in to avoid a proliferation of categories that don't reduce the complexity over the raw systemd options, but e.g. "protect the kernel stuff and devices", "reduce syscall attack surface", and "isolate filesystem with just the things I need" are viable categories that I think would reduce the complexity in hardening significantly. Some options you haven't included that #83474 mentions / By the way, it would be good to integrate the existing |
An alternative to this module would be to set some hardening flags by default and only unset them in the services that actually need them. This will also remove the over all bloat of flags we need to set in every module. |
@Mic92 this is actually what this module does. It does {
sandbox = 2;
serviceConfig.ProtectControlGroups = false;
} |
I also think For example For some other missing options though (mostly sandbox 1, ProtectHostname or RestrictRealtime really make sense to have always set by default... Even RestrictAddressFamilies...), I like the idea but I think it will be tricky.. Maybe a 3-steps process, like:
cc @aszlig for further opinion on this, I'm just a user of his superb confinement module :) |
I mean we could enable some sane default options all systemd services without having to enable |
I think this is a great idea, but imo there should only be two possibilities for the option:
I think we should only use intermediate options for backwards compatibility. So in addition to So currently the type should just be Regarding turning on sandboxing by default, I think it should be done with these steps (with at least a NixOS release between them)
This makes the option essentially mandatory. If you choose to ignore the warning, the service might break in the future since the default changes. And if you don't ignore the warning and set the value, you either fix it to a backwards compatible number like |
I am very much a fan of sandboxing, but IMHO sandboxing needs to happen on a granular, and per-service level. Also see the discussion around #93305 (comment). I don't think we should have a generic sandboxing module that hides/groups specific sandboxing properties behind a number, and potentially breaks services in the feature. "Not breaking existing services by being very explicit with what to sandbox" is also the route systemd went with each sandboxing option, for similar reasons. IMHO, we should keep sandboxing and isolating more services, writing release notes on our way (so people know how to migrate / adapt), but I don't think a generic sandboxing module is the right approach. |
This PR's module does not remove granularity and per-service config in any way. It merely is a way to conveniently set a lot of security-related settings to safer defaults. The user can still opt-out of each one and for each service separately.
By combining these settings behind a number we make security easy: Instead of having to remember each and every one of the security settings you might want to enable, you can just set a single option which configures safe defaults. If the service doesn't work you can still opt-out of the settings that break it. This is a good thing, not a bad thing. In fact, this is kind of what NixOS is based on: Being able to just declare some
Whether this setting breaks services in the future will be clearly specified, and it comes with reasonable tradeoffs:
I agree that we may not be able to change the default of this option in the end, but this is a separate matter from this PR and doesn't need to influence the decision of merging it.
As seen from the current state of NixOS services, this obviously doesn't work. Most services don't use any security-related settings and are vastly under-restricted, We can't expect module authors to know all systemd related security options and hope they enable them all as far as possible. Every time I write a service module I think to myself that while I could look into enabling all the security related settings, it would be a long and tedious process, and it would turn my 4-line service definition into a 30-line service definition, reducing readability and simplicity. So I don't do it. Similarly, as a NixOS module reviewer, I could ask people to do the same, but who am I to tell people to do something I wouldn't even do myself. And what if they only add 3 security-related settings because they thought it was enough? Should I tell them to look into enabling all available ones? As a NixOS module reviewer, this just won't work well. The alternative with this PR is to tell people to set In essence, this is about being able to default to a secure set of settings for convenience. |
@infinisil the most ideal spot for When you take in mind the perspective of a software project that has been around for 30+ years we have to keep a couple things in mind:
Sure, we want to harden services, but we don't need to do it overnight with a switch. We need to carefully review (ourselves, and have upstream do the same) what we turn on so we stop breaking things without realizing we have done so. |
I think generally self-sandboxing is something more upstream projects should do regardless of The last point about breaking things is btw. exactly why I didn't start to add the the Personally, my approach is to set these settings on a per-service and per-machine basis for the services that I care about. For example I don't care about services like getty (which I usually disable anyway), the Nix daemon or any other service that's not exposed via network and confine all the services I do care about in such a way that they don't have access to anything else, including the services I don't care about :-) Nevertheless, I do however still think that having a single option to restrict a service as much as possible is useful, maybe even on a global level, since whitelisting the features you actually care about is much easier than doing it the other way around. |
My main concern with "this is too complicated for a single switch" is that it has the potential to result in perfection being the enemy of "good enough" in many cases. As for the options themselves and what such a scale would look like I believe is useful research and exploration in itself. Abandoning an idea because we know that it's likely an 80/20 solution to improve the status quo while waiting for upstream to eventually do all the "Right Things"™ feels premature to me. There is real value in exploring this space and I thought the PR was an excellent start to this conversation. |
Yes agreed, hardening from upstream is nice. But this doesn't conflict with this PR at all. Just because upstream doesn't harden doesn't mean we can still do so in NixOS.
Of course software runs fine if they have all the privileges, but that's not really a good thing. But in any case, this also doesn't conflict with this PR. Unless
Again, this PR won't break anything*. This only allows opt-in to secure defaults. *: Actually one exception is when an upstream update needs new privileges. If the |
This PR is problematic, as it provides a huge footgun. When enabled, it provides two handwavy sandboxing options (quoting from the description):
This is mostly tooling for other NixOS module authors. If you were to write a new module, you would probably pick What would then happen is that more and more module authors would simply set In the future, when there's more sandboxing options, people will be afraid to touch the sandboxing module itself, and what is enabled by Again, I think sandboxing should be a conscious and fine-grained decision per service, ideally already done by upstream, or the downstream nixpkgs maintainers. I don't think a coarse "sandboxing shotgun" is the right way here. Also, hiding sandboxing options by an arbitrary number will make comparing with upstream systemd units, if they appear, much harder. |
@flokli As I've argued in #87661 (comment), I don't think the current sandboxing levels of the PR should be used. I want levels
So there is a clear contract here:
And actually, we can even drop the |
@infinisil then we still end up with module authors using The only difference would be "we can't change the meaning of I still don't see the harm in duplicating some of the sandboxing directives in each module. In fact I'd argue having them easily diff-able with what upstream provides is an advantage :-) |
People don't need to clearly know what If people don't want to see a And I already explained the problem with requiring people to add each setting manually in #87661 (comment). From a quick grep through nixpkgs for |
I'm still trying to understand why this is a burning problem. I don't see it as a problem, I see it as a room for growth in the GNU/Linux ecosystem 😄. Blind application of I can imagine a future where |
Oh, nice to see the conversation is back again. One thing to note is that this PR is a variation of the module we use in our downstream repository. The idea when we sandbox services downstream is to just set @aanderse One of the problems (and also opportunities) nixpkgs has over other distros is that we mostly ship our own unit files instead of relying on random |
Oh also the main motiviation for the original module was ajs124 (not pinging here) who was sick of me copying the same chunks of text from my Joplin notebook. |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/hardening-systemd-services/17147/12 |
This can be enabled using
systemd.services.<name>.sandbox
and supportsdifferent levels of confinement, allowing for new new confinement
options to be added in the future.
We have been using a similar module for about half a year with extreme success, being able to sandbox stuff quickly and efficently.
cc @ajs124 As you
also wrote parts of the module iircwanted me to write the module back thencc @Izorkin Maybe this can make #85567 more generic
cc @flokli @Mic92 @Ma27 because you might care about this
Motivation for this change
Things done
sandbox
innix.conf
on non-NixOS linux)nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
./result/bin/
)nix path-info -S
before and after)