New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
auto-luks: require new mountPoint
option
#1156
Conversation
nix/auto-luks.nix
Outdated
@@ -67,6 +67,24 @@ with utils; | |||
luksFormat</command>. | |||
''; | |||
}; | |||
|
|||
mountPoint = mkOption { | |||
default = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type is null or string.
default = false; | ||
type = types.nullOr types.string; | ||
description = '' | ||
This option is required so the autoLuks module knows where the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If required, why allow "null"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was meant as an escape hatch for people that know what they are doing. There might be cases where it is being used without using it for a filesystem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the type signature make it explicit that I want either a str
or the string "none". While technically not required I think making the "none"
case explicit in the type serves as a good way to document the existence of it.
Maybe we should just remove the auto-LUKS feature. I doubt anybody is using it. |
With the upcoming systemd v242 change in nixpkgs there will be a breaking change to how auto-luks used to work. We are going to remove a commit from the systemd fork that was added for auto-luks to work. The auto-luks feature will continue to work but another mount option is required to mark those `fileSystems` as not local. All mount points that carry the `_netdev` mount option are not part of the `local-fs.target` and thus not blocking the start of things like the sshd daemon. Adding this option is (so far) the least intrusive way to ensure that systems will not break. Every user of the feature will be required to add only one more value to it's configuration. Trying to automatically detect which mount points are mounted from the unlocked block devices is not really feasible in all/most cases. There might be another layer of indirection in between the actual (luks) device and the mount point. Good examples that will make it hard for us to guess are things like RAID, LVM, ZFS, …. Dropping the systemd patch removes a bunch of issues and race conditions in regards to creating `/run/<service>` and `/var/lib/<service>` directories. Those paths are usually set up using the `systemd-tempfiles.service` which is (again) part of `sysinit.target`. This is also part of the ongoing effort to remove large a portion of our `PreStart` scripts that just create directories. In the long run there should be less custom scripting required to get a service with (temporary and persistent) state up and running.
I addressed the review comments but In general I agree with @edolstra that we might just want to remove the feature. It will still have to be somewhat synchronized with the NixOS releases so people are not creating unbootable systems by using a newer NixOS version from a very old NixOps. |
On a call with Andi I suggested we break autoluks as soon as possible in 19.03 stable, to try and get users to come out of the woodwork. If nobody comes out, we delete it. |
@AmineChikhaoui Lets say we keep auto-luks but merge this PR. Would the additional attributes be okay / feasible for your deployments? |
@andir I'll double check and get back. I just wanted to reply that there are systems using this first, will read the details later. |
What's the status of this? Does the inifinite recursion still occur? |
As far as I remember I tried to reproduce this and then ran into issues reproducing it. I do not have an AWS account or similar to actually test this properly. On the libvirt backend I wasn't able to reproduce. |
Well, this PR was meant to improve UX for people migrating from 19.03 to 19.09, alerting them for the missing With nixpkgs now having branched off to 20.03 (and most people migrated to 19.09 at least), people did run into this, and this PR has probably become less useful. If nobody can test this, we should probably just drop the idea, instead of just keeping this PR open. |
I'll try this again this week. Though I don't think it needs an AWS account to reproduce as the infinite recursion is an eval error. |
Hello! Thank you for this PR. In the past several months, some major changes have taken place in
This is all accumulating in to what I hope will be a NixOps 2.0 My hope is that by adding types and more thorough automated testing, However, because of the major changes, it has become likely that this If you would like to see this merge, please bring it up to date with Thank you again for the work you've done here, I am sorry to be Graham |
On 12:36 26.03.20, Graham Christensen wrote:
Closed #1156.
Thank you for putting in the effort on NixOps. Appreciate it. It really
needs that love :)
I'll probably not re-open this (Nit-pick: I would have to create a new
PR for that and not just open it again). This has been done to help those that
use NixOps with the obscure auto-luks feature. I failed to reproduce the
issues and the tests didn't cover the feature… Nobody has really cared
in a while about it.
*Someone* should do the judgement call of retiring that feature if
nobody can fix it. It should be broken since a few releases anyway.
Currently we seem to be deadlocked on someone using it but not requiring
the fix.
Maybe it is worth having it downstream if your deployment depends on it.
|
It was originally moved because of nixops autoLuks feature which has been unsupported for a while. See: * NixOS#62211 * NixOS/nixops#1156 (comment) systemd-tmpfiles-setup-dev.service needs to run very early (even before udev runs) because udev rules assume static device nodes already exist even before udev is started. If these static device nodes do not exist; systemd might have trouble mounting filesystems that require static device nodes (like loopfs and btrfs).
It was originally moved because of nixops autoLuks feature which has been unsupported for a while. See: * #62211 * NixOS/nixops#1156 (comment) systemd-tmpfiles-setup-dev.service needs to run very early (even before udev runs) because udev rules assume static device nodes already exist even before udev is started. If these static device nodes do not exist; systemd might have trouble mounting filesystems that require static device nodes (like loopfs and btrfs). (cherry picked from commit d4e4d27)
With the upcoming systemd v242 change (NixOS/nixpkgs#61321) in nixpkgs there will be a breaking change to how auto-luks used to work. We are going to remove a commit from the systemd fork (NixOS/systemd@ce79214) that was added for auto-luks to work. The
auto-luks feature will continue to work but another mount option is required to mark those
fileSystems
as not local. All mount points that carry the_netdev
mount option are not part of thelocal-fs.target
and thus not blocking the start of things like the sshd daemon.Adding this option is (so far) the least intrusive way to ensure that systems will not break. Every user of the feature will be required to add only one more value to it's configuration.
Trying to automatically detect which mount points are mounted from the unlocked block devices is not really feasible in all/most cases. There might be another layer of indirection in between the actual (luks) device and the mount point. Good examples that will make it hard for us to guess are things like RAID, LVM, ZFS, ….
Dropping the systemd patch removes a bunch of issues and race conditions in regards to creating
/run/<service>
and/var/lib/<service>
directories (see NixOS/nixpkgs#47550, NixOS/nixpkgs#53852, NixOS/nixpkgs#53852). Those paths are usually set up using thesystemd-tempfiles.service
which is (again) part ofsysinit.target
.This is also part of the ongoing effort to remove large a portion of our
PreStart
scripts that just create directories. In the long run there should be less custom scripting required to get a service with (temporary and persistent) state up and running.For all the current version of NixOS this should be mostly a No-Op. In a local test against libvirtd there was no observable difference in behavior between 19.03 and the systemd PR.
Ideally we would have a way to warn users that deploy a more recent NixOS that they must use a newer NixOps version. It is still not obvious how to do that in an elegant way. Opinions are very welcome. Not updating systemd or keeping the broken sysinit behaviour isn't really an option in the long run.
This change is