Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sanoid: add package, NixOS module and test #72060

Merged
merged 2 commits into from Feb 10, 2020

Conversation

lopsided98
Copy link
Contributor

Motivation for this change

Adds sanoid, a ZFS snapshotting and synchronization tool.

I have been using a version of this package and module locally for over a year, and just cleaned it up before submitting it upstream. I also just added a test that takes snapshots with sanoid and then uses syncoid to transfer them to another machine.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nix-review --run "nix-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.
Notify maintainers

cc @deliciouslytyped @Nekroze

@lopsided98

This comment has been minimized.

@lopsided98
Copy link
Contributor Author

@GrahamcOfBorg test sanoid

Copy link
Member

@infinisil infinisil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I might try this out, currently using znapzend, but it's a bit problematic occasionally

nixos/modules/services/backup/sanoid.nix Outdated Show resolved Hide resolved
nixos/modules/services/backup/sanoid.nix Outdated Show resolved Hide resolved
};

extraConfig = mkOption {
description = "Extra configuration for this template/dataset";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I'm describing in NixOS/rfcs#42, introducing such extraConfig options has problems, so I'd rather have this PR use the approach described there. This means:

  • Changing this to settings = mkOption { type = types.attrsOf (types.nullOr (types.oneOf [ int bool str ])); ... }
  • Setting all other options in the config section of the (sub)module like settings.hourly = mkDefault cfg.hourly; instead
  • Generating the config file only from the cfg.settings values. You might be able to use lib.generators.toINI for this

If you want to I could implement this for you

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented this; let me know if it looks good to you.

datasetOptions = commonOptions // {
useTemplate = mkOption {
description = "Names of the templates to use for this dataset";
type = types.nullOr (types.listOf (types.enum (attrNames cfg.templates)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the manual you should override this options description to note that only values in templates can be used, because while building docs templates is just gonna be empty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this also one of those situations where changing templates will cause unintended manual rebuilds?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah probably

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to cause manual rebuilds as far as I can tell, but I overrode the type description anyway so that it is more useful than list of one of s.

nixos/modules/services/backup/sanoid.nix Outdated Show resolved Hide resolved
nixos/modules/services/backup/sanoid.nix Outdated Show resolved Hide resolved
nixos/modules/services/backup/syncoid.nix Outdated Show resolved Hide resolved
'';
};

defaultArguments = mkOption {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about removing this option and useDefaultArguments and instead putting the default arguments as default = [ ... ] into extraArguments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name was not a good description of what that the option does. I changed it to commonArgs (it seems like Args is more common than Arguments in nixpkgs). I created that option because I wanted a convenient shortcut to apply arguments to all (or most, since you can disable it for a specific command with useCommonArgs = false) of the commands, in addition to those configured in extraArgs. For example, I use it to add --no-privilege-elevation and --no-sync-snap in my configuration.

nixos/modules/services/backup/syncoid.nix Show resolved Hide resolved
nixos/modules/services/backup/syncoid.nix Outdated Show resolved Hide resolved
Comment on lines 156 to 209
systemd.services.sanoid = {
description = "Sanoid snapshot service";
serviceConfig.ExecStart = "${pkgs.sanoid}/bin/sanoid --cron --configdir=${configDir} ${cfg.extraArgs}";
after = [ "zfs.target" ];
startAt = cfg.interval;
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sanoid should run as its own user, not root. Also consider passing --verbose

Copy link
Member

@tilpner tilpner Oct 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need environment.TZ = "UTC";?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a user option, but left it as root by default because AFAIK sanoid cannot run as another user unless ZFS privilege delegation is configured.

I'd rather not pass --verbose unconditionally in case someone doesn't want it for some reason. I added it to the example for extraArgs though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you need something like /run/booted-system/sw/bin/zfs allow sanoid snapshot,destroy,mount,send,receive ${lib.escapeShellArg pool} to allow sanoid to act on that pool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this module should automatically make changes like that to users' pools.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm willing to implement this, but I'm not sure what approach to take. zfs allow needs to run as root, but PermissionsStartOnly is deprecated and NixOS doesn't support the command prefixes that replace it, so I don't see a way to setup privilege delegation in preStart. The only other way I can see is to create another service that gets started before sanoid.service, but that seems hacky.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT the prefixes can be used, just not with pretty abstractions like script/preStart/preStop/etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that sanoid cannot currently run as non-root. It hardcodes the path to the cache file as /var/cache/sanoidsnapshots.txt, which cannot be created as an unprivileged user. This could be worked around with tmpfiles or another ExecStartPre command. The real problem is that it puts lock files directly in /var/run. In this case, sanoid needs to create and delete these files itself, so I don't see a way to work around this.

I am working on patches to fix these issues, which I will submit upstream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --run-dir and --cache-dir options have been accepted upstream. There is no new release yet, so I added them as patches.

pkgs/tools/backup/sanoid/default.nix Outdated Show resolved Hide resolved
@lopsided98 lopsided98 force-pushed the sanoid-init branch 4 times, most recently from 7d9a60c to f5d9349 Compare December 4, 2019 19:11
Copy link
Member

@infinisil infinisil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good! I have some final comments, but I'm happy to merge once those are addressed.

nixos/modules/services/backup/sanoid.nix Show resolved Hide resolved
nixos/modules/services/backup/syncoid.nix Show resolved Hide resolved
nixos/modules/services/backup/syncoid.nix Show resolved Hide resolved
nixos/modules/services/backup/syncoid.nix Outdated Show resolved Hide resolved
nixos/modules/services/backup/syncoid.nix Outdated Show resolved Hide resolved
pkgs/tools/backup/sanoid/default.nix Show resolved Hide resolved
@lopsided98 lopsided98 force-pushed the sanoid-init branch 3 times, most recently from aaaadce to b6a9ae3 Compare January 22, 2020 20:07
@lopsided98 lopsided98 force-pushed the sanoid-init branch 2 times, most recently from ee4dce2 to 4e40ca7 Compare January 22, 2020 20:16
ExecStartPre = map (pool: lib.escapeShellArgs [
"+/run/booted-system/sw/bin/zfs" "allow"
cfg.user "snapshot,destroy" pool
]) pools;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! What's the status on being able to use DynamicUser now? I see you're including patches for --cache-dir and --run-dir, yet those options aren't passed anywhere. Also I see zfs allow here but not in syncoid. The ideal situation would be to get rid of all user and group options and switch to DynamicUser. Combining this with zfs allow also has the benefit of not needing a user with permanent permissions, since the service user will be only temporary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--cache-dir and --run-dir are not passed because they default to the directories that systemd uses, although I could pass them to be explicit.

I did some testing with DynamicUser and it seems to work. I remove the privilege delegations in ExecStopPost, which should result in them being correctly cleaned up even if the service fails. My only concern is that if the service is interrupted (ie. by power failure, kernel panic) then the permissions won't be cleaned up. If systemd were to reuse this UID for another DynamicUser service, then that service would gain unintended ZFS permissions, which seems like a (admittedly rather hard to exploit) security vulnerability.

It should be possible to perform automatic privilege delegation for syncoid as well, although it is somewhat more complicated to extract the pool name and determine whether a target is local or remote.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see the removal of privileges, is that change missing perhaps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't publish the DynamicUser change. I have added it now in a separate commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, sanoid is looking great now. Does zfs allow also work for syncoid? If so there wouldn't be a need for a user option (unless you kept it because of a certain use case?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You aren't concerned about the cleanup issue I mentioned?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, forgot about that. I guess we can delegate this as potential future work

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand your comment correctly, you are suggesting we forgo DynamicUser for now (that would be my choice). I am planning to implement automatic privilege delegation for syncoid (without DynamicUser), I've just been busy recently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow, despite the discussion here, the final version of the PR that got merged uses DynamicUser. While working on some other issues, I discovered that at some point sanoid failed to clean up its permissions, as show below:

---- Permissions on root ---------------------------------------------
Local+Descendent permissions:
	user syncoid destroy,hold,send,snapshot
	user  destroy,mount,snapshot

Besides the possible security issue, it also appears to be currently impossible to remove these permissions until openzfs/zfs#10280 makes it into a release.

@Frostman
Copy link
Member

Frostman commented Feb 4, 2020

I'm looking to switch to sanoid/syncoid on my NAS. So, I've tested the package on my system - it works correctly. I didn't yet have a chance to try the module.

Copy link
Member

@infinisil infinisil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I'll do some final testing, will merge after that, just before the 20.03 feature freeze :)

@infinisil
Copy link
Member

I took the liberty to improve the options a bit:

  • I replaced null as the default value with whatever upstream uses. Not having a default value was very confusing when trying to use this, should I set the option or not? What value is used if I don't? Where can I find the default then? Upstream's defaults shouldn't drastically over time, so I don't think it's a problem to incorporate them here.
  • For free-form settings I added links to upstream to know what can be set there.
  • Some other minor changes

With these changes this looks good to me, are they also fine with you @lopsided98?

@lopsided98
Copy link
Contributor Author

That looks good to me.

@infinisil infinisil merged commit 637bb9f into NixOS:master Feb 10, 2020
@infinisil
Copy link
Member

Merging then! 🚀

@lopsided98
Copy link
Contributor Author

See #79759

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants