Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/syncoid: split in multiple systemd services and harden them #98455

Merged
merged 1 commit into from Jul 24, 2021

Conversation

ju1m
Copy link
Contributor

@ju1m ju1m commented Sep 22, 2020

Motivation for this change

Currently all syncoid commands are listed into a single script inside a single systemd service, hence if one command fails all the following commands are not run.
Ping @lopsided98

Things done
  • Put each command into a dedicated systemd service.
  • Add interval option per command, defaulting to the global interval.
  • Hardening of the service:
$ systemd-analyze security syncoid-losurdo-home-julm-work.service
  NAME                                                        DESCRIPTION                                                                    EXPOSURE
✗ PrivateNetwork=                                             Service has access to the host's network                                            0.5
✓ User=/DynamicUser=                                          Service runs under a static non-root user identity                                     
✓ CapabilityBoundingSet=~CAP_SET(UID|GID|PCAP)                Service cannot change UID/GID identities/capabilities                                  
✓ CapabilityBoundingSet=~CAP_SYS_ADMIN                        Service has no administrator privileges                                                
✓ CapabilityBoundingSet=~CAP_SYS_PTRACE                       Service has no ptrace() debugging abilities                                            
✗ RestrictAddressFamilies=~AF_(INET|INET6)                    Service may allocate Internet sockets                                               0.3
✓ RestrictNamespaces=~CLONE_NEWUSER                           Service cannot create user namespaces                                                  
✓ RestrictAddressFamilies=~…                                  Service cannot allocate exotic sockets                                                 
✓ CapabilityBoundingSet=~CAP_(CHOWN|FSETID|SETFCAP)           Service cannot change file ownership/access mode/capabilities                          
✓ CapabilityBoundingSet=~CAP_(DAC_*|FOWNER|IPC_OWNER)         Service cannot override UNIX file/IPC permission checks                                
✓ CapabilityBoundingSet=~CAP_NET_ADMIN                        Service has no network configuration privileges                                        
✓ CapabilityBoundingSet=~CAP_SYS_MODULE                       Service cannot load kernel modules                                                     
✓ CapabilityBoundingSet=~CAP_SYS_RAWIO                        Service has no raw I/O access                                                          
✓ CapabilityBoundingSet=~CAP_SYS_TIME                         Service processes cannot change the system clock                                       
✗ DeviceAllow=                                                Service has a device ACL with some special devices                                  0.1
✗ IPAddressDeny=                                              Service does not define an IP address allow list                                    0.2
✓ KeyringMode=                                                Service doesn't share key material with other services                                 
✓ NoNewPrivileges=                                            Service processes cannot acquire new privileges                                        
✓ NotifyAccess=                                               Service child processes cannot alter service state                                     
✓ PrivateDevices=                                             Service has no access to hardware devices                                              
✓ PrivateMounts=                                              Service cannot install system mounts                                                   
✓ PrivateTmp=                                                 Service has no access to other software's temporary files                              
✓ PrivateUsers=                                               Service does not have access to other users                                            
✓ ProtectClock=                                               Service cannot write to the hardware clock or system clock                             
✓ ProtectControlGroups=                                       Service cannot modify the control group file system                                    
✓ ProtectHome=                                                Service has no access to home directories                                              
✓ ProtectKernelLogs=                                          Service cannot read from or write to the kernel log ring buffer                        
✓ ProtectKernelModules=                                       Service cannot load or read kernel modules                                             
✓ ProtectKernelTunables=                                      Service cannot alter kernel tunables (/proc/sys, …)                                    
✓ ProtectSystem=                                              Service has strict read-only access to the OS file hierarchy                           
✓ RestrictAddressFamilies=~AF_PACKET                          Service cannot allocate packet sockets                                                 
✓ RestrictSUIDSGID=                                           SUID/SGID file creation by service is restricted                                       
✓ SystemCallArchitectures=                                    Service may execute system calls only with native ABI                                  
✓ SystemCallFilter=~@clock                                    System call allow list defined for service, and @clock is not included                 
✓ SystemCallFilter=~@debug                                    System call allow list defined for service, and @debug is not included                 
✓ SystemCallFilter=~@module                                   System call allow list defined for service, and @module is not included                
✓ SystemCallFilter=~@mount                                    System call allow list defined for service, and @mount is not included                 
✓ SystemCallFilter=~@raw-io                                   System call allow list defined for service, and @raw-io is not included                
✓ SystemCallFilter=~@reboot                                   System call allow list defined for service, and @reboot is not included                
✓ SystemCallFilter=~@swap                                     System call allow list defined for service, and @swap is not included                  
✓ SystemCallFilter=~@privileged                               System call allow list defined for service, and @privileged is not included            
✓ SystemCallFilter=~@resources                                System call allow list defined for service, and @resources is not included             
✓ AmbientCapabilities=                                        Service process does not receive ambient capabilities                                  
✓ CapabilityBoundingSet=~CAP_AUDIT_*                          Service has no audit subsystem access                                                  
✓ CapabilityBoundingSet=~CAP_KILL                             Service cannot send UNIX signals to arbitrary processes                                
✓ CapabilityBoundingSet=~CAP_MKNOD                            Service cannot create device nodes                                                     
✓ CapabilityBoundingSet=~CAP_NET_(BIND_SERVICE|BROADCAST|RAW) Service has no elevated networking privileges                                          
✓ CapabilityBoundingSet=~CAP_SYSLOG                           Service has no access to kernel logging                                                
✓ CapabilityBoundingSet=~CAP_SYS_(NICE|RESOURCE)              Service has no privileges to change resource use parameters                            
✓ RestrictNamespaces=~CLONE_NEWCGROUP                         Service cannot create cgroup namespaces                                                
✓ RestrictNamespaces=~CLONE_NEWIPC                            Service cannot create IPC namespaces                                                   
✓ RestrictNamespaces=~CLONE_NEWNET                            Service cannot create network namespaces                                               
✓ RestrictNamespaces=~CLONE_NEWNS                             Service cannot create file system namespaces                                           
✓ RestrictNamespaces=~CLONE_NEWPID                            Service cannot create process namespaces                                               
✓ RestrictRealtime=                                           Service realtime scheduling access is restricted                                       
✓ SystemCallFilter=~@cpu-emulation                            System call allow list defined for service, and @cpu-emulation is not included         
✓ SystemCallFilter=~@obsolete                                 System call allow list defined for service, and @obsolete is not included              
✓ RestrictAddressFamilies=~AF_NETLINK                         Service cannot allocate netlink sockets                                                
✓ RootDirectory=/RootImage=                                   Service has its own root directory/image                                               
✓ SupplementaryGroups=                                        Service has no supplementary groups                                                    
✓ CapabilityBoundingSet=~CAP_MAC_*                            Service cannot adjust SMACK MAC                                                        
✓ CapabilityBoundingSet=~CAP_SYS_BOOT                         Service cannot issue reboot()                                                          
✓ Delegate=                                                   Service does not maintain its own delegated control group subtree                      
✓ LockPersonality=                                            Service cannot change ABI personality                                                  
✓ MemoryDenyWriteExecute=                                     Service cannot create writable executable memory mappings                              
✓ RemoveIPC=                                                  Service user cannot leave SysV IPC objects around                                      
✓ RestrictNamespaces=~CLONE_NEWUTS                            Service cannot create hostname namespaces                                              
✓ UMask=                                                      Files created by service are accessible only by service's own user by default          
✓ CapabilityBoundingSet=~CAP_LINUX_IMMUTABLE                  Service cannot mark files immutable                                                    
✓ CapabilityBoundingSet=~CAP_IPC_LOCK                         Service cannot lock memory into RAM                                                    
✓ CapabilityBoundingSet=~CAP_SYS_CHROOT                       Service cannot issue chroot()                                                          
✓ ProtectHostname=                                            Service cannot change system host/domainname                                           
✓ CapabilityBoundingSet=~CAP_BLOCK_SUSPEND                    Service cannot establish wake locks                                                    
✓ CapabilityBoundingSet=~CAP_LEASE                            Service cannot create file leases                                                      
✓ CapabilityBoundingSet=~CAP_SYS_PACCT                        Service cannot use acct()                                     
✓ CapabilityBoundingSet=~CAP_SYS_TTY_CONFIG                   Service cannot issue vhangup()                                                         
✓ CapabilityBoundingSet=~CAP_WAKE_ALARM                       Service cannot program timers that wake up the system                                  
✗ RestrictAddressFamilies=~AF_UNIX                            Service may allocate local sockets                                                  0.1

→ Overall exposure level for syncoid-losurdo-home-julm-work.service: 1.0 OK 🙂
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@ju1m
Copy link
Contributor Author

ju1m commented Oct 27, 2020

This PR was broken by 733acfa . It is now rebased against current master with modifications to support the zfs allow calls from 733acfa

Copy link
Contributor

@lopsided98 lopsided98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good idea. I've tested this pretty thoroughly in my setup, and it works with the changes I suggested.

This is the modified version of this PR I have been testing: lopsided98@0894223

]) (localPoolName c.target);
serviceConfig.User = cfg.user;
serviceConfig.Group = cfg.group;
serviceConfig.Restart = mkDefault "on-failure";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a good default. With the default StartLimit settings, persistent failures cause the service to be restarted in a loop forever and never marked as failed. This prevents me from getting a notification that my backups are failing. If users want restarting they can easily configure it themselves with StartLimit settings appropriate for their use case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've removed Restart= and RestartSec=, but I do not have enough monitoring perspective to know if this is suitable or not like that.

nixos/modules/services/backup/syncoid.nix Outdated Show resolved Hide resolved
# For syncoid to be able to create /var/lib/syncoid/.ssh/
# and to use custom ssh_config or known_hosts.
home = "/var/lib/syncoid";
createHome = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer StateDirectory = "syncoid" in the systemd service. I think it is the more idiomatic way of doing it.

For my use case, I actually preferred not having a stateful known_hosts file because it makes it harder to change host keys (I normally manage them declaratively but now I would also have to manually edit /var/lib/syncoid/.ssh/known_hosts as well). I'm fine with this change though if it helps your use case.

Copy link
Contributor Author

@ju1m ju1m Nov 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I've moved to StateDirectory= which ensures correct permissions and automatically binds the directory in RootDirectory=.

Concerning the use case of /var/lib/syncoid/.ssh/, when generating a /etc/ssh/ssh_known_hosts with:

programs.ssh.knownHosts = {
  mermet = {
    hostNames = [ "mermet" "mermet.sourcephile.fr" ];
    publicKey = "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFvKN2sIpH782MFjaOpcty1Hs/T/TPNJpXI08H3O3oxl";
  };
};

and using this hostname in :

services.syncoid.commands."backup@mermet.sourcephile.fr:rpool/var/mail".target =
  "losurdo/backup/mermet/var/mail";

syncoid now writes a /var/lib/syncoid/.ssh/known_hosts specifying the IP address of the hostname:

80.67.180.129 ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFvKN2sIpH782MFjaOpcty1Hs/T/TPNJpXI08H3O3oxl

instead of issuing the following warnings:

losurdo syncoid[19440]: Could not create directory '/var/empty/.ssh'.
losurdo syncoid[19440]: Failed to add the ED25519 host key for IP address '80.67.180.129' to the list of known hosts (/var/empty/.ssh/known_hosts).

Hence, if that IP address is dynamic, having such a stateful known_hosts may be useful, though there are probably some --sshoption that can help.

"+/run/booted-system/sw/bin/zfs" "allow"
cfg.user "create,mount,receive,rollback" pool
]) (localPoolName c.target);
serviceConfig.User = cfg.user;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to set PrivateTmp = true, otherwise different commands that have the same destination host and run in parallel end up using the same SSH control socket. When one command finishes, it kills the connection of any others using the same socket.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, see also jimsalterjrs/sanoid#532 (comment).
I've set PrivateTmp=true and also taken the time to find working values for the security options, based upon systemd-analyze security syncoid-$NAME.service and previous experiences like #98904 .
As always, it might still be too much hardening. Please bear with me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardening doesn't cause any problems for me, although as always it is definitely an increased maintenance burden that requires in depth systemd knowledge to understand. I'd like to get someone else's opinion, although it seems to be nearly impossible to find reviewers for sanoid/syncoid PRs for some reason.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying out sanoid/syncoid, hopefully I'll have the experience in a few weeks to have actual opinions. I'll try to remember this (and other PRs) -- but feel free to ping me if it's still stuck in a while.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it would be great to get this merged. Make sure you use #83904, otherwise sanoid templates won't work right.

Copy link
Contributor

@lopsided98 lopsided98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NixOS test needs to be updated to account for the changed syncoid service name.

@ju1m ju1m changed the title nixos/syncoid: split in multiple systemd services nixos/syncoid: split in multiple systemd services and harden them Nov 25, 2020
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/prs-ready-for-review/3032/385

@ju1m
Copy link
Contributor Author

ju1m commented May 10, 2021

Hardened with ProcSubset=pid and ProtectProc=invisible as I've done elsewhere.

@ju1m
Copy link
Contributor Author

ju1m commented Jun 15, 2021

Rebased against latest master, solving merge conflit with d87903a

Copy link
Contributor

@lopsided98 lopsided98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lheckemann I've been running the latest version of this PR since last night without problems, so I'm good with merging this now if you want, as we discussed on IRC.

@etu
Copy link
Contributor

etu commented Jul 24, 2021

@ju1m

I've tested this, it seems to work, but some of the locking down seems to cause warnings (not errors though).

Jul 24 08:42:26 fenchurch zed[30343]: eid=179 class=history_event pool='zroot'
Jul 24 08:42:27 fenchurch syncoid[30341]: WARN: lzop not available on source ssh:-S /tmp/syncoid-root@vps05.elis.nu-1627108946 root@vps05.elis.nu- sync will continue without compression.
Jul 24 08:42:27 fenchurch syncoid[30341]: WARN: mbuffer not available on source ssh:-S /tmp/syncoid-root@vps05.elis.nu-1627108946 root@vps05.elis.nu - sync will continue without source buffering.
Jul 24 08:42:28 fenchurch syncoid[30383]: Error: /proc must be mounted
Jul 24 08:42:28 fenchurch syncoid[30383]:   To mount /proc at boot you need an /etc/fstab line like:
Jul 24 08:42:28 fenchurch syncoid[30383]:       proc   /proc   proc    defaults
Jul 24 08:42:28 fenchurch syncoid[30383]:   In the meantime, run "mount proc /proc -t proc"
Jul 24 08:42:28 fenchurch syncoid[30341]: NEWEST SNAPSHOT: autosnap_2021-07-24_06:30:07_frequently
Jul 24 08:42:28 fenchurch syncoid[30393]: Error: /proc must be mounted
Jul 24 08:42:28 fenchurch syncoid[30393]:   To mount /proc at boot you need an /etc/fstab line like:
Jul 24 08:42:28 fenchurch syncoid[30393]:       proc   /proc   proc    defaults
Jul 24 08:42:28 fenchurch syncoid[30393]:   In the meantime, run "mount proc /proc -t proc"

This is not the case when pushing the snapshots to a remote host, only when pulling snapshots from a remote host.

@ju1m
Copy link
Contributor Author

ju1m commented Jul 24, 2021

@etu, oh, right, this is an important warning: syncoid calls ps to detect whether there is already a zfs receive on the receiving dataset. This is caused by my recent addition of ProcSubset=pid:

$ sudo systemd-run --pipe -p DynamicUser=true -p ProcSubset=pid ps -Ao args=
Running as unit: run-u10883.service
Error: /proc must be mounted
  To mount /proc at boot you need an /etc/fstab line like:
      proc   /proc   proc    defaults
  In the meantime, run "mount proc /proc -t proc"

I am also relaxing ProtectProc=invisible back to ProtectProc=default to let ps have access to all the processes.
Thanks!

@etu etu merged commit 6984e68 into NixOS:master Jul 24, 2021
@phryneas
Copy link
Member

Could it be that this has hardened the service so hard it has no more access to the ssh key in the end? I moved the ssh key into /etc in the end, which seems to work, but everywhere else it just fails.

@ju1m
Copy link
Contributor Author

ju1m commented Feb 19, 2022

@phryneas yes, this PR chrooted syncoid in a temporary RootDirectory=, hence if you do not put your ssh key in one of the following directories (or one of their sub-directories): [ builtins.storeDir "/etc" "/run" ]; the key will not be accessible to syncoid. If you need to use a custom directory, you can add it to config.services.syncoid.service.serviceConfig.BindReadOnlyPaths.

Note that #147559 (awaiting reviews) uses the new systemd option LoadCredential= which takes care to mount the key in the RootDirectory=.

@phryneas
Copy link
Member

Thank you @ju1m, config.services.syncoid.service.serviceConfig.BindReadOnlyPaths = [ "/var/lib/remote_backup/.ssh" ]; was the missing puzzle piece for me to make things work - and I learned something new about systemd (and how I just can add extra things to systemd services - I didn't know I could just add stuff to config.services.syncoid.service.serviceConfig).

I'm going to subscribe that future PR.

@ju1m
Copy link
Contributor Author

ju1m commented Feb 20, 2022

@phryneas , note that you would usually reach serviceConfig through something like config.systemd.services.${serviceName}.serviceConfig but because in the case of syncoid.service many systemd services are generated this convenient option enables to easily set common settings for all those systemd services.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants