Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/sshd: added missing systemd controls #34331

Closed
wants to merge 1 commit into from

Conversation

netixx
Copy link
Contributor

@netixx netixx commented Jan 27, 2018

Added after= and requires= systemd order options.
Without the two options, sshd fails to start if
ip binding is configured in ssh_config (because
ip address is not defined when sshd starts)

This fixes issue #30903 for sshd service.

Motivation for this change

Without this change, sshd fails to start when ip binding is configured in sshd.

Things done

Tested in my personal nixos setup (sshd wouldn't start without this).

  • Tested using sandboxing (nix.useSandbox on NixOS, or option build-use-sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Fits CONTRIBUTING.md.

@aneeshusa
Copy link
Contributor

cc me as openssh maintainer
@netixx can you provide more details about your sshd setup? My main questions are if you are using socket activation or not, and what you have configured your listenAddresses as. If we are using socket activation, I believe we shouldn't need this. If not, I believe we still shouldn't need this if binding to a loopback (localhost) address or 0.0.0.0, but I can see this being needed if specific interfaces are configured.
I'll try to do some more research on my end as well.

@netixx
Copy link
Contributor Author

netixx commented Jan 28, 2018

My setup is the following :

services.openssh.listenAddresses = [
      {
        addr = "X.X.X.X";
        port = 22;
      }
];
networking.interfaces = [
    {
      name = "LAN";
      ip4 = [
        {
          address = "X.X.X.X";
          prefixLength = 24;
        }
      ];
    }
];
networking.useNetworkd = true;

In my setup, X.X.X.X is an ip address assigned either on a bridge or a physical interface.
Im have also configured networking.useNetworkd = true; (which may change when the IP addresses are available for binding). I am not using socket activation.

I agree with your comments on socket activation. Also when I remove the listenAddresses parameters (sshd binds by default to any/0.0.0.0), it was starting as expected. For loopback addresses, I am not familiar with how the 127.0.0.0/8 loopback is brought up on linux (before systemd ?), but for other loopback interface (through this example https://www.aangelis.gr/blog/2016/04/multiple-loopback-ips-in-arch-linux), sshd might try to bind before it is available.

What first struck me as strange is the fact that the after= parameter is set in the current nixos implementation, but the required= or wants= is not set. They say in the manual https://www.freedesktop.org/software/systemd/man/systemd.unit.html that

It is a common pattern to include a unit name in both the After= and Requires= options, in which case the unit listed will be started before the unit that is configured with these options

which is why I added the requires= dependency in my patch but at the same time

After= ensures that the configured unit is started after the listed unit finished starting up

so it seems after should be enough for what we want to do. However (and I just noticed now), the target referenced in the current nixos implementation is network.target, which, according to the systemd manual, is not relevant for IP address binding.

Also, from a theoretical standpoint, we could consider that sshd is useless without network, and so the requires dependency make sense.

If we look at the systemd unit for sshd in ArchLinux, they only included After=network-online.target. I have also read (in not very reliable sources), that good practice is to use After=network.target network-online.target to retain compatibility with older systems (but I think it does not apply to nixos).

@fpletz
Copy link
Member

fpletz commented Jan 28, 2018

I've been testing networkd in the last few weeks and have a problem with this change and the way networkd is currently integrated into NixOS. If general DHCP (networking.useDHCP) is enabled, all interfaces are matched in the main network unit and will be configured with DHCP. If only one interface can't be configured via DHCP network-online.target will never be reached. This is even the case when one interface is configured statically.

I think we should rather fix our networkd integration than adding this change prematurely and thus possibly break sshd for regular networkd users.

EDIT: This is also the case if networkd is not used. If there is no DHCP available even though some interfaces are statically configured => no network-online.target => no sshd.

@netixx
Copy link
Contributor Author

netixx commented Jan 31, 2018

@fpletz, it's true the the network-online.target can be troublesome, I have experienced failures of the network-online.target (see #30904) which broke my system, but I had to fix it because the other services (say unbound), would also fail. Probably, the dhcp should timeout and the system should always reach the network-online.target (I think this behaviour is already implemented in systemd-networkd) ?

If I understand the manual correctly, the after= wil start the service event if network-online.target fails:

Most importantly, for service units start-up is considered completed for the purpose of Before=/After= when all its configured start-up commands have been invoked and they either failed or reported start-up success.

I propose 3 solutions (assuming the above assumption is correct):

  1. Use After=network-online.target. And conditionally change the Requires= dependency: If address binding is configured, then use Requires=network-online.target, else, don't configure the Requires= dependency.
  2. Use After=network-online.target. And use the Wants=network-online.target dependency. What this will do is stop sshd if network is stopped, but it will start sshd even if network-online.target fails.
  3. Use After=network-online.target, and don't use either Requires= nor Wants=. After discussing this issue, I think this the best option (it seems to be the choice that ArchLinux made, for example).

In the end, I think we need to implement at least one of these options, because, in the current setup, the sshd will fail to start if address binding is configured (which caused me quite a bit of trouble when I upgraded to 17.09 and had to revert to the serial console).

What do you think is the best course of action ?

@grahamc
Copy link
Member

grahamc commented Jan 31, 2018

@GrahamcOfBorg test openssh

Copy link

@GrahamcOfBorg GrahamcOfBorg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Success on aarch64-linux (full log)

Partial log (click to expand)

2 out of 2 tests succeeded
server_lazy# [   36.931141] systemd[981]: Reached target Shutdown.
test script finished in 38.92s
cleaning up
killing client (pid 627)
killing server_lazy (pid 638)
killing server (pid 651)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/build/vde1.ctl': Directory not empty
/nix/store/z4b81hf2svad6h4i5fkyqh2mnxzbhhz9-vm-test-run-openssh

Copy link

@GrahamcOfBorg GrahamcOfBorg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failure on x86_64-linux (full log)

Partial log (click to expand)

cannot build derivation ‘/nix/store/xrga9zij6k202vc0w7c05r5m0ga83y2l-closure-info.drv’: 1 dependencies couldn't be built
cannot build derivation ‘/nix/store/0xl99baq75h8qia3g3qbcnqfsa9xk9ns-run-nixos-vm.drv’: 2 dependencies couldn't be built
cannot build derivation ‘/nix/store/2kndnc9qhblkxlamnr82c66p8nq3msrk-run-nixos-vm.drv’: 2 dependencies couldn't be built
cannot build derivation ‘/nix/store/9yapbrbrpir6wi9m2ixxwjmfjksvx7bm-run-nixos-vm.drv’: 2 dependencies couldn't be built
cannot build derivation ‘/nix/store/mbs84zwv9v96ibfra526jfqqvy2d25wa-nixos-vm.drv’: 2 dependencies couldn't be built
cannot build derivation ‘/nix/store/nia8rwa6gajgizf6d0z51sj2840jn7qk-nixos-vm.drv’: 2 dependencies couldn't be built
cannot build derivation ‘/nix/store/rgbgq7k1j0kg2f177gm5hsznzmgc24x0-nixos-vm.drv’: 2 dependencies couldn't be built
cannot build derivation ‘/nix/store/7jqcizy5vjnfvd2jfyx9qnlmgr6arq1h-nixos-test-driver-openssh.drv’: 3 dependencies couldn't be built
cannot build derivation ‘/nix/store/2c7rsnr5djw8bw5wgyg6yrm15bpl5xx0-vm-test-run-openssh.drv’: 1 dependencies couldn't be built
error: build of ‘/nix/store/2c7rsnr5djw8bw5wgyg6yrm15bpl5xx0-vm-test-run-openssh.drv’ failed

@netixx
Copy link
Contributor Author

netixx commented Feb 7, 2018

I could not access full logs on the x86_64-linux but I assume since the aarch is working (and this change is not a program change) the problem is elsewhere ?

Added after= and requires= systemd order options.
Without the two options, sshd fails to start if
ip binding is configured in ssh_config (because
ip address is not defined when sshd starts)

This fixes issue NixOS#30903 for sshd service.
@netixx
Copy link
Contributor Author

netixx commented Feb 27, 2018

I will close this, since I think using startWhenNeeded and/or firewall rules is a better option than bindAddr.

@netixx netixx closed this Feb 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants