New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hetzner: Don't use localCommands, set interface device of default gateway #1032
Conversation
Good stuff! |
I see one last issue in
|
nix/hetzner-bootstrap.nix
Outdated
@@ -5,7 +5,6 @@ let | |||
|
|||
nixpart = python2Packages.nixpart0.override { | |||
useNixUdev = false; | |||
udevSoMajor = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's actually something I wanted to break up into a separate commit, but it was left in that stash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
nixops/backends/hetzner.py
Outdated
gw_ip_cmd = default_gw_cmd + " | cut -d' ' -f1" | ||
gw_dev_cmd = default_gw_cmd + " | cut -d' ' -f3" | ||
gw_ip = self.run_command(gw_ip_cmd, capture_stdout=True).strip() | ||
gw_dev = self.run_command(gw_dev_cmd, capture_stdout=True).strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably would make sense to use only one single command using spli
t instead of the cut
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
tests/hetzner-backend/default.nix
Outdated
@@ -1,7 +1,7 @@ | |||
{ system, nixops }: | |||
|
|||
with import <nixpkgs/nixos/lib/testing.nix> { inherit system; }; | |||
with import <nixpkgs/nixos/lib/qemu-flags.nix>; | |||
with import <nixpkgs/nixos/lib/qemu-flags.nix> { inherit pkgs; }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here (different commit)...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@nh2: Did you test whether this works on a server with additional subnets? |
@aszlig I think the
that we're generating now is unnecessary. It gets placed into
This is because |
@nh2: Also, what's missing here (and what I wanted to implement when testing this) is a migration function that migrates the old |
I've pushed a fix for that: Only generate the route if it's not for the device that's the default gateway. |
…eway. Should help #833. Original localCommands patch by aszlig. Fixups and interface device stuff from me. Note that this new way of doing it without localCommands is backwards incompatible with NixOS < 18.03, see: https://nixos.org/nixos/manual/release-notes.html#sec-release-18.03-notable-changes As described on https://wiki.hetzner.de/index.php/Netzkonfiguration_Debian/en#IPv4 Hetzner requires that even traffic to the local subnet goes via the gateway. NixOps already had code for that (set in `localCommands`), but it wasn't enough: The ip route replace default via "GATEWAY_IP" proto static generated by `network-setup.service` in NixOS's `network-interface-scripted.nix` fails with RTNETLINK answers: Network is unreachable because all routes added so far had `via "GATEWAY_IP"`, but the kernel didn't know how to actually reach GATEWAY_IP because there was no static route for that IP, see. https://serverfault.com/questions/581159/unable-to-add-a-static-route-sioaddrt-network-is-unreachable/581162#581162 This commit adds an explicit static route to the default gateway, on the relevant network device. This allows us to generate the following instead: ip route replace GATEWAY_IP dev THE_DEVICE proto static ip route replace default via "GATEWAY_IP" dev THE_DEVICE proto static so that the kernel knows how to reach the gateway. An example of what routes look like in `ip route` after deploying and 18.03 machine, where 1.2.3.4 is the server's IP and 1.2.3.1 is the gateway: default via 1.2.3.1 dev eth0 src 1.2.3.4 metric 202 1.2.3.0/27 dev eth0 proto kernel scope link src 1.2.3.4 metric 202 1.2.3.1 dev eth0 proto static scope link For IPv6, the link-local `fe80::1` is the gateway, as described on https://wiki.hetzner.de/index.php/Netzkonfiguration_Debian/en#IPv6 Signed-off-by: Niklas Hambüchen <mail@nh2.me>
87f6171
to
bb9fe12
Compare
@aszlig I've written an ad-hoc one that goes via Perhaps the logic in there is useful to some extent for making the proper migration script. https://gist.github.com/nh2/7707e0069df1d4bebc6572c40c85b673 |
@nh2 about IPv4, can we get NixOps and NixOS networking to configure something like this instead?
We arent supposed to reach anything on the same L2 except the gateway anyhow, this avoids having to configure a separate route for whatever netmask might be set - I saw both |
@flokli In general I agree that there's no point not to set netmask I considered doing that, but then opted to first make the minimal change in this PR that fixes the issue. I would recommend implementing the |
I was suggesting the /32 for this PR, as we started discussing about migrations, and then we'd probably need to worry about that a second time, when changing to the /32.
I run a such a network configuration on some Hetzner servers for quite a while now and consider it much less flaky.
|
@flokli This PR doesn't do migrations, I'd probably add that with a separate PR as well. I care first about that new Hetzner deployments work, and second that old ones can be migrated. I am hesitant to auto-migrate nixops Hetzner users from There must be some reason why Hetzner doens't just set it up as Again, I'm all for the |
Same should apply for not auto-migrating existing deployments without asking back ;-)
Additional IPv4 adresses are routed via the main IPv4 adress, so you can either add it with netmask /32 on any interface, or route it somewhere else internally (for example, as a peer route via a tunnel link)
Fine, so let's merge this, and add some migration assistance and the |
This is based on my initial (untested) patch[1] to get rid of the use of networking.localCommands and instead uses the appropriate attributes in networking.interfaces, networking.defaultGateway and networking.defaultGateway6 instead. The reason why I used localCommands in the first place was that we didn't have the necessary attributes in NixOS at the time I was writing the implementation. What is still missing is the migration from the old physical spec to the new one, but we can still do that later. The commit I'm merging here is this patch plus fixes and actual testing by @nh2 (thank you!). [1]: https://gist.github.com/aszlig/84e9bf4c8c2a06e77e43d64f89a5c2fc
@flokli: I tested this with a new server just now and routing to the same subnet doesn't work as it's missing the gateway. So I guess I'll add the |
@flokli: Actually, never mind about the |
This was already suggested by @flokli in NixOS#1032 and it basically makes sure that all traffic is sent to the gateway instead of being sent directly. The Hetzner Wiki at [1] describes configuration of Debian to include a "peertopeer" directive in /etc/network/interfaces, which roughly translates to the command "ip address add ... peer A.B.C.D". However, after testing with and without the peer keyword of "ip address" I didn't notice any difference in behaviour in comparison to setting a plain /32 prefix length as we do now. Apart from making configuration less complicated, this also gets rid of a bunch of code we now no longer need, eg. calculating subnet masks or getting the real prefix length. Tested on a newly deployed PX61-NVMe. [1]: https://wiki.hetzner.de/index.php/Netzkonfiguration_Debian/en#Dedicated_Servers Signed-off-by: aszlig <aszlig@nix.build>
Should help #833.
Original localCommands patch by aszlig.
Fixups and interface device stuff from me.
Note that this new way of doing it without localCommands is backwards
incompatible with NixOS < 18.03, see:
https://nixos.org/nixos/manual/release-notes.html#sec-release-18.03-notable-changes
As described on
Hetzner requires that even traffic to the local subnet goes via the gateway.
NixOps already had code for that (set in
localCommands
), but it wasn't enough:The
generated by
network-setup.service
in NixOS'snetwork-interface-scripted.nix
fails with
because all routes added so far had
via "GATEWAY_IP"
, butthe kernel didn't know how to actually reach GATEWAY_IP because
there was no static route for that IP, see.
This commit adds an explicit static route to the default gateway, on the
relevant network device.
This allows us to generate the following instead:
so that the kernel knows how to reach the gateway.
An example of what routes look like in
ip route
after deploying and 18.03machine, where 1.2.3.4 is the server's IP and 1.2.3.1 is the gateway:
For IPv6, the link-local
fe80::1
is the gateway, as described on