-
-
Notifications
You must be signed in to change notification settings - Fork 15.5k
wireguard: 0.0.20190406 -> 0.0.20190531 and Change peers without tearing down the interface, handle DNS failures better #62325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Before, changing any peers caused the entire WireGuard interface to be torn down and rebuilt. By configuring each peer in a separate service we're able to only restart the affected peers. Adding each peer individually also means individual peer configurations can fail, but the overall interface and all other peers will still be added. A WireGuard peer's internal identifier is its public key. This means it is the only reliable identifier to use for the systemd service.
@@ -265,25 +316,8 @@ let | |||
wg set ${name} private-key ${privKey} ${ | |||
optionalString (values.listenPort != null) " listen-port ${toString values.listenPort}"} | |||
|
|||
${concatMapStringsSep "\n" (peer: | |||
assert (peer.presharedKeyFile == null) || (peer.presharedKey == null); # at most one of the two must be set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assert has been moved down in to the assertions, which provides a reasonable error message than "assertion failed" :P
@GrahamcOfBorg build wireguard-tools |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Also switched over my system to this PR.
We probably either want to revert ced7cfc in 19.03, or backport this PR to 19.03. I guess doing the former makes more sense. |
I think backporting all wireguard snapshots makes more sense. This is what I used to do in the past. |
I wasn't referring to the backporting of wireguard snapshots, but on the change of retry behaviour introduced in the mentioned commit, which we should revert on 19.03, too.
|
Sorry guys, you might want to bump again to 0.0.20190601. 0531 has some issues on Intel Atom. |
This reverts commit ced7cfc. See: NixOS#62325
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let | ||
keyToUnitName = replaceChars | ||
[ "/" "-" " " "+" "=" ] | ||
[ "-" "\\x2d" "\\x20" "\\x2b" "\\x3d" ]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason you didn't use utils.escapeSystemdPath
here?
Unfortunately that seems to prevent to start client interfaces properly. Having it applied, it did properly start the server interface, but several client interfaces do show up as started as well but there's no ping/connection possible.
And on the server side:
That was before I run Edit: It seems the problem is with the generated unit file. If it's a wg client, there doesn't seem any reference to the wg server in it: /nix/store/4gjzpinpamy8zs0xdbnmdafhvd7hrfpm-unit-script-wireguard-wg_hb-start
On a different machine, that isn't upgraded yet to latest code, the generated unit file looks like this:
So, the difference is that |
It actually seems the two commands
|
It seem the peer unit files are properly created - but not called by the current unit files:
results in:
Looking at
Looking at the ExecStart file
but as said before, two instructions for the peer (in this case the server peer) are missing. However, in
Looking there at the ExecStart file
So that's the missing part. When running those two commands, the tunnel now works. But no idea where the peer unit file also should be called from |
I can't reproduce your bug report, @sjau. Please ping me on IRC when you're able to debug further. |
FWIW here is the test case I wrote to try and reproduce: import ../make-test.nix ({ lib, pkgs, ...} : let
globalPeers = [
{
idx = 1;
private = "QHWxUiMMnCNdYlUUowXOhwA9dqFj0mPW+hlrfL8+33U=";
public = "vcLLUIzLkTuv60vV7+nMsWUDSh+R+g+YR2Jn+Kjlc1g=";
}
{
idx = 2;
private = "yEhxYq3jxZMXMBCofjGNG+GomUmrwVUHvWvuFRWC2HU=";
public = "AMYt7rajfDZfupU8+m/uQ8C+1R++zBrI+Z8wuQylXio=";
}
{
idx = 3;
private = "mDXJoAJmFaERpaP6Wvll/FqoGTxZA0CgnL68dFs4XGM=";
public = "UWBRfv+VsUOLIi5iD++Zurl4wx+S+A6r39flcI3jyhA=";
}
{
idx = 4;
private = "gDINget094ZZNd2AV1lFxi3wOxbrtG74oZ/7FOEY30Q=";
public = "/aqhS+qrcRMQWdCQjnnc++hggRqU+4s26EEVlnsFcTA=";
}
{
idx = 5;
private = "SBil1s3P9Uam6zJTdm/GOjgozaoVPha1ThS49OGbO0E=";
public = "+l2Q+6Ap7y2MavVR1SR+qMQQUSp4shwNeFjcodhr+lw=";
}
];
peersFor = self: lib.remove self globalPeers;
in {
name = "wireguard-interesting-keys";
meta = with pkgs.stdenv.lib.maintainers; {
maintainers = [ ma27 grahamc ];
};
nodes = lib.listToAttrs (map (self: {
name = "peer${toString self.idx}";
value = {
networking.useDHCP = false;
networking.interfaces.eth1 = {
ipv4.addresses = lib.singleton {
address = "192.168.0.${toString self.idx}";
prefixLength = 24;
};
};
networking.wireguard.interfaces.wg0 = {
ips = [ "10.10.10.${toString self.idx}/24" ];
listenPort = 12345;
privateKey = self.private;
peers = map (peer: {
allowedIPs = [ "10.10.10.${toString peer.idx}/32" ];
endpoint = "192.168.0.${toString peer.idx}:12345";
publicKey = peer.public;
persistentKeepalive = 1;
}) (peersFor self);
};
};
}) globalPeers);
testScript = ''
startAll;
# Wait for wireguard-wg0 to be up
${lib.concatMapStringsSep "\n" (self: "$peer${toString self.idx}->waitForUnit(\"wireguard-wg0\");") globalPeers}
${lib.concatMapStringsSep "\n" (self: "print $peer${toString self.idx}->succeed(\"wg\");") globalPeers}
# Ping all the peers
${lib.concatStringsSep "\n"
(lib.flatten (map
(self: map
(peer: "$peer${toString self.idx}->succeed(\"ping -c1 10.10.10.${toString peer.idx}\");")
(peersFor self)
)
globalPeers
))}
'';
}) My suspicion was your system wasn't working because you had keys started with a |
On latest nixos unstable it still fails to properly start the wg peers. And use domain names instead of IP addresses and it will very likely fail for you also becuase of missing dns resolution during boot up. Your test case just won't cover real bootup missing dns. Simple solution would probably be to change the wg peer type from oneshot to simple with retry on fail. At least when this was the case before, everything worked fine. |
@sjau can you open an issue with some more detail on how this fails (like log output of the failing units)? |
Motivation for this change
In #30459 @sjau reported a problem with starting up a system with WireGuard where peers used DNS. On boot, the system would start the WireGuard interface, and attempt to add a peer by hostname. This would fail because the system wasn't able to resolve domains yet.
In #61971 sjau fixed this bug by having the wg startup unit restart until it worked, but this caused @zx2c4 great heartache.
Relatedly, having the entire interface and all peers disappear because one peer failed to resolve is not a nice or acceptable failure mode. This is related to another problem: changing a single peer causes the interface to go away and be restarted from scratch. This isn't necessary, either :)
cc @Mic92
Things done
sandbox
innix.conf
on non-NixOS)nix-shell -p nix-review --run "nix-review wip"
./result/bin/
)nix path-info -S
before and after)