Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/networkd: improved behavior with networking.enableIPv6=false #35841

Closed
wants to merge 1 commit into from

Conversation

netixx
Copy link
Contributor

@netixx netixx commented Feb 27, 2018

Minor refactoring to set correct options when ipv6 is disabled,
namely:

  • Configure DHCP=ipv4
  • deactivate linklocal addresses configuration and router advertisment
    listen

In the process, a bug was fixed in the DHCP option declaration which
excpected "v4" or "v6" whereas upstream documentation says it should
be "ipv6" or "ipv4".

Also, improved the system by not generating 99-main if no global options
(networking.useDHCP, defaultGateways, search/domains) are configured,
this prevent breaking interfaces such as tunnels when it is not needed
(otherwise interfaces end up in the "configuring" state, see #30904).

Motivation for this change
  • Prevents networkd controlled interfaces from hanging in the configuring state when the current network environment is not IPv6 (Issue [NixOS] systemd-networkd-wait-online.service fails with bridge interfaces configured #30904).
  • Fixes configuration token DHCP which should have value "ipv4" or "ipv6" (and not "v4" or "v6") systemd networkd docs
  • Remove "99-main" catch-all if it is not required by a global setting (such as networking.useDHCP). This prevents networkd from trying to control unspecified interfaces (such as tunnel interfaces) when it is not needed (the 99-main would be empty and apply no configuration).
Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option build-use-sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Fits CONTRIBUTING.md.

@netixx
Copy link
Contributor Author

netixx commented Feb 27, 2018

Upon further testing of this PR (namely vlans), I encountered a problem :

error: The unique option `systemd.network.networks.40-pLAN.networkConfig.IPv6AcceptRA' is defined multiple times, in `/var/nixpkgs/nixos/modules/tasks/network-interfaces-systemd.nix' and `/var/nixpkgs/nixos/modules/tasks/network-interfaces-systemd.nix'.

I think this is because the attribute systemd.network.networks.40-pLAN is called multiple times (by the vlan interface and the parent interface for example).

I think this is because if forgot the "override" for both IPv6 related options in genericConfig (which anyways should be there).

I tried to add override, but then I had the following errors:

cannot coerce a set to a string, at /var/nixpkgs/nixos/modules/system/boot/systemd-lib.nix:65:8

which as I understand it, is due to the fact that mkOverride returns a set.

I have reached the limit of my knowledge about this and I could use the help of someone more versed in the matter to help me.

Thank you.

} // optionalAttrs (domains != [ ]) {
domains = override domains;
} // optionalAttrs (! cfg.enableIPv6) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the default value of enableIPv6?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value for networking.enableIPv6 is true.

@Mic92
Copy link
Member

Mic92 commented Mar 1, 2018

cc @andir

@netixx
Copy link
Contributor Author

netixx commented Mar 1, 2018

I think patching genericNerwork is best because it applies ipv6 de-activation everywhere it's needed, but that what is giving me trouble... Maybe there is a better way ?

if useDHCP == true || useDHCP == null
then
if cfg.enableIPv6
then "both"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am (and never was, so far) happy with how this worked.

Let me quote the DHCP= option from systemd.network:

    DHCP=
       Enables DHCPv4 and/or DHCPv6 client support. Accepts "yes", "no",
       "ipv4", or "ipv6".

       Note that DHCPv6 will by default be triggered by Router Advertisement,
       if that is enabled, regardless of this parameter. By enabling DHCPv6
       support explicitly, the DHCPv6 client will be started regardless of the
       presence of routers on the link, or what flags the routers pass. See
       "IPv6AcceptRA=".

The painpoint for me is the fact that we will enable DHCPv6. DHCPv6 is not the default way of configuring IPv6. It will not hurt (most people, anyway).

I would prefer having an option to specify the IPv6 DHCP configuration on it's own.

What do you think of having useDHCP which enables two (new) options: useDHCPv4 & useDHCPv6? Whoever cares about such details (like me) can then use the specific protocol switch to configure it.

Or, maybe just allow useDHCP to be an types.nullOr enum [ "off" "ipv6" "ipv4" "both" ];?

Opinions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good observation indeed. I think I got confused because both is not a valid option....

My goad writing this was to only not enable DHCPv6 if it was not strictly needed. What I needed in my case, was to use DHCP=ipv4 so that there was that the interface would get stuck in the configuring state.

I don't think the use of networking.enableIPv6 = false is mainstream, the only reason I used it is to work around problems like coreos/bugs#1419.

} // optionalAttrs (! cfg.enableIPv6) {
# if IPv6 is disabled, we shoud define additionnal attributes
# see https://github.com/coreos/bugs/issues/1419
networkConfig.LinkLocalAddressing = "no";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a fan of this... Do you really have to do this?

The way I read the bug report is:
If you disable IPv6 (which would only be configured by SLAAC anyway, so your network already is opted-in?) then booting up causes issues?
Why would you disable IPv6 then? I mean there is no good reason to NOT support IPv6 in 2018... Yes corporate networks and stuff but I am not sorry.. I has been many years to upgrade your legacy IT…

Copy link
Contributor Author

@netixx netixx Mar 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use my nixos box in a home gateway scenario. My provider does not provide a static v6 address, which is why I didn't configure IPv6 in my home (I was waiting for time to devise a solution for this).

Anyways, I think if ipv6 is actually disabled (cfg.enableIPv6 = false), then we should indeed disable those options, for coherence sake, and to avoid interfaces getting stuck in the configuring stage (and thus the network-online.target failing). If the user wants to disable IPv6, we should respect this everywhere we ca,.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. If someone decides to disable IPv6 we should honor that.

Are you able to write a small NixOS test to verify that the behavior is as expected? (Both with IPv6 disabled/enabled)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll think about a test, do you have any pointers for writing one ?

Also, I need to fix the error: The unique option systemd.network.networks.40-pLAN.networkConfig.IPv6AcceptRA' is defined multiple times, in /var/nixpkgs/nixos/modules/tasks/network-interfaces-systemd.nix' and /var/nixpkgs/nixos/modules/tasks/network-interfaces-systemd.nix'.
` error I am having with the current implementation (due to the fact that, for vlans, the underlying physical interface is referred to multiple times). Do you know anyone would could help me fix this ?

Thanks

}
(mkIf (domains != [ ] || defaultGateways != [ ] || cfg.useDHCP) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean we won't create a default network (read: we won't do ANY configuration) without DHCP, search domain or an Gateway?

Out of my hea: If I use SLAAC with IPv6 I MUST set useDHCP or else I do not have a .network file (and thus no networking)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem I had, is that this 99-main network file was messing my setup (again, without IPv6). For example, my tunnel interfaces (from openvpn) where rendered unusable because networkd was messing with them. Also, I had the configuring problems as well.

To cover the problem of ipv6, maybe we add cfg.enableIPv6 to the conditionnal ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are gaining a lot of complexity in the networking configuration. If you think that would solve the issue go for it.

I would like to see tests for these kind of things so we can verify that it is working as intended vs. having humans parse nix expressions :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't like this implementation very much.

We can use the following to prevent networkd from messing with all the interfaces :

[Link]
Unmanaged=yes

What do you think about the following solution: we add a global parameter networking.acceptRA, if networking.useDHCP=false and networking.enableIPv6=false and networking.acceptRA=false we add the unmanaged flag in the "99-main" unit.

I think this is simpler, and enables use of domains and defaultGateway without messing with interfaces (e.g. tunnels) when it's not needed. I will test this if you think this is better.

@Mic92
Copy link
Member

Mic92 commented Mar 1, 2018

Please excuse the passive-aggressive tone of @andir. He is an ipv6 advocate and has a bad day, if he has to disable ipv6 somewhere :).

@netixx
Copy link
Contributor Author

netixx commented Mar 1, 2018

First, thank you for taking a look at this.

I don't mind that much :), the points are very interesting, I would say I am also an advocate of IPv6 as a principle, but I have little knowledge of it (and my ISP is not making it easy...), which shows in this PR.

Whatever we decide on the subject of adding or not those lines, I would be interested in understanding why I got this error when using vlans (to improve my knowledge of the nix language).

])
(assertValueOneOf "DHCP" ["both" "none" "v4" "v6"])
(assertValueOneOf "DHCP" ["both" "none" "ipv4" "ipv6"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also change both to yes and none to no as it was introduced in systemd/systemd@cb9fc36. However according to this code the old style should still work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done :)

@matthewbauer
Copy link
Member

Hey all. Checking in on the status of this. Any objections to merging?

} // optionalAttrs (domains != [ ]) {
domains = override domains;
} // optionalAttrs (! cfg.enableIPv6) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is still causing problems when using vlans (because the property is merged twice, see other comments)

@netixx
Copy link
Contributor Author

netixx commented Apr 29, 2018

There is still a problem when using vlans (see above comments).

@flokli
Copy link
Contributor

flokli commented May 12, 2018

Also bumped into the wrong DHCP options problem, and some missing fields as well while trying to mirror my setup.

I think it makes sense to split the changes proposed here into more granular commits:

  • fixing values for DHCP
  • adding values for IPv6AcceptRA, LinkLocalAddressing (+IPv6Token)
  • "disable ipv6" logic

@netixx Could you split your commit like this? This will probably make a review a bit easier, and allow to cherry-pick some of the fixes.

In addition, should probably decide on what to do with other fields and sections not currently supported by this module in the longer run.
IMHO, having all these assertions in here makes the module far more complex than it has to be.

Maybe it would even make sense to not duplicate networkd's parsing logic, or have some automated process to write these assertions, instead of manually trying to keep them in sync.

@flokli flokli mentioned this pull request May 15, 2018
8 tasks
@flokli
Copy link
Contributor

flokli commented May 15, 2018

@netixx I copied over your DHCP value fixes plus some of the field additions to #40573. Please comment! :-)

I don't really understand the vlan problem described above.
Do you have some example configuration you can point me at?

@netixx
Copy link
Contributor Author

netixx commented Jun 3, 2018

Thank you for including the changes I made here for the assertions !

For the vlan part:

In genericConfig, I (conditionally) define the sub-attribute networkConfig.LinkLocalAddressing = "no";. When using an interface (say eth0) and configuring an address (using networkd), it works as expected. But when I define a vlan on eth0 (say eth0.100), and assign an address to eth0.100, then I get an error (.networkConfig.IPv6AcceptRA' is defined multiple times)

This is because vlans have a dependency on the physical interface that triggers genericNetwork on eth0. Since it is already used by the configuration of the address on eth0, it triggers a conflict. What I don't understand is why this attribute is not merged by nix ?

Minor refactoring to set correct options when ipv6 is disabled,
namely:
- Configure DHCP=ipv4
- deactivate linklocal addresses configuration and router advertisment
  listen

Also, improved the system by not generating 99-main if no global options
(networking.useDHCP, defaultGateways, search/domains) are configured,
this prevent breaking interfaces such as tunnels when it is not needed
(otherwise interfaces end up in the "configuring" state, see NixOS#30904).
@netixx
Copy link
Contributor Author

netixx commented Oct 8, 2018

I am closing this since:

  • I don't think it will be merged, given what the community thinks
  • IPv4 only deployments should not be the norm in the future
  • Changes about RA have been merged in another PR
  • The duplicate definition error could not be solved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants