Describe the bug
Using networking.useDHCP
is deprecated since e862dd637350ddd1812a6c1fb5811c6464e74ff5.
But:
It seems like a step back to me, having to list machine specific network interfaces in configuration.nix
instead of being able to say "use DHCP (or not) for any interface you see". Also ref. https://github.com/NixOS/nixpkgs/issues/73595.
I think/hope networking.useDHCP = true
can be mapped to networkd with something like this (untested):
# /etc/systemd/network/10-nixos-dhcp.network
[Match]
Name=*
[Network]
DHCP=yes
CC @globin.
@florianjacob: Why the thumbs down? I'm curious why using networking.useDHCP is suddenly a bad idea. Because of networkd somehow? Please explain.
About a year ago I banged my head against global useDHCP
and related global options which caused alot of problems with networkd, and are / were enforced through a 99-main.network
which matches all interfaces like your code snippet. Can't explain the problems on the hoof anymore though, as I disabled all of that stuff since then and manually configure systemd-networkd directly without networking.interfaces
as it works / worked so bad with systemd-networkd. The deprecation itself is exactly because the 99-main.network
catchall is removed in 20.03: https://github.com/NixOS/nixpkgs/blob/50295a12011334743defec979aff1d1789600f58/nixos/doc/manual/release-notes/rl-2003.xml#L130
If I remember correctly, one main cause is the fact that systemd-networkd does only apply the first network file that matches and ignores all others, which doesn't harmonize with how the networking.interfaces
module / global options are designed.
(Thumbs down just because I remember it's a good idea not to have that, thumbs up for documenting and explaining why that decision was made and what the problems were.)
@florianjacob: Thank you. I read the linked issues and have a better understanding of the problem now.
If some interfaces must be excluded from networkd control, and a whitelist is preferred, how about this:
networking.useDHCP = [ "en*" "wl*" ];
This could be the new default and have low priority so that per interface settings win. This should be machine agnostic AFAICT.
Well, if the above whitelist works, there is actually no need to change the API: useDHCP must simply change from matching all interfaces to the ones starting with "en" and "wl".
IMO there should be a way to override the whitelist, while the boolean true value will make it use the default whitelist a list could override it
Also, some machines have eth0, eth1... renamed interfaces, so just relying on "en" might not work, it should be "e"
I remember now I even have a setup with "wan0" and "lan0" interfaces (via udev rule), so even "e*" is not correct/enough.
Is there a way to match _real_ hardware interfaces?
In network manager at least there seems to be the "hw" flag, but I suppose with the right values from /proc/ this flag could be recreated
The only problem would be that it'll likely have to happen on runtime, since the config could get built on any machine
How is the live CD going to be configured without a global networking.useDHCP
? One cannot know the names of the network interfaces beforehand, so there must be some wildcard/global match in NixOS _somewhere_. If the live CD can be made to work with networkd (I guess that's the plan), surely we can make the installed NixOS too?
@bjornfor wouldn't a "*" work for the live iso?
I think the point was to not match certain interfaces, like the loopback interface ('lo'). But I didn't pick up all the details of the above linked issues, so I could be wrong.
So there are two related problems at play here.
If there is no carrier or no DHCP response, the interfaces will stay in the configuring
state and will delay network-online.target
until either all interfaces are configured or a timeout is reached. Then networking-online.target
fails and thus all services depending on it, even if one link has a configured and working connection.
This is not something we want to happen to new users or on NixOS upgrades that might switch to networkd by default.
Note that most other distributions I'm familiar with don't do DHCP on all interfaces by default but have their installer generate a sensible networking config by some kind of autodetection and asking the user. We're just using dhcpcd cleverly. I think this behaviour only makes sense on install mediums where one cannot assume anything of the target. But on install mediums, IMHO, we should rather let the the user just use network-manager and nm-tui
for ad-hoc configuration.
Specifically, I think the networking configuration should rather be a conscious decision by the user. Either statically via configuration or dynamically via tools like NetworkManager. Users can still configure dhcpcd or networkd explicitly to run DHCP on whitelisted/blacklisted interfaces they deem useful for the job. I was also thinking of allowing wildcards/globs for networking.interfaces.<ifname>
since both our DHCP clients would support it to simplify this.
There is a way though to exclude interfaces from the network-online.target
status checks: Network units can set RequiredForOnline=false
. But setting this for the catchall DHCP networks would also break network-online.target
for services that really rely on a working internet connection on start.
After researching for my response here again, I noticed that systemd-networkd-wait-online
now has an --any
option which would result in the same behaviour we have for dhcpcd in principle. Except that if we couple static configurations with DHCP on all (other) interfaces, the one statically configured interface without a default route would activate network-online.target
which is also not the behaviour we want, strictly speaking.
Also note that this way we have a kind of race condition with both dhcpcd and networkd anyway because acquiring an IP via DHCP on one interface does not necessarily mean we also get a default route that might come from another interface.
Matching for en* wl* ww*
would potentially be enough, but only if predictable interface names are enabled (see man systemd.net-naming-scheme
). If predictable interfaces names are disabled, we cannot assume anything since the interfaces names are defined by the kernel/drivers could have names like usb0
for usb network cards.
Furthermore, udev exposes the DEVTYPE property which can be accessed via networkd units for matching via Type=
. This would be ideal because we could match for ethernet and wifi cards individually. After looking at some hardware, this property is unfortunately not set on some hardware even though the interface is from a physical network card. Not sure if this is a kernel, driver or udev problem.
But: Even though we might have a sensible selector that works with predictable interface names enabled, we have not yet solved the first problem.
It was more sensible for us to remove networking.useDHCP
because we aren't sure how to implement a correct solution in networkd via either config or code. Moreover, though our current implementation with dhcpcd is working well for most cases, it is also a source of trouble for others, and it has bugs. And it is enabled by default!
@bjornfor Does this explain our rationale in a way that makes sense to you? What do you think?
Closing, as there has been no further reaction and I think @fpletz comment is an adequate answer to the issue. Feel free to reopen if there are further questions!
@globin: My lack of response was mostly due to lack of time, not because I think this issue is not relevant anymore. In fact, I don't have a lot of time now either, so sorry for being brief.
@fpletz: Thank you for the detailed post. Here is my response, as an end user who doesn't know all the details:
It was more sensible for us to remove networking.useDHCP because we aren't sure how to implement a correct solution in networkd via either config or code.
That sounds like a perfectly good reason for why things are like they are with networkd, but IMHO not so much for deprecating networking.useDHCP
. It sounds like there are issues with the networkd integration in NixOS (and some upstream projects?), not the idea of networking.useDHCP
itself, and that networkd is not ready yet to be the default NixOS networking backend.
The move to networkd feels kind of rushed, ref. this issue and https://github.com/NixOS/nixpkgs/issues/73595.
@bjornfor Sorry that I didn't make my point clear enough and that I was stressing the move to networkd too much.
networking.useDHCP
should be removed because it's currently
If you still disagree about the removal, please come up with a sensible implementation instead. We can then also use that logic with networkd.
networking.useDHCP should be removed because it's currently
- buggy (see the edge cases I described)
I only saw bugs / edge cases mentioned for the combination of networkd and networking.useDHCP. For networking.useDHCP alone, what's the problem?
- does not what its documentation states ("Whether to use DHCP to obtain an IP address and other configuration for all network interfaces that are not manually configured.") because the dhcpcd blacklist will still be applied silently.
Do you mean the networking.dhcpcd.denyInterfaces
option + hardcoded list of ignored interfaces from nixos/modules/services/networking/dhcpcd.nix (lo peth* vif* tap* tun* virbr* vnet* vboxnet* sit*
)? I guess I always assumed the option was about _hardware_ interfaces, so I don't feel bad when now seeing that list of blacklisted interfaces. We can add in the word "hardware" before "network interfaces" too, to make the docstring more accurate. Does the current implementation cause any problems?
I don't see a reason to remove networking.useDHCP
. It's an "abstract" option not tied to any particular implementation. Whether it enables dhcpcd
or systemd's DHCP client is an implementation detail.
When https://github.com/NixOS/nixpkgs/issues/73595 gets fixed, I guess the plan is to run nixos-generate-config when adding/removing network interfaces? (Well, not my plan, but it seems we're heading that way.)
I tried nixos-generate-config on my machine and got this:
networking.useDHCP = false;
networking.interfaces.docker0.useDHCP = true; # wrong
networking.interfaces.enp2s0.useDHCP = true;
networking.interfaces.tun0.useDHCP = true; # wrong
networking.interfaces.vboxnet0.useDHCP = true; # wrong
networking.interfaces.wlp3s0.useDHCP = true;
So the thinking is that networking.useDHCP should be removed because it has a (hidden) blacklist, whereas without a blacklist you get that behaviour like above? I don't think that's an improvement.
If there is no carrier or no DHCP response, the interfaces will stay in the configuring state and will delay network-online.target until either all interfaces are configured or a timeout is reached. Then networking-online.target fails and thus all services depending on it, even if one link has a configured and working connection.
@fpletz isn't the normal behavior that the system tries to get an IP via DHCP and when it don't get one, assign itself a link local address?
Link local addresses allow machines to automatically have an IP address on a network if they haven't been manually configured or automatically configured by a special server on the network (DHCP). Before an address is chosen from that range, the machine sends out a special message (using ARP which stands for address resolution protocol) to the machines on the network around it (assuming that they also haven't been assigned an address manually or automatically) to find out if 169.254.1.1 is free. If it is, then the machine assigns that address to its network card. If that address is already in use by another machine on the same network, then it tries the next IP 169.254.1.2 and so on, until it finds a free address.
Source: https://serverfault.com/a/118329
So, can we get that behavior with networkd?
I'm always for sane defaults. Do what the user expects. So we can implement a blacklist logic for interfaces that are configured automatically by other programs, like docker0, tun0, vboxnet0.
Someone can ask systemd if the features we need are supported now, or if they will implement them ever? With that information, we can make an informed decision how to proceed here to finish the release.
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/networking-usenetworkd-and-usedhcp/4352/2
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/networking-usenetworkd-and-usedhcp/4352/3
@fpletz When you mention network manager, are you saying that the global useDHCP = true
isn't needed when network manager is used?
Most helpful comment
I don't see a reason to remove
networking.useDHCP
. It's an "abstract" option not tied to any particular implementation. Whether it enablesdhcpcd
or systemd's DHCP client is an implementation detail.