Describe the bug
It takes ages for the routing table to come up
See https://github.com/NixOS/nixpkgs/issues/49534#issuecomment-528516916 for original thread.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Add any other context about the problem here.
Metadata
Please run nix run nixpkgs.nix-info -c nix-info -m and paste the result.
Maintainer information:
# a list of nixpkgs attributes affected by the problem
attribute:
# a list of nixos modules affected by the problem
module:
@johnalotoski I have moved the issue here.
Thanks very much @arianvp!
@johannesloetzsch a few more questions
First of all, could you run the following command for some extra debug info for me
find /sys/class/net | xargs -n1 udevadm -d test-builtin net_setup_link
Second of all, are you seeing this on all packet instances with bonded interfaces or just a particular instance class?
I'm wondering if we're running into this: https://wiki.debian.org/Bonding#udev_renaming_issue
Also could you run the output of:
echo "Slaves = $(cat /sys/class/net/bond0/bonding/slaves)"
echo "Primary = $(cat /sys/class/net/bond0/bonding/primary)"
echo "Active Slave = $(cat /sys/class/net/bond0/bonding/active_slave)"
I'll get some packet boxes through nixos to debug this. so not too important that you answer this quickly. but what I do would like to know is on what instance types you're experiencing this issue. that will help me a lot.
Hi @arianvp, the commands requested are included in the zip file: 2019-10-10-packet-bonding-testing.zip
First, I ran the commands when the machine did not have the problem commit and the bonded network was behaving as expected (no arp issue). Then I applied the problem commit as illustrated previously in the asciinema video, rebooted, verified the network arp problem was back, and then ran the commands again.
I've seen the issue on both c2.medium.x86 and c1.small.x86 servers in EWR1 and AMS1 and have not tested other types or regions beyond that, but the problem is consistent and easily reproducible in my experience. When spinning up a c2.medium.x86, you may run into a separate occasional random failure during provisioning where a block device that gets formatted for booting ends up being unexpectedly renamed on the next reboot and the boot device label is then no longer found so the provisioning fails. The c1.small.x86 provisioning doesn't have this extra issue and it provisions and boots a little faster, so you may want to try and replicate with. I just tried a c1.small.x86 provision at the tip of nixpkgs master (18faa091c6ce6eb78cdb0d4a9399aa835ff2d9f1) and confirmed the issue is still there. Thanks!
From IRC:
6:36 <srhb> Alright, bisect done. Looks like bonds were broken in either 8c7e588362e708ade5e782c09dbdf84d06ab4254 or 1f03f6fc43a6f71b8204adf6cd02fb3685261add with the first being more likely, since that's the systemd -> 242 upgrade.
06:51 <srhb> I'll write down some notes on what's going on later, but the short story is that bonds apparently make up (random?) MAC addresses and then set the bonded devices hwaddr to the same, when the default behaviour _should_ be to set the bond MAC to that of the first device and _then_ set the same hwaddr for the remaining devices.
06:51 <srhb> I guess I'll bisect systemd too, if it's not too hard..
https://github.com/systemd/systemd/blob/master/NEWS#L705-L713 is our suspect at the moment.
systemd v242 assigns MAC addresses to bonds based on the hash of the interface name and the machine ID. Before, the bond0 MAC was the MAC of the lowest interface.
(also: https://www.cisco.com/c/en/us/support/docs/smb/switches/cisco-small-business-500-series-stackable-managed-switches/smb3088-link-aggregation-control-protocol-lacp-configuration-on-sx50.html).
So, the Packet machine would have the following metadata describing bond0:
"interfaces": [
{
"name": "eth0",
"mac": "50:6b:4b:44:00:ea",
"bond": "bond0"
},
{
"name": "eth1",
"mac": "50:6b:4b:44:00:eb",
"bond": "bond0"
}
],
and then it'd come up with its own MACs out of thin air:
[root@c2-medium-provtest:~]# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp5s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
link/ether 06:eb:e5:f0:53:2d brd ff:ff:ff:ff:ff:ff
3: enp5s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
link/ether 06:eb:e5:f0:53:2d brd ff:ff:ff:ff:ff:ff
4: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 06:eb:e5:f0:53:2d brd ff:ff:ff:ff:ff:ff
note it is now using the MAC 06:eb:e5:60:53:2d.
We think this would confuse the upstream switch, until the LACP timeout occurred 30 minutes later. I haven't confirmed this with Packet.
Manually taking the lower MAC (50:6b:4b:44:00:ea) we can add this to the configuration.nix:
networking.interfaces.bond0.macAddress = "50:6b:4b:44:00:ea";
and the MAC will stay consistent and networking will work immediately.
The reason we were the first ones to find this and it took so long to debug is no other OS available on Packet offers systemd v242. Now that Ubuntu 19.10 will, Packet will have saved a lot of hours debugging this :)
A huge thank-you to @srhb, @andir, @mmlb, @arianvp, @johnalotoski, @disassembler, and certainly others through this nightmare.
Awesome work all! What a great technical community!
Leaving a note to future myself.
Ubuntu 19.10 uses https://netplan.io . a YAML DSL on top of networkd. Unfortunately it has no control around setting mac addresses for bond devices from first glance, so people who are going to run Ubuntu 19.10 with a bonded network are going to run into similar issues.
Has the fix for this been backported to 19.09?
The "fix" for now is hardcoding the MAC address of the bridge to as it has been before the systemd bump.
I think it's more a problem with the Packet networking environment than wrong behaviour of NixOS / systemd, and with more and more distributions switching to a later systemd version, something not limited to NixOS at all.
It'd be good to get some feedback from Packet about how they intend to address this.
In theory the script that generates packets images is fixed. Whether we updated the 19.09 images on Packet we should check with @grahamc
https://github.com/grahamc/packet-nixos lists as "The Official NixOS install images for Packet.net", so I assume their images are built with this configuration.
However, even if they are built that way, having to carry around packet-specific networking quirks is unfortunate - I hope the underlying problems get fixed.
People configuring their bridges with a recent enough networkd on other distros will run into the same problems.
@grahamc, perhaps the real question here is what should users with existing Packet machines do to upgrade? I assume it's necessary to update the configuration in /etc/nixos/packet somehow?
For the record, the workaround given earlier does not work on my c1.small.x86 instance. The timeouts persist. I have tried using the MAC addresses of both interfaces; neither work.
@bgamari did you reach out to packet? I assume this will become a problem sooner or later, and we should fix the problem at the source…
Most helpful comment
systemd v242 assigns MAC addresses to bonds based on the hash of the interface name and the machine ID. Before, the bond0 MAC was the MAC of the lowest interface.
(also: https://www.cisco.com/c/en/us/support/docs/smb/switches/cisco-small-business-500-series-stackable-managed-switches/smb3088-link-aggregation-control-protocol-lacp-configuration-on-sx50.html).
So, the Packet machine would have the following metadata describing bond0:
and then it'd come up with its own MACs out of thin air:
note it is now using the MAC
06:eb:e5:60:53:2d.We think this would confuse the upstream switch, until the LACP timeout occurred 30 minutes later. I haven't confirmed this with Packet.
Manually taking the lower MAC (
50:6b:4b:44:00:ea) we can add this to the configuration.nix:and the MAC will stay consistent and networking will work immediately.
The reason we were the first ones to find this and it took so long to debug is no other OS available on Packet offers systemd v242. Now that Ubuntu 19.10 will, Packet will have saved a lot of hours debugging this :)
A huge thank-you to @srhb, @andir, @mmlb, @arianvp, @johnalotoski, @disassembler, and certainly others through this nightmare.