Nixpkgs: Bonded network not working on Packet.net

Created on 24 Sep 2019 · 18Comments · Source: NixOS/nixpkgs

Describe the bug
It takes ages for the routing table to come up

See https://github.com/NixOS/nixpkgs/issues/49534#issuecomment-528516916 for original thread.
To Reproduce
Steps to reproduce the behavior:

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Metadata
Please run nix run nixpkgs.nix-info -c nix-info -m and paste the result.

Maintainer information:

# a list of nixpkgs attributes affected by the problem
attribute:
# a list of nixos modules affected by the problem
module:

bug

Source

arianvp

Most helpful comment

systemd v242 assigns MAC addresses to bonds based on the hash of the interface name and the machine ID. Before, the bond0 MAC was the MAC of the lowest interface.
(also: https://www.cisco.com/c/en/us/support/docs/smb/switches/cisco-small-business-500-series-stackable-managed-switches/smb3088-link-aggregation-control-protocol-lacp-configuration-on-sx50.html).
So, the Packet machine would have the following metadata describing bond0:

    "interfaces": [
      {
        "name": "eth0",
        "mac": "50:6b:4b:44:00:ea",
        "bond": "bond0"
      },
      {
        "name": "eth1",
        "mac": "50:6b:4b:44:00:eb",
        "bond": "bond0"
      }
    ],

and then it'd come up with its own MACs out of thin air:

[root@c2-medium-provtest:~]# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp5s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether 06:eb:e5:f0:53:2d brd ff:ff:ff:ff:ff:ff
3: enp5s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether 06:eb:e5:f0:53:2d brd ff:ff:ff:ff:ff:ff
4: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 06:eb:e5:f0:53:2d brd ff:ff:ff:ff:ff:ff

note it is now using the MAC 06:eb:e5:60:53:2d.

We think this would confuse the upstream switch, until the LACP timeout occurred 30 minutes later. I haven't confirmed this with Packet.

Manually taking the lower MAC (50:6b:4b:44:00:ea) we can add this to the configuration.nix:

networking.interfaces.bond0.macAddress = "50:6b:4b:44:00:ea";

and the MAC will stay consistent and networking will work immediately.

The reason we were the first ones to find this and it took so long to debug is no other OS available on Packet offers systemd v242. Now that Ubuntu 19.10 will, Packet will have saved a lot of hours debugging this :)

A huge thank-you to @srhb, @andir, @mmlb, @arianvp, @johnalotoski, @disassembler, and certainly others through this nightmare.

grahamc on 26 Nov 2019

❤2 👍1

All 18 comments

@johnalotoski I have moved the issue here.

arianvp on 25 Sep 2019

Thanks very much @arianvp!

johnalotoski on 30 Sep 2019

@johannesloetzsch a few more questions

First of all, could you run the following command for some extra debug info for me

find /sys/class/net | xargs -n1 udevadm -d test-builtin net_setup_link

Second of all, are you seeing this on all packet instances with bonded interfaces or just a particular instance class?

arianvp on 10 Oct 2019

I'm wondering if we're running into this: https://wiki.debian.org/Bonding#udev_renaming_issue

arianvp on 10 Oct 2019

Also could you run the output of:


echo "Slaves = $(cat /sys/class/net/bond0/bonding/slaves)"
echo "Primary = $(cat /sys/class/net/bond0/bonding/primary)"
echo "Active Slave = $(cat /sys/class/net/bond0/bonding/active_slave)"

I'll get some packet boxes through nixos to debug this. so not too important that you answer this quickly. but what I do would like to know is on what instance types you're experiencing this issue. that will help me a lot.

arianvp on 10 Oct 2019

Hi @arianvp, the commands requested are included in the zip file: 2019-10-10-packet-bonding-testing.zip

First, I ran the commands when the machine did not have the problem commit and the bonded network was behaving as expected (no arp issue). Then I applied the problem commit as illustrated previously in the asciinema video, rebooted, verified the network arp problem was back, and then ran the commands again.

I've seen the issue on both c2.medium.x86 and c1.small.x86 servers in EWR1 and AMS1 and have not tested other types or regions beyond that, but the problem is consistent and easily reproducible in my experience. When spinning up a c2.medium.x86, you may run into a separate occasional random failure during provisioning where a block device that gets formatted for booting ends up being unexpectedly renamed on the next reboot and the boot device label is then no longer found so the provisioning fails. The c1.small.x86 provisioning doesn't have this extra issue and it provisions and boots a little faster, so you may want to try and replicate with. I just tried a c1.small.x86 provision at the tip of nixpkgs master (18faa091c6ce6eb78cdb0d4a9399aa835ff2d9f1) and confirmed the issue is still there. Thanks!

johnalotoski on 11 Oct 2019

From IRC:

6:36 <srhb> Alright, bisect done. Looks like bonds were broken in either 8c7e588362e708ade5e782c09dbdf84d06ab4254 or 1f03f6fc43a6f71b8204adf6cd02fb3685261add with the first being more likely, since that's the systemd -> 242 upgrade.
06:51 <srhb> I'll write down some notes on what's going on later, but the short story is that bonds apparently make up (random?) MAC addresses and then set the bonded devices hwaddr to the same, when the default behaviour _should_ be to set the bond MAC to that of the first device and _then_ set the same hwaddr for the remaining devices.
06:51 <srhb> I guess I'll bisect systemd too, if it's not too hard..

arianvp on 26 Nov 2019

https://github.com/systemd/systemd/blob/master/NEWS#L705-L713 is our suspect at the moment.

arianvp on 26 Nov 2019

    "interfaces": [
      {
        "name": "eth0",
        "mac": "50:6b:4b:44:00:ea",
        "bond": "bond0"
      },
      {
        "name": "eth1",
        "mac": "50:6b:4b:44:00:eb",
        "bond": "bond0"
      }
    ],

and then it'd come up with its own MACs out of thin air:

[root@c2-medium-provtest:~]# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp5s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether 06:eb:e5:f0:53:2d brd ff:ff:ff:ff:ff:ff
3: enp5s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether 06:eb:e5:f0:53:2d brd ff:ff:ff:ff:ff:ff
4: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 06:eb:e5:f0:53:2d brd ff:ff:ff:ff:ff:ff

note it is now using the MAC 06:eb:e5:60:53:2d.

We think this would confuse the upstream switch, until the LACP timeout occurred 30 minutes later. I haven't confirmed this with Packet.

Manually taking the lower MAC (50:6b:4b:44:00:ea) we can add this to the configuration.nix:

networking.interfaces.bond0.macAddress = "50:6b:4b:44:00:ea";

and the MAC will stay consistent and networking will work immediately.

A huge thank-you to @srhb, @andir, @mmlb, @arianvp, @johnalotoski, @disassembler, and certainly others through this nightmare.

grahamc on 26 Nov 2019

❤2 👍1

Awesome work all! What a great technical community!

johnalotoski on 26 Nov 2019

Leaving a note to future myself.

Ubuntu 19.10 uses https://netplan.io . a YAML DSL on top of networkd. Unfortunately it has no control around setting mac addresses for bond devices from first glance, so people who are going to run Ubuntu 19.10 with a bonded network are going to run into similar issues.

arianvp on 27 Nov 2019

Has the fix for this been backported to 19.09?

bgamari on 28 Dec 2019

The "fix" for now is hardcoding the MAC address of the bridge to as it has been before the systemd bump.

I think it's more a problem with the Packet networking environment than wrong behaviour of NixOS / systemd, and with more and more distributions switching to a later systemd version, something not limited to NixOS at all.

It'd be good to get some feedback from Packet about how they intend to address this.

flokli on 28 Dec 2019

In theory the script that generates packets images is fixed. Whether we updated the 19.09 images on Packet we should check with @grahamc

arianvp on 28 Dec 2019

https://github.com/grahamc/packet-nixos lists as "The Official NixOS install images for Packet.net", so I assume their images are built with this configuration.

However, even if they are built that way, having to carry around packet-specific networking quirks is unfortunate - I hope the underlying problems get fixed.

People configuring their bridges with a recent enough networkd on other distros will run into the same problems.

flokli on 28 Dec 2019

@grahamc, perhaps the real question here is what should users with existing Packet machines do to upgrade? I assume it's necessary to update the configuration in /etc/nixos/packet somehow?

bgamari on 28 Dec 2019

For the record, the workaround given earlier does not work on my c1.small.x86 instance. The timeouts persist. I have tried using the MAC addresses of both interfaces; neither work.

bgamari on 21 Mar 2020

@bgamari did you reach out to packet? I assume this will become a problem sooner or later, and we should fix the problem at the source…

flokli on 12 May 2020

Was this page helpful?

0 / 5 - 0 ratings