Minikube: systemd-networkd-wait-online.service delays boot for 2 minutes

Created on 23 Mar 2017  路  4Comments  路  Source: kubernetes/minikube

Is this a BUG REPORT or FEATURE REQUEST? : Bug report

Minikube version: v0.17.1

Environment:

  • OS: Fedora 25 (kernel 4.9.14)
  • VM Driver: VirtualBox 5.1.18
  • ISO version: .minikube/cache/iso/minikube-v1.0.7.iso
  • systemd version: 231

What happened:

minikube start takes >2min to finish.

What you expected to happen:

minikube start takes less time, since networking is available.

How to reproduce it:
Just run minikube start and wait

Anything else do we need to know:

It seems most of the time is spent on systemd-networkd-wait-online.service:

# systemd-analyze blame
       2min 84ms systemd-networkd-wait-online.service
          1.157s docker.service
           385ms systemd-journal-flush.service
           296ms sshd.service
           273ms systemd-udev-trigger.service
           258ms localkube.service
           203ms minikube-automount.service
           189ms systemd-tmpfiles-setup-dev.service
           113ms vboxservice.service

journal entries:

Mar 23 00:53:51 minikube systemd[1]: Starting Wait for Network to be Configured...
Mar 23 00:53:51 minikube systemd-networkd-wait-online[3242]: ignoring: lo
Mar 23 00:53:53 minikube systemd-networkd[2384]: eth0: Configured
Mar 23 00:53:53 minikube systemd-networkd-wait-online[3242]: ignoring: lo
Mar 23 00:55:39 minikube kernel: NFSD: Unable to end grace period: -110
Mar 23 00:55:51 minikube systemd-networkd-wait-online[3242]: Event loop failed: Connection timed out
Mar 23 00:55:51 minikube systemd[1]: systemd-networkd-wait-online.service: Main process exited, code=exited, status=1/FAILURE
Mar 23 00:55:51 minikube systemd[1]: Failed to start Wait for Network to be Configured.
Mar 23 00:55:51 minikube systemd[1]: systemd-networkd-wait-online.service: Unit entered failed state.
Mar 23 00:55:51 minikube systemd[1]: systemd-networkd-wait-online.service: Failed with result 'exit-code'.
Mar 23 00:55:51 minikube systemd[1]: Reached target Network is Online.
Mar 23 00:55:51 minikube systemd[1]: Starting Localkube...

This could point to a network issue. However, right after the VM boots, it's possible to ping 8.8.8.8 immediately.

Running /lib/systemd/systemd-networkd-wait-online manually also hangs for 2 minutes after the system is fully booted:

# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 08:00:27:d4:66:54 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 08:00:27:ef:dd:86 brd ff:ff:ff:ff:ff:ff
4: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1
    link/sit 0.0.0.0 brd 0.0.0.0
6: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default 
    link/ether 02:42:8e:c2:b0:34 brd ff:ff:ff:ff:ff:ff
8: veth42d9e5e@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default 
    link/ether 6e:1e:e0:72:34:41 brd ff:ff:ff:ff:ff:ff link-netnsid 0
10: veth7866268@if9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default 
    link/ether d2:3b:bb:9f:ad:78 brd ff:ff:ff:ff:ff:ff link-netnsid 1

# time /lib/systemd/systemd-networkd-wait-online
ignoring: lo
Event loop failed: Connection timed out

real    2m0.244s
user    0m0.000s
sys 0m0.001s

It could be related to this issue systemd/systemd/issues/5154

kinbug

Most helpful comment

After investigating a bit, the same issue is occuring with KVM.
I found this bug which seems to be related in systemd issue tracker.
I managed to start the service by adding --ignore=eth1 in /lib/systemd/system/systemd-networkd-wait-online.service

[Service]
Type=oneshot
ExecStart=/lib/systemd/systemd-networkd-wait-online --ignore=eth1
RemainAfterExit=yes

which may not be needed as if we are able to ssh to the machine we can assume that the bridge is up

All 4 comments

Thanks for the detailed issue. We might be able to decouple the dependency with localkube and waiting for this systemd network unit anyways. That might be worth looking into.

Right now, we have two code paths that support systemd and non-systemd because of legacy reasons. We should just delete the non-systemd code, since we don't plan on supporting those images anymore.

After investigating a bit, the same issue is occuring with KVM.
I found this bug which seems to be related in systemd issue tracker.
I managed to start the service by adding --ignore=eth1 in /lib/systemd/system/systemd-networkd-wait-online.service

[Service]
Type=oneshot
ExecStart=/lib/systemd/systemd-networkd-wait-online --ignore=eth1
RemainAfterExit=yes

which may not be needed as if we are able to ssh to the machine we can assume that the bridge is up

Closing. We have removed this dependency in #1298 and it seems to have reduced startup time noticeably.
https://github.com/kubernetes/minikube/pull/1298#issuecomment-293311777
Thanks @gtirloni for investigating this.

$ time ./out/minikube start
Starting local Kubernetes v1.6.0 cluster...
Starting VM...
SSH-ing files into VM...
Setting up certs...
Starting cluster components...
Connecting to cluster...
Setting up kubeconfig...
Kubectl is now configured to use the cluster.

real    0m35.373s
user    0m1.653s
sys 0m0.236s

Snappy :+1: Thank you!

Was this page helpful?
0 / 5 - 0 ratings