This happens when I login the container, and can't quit by Ctrl-c.
My system is Ubuntu 12.04
, kernel is 3.8.0-25-generic
.
docker version:
root@wutq-docker:~# docker version
Client version: 0.10.0
Client API version: 1.10
Go version (client): go1.2.1
Git commit (client): dc9c28f
Server version: 0.10.0
Server API version: 1.10
Git commit (server): dc9c28f
Go version (server): go1.2.1
Last stable version: 0.10.0
I have used the script https://raw.githubusercontent.com/dotcloud/docker/master/contrib/check-config.sh to check, and all right.
I watch the syslog and found this message:
May 6 11:30:33 wutq-docker kernel: [62365.889369] unregister_netdevice: waiting for lo to become free. Usage count = 3
May 6 11:30:44 wutq-docker kernel: [62376.108277] unregister_netdevice: waiting for lo to become free. Usage count = 3
May 6 11:30:54 wutq-docker kernel: [62386.327156] unregister_netdevice: waiting for lo to become free. Usage count = 3
May 6 11:31:02 wutq-docker kernel: [62394.423920] INFO: task docker:1024 blocked for more than 120 seconds.
May 6 11:31:02 wutq-docker kernel: [62394.424175] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 6 11:31:02 wutq-docker kernel: [62394.424505] docker D 0000000000000001 0 1024 1 0x00000004
May 6 11:31:02 wutq-docker kernel: [62394.424511] ffff880077793cb0 0000000000000082 ffffffffffffff04 ffffffff816df509
May 6 11:31:02 wutq-docker kernel: [62394.424517] ffff880077793fd8 ffff880077793fd8 ffff880077793fd8 0000000000013f40
May 6 11:31:02 wutq-docker kernel: [62394.424521] ffff88007c461740 ffff880076b1dd00 000080d081f06880 ffffffff81cbbda0
May 6 11:31:02 wutq-docker kernel: [62394.424526] Call Trace:
May 6 11:31:02 wutq-docker kernel: [62394.424668] [<ffffffff816df509>] ? __slab_alloc+0x28a/0x2b2
May 6 11:31:02 wutq-docker kernel: [62394.424700] [<ffffffff816f1849>] schedule+0x29/0x70
May 6 11:31:02 wutq-docker kernel: [62394.424705] [<ffffffff816f1afe>] schedule_preempt_disabled+0xe/0x10
May 6 11:31:02 wutq-docker kernel: [62394.424710] [<ffffffff816f0777>] __mutex_lock_slowpath+0xd7/0x150
May 6 11:31:02 wutq-docker kernel: [62394.424715] [<ffffffff815dc809>] ? copy_net_ns+0x69/0x130
May 6 11:31:02 wutq-docker kernel: [62394.424719] [<ffffffff815dc0b1>] ? net_alloc_generic+0x21/0x30
May 6 11:31:02 wutq-docker kernel: [62394.424724] [<ffffffff816f038a>] mutex_lock+0x2a/0x50
May 6 11:31:02 wutq-docker kernel: [62394.424727] [<ffffffff815dc82c>] copy_net_ns+0x8c/0x130
May 6 11:31:02 wutq-docker kernel: [62394.424733] [<ffffffff81084851>] create_new_namespaces+0x101/0x1b0
May 6 11:31:02 wutq-docker kernel: [62394.424737] [<ffffffff81084a33>] copy_namespaces+0xa3/0xe0
May 6 11:31:02 wutq-docker kernel: [62394.424742] [<ffffffff81057a60>] ? dup_mm+0x140/0x240
May 6 11:31:02 wutq-docker kernel: [62394.424746] [<ffffffff81058294>] copy_process.part.22+0x6f4/0xe60
May 6 11:31:02 wutq-docker kernel: [62394.424752] [<ffffffff812da406>] ? security_file_alloc+0x16/0x20
May 6 11:31:02 wutq-docker kernel: [62394.424758] [<ffffffff8119d118>] ? get_empty_filp+0x88/0x180
May 6 11:31:02 wutq-docker kernel: [62394.424762] [<ffffffff81058a80>] copy_process+0x80/0x90
May 6 11:31:02 wutq-docker kernel: [62394.424766] [<ffffffff81058b7c>] do_fork+0x9c/0x230
May 6 11:31:02 wutq-docker kernel: [62394.424769] [<ffffffff816f277e>] ? _raw_spin_lock+0xe/0x20
May 6 11:31:02 wutq-docker kernel: [62394.424774] [<ffffffff811b9185>] ? __fd_install+0x55/0x70
May 6 11:31:02 wutq-docker kernel: [62394.424777] [<ffffffff81058d96>] sys_clone+0x16/0x20
May 6 11:31:02 wutq-docker kernel: [62394.424782] [<ffffffff816fb939>] stub_clone+0x69/0x90
May 6 11:31:02 wutq-docker kernel: [62394.424786] [<ffffffff816fb5dd>] ? system_call_fastpath+0x1a/0x1f
May 6 11:31:04 wutq-docker kernel: [62396.466223] unregister_netdevice: waiting for lo to become free. Usage count = 3
May 6 11:31:14 wutq-docker kernel: [62406.689132] unregister_netdevice: waiting for lo to become free. Usage count = 3
May 6 11:31:25 wutq-docker kernel: [62416.908036] unregister_netdevice: waiting for lo to become free. Usage count = 3
May 6 11:31:35 wutq-docker kernel: [62427.126927] unregister_netdevice: waiting for lo to become free. Usage count = 3
May 6 11:31:45 wutq-docker kernel: [62437.345860] unregister_netdevice: waiting for lo to become free. Usage count = 3
After happend this, I open another terminal and kill this process, and then restart docker, but this will be hanged.
I reboot the host, and it still display that messages for some minutes when shutdown:
I'm seeing a very similar issue for eth0. Ubuntu 12.04 also.
I have to power cycle the machine. From /var/log/kern.log
:
May 22 19:26:08 box kernel: [596765.670275] device veth5070 entered promiscuous mode
May 22 19:26:08 box kernel: [596765.680630] IPv6: ADDRCONF(NETDEV_UP): veth5070: link is not ready
May 22 19:26:08 box kernel: [596765.700561] IPv6: ADDRCONF(NETDEV_CHANGE): veth5070: link becomes ready
May 22 19:26:08 box kernel: [596765.700628] docker0: port 7(veth5070) entered forwarding state
May 22 19:26:08 box kernel: [596765.700638] docker0: port 7(veth5070) entered forwarding state
May 22 19:26:19 box kernel: [596777.386084] [FW DBLOCK] IN=docker0 OUT= PHYSIN=veth5070 MAC=56:84:7a:fe:97:99:9e:df:a7:3f:23:42:08:00 SRC=172.17.0.8 DST=172.17.42.1 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=170 DF PROTO=TCP SPT=51615 DPT=13162 WINDOW=14600 RES=0x00 SYN URGP=0
May 22 19:26:21 box kernel: [596779.371993] [FW DBLOCK] IN=docker0 OUT= PHYSIN=veth5070 MAC=56:84:7a:fe:97:99:9e:df:a7:3f:23:42:08:00 SRC=172.17.0.8 DST=172.17.42.1 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=549 DF PROTO=TCP SPT=46878 DPT=12518 WINDOW=14600 RES=0x00 SYN URGP=0
May 22 19:26:23 box kernel: [596780.704031] docker0: port 7(veth5070) entered forwarding state
May 22 19:27:13 box kernel: [596831.359999] docker0: port 7(veth5070) entered disabled state
May 22 19:27:13 box kernel: [596831.361329] device veth5070 left promiscuous mode
May 22 19:27:13 box kernel: [596831.361333] docker0: port 7(veth5070) entered disabled state
May 22 19:27:24 box kernel: [596841.516039] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
May 22 19:27:34 box kernel: [596851.756060] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
May 22 19:27:44 box kernel: [596861.772101] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
Hey, this just started happening for me as well.
Docker version:
Client version: 0.11.1
Client API version: 1.11
Go version (client): go1.2.1
Git commit (client): fb99f99
Server version: 0.11.1
Server API version: 1.11
Git commit (server): fb99f99
Go version (server): go1.2.1
Last stable version: 0.11.1
Kernel log: http://pastebin.com/TubCy1tG
System details:
Running Ubuntu 14.04 LTS with patched kernel (3.14.3-rt4). Yet to see it happen with the default linux-3.13.0-27-generic kernel. What's funny, though, is that when this happens, all my terminal windows freeze, letting me type a few characters at most before that. The same fate befalls any new ones I open, too - and I end up needing to power cycle my poor laptop just like the good doctor above. For the record, I'm running fish shell in urxvt or xterm in xmonad. Haven't checked if it affects plain bash.
This might be relevant:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1065434#yui_3_10_3_1_1401948176063_2050
Copying a fairly large amount of data over the network inside a container
and then exiting the container can trigger a missing decrement in the per
cpu reference count on a network device.
Sure enough, one of the times this happened for me was right after apt-get
ting a package with a ton of dependencies.
Upgrading from Ubuntu 12.04.3 to 14.04 fixed this for me without any other changes.
I experience this on RHEL7, 3.10.0-123.4.2.el7.x86_64
I've noticed the same thing happening with my VirtualBox virtual network interfaces when I'm running 3.14-rt4. It's supposed to be fixed in vanilla 3.13 or something.
@egasimus Same here - I pulled in hundreds of MB of data before killing the container, then got this error.
I upgraded to Debian kernel 3.14 and the problem appears to have gone away. Looks like the problem existed in some kernels < 3.5, was fixed in 3.5, regressed in 3.6, and was patched in something 3.12-3.14. https://bugzilla.redhat.com/show_bug.cgi?id=880394
@spiffytech Do you have any idea where I can report this regarding the realtime kernel flavour? I think they're only releasing a RT patch for every other version, and would really hate to see 3.16-rt come out with this still broken. :/
EDIT: Filed it at kernel.org.
I'm getting this on Ubuntu 14.10 running a 3.18.1. Kernel log shows
Dec 21 22:49:31 inotmac kernel: [15225.866600] unregister_netdevice: waiting for lo to become free. Usage count = 2
Dec 21 22:49:40 inotmac kernel: [15235.179263] INFO: task docker:19599 blocked for more than 120 seconds.
Dec 21 22:49:40 inotmac kernel: [15235.179268] Tainted: G OE 3.18.1-031801-generic #201412170637
Dec 21 22:49:40 inotmac kernel: [15235.179269] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 21 22:49:40 inotmac kernel: [15235.179271] docker D 0000000000000001 0 19599 1 0x00000000
Dec 21 22:49:40 inotmac kernel: [15235.179275] ffff8802082abcc0 0000000000000086 ffff880235c3b700 00000000ffffffff
Dec 21 22:49:40 inotmac kernel: [15235.179277] ffff8802082abfd8 0000000000013640 ffff8800288f2300 0000000000013640
Dec 21 22:49:40 inotmac kernel: [15235.179280] ffff880232cf0000 ffff8801a467c600 ffffffff81f9d4b8 ffffffff81cd9c60
Dec 21 22:49:40 inotmac kernel: [15235.179282] Call Trace:
Dec 21 22:49:40 inotmac kernel: [15235.179289] [<ffffffff817af549>] schedule+0x29/0x70
Dec 21 22:49:40 inotmac kernel: [15235.179292] [<ffffffff817af88e>] schedule_preempt_disabled+0xe/0x10
Dec 21 22:49:40 inotmac kernel: [15235.179296] [<ffffffff817b1545>] __mutex_lock_slowpath+0x95/0x100
Dec 21 22:49:40 inotmac kernel: [15235.179299] [<ffffffff8168d5c9>] ? copy_net_ns+0x69/0x150
Dec 21 22:49:40 inotmac kernel: [15235.179302] [<ffffffff817b15d3>] mutex_lock+0x23/0x37
Dec 21 22:49:40 inotmac kernel: [15235.179305] [<ffffffff8168d5f8>] copy_net_ns+0x98/0x150
Dec 21 22:49:40 inotmac kernel: [15235.179308] [<ffffffff810941f1>] create_new_namespaces+0x101/0x1b0
Dec 21 22:49:40 inotmac kernel: [15235.179311] [<ffffffff8109432b>] copy_namespaces+0x8b/0xa0
Dec 21 22:49:40 inotmac kernel: [15235.179315] [<ffffffff81073458>] copy_process.part.28+0x828/0xed0
Dec 21 22:49:40 inotmac kernel: [15235.179318] [<ffffffff811f157f>] ? get_empty_filp+0xcf/0x1c0
Dec 21 22:49:40 inotmac kernel: [15235.179320] [<ffffffff81073b80>] copy_process+0x80/0x90
Dec 21 22:49:40 inotmac kernel: [15235.179323] [<ffffffff81073ca2>] do_fork+0x62/0x280
Dec 21 22:49:40 inotmac kernel: [15235.179326] [<ffffffff8120cfc0>] ? get_unused_fd_flags+0x30/0x40
Dec 21 22:49:40 inotmac kernel: [15235.179329] [<ffffffff8120d028>] ? __fd_install+0x58/0x70
Dec 21 22:49:40 inotmac kernel: [15235.179331] [<ffffffff81073f46>] SyS_clone+0x16/0x20
Dec 21 22:49:40 inotmac kernel: [15235.179334] [<ffffffff817b3ab9>] stub_clone+0x69/0x90
Dec 21 22:49:40 inotmac kernel: [15235.179336] [<ffffffff817b376d>] ? system_call_fastpath+0x16/0x1b
Dec 21 22:49:41 inotmac kernel: [15235.950976] unregister_netdevice: waiting for lo to become free. Usage count = 2
Dec 21 22:49:51 inotmac kernel: [15246.059346] unregister_netdevice: waiting for lo to become free. Usage count = 2
I'll send docker version/info
once the system isn't frozen anymore :)
We're seeing this issue as well. Ubuntu 14.04, 3.13.0-37-generic
On Ubuntu 14.04 server, my team has found that downgrading from 3.13.0-40-generic to 3.13.0-32-generic "resolves" the issue. Given @sbward's observation, that would put the regression after 3.13.0-32-generic and before (or including) 3.13.0-37-generic.
I'll add that, in our case, we sometimes see a _negative_ usage count.
FWIW we hit this bug running lxc on trusty kernel (3.13.0-40-generic #69-Ubuntu) the message appears in dmesg followed by this stacktrace:
[27211131.602869] INFO: task lxc-start:26342 blocked for more than 120 seconds.
[27211131.602874] Not tainted 3.13.0-40-generic #69-Ubuntu
[27211131.602877] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27211131.602881] lxc-start D 0000000000000001 0 26342 1 0x00000080
[27211131.602883] ffff88000d001d40 0000000000000282 ffff88001aa21800 ffff88000d001fd8
[27211131.602886] 0000000000014480 0000000000014480 ffff88001aa21800 ffffffff81cdb760
[27211131.602888] ffffffff81cdb764 ffff88001aa21800 00000000ffffffff ffffffff81cdb768
[27211131.602891] Call Trace:
[27211131.602894] [<ffffffff81723b69>] schedule_preempt_disabled+0x29/0x70
[27211131.602897] [<ffffffff817259d5>] __mutex_lock_slowpath+0x135/0x1b0
[27211131.602900] [<ffffffff811a2679>] ? __kmalloc+0x1e9/0x230
[27211131.602903] [<ffffffff81725a6f>] mutex_lock+0x1f/0x2f
[27211131.602905] [<ffffffff8161c2c1>] copy_net_ns+0x71/0x130
[27211131.602908] [<ffffffff8108f889>] create_new_namespaces+0xf9/0x180
[27211131.602910] [<ffffffff8108f983>] copy_namespaces+0x73/0xa0
[27211131.602912] [<ffffffff81065b16>] copy_process.part.26+0x9a6/0x16b0
[27211131.602915] [<ffffffff810669f5>] do_fork+0xd5/0x340
[27211131.602917] [<ffffffff810c8e8d>] ? call_rcu_sched+0x1d/0x20
[27211131.602919] [<ffffffff81066ce6>] SyS_clone+0x16/0x20
[27211131.602921] [<ffffffff81730089>] stub_clone+0x69/0x90
[27211131.602923] [<ffffffff8172fd2d>] ? system_call_fastpath+0x1a/0x1f
Ran into this on Ubuntu 14.04 and Debian jessie w/ kernel 3.16.x.
Docker command:
docker run -t -i -v /data/sitespeed.io:/sitespeed.io/results company/dockerfiles:sitespeed.io-latest --name "Superbrowse"
This seems like a pretty bad issue...
@jbalonso even with 3.13.0-32-generic I get the error after only a few successful runs :sob:
@MrMMorris could you share a reproducer script using public available images?
Everyone who's seeing this error on their system is running a package of the Linux kernel on their distribution that's far too old and lacks the fixes for this particular problem.
If you run into this problem, make sure you run apt-get update && apt-get dist-upgrade -y
and reboot your system. If you're on Digital Ocean, you also need to select the kernel version which was just installed during the update because they don't use the latest kernel automatically (see https://digitalocean.uservoice.com/forums/136585-digitalocean/suggestions/2814988-give-option-to-use-the-droplet-s-own-bootloader).
CentOS/RHEL/Fedora/Scientific Linux users need to keep their systems updated using yum update
and reboot after installing the updates.
When reporting this problem, please make sure your system is fully patched and up to date with the latest stable updates (no manually installed experimental/testing/alpha/beta/rc packages) provided by your distribution's vendor.
@unclejack
I ran apt-get update && apt-get dist-upgrade -y
ubuntu 14.04 3.13.0-46-generic
Still get the error after only one docker run
I can create an AMI for reproducing if needed
@MrMMorris Thank you for confirming it's still a problem with the latest kernel package on Ubuntu 14.04.
Anything else I can do to help, let me know! :smile:
@MrMMorris if you can provide a reproducer there is a bug opened for Ubuntu and it will be much appreciated: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1403152
@rsampaio if I have time today, I will definitely get that for you!
This problem also appears on 3.16(.7) on both Debian 7 and Debian 8: https://github.com/docker/docker/issues/9605#issuecomment-85025729. Rebooting the server is the only way to fix this for now.
Seeing this issue on RHEL 6.6 with kernel 2.6.32-504.8.1.el6.x86_64 when starting some docker containers (not all containers)
_kernel:unregister_netdevice: waiting for lo to become free. Usage count = -1_
Again, rebooting the server seems to be the only solution at this time
Also seeing this on CoreOS (647.0.0) with kernel 3.19.3.
Rebooting is also the only solution I have found.
Tested Debian jessie with sid's kernel (4.0.2) - the problem remains.
Anyone seeing this issue running non-ubuntu containers?
Yes. Debian ones.
19 июня 2015 г. 19:01 пользователь "popsikle" [email protected]
написал:
Anyone seeing this issue running non-ubuntu containers?
—
Reply to this email directly or view it on GitHub
https://github.com/docker/docker/issues/5618#issuecomment-113556862.
This is a kernel issue, not an image related issue. Switching an image for another won't improve or make this problem worse.
Experiencing issue on Debian Jessie on a BeagleBone Black running 4.1.2-bone12 kernel
Experiencing after switching from 4.1.2 to 4.2-rc2 (using git build of 1.8.0).
Deleting /var/lib/docker/* doesn't solve the problem.
Switching back to 4.1.2 solves the problem.
Also, VirtualBox has same issue and there's patch for v5.0.0 (retro-ported to v4) which supposedly does something in kernel driver part.. worth looking to understand the problem.
This is the fix in the VirtualBox: https://www.virtualbox.org/attachment/ticket/12264/diff_unregister_netdev
They don't actually modify the kernel, just their kernel module.
Also having this issue with 4.2-rc2:
unregister_netdevice: waiting for vethf1738d3 to become free. Usage count = 1
Just compiled 4.2-RC3, seems to work again
@nazar-pc Thanks for info. Just hit it with 4.1.3, was pretty upset
@techniq same here, pretty bad kernel bug. I wonder if we should report it to be backported to 4.1 tree.
Linux docker13 3.19.0-22-generic #22-Ubuntu SMP Tue Jun 16 17:15:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Kernel from Ubuntu 15.04, same issue
I saw it with 4.2-rc3 as well. There is not one bug about device leakage :) I can reproduce on any kernel >=4.1 under highload.
I just had this problem too. Ubuntu 3.13.0-57-generic, provisioned via tutum. Unfortunately it fills up the kern.log and syslog and crashes the machine. It happens on the database machine (dockerized postgres), so it brings down the whole system...
Joining the chorus of "me too"'s, I am seeing this problem on a cloudstack VM running RancherOS (a minimal OS) 0.3.3 while pulling docker images from a local private docker repo. It's happening every ten seconds, not sure if that means anything or not.
Also having this issue with 4.2-rc7
Any news on this, which kernel should we use ? It keeps happening even with a fully up-to-date kernel (3.19.0-26 on Ubuntu 14.04)
We got this problem too. This happen after we configured userland-proxy=false. We're using some monitor scripts that will spawn new docker container to execute nagios plugins command every 1 minutes. What I'm seeing on process tree is it stuck on docker rm command and seeing a lot of errors in kern.log file
Sep 24 03:53:13 prod-service-05 kernel: [ 1920.544106] unregister_netdevice: waiting for lo to become free. Usage count = 2
Sep 24 03:53:13 prod-service-05 kernel: [ 1921.008076] unregister_netdevice: waiting for vethb6bf4db to become free. Usage count = 1
Sep 24 03:53:23 prod-service-05 kernel: [ 1930.676078] unregister_netdevice: waiting for lo to become free. Usage count = 2
Sep 24 03:53:23 prod-service-05 kernel: [ 1931.140074] unregister_netdevice: waiting for vethb6bf4db to become free. Usage count = 1
Sep 24 03:53:33 prod-service-05 kernel: [ 1940.820078] unregister_netdevice: waiting for lo to become free. Usage count = 2
This is our system information
ubuntu@prod-service-02:~$ docker version
Client:
Version: 1.8.2
API version: 1.20
Go version: go1.4.2
Git commit: 0a8c2e3
Built: Thu Sep 10 19:19:00 UTC 2015
OS/Arch: linux/amd64
Server:
Version: 1.8.2
API version: 1.20
Go version: go1.4.2
Git commit: 0a8c2e3
Built: Thu Sep 10 19:19:00 UTC 2015
OS/Arch: linux/amd64
ubuntu@prod-service-02:~$ docker info
Containers: 2
Images: 52
Storage Driver: overlay
Backing Filesystem: extfs
Execution Driver: native-0.2
Logging Driver: gelf
Kernel Version: 4.0.9-040009-generic
Operating System: Ubuntu 14.04.3 LTS
CPUs: 4
Total Memory: 7.304 GiB
Name: prod-service-02
ID: NOIK:LVBV:HFB4:GZ2Y:Q74F:Q4WW:ZE22:MDE7:7EBW:XS42:ZK4G:XNTB
WARNING: No swap limit support
Labels:
provider=generic
Update: Although https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1403152 said it's already fixed on 2015-08-17. So I tried with kernel v3.19.8-ckt6-vivid that build on 02-Sep-2015 or even v4.2.1-unstable that build on 21-Sep-2015 and still have a problem.
I've just hit the problem again using 3.19.0-28-generic
, so latest ubuntu kernel is not safe
Yup, seems like --userland-proxy=false
isn't best option now with older kernels :(
No. I tried --userland-proxy=false for all 3.19, 4.0, 4.2 kernel version and problem still happen.
I am using userland proxy without iptables (--iptables=false) and seeing this once per day as a minimum. Sadly the only workaround was a watchdog that hard reset the server using SysRq technique.
My systems run some containers that are heavy stdout/err writers, as others reported it may trigger the bug.
``````
$ docker info
Containers: 15
Images: 148
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 178
Dirperm1 Supported: true
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.19.0-26-generic
Operating System: Ubuntu 14.04.3 LTS
CPUs: 12
Total Memory: 62.89 GiB
Name: **
ID: 2ALJ:YTUH:QCNX:FPEO:YBG4:ZTL4:2EYK:AV7D:FN7C:IVNU:UWBL:YYZ5
$ docker version
Client version: 1.7.0
Client API version: 1.19
Go version (client): go1.4.2
Git commit (client): 0baf609
OS/Arch (client): linux/amd64
Server version: 1.7.0
Server API version: 1.19
Go version (server): go1.4.2
Git commit (server): 0baf609
OS/Arch (server): linux/amd64```
``````
Unfortunately, I'm in the same case, today a production server failed 3 times on this error, and the only way to handle that is to use some magic SysRq commands..
bump
I'm still seeing this on the latest debian jessie using kernel 4.2.0
Same problem here. All of a sudden, three of my aws servers went down and the logs were yelling "unregister_netdevice: waiting for lo to become free. Usage count = 1"
Ubuntu: 14.04
Kernel version: 3.13.0-63-generic
Docker: 1.7.1
Syslog
Is there a safe-to-use kernel version ?
Issue happens also with kernel 4.2 of Ubuntu 15.10
happend in coreos:
Images: 1174
Storage Driver: overlay
Backing Filesystem: extfs
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.1.7-coreos
Operating System: CoreOS 766.4.0
@killme2008 上次说的内核bug
You probably should give a try with this patch applied on top of your kernel http://www.spinics.net/lists/netdev/msg351337.html
packet: race condition in packet_bind
or wait for the backport in -stable tree; it will come sooner or later.
:+1: Great news!
Hey everyone, good news !
Since my last comment here (at the time of writing, 17 days ago) I haven't got these errors again. My servers (about 30 of them) were running ubuntu 14.04 with some outdated packages.
After a full system upgrade including docker-engine (from 1.7.1 to 1.8.3) + kernel upgrade to the latest possible version on ubuntu's repo, my servers are running without any occurences.
:8ball:
Happened on 3 of our AWS instance today also:
Client:
Version: 1.8.2
API version: 1.20
Go version: go1.4.2
Git commit: 0a8c2e3
Built: Thu Sep 10 19:19:00 UTC 2015
OS/Arch: linux/amd64
Server:
Version: 1.8.2
API version: 1.20
Go version: go1.4.2
Git commit: 0a8c2e3
Built: Thu Sep 10 19:19:00 UTC 2015
OS/Arch: linux/amd64
Containers: 45
Images: 423
Storage Driver: devicemapper
Pool Name: docker-202:1-527948-pool
Pool Blocksize: 65.54 kB
Backing Filesystem: extfs
Data file: /dev/loop0
Metadata file: /dev/loop1
Data Space Used: 22.79 GB
Data Space Total: 107.4 GB
Data Space Available: 84.58 GB
Metadata Space Used: 35.58 MB
Metadata Space Total: 2.147 GB
Metadata Space Available: 2.112 GB
Udev Sync Supported: true
Deferred Removal Enabled: false
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Library Version: 1.02.77 (2012-10-15)
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.13.0-49-generic
Operating System: Ubuntu 14.04.2 LTS
CPUs: 8
Total Memory: 60 GiB
Name: ip-10-0-1-36
ID: HEZG:TBTM:V4LN:IU7U:P55N:HNVH:XXOP:RMUX:JNWH:DSJP:3OA4:MGO5
WARNING: No swap limit support
I'm having the same problem with Ubuntu 14.04, all packages up-to-date and latest linux-generic-lts-vivid
kernel:
$ docker version
Client:
Version: 1.9.0
API version: 1.21
Go version: go1.4.2
Git commit: 76d6bc9
Built: Tue Nov 3 17:43:42 UTC 2015
OS/Arch: linux/amd64
Server:
Version: 1.9.0
API version: 1.21
Go version: go1.4.2
Git commit: 76d6bc9
Built: Tue Nov 3 17:43:42 UTC 2015
OS/Arch: linux/amd64
$ docker info
Containers: 14
Images: 123
Server Version: 1.9.0
Storage Driver: aufs
Root Dir: /mnt/docker-images/aufs
Backing Filesystem: extfs
Dirs: 151
Dirperm1 Supported: true
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.19.0-32-generic
Operating System: Ubuntu 14.04.3 LTS
CPUs: 8
Total Memory: 29.45 GiB
Name: ip-172-31-35-202
ID: 3B7E:5DJL:S4IB:KUCL:6UKN:64OF:WCLO:JKGK:4OI2:I2R6:63EY:WATN
WARNING: No swap limit support
I had it with latest linux-image-generic
(3.13.0-67-generic) as well.
Having the same issues here on rancherOS.
Still happening on Fedora 22 (updated)....
I can get rid of the messages if I restart docker
systemctl restart docker
... the message appears again for about 3-4 times and then stops
The same error meet me with coreos:
version of coreos:
core@core-1-94 ~ $ cat /etc/os-release NAME=CoreOS ID=coreos VERSION=766.5.0 VERSION_ID=766.5.0 BUILD_ID= PRETTY_NAME="CoreOS 766.5.0" ANSI_COLOR="1;32" HOME_URL="https://coreos.com/" BUG_REPORT_URL="https://github.com/coreos/bugs/issues"
docker version:
core@core-1-94 ~ $ docker version Client version: 1.7.1 Client API version: 1.19 Go version (client): go1.4.2 Git commit (client): df2f73d-dirty OS/Arch (client): linux/amd64 Server version: 1.7.1 Server API version: 1.19 Go version (server): go1.4.2 Git commit (server): df2f73d-dirty OS/Arch (server): linux/amd64
core@core-1-94 ~ $ uname -a Linux core-1-94 4.1.7-coreos-r1 #2 SMP Thu Nov 5 02:10:23 UTC 2015 x86_64 Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz GenuineIntel GNU/Linux
system log:
Dec 07 16:26:54 core-1-94 kernel: unregister_netdevice: waiting for veth775ea53 to become free. Usage count = 1 Dec 07 16:26:54 core-1-94 kernel: unregister_netdevice: waiting for lo to become free. Usage count = 2 Dec 07 16:26:55 core-1-94 sdnotify-proxy[1203]: I1207 08:26:55.930559 00001 vxlan.go:340] Ignoring not a miss: 4e:5c:47:2f:9a:85, 10.244.97.10 Dec 07 16:26:59 core-1-94 dockerd[1269]: time="2015-12-07T16:26:59.448438648+08:00" level=info msg="GET /version" Dec 07 16:27:01 core-1-94 sdnotify-proxy[1203]: I1207 08:27:01.050588 00001 vxlan.go:340] Ignoring not a miss: 5a:b1:f7:e9:7d:d0, 10.244.34.8 Dec 07 16:27:02 core-1-94 dockerd[1269]: time="2015-12-07T16:27:02.398020120+08:00" level=info msg="GET /version" Dec 07 16:27:02 core-1-94 dockerd[1269]: time="2015-12-07T16:27:02.398316249+08:00" level=info msg="GET /version" Dec 07 16:27:04 core-1-94 dockerd[1269]: time="2015-12-07T16:27:04.449317389+08:00" level=info msg="GET /version" Dec 07 16:27:04 core-1-94 kernel: unregister_netdevice: waiting for veth775ea53 to become free. Usage count = 1 Dec 07 16:27:04 core-1-94 kernel: unregister_netdevice: waiting for lo to become free. Usage count = 2 Dec 07 16:27:06 core-1-94 sdnotify-proxy[1203]: I1207 08:27:06.106573 00001 vxlan.go:340] Ignoring not a miss: a6:38:ac:79:93:f5, 10.244.47.24 Dec 07 16:27:09 core-1-94 dockerd[1269]: time="2015-12-07T16:27:09.449944048+08:00" level=info msg="GET /version" Dec 07 16:27:11 core-1-94 sdnotify-proxy[1203]: I1207 08:27:11.162578 00001 vxlan.go:340] Ignoring not a miss: 0e:f0:6f:f4:69:57, 10.244.71.24 Dec 07 16:27:12 core-1-94 dockerd[1269]: time="2015-12-07T16:27:12.502991197+08:00" level=info msg="GET /version" Dec 07 16:27:12 core-1-94 dockerd[1269]: time="2015-12-07T16:27:12.503411160+08:00" level=info msg="GET /version" Dec 07 16:27:14 core-1-94 dockerd[1269]: time="2015-12-07T16:27:14.450646841+08:00" level=info msg="GET /version" Dec 07 16:27:14 core-1-94 kernel: unregister_netdevice: waiting for veth775ea53 to become free. Usage count = 1 Dec 07 16:27:14 core-1-94 kernel: unregister_netdevice: waiting for lo to become free. Usage count = 2 Dec 07 16:27:16 core-1-94 sdnotify-proxy[1203]: I1207 08:27:16.282556 00001 vxlan.go:340] Ignoring not a miss: a6:62:77:31:ef:68, 10.244.13.6 Dec 07 16:27:19 core-1-94 dockerd[1269]: time="2015-12-07T16:27:19.451486277+08:00" level=info msg="GET /version" Dec 07 16:27:21 core-1-94 sdnotify-proxy[1203]: I1207 08:27:21.402559 00001 vxlan.go:340] Ignoring not a miss: 92:c4:66:52:cd:bb, 10.244.24.7 Dec 07 16:27:22 core-1-94 dockerd[1269]: time="2015-12-07T16:27:22.575446889+08:00" level=info msg="GET /version" Dec 07 16:27:22 core-1-94 dockerd[1269]: time="2015-12-07T16:27:22.575838302+08:00" level=info msg="GET /version" Dec 07 16:27:24 core-1-94 dockerd[1269]: time="2015-12-07T16:27:24.452320364+08:00" level=info msg="GET /version" Dec 07 16:27:24 core-1-94 kernel: unregister_netdevice: waiting for veth775ea53 to become free. Usage count = 1 Dec 07 16:27:24 core-1-94 kernel: unregister_netdevice: waiting for lo to become free. Usage count = 2 Dec 07 16:27:26 core-1-94 sdnotify-proxy[1203]: I1207 08:27:26.394569 00001 vxlan.go:340] Ignoring not a miss: 6a:f7:bf:ec:03:50, 10.244.87.8 Dec 07 16:27:29 core-1-94 dockerd[1269]: time="2015-12-07T16:27:29.453171649+08:00" level=info msg="GET /version" Dec 07 16:27:29 core-1-94 systemd[1]: Starting Generate /run/coreos/motd... Dec 07 16:27:29 core-1-94 systemd[1]: Started Generate /run/coreos/motd. Dec 07 16:27:32 core-1-94 dockerd[1269]: time="2015-12-07T16:27:32.671592437+08:00" level=info msg="GET /version" Dec 07 16:27:32 core-1-94 dockerd[1269]: time="2015-12-07T16:27:32.671841436+08:00" level=info msg="GET /version" Dec 07 16:27:33 core-1-94 sdnotify-proxy[1203]: I1207 08:27:33.562534 00001 vxlan.go:340] Ignoring not a miss: 22:b4:62:d6:25:b9, 10.244.68.8 Dec 07 16:27:34 core-1-94 dockerd[1269]: time="2015-12-07T16:27:34.453953162+08:00" level=info msg="GET /version" Dec 07 16:27:34 core-1-94 kernel: unregister_netdevice: waiting for veth775ea53 to become free. Usage count = 1 Dec 07 16:27:35 core-1-94 kernel: unregister_netdevice: waiting for lo to become free. Usage count = 2
happy birthday, bloody issue =)
6 May 2014
same thing here. Just rebooting. Latest docker version. Ubuntu 14.04.
@samvignoli this has been identified as a kernel issue, so unfortunately not something that can be fixed in docker
@thaJeztah Have you got a link to the bug tracker for the kernel issue?
Or perhaps a pointer to which kernel's are affected?
Keen to get this resolved in our environment.
@Rucknar sorry, I don't (perhaps there's one in this discussion, I haven't read back all comments)
Linux atlas2 3.19.0-33-generic #38~14.04.1-Ubuntu SMP Fri Nov 6 18:17:28 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
@Rucknar if you scroll a bit to the top - you will see the link to the patch http://www.spinics.net/lists/netdev/msg351337.html. It is now in linux master, I guess it will go to Linux 4.4, maybe someone had already backported it to previous versions, but not sure.
Thanks all, will look at what's required in upgrading.
FWIW I backported the last patch mentioned here to ubuntu 3.19 and I also tested on 4.2 kernel both unsuccessful. The problem is still present even on 4.4-rc3 net-next branch at this point.
@rsampaio How did you test that? I cannot reliably trigger this fault using docker, actually, on any kernel. It just happens sometimes.
@fxposter we also can't reproduce the problem outside production so I had to boot a few instances with the patched kernel in production, it happens so frequently that I can find out if a kernel is affected within 24h of production load.
Sometimes we an fix it with an very unusual resource: we move the container directories away from /var/lib/docker/aufs/mnt
With that... MAYBE we can 'service docker restart' and move the directories back.
Otherwise... only rebooting.
@rsampaio are you talking about heroku production now? how dow you avoid this problem, cause all your business is built around containers/etc?
@rsampaio do you use --userland-proxy=false
or just high amount of created containers? I can reproduce it fairly easy with --userland-proxy=false
and with some load without :)
@LK4D4 I believe it is just a high amount of created/destroyed containers, specially containers doing a lot of outbound traffic, we also use LXC instead of docker but the bug is exactly the same as the one described here, I can try to reproduce using your method if it is easy to describe and/or does not involve production load, the idea is to get a crashdump and _maybe_ find more hints about what exactly trigger this bug.
@rsampaio I can reproduce with prolonged usage of https://github.com/crosbymichael/docker-stress
Has there been any updates / proposals for this getting fixed?
@joshrendek it's a kernel bug. Looks like even newly released kernel 4.4 does not fix it, so there is at least one more race condition somewhere :)
kernel bug
=)
@samvignoli could you keep your comments constructive? Feel free to open a PR if you have ways to fix this issue.
Was this bug already reported upstream (kernel mailinglist)?
Sure has been. first comment references this bug as well: https://bugzilla.kernel.org/show_bug.cgi?id=81211
Open since 2014. No comments from anyone that works on it though other than to say it's most likely an application using it incorrectly.
Thanks for the link, Justin! I'll troll Linus =)
kind regards. =* :heart:
@samvignoli please don't do this, it does help not anyone.
Can somebody reproduce this in a small VM image?
Maybe I can get my hands dirty with gdb and lots of kprintf.
bug still open.
OS: CentOS 7.2
kernel: 4.4.2 elrepo kernel-ml
docker: 1.10.2
fs: overlayfs with xfs
log:
Message from syslogd@host118 at Feb 29 14:52:47 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
[root@host118 ~]# uname -a
Linux host118 4.4.2-1.el7.elrepo.x86_64 #1 SMP Thu Feb 18 10:20:19 EST 2016 x86_64 x86_64 x86_64 GNU/Linux
[root@host118 ~]# cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)
[root@host118 ~]# lsb_release -a
LSB Version: :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description: CentOS Linux release 7.2.1511 (Core)
Release: 7.2.1511
Codename: Core
[root@host118 ~]# docker info
Containers: 5
Running: 2
Paused: 0
Stopped: 3
Images: 154
Server Version: 1.10.2
Storage Driver: overlay
Backing Filesystem: xfs
Execution Driver: native-0.2
Logging Driver: json-file
Plugins:
Volume: local
Network: bridge null host
Kernel Version: 4.4.2-1.el7.elrepo.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.858 GiB
Name: host118
ID: 2NW7:Y54E:AHTO:AVDR:S2XZ:BGMC:ZO4I:BCAG:6RKW:KITO:KRM2:DQIZ
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
this log show when runing sameersbn/docker-gitlab docker image:
wget https://raw.githubusercontent.com/sameersbn/docker-gitlab/master/docker-compose.yml
docker-compose up
I may just be getting lucky - but after applying these sysctl settings the occurrence of this happening has gone way down.
net.ipv4.netfilter.ip_conntrack_tcp_timeout_established = 600
net.ipv4.tcp_tw_reuse = 1
net.netfilter.nf_conntrack_generic_timeout = 120
net.netfilter.nf_conntrack_max = 1555600000
net.netfilter.nf_conntrack_tcp_timeout_close = 10
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60
net.netfilter.nf_conntrack_tcp_timeout_established = 300
net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_last_ack = 30
net.netfilter.nf_conntrack_tcp_timeout_max_retrans = 300
net.netfilter.nf_conntrack_tcp_timeout_syn_recv = 60
net.netfilter.nf_conntrack_tcp_timeout_syn_sent = 120
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_unacknowledged = 300
@joshrendek what's the motivation behind these settings?
@kmike this was to fix some other conntrack issues (ip tables getting full) that we were experiencing - it seems to have done something with regards to my original issue though as a side effect
Could you show the before/after so we can see what actually changed? are you willing to binary search these settings and see if there's a smaller set?
I'm using CoreOS Stable (899.13.0) in a Compute Engine VM. This error occurs every time I start the server with the following flag to 0
(the default). I have tested several times back and forth and with IPv6 disabled I can start all the containers in the node without any error:
$ cat /etc/sysctl.d/10-disable-ipv6.conf
net.ipv6.conf.all.disable_ipv6 = 1
I use the gcloud container to download from GCR, so maybe the problem is IPv6 + download of MBs of images + close the containers quickly.
Docker version for reference:
Client:
Version: 1.9.1
API version: 1.21
Go version: go1.4.3
Git commit: 9894698
Built:
OS/Arch: linux/amd64
Server:
Version: 1.9.1
API version: 1.21
Go version: go1.4.3
Git commit: 9894698
Built:
OS/Arch: linux/amd64
I have also tested the previous sysctl flags in this issue; but some have already that value and the rest didn't seem to change anything related to this error:
net.ipv4.netfilter.ip_conntrack_tcp_timeout_established = 600
-----> not found in CoreOS
net.ipv4.tcp_tw_reuse = 1
-----> default: 0
net.netfilter.nf_conntrack_generic_timeout = 120
-----> default: 600
net.netfilter.nf_conntrack_max = 1555600000
-----> default: 65536
net.netfilter.nf_conntrack_tcp_timeout_close = 10
-> already: 10
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60
-> already: 60
net.netfilter.nf_conntrack_tcp_timeout_established = 300
-----> default: 432000
net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 120
-> already: 120
net.netfilter.nf_conntrack_tcp_timeout_last_ack = 30
-> already: 30
net.netfilter.nf_conntrack_tcp_timeout_max_retrans = 300
-> already: 300
net.netfilter.nf_conntrack_tcp_timeout_syn_recv = 60
-> already: 60
net.netfilter.nf_conntrack_tcp_timeout_syn_sent = 120
-> already: 120
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120
-> already: 120
net.netfilter.nf_conntrack_tcp_timeout_unacknowledged = 300
-> already: 300
I'm still seeing the issue when I set net.ipv6.conf.all.disable_ipv6=1.
The docker stress tool can produce the issue very easily.
https://github.com/crosbymichael/docker-stress
This is the binary I built for the tool above.
https://storage.googleapis.com/donny/main
https://storage.googleapis.com/donny/stress.json
Once we see the log "unregister_netdevice: waiting for veth6c3b8b0 to become free. Usage count", docker is hanging. I think this is a kernel issue triggered by docker. This will happen only when docker userland-proxy is off (--userland-proxy=false).
I've had this happen with and without userland proxy enabled, so I wouldn't say only when it is off.
It could be that it makes the situation worse; I know we once tried to make --userland-proxy=false
the default, but reverted that because there were side-effects https://github.com/docker/docker/issues/14856
I've seen too the error one time since yesterday, clearly disabling IPv6 it's not a fix; but without the flag I can't even start all the containers of the server without trashing docker.
Running into this on CoreOS 1010.1.0 with kubernetes 1.2.2 and docker 1.10.3
Kubernetes added a flag to kubelet (on mobile, so can't look it up) for
hairpin mode. Change it to "promiscuous bridge" or whatever the valid
value is. We have not seen this error since making that change.
@bprashanh
Please confirm or refute?
On Apr 13, 2016 12:43 PM, "Aaron Crickenberger" [email protected]
wrote:
Running into this on CoreOS 1010.1.0 with kubernetes 1.2.2 and docker
1.10.3—
You are receiving this because you commented.
Reply to this email directly or view it on GitHub
https://github.com/docker/docker/issues/5618#issuecomment-209617342
Getting this on AWS running Linux 4.4.5-15.26.amzn1.x86_64 with Docker version 1.9.1, build a34a1d5/1.9.1.
Ruby 2.3.0 with Alpine image is running inside the container causing this
kernel:[58551.548114] unregister_netdevice: waiting for lo to become free. Usage count = 1
Any fix for this?
saw this for the first time on Linux 3.19.0-18-generic #18~14.04.1-Ubuntu SMP Wed May 20 09:38:33 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
A couple reboots fixed it.
@MrMMorris Fixed as in you're certain the problem has gone away for good, or in that you're not experiencing it again just yet? Could be a race condition...
It's pretty clear that this is a race in the kernel, losing a refcount
somewhere. This is a REALLY hard to track bug, but as far as we can tell
still exists.
On Mon, May 2, 2016 at 10:52 PM, Sune Keller [email protected]
wrote:
@MrMMorris https://github.com/MrMMorris Fixed as in you're certain the
problem has gone away for good, or in that you're not experiencing it again
just yet? Could be a race condition...—
You are receiving this because you commented.
Reply to this email directly or view it on GitHub
https://github.com/docker/docker/issues/5618#issuecomment-216444133
Yup. I tried CoreOS 1032.0.0 with kernel 4.5, and the issue still exists.
I encountered this again on CoreOS 1010.1.0 with kernel 4.5.0 yesterday, it had been after several containers were started and killed in rapid succession.
I've got this error.
Docker Version: 1.9.1
Kernel Version: 4.4.8-20.46.amzn1.x86_64
Operating System: Amazon Linux AMI 2016.03
@sirlatrom not fixed. Seeing this again 😭 Required multiple reboots to resolve.
Currently running 3.19.0-18-generic. Will try upgrading to latest
same here! :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry: :cry:
@samvignoli your comments are not constructive. Please stop posting.
sorry, forgot the thumbs up function.
Reproduced in Fedora Server 23 - 4.2.5-300.fc23.x86_64. Cannot restart the Docker service - only reboot the node.
Same issue on Fedora 24 Kernel: 4.5.2-302.fc24.x86_64. didn't cause any hangs, but spams a log file.
@hapylestat Can you try systemctl restart docker
? This caused it all to hang for me.
Thanks
This is happening to my (CoreOS, EC2) machines quite frequently. In case it's at all helpful, here are all the logs related to the stuck veth device in one instance of this bug.
$ journalctl | grep veth96110d9
May 14 16:40:27 ip-10-100-37-14.eu-west-1.compute.internal systemd-udevd[4189]: Could not generate persistent MAC address for veth96110d9: No such file or directory
May 14 16:40:27 ip-10-100-37-14.eu-west-1.compute.internal kernel: IPv6: ADDRCONF(NETDEV_UP): veth96110d9: link is not ready
May 14 16:40:27 ip-10-100-37-14.eu-west-1.compute.internal systemd-networkd[665]: veth96110d9: Configured
May 14 16:40:27 ip-10-100-37-14.eu-west-1.compute.internal kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth96110d9: link becomes ready
May 14 16:40:27 ip-10-100-37-14.eu-west-1.compute.internal systemd-networkd[665]: veth96110d9: Gained carrier
May 14 16:40:27 ip-10-100-37-14.eu-west-1.compute.internal systemd-networkd[665]: veth96110d9: Lost carrier
May 14 16:40:27 ip-10-100-37-14.eu-west-1.compute.internal systemd-networkd[665]: veth96110d9: Removing non-existent address: fe80::98f4:98ff:fea2:d83b/64 (valid for ever)
May 14 16:40:32 ip-10-100-37-14.eu-west-1.compute.internal kernel: eth0: renamed from veth96110d9
May 14 16:53:45 ip-10-100-37-14.eu-west-1.compute.internal kernel: veth96110d9: renamed from eth0
May 14 16:53:45 ip-10-100-37-14.eu-west-1.compute.internal systemd-networkd[665]: veth96110d9: Configured
May 14 16:53:45 ip-10-100-37-14.eu-west-1.compute.internal systemd-networkd[665]: veth96110d9: Gained carrier
May 14 16:53:45 ip-10-100-37-14.eu-west-1.compute.internal kernel: IPv6: veth96110d9: IPv6 duplicate address fe80::42:aff:fee0:571a detected!
May 14 16:53:45 ip-10-100-37-14.eu-west-1.compute.internal systemd-networkd[665]: veth96110d9: Lost carrier
May 14 16:53:45 ip-10-100-37-14.eu-west-1.compute.internal systemd-networkd[665]: veth96110d9: Removing non-existent address: fe80::42:aff:fee0:571a/64 (valid for ever)
May 14 16:53:55 ip-10-100-37-14.eu-west-1.compute.internal kernel: unregister_netdevice: waiting for veth96110d9 to become free. Usage count = 1
May 14 16:54:05 ip-10-100-37-14.eu-west-1.compute.internal kernel: unregister_netdevice: waiting for veth96110d9 to become free. Usage count = 1
May 14 16:54:15 ip-10-100-37-14.eu-west-1.compute.internal kernel: unregister_netdevice: waiting for veth96110d9 to become free. Usage count = 1
May 14 16:54:25 ip-10-100-37-14.eu-west-1.compute.internal kernel: unregister_netdevice: waiting for veth96110d9 to become free. Usage count = 1
May 14 16:54:35 ip-10-100-37-14.eu-west-1.compute.internal kernel: unregister_netdevice: waiting for veth96110d9 to become free. Usage count = 1
This seems to happen when I remove many containers at once (in my case, when I delete k8s pods en-masse).
For those saying a reboot fixed it - did you reboot or stop/start the machines? On physical machines I had to use a remote power reset to get the machine to come back up.
@joshrendek, I had to use iLO's cold boot (I.e a physical power cycle).
@joshrendek I have a script now which runs watching for this and does reboot -f
when it happens 😢.
Might have found the issue (or just got lucky). I have moved the Docker graph dir from an XFS partitioned disk over to an EXT4 partitioned disk and I cannot reproduce the issue (as well as solving a load of other XFS bugs I was getting). I remember @vbatts saying that XFS isn't supported yet.
I have tried to provoke by running build
, run
, stop
, delete
in an infinate loop on variaous images, creating about 10 containers each cycle, for the last few hours.
@joedborg what graphdriver are you using? Devicemapper? Overlay?
@thaJeztah Good point, I should have mentioned that. I'm using Overlay driver with (now) EXT4 backing FS.
I used to use devicemapper (because I'm using Fedora Server), but I had tons of pain (as I believe many do), especially with leaks where the mapper would not return space to the pool once a container had been deleted.
If it helps, I'm on Docker 1.11.1 and Kernel 4.2.5-300.fc23.x86_64.
@joedborg interesting, because the RHEL docs mentioned that only EXT4 is supported on RHEL/CentOS 7.1, and only XFS on RHEL/CentOS 7.2. I'd have expected XFS to work on newer versions then
@thaJeztah ah that's odd. I'm trying to think of other things that it might be. I've re-read from the top and it seems some people are running the same config. The only other thing that's different is that the XFS disk is spindle and the EXT4 is SSD. I will keep soak testing in the mean time. I've also moved prod over to used the same setup, so either way we'll have an answer before long. However, it was doing on almost every stop
before, so it's certainly better.
@joedborg well, it's useful information indeed
same error here, from kernel 4.2 to 4.5, same docker verion.
BTW, I'm running several virtualbox machines on the same box at the same time.
$ docker version
Client:
Version: 1.8.3
API version: 1.20
Go version: go1.4.2
Git commit: f4bf5c7
Built: Mon Oct 12 05:27:08 UTC 2015
OS/Arch: linux/amd64
Server:
Version: 1.8.3
API version: 1.20
Go version: go1.4.2
Git commit: f4bf5c7
Built: Mon Oct 12 05:27:08 UTC 2015
OS/Arch: linux/amd64
$ docker info
Containers: 3
Images: 461
Storage Driver: devicemapper
Pool Name: docker-253:7-1310721-pool
Pool Blocksize: 65.54 kB
Backing Filesystem: extfs
Data file: /dev/loop0
Metadata file: /dev/loop1
Data Space Used: 18.08 GB
Data Space Total: 107.4 GB
Data Space Available: 18.37 GB
Metadata Space Used: 26.8 MB
Metadata Space Total: 2.147 GB
Metadata Space Available: 2.121 GB
Udev Sync Supported: true
Deferred Removal Enabled: false
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Library Version: 1.02.90 (2014-09-01)
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.5.0-0.bpo.1-amd64
Operating System: Debian GNU/Linux 8 (jessie)
CPUs: 4
Total Memory: 15.56 GiB
Name: tungsten
ID: HJX5:TKIH:TF4G:JCQA:MHQB:YYUD:DHBL:53M7:ZRY2:OCIE:FHY7:NLP6
I am experiencing this issue using the overlay
graph driver, with the directory on an ext4
FS. So I don't think xfs
is the problem 😢
@obeattie Yeah, seems people are getting it on devicemapper
too. Touch wood, I have not had the issue again since switching. As mentioned, I did also swap physical disk. This is going to be an interesing one!
This problem does not correlate with the filesystem in any way. I have see this problem with zfs, overlayfs, devicemapper, btrfs and aufs. Also with or without swap. It is not even limited to docker, I hit the same bug with lxc too. The only workaround, I currently see, is not to stop container concurrently.
if it helps, I am getting the same error message on the latest ec2 instance backed by AWS AMI. docker version shows-
Client:
Version: 1.9.1
API version: 1.21
Go version: go1.4.2
Git commit: a34a1d5/1.9.1
Built:
OS/Arch: linux/amd64
Server:
Version: 1.9.1
API version: 1.21
Go version: go1.4.2
Git commit: a34a1d5/1.9.1
Built:
OS/Arch: linux/amd64
Just hopping on board. I'm seeing the same behavior on the latest Amazon ec2 instance. After some period of time, the container just tips over and becomes unresponsive.
$ docker info
Containers: 2
Images: 31
Server Version: 1.9.1
Storage Driver: devicemapper
Pool Name: docker-202:1-263705-pool
Pool Blocksize: 65.54 kB
Base Device Size: 107.4 GB
Backing Filesystem:
Data file: /dev/loop0
Metadata file: /dev/loop1
Data Space Used: 1.199 GB
Data Space Total: 107.4 GB
Data Space Available: 5.754 GB
Metadata Space Used: 2.335 MB
Metadata Space Total: 2.147 GB
Metadata Space Available: 2.145 GB
Udev Sync Supported: true
Deferred Removal Enabled: false
Deferred Deletion Enabled: false
Deferred Deleted Device Count: 0
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Library Version: 1.02.93-RHEL7 (2015-01-28)
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.4.10-22.54.amzn1.x86_64
Operating System: Amazon Linux AMI 2016.03
CPUs: 1
Total Memory: 995.4 MiB
Name: [redacted]
ID: OB7A:Q6RX:ZRMK:4R5H:ZUQY:BBNK:BJNN:OWKS:FNU4:7NI2:AKRT:5SEP
$ docker version
Client:
Version: 1.9.1
API version: 1.21
Go version: go1.4.2
Git commit: a34a1d5/1.9.1
Built:
OS/Arch: linux/amd64
Server:
Version: 1.9.1
API version: 1.21
Go version: go1.4.2
Git commit: a34a1d5/1.9.1
Built:
OS/Arch: linux/amd64
Same as the above comments, also running on EC2 happens to be via elastic beanstalk using 64bit Amazon Linux 2016.03 v2.1.0 running Docker 1.9.1
Somewhat anecdotal at this time, but I recently tried upgrading from 4.2.0 to 4.5.5 kernel on around 18 servers as a tests, and this issue became considerably worse (from multiple days down to no more than 4 hours between issues).
This was on Debian 8
Exact same setup as @jonpaul and @g0ddard
Looking to see how we might be able to mitigate this bug.
First thing (which may or may not work out, it's risky) is to keep the API available in cases where this occurs: #23178
Hello. I've also been bitten by this bug...
Jun 08 17:30:40 node-0-vm kernel: unregister_netdevice: waiting for veth846b1dc to become free. Usage count = 1
I'm using Kubernetes 1.2.4 on CoreOS Beta, Flannel and running on Azure. Is there someway to help debug this issue? The kernel bug thread seems dead. Some people report that disabling IPv6 on the kernel, using --userland-proxy=true
or using aufs instead of overlay storage help, while others do not... It's a bit confusing.
Like @justin8 I also noticed this after upgrading my Fedora 23 system to kernel 4.5.5; the issue remains with kernel 4.5.6.
We encountered this bug when the container was hitting its memory limit. Unsure if its related or not.
same issue here
# docker version
Client:
Version: 1.9.1
API version: 1.21
Go version: go1.4.3
Git commit: a34a1d5
Built: Fri Nov 20 17:56:04 UTC 2015
OS/Arch: linux/amd64
Server:
Version: 1.9.1
API version: 1.21
Go version: go1.4.3
Git commit: a34a1d5
Built: Fri Nov 20 17:56:04 UTC 2015
OS/Arch: linux/amd64
# docker info
Containers: 213
Images: 1232
Server Version: 1.9.1
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 1667
Dirperm1 Supported: true
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.19.0-5-exton
Operating System: Debian GNU/Linux 7 (wheezy)
CPUs: 4
Total Memory: 21.58 GiB
Name: [redacted]
Message from syslogd@[redacted] at Jun 24 10:07:54 ...
kernel:[1716405.486669] unregister_netdevice: waiting for lo to become free. Usage count = 2
Message from syslogd@[redacted] at Jun 24 10:07:56 ...
kernel:[1716407.146691] unregister_netdevice: waiting for veth06216c2 to become free. Usage count = 1
centos7.2
docker 1.10.3
the same problem
I have a "one liner" that will eventually reproduce this issue for me on an EC2 (m4.large) running CoreOS 1068.3.0 with the 4.6.3 kernel (so very recent). For me, it takes about 300 iterations but YMMV.
Linux ip-172-31-58-11.ec2.internal 4.6.3-coreos #2 SMP Sat Jun 25 00:59:14 UTC 2016 x86_64 Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz GenuineIntel GNU/Linux
CoreOS beta (1068.3.0)
Docker version 1.10.3, build 3cd164c
A few hundred iterations of the loop here will eventually hang dockerd and the kernel will be emitting error messages like
kernel: unregister_netdevice: waiting for veth8c7d525 to become free. Usage count = 1
The reproducer loop is
i=0; while echo $i && docker run --rm -p 8080 busybox /bin/true && docker ps; do sleep 0.05; ((i+=1)); done
EDITS
userland-proxy=false
@btalbot's script, above, doesn't reproduce the issue for me on Fedora 23 after several thousand iterations.
$ docker --version
Docker version 1.10.3, build f476348/1.10.3
$ docker info
Containers: 3
Running: 0
Paused: 0
Stopped: 3
Images: 42
Server Version: 1.10.3
Storage Driver: devicemapper
Pool Name: docker_vg-docker--pool
Pool Blocksize: 524.3 kB
Base Device Size: 107.4 GB
Backing Filesystem: xfs
Data file:
Metadata file:
Data Space Used: 17.69 GB
Data Space Total: 73.67 GB
Data Space Available: 55.99 GB
Metadata Space Used: 5.329 MB
Metadata Space Total: 130 MB
Metadata Space Available: 124.7 MB
Udev Sync Supported: true
Deferred Removal Enabled: true
Deferred Deletion Enabled: true
Deferred Deleted Device Count: 0
Library Version: 1.02.109 (2015-09-22)
Execution Driver: native-0.2
Logging Driver: journald
Plugins:
Volume: local
Network: bridge null host
Kernel Version: 4.5.7-200.fc23.x86_64
Operating System: Fedora 23 (Workstation Edition)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 0
CPUs: 4
Total Memory: 15.56 GiB
Name: <hostname>
ID: TOKW:AWJF:3VZU:55QA:V3KD:ZCA6:4XWW:JBY2:2Q5C:3S65:3ZXV:XRXG
Registries: docker.io (secure)
This problem happens quite frequently on my Kubernetes cluster, however I can't reproduce it reliably with the stressers or @btalbot's one liner. I've tried running it on two Azure VMs with CoreOS 1068.3.0.
First VM was a Standard_D1_v2 (3.5GB Ram, 1 core) - the script did > 3000 iterations.
Second VM was a Standard_DS15_v2 (140GB Ram, 20 cores) - the script did > 7600 iterations.
I've updated my previous comment (https://github.com/docker/docker/issues/5618#issuecomment-229545933) to include that I can only reproduce this when userland-proxy=false.
It reproduces for me on EC2 t2.micro (single core) VMs as well as m4.large (multi core) both using HVM. Haven't seen it happen using VirtualBox on my laptop yet though no matter the setting of userland-proxy.
We have encountered this bug while using Flannel with hairpin-veth enabled at kubernetes cluster (using iptables proxy). This bug was happening only when we run-stop too many container. We switch to using cbr0 bridge network and promiscuous-bridge hairpin mode and never see it again.
Actually it is easy to reproduce this bug if you are using hairpin-veth, just start this job with 100 containers with kubernetes.
On 01/07/2016 08:01, manoj0077 wrote:
@btalbot https://github.com/btalbot so with 1.12 we can restart
dockerd without effecting running containers. So would dockerd restart
help here in this case ?AFAICT, event with 1.12, docker containers processes are still children
of the docker daemon.
@sercand how did you set promiscuous-bridge hairpin mode? I can't see any documentation from docker about that, or perhaps they are using a different name
Is there some official word from Docker 🐳 on when this might be looked at? This is second most commented open issue; is very severe (necessitating a host restart); is reproducible; and I don't see any real progress toward pinning down the root cause or fixing it 😞.
This seems most likely to be a kernel issue, but the ticket on Bugzilla has been stagnant for months. Would it be helpful to post our test cases there?
@justin8 I think those are Kubelet flags: --configure-cbr0
and --hairpin-mode
@sercand I also use Flannel. Is there any disadvantage in using --hairpin-mode=promiscuous-bridge
?
@obeattie I agree. :(
FTR I managed to replicate the problem using @sercand's stresser job on a test Kubernetes cluster that I set up, it also uses flannel and hairpin-veth.
@sercand Could you please detail the steps to begin using promiscuous-bridge
? I added the flag --configure-cbr0=true
to the node's kubelet but it complains:
ConfigureCBR0 requested, but PodCIDR not set. Will not configure CBR0 right now
. I thought this PodCIDR was supposed to come from the master? Thanks.
EDIT: It seems I needed to add --allocate-node-cidrs=true --cluster-cidr=10.2.0.0/16
to the controller manager config, but since I don't have a cloud provider (Azure) the routes probably won't work.
@justin8 I have followed this doc.
@edevil from the documentation hairpin-mode is for "This allows endpoints of a Service to loadbalance back to themselves if they should try to access their own Service". By the way my cluster runs at Azure and it was not an easy task to achieve.
@sercand According to the doc, if we use --allocate-node-cidrs=true
on the controller manager we're supposed to use a cloud provider in order for it to setup the routes. Since there is no Kubernetes cloud provider for Azure didn't you have problems? Do you setup the routes manually? Thanks.
@edevil I use terraform to create routes. You can find it at this repo. I have quickly created this configuration and tested only once. I hope it is enough to provide basic logic behind it.
@morvans @btalbot did you get a chance to try with 1.12 ...?
I can confirm that moving away from hairpin-veth and using the cbr0 bridge I cannot reproduce the problem anymore.
Just in case: anyone having this issue on the bare metal? We've seen this when tested rancher cluster on our VMWare lab, but never seen on real bare metal deployment.
Yes, this issue happens on bare metal for any kernel >= 4.3. Have seen this on a lot of different machines and hardware configurations. Only solution for us was to use kernel 4.2.
It definitely still happens on 4.2, but it ias an order of magnitude more often on anything newer, ive been testing each major release to see if its better, and nothing yet
Happens on CoreOS alpha 1097.0.0 also.
Kernel: 4.6.3
Docker: 1.11.2
I get same issue.
Docker: 1.11.2
Kernel: 4.4.8-boot2docker.
Host: Docker-machine with VMWare Fusion driver on OS X.
Any suggested workarounds?
Would be really helpful if those of you who can reproduce the issue reliably in an environment where a crashdump is possible (aka not EC2) could in fact share this crashdump file, more information about how to enable kdump in ubuntu trusty can be found here and these are the crash options you need to enable when kdump is ready to generate a crashdump:
echo 1 > /proc/sys/kernel/hung_task_panic # panic when hung task is detected
echo 1 > /proc/sys/kernel/panic_on_io_nmi # panic on NMIs from I/O
echo 1 > /proc/sys/kernel/panic_on_oops # panic on oops or kernel bug detection
echo 1 > /proc/sys/kernel/panic_on_unrecovered_nmi # panic on NMIs from memory or unknown
echo 1 > /proc/sys/kernel/softlockup_panic # panic when soft lockups are detected
echo 1 > /proc/sys/vm/panic_on_oom # panic when out-of-memory happens
The crashdump can really help kernel developers find more about what is causing the reference leak but keep in mind that a crashdump also includes a memory dump of your host and may contain sensible information.
...sensible information.
:o
I am running into the same issue.
Jul 13 10:48:34 kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
Linux 4.6.3-1.el7.elrepo.x86_64
Docker: 1.11.2
Same issue:
Ubuntu 14.04.4 LTS (GNU/Linux 3.19.0-25-generic x86_64)
Docker version: 1.10.3
Just happened directly on the terminal screen:
Message from syslogd@svn at Jul 26 21:47:38 ...
kernel:[492821.492101] unregister_netdevice: waiting for lo to become free. Usage count = 2
Message from syslogd@svn at Jul 26 21:47:48 ...
kernel:[492831.736107] unregister_netdevice: waiting for lo to become free. Usage count = 2
Message from syslogd@svn at Jul 26 21:47:58 ...
kernel:[492841.984110] unregister_netdevice: waiting for lo to become free. Usage count = 2
system is
Linux svn.da.com.ar 4.4.14-24.50.amzn1.x86_64 #1 SMP Fri Jun 24 19:56:04 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
Same problem
Os: Amazon Linux AMI release 2016.03
Docker: 1.9.1
Here also:
Linux 4.4.14-24.50.amzn1.x86_64 x86_64
Docker version 1.11.2, build b9f10c9/1.11.2
I'm seeing the same issue on EC2:
Docker version 1.11.2, build b9f10c9/1.11.2
NAME="Amazon Linux AMI"
VERSION="2016.03"
ID="amzn"
ID_LIKE="rhel fedora"
VERSION_ID="2016.03"
PRETTY_NAME="Amazon Linux AMI 2016.03"
CPE_NAME="cpe:/o:amazon:linux:2016.03:ga"
HOME_URL="http://aws.amazon.com/amazon-linux-ami/"
kernel:[154350.108043] unregister_netdevice: waiting for lo to become free. Usage count = 1
(on all my pty + beeper when this happens)
"simply" Debian Jessie + backports:
Linux 4.6.0-0.bpo.1-amd64 #1 SMP Debian 4.6.1-1~bpo8+1 (2016-06-14) x86_64 GNU/Linux
Docker version 1.12.0, build 8eab29e
Hello,
When I try to replicate the issue over a controlled environment by creating an destroying new images I can not reproduce it.
The issue is been raise at one of the servers running docker 1.9.1
docker info | egrep "Version|Driver"
Server Version: 1.9.1
Storage Driver: devicemapper
Library Version: 1.02.93 (2015-01-30)
Execution Driver: native-0.2
Logging Driver: gelf
Kernel Version: 4.5.0-coreos-r1
I concurrently lunch 17753 container so far in concurrent mode and raising traffic to internet, without leak any of the veth* interface. Can someone paste instructions to consistently reproduce the issue?
@pegerto Should be pretty easy to trigger if you have --userland-proxy=false
and spin up a bunch of containers concurrently. I do this using https://github.com/crosbymichael/docker-stress
Thanks @cpuguy83
Configuring the daemon to have --userland-proxy=false
I can easily reproduce the issue, thank you, we can see this issue affecting daemons that doesn't run this configuration.
I see a kernel dump at the netfilter hook introduced by the netns segregation at >=4.3, any thoughts why the issue seems worse when route occurs at 127/8 ?
Thanks
Seeing this issue as well. CoreOS 1068.8.0, Docker 1.10.3, kernel 4.6.3. I pulled some of the system logs if anybody is interested.
Just got multiple ...
unregistered_netdevice: waiting for lo to become free. Usage count = 1
... on 2 VMs and on my baremetal laptop, all running Ubuntu 16.04 and the latests kernels (4.4.0-3[456]).
The result is everything hangs and requires a hard reboot.
Haven't experienced this before last week and I think one of the VM was on 1.11.3 while the others were all on 1.12.0.
@RRAlex This is not specific to any docker version.
If you are using --userland-proxy=false
on the daemon options... OR (from what I understand) you are using kubernetes you will likely hit this issue.
The reason being is the --userland-proxy=false
option enables hairpin NAT on the bridge interface... this is something that kubernetes also sets when it sets up the networking for it's containers.
Seeing this on a BYO node using Docker Cloud (and Docker Cloud agent).
Saw this today, once (out of about 25 tries) on current Amazon ECS AMIs, running vanilla debian:jessie with a command that apt-get updates, installs pbzip2, then runs it (simple multithreaded CPU test).
@edevil
Most of you people here describe that you encounter this situation while using Docker for starting/stopping containers, but I got exactly the same situation without Docker, on Debian:
No way to recover except a hard reset of the machine.
So please, in your investigations to pinpoint / solve this issue, do not focus on Docker alone. It is obvious a generic issue with fast stop/starts of containers, be it through Docker, or through plain "lxc" commands.
I'd think that this is a problem of linux kernel.
I met this problem when I have 3 chroot(in fact pbuilder) running with very heavy load.
My hardware is Loongson 3A (a mips64el machine with 3.16 kernel).
When I am trying to ssh into it, I met this problem.
So this problem may not only about docker or lxc, it is even about chroot.
Docker version 1.11.2.
kernel:[3406028.998789] unregister_netdevice: waiting for lo to become free. Usage count = 1
cat /etc/os-release
NAME=openSUSE
VERSION="Tumbleweed"
VERSION_ID="20160417"
PRETTY_NAME="openSUSE Tumbleweed (20160417) (x86_64)"
ID=opensuse
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:opensuse:opensuse:20160417"
BUG_REPORT_URL="https://bugs.opensuse.org"
HOME_URL="https://www.opensuse.org/"
ID_LIKE="suse"
uname -a
Linux centre 4.5.0-3-default #1 SMP PREEMPT Mon Mar 28 07:27:57 UTC 2016 (8cf0ce6) x86_64 x86_64 x86_64 GNU/Linux
Bare metal.
We had the issue lately on bare metal (dedicated on ovh) with kernel 4.6.x and docker 1.11.2.
After reading comments here and trying multiple workarounds, we downgraded our kernel to the latest version of the 3.14 branch (3.14.74) and upgraded docker to 1.12.0 to avoid https://github.com/docker/libnetwork/issues/1189 and everything seems to be alright for now.
I hope this can help.
All, I think you don't need to post messages anymore about Docker or chroot, it's all about the Linux kernel.
So please, can someone stand up who can debug in some way the kernel , in the parts where it is disabling virtual network interfaces for containers ? Maybe there is some race conditions happening when a previous stop of a container did not yet entirely disable/cleanup its virtual interface, before a new stop of a container is requested.
@rdelangh I don't think that issue is necessarily related to the kernel.
On Fedora 24, I can't reproduce the issue with Docker 1.10.3 from the Fedora repos, only with Docker 1.12.1 from the Docker own repos.
Both tests were conducted with kernel 4.6.7-300.fc24.x86_64.
Seeing this issue as well on CoreOS 1068.10.0, Docker 1.10.3, kernel 4.6.3.
kernel: unregister_netdevice: waiting for veth09b49a3 to become free. Usage count = 1
Using Kubernetes 1.3.4 on CoreOS 1068.9.0 stable on EC2, docker 1.10.3 I see this problem.
unregister_netdevice: waiting for veth5ce9806 to become free. Usage count = 1
unregister_netdevice: waiting for veth5ce9806 to become free. Usage count = 1
unregister_netdevice: waiting for veth5ce9806 to become free. Usage count = 1
...
uname -a
Linux <redacted> 4.6.3-coreos #2 SMP Fri Aug 5 04:51:16 UTC 2016 x86_64 Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz GenuineIntel GNU/Linux
Seeing this issue as well on Ubuntu 16.04, Docker 1.12.1, kernel 4.4.0-34-generic
waiting for lo to become free. Usage count = 1
$ time docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
...
real 4m40.943s
user 0m0.012s
sys 0m0.004s
For those using Kubernetes <= 1.3.4 you can exploit this issue: https://github.com/kubernetes/kubernetes/issues/30899 to reproduce this problem. I ran a small cluster with 1 Controller (m4.large) and 2 Workers (m4.large) on CoreOS 1068.10.
From there you can create 2 ReplicationControllers, I called them hello
and hello1
based on this: http://pastebin.com/mAtPTrXH . Make sure to change the names and labels to be different.
Then, create 2 deployments matching the same names/labels as the above based on this: http://pastebin.com/SAnwLnCw .
_As soon as you create the deployments, you'll get a crazy amount of spam containers_.
If you leave it on for a while (several minutes), you'll see a lot of stuff trying to terminate/create. You can delete the deployments and let things stabilize. You should see a good handful Terminating and ContainerCreating. If you ssh into the nodes, check dmesg
and docker ps
to see if the above symptoms are apparent.
In my instance it took me about 5 minutes of letting this freak out before seeing the issue. I plan on making the changes that @sercand and @edevil were toying with and see if this works for me in this case.
@edevil After looking at your linked commit, am I correct that you disabled/removed Flannel in your environment altogether in favor of the cbro
bridge created by Kubernetes to get past this issue?
I'm seeing on my end that you would not be able to use them in tandem because flannel wants to use docker0
and your internal networking would be working on cbr0
correct?
@alph486 that's correct, I stopped using flannel. I use the bridge and setup the routes for the pod network.
@alph486 flannel doesn't want to use docker0. It's just the default bridge for docker, which you can override with --bridge=cbr0
docker option.
On CoreOS you would have to override the docker systemd unit.
The Kubelet flag --experimental-flannel-overlay
can read flannel configuration, and configure the docker bridge cbr0
with the flannel CIDR.
It will also enable promiscuous
mode instead of veth-hairpin
which seems to be the issue.
Thanks @dadux for the input. If K8s will pick up the cbr0
interface that has already been bootstrapped by the overridden unit, could be in business with that solution; i'll try it.
According to docs, promiscuous-bridge
appears to be the default value for --hairpin-mode
in kubelet v1.3.4+. I'm still seeing the issue with this, so I'm not entirely sure that's the whole solution.
I've not been able to reproduce the issue again after using the kubenet
network plugin (which is set to replace --configure-cbr0
). I'm kind of avoiding the flannel-overlay
option due to the uncertainty of its future (it seems to be tied to --configure-cbr0
).
If your docker daemon uses the docker0
bridge, setting --hairpin-mode=promiscuous-bridge
will have no effect as the kubelet will try to configure the un-existing bridge cbr0
.
For CoreOS, my workaround to mirror the Kubernetes behaviour but still using flannel :
docker0
interface to promiscuous mode. (Surely there's a more elegant want to do this ?) :- name: docker.service
command: start
drop-ins:
- name: 30-Set-Promiscuous-Mode.conf
content: |
[Service]
ExecStartPost=/usr/bin/sleep 5
ExecStartPost=/usr/bin/ip link set docker0 promisc on
kubelet --hairpin-mode=none
You can check if hairpin is enabled for your interfaces with
brctl showstp docker0
or
for f in /sys/devices/virtual/net/*/brport/hairpin_mode; do cat $f; done
I think my colleague have fixed this recently http://www.spinics.net/lists/netdev/msg393441.html, we encountered this problem in our environment and then we found the issue, with this fix, we never encounter this problem any more. Anyone who encountered this problem, could your try this patch and see if it happen again. And from our analysis, it related to ipv6, so you also can try disable ipv6 of docker with --ipv6=false
when starting docker daemon
@coolljt0725 Maybe I'm wrong, but ipv6 is disabled by default in docker and I've just reproduced the problem via docker-stress with "--ipv6=false" option (which is the default anyway). Haven't tried your patch yet.
@dadux Thank you for your help. On Kubernetes 1.3.4 CoreOS 1068 Stable, Docker 10.3, Flannel as networking layer, I have fixed the problem by making the following changes in my CoreOS units:
- name: docker.service
drop-ins:
- name: 30-Set-Promiscuous-Mode.conf
content: |
[Service]
ExecStartPost=/usr/bin/sleep 5
ExecStartPost=/usr/bin/ip link set docker0 promisc on
Added the following to kubelet.service
:
--hairpin-mode=none
What effect do these changes on Docker/Kubernetes have with regards to how the O/S handles interfaces for containers ?
I must stress that it is an issue with wrong O/S behaviour, not Docker or Kubernetes, because we (and some other people in this thread) are not at all running Docker or Kubernetes, but still encounter exactly the same situations when stopping LXC containers quite quickly one after the other.
@rdelangh You are correct. However, this issue was created in the Docker project to track the behavior as it pertains to Docker. There are other issues mentioned in this thread tracking it as an OS problem, a K8s problem, and CoreOS problem. If you have found the issue in LXC or something else, highly recommend you start a thread there and link here to raise awareness around the issue.
When those using Docker google for this error they will likely land here. So, it makes sense that we post workarounds to this issue here so that until the underlying problems are fixed, people can move forward.
What effect do these changes on Docker/Kubernetes have with regards to how the O/S handles interfaces for containers ?
- The docker change in my post allows the Kubernetes stack interrogate docker and make sure the platform is healthy when the issue occurs.
- the
hairpin-mode
change essentially tells K8s to use thedocker0
bridge as is, and therefore will not try to use "kernel land" networking and "hairpin veth" which is where the problem begins in the Docker execution path.
Its a workaround for this issue using K8s and Docker.
coolljt0725 colleagues's patch has been queued for stable, so hopefully it'll be backported into distros soon enough. (David Miller's post: http://www.spinics.net/lists/netdev/msg393688.html )
Not sure where that commit is though and if we should send it to Ubuntu, RH, etc. to help them track & backport it?
Going to show up here at some point I guess:
http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/tree/net/ipv6/addrconf.c
EDIT: seems to be present here: https://github.com/torvalds/linux/blob/master/net/ipv6/addrconf.c
Thank you to coolljt0725 and co (and everybody in this thread). Since many people will be unable to update to a kernel with the ipv6 patch for some time, (everyone, currently) I've managed to squash this bug after trying many of the suggestions from this thread. I want to make a full post to follow up on things that did work and did not work so that nobody else has to see the trouble I seen.
TL;DR disable ipv6 in linux boot params, reboot. on coreos this means /usr/share/oem/grub.cfg
has the contents: set linux_append="ipv6.disable=1"
and then a reboot. a more general purpose suggestion that should work on centos/ubuntu/debian/$linuxes may be found here
dockerd
individually, and with certain combinations (since none of them seemed to work, I wasn't too scientific about trying any and all combinations):--ipv6=false
—iptables=false
—ip-forward=false
—icc=false
—ip-masq=false
—userland-proxy=false
interestingly, --ipv6=false
doesn't really seem to do anything -- this was quite perplexing, containers still received inet6 addresses with this flag.
--userland-proxy=false
sets hairpin mode and wasn't expected to work really. in conjunction with this I had some hope but this did not resolve the issue, either (setting docker0 to promisc mode). There is a mention of a fix to --userland-proxy=false
here and this may be upstream soon and worth another shot, it would be nice to turn this off regardless of the bug noted in this issue for performance but unfortunately it has yet another bug at this time.
too long; did read: disable ipv6 in your grub settings. reboot. profit.
Faced this issue on CentOS 7.2 (3.10.0-327.28.3.el7.x86_64) and Docker 1.12.1 (w/o k8s). The problem arises when network traffic increases.
Booting kernel with ipv6 disabled (as per previous advice) didn't help.
But turning the docker0 interface into promisc mode have fixed this. Used systemd drop-in by @dadux (thank you!) - seems to be working good now.
@rdallman Deactivating ipv6 via grub does not prevent unregister_netdevice
for me in either ubuntu 16.06 (kernel 4.4.0-36-generic) or 14.04 (kernel 3.13.0-95-generic). Regardless of the --userland-proxy
setting (either true or false).
Ooooh, that's cool that patch was queued for stable.
ping @aboch for problem that --ipv6=false
does nothing.
@trifle sorry :( thanks for posting info, we are yet to run into issues after few days testing but will update back if we run into any issues. we're running coreos 1122.2 (kernel 4.7.0). setting docker0 to promisc mode seems to fix this for some people (no luck for us).
@RRAlex Do you know if anyone has reached out to the Ubuntu kernel team regarding a backport? We have a large production Docker deployment on an Ubuntu cluster that's affected by the bug.
Ubuntu kernel team mailing list:
https://lists.ubuntu.com/archives/kernel-team/2016-September/thread.html
Patch for the stable kernel:
https://github.com/torvalds/linux/commit/751eb6b6042a596b0080967c1a529a9fe98dac1d
Ubuntu kernel commit log:
http://kernel.ubuntu.com/git/ubuntu/ubuntu-xenial.git/log/?h=master-next
(Patch is not there yet)
@leonsp I tried contacting them on what seems to be the related issue:
https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1403152
If you look at the last (#79) reply, someone built a kernel for Xenial with that patch:
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1403152
Not sure when it is going in the main Ubuntu kernel tree though nor what is this person's relation to Ubuntu and if that'll help...
I also can't find the mentioned commits from that thread in the Ubuntu kernel commit log.
@RRAlex The mentioned commits is on ddstreet's branch ~ddstreet/+git/linux:lp1403152-xenial, here is the log: https://code.launchpad.net/~ddstreet/+git/linux/+ref/lp1403152-xenial
So, anyone with this issue on Ubuntu 16.04 can give it a try. https://launchpad.net/~ddstreet/+archive/ubuntu/lp1403152
Possibly @sforshee knows (for the Ubuntu kernel)
I've finally managed to test the "ipv6.disable=1" solution. In addition to that - I've upgraded to 4.7.2 kernel on my debian 8.
After kernel upgrade and enabling "ipv6.disable=1" in kernel parameters I've managed to catch "waiting for lo" issue on real workload even without "--userland-proxy=false" flag for docker daemon. The good news are that after specifying "--userland-proxy=false" and trying to reproduce issue with "docker-stress" - I can no longer do that. But I am pretty sure it will arise again regardless of "--userland-proxy" value.
So from what I see - the ipv6 is definitely involved into this issue, because now docker-stress is no longer able to catch the issue. The bad news is that the issue is actually still there (ie: it's fixed only partially).
Will compile latest 4.8rc7 later to test more.
@twang2218 @coolljt0725
Hmmm.. so I just tried the Ubuntu xenial 4.4.0-36 kernel with the patch backported from ddstreet's ppa:
$ uname -a
Linux paul-laptop 4.4.0-36-generic #55hf1403152v20160916b1-Ubuntu SMP Fri Sep 16 19:13:50 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
Unfortunately, this does not seem to solve the problem for me. Note that I'm also running with "ipv6.disable=1". Are we looking at multiple unrelated causes with the same outcome? Many of the comments in this thread seem to suggest so.
I don't know too much about these, but I know we've had bugs like this before. As I understand it, reference counts to any network device end up getting transferred to lo when a network namespace is being cleaned up, so "waiting for lo to become free" means there's a reference count leak for some net device but not necessarily for lo directly. That makes these a bear to track down, because by the time you know there was a leak you don't know what device it was associated with.
I haven't read back through all the comments, but if someone can give me a reliable reproducer on Ubuntu I'll take a look at it and see if I can figure anything out.
@sforshee it's not always easy to reproduce, but there was a patch created (that at least fixes some of the cases reported here); http://www.spinics.net/lists/netdev/msg393441.html. That was accepted upstream https://github.com/torvalds/linux/commit/751eb6b6042a596b0080967c1a529a9fe98dac1d
@thaJeztah ah, I see the question you were directing me at now.
So the patch is in the upstream 4.4 stable queue, for 16.04 it's likely to be included in not the next kernel SRU (which is already in progress) but the one after that, about 5-6 weeks from now. If it is needed in 14.04 too please let me know so that it can be backported.
@sforshee basically earlier (before that patch) that could be reproduced by enabling ipv6 in kernel (usually enabled by default), adding "--userland-proxy=false" to docker daemon flags and then running docker-stress -c 100
, for example (docker-stress is from here: https://github.com/crosbymichael/docker-stress)
@fxposter thanks. If there's a fix for that one though all I really need to worry about is getting that fix into the Ubuntu kernel. I can also help look into other leaks that aren't fixed by that patch.
I'm having this issue too. I'm running docker inside a rancherOS box from AWS. Actually, it happens randomly after setting up a rancher cluster (3 hosts) and running a small application in it.
same... Fedora 24, happen randomly, can be fine for week, than i get one every 10 hours
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
Experiencing on CentOS 7 running kernel 3.10.0-327.36.1.el7 and docker 1.12.1
Downgrading to kernel 3.10.0-327.18.2.el7 while remaining on docker 1.12.1, seems to have stabilized the system.
I'm also seeing this:
Docker version 1.11.2
Ubuntu 16.04.1 4.4.0-38-generic
ipv6 disabled (grub)
Just had this problem without --userland-proxy=false
(sic!) on server with kernel 4.8.0-rc7, which includes ipv6 patch (sic!!). So maybe it fixes some of the problems, but not all of them, definitely.
Does anyone know how this can be debugged at all?
We discovered that this only occurs on our setup when we (almost) run out of free memory.
@fxposter It would be useful to find minimal reproduction case, which is kinda hard :/ Then we could use ftrace to at least find code paths.
Happening on CoreOS 1081.5.0 (4.6.3-coreos)
Linux blade08 4.6.3-coreos #2 SMP Sat Jul 16 22:51:51 UTC 2016 x86_64 Intel(R) Xeon(R) CPU X5650 @ 2.67GHz GenuineIntel GNU/Linux
@LK4D4 unfortunately it's no longer possible to reproduce it via docker-stress (at least I could not). I will try to mimic our previous setup with webkits (which triggered this problem quite more often than I would like).
@fxposter That patch does only fix some of the problems(but in our environment, we never encounter it anymore with that patch ), not all, I'll let my colleague keep on looking into this issue. If you have any way to reproduce this, please let me know, thanks :)
The fix is in 4.4.22 stable https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.4.22
I posted a request for Redhat to apply this patch to Fedora 24.
4.4.0-42 is still broken for sure...
I mentioned it here for Ubuntu, but maybe someone has a better idea:
https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1403152
I'm also seeing this, Docker version 1.11.2, build b9f10c9/1.11.2, 64bit Amazon Linux 2016.03 v2.1.6.
still happened. docker 1.12.2, armbian linux kernel 4.8.4, ipv6.disable=1 in bootargs
how to fix the bug, I meet it every day
@woshihaoren Don't use --userland-proxy=false
To clarify - we faced it with userland-proxy disabled too
Getting this on Amazon Linux AMI 2016.9:
$ uname -a
Linux 4.4.23-31.54.amzn1.x86_64 #1 SMP
Docker version:
``` Client:
Version: 1.11.2
API version: 1.23
Go version: go1.5.3
Git commit: b9f10c9/1.11.2
Built:
OS/Arch: linux/amd64
Server:
Version: 1.11.2
API version: 1.23
Go version: go1.5.3
Git commit: b9f10c9/1.11.2
Built:
OS/Arch: linux/amd64
```
centos7 kernel 4.4.30 again~~~~
CoreOS 1185.3.0, 4.7.3-coreos-r2, Docker 1.11.2
Reproducible with just running 10..20 debian:jessie containers with "apt-get update" as command.
CoreOS stable is currently still hit. The fix for the 4.7 series is in 4.7.5: https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.7.5
commit 4e1b3aa898ea93ec10e48c06f0e511de37c35b2d
Author: Wei Yongjun <[email protected]>
Date: Mon Sep 5 16:06:31 2016 +0800
ipv6: addrconf: fix dev refcont leak when DAD failed
TL;DR - There are no solutions in this post, but I do list what I've chased so far and my current working theories. I'm hoping other folks who are also chasing this might find some info here helpful as we run this thing down.
@koendc Thanks for posting the patch that was introduced into 4.7.5. I back ported the 4e1b3aa898ea93ec10e48c06f0e511de37c35b2d (upstream 751eb6b6042a596b0080967c1a529a9fe98dac1d) patch to my 4.5.5 setup [1] and was able to easily reproduce the unregister_netdevice problem. It is possible that other changes in the 4.7.x kernel work together with the provided patch to resolve this issue, but I have not yet confirmed that, so we shouldn't lose all hope yet. I'm testing with 4.5.5 because I have a reproducible test case to cause the problem, discussed in [2].
Other things I've confirmed based on testing:
Next steps:
IPv6: eth0: IPv6 duplicate address <blah> detected
errors. Might be another red herring, but I want to try exercising ipv6 disabling to see if there is a correlation[1] My full setup is a GCE virt running a slightly customized debian kernel based on 4.5.5
. Docker version 1.8.3, build f4bf5c7
is running on top of that
[2] Test case information: I have 20 parallel processes, each start a Node.js hello world server inside of a docker container. Instead of returning hello world
, the Node.js server returns 1 MB of random text. The parent process is in a tight loop that starts the container, curls to retrieve the 1MB of data, and stops the container. Using this setup, I can consistently reproduce the problem in 4-90s. Using this same setup on a physical host or inside of virtualbox does not reproduce the problem, despite varying items that alter mean time to reproduction on the GCE box. Variables I've been playing with: number of concurrent test processes, size of payload transferred, and quantity of curl calls. The first two variables are definitely correlated, though I think it's likely just adjusting the variables in order to find a reasonable saturation point for the virt.
I too am having this error.
I see it repeated 3 times after deploying a container.
Description
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
Steps to reproduce the issue:
docker run -d --network=anetwork --name aname -p 9999:80 aimagename
Describe the results you received:
Just get the error repeated 3 times.
Describe the results you expected:
No error
Additional information you deem important (e.g. issue happens only occasionally):
Just started happening after this weekend.
Output of docker version
:
docker --version
Docker version 1.12.3, build 6b644ec
Output of docker info
:
docker info
Containers: 10
Running: 9
Paused: 0
Stopped: 1
Images: 16
Server Version: 1.12.3
Storage Driver: overlay2
Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: overlay null host bridge
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 4.8.4-200.fc24.x86_64
Operating System: Fedora 24 (Server Edition)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 15.67 GiB
Name: docker-overlayfs
ID: AHY3:COIU:QQDG:KZ7S:AUBY:SJO7:AHNB:3JLM:A7RN:57CQ:G56Y:YEVU
Docker Root Dir: /docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Insecure Registries:
127.0.0.0/8
Additional environment details (AWS, VirtualBox, physical, etc.):
Virtual machine:
Fedora 24
OverlayFS2 on ext3
Separate drive allocated for docker use 24 gigs.
16 gigs of ram.
Docker PS
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5664a10de50b 7f01d324a3cb "/bin/sh -c 'apk --no" 11 minutes ago Exited (1) 10 minutes ago pensive_brattain
3727b3e57e2f paa-api "/bin/sh -c /run.sh" 10 days ago Up 10 days 0.0.0.0:8080->80/tcp paa-api
43cfe7eae9cf paa-ui "nginx -g 'daemon off" 10 days ago Up 10 days 0.0.0.0:80->80/tcp, 443/tcp paa-ui
345eaab3b289 sentry "/entrypoint.sh run w" 11 days ago Up 11 days 0.0.0.0:8282->9000/tcp my-sentry
32e555609cd2 sentry "/entrypoint.sh run w" 11 days ago Up 11 days 9000/tcp sentry-worker-1
a411d09d7f98 sentry "/entrypoint.sh run c" 11 days ago Up 11 days 9000/tcp sentry-cron
7ea48b27eb85 postgres "/docker-entrypoint.s" 11 days ago Up 11 days 5432/tcp sentry-postgres
116ad8850bb1 redis "docker-entrypoint.sh" 11 days ago Up 11 days 6379/tcp sentry-redis
35ee0c906a03 uifd/ui-for-docker "/ui-for-docker" 11 days ago Up 11 days 0.0.0.0:9000->9000/tcp docker-ui
111ad12b877f elasticsearch "/docker-entrypoint.s" 11 days ago Up 11 days 0.0.0.0:9200->9200/tcp, 0.0.0.0:9300->9300/tcp paa-elastic
Docker images
docker images -a
REPOSITORY TAG IMAGE ID CREATED SIZE
<none> <none> 7f01d324a3cb 12 minutes ago 88.51 MB
<none> <none> 1a6a12354032 12 minutes ago 88.51 MB
debian jessie 73e72bf822ca 6 days ago 123 MB
paa-api latest 6da68e510175 10 days ago 116.9 MB
<none> <none> 4c56476ba36d 10 days ago 116.9 MB
<none> <none> 3ea3bff63c7b 10 days ago 116.8 MB
<none> <none> 05d6d5078f8a 10 days ago 88.51 MB
<none> <none> 30f0e6001f1e 10 days ago 88.51 MB
paa-ui latest af8ff5acc85a 10 days ago 188.1 MB
elasticsearch latest 5a62a28797b3 12 days ago 350.1 MB
sentry latest 9ebeda6520cd 13 days ago 493.7 MB
redis latest 74b99a81add5 13 days ago 182.9 MB
python alpine 8dd7712cca84 13 days ago 88.51 MB
postgres latest 0267f82ab721 13 days ago 264.8 MB
nginx latest e43d811ce2f4 3 weeks ago 181.5 MB
uifd/ui-for-docker latest 965940f98fa5 9 weeks ago 8.096 MB
Docker Volume Ls
DRIVER VOLUME NAME
local 3bc848cdd4325c7422284f6898a7d10edf8f0554d6ba8244c75e876ced567261
local 6575dad920ec453ca61bd5052cae1b7e80197475b14955115ba69e8c1752cf18
local bf73a21a2f42ea47ce472e55ab474041d4aeaa7bdb564049858d31b538bad47b
local c1bf0761e8d819075e8e2427c29fec657c9ce26bc9c849548e10d64eec69e76d
local e056bce5ae34f4066d05870365dcf22e84cbde8d5bd49217e3476439d909fe44
* DF -H*
df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 7.9G 0 7.9G 0% /dev
tmpfs 7.9G 0 7.9G 0% /dev/shm
tmpfs 7.9G 1.3M 7.9G 1% /run
tmpfs 7.9G 0 7.9G 0% /sys/fs/cgroup
/dev/mapper/fedora-root 11G 1.6G 8.7G 16% /
tmpfs 7.9G 8.0K 7.9G 1% /tmp
/dev/sda1 477M 130M 319M 29% /boot
/dev/sdb1 24G 1.6G 21G 7% /docker
overlay 24G 1.6G 21G 7% /docker/overlay2/5591cfec27842815f5278112edb3197e9d7d5ab508a97c3070fb1a149d28f9f0/merged
shm 64M 0 64M 0% /docker/containers/35ee0c906a03422e1b015c967548582eb5ca3195b3ffdd040bb80df9bb77cd32/shm
overlay 24G 1.6G 21G 7% /docker/overlay2/73e795866566e845f09042d9f7e491e8c3ac59ebd7f5bc0ee4715d0f08a12b7b/merged
shm 64M 4.0K 64M 1% /docker/containers/7ea48b27eb854e769886f3b662c2031cf74f3c6f77320a570d2bfa237aef9d2b/shm
overlay 24G 1.6G 21G 7% /docker/overlay2/fad7f3b483bc48b83c3a729368124aaaf5fdd7751fe0a383171b8966959ac966/merged
shm 64M 0 64M 0% /docker/containers/116ad8850bb1c74d1a33b6416e1b99775ef40aa13fc098790b7e4ea07e3e6075/shm
overlay 24G 1.6G 21G 7% /docker/overlay2/456c40bc86852c9f9c9ac737741b57d30f2167882f15b32ac25f42048648d945/merged
shm 64M 0 64M 0% /docker/containers/a411d09d7f98e1456a454a399fb68472f5129df6c3bd0b73f59236e6f1e55e74/shm
overlay 24G 1.6G 21G 7% /docker/overlay2/3ee2b1b978b048f4d80302eec129e7163a025c7bb8e832a29567b64f5d15baa0/merged
shm 64M 0 64M 0% /docker/containers/32e555609cd2c77a1a8efc45298d55224f15988197ef47411a90904cf3e13910/shm
overlay 24G 1.6G 21G 7% /docker/overlay2/3e1cdabc2ae422a84b1d4106af1dde0cd670392bbe8a9d8f338909a926026b73/merged
shm 64M 0 64M 0% /docker/containers/345eaab3b289794154af864e1d14b774cb8b8beac8864761ac84051416c7761b/shm
overlay 24G 1.6G 21G 7% /docker/overlay2/6bfc33084abe688af9c1a704a0daba496bee7746052103ef975c76d2c74d6455/merged
shm 64M 0 64M 0% /docker/containers/111ad12b877f4d4d8b3ab4b44b06f645acf89b983580e93d441305dcc7926671/shm
overlay 24G 1.6G 21G 7% /docker/overlay2/0b454336447a39d06966adedf4dc4abed6405212107a2f8f326072ae5fb58b3d/merged
shm 64M 0 64M 0% /docker/containers/43cfe7eae9cf310d64c6fe0f133152067d88f8d9242e48289148daebd9cb713d/shm
overlay 24G 1.6G 21G 7% /docker/overlay2/0d8bba910f1f5e928a8c1e5d02cc55b6fe7bd7cd5c4d23d4abc6f361ff5043ac/merged
shm 64M 0 64M 0% /docker/containers/3727b3e57e2f5c3b7879f
DF -i
df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
devtmpfs 2051100 411 2050689 1% /dev
tmpfs 2054171 1 2054170 1% /dev/shm
tmpfs 2054171 735 2053436 1% /run
tmpfs 2054171 16 2054155 1% /sys/fs/cgroup
/dev/mapper/fedora-root 5402624 53183 5349441 1% /
tmpfs 2054171 8 2054163 1% /tmp
/dev/sda1 128016 350 127666 1% /boot
/dev/sdb1 1572864 72477 1500387 5% /docker
overlay 1572864 72477 1500387 5% /docker/overlay2/5591cfec27842815f5278112edb3197e9d7d5ab508a97c3070fb1a149d28f9f0/merged
shm 2054171 1 2054170 1% /docker/containers/35ee0c906a03422e1b015c967548582eb5ca3195b3ffdd040bb80df9bb77cd32/shm
overlay 1572864 72477 1500387 5% /docker/overlay2/73e795866566e845f09042d9f7e491e8c3ac59ebd7f5bc0ee4715d0f08a12b7b/merged
shm 2054171 2 2054169 1% /docker/containers/7ea48b27eb854e769886f3b662c2031cf74f3c6f77320a570d2bfa237aef9d2b/shm
overlay 1572864 72477 1500387 5% /docker/overlay2/fad7f3b483bc48b83c3a729368124aaaf5fdd7751fe0a383171b8966959ac966/merged
shm 2054171 1 2054170 1% /docker/containers/116ad8850bb1c74d1a33b6416e1b99775ef40aa13fc098790b7e4ea07e3e6075/shm
overlay 1572864 72477 1500387 5% /docker/overlay2/456c40bc86852c9f9c9ac737741b57d30f2167882f15b32ac25f42048648d945/merged
shm 2054171 1 2054170 1% /docker/containers/a411d09d7f98e1456a454a399fb68472f5129df6c3bd0b73f59236e6f1e55e74/shm
overlay 1572864 72477 1500387 5% /docker/overlay2/3ee2b1b978b048f4d80302eec129e7163a025c7bb8e832a29567b64f5d15baa0/merged
shm 2054171 1 2054170 1% /docker/containers/32e555609cd2c77a1a8efc45298d55224f15988197ef47411a90904cf3e13910/shm
overlay 1572864 72477 1500387 5% /docker/overlay2/3e1cdabc2ae422a84b1d4106af1dde0cd670392bbe8a9d8f338909a926026b73/merged
shm 2054171 1 2054170 1% /docker/containers/345eaab3b289794154af864e1d14b774cb8b8beac8864761ac84051416c7761b/shm
overlay 1572864 72477 1500387 5% /docker/overlay2/6bfc33084abe688af9c1a704a0daba496bee7746052103ef975c76d2c74d6455/merged
shm 2054171 1 2054170 1% /docker/containers/111ad12b877f4d4d8b3ab4b44b06f645acf89b983580e93d441305dcc7926671/shm
overlay 1572864 72477 1500387 5% /docker/overlay2/0b454336447a39d06966adedf4dc4abed6405212107a2f8f326072ae5fb58b3d/merged
shm 2054171 1 2054170 1% /docker/containers/43cfe7eae9cf310d64c6fe0f133152067d88f8d9242e48289148daebd9cb713d/shm
overlay 1572864 72477 1500387 5% /docker/overlay2/0d8bba910f1f5e928a8c1e5d02cc55b6fe7bd7cd5c4d23d4abc6f361ff5043ac/merged
shm 2054171 1 2054170 1% /docker/containers/3727b3e57e2f5c3b7879f23deb3b023d10c0b766fe83e21dd389c71021af371f/shm
tmpfs 2054171 5 2054166 1% /run/user/0
Free -lmh
free -lmh
total used free shared buff/cache available
Mem: 15G 3.0G 10G 19M 2.7G 12G
Low: 15G 5.6G 10G
High: 0B 0B 0B
Swap: 1.2G 0B 1.2G
For any of those interested, we (Travis CI) are rolling out an upgrade to v4.8.7
on Ubuntu 14.04. Our load tests showed no occurrences of the error described here. Previously, we were running linux-image-generic-lts-xenial on Ubuntu 14.04. I'm planning to get a blog post published in the near future describing more of the details.
UPDATE: I should have mentioned that we are running this docker stack:
Client:
Version: 1.12.3
API version: 1.24
Go version: go1.6.3
Git commit: 6b644ec
Built: Wed Oct 26 21:44:32 2016
OS/Arch: linux/amd64
Server:
Version: 1.12.3
API version: 1.24
Go version: go1.6.3
Git commit: 6b644ec
Built: Wed Oct 26 21:44:32 2016
OS/Arch: linux/amd64
UPDATE: We are _still_ seeing this error in production on Ubuntu Trusty + kernel v4.8.7. We don't yet know why these errors disappeared in staging load tests that previously reproduced the error, yet the error rate in production is effectively the same. Onward and upward. We have disabled "automatic implosion" based on this error given the high rate of instance turnover.
also found in centos 7
Message from syslogd@c31392666b98e49f6ace8ed65be337210-node1 at Nov 17 17:28:07 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
Message from syslogd@c31392666b98e49f6ace8ed65be337210-node1 at Nov 17 17:32:47 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
Message from syslogd@c31392666b98e49f6ace8ed65be337210-node1 at Nov 17 17:37:32 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
Message from syslogd@c31392666b98e49f6ace8ed65be337210-node1 at Nov 17 17:37:42 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
[root@c31392666b98e49f6ace8ed65be337210-node1 ~]# docker info
Containers: 19
Running: 15
Paused: 0
Stopped: 4
Images: 23
Server Version: 1.11.2.1
Storage Driver: overlay
Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local nas acd ossfs
Network: vpc bridge null host
Kernel Version: 4.4.6-1.el7.elrepo.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.795 GiB
Name: c31392666b98e49f6ace8ed65be337210-node1
ID: WUWS:FDP5:TNR6:EE5B:I2KI:O4IT:TQWF:4U42:5327:7I5K:ATGT:73KM
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: bridge-nf-call-ip6tables is disabled
Cluster store: etcd://test.com:2379
Cluster advertise: 192.168.0.2:2376
Same thing happening here with a DigitalOcean VPS on Debian testing:
# journalctl -p0 | tail -15
Nov 19 12:02:55 hostname kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
Nov 19 12:03:05 hostname kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
Nov 19 12:17:44 hostname kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
Nov 19 12:48:15 hostname kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
Nov 19 13:33:08 hostname kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
Nov 19 14:03:04 hostname kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
Nov 19 14:03:14 hostname kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
Nov 19 14:17:59 hostname kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
Nov 19 15:03:02 hostname kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
Nov 19 15:18:13 hostname kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
Nov 19 15:32:44 hostname kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
Nov 19 16:03:13 hostname kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
Nov 19 16:47:43 hostname kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
Nov 19 17:17:46 hostname kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
Nov 19 17:17:56 hostname kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
System
$ apt list --installed 'linux-image*'
Listing... Done
linux-image-3.16.0-4-amd64/now 3.16.36-1+deb8u2 amd64 [installed,local]
linux-image-4.8.0-1-amd64/testing,now 4.8.5-1 amd64 [installed,automatic]
linux-image-amd64/testing,now 4.8+76 amd64 [installed]
$ apt list --installed 'docker*'
Listing... Done
docker-engine/debian-stretch,now 1.12.3-0~stretch amd64 [installed]
N: There are 22 additional versions. Please use the '-a' switch to see them.
$ uname -a
Linux hostname 4.8.0-1-amd64 #1 SMP Debian 4.8.5-1 (2016-10-28) x86_64 GNU/Linux
$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux testing (stretch)
Release: testing
Codename: stretch
$ docker info
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 42
Server Version: 1.12.3
Storage Driver: devicemapper
Pool Name: docker-254:1-132765-pool
Pool Blocksize: 65.54 kB
Base Device Size: 10.74 GB
Backing Filesystem: ext4
Data file: /dev/loop0
Metadata file: /dev/loop1
Data Space Used: 435 MB
Data Space Total: 107.4 GB
Data Space Available: 16.96 GB
Metadata Space Used: 1.356 MB
Metadata Space Total: 2.147 GB
Metadata Space Available: 2.146 GB
Thin Pool Minimum Free Space: 10.74 GB
Udev Sync Supported: true
Deferred Removal Enabled: false
Deferred Deletion Enabled: false
Deferred Deleted Device Count: 0
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Library Version: 1.02.136 (2016-11-05)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: null host bridge overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 4.8.0-1-amd64
Operating System: Debian GNU/Linux stretch/sid
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 996.4 MiB
Name: hostname
ID: <redacted>
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
127.0.0.0/8
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0b54ed86ba70 squid/production "/usr/sbin/squid -N" 29 hours ago Up 29 hours 0.0.0.0:8080-8081->8080-8081/tcp squid
$ ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether de:ad:be:ff:ff:ff brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether de:ad:be:ff:ff:ff brd ff:ff:ff:ff:ff:ff
4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether de:ad:be:ff:ff:ff brd ff:ff:ff:ff:ff:ff
234: veth64d2a77@if233: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default
link/ether de:ad:be:ff:ff:ff brd ff:ff:ff:ff:ff:ff link-netnsid 1
# ifconfig
docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.17.0.1 netmask 255.255.0.0 broadcast 0.0.0.0
inet6 dead::beef:dead:beef:ffff prefixlen 64 scopeid 0x20<link>
ether de:ad:be:ef:ff:ff txqueuelen 0 (Ethernet)
RX packets 3095526 bytes 1811946213 (1.6 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 2642391 bytes 1886180372 (1.7 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 123.45.67.89 netmask 255.255.240.0 broadcast 123.45.67.89
inet6 dead::beef:dead:beef:ffff prefixlen 64 scopeid 0x0<global>
inet6 dead::beef:dead:beef:ffff prefixlen 64 scopeid 0x20<link>
ether dead::beef:dead:beef:ffff txqueuelen 1000 (Ethernet)
RX packets 3014258 bytes 2087556505 (1.9 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 3453430 bytes 1992544469 (1.8 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1 (Local Loopback)
RX packets 178 bytes 15081 (14.7 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 178 bytes 15081 (14.7 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
veth64d2a77: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet6 dead::beef:dead:beef:ffff prefixlen 64 scopeid 0x20<link>
ether d2:00:ac:07:c8:45 txqueuelen 0 (Ethernet)
RX packets 1259405 bytes 818486790 (780.5 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1103375 bytes 817423202 (779.5 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
I've been testing 4.8.8 in a tight loop (see [2] from my earlier comment for the test case) non-stop for the last 4 days. So far, so good.
Facts
Suppositions
@meatballhat pointed out that their production servers experienced the problem while running 4.8.7. This leaves us with two possibilities:
Can we get a few folks to try 4.8.8 to see if they are able to reproduce this problem?
@reshen I'll get us updated to 4.8.8 and report back :+1: Thanks much for your research!
@reshen Excellent research. So far I've also not been able to reproduce the problem using Linux 4.8.8 on Xubuntu 16.04.
I've been using the Ubuntu mainline kernel builds. I do not have a well defined test case, but I could consistently reproduce the problem before by starting and stopping the set of docker containers I work with.
To test Linux 4.8.8 the easiest for me was to switch from aufs to overlay2 as storage driver as the mainline kernel builds did not include aufs. I don't think will influence the test, but it should be noted.
In the past I've tested Linux 4.4.4 with the 751eb6b6 backported by Dan Streetman, this did not seem to reduce the problem for me. It will be interesting to see if also backporting the two patches noted by you (5086cadf and 6fff1319) can give the same result as 4.4.8.
Ubuntu 16.04 with 4.4.0-47 was still affected... trying 4.4.0-49 now, will report later.
edit: 2016-11-28: -49 is sitll showing that log line in dmesg.
Experienced this on Fedora 25 (kernel 4.8.8) and Docker 1.12.3
FYI: we've been running Linux 4.8.8 in conjunction with Docker v1.12.3 on a single production host. Uptime is presently at 5.5 days and the machine remains stable.
We occasionally see a handful of unregister_netdevice: waiting for lo to become free. Usage count = 1
messages in syslog, but unlike before, the kernel does not crash and the message goes away. I suspect that one of the other changes introduced either in the Kernel or in Docker detect this condition and now recover from it. For us, this now makes this message annoying but no longer a critical bug.
I'm hoping some other folks can confirm the above on their production fleets.
@gtirloni - can you clarify if your 4.8.8/1.12.3 machine crashed or if you just saw the message?
Thank you, in advance, to everyone who has been working on reproducing/providing useful information to triangulate this thing.
we delete the counterpart of the veth interface (docker0) after starting docker and restart docker afterwards when we provision the host using ansible. Problem didn't occur since.
I'm also getting this same error on a Raspberry Pi 2 running Raspbian with Docker.
Kernel info
Linux rpi2 4.4.32-v7+ #924 SMP Tue Nov 15 18:11:28 GMT 2016 armv7l GNU/Linux
Docker Info
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 9
Server Version: 1.12.3
Storage Driver: overlay
Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: null host bridge overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options:
Kernel Version: 4.4.32-v7+
Operating System: Raspbian GNU/Linux 8 (jessie)
OSType: linux
Architecture: armv7l
CPUs: 4
Total Memory: 925.5 MiB
Name: rpi2
ID: 24DC:RFX7:D3GZ:YZXF:GYVY:NXX3:MXD3:EMLC:JPLN:7I6I:QVQ7:M3NX
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
WARNING: No kernel memory limit support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support
WARNING: No cpuset support
Insecure Registries:
127.0.0.0/8
It happened after creating a container which needed around ~50Mb of downloaded programs installed.
Only a reboot would let me use the machine again
I am actually seeing this on Amazon Linux in an ECS cluster - the message occasionally throws but it doesn't lock up, like reshen's seeing now. Docker 1.11.2. Uname reports "4.4.14-24.50.amzn1.x86_64" as the version.
@reshen I'm going to build 4.8.8 this weekend on my laptop and see if that
fixes it for me!
ᐧ
On Thu, Dec 1, 2016 at 10:29 AM, Ernest Mueller notifications@github.com
wrote:
I am actually seeing this on Amazon Linux in an ECS cluster - the message
occasionally throws but it doesn't lock up, like reshen's seeing now.
Docker 1.11.2. Uname reports "4.4.14-24.50.amzn1.x86_64" as the version.—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/docker/docker/issues/5618#issuecomment-264220432, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AKklVRqoBUZDu3HMhGv3b6knnA6j6C_Qks5rDvXRgaJpZM4B4L4Z
.
--
Keifer Furzland
http://kfrz.work
I was also able to reproduce this issue using https://github.com/crosbymichael/docker-stress on a Kubernetes worker node running CoreOS Stable 1185.3.0.
Running docker_stress_linux_amd64 -k 3s -c 5 --containers 1000
: 5 concurrent workers creating/deleting containers, max lifetime of containers = 3s, create up to 1000 containers on an m4.large instance on AWS would leave the Docker daemon unresponsive after about three minutes.
Upgraded to CoreOS Beta 1235.1.0 and I haven't been able to reproduce (both the unresponsiveness or the unregister_netdevice
message in the kernel logs). Whereas running 5 concurrent docker_stress workers would kill CoreOS Stable after a few minutes, I was able to run with 10 and 15 concurrent workers until test completion using CoreOS Beta.
CoreOS releases in "channels" so it's not possible to upgrade the kernel in isolation. Here are the major differences between stable and beta:
Seeing this issue on Amazon Elastic Beanstalk running 4.4.23-31.54.amzn1.x86_64
Just Happen on CoreOS Stable 1185.5.0, Docker 1.12.2
After a reboot everything is fine
Update: the hung Docker daemon issue has struck again on a host running CoreOS Beta 1235.1.0 with Docker v1.12.3, and Linux kernel v4.8.6. 😢
1.12.4 and 1.13 should, in theory, not freeze up when this kernel issue is hit.
The reason the freeze in the docker daemon occurs is because the daemon is waiting for a netlink message back from the kernel (which will never come) while holding the lock on the container object.
1.12.4 and 1.13 set a timeout on this netlink request to at least release the container lock.
This does __not__ fix the issue, but at least (hopefully) does not freeze the whole daemon.
You will likely not be able to spin up new containers, and similarly probably will not be able to tear them down since it seems like all interactions with netlink stall once this issue is hit.
@cpuguy83 FWIW, any running containers continue to run without issue AFAIK when the daemon is hung. Indeed, it's the starting and stopping of containers that is noticeable (especially running on Kubernetes, as we are).
This does not fix the issue, but at least (hopefully) does not freeze the whole daemon.
The one upside of the whole daemon being frozen is that it's easy to figure out. Kubernetes can evict the node, maybe even automatically reboot. Should the daemon just keep running, would it still be possible to easily find the kernel issue appeared at all?
@seanknox I could provide you with a custom CoreOS 1248.1.0 AMI with patched Docker (CoreOS Docker 1.12.3 + Upstream 1.12.4-rc1 Patches). It has fixed hangups every couple of hours on my CoreOS/K8s clusters. Just ping me with your AWS Account-ID on the Deis Slack.
We got a huge pain with this issue on our CoreOS cluster. Could anyone tell when it will be finally fixed? We dream about this moment when we can sleep at night.
@DenisIzmaylov If you don't set --userland-proxy=false
, then generally you should not run into this issue.
But otherwise this is a bug in the kernel, possibly multiple kernel bugs, that some say is resolved in 4.8 and others say not. For some, disabling ipv6 seems to fix it, others not (hence it's probably multiple issues... or at least multiple causes).
I've seen this issue within hours on high load systems with and without --userland-proxy=false
Confirmed we are still seeing unregister_netdevice
errors on kernel 4.8.12. It takes about 5 days to trigger. Only a reboot of the system seems to recover from the issue. Stopping Docker seems to hang indefinitely.
Have not tried the disable ipv6 trick for kernel boot yet.
Containers: 17
Running: 14
Paused: 0
Stopped: 3
Images: 121
Server Version: 1.10.3
Storage Driver: overlay
Backing Filesystem: extfs
Execution Driver: native-0.2
Logging Driver: json-file
Plugins:
Volume: local
Network: bridge null host
Kernel Version: 4.8.12-1.el7.elrepo.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 24
Total Memory: 62.86 GiB
Name: **REDACTED***
ID: **REDACTED***
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
Would be awesome if someone can try this with 1.12.5, which should timeout on the stuck netlink request now instead of just hanging Docker.
@cpuguy83 however, system is still unusable in that state :)
@LK4D4 Oh, totally, just want to see those timeouts ;)
Getting this issue on Cent OS 7:
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
Linux foo 3.10.0-514.2.2.el7.x86_64 #1 SMP Tue Dec 6 23:06:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
docker-engine-1.12.5-1.el7.centos.x86_64
This is effecting my CI builds which are running inside Docker containers and appear to be suddenly dying during which this console message appears. Is there a fix or a workaround? thanks!
@cpuguy83 Docker doesnt hang for me when this error occurs but the containers get killed which in my situation is breaking my Jenkins/CI jobs.
So i've been running docker on a centos 7 machine for a while (11 months?) without issue. Today i decided to give the tcp listening daemon a try (added the tcp listening address to /etc/sysconfig/docker) and just got this error.
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
so my usage count is not 3.
Containers: 4
Running: 3
Paused: 0
Stopped: 1
Images: 67
Server Version: 1.10.3
Storage Driver: btrfs
Build Version: Btrfs v4.4.1
Library Version: 101
Execution Driver: native-0.2
Logging Driver: json-file
Plugins:
Volume: local
Network: bridge null host
Kernel Version: 3.10.0-514.2.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 2
CPUs: 24
Total Memory: 39.12 GiB
Name: aimes-web-encoder
ID: QK5Q:JCMA:ATGR:ND6W:YOT4:PZ7G:DBV5:PR26:YZQL:INRU:HAUC:CQ6B
Registries: docker.io (secure)
3.10.0-514.2.2.el7.x86_64 #1 SMP Tue Dec 6 23:06:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
Client:
Version: 1.10.3
API version: 1.22
Package version: docker-common-1.10.3-59.el7.centos.x86_64
Go version: go1.6.3
Git commit: 3999ccb-unsupported
Built: Thu Dec 15 17:24:43 2016
OS/Arch: linux/amd64
Server:
Version: 1.10.3
API version: 1.22
Package version: docker-common-1.10.3-59.el7.centos.x86_64
Go version: go1.6.3
Git commit: 3999ccb-unsupported
Built: Thu Dec 15 17:24:43 2016
OS/Arch: linux/amd64
I can confirm @aamerik. I am seeing the same issue on the same kernel version. No recent major changes on the system, seeing this issue since today.
I saw the same kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
message on my CentOS 7 machine running a docker image of Jenkins. The CentOS 7 machine I was using was current with all the latest CentOS 7 patches as of approximately 20 Dec 2016.
Since the most recent references here seem to be CentOS based, I'll switch my execution host to a Ubuntu or a Debian machine.
I am running Docker version 1.12.5, build 7392c3b
on that CentOS 7 machine. Docker did not hang, but the Jenkins process I was running in Docker was killed when that message appeared.
Thanks so much for Docker! I use it all the time, and am deeply grateful for your work on it!
I'm seeing the same issue when using Jenkins with Docker on a Linux 4.8.15 machine.
Did anyone , reach a fix procedure for rancher os ?
AFAICT, this is a locking issue in the network namespaces subsystem of Linux kernel. This bug has been reported over a year ago, with no reply: https://bugzilla.kernel.org/show_bug.cgi?id=97811 There has been some work on this (see here: http://www.spinics.net/lists/netdev/msg351337.html) but it seems it's not a complete fix.
I've tried pinging the network subsystem maintainer directly, with no response. FWIW, I can reproduce the issue in a matter of minutes.
Smyte will pay $5000 USD for the resolution of this issue. Sounds like I need to talk to someone who works on the kernel?
@petehunt I believe there are multiple issues at play causing this error.
We deployed kernel 4.8.8
as @reshen suggested and while uptime seems a bit better we still continue to see this issue in production.
Trying to deploy Mesosphere from a bootstrap node. All nodes are CentOS 7.2 minimal with all updates applied. The bootstrap node is showing the error as noted above by others:
Message from syslogd@command01 at Jan 16 02:30:24 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
Message from syslogd@command01 at Jan 16 02:30:34 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
Message from syslogd@command01 at Jan 16 02:30:44 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
uname -r:
3.10.0-514.2.2.el7.x86_64
docker -v:
Docker version 1.11.2, build b9f10c9
I can confirm a reboot silences the messages, but the minute i deploy mesosphere again, the messages start every now and then. Mesosphere is quite a large deployment. Maybe those trying to recreate the error can use the installer to reproduce the error. It does take a few minutes before the error shows up after using the first script switch ( --genconf which is the first step ).
We've hit this also. However, the error messages in our case mention the device eth0
not lo
. My error is this:
kernel:unregister_netdevice: waiting for eth0 to become free. Usage count = 1
I'm assuming that errors like mentioning eth0
instead of lo
have the same root cause as this issue. If not, we should open a new ticket regarding the eth0 errors.
OPTIONS=" -H unix:///var/run/docker.sock --ip-forward=true --iptables=true --ip-masq=true --log-driver json-file --log-opt max-size=25m --log-opt max-file=2"
We've hit this also.
Error: unregister_netdevice: waiting for lo to become free. Usage count = 1
OS: CentOS Linux release 7.3.1611 (Core)
Kernel 3.10.0-514.2.2.el7.x86_64
Docker version: 1.13.0-cs1-rc1
Docker options:
{
"disable-legacy-registry": true,
"icc":true,
"insecure-registries":[],
"ipv6":false,
"iptables":true,
"storage-driver": "devicemapper",
"storage-opts": [
"dm.thinpooldev=/dev/mapper/docker_vg-thinpool",
"dm.use_deferred_removal=true",
"dm.use_deferred_deletion=true"
],
"userland-proxy": false
}
I have this on two CentOS systems, latest updates on at least one of them.
$ uname -r
3.10.0-514.2.2.el7.x86_64
$ docker -v
Docker version 1.12.6, build 78d1802
Hey, for everyone affected by this issue on RHEL or CentOS, I've backported the commit from the mainline kernels (torvalds/linux@751eb6b6042a596b0080967c1a529a9fe98dac1d) that fixes the race condition in the IPV6 IFP refcount to 3.10.x kernels used in enterprise distributions. This should fix this issue.
You can find the bug report with working patch here:
If you are interested in testing it and have a RHEL 7 or CentOS 7 system, I have already compiled the latest CentOS 7.3 3.10.0-514.6.1.el7.x86_64 kernel with the patch. Reply to the CentOS bugtracker thread and I can send you a link to the build.
Note: there may be another issue causing a refcount leak but this should fix the error message for many of you.
@stefanlasiewski @henryiii @jsoler
I'll be trying out a build also adding this fix: http://www.spinics.net/lists/netdev/msg351337.html later tonight.
@iamthebot does it mean that if one disables IPv6 it should fix the issue too, even without a patch you just backported?
@redbaron only if that is the issue that you are hitting. I really think there are multiple kernel issues being hit here.
@redbaron maybe. #20569 seems to indicate fully disabling IPV6 is difficult.
So to clarify a bit what's happening under the hood that's generating this message, the kernel maintains a running count of if a device is in use before removing it from a namespace, unregistering it, deactivating it, etc. If for some reason there's a dangling reference to a device, then you're going to see that error message since it can't be unregistered when something else is using it.
The fixes I've seen so far:
I think there's still another race condition when switching namespaces (this seems to happen after creating a bunch of new containers) but I'll need to replicate the issue reliably in order to hunt it down and write a patch.
Does anyone have a minimal procedure for consistently reproducing this? Seemed to happen randomly on our systems.
@iamthebot it's not really straightforward, but I think we can provide you with a test environment that can reliably reproduce this. Email me ([email protected]) and we can arrange the details.
Still experience this under heavy load on Docker version 1.12.6, build 7392c3b/1.12.6 on 4.4.39-34.54.amzn1.x86_64 AWS Linux AMI.
I have 9 docker hosts all nearly identical, and only experience this on some of them. It may be coincidence, but one thing in common I've noticed is that I only seem to have this problem when running containers that do not handle SIGINT
. When I docker stop
these containers, it hangs for 10s and then kills the container ungracefully.
It takes several days before the issue presents itself, and seems to show up randomly, not just imediately after running docker stop
. This is mostly anecdotal, but maybe it will help someone.
I have upgraded all my docker nodes to kernel 3.10.0-514.6.1.el7.x86_64 on CentOS 7.3 as @iamthebot mentioned but I still get same errors:
Jan 26 13:52:49 XXX kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
Message from syslogd@XXX at Jan 26 13:52:49 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
@jsoler just to be clear, did you apply the patch in the bug tracker thread before building the kernel? Or are you using a stock kernel? Also try applying this one (patch should work on older kernels).
Shoot me an email ([email protected]) and I can send you a link to a pre-built kernel. @vitherman I unfortunately don't have a lot of time to look into this (looks like some instrumentation will need to be compiled in to catch this bug) but I've escalated the issue with Red Hat support so their kernel team will take a look.
@ckeeney I can confirm this behavior. We have a dockerized Node application which caused said error on the host system when it was shut down. After implementing a function within the Node.js application, that catches SIGINT and SIGTERM to gracefully shut down the application the error hasn't occured again.
Which kinda makes sense; the Node application uses the virtual interface Docker creates. When Node doesn't get shut down properly, the device hangs and the host system cant unregister it, even though the Docker container has succesfully been stopped.
here is an example code snippet:
function shutdown() {
logger.log('info', 'Graceful shutdown.');
httpServer.close();
if (httpsServer) {
httpsServer.close();
}
process.exit();
}
process.on('SIGINT', shutdown);
process.on('SIGTERM', shutdown);
@michael-niemand is there a different signal that is properly handled by Node by default for a clean shutdown? (you can specify the STOPSIGNAL
in the image, or on docker run
through the --stop-signal
flag.
@thaJeztah for a good explanation of the problem, and workaround, see nodejs/node-v0.x-archive#9131#issuecomment-72900581
@ckeeney I'm aware of that (i.e., processes running as PID1
may not handle SIGINT
or SIGTERM
). For that reason, I was wondering if specifying a different stop-signal would do a clean-shutdown even if running as PID1
.
Alternatively, docker 1.13 adds an --init
option (pull request: https://github.com/docker/docker/pull/26061), that insert an init in the container; in that case, Node is not running as PID1
, which may help in cases that you cannot update the application.
@iamthebot I have built kernel version 3.10.0-514.el7 with your patch integrated but I get same error. Well I am not sure if I have built well the kernel package of centos. Could you share me your kernel package to test it ?
Thanks
I have/had been dealing with this bug for almost a year now. I use CoreOS with PXE boot, I disabled ipv6 in the pxeboot config and I haven't seen this issue once since then.
Well my environment has disabled ipv6 with this sysctl configuration
net.ipv6.conf.all.disable_ipv6=1
net.ipv6.conf.default.disable_ipv6=1
but I still get the error
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
@jsoler right, I was doing that too, still happened. Only once I did it pxe level did it stop.
label coreos
menu label CoreOS
kernel coreos/coreos_production_pxe.vmlinuz
append initrd=coreos/coreos_production_pxe_image.cpio.gz ipv6.disable=1 cloud-config-url=http://...
Just an observation - there seem to be different problems at play (that has been said before).
Some have noted logs alternating between any of those above and others only having one of the above only.
There is also a similar bug logged on Ubuntu. On this one, they seem to find that NFS is the problem.
@etlweather I believe that in fact the only common denominator is, well, a net device not being able to be unregistered by the kernel as the error message says. However the reasons _why_ are somewhat different. For us it definitely was the mentioned docker / node issue (veth). For eth, lo the cause is most likely something completely different.
Still happens with 4.9.0-0.bpo.1-amd64 on debian jessie with docker 1.13.1. Is there any kernel - os combination which is stable?
This might not be a purely docker issue - I'm getting it on a Proxmox server where I'm only running vanilla LXC containers (ubuntu 16.04).
@darth-veitcher it's a kernel issue
@thaJeztah agreed thanks. Was going to try and install 4.9.9 tonight from mainline and see if that fixes matters.
I'm getting it running Docker 1.13.1 on a Debian with kernel 4.9.9-040909.
Yes upgrading kernel on Proxmox to latest 4.9.9 didn't resolve the error. Strange as it's just appeared after a year without issues.
There might be something in a previous statement further up in the thread about it being linked to either NFS or CIFS shares mounted.
Sent from my iPhone
On 14 Feb 2017, at 07:47, Alfonso da Silva notifications@github.com wrote:
I'm getting it running Docker 1.13.1 on a Debian with kernel 4.9.9-040909.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
I have a bugzilla ticket open with Redhat about this.
Some developments:
Red Hat put the IPV6 refcount leak patches from mainline on QA, looks like they're queued up for RHEL 7.4 and may be backported to 7.3. Should be on CentOS-plus soon too. Note: This patch only fixes issues in SOME cases. If you have a 4.x kernel it's a moot point since they're already there.
This is definitely a race condition in the kernel from what I can tell, which makes it really annoying to find. I've taken a snapshot of the current mainline kernel and am working on instrumenting the various calls starting with the IPV6 subsystem. The issue is definitely reproducible now: looks like all you have to do is create a bunch of containers, push a ton of network traffic from them, crash the program inside the containers, and remove them. Doing this over and over triggers the issue in minutes, tops on a physical 4-core workstation.
Unfortunately, I don't have a lot of time to work on this: if there are kernel developers here who are willing to collaborate on instrumenting the necessary pieces I think we can set up a fork and start work on hunting this down step by step.
@iamthebot , is it reproducible on a qemu-kvm setup?
@iamthebot I have tried to repro this several times with different kernels. Somewhere above it was mentioned that using docker-stress -c 100
with userland-proxy
set to false would trigger it, but I had no luck.
If you have a more reliable repro (even if it takes a long time to trigger) I can try and take a look
We encounter the same difficulty in our production and staging environments. We are going to upgrade to Docker 1.13 and Linux kernel 4.9 soon but as other already mentioned; these versions are also affected.
$ docker -v
Docker version 1.12.3, build 6b644ec
$ uname -a
Linux 4.7.0-0.bpo.1-amd64 #1 SMP Debian 4.7.8-1~bpo8+1 (2016-10-19) x86_64 GNU/Linux
I'm experiencing this issue pretty regularly on my dev system, always while shutting down containers.
General info
→ uname -a
Linux miriam 3.10.0-514.6.1.el7.x86_64 #1 SMP Sat Dec 10 11:15:38 EST 2016 x86_64 x86_64 x86_64 GNU/Linux
→ cat /etc/redhat-release
Red Hat Enterprise Linux Workstation release 7.3 (Maipo)
→ docker -v
Docker version 1.13.0, build 49bf474
→ docker-compose -v
docker-compose version 1.10.0, build 4bd6f1a
→ docker info
Containers: 11
Running: 0
Paused: 0
Stopped: 11
Images: 143
Server Version: 1.13.0
Storage Driver: overlay
Backing Filesystem: xfs
Supports d_type: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 03e5862ec0d8d3b3f750e19fca3ee367e13c090e
runc version: 2f7393a47307a16f8cee44a37b262e8b81021e3e
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 3.10.0-514.6.1.el7.x86_64
Operating System: Red Hat Enterprise Linux
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.19 GiB
Name: miriam
ID: QU56:66KP:C37M:LHXT:4ZMX:3DOB:2RUD:F2RR:JMNV:QCGZ:ZLWQ:6UO5
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
File Descriptors: 16
Goroutines: 25
System Time: 2017-02-15T10:47:09.010477057-06:00
EventsListeners: 0
Http Proxy: http://xxxxxxxxxxxxxxxxxxxx:80
Https Proxy: http://xxxxxxxxxxxxxxxxxxxx:80
No Proxy: xxxxxxxxxxxxxxxxxxxx
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Docker daemon log
DEBU[70855] Calling DELETE /v1.22/containers/9b3d01076f3b6a1373729e770a9b1b4e878c2e4be5e27376d24f21ffead6792f?force=False&link=False&v=False
DEBU[70855] Calling DELETE /v1.22/containers/38446ddb58bc1148ea2fd394c5c14618198bcfca114dae5998a5026152da7848?force=False&link=False&v=False
DEBU[70855] Calling DELETE /v1.22/containers/e0d31b24ea4d4649aec766c7ceb5270e79f5a74d60976e5894d767c0fb2af47a?force=False&link=False&v=False
DEBU[70855] Calling DELETE /v1.22/networks/test_default
DEBU[70855] Firewalld passthrough: ipv4, [-t nat -C POSTROUTING -s 172.19.0.0/16 ! -o br-ee4e6fb1c772 -j MASQUERADE]
DEBU[70855] Firewalld passthrough: ipv4, [-t nat -D POSTROUTING -s 172.19.0.0/16 ! -o br-ee4e6fb1c772 -j MASQUERADE]
DEBU[70855] Firewalld passthrough: ipv4, [-t nat -C DOCKER -i br-ee4e6fb1c772 -j RETURN]
DEBU[70855] Firewalld passthrough: ipv4, [-t nat -D DOCKER -i br-ee4e6fb1c772 -j RETURN]
DEBU[70855] Firewalld passthrough: ipv4, [-t filter -C FORWARD -i br-ee4e6fb1c772 -o br-ee4e6fb1c772 -j ACCEPT]
DEBU[70855] Firewalld passthrough: ipv4, [-D FORWARD -i br-ee4e6fb1c772 -o br-ee4e6fb1c772 -j ACCEPT]
DEBU[70855] Firewalld passthrough: ipv4, [-t filter -C FORWARD -i br-ee4e6fb1c772 ! -o br-ee4e6fb1c772 -j ACCEPT]
DEBU[70855] Firewalld passthrough: ipv4, [-D FORWARD -i br-ee4e6fb1c772 ! -o br-ee4e6fb1c772 -j ACCEPT]
DEBU[70855] Firewalld passthrough: ipv4, [-t filter -C FORWARD -o br-ee4e6fb1c772 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT]
DEBU[70855] Firewalld passthrough: ipv4, [-D FORWARD -o br-ee4e6fb1c772 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT]
DEBU[70856] Firewalld passthrough: ipv4, [-t filter -C FORWARD -o br-ee4e6fb1c772 -j DOCKER]
DEBU[70856] Firewalld passthrough: ipv4, [-t filter -C FORWARD -o br-ee4e6fb1c772 -j DOCKER]
DEBU[70856] Firewalld passthrough: ipv4, [-D FORWARD -o br-ee4e6fb1c772 -j DOCKER]
DEBU[70856] Firewalld passthrough: ipv4, [-t filter -C DOCKER-ISOLATION -i br-ee4e6fb1c772 -o docker0 -j DROP]
DEBU[70856] Firewalld passthrough: ipv4, [-D DOCKER-ISOLATION -i br-ee4e6fb1c772 -o docker0 -j DROP]
DEBU[70856] Firewalld passthrough: ipv4, [-t filter -C DOCKER-ISOLATION -i docker0 -o br-ee4e6fb1c772 -j DROP]
DEBU[70856] Firewalld passthrough: ipv4, [-D DOCKER-ISOLATION -i docker0 -o br-ee4e6fb1c772 -j DROP]
DEBU[70856] Firewalld passthrough: ipv4, [-t filter -C DOCKER-ISOLATION -i br-ee4e6fb1c772 -o br-b2210b5a8b9e -j DROP]
DEBU[70856] Firewalld passthrough: ipv4, [-D DOCKER-ISOLATION -i br-ee4e6fb1c772 -o br-b2210b5a8b9e -j DROP]
DEBU[70856] Firewalld passthrough: ipv4, [-t filter -C DOCKER-ISOLATION -i br-b2210b5a8b9e -o br-ee4e6fb1c772 -j DROP]
DEBU[70856] Firewalld passthrough: ipv4, [-D DOCKER-ISOLATION -i br-b2210b5a8b9e -o br-ee4e6fb1c772 -j DROP]
DEBU[70856] releasing IPv4 pools from network test_default (ee4e6fb1c772154fa35ad8d2c032299375bc2d7756b595200f089c2fbcc39834)
DEBU[70856] ReleaseAddress(LocalDefault/172.19.0.0/16, 172.19.0.1)
DEBU[70856] ReleasePool(LocalDefault/172.19.0.0/16)
Message from syslogd@miriam at Feb 15 10:20:52 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
@r-BenDoan if you try to stop a container but it doesn't respond to SIGINT, docker will wait 10 seconds and then kill the container ungracefully. I encountered that behavior in my nodejs containers until I added signal handling. If you see a container taking 10s to stop, it likely isn't handling signals and is more likely to trigger this issue.
Make sure your containers can stop gracefully.
While I'm not the one who is fixing this issue, not being much into Linux Kernel dev, I think I am right in saying that the "me too" comments aren't that helpful. By this I mean, just saying "I have this problem too, with Kernel vx.x and Docker 1.x" does not bring anything new to the discussion.
However, I would suggest that "me too" comments which describe more the environment and method to reproduce would be of great value.
When reading all the comments, it is clear that there are a few problems - as I posted earlier, some with vethXYZ, some with eth0 and others with lo0. This suggest that they could be caused by different problems. So just saying "me too" without full description of the error and environment may mislead people.
Also, when describing the environment, giving the Kernel and Docker version is not sufficient. Per the thread, there seems to be a few factors such as ipv6 enabled or not. NodeJS not responding to SIGINT (or other containers, not bashing on NodeJS here).
So describing what the workload on the environment is would be useful. Also, this occurs when a container is being shutdown, therefore I would also suggest to the people experiencing this issue to pay attention to what container is being stopped when the problem rear its ugly head.
While it seems the problem is in the Kernel having a race condition - identifying the trigger will be of tremendous help to those who will fix the issue. And it can even give the affected users an immediate solution such as implementing a signal handler in a NodeJS application (I don't know myself that this prevents the issue from triggering, but it seems so per earlier comments of others).
FWIW kubernetes has correlated this completely to veth "hairpin mode" and
has stopped using that feature completely. We have not experienced this
problem at all, across tens of thousands of production machines and vastly
more test runs, since changing.
Until this is fixed, abandon ship. Find a different solution :(
On Wed, Feb 15, 2017 at 10:00 AM, ETL notifications@github.com wrote:
While I'm not the one who is fixing this issue, not being much into Linux
Kernel dev, I think I am right in saying that the "me too" comments aren't
that helpful. By this I mean, just saying "I have this problem too, with
Kernel vx.x and Docker 1.x" does not bring anything new to the discussion.However, I would suggest that "me too" comments which describe more the
environment and method to reproduce would be of great value.When reading all the comments, it is clear that there are a few problems -
as I posted earlier, some with vethXYZ, some with eth0 and others with lo0.
This suggest that they could be caused by different problems. So just
saying "me too" without full description of the error and environment may
mislead people.Also, when describing the environment, giving the Kernel and Docker
version is not sufficient. Per the thread, there seems to be a few factors
such as ipv6 enabled or not. NodeJS not responding to SIGINT (or other
containers, not bashing on NodeJS here).So describing what the workload on the environment is would be useful.
Also, this occurs when a container is being shutdown, therefore I would
also suggest to the people experiencing this issue to pay attention to what
container is being stopped when the problem rear its ugly head.While it seems the problem is in the Kernel having a race condition -
identifying the trigger will be of tremendous help to those who will fix
the issue. And it can even give the affected users an immediate solution
such as implementing a signal handler in a NodeJS application (I don't know
myself that this prevents the issue from triggering, but it seems so per
earlier comments of others).—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/docker/docker/issues/5618#issuecomment-280087293, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AFVgVFmu1SiStZcLKtKuk1W-tjn6wOXlks5rcz0hgaJpZM4B4L4Z
.
Yep, we are moving to gke and no longer seeing this issue (so no more bug bounty from us :))
Just had the error again. I was trying to fix a node.js application which uses sockets and therefore scaled the application often. The node.js app was build on top of https://github.com/deployd/deployd. I hope this provides some more info. What also was interesting is that both server inside my swarm displayed the unregister_netdevice error simultaneously after I removed the service via docker service rm. The container was scaled to 4 so two container were running on each machine.
edit Happened again! Working on the same node.js app. The last 3 or 4 days I haven't directly worked on that node.js application and it never occured.
edit2 will try to add signal handler to the nodejs app. Let's see if that helps....
I just ran in to this error, after using docker-py to publish a new instance to EC. However, I was able to exit with ctrl+C, and haven't seen it since (now that most of the images are building more quickly from the cache)
```{"status":"Pushed","progressDetail":{},"id":"c0962ea0b9bc"}
{"status":"stage: digest: sha256:f5c476a306f5c2558cb7c4a2fd252b5b186b65da22c8286208e496b3ce685de8 size: 5737"}
{"progressDetail":{},"aux":{"Tag":"stage","Digest":"sha256:f5c476a306f5c2558cb7c4a2fd252b5b186b65da22c8286208e496b3ce685de8","Size":5737}}
Docker image published successfully
Message from syslogd@ip-172-31-31-68 at Feb 16 19:49:16 ...
kernel:[1611081.976079] unregister_netdevice: waiting for lo to become free. Usage count = 1
Message from syslogd@ip-172-31-31-68 at Feb 16 19:49:27 ...
kernel:[1611092.220067] unregister_netdevice: waiting for lo to become free. Usage count = 1
[1]+ Stopped ./image-publish.py
[root@ip-172-31-xx-xx image-publish]# ^C
[root@ip-172-31-xx-xx image-publish]#
@thockin is this setting --hairpin-mode=none
on the kubelets?
=none breaks containers that get NAT'ed back to themselves. We use
promiscuous-bridge by default.
On Thu, Feb 16, 2017 at 7:26 PM, Kanat Bekt notifications@github.com
wrote:
@thockin https://github.com/thockin is this setting --hairpin-mode=none
on the kubelets?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/docker/docker/issues/5618#issuecomment-280539673, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AFVgVLNwAH6NWVaIKhJfS147O9w_rtJEks5rdRN8gaJpZM4B4L4Z
.
@thockin which containers might want to access themselves via Service ClusterIP ?
It turns out to be more common than I thought, and when we broke it, lots
of people complained.
On Feb 17, 2017 12:38 AM, "Maxim Ivanov" notifications@github.com wrote:
@thockin https://github.com/thockin which containers might want to
access themselves via Service ClusterIP ?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/docker/docker/issues/5618#issuecomment-280588366, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AFVgVLn3uBvUW-dQ72qst5_eYiFUtfSVks5rdVyIgaJpZM4B4L4Z
.
I think I know why some dockerized nodejs app could cause this issue. Node uses keep-alive connections per default. When server.close()
is used, the server doesn't accept new connections. But current active connections like websockets or HTTP keep-alive connections are still maintained. When the dockerized app is also scaled to n this could result in waiting for lo to become free
because when it is forced to termination lo newer was freed. When docker redistributes this app to another node or the app is scaled down docker sends a signal to the app that it should shutdown. The app listens to this signal and can react. When the app isn't shutdown after some seconds, docker terminates it without hesitation. I added signal handlers and found out that when using server.close()
the server isn't perfectly terminated but "only" stops accepting new connections (see https://github.com/nodejs/node/issues/2642). So we need to make sure that open connections like websockets or http keep-alive is also closed.
How to handle websockets:
The nodejs app emits to all websockets closeSockets
when a shutdown signal is received. The client listens on this closeSockets
event and calls sockets.disconnect()
and shortly after sockets.connect()
. Remember that server.close()
was called so this instance doesn't accept new requests. When other instances of this dockerized app is running the loadbalancer inside docker will eventually pick an instance which isn't shutdown and a successful connection is established. The instance which should shutdown won't have open websockets-connections.
var gracefulTermination = function(){
//we don't want to kill everything without telling the clients that this instance stops
//server.close() sets the server to a state on which he doesn't allow new connections
//but the old connections (websockets) are still open and can be used
server.close(function(){
// this method is called when the server terminates
console.log('close bknd');
process.exit();
});
//iterate through all open websockets and emit 'closeSockets' to the clients.
//Clients will then call disconnect() and connect() on their site to establish new connections
//to other instances of this scaled app
Object.keys(server.socketIoObj.sockets.sockets).forEach(function(id) {
console.log("WebSocket ID:",id, " will be closed from the client.")
server.socketIoObj.to(id).emit('closeSockets');
});
};
process.on( "SIGINT", function() {
console.log('CLOSING [SIGINT]');
gracefulTermination();
});
...
How to handle keep-alive HTTP connections:
Currently I don't know how this can be done perfectly. The easiest way is to disable keep-alive.
app.use(function (req, res, next) {
res.setHeader('Connection', 'close');
next();
}
Another possibility is to set the keep-alive timeout to a very low number. For example 0.5 seconds.
app.use(function (req, res, next) {
res.setTimeout(500);
next();
}
Hope this could help others :)
I've got same issues. Attachments are all of logs that made from ecs-logs-collector script.
Much appreciated for any help :)
I've got same issues.
Docker version 1.13.1, build 092cba3
Linux debian 4.8.6-x86_64-linode78
Linux backup 4.6.0-040600-generic #201606100558 SMP Fri Jun 10 10:01:15 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
Server Version: 1.13.1
Same issue. I'm using mount in privileged container. After 4-5 runs it freezes. Also i have same issue with latest standard kernel for 16.04
Everyone, @etlweather is spot-on. Only post a "me too" if you have a reliable way of reproducing the issue. In that case, detail your procedure. A docker and kernel version isn't enough and we get lots of notifications about it. The simpler your reproduction procedure, the better.
@rneugeba @redbaron Unfortunately the current "repro" I have is very hardware specific (all but proving this is a race condition). I haven't tried getting a QEMU repro but that's definitely the next step so multiple people can actually work on this and get the expected result (ideally in 1 CPU core setup). If someone already has one, please shoot me an email (it's on my profile). I'll thoroughly test it and post it here.
We're getting this in GCE pretty frequently. Docker freezes and the machine hangs on reboot.
[782935.982038] unregister_netdevice: waiting for vethecf4912 to become free. Usage count = 17
The container is running a go application, and has hairpin nat configured.
Docker:
matthew@worker-1:~$ docker version
Client:
Version: 1.12.6
API version: 1.24
Go version: go1.6.4
Git commit: 78d1802
Built: Tue Jan 10 20:38:45 2017
OS/Arch: linux/amd64
Server:
Version: 1.12.6
API version: 1.24
Go version: go1.6.4
Git commit: 78d1802
Built: Tue Jan 10 20:38:45 2017
OS/Arch: linux/amd64
Ubuntu 16.04 LTS,
matthew@worker-1:~$ uname -a
Linux worker-1 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Does anyone have a suggested work around for this? I tried enabling --userland-proxy=true
and docker still hangs after a while. It appears Kubernates has a solution from what @thockin wrote above, but its not clear what --hairpin-mode=promiscuous-bridge
exactly does and how to configure that on a plain jane ubuntu 16.x docker install.
I can make this happen reliably when running Proxmox and using containers. Specifically, if I have moved a considerable amount of data or moved really any amount of data very recently, shutting down or hard stopping the container will produce this error. I've seen it most often when I am using containers that mount my NAS within, but that might be a coincidence.
# uname -a
Linux proxmox01 4.4.40-1-pve #1 SMP PVE 4.4.40-82 (Thu, 23 Feb 2017 15:14:06 +0100) x86_64 GNU/Linux
# cat /etc/debian_version
8.7
And from within Proxmox:
proxmox-ve: 4.4-82 (running kernel: 4.4.40-1-pve)
pve-manager: 4.4-12 (running version: 4.4-12/e71b7a74)
pve-kernel-4.4.35-1-pve: 4.4.35-77
pve-kernel-4.4.40-1-pve: 4.4.40-82
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-1
libqb0: 1.0-1
pve-cluster: 4.0-48
qemu-server: 4.0-109
pve-firmware: 1.1-10
libpve-common-perl: 4.0-92
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-1
pve-docs: 4.4-3
pve-qemu-kvm: 2.7.1-4
pve-container: 1.0-94
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-3
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-8
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.9-pve15~bpo80
It's worth noting that Docker is not installed on this system and never has been. I'm happy to provide any data the community needs to troubleshoot this issue, just tell me what commands to run.
I am able to reproduce this on centos 7.3 running as a swarm worker node running dtr with a mounted nfs volume mapped
The issue being discussed here is a kernel bug and has not yet been fixed. There are a number of options that may help for _some_ situations, but not for all (it's most likely a combination of issues that trigger the same error)
"I have this too" does not help resolving the bug. only leave a comment if you have information that may help resolve the issue (in which case; providing a patch to the kernel upstream may be the best step).
If you want to let know you have this issue too use the "thumbs up" button in the top description:
Every comment here sends an e-mail / notification to over 3000 people I don't want to lock the conversation on this issue, because it's not resolved yet, but may be forced to if you ignore this.
Thanks!
Thats all well and good but what _exactly_ are the options that help? This problem is causing us issues in production so I'd like to do whatever work arounds that are necessary to work around the kernel bug.
If someone from Docker has time to try the Kubernetes workaround, please
let me know and we can point you at it. I am unable to extract the changes
and patch them into Docker myself, right now.
On Thu, Mar 9, 2017 at 7:46 AM, Matthew Newhook notifications@github.com
wrote:
Thats all well and good but what exactly are the options that help?
This problem is causing us issues in production so I'd like to do whatever
work arounds that are necessary to work around the kernel bug.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/docker/docker/issues/5618#issuecomment-285388243, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AFVgVGdH5VX_oFWkImyo_TvlIuhmwMepks5rkB7MgaJpZM4B4L4Z
.
@thockin thanks. I was following the PR/issue in Kubernetes with the hairpin-mode workaround. But during the many back and forth, I lost track if the workaround infact gets rid of this issue ?
(As I understand there are different scenarios that causes the ref-count inconsistency in the kernel).
If you can point me to the PR that you believe addresses the issue in K8s, I will work to get this patched in docker atleast for the case of turning userland-proxy off by default. (And we can test it using the docker-stress reproduction steps).
I'm not sure I have a single PR, but you can look at current state. Start
here:
On Sat, Mar 11, 2017 at 10:49 PM, Madhu Venugopal notifications@github.com
wrote:
@thockin https://github.com/thockin thanks. I was following the
PR/issue in Kubernetes with the hairpin-mode workaround. But during the
many back and forth, I lost track if the workaround infact gets rid of this
issue ?
(As I understand there are different scenarios that causes the ref-count
inconsistency in the kernel).If you can point me to the PR that you believe addresses the issue in K8s,
I will work to get this patched in docker atleast for the case of turning
userland-proxy off by default. (And we can test it using the docker-stress
reproduction steps).—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/docker/docker/issues/5618#issuecomment-285926217, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AFVgVIlGs_QccxS6YYQiLNybddDzB4yUks5rk5VogaJpZM4B4L4Z
.
Hey all, just to be clear, all the "kubernetes workaround" does is enable promiscuous mode on the underlying bridge. You can achieve the same thing with ip link set <bridgename> promisc on
using iproute2. It decreases the probability of running into the bug but may not eliminate it altogether.
Now, in theory this shouldn't work... but for some reason promiscuous mode seems to make the device teardown just slow enough that you don't get a race to decrement the ref counter. Perhaps one of the Kurbernetes contributors can chime in here if they're on this thread.
I can verify the workaround (NOT FIX) works using my environment-specific repro. I can't really verify it helps if you're using the IPVLAN or MACVLAN drivers (we use macvlan in prod) because it seems very difficult to get those setups to produce this bug. Can anyone else with a repro attempt to verify the workaround?
Hi all, I tried to debug the kernel issue, was having a email chain on the "netdev" mailing list, so just wanted to post some findings here.
https://www.spinics.net/lists/netdev/msg416310.html
The issue that we are seeing is that
unregister_netdevice: waiting for lo to become free. Usage count = 1
during container shut down. When if I inspect the container network namespace, it seems like the eth0
device has already been deleted, but only the lo
device is left there. And there is another structure holding the reference for that device.
After some digging, it turns out the "thing" holding the reference, is one of the "routing cache" (struct dst_entry
). And something is preventing that particular dst_entry
to be gc'ed (the reference count for dst_entry
is larger than 0). So I logged every dst_hold()
(increment dst_entry reference count by 1), and dst_release()
(decrement dst_entry reference count by 1), and there is indeed more dst_hold()
calls then dst_release()
.
Here is the logs attached: kern.log.zip
Summary:
lo
interface was renamed to lodebug
for ease of grepingdst_entry
starts with 1dst_entry
(which is holding the reference for lo) at the end is 19dst_hold()
calls, and 258023 total dst_release()
callsdst_hold()
calls, there are 88034 udp_sk_rx_dst_set()
(which is then calls dst_hold()
), 152536 inet_sk_rx_dst_set()
, and 17471 __sk_add_backlog()
dst_release()
calls, there are 240551 inet_sock_destruct()
and 17472 refdst_drop()
There are more udp_sk_rx_dst_set()
and inet_sk_rx_dst_set()
calls in total than inet_sock_destruct()
, so suspecting there are some sockets are in a "limbo" state, and something preventing them to destroyed.
UPDATE:
Turns out sockets (struct sock
) are created and destroyed correctly, but for some of the TCP sockets inet_sk_rx_dst_set()
are being called multiple times on the same dst
, but there is only one corresponding inet_sock_destruct()
to release the reference to the dst
.
Here is the CentOS 7.3 workaround that fixed it for me:
yum --enablerepo=centosplus install kernel-plus
egrep ^menuentry /etc/grub2.cfg | cut -f 2 -d \’
grub2-set-default 0
reboot
Here is the patch that solves it:
https://bugs.centos.org/view.php?id=12711&nbn=1
UPDATE: This turned out not to solve the problem permanently. It showed up again several hours later with the following wall message:
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
@adrianotto - to clarify: Does the CentOS kernel patch resolve this? Just curious if you meant both your workaround and reference kernel path both did not successfully resolve this permanently?
@stayclassychicago @adrianotto That patch only addresses one of the race conditions that can trigger the "usage count" issue in the kernel. It's just my backported fix from something in the 4.x kernels already. It may solve your problems so it's worth a shot.
@stayclassychicago before I tried the 3.10.0-514.10.2.el7.centos.plus.x86_64
kernel I was getting the kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
very regularly, nearly every time I ran a container with docker run --rm ...
when the container exited. After the kernel upgrade and reboot, it completely stopped for many hours, and then came back again. Now half the time I delete containers it works properly, where it used to error very time. I don't know for sure if the new kernel is helping, but it doesn't hurt.
Looks like it is very easy to reproduce when there is a LACP bonding interface on the machine. We have a 3 node swarm cluster, all 3 with a configured LACP bonding interface, and this issue basically doesn't allow us to work with the cluster. We have to restart nodes every 15-20 minutes.
Confirmed - as soon as I removed LACP bonding from the interfaces (those were used as main interfaces), everything is working fine for more than 12 hours. Used to break every ~30 minutes.
This is reproducible on Linux containerhost1 4.9.0-0.bpo.2-amd64 #1 SMP Debian 4.9.18-1~bpo8+1 (2017-04-10) x86_64 GNU/Linux
with Docker version 17.04.0-ce, build 4845c56
running in priviliged mode when we have cifs mounts open. When the container stops with mounts open, Docker gets unresponsive and we get the kernel:[ 1129.675495] unregister_netdevice: waiting for lo to become free. Usage count = 1
-error.
ubuntu 16.04(kernel 4.4.0-78-generic) still has the issue. And when it happens, any application that tries to create a new network namespace through clone syscall will get stuck
[ 3720.752954] [<ffffffff8183c8f5>] schedule+0x35/0x80
[ 3720.752957] [<ffffffff8183cb9e>] schedule_preempt_disabled+0xe/0x10
[ 3720.752961] [<ffffffff8183e7d9>] __mutex_lock_slowpath+0xb9/0x130
[ 3720.752964] [<ffffffff8183e86f>] mutex_lock+0x1f/0x30
[ 3720.752968] [<ffffffff8172ba2e>] copy_net_ns+0x6e/0x120
[ 3720.752972] [<ffffffff810a169b>] create_new_namespaces+0x11b/0x1d0
[ 3720.752975] [<ffffffff810a17bd>] copy_namespaces+0x6d/0xa0
[ 3720.752980] [<ffffffff8107f1d5>] copy_process+0x905/0x1b70
[ 3720.752984] [<ffffffff810805d0>] _do_fork+0x80/0x360
[ 3720.752988] [<ffffffff81080959>] SyS_clone+0x19/0x20
[ 3720.752992] [<ffffffff81840a32>] entry_SYSCALL_64_fastpath+0x16/0x71
The only solution is to hard reset the machine.
I met this issue when mounting NFS volume in a privileged container then restarting the container.
It seems to me this issue never happened on RHEL 7 with the same procedure.
$ docker version
Client:
Version: 1.12.6
API version: 1.24
Package version: docker-common-1.12.6-6.gitae7d637.fc25.x86_64
Go version: go1.7.4
Git commit: ae7d637/1.12.6
Built: Mon Jan 30 16:15:28 2017
OS/Arch: linux/amd64
Server:
Version: 1.12.6
API version: 1.24
Package version: docker-common-1.12.6-6.gitae7d637.fc25.x86_64
Go version: go1.7.4
Git commit: ae7d637/1.12.6
Built: Mon Jan 30 16:15:28 2017
OS/Arch: linux/amd64
Red Hat claims to have an instance of this bug fixed as of kernel-3.10.0-514.21.1.el7 release. I suppose they will upstream the fix as soon as possible and rebase to 4.12. This package is already available on CentOS 7 as well.
Documentation related to the fix (RHN access needed):
https://access.redhat.com/articles/3034221
https://bugzilla.redhat.com/show_bug.cgi?id=1436588
From the article:
"In case of a duplicate IPv6 address or an issue with setting an address, a race condition occurred. This race condition sometimes caused address reference counting leak. Consequently, attempts to unregister a network device failed with the following error message: "unregister_netdevice: waiting for to become free. Usage count = 1". With this update, the underlying source code has been fixed, and network devices now unregister as expected in the described situation."
I already deployed this fix in all systems of our PaaS pool, and there's been already 2 days without the bug being hit. Earlier, we've had at least one system being frozen per day. I will report here if we hit the bug again.
I have kernel version 3.10.0-514.21.1.el7.x86_64, and I still have the same symptom.
Message from syslogd@docker at May 26 22:02:26 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
# uname -a
Linux docker 3.10.0-514.21.1.el7.x86_64 #1 SMP Thu May 25 17:04:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
# uptime
22:03:10 up 35 min, 3 users, load average: 0.16, 0.07, 0.06
@adrianotto Apparently, there are multiple ways to hit this issue. How did you reproduced your particular instance of this bug?
@bcdonadio If you look at https://git.centos.org/commitdiff/rpms!kernel.git/b777aca52781bc9b15328e8798726608933ceded - you will see that the https://bugzilla.redhat.com/show_bug.cgi?id=1436588 bug is "fixed" by this change:
+- [net] ipv6: addrconf: fix dev refcont leak when DAD failed (Hangbin Liu) [1436588 1416105]
Which is in upstream kernel since 4.8, I believe (https://github.com/torvalds/linux/commit/751eb6b6042a596b0080967c1a529a9fe98dac1d). And 4.9 and 4.10 has this bug present, so RedHat just backported some of the fixes from upstream, which probably fix some problems, but definitely not all of them.
@bcdonadio I can reproduce the bug on my system by running this test script once per hour from cron:
#!/bin/sh
TAG=`date +%F_%H_%M_%S_UTC`
docker pull centos:centos6
docker run --rm adrianotto/centos6 yum check-update -q > package_updates.txt
LINES=`wc -l < package_updates.txt`
if [ $LINES -eq 0 ] ; then
rm -f package_updates.txt
echo "No packages need to be updated"
exit 0
fi
docker run --rm adrianotto/centos6 rpm -a -q > old_packages.txt
docker build -t temp:$TAG .
docker run --rm temp:$TAG rpm -a -q > new_packages.txt
docker rmi temp:$TAG
This script is just producing a package list using an image in the Docker registry, and another using one that's built locally so I can compare them. The Dockerfile is just this:
FROM centos:centos6
MAINTAINER Adrian Otto
RUN yum clean all && yum update -y && yum clean all
2-4 minutes later syslog gets this message:
Message from syslogd@docker at May 27 16:51:55 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 0
In the last occurrence happened a few minutes after I ran the script manually. My guess is that after some timeout elapses after the container delete is attempted, the error condition is raised.
I'm certain the error condition is intermittent, because the script above runs as a cron job at :00 past each error. Here is a sample of the error output that syslog recorded:
May 26 01:02:44 docker kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
May 26 02:02:22 docker kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
May 26 02:02:32 docker kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
May 26 03:02:18 docker kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
May 26 03:02:28 docker kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
May 26 03:02:38 docker kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
May 26 04:03:14 docker kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
May 26 05:02:25 docker kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
May 26 05:02:35 docker kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
May 26 06:03:31 docker kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
May 26 06:03:41 docker kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
May 26 06:03:51 docker kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
May 26 06:04:02 docker kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
May 26 09:03:04 docker kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
So it happens somewhere in the range of 2 to 4 minutes after the containers run and exit and are deleted by docker because of the --rm flag. Also notice from the log above that there is not an error for every container that's run/deleted, but it's pretty consistent.
Would it be possible for someone to see if this patch improves things?
https://patchwork.ozlabs.org/patch/768291/
@hlrichardson This actually looks like it! I will try to backport it to our 3.16 kernel or upgrade specific servers and compile kernel 4.9 with this patch tomorrow, we'll see how it goes.
Though, after checking the commit this patch references (https://github.com/torvalds/linux/commit/0c1d70af924b966cc71e9e48920b2b635441aa50) - it was committed in 4.6 kernel, while the problem was there even before :(
Ah, so perhaps not related, unless there are multiple causes (unfortunately there are many ways this type of bug can be triggered, so that is a possibility).
We personally hit at least multiple issues here - in some of them these
"unregister_netdevice" logs just disappear after some period of time and
docker containers are able to work fine, while in others - all containers
get stuck and the server need to be rebooted.
Actually - we don't use vxlan on servers that get these issues - we use
simple bridges and port forwarding (it happens regardless of userland-proxy
settings).
On May 30, 2017 22:54, "hlrichardson" notifications@github.com wrote:
Ah, so perhaps not related, unless there are multiple causes
(unfortunately there are many ways this type of bug can be triggered, so
that is a possibility).—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/moby/moby/issues/5618#issuecomment-304989068, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAGqoDHe1n3h9_eJ2kmeWcbhKRCX6rZoks5r_HPbgaJpZM4B4L4Z
.
OK, if you're not using vxlan tunnels it definitely won't help.
BTW, if you see a single instance of the "unregister_netdevice" message when a network namespace is deleted (container exit), it should be considered a normal situation in which something referencing a netdevice was cleaned up more or less at the same time the namespace
was being deleted.
The more serious case is where this message is repeated every 10 seconds and never ceases...
in this case a global lock is held forever, and since this lock has to be acquired whenever network
namespace is added or deleted, any attempt to create or delete a network namespace also
hangs forever.
If you have a fairly painless way to reproduce the second type of problem, I'd be interested in
taking a look.
@hlrichardson We're seeing the 2nd case you mention above on a bunch of our servers i.e. message repeated every 10 seconds. What info do you want me to share?
Seeing this on Fedora 25 while testing and building centos:7 containers while using yum. Yum failed to finish downloading the package database and hung indefinitely because the network stopped working in a weird way.
Hi guys,
There is a potential patch for the kernel bug (or at least one of the bugs) in the Linux net-dev mailing list:
https://www.spinics.net/lists/netdev/msg442211.html
It's merged in net tree, queued for stable tree.
According to https://github.com/torvalds/linux/commit/d747a7a51b00984127a88113cdbbc26f91e9d815 - it is in 4.12 (which was released yesterday)!
@fxposter @kevinxucs I'll try backporting this to the current CentOS kernel tomorrow.
I'm running 4.12 (from http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12/) and I still hit this, so torvalds/linux@d747a7a must not be the complete fix.
$ uname -r
4.12.0-041200-generic
Ryan, do you have a reliable way to reproduce?
On 6 Jul 2017 4:29 pm, "Ryan Campbell" notifications@github.com wrote:
I'm running 4.12 (from http://kernel.ubuntu.com/~
kernel-ppa/mainline/v4.12/) and I still hit this, so torvalds/linux@
d747a7a https://github.com/torvalds/linux/commit/d747a7a must not be
the complete fix.$ uname -r
4.12.0-041200-generic—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/moby/moby/issues/5618#issuecomment-313413120, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAdcPCbVPDjPw6va-N5dM7CjYn2W4Bodks5sLO9ZgaJpZM4B4L4Z
.
@justincormack Unfortunately I don't have a minimal example that I can share, but we have a test suite that creates and destroys a lot of containers and i usually run into this issue (hanging docker commands, a lot of waiting for lo to become free
in syslog) after only a few iterations.
@campbellr I've been trying to repro this now three times and spent a good part of this week on it with little luck. I managed to get the waiting for lo to become free
messages a couple of times but without crashes/hangs afterwards. I'm trying to reduce the test case by just create network namespace and veth interfaces.
In your test suite:
Even partial answers to the above may help to narrow it down...
thanks
@rn Docker won't hang anymore as it sets a timeout on the netlink request that would normally hang. But you wouldn't be able to start new containers (or restart existing ones), likely container cleanup on stop would be weird as well.
I haven't had a chance to test on 4.12 yet, but I could reproduce reliably on the kvm instances at vultr. I'm running swarm and my headless chrome workers cause the problems when they fail health checks or crash regularly. Of course at this point I've tracked down all the crashers handle network errors cleanly etc so i'm seeing waiting for lo to become free
but not often enough to hang things for a couple weeks.
So it seems like the things that help reproduce are more complex networking scenarios combined with large amounts of traffic into the containers, constant container recycling and kvm.
@rn I managed to narrow this down to a specific container in our test suite, and was able to reproduce with the following steps:
After 3 or 4 iterations of this i end up getting waiting for lo to become free
and on the next iteration docker run
fails with docker: Error response from daemon: containerd: container did not start before the specified timeout.
do your containers have a lot of network activity? If so, which direction is predominant?
A pretty small amount. In the steps mentioned above, the http request is a small amount of json, and the response is a binary blob thats around 10MB.
What sort of machine are you running this one (number of cores, is it a VM, etc)
This is on a 4-core desktop machine (no VM)
Do you create a lot of containers concurrently?
No, everything is done serially.
Do your containers exit normally or do they crash?
They're stopped with docker stop
- start container (an internal tornado-based web service -- im trying to extract out a minimal example that still hits this)
- make a request to web service running in container
- wait for response
- kill container
I spent some time stripping the container down and it turns out that the web service had nothing to do with the bug. What seems to trigger this in my case is mounting an NFS share inside a container (running with --privileged
).
On my desktop, i can reliably reproduce simply running the following a few times:
$ docker run -it --rm --privileged alpine:latest /bin/mount -o nolock -o async -o vers=3 <my-nfs-server>:/export/foo /mnt
Kubernetes users, I opened an issue to _kops_ to release the next Kubernetes AMI with Kernel version 4.12. Welcome to check it out: https://github.com/kubernetes/kops/issues/2901
I also hit this on centos 7.3 with host kernel 3.10.0-514.6.1.el7.x86_64 and docker-ce-17.06.0.ce-1.el7.centos.x86_64.
@FrankYu thats not helpful. To participate usefully in this thread, please provide an exact way to reproduce this issue, and please test on a modern kernel. 3.10 was released four years ago, we are discussing about whether it is fixed or partially on a release from four days ago.
@danielgusmao our RancherOS and AWS ECS AMI linux OS's already have that 'fix' in place (likely was the default) and it does not resolve the issue for us. We still see the message show up in logs all the time. Likely only hope is the kernel patch gets backported widely. Though I searched around and can't see any evidence of serious progress towards that yet in RedHat/Centos/AWS linux bugzillas and forums.
To be clear, the message itself is benign, it's the kernel crash after the messages reported by the OP which is not.
The comment in the code, where this message is coming from, explains what's happening. Basically every user, such as the IP stack) of a network device (such as the end of veth
pair inside a container) increments a reference count in the network device structure when it is using the network device. When the device is removed (e,g. when the container is removed) each user is notified so that they can do some cleanup (e.g. closing open sockets etc) before decrementing the reference count. Because this cleanup can take some time, especially under heavy load (lot's of interface, a lot of connections etc), the kernel may print the message here once in a while.
If a user of network device never decrements the reference count, some other part of the kernel will determine that the task waiting for the cleanup is stuck and it will crash. It is only this crash which indicates a kernel bug (some user, via some code path, did not decrement the reference count). There have been several such bugs and they have been fixed in modern kernel (and possibly back ported to older ones). I have written quite a few stress tests (and continue writing them) to trigger such crashes but have not been able to reproduce on modern kernels (i do however the above message).
Please only report on this issue if your kernel actually crashes, and then we would be very interested in:
uname -r
)Thanks
[ @thaJeztah could you change the title to something like kernel crash after "unregister_netdevice: waiting for lo to become free. Usage count = 3"
to make it more explicit]
Should be fixed in kernel 4.12 or late. Please check. https://access.redhat.com/solutions/3105941
and link to patch https://github.com/torvalds/linux/commit/d747a7a51b00984127a88113cdbbc26f91e9d815
@drweber you will also find this patch in upcoming stable releases (for now 4.11.12, 4.9.39, 4.4.78, 3.18.62)
@rn
If a user of network device never decrements the reference count, some other part of the kernel will determine that the task waiting for the cleanup is stuck and it will crash. It is only this crash which indicates a kernel bug (some user, via some code path, did not decrement the reference count). There have been several such bugs and they have been fixed in modern kernel (and possibly back ported to older ones). I have written quite a few stress tests (and continue writing them) to trigger such crashes but have not been able to reproduce on modern kernels (i do however the above message).
Please only report on this issue if your kernel actually crashes ...
We are having a slightly different issue in our environment that I am hoping to get some clarification on (kernel 3.16.0-77-generic, Ubuntu 14.04, docker 1.12.3-0~trusty. We have thousands of hosts running docker, 2-3 containers per host, and we are seeing this on < 1% of total hosts running docker).
We actually never see the kernel crash, but instead (like the original reporters as far as I can tell) the dockerd
process is defunct. Upstart (using the /etc/init/docker.conf
job from the upstream package) will not start a new dockerd
process because it thinks it is already running (start: Job is already running: docker
), and attempting to stop the upstart job also fails (docker start/killed, process <pid of defunct process>
).
$ ps -ely
S UID PID PPID C PRI NI RSS SZ WCHAN TTY TIME CMD
...
Z 0 28107 1 0 80 0 0 0 - ? 00:18:05 dockerd <defunct>
Since we mostly run with bridge networking (on a custom bridge device) in dmesg
we see a slightly different message referring to the virtual interface:
[7895942.484851] unregister_netdevice: waiting for vethb40dfbc to become free. Usage count = 1
[7895952.564852] unregister_netdevice: waiting for vethb40dfbc to become free. Usage count = 1
[7895962.656984] unregister_netdevice: waiting for vethb40dfbc to become free. Usage count = 1
Because upstart seems to refuse to restart dockerd or recognize that the previously running process is a zombie, the only solution we have found is to reboot the host.
While our outcome seems different (the kernel does not crash) the root cause sounds the same or similar. Is this not the same issue then? Is there any known workaround or way to have the docker
upstart job become runnable again when this occurs?
@campbellr I can reproduce this issue with your approach on kernel 4.12.2-1.
BTW, if I unmount the NFS storage before the container is stopped, this issue will not happen.
same problem.
[root@docker1 ~]# cat /etc/redhat-release
CentOS Linux release 7.3.1611 (Core)
[root@docker1 ~]# uname -r
3.10.0-514.26.2.el7.x86_64
[root@docker1 ~]# docker version
Client:
Version: 1.12.6
API version: 1.24
Package version: docker-1.12.6-32.git88a4867.el7.centos.x86_64
Go version: go1.7.4
Git commit: 88a4867/1.12.6
Built: Mon Jul 3 16:02:02 2017
OS/Arch: linux/amd64
Server:
Version: 1.12.6
API version: 1.24
Package version: docker-1.12.6-32.git88a4867.el7.centos.x86_64
Go version: go1.7.4
Git commit: 88a4867/1.12.6
Built: Mon Jul 3 16:02:02 2017
OS/Arch: linux/amd64
Hi,
I've just created 2 repos https://github.com/piec/docker-samba-loop and https://github.com/piec/docker-nfs-loop that contain the necessary setup in order to reproduce this bug
My results:
docker-samba-loop
in a few iterations (<10). I can't reproduce it with docker-nfs-loop
docker-samba-loop
, didn't try docker-nfs-loop
Hope this helps
Cheers
A workaround is to use --net=host
in my case. But it's not always an acceptable solution
@piec, many thanks for the details. I have a few more questions for you at the end of this very long comment.
Using the SMB setup I was able to produce a number of things with different kernels. I've tried this with the NFS setup as well but no dice.
All tests are run with docker 17.06.1-ce on HyperKit with a VM configured with 2 vCPUs and 2GB of memory (via Docker for Mac, but that should not matter). I'm using LinuxKit kernels, because I can easily swap them out.
I modified your Dockerfile
in that I added a call to date
as the first command executed and also added a call to date
before andeafter the docker run
for the client.
With 4.9.39 (latest 4.9.x stable kernel) I get a kernel crash:
# while true; do date; docker run -it --rm --name client-smb --cap-add=SYS_ADMIN --cap-add DAC_READ_SEARCH --link samba:samba client-smb:1; date; sleep 1; done
Thu 27 Jul 2017 14:12:51 BST
+ date
Thu Jul 27 13:12:52 UTC 2017
+ mount.cifs //172.17.0.2/public /mnt/ -o vers=3.0,user=nobody,password=
+ date
Thu Jul 27 13:12:52 UTC 2017
+ ls -la /mnt
total 1028
drwxr-xr-x 2 root root 0 Jul 27 10:11 .
drwxr-xr-x 1 root root 4096 Jul 27 13:12 ..
-rwxr-xr-x 1 root root 3 Jul 27 10:11 bla
+ umount /mnt
+ echo umount ok
umount ok
Thu 27 Jul 2017 14:12:52 BST
Thu 27 Jul 2017 14:12:53 BST
---> First iteration suceeds and then hangs on the docker run
and in dmesg
:
[ 268.347598] BUG: unable to handle kernel paging request at 0000000100000015
[ 268.348072] IP: [<ffffffff8c64ea95>] sk_filter_uncharge+0x5/0x31
[ 268.348411] PGD 0 [ 268.348517]
[ 268.348614] Oops: 0000 [#1] SMP
[ 268.348789] Modules linked in:
[ 268.348971] CPU: 1 PID: 2221 Comm: vsudd Not tainted 4.9.39-linuxkit #1
[ 268.349330] Hardware name: BHYVE, BIOS 1.00 03/14/2014
[ 268.349620] task: ffff8b6ab8eb5100 task.stack: ffffa015c113c000
[ 268.349995] RIP: 0010:[<ffffffff8c64ea95>] [<ffffffff8c64ea95>] sk_filter_uncharge+0x5/0x31
[ 268.350509] RSP: 0018:ffffa015c113fe10 EFLAGS: 00010202
[ 268.350818] RAX: 0000000000000000 RBX: ffff8b6ab7eee6a8 RCX: 0000000000000006
[ 268.351231] RDX: 00000000ffffffff RSI: 00000000fffffffd RDI: ffff8b6ab7eee400
[ 268.351636] RBP: ffff8b6ab7eee400 R08: 0000000000000000 R09: 0000000000000000
[ 268.352022] R10: ffffa015c101fcb0 R11: 0000000000000000 R12: 0000000000000000
[ 268.352409] R13: ffff8b6ab7eee4a8 R14: ffff8b6ab7f8e340 R15: 0000000000000000
[ 268.352796] FS: 00007f03f62e3eb0(0000) GS:ffff8b6abc700000(0000) knlGS:0000000000000000
[ 268.353234] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 268.353546] CR2: 0000000100000015 CR3: 00000000782d2000 CR4: 00000000000406a0
[ 268.353961] Stack:
[ 268.354106] ffffffff8c625054 ffff8b6ab7eee400 ffffa015c113fe88 0000000000000000
[ 268.354526] ffffffff8c74ed96 01000008bc718980 0000000000000000 0000000000000000
[ 268.354965] de66927a28223151 ffff8b6ab4443a40 ffffa015c101fcb0 ffff8b6ab4443a70
[ 268.355384] Call Trace:
[ 268.355523] [<ffffffff8c625054>] ? __sk_destruct+0x35/0x133
[ 268.355822] [<ffffffff8c74ed96>] ? unix_release_sock+0x1df/0x212
[ 268.356164] [<ffffffff8c74ede2>] ? unix_release+0x19/0x25
[ 268.356454] [<ffffffff8c62034c>] ? sock_release+0x1a/0x6c
[ 268.356742] [<ffffffff8c6203ac>] ? sock_close+0xe/0x11
[ 268.357019] [<ffffffff8c1f8710>] ? __fput+0xdd/0x17b
[ 268.357288] [<ffffffff8c0f604d>] ? task_work_run+0x64/0x7a
[ 268.357583] [<ffffffff8c003285>] ? prepare_exit_to_usermode+0x7d/0xa4
[ 268.357925] [<ffffffff8c7d2884>] ? entry_SYSCALL_64_fastpath+0xa7/0xa9
[ 268.358268] Code: 08 4c 89 e7 e8 fb f8 ff ff 48 3d 00 f0 ff ff 77 06 48 89 45 00 31 c0 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f 1f 44 00 00 <48> 8b 46 18 8b 40 04 48 8d 04 c5 28 00 00 00 f0 29 87 24 01 00
[ 268.359776] RIP [<ffffffff8c64ea95>] sk_filter_uncharge+0x5/0x31
[ 268.360118] RSP <ffffa015c113fe10>
[ 268.360311] CR2: 0000000100000015
[ 268.360550] ---[ end trace 4a7830b42d5acfb3 ]---
[ 268.360861] Kernel panic - not syncing: Fatal exception
[ 268.361217] Kernel Offset: 0xb000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 268.361789] Rebooting in 120 seconds..
Some times I see several iteration of what the 4.11.12 kernel does, including the unregister_netdevice
messages (see below) and then get the kernel crash above. Sometimes I see a slight variations for the crash, like:
[ 715.926694] BUG: unable to handle kernel paging request at 00000000fffffdc9
[ 715.927380] IP: [<ffffffff8664ea95>] sk_filter_uncharge+0x5/0x31
[ 715.927868] PGD 0 [ 715.928022]
[ 715.928174] Oops: 0000 [#1] SMP
[ 715.928424] Modules linked in:
[ 715.928703] CPU: 0 PID: 2665 Comm: runc:[0:PARENT] Not tainted 4.9.39-linuxkit #1
[ 715.929321] Hardware name: BHYVE, BIOS 1.00 03/14/2014
[ 715.929765] task: ffff931538ef4140 task.stack: ffffbcbbc0214000
[ 715.930279] RIP: 0010:[<ffffffff8664ea95>] [<ffffffff8664ea95>] sk_filter_uncharge+0x5/0x31
[ 715.931043] RSP: 0018:ffffbcbbc0217be0 EFLAGS: 00010206
[ 715.931487] RAX: 0000000000000000 RBX: ffff931532a662a8 RCX: 0000000000000006
[ 715.932043] RDX: 00000000ffffffff RSI: 00000000fffffdb1 RDI: ffff931532a66000
[ 715.932612] RBP: ffff931532a66000 R08: 0000000000000000 R09: 0000000000000000
[ 715.933181] R10: ffff9315394f2990 R11: 000000000001bb68 R12: ffff931532a66000
[ 715.933725] R13: ffff9315328060a8 R14: ffff931532a66340 R15: 0000000000000000
[ 715.934258] FS: 0000000000000000(0000) GS:ffff93153c600000(0000) knlGS:0000000000000000
[ 715.934857] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 715.935286] CR2: 00000000fffffdc9 CR3: 0000000052c09000 CR4: 00000000000406b0
[ 715.935822] Stack:
[ 715.935974] ffffffff86625054 ffff931532806000 ffffbcbbc0217c58 ffff931532a66000
[ 715.936560] ffffffff8674ed37 0100000800000282 0000000000000000 0000000000000000
[ 715.937173] 5de0b9a3a313c00b ffff9315346f5080 ffff9315394f2990 ffff9315346f50b0
[ 715.937751] Call Trace:
[ 715.937982] [<ffffffff86625054>] ? __sk_destruct+0x35/0x133
[ 715.938608] [<ffffffff8674ed37>] ? unix_release_sock+0x180/0x212
[ 715.939130] [<ffffffff8674ede2>] ? unix_release+0x19/0x25
[ 715.939517] [<ffffffff8662034c>] ? sock_release+0x1a/0x6c
[ 715.939907] [<ffffffff866203ac>] ? sock_close+0xe/0x11
[ 715.940277] [<ffffffff861f8710>] ? __fput+0xdd/0x17b
[ 715.940635] [<ffffffff860f604d>] ? task_work_run+0x64/0x7a
[ 715.941072] [<ffffffff860e148a>] ? do_exit+0x42a/0x8e0
[ 715.941472] [<ffffffff8674edfa>] ? scm_destroy+0xc/0x25
[ 715.941880] [<ffffffff867504e0>] ? unix_stream_sendmsg+0x2dd/0x30b
[ 715.942357] [<ffffffff860e19aa>] ? do_group_exit+0x3c/0x9d
[ 715.942780] [<ffffffff860eac41>] ? get_signal+0x45d/0x4e2
[ 715.943210] [<ffffffff86621640>] ? sock_sendmsg+0x2d/0x3c
[ 715.943618] [<ffffffff8602055a>] ? do_signal+0x36/0x4c9
[ 715.944017] [<ffffffff861f64c7>] ? __vfs_write+0x8f/0xcc
[ 715.944416] [<ffffffff861f7100>] ? vfs_write+0xbb/0xc7
[ 715.944809] [<ffffffff8600326c>] ? prepare_exit_to_usermode+0x64/0xa4
[ 715.945295] [<ffffffff867d2884>] ? entry_SYSCALL_64_fastpath+0xa7/0xa9
[ 715.945789] Code: 08 4c 89 e7 e8 fb f8 ff ff 48 3d 00 f0 ff ff 77 06 48 89 45 00 31 c0 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f 1f 44 00 00 <48> 8b 46 18 8b 40 04 48 8d 04 c5 28 00 00 00 f0 29 87 24 01 00
[ 715.947701] RIP [<ffffffff8664ea95>] sk_filter_uncharge+0x5/0x31
[ 715.948112] RSP <ffffbcbbc0217be0>
[ 715.948292] CR2: 00000000fffffdc9
[ 715.948467] ---[ end trace 2d69bea56725fd5f ]---
[ 715.948722] Kernel panic - not syncing: Fatal exception
[ 715.949059] Kernel Offset: 0x5000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 715.949595] Rebooting in 120 seconds..
The crashes are in the unix domain socket code and similar/identical to what is
reported here, though with this new test case it is much easier to reproduce.
With 4.11.12 (which is the latest stable in the 4.11 series) I see no crashes, but it is really slow (annotations inline with --->
):
# while true; do date; docker run -it --rm --name client-smb --cap-add=SYS_ADMIN --cap-add DAC_READ_SEARCH --link samba:samba client-smb:1; date; sleep 1; done
Thu 27 Jul 2017 13:48:04 BST
+ date
Thu Jul 27 12:48:05 UTC 2017
+ mount.cifs //172.17.0.2/public /mnt/ -o vers=3.0,user=nobody,password=
+ date
Thu Jul 27 12:48:05 UTC 2017
+ ls -la /mnt
total 1028
drwxr-xr-x 2 root root 0 Jul 27 10:11 .
drwxr-xr-x 1 root root 4096 Jul 27 12:48 ..
-rwxr-xr-x 1 root root 3 Jul 27 10:11 bla
+ umount /mnt
+ echo umount ok
umount ok
Thu 27 Jul 2017 13:48:05 BST
---> First iteration takes one second
Thu 27 Jul 2017 13:48:06 BST
docker: Error response from daemon: containerd: container did not start before the specified timeout.
Thu 27 Jul 2017 13:50:07 BST
---> Second iteration fails after 2 minutes with dockerd unable to start the container
Thu 27 Jul 2017 13:50:08 BST
+ date
Thu Jul 27 12:51:52 UTC 2017
+ mount.cifs //172.17.0.2/public /mnt/ -o vers=3.0,user=nobody,password=
+ date
Thu Jul 27 12:51:53 UTC 2017
+ ls -la /mnt
total 1028
drwxr-xr-x 2 root root 0 Jul 27 10:11 .
drwxr-xr-x 1 root root 4096 Jul 27 12:50 ..
-rwxr-xr-x 1 root root 3 Jul 27 10:11 bla
+ umount /mnt
+ echo umount ok
umount ok
Thu 27 Jul 2017 13:51:53 BST
---> Third iterations succeeds, BUT it takes almost 2 minutes between docker run and the container running
Thu 27 Jul 2017 13:51:54 BST
docker: Error response from daemon: containerd: container did not start before the specified timeout.
Thu 27 Jul 2017 13:53:55 BST
---> Fourth iteration fails after two minutes
Thu 27 Jul 2017 13:53:56 BST
+ date
Thu Jul 27 12:55:37 UTC 2017
+ mount.cifs //172.17.0.2/public /mnt/ -o vers=3.0,user=nobody,password=
+ date
Thu Jul 27 12:55:37 UTC 2017
+ ls -la /mnt
total 1028
drwxr-xr-x 2 root root 0 Jul 27 10:11 .
drwxr-xr-x 1 root root 4096 Jul 27 12:53 ..
-rwxr-xr-x 1 root root 3 Jul 27 10:11 bla
+ umount /mnt
+ echo umount ok
umount ok
Thu 27 Jul 2017 13:55:38 BST
---> Fifth iteration succeeds, but almost 2 minutes between docker run and the container executing
I had this running for an hour or so with the same pattern repeating, but no kernel crash.
I the kernel logs I see, lots of:
[ 84.940380] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 95.082151] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 105.253289] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 115.477095] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 125.627059] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 135.789298] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 145.969455] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 156.101126] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 166.303333] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 176.445791] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 186.675958] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 196.870265] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 206.998238] unregister_netdevice: waiting for lo to become free. Usage count = 1
[...]
That is a message every ten seconds.
Since this does not cause the hung task detection to kick in even after an hour, I suspect that with 4.11.12 the reference count eventually gets decremented and the device get's freed, but, judging by the intervals I can run containers, it might take up to 4mins!
The kernel crash in the OP indicated that the kernel crashed because a hung task was detected. I haven not seen this crash in my testing, so I changed the sysctl
setting related to hung task detection:
# sysctl -a | grep kernel.hung_task
kernel.hung_task_check_count = 4194304
kernel.hung_task_panic = 0
kernel.hung_task_timeout_secs = 120
kernel.hung_task_warnings = 10
# sysctl -w kernel.hung_task_timeout_secs = 60
# sysctl -w kernel.hung_task_panic=1
This reduces the timeout to 60 seconds and panics the kernel if a hung task was detected. Since it takes around 2 minutes before dockerd
complained that containerd
did not start, reducing the hung task detection to 60s ought to trigger a kernel panics if a single task was hung. Alas there was no crash in the logs
Next, I increase the sleep
after each docker run
to 5 minutes to see if the messages are continuous. In this case all docker run
s seem to work, which is kinda expected since from the previous experiments a docker run
would work every 4 minutes or so
---> This is after the first run
[ 281.406660] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 291.455945] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 301.721340] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 311.988572] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 322.258805] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 332.527383] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 342.796511] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 353.059499] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 363.327472] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 373.365562] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 383.635923] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 393.684949] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 403.950186] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 414.221779] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 424.490110] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 434.754925] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 445.022243] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 455.292106] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 465.557462] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 475.826946] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 486.097833] unregister_netdevice: waiting for lo to become free. Usage count = 1
---> 200+ seconds of messages and then nothing for almost 400 seconds
[ 883.924399] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 893.975810] unregister_netdevice: waiting for lo to become free. Usage count = 1
...
[ 1088.624065] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 1098.891297] unregister_netdevice: waiting for lo to become free. Usage count = 1
---> 200+ seconds of messages and then a gap of 90 seconds
[ 1185.119327] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 1195.387962] unregister_netdevice: waiting for lo to become free. Usage count = 1
...
[ 1390.040035] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 1400.307359] unregister_netdevice: waiting for lo to become free. Usage count = 1
---> 200+ seconds of messages and then a gap of 80+ seconds
[ 1486.325724] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 1496.591715] unregister_netdevice: waiting for lo to become free. Usage count = 1
...
[ 1680.987216] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 1691.255068] unregister_netdevice: waiting for lo to become free. Usage count = 1
---> 200+ seconds of messages and then a gap of 90+ seconds
[ 1787.547334] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 1797.819703] unregister_netdevice: waiting for lo to become free. Usage count = 1
It looks like we are getting around 200 seconds worth of unregister_netdevice
on almost every docker run
(except for the second one). I suspect during that time we can't start new containers (as indicated by Experiment 2. It's curious that the hung task detection is not kicking in, presumably because no task is hung.
This is reverting back to 1s sleep in between docker run
We have another kernel which enabled a bunch of additional debug
options, such as LOCKDEP
, RCU_TRACE
, LOCKUP_DETECTOR
and a few
more.
Running the repro 4.11.12 kernels with these debug options enabled did not trigger anything.
Ditto for the 4.9.39 kernel, where the normal kernel crashes. The debug options change the timing slightly, so this maybe an additional clue that the crash in the unix domain socket code shows about is due to a race.
strace
on the various containerd
processes is not helpful (it
usually isn't because it's written in Go). Lots of long stalls in
futex(...FUTEX_WAIT...)
with any information on where/why.
Some poking around with sysrq
:
Increase verbosity:
echo 9 > /proc/sysrq-trigger
Stack trace from all CPUs:
echo l > /proc/sysrq-trigger
[ 1034.298202] sysrq: SysRq : Show backtrace of all active CPUs
[ 1034.298738] NMI backtrace for cpu 1
[ 1034.299073] CPU: 1 PID: 2235 Comm: sh Tainted: G B 4.11.12-linuxkit #1
[ 1034.299818] Hardware name: BHYVE, BIOS 1.00 03/14/2014
[ 1034.300286] Call Trace:
[ 1034.300517] dump_stack+0x82/0xb8
[ 1034.300827] nmi_cpu_backtrace+0x75/0x87
[ 1034.301200] ? irq_force_complete_move+0xf1/0xf1
[ 1034.301633] nmi_trigger_cpumask_backtrace+0x6e/0xfd
[ 1034.302097] arch_trigger_cpumask_backtrace+0x19/0x1b
[ 1034.302560] ? arch_trigger_cpumask_backtrace+0x19/0x1b
[ 1034.302989] sysrq_handle_showallcpus+0x17/0x19
[ 1034.303438] __handle_sysrq+0xe4/0x172
[ 1034.303826] write_sysrq_trigger+0x47/0x4f
[ 1034.304210] proc_reg_write+0x5d/0x76
[ 1034.304507] __vfs_write+0x35/0xc8
[ 1034.304773] ? rcu_sync_lockdep_assert+0x12/0x52
[ 1034.305132] ? __sb_start_write+0x152/0x189
[ 1034.305458] ? file_start_write+0x27/0x29
[ 1034.305770] vfs_write+0xda/0x100
[ 1034.306029] SyS_write+0x5f/0xa3
[ 1034.306283] entry_SYSCALL_64_fastpath+0x1f/0xc2
[ 1034.306638] RIP: 0033:0x7fa4810488a9
[ 1034.306976] RSP: 002b:00007fffd3a29828 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 1034.307567] RAX: ffffffffffffffda RBX: 000000c6b523a020 RCX: 00007fa4810488a9
[ 1034.308101] RDX: 0000000000000002 RSI: 000000c6b5239d00 RDI: 0000000000000001
[ 1034.308635] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 1034.309169] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[ 1034.309700] R13: 0000000000000000 R14: 00007fffd3a29988 R15: 00007fa481280ee0
[ 1034.310334] Sending NMI from CPU 1 to CPUs 0:
[ 1034.310710] NMI backtrace for cpu 0 skipped: idling at pc 0xffffffffa0922756
Nothing here, CPU1 is idle, CPU0 is handling the sysrq.
Show blocked tasks (twice)
echo w > /proc/sysrq-trigger
[ 467.167062] sysrq: SysRq : Show Blocked State
[ 467.167731] task PC stack pid father
[ 467.168580] kworker/u4:6 D 0 293 2 0x00000000
[ 467.169096] Workqueue: netns cleanup_net
[ 467.169487] Call Trace:
[ 467.169732] __schedule+0x582/0x701
[ 467.170073] schedule+0x89/0x9a
[ 467.170338] schedule_timeout+0xbf/0xff
[ 467.170666] ? del_timer_sync+0xc1/0xc1
[ 467.171011] schedule_timeout_uninterruptible+0x2a/0x2c
[ 467.171422] ? schedule_timeout_uninterruptible+0x2a/0x2c
[ 467.171866] msleep+0x1e/0x22
[ 467.172155] netdev_run_todo+0x173/0x2c4
[ 467.172499] rtnl_unlock+0xe/0x10
[ 467.172770] default_device_exit_batch+0x13c/0x15f
[ 467.173226] ? __wake_up_sync+0x12/0x12
[ 467.173550] ops_exit_list+0x29/0x53
[ 467.173850] cleanup_net+0x1a8/0x261
[ 467.174153] process_one_work+0x276/0x4fb
[ 467.174487] worker_thread+0x1eb/0x2ca
[ 467.174800] ? rescuer_thread+0x2d9/0x2d9
[ 467.175136] kthread+0x106/0x10e
[ 467.175406] ? __list_del_entry+0x22/0x22
[ 467.175737] ret_from_fork+0x2a/0x40
[ 467.176167] runc:[1:CHILD] D 0 2609 2606 0x00000000
[ 467.176636] Call Trace:
[ 467.176849] __schedule+0x582/0x701
[ 467.177152] schedule+0x89/0x9a
[ 467.177451] schedule_preempt_disabled+0x15/0x1e
[ 467.177827] __mutex_lock+0x2a0/0x3ef
[ 467.178133] ? copy_net_ns+0xbb/0x17c
[ 467.178456] mutex_lock_killable_nested+0x1b/0x1d
[ 467.179068] ? mutex_lock_killable_nested+0x1b/0x1d
[ 467.179489] copy_net_ns+0xbb/0x17c
[ 467.179798] create_new_namespaces+0x12b/0x19b
[ 467.180151] unshare_nsproxy_namespaces+0x8f/0xaf
[ 467.180569] SyS_unshare+0x17b/0x302
[ 467.180925] entry_SYSCALL_64_fastpath+0x1f/0xc2
[ 467.181303] RIP: 0033:0x737b97
[ 467.181559] RSP: 002b:00007fff1965ab18 EFLAGS: 00000246 ORIG_RAX: 0000000000000110
[ 467.182182] RAX: ffffffffffffffda RBX: 0000000002277bd8 RCX: 0000000000737b97
[ 467.182805] RDX: 0000000000000000 RSI: 0000000000867a0f RDI: 000000006c020000
[ 467.183368] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 467.184014] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[ 467.184639] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 477.286653] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 487.457828] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 497.659654] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 507.831614] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 518.030241] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 528.232963] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 538.412263] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 548.583610] unregister_netdevice: waiting for lo to become free. Usage count = 1
echo w > /proc/sysrq-trigger
[ 553.969592] sysrq: SysRq : Show Blocked State
[ 553.970411] task PC stack pid father
[ 553.971208] kworker/u4:6 D 0 293 2 0x00000000
[ 553.971686] Workqueue: netns cleanup_net
[ 553.972058] Call Trace:
[ 553.972305] __schedule+0x582/0x701
[ 553.972690] schedule+0x89/0x9a
[ 553.973039] schedule_timeout+0xbf/0xff
[ 553.973462] ? del_timer_sync+0xc1/0xc1
[ 553.973890] schedule_timeout_uninterruptible+0x2a/0x2c
[ 553.974706] ? schedule_timeout_uninterruptible+0x2a/0x2c
[ 553.975244] msleep+0x1e/0x22
[ 553.975539] netdev_run_todo+0x173/0x2c4
[ 553.975950] rtnl_unlock+0xe/0x10
[ 553.976303] default_device_exit_batch+0x13c/0x15f
[ 553.976725] ? __wake_up_sync+0x12/0x12
[ 553.977121] ops_exit_list+0x29/0x53
[ 553.977501] cleanup_net+0x1a8/0x261
[ 553.977869] process_one_work+0x276/0x4fb
[ 553.978245] worker_thread+0x1eb/0x2ca
[ 553.978578] ? rescuer_thread+0x2d9/0x2d9
[ 553.978933] kthread+0x106/0x10e
[ 553.979283] ? __list_del_entry+0x22/0x22
[ 553.979774] ret_from_fork+0x2a/0x40
[ 553.980244] runc:[1:CHILD] D 0 2609 2606 0x00000000
[ 553.980728] Call Trace:
[ 553.980949] __schedule+0x582/0x701
[ 553.981254] schedule+0x89/0x9a
[ 553.981533] schedule_preempt_disabled+0x15/0x1e
[ 553.981917] __mutex_lock+0x2a0/0x3ef
[ 553.982220] ? copy_net_ns+0xbb/0x17c
[ 553.982524] mutex_lock_killable_nested+0x1b/0x1d
[ 553.982909] ? mutex_lock_killable_nested+0x1b/0x1d
[ 553.983311] copy_net_ns+0xbb/0x17c
[ 553.983606] create_new_namespaces+0x12b/0x19b
[ 553.983977] unshare_nsproxy_namespaces+0x8f/0xaf
[ 553.984363] SyS_unshare+0x17b/0x302
[ 553.984663] entry_SYSCALL_64_fastpath+0x1f/0xc2
[ 553.985080] RIP: 0033:0x737b97
[ 553.985306] RSP: 002b:00007fff1965ab18 EFLAGS: 00000246 ORIG_RAX: 0000000000000110
[ 553.985861] RAX: ffffffffffffffda RBX: 0000000002277bd8 RCX: 0000000000737b97
[ 553.986383] RDX: 0000000000000000 RSI: 0000000000867a0f RDI: 000000006c020000
[ 553.986811] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 553.987182] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[ 553.987551] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
/ # [ 558.844761] unregister_netdevice: waiting for lo to become free. Usage count = 1
This shows that both the netns
and cleanup_net
work queues are busy. I found a somewhat related issue a quite a while back here, but this time the cleanup_net
workqueue is in a different state.
unregister_netdev
messages seem unrelated to the recent fix (which is in both 4.9.39 and 4.11.12). This maybe because the cleanup_net
work queue is not progressing and thus the message is printed.runc
, maybe I should try containerd
.I will dig a bit more and then send a summary to netdev
.
@piec do you have console access and can see if there is anything in terms of crash dump or do you also just see huge delays as I see? If you have a crash dump, I'd be very interested in seeing it. Also, are you running on bare metal or in a VM? What's your configuration in terms of CPUs and memory?
@rn thanks for the investigations!
I'm running on a baremetal desktop PC so I have access to everything. It's an i7-4790K + 32 GiB.
Currently I'm running on an up-to-date Arch Linux + kernel from the testing repo (4.12.3-1-ARCH)
In my case everything behaves as you describe in your Experiment 2 (4.11.12 kernel):
unregister_netdevice: waiting for lo to become free. Usage count = 1
message appears repeatedly if I try to run any new container in the 4+ minutes delay after the client-smb has exited. And only appears if you run a new container in that 4 minutes time lapse. Running a new container after these 4 minutes will be "normal"So I suppose there's an issue somewhere in the clean up process of the smb-client container related to network interfaces
There is actually a much simpler repro of this issue (which, BTW is not the original issue).
This script just starts a SMB server on the host and then creates a network namespace with a veth
pair, executes mount; ls; unmount
in the network name space and then removes the network namespace.
apk add --no-cache iproute2 samba samba-common-tools cifs-utils
# SMB server setup
cat <<EOF > /etc/samba/smb.conf
[global]
workgroup = WORKGROUP
netbios name = FOO
passdb backend = tdbsam
security = user
guest account = nobody
strict locking = no
min protocol = SMB2
[public]
path = /share
browsable = yes
read only = no
guest ok = yes
browseable = yes
create mask = 777
EOF
adduser -D -G nobody nobody && smbpasswd -a -n nobody
mkdir /share && chmod ugo+rwx /share && touch /share/foo
chown -R nobody.nobody /share
# Bring up a veth pair
ip link add hdev type veth peer name nsdev
ip addr add 10.0.0.1/24 dev hdev
ip link set hdev up
# Start SMB server and sleep for it to serve
smbd -D; sleep 5
# Client setup
ip netns add client-ns
ip link set nsdev netns client-ns
ip netns exec client-ns ip addr add 10.0.0.2/24 dev nsdev
ip netns exec client-ns ip link set lo up
ip netns exec client-ns ip link set nsdev up
sleep 1 # wait for the devices to come up
# Execute (mount, ls, unmount) in the network namespace and a new mount namespace
ip netns exec client-ns unshare --mount \
/bin/sh -c 'mount.cifs //10.0.0.1/public /mnt -o vers=3.0,guest; ls /mnt; umount /mnt'
# Delete the client network namespace.
ip netns del client-ns
# Create a new network namespace
# This will stall for up to 200s
ip netns add new-netns
Note adding a simple sleep 1
after the unmount, either when executing in the namespace or before deleting the network namespace works without stalling at all when creating the new namespace. A sleep after the old namespace is deleted, does not reduce the stalling.
@piec I also tested this with your repro and a sleep 1
in the Dockerfile after the unmount and everything works as expected, no stalling, no unregister_netdev
messages.
I'll write this up now and send to netdev@vger
Excellent
I confirm that a sleep
after unmounting fixes stalling and unregister_netdev
messages in my setup as well
Don't you think umount
generates an asynchronous action relative to its netns which will block and eventually timeout if the netns is removed before that action finishes? A sleep after the mount would let this stuff finish before the netns is removed.
But that's just a hypothesis
I tried without the unmount, same difference. It's the deletion of the network namespace. That 9and the removal of the mount namespace will trigger the unmount anyway.
Ah ok
By the way I reproduced the issue by mistake (while developing) on another machine with smb again. It's an Ubuntu 16.04 PC, Linux 4.4.0-77-generic. And there's a hung task backtrace which might be interesting. No crash and same ~4 minutes delay.
[6409720.564230] device vethff6396b entered promiscuous mode
[6409720.564415] IPv6: ADDRCONF(NETDEV_UP): vethff6396b: link is not ready
[6409723.844595] unregister_netdevice: waiting for lo to become free. Usage count = 1
[6409726.812872] INFO: task exe:17732 blocked for more than 120 seconds.
[6409726.812918] Tainted: P O 4.4.0-77-generic #98-Ubuntu
[6409726.812959] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[6409726.813007] exe D ffff8809952bbcb8 0 17732 1 0x00000000
[6409726.813013] ffff8809952bbcb8 ffffffff821d9a20 ffff88103856c600 ffff880ffae2d400
[6409726.813018] ffff8809952bc000 ffffffff81ef7724 ffff880ffae2d400 00000000ffffffff
[6409726.813021] ffffffff81ef7728 ffff8809952bbcd0 ffffffff81837845 ffffffff81ef7720
[6409726.813025] Call Trace:
[6409726.813036] [<ffffffff81837845>] schedule+0x35/0x80
[6409726.813040] [<ffffffff81837aee>] schedule_preempt_disabled+0xe/0x10
[6409726.813044] [<ffffffff81839729>] __mutex_lock_slowpath+0xb9/0x130
[6409726.813048] [<ffffffff818397bf>] mutex_lock+0x1f/0x30
[6409726.813053] [<ffffffff81726a2e>] copy_net_ns+0x6e/0x120
[6409726.813059] [<ffffffff810a168b>] create_new_namespaces+0x11b/0x1d0
[6409726.813062] [<ffffffff810a17ad>] copy_namespaces+0x6d/0xa0
[6409726.813068] [<ffffffff8107f1d5>] copy_process+0x905/0x1b70
[6409726.813073] [<ffffffff810805d0>] _do_fork+0x80/0x360
[6409726.813077] [<ffffffff81080959>] SyS_clone+0x19/0x20
[6409726.813081] [<ffffffff8183b972>] entry_SYSCALL_64_fastpath+0x16/0x71
[6409733.941041] unregister_netdevice: waiting for lo to become free. Usage count = 1
[6409744.021494] unregister_netdevice: waiting for lo to become free. Usage count = 1
The netdev@vger thread is here https://www.mail-archive.com/[email protected]/msg179703.html if anyone wants to follow progress.
@piec yes, that's expected.
I also run into this bug and was able to reproduce Oopses with the docker-samba-loop method from @piec on the Ubuntu kernel images:
I added my findings to the Ubuntu bug report: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407 and https://github.com/fho/docker-samba-loop
@fho thanks. You actually don't need docker at all to repro, just running the samba client in a network namespace will do the trick as per https://github.com/moby/moby/issues/5618#issuecomment-318681443
@rn thanks for the info. I haven't tried that way yet.
The recent posts here and and on the netdev mailinglist seem to be only about kernel stalls.
I'm having kernel crashes also with kernel 4.11 and 4.12.
I'm seeing an issue very similar to this (as detailed in #35068). We basically run a two-node swarm, which runs a single service with 4 replicas using a spread placement strategy.
In each of these service containers we mount the host docker.sock as a volume, and from within the container we execute docker run
commands, with a max concurrency of 4 per container. This results in up to 4 containers being created concurrently, and immediately removed after via -rm
.
Additional kernel logs and examples on ARMv7 shown in the above reference.
ip6_route_dev_notify panic is a serious problem for us.
I think looking at this a bit more, this is definitely NOT the same bug as:
I think this is an issue upstream in the kernel with the ipv6 layer.
This information might be relevant.
We are able to reproduce the problem with _unregister_netdevice: waiting for lo to become free. Usage count = 1_ with 4.14.0-rc3 kernel with _CONFIG_PREEMPT_NONE=y_ and running only on one CPU with following boot kernel options:
BOOT_IMAGE=/boot/vmlinuz-4.14.0-rc3 root=/dev/mapper/vg0-root ro quiet vsyscall=emulate nosmp
Once we hit this state, it stays in this state and reboot is needed. No more containers can be spawned. We reproduce it by running images doing ipsec/openvpn connections + downloading a small file inside the tunnels. Then the instances exist (usually they run < 10s). We run 10s of such containers a minute on one machine. With the abovementioned settings (only 1cpu), the machine hits it in ~2 hours.
Another reproducer with the same kernel, but without limiting number of CPUs, is to jus run iperf in UDP mode for 3 seconds inside the container (so there is no TCP communication at all). If we run 10 of such containers in parallel, wait for all of them to finish and do it again, we hit the trouble in less than 10 minutes (on 40 cores machine).
In both of our reproducers, we added "ip route flush table all; ifconfig
Hi,
Just to add to the fire we are also seeing this problem, as requested here are the following...
Kernel Version: Linux exe-v3-worker 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u5 (2017-09-19) x86_64 GNU/Linux
Linux distribution/version: Debian 9.1 (with all packages up to date)
Are you on the latest kernel version of your Linux vendor? Yes
Network setup (bridge, overlay, IPv4, IPv6, etc): IPv4 only, NATed as per default Docker setup
Description of the workload (what type of containers, what type of network load, etc): Very short lived containers (from a few seconds to a few minutes) running scripts before exiting.
And ideally a simple reproduction:
**kernel:[617624.412100] unregister_netdevice: waiting for lo to become free. Usage count = 1
Couldn't kill old container or start new ones on the nodes affected, had to reboot to restore functionality.**
Hopefully we find a root cause / patch soon.
Best Regards,
robputt796
@campbellr
agreed that it seems have something to do with networking storage. I'm using ceph krbd as persistent volume in kubernetes.
And I can reproduce the situation after a long time running container crash.
The issue was assigned 10 days ago and it is work in progress, you can see more insights of what's going on here https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407
Hopefully Dan Streetman finds out how to fix it
Turns out that the Oops is caused by a kernel bug which has been fixed by commit 76da0704507bbc51875013f6557877ab308cfd0a:
ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=76da0704507bbc51875013f6557877ab308cfd0a
(It merely fixes the kernel panic, not the "kernel:unregister_netdevice: waiting for lo to become free. Usage count = 2" issue.)
(repeating this here again, because GitHub is hiding old comments)
The issue being discussed here is a kernel bug and has not yet been fully fixed. Some patches went in the kernel that fix _some_ occurrences of this issue, but others are not yet resolved.
There are a number of options that may help for _some_ situations, but not for all (again; it's most likely a combination of issues that trigger the same error)
"I have this too" does not help resolving the bug. only leave a comment if you have information that may help resolve the issue (in which case; providing a patch to the kernel upstream may be the best step).
If you want to let know you have this issue too use the "thumbs up" button in the top description:
Every comment here sends an e-mail / notification to over 3000 people I don't want to lock the conversation on this issue, because it's not resolved yet, but may be forced to if you ignore this.
I will be removing comments that don't add useful information in order to (slightly) shorten the thread
Thanks!
I believe I've fixed this issue, at least when caused by a kernel TCP socket connection. Test kernels for Ubuntu are available and I would love feedback if they help/fix this for anyone here. Patch is submitted upstream; more details are in LP bug:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407/comments/46
Sorry to spoil the celebration, but we were able to reproduce the issue. We are now working with @ddstreet on it at https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407/ .
Are there no workarounds?
Use host networking (which destroys much of the value of containers, but there you go).
@pumba-lt We had this issue about 1.5yrs ago, about 1yr ago I disabled ipv6 at the kernel level (not sysctl) and haven't had the issue once. Running a cluster of 48 blades.
Normally in: /etc/default/grub
GRUB_CMDLINE_LINUX="xxxxx ipv6.disable=1"
However, I use PXE boot, so my PXE config has:
DEFAULT menu.c32
prompt 0
timeout 50
MENU TITLE PXE Boot
label coreos
menu label CoreOS
kernel mykernel
append initrd=myimage ipv6.disable=1 elevator=deadline cloud-config-url=myurl
I assure you, you will not see this issue again.
Everyone please understand this is a common SYMPTOM that has many causes. What has worked for you to avoid this may not work for someone else.
I can confirm our issues were solved after disabling IPv6 upon boot (fron grub's config file). Had numerous issues in a 7 node cluster, runs smoothly now.
I don't remember where I found the solution, or did I find it myself, anyway, thanks @qrpike for suggesting this to others :) !!
https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.4.114
commit edaafa805e0f9d09560a4892790b8e19cab8bf09
Author: Dan Streetman ddstreet@ieee.org
Date: Thu Jan 18 16:14:26 2018 -0500
net: tcp: close sock if net namespace is exiting
[ Upstream commit 4ee806d51176ba7b8ff1efd81f271d7252e03a1d ]
When a tcp socket is closed, if it detects that its net namespace is
exiting, close immediately and do not wait for FIN sequence.
For normal sockets, a reference is taken to their net namespace, so it will
never exit while the socket is open. However, kernel sockets do not take a
reference to their net namespace, so it may begin exiting while the kernel
socket is still open. In this case if the kernel socket is a tcp socket,
it will stay open trying to complete its close sequence. The sock's dst(s)
hold a reference to their interface, which are all transferred to the
namespace's loopback interface when the real interfaces are taken down.
When the namespace tries to take down its loopback interface, it hangs
waiting for all references to the loopback interface to release, which
results in messages like:
unregister_netdevice: waiting for lo to become free. Usage count = 1
These messages continue until the socket finally times out and closes.
Since the net namespace cleanup holds the net_mutex while calling its
registered pernet callbacks, any new net namespace initialization is
blocked until the current net namespace finishes exiting.
After this change, the tcp socket notices the exiting net namespace, and
closes immediately, releasing its dst(s) and their reference to the
loopback interface, which lets the net namespace continue exiting.
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811
Signed-off-by: Dan Streetman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Using SCTP in netns could also trigger this, fixes in 4.16-rc1:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4a31a6b19f9ddf498c81f5c9b089742b7472a6f8
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=957d761cf91cdbb175ad7d8f5472336a4d54dbf2
still happend "unregister_netdevice: waiting for eth0 to become free. Usage count = 1" although I‘v upgraded kernel version to 4.4.118, and docker version 17.09.1-ce ,maybe I should try disable ipv6 at the kernel level . Hope it cloud work.
@wuming5569 please let me know if it worked for you with that version of linux
@wuming5569 maybe, upgrade kernel 4.4.114 fix "unregister_netdevice: waiting for lo to become free. Usage count = 1", not for "unregister_netdevice: waiting for eth0 to become free. Usage count = 1".
I tested in production.
@ddstreet this is a feedback, any help ?
@wuming5569 as mentioned above, the messages them self are benign but they may eventually lead to the kernel hanging. Does your kernel hang and if so, what is your network pattern, ie what type of networking do your containers do?
Experienced same issue on CentOS. My kernel is 3.10.0-693.17.1.el7.x86_64. But, I didn't get similar stack trace in syslog.
Same on Centos7 kernel 3.10.0-514.21.1.el7.x86_64 and docker 18.03.0-ce
@danielefranceschi I recommend you upgrade to the latest CentOS kernel (at least 3.10.0-693). It won't solve the issue, but it seems to be much less frequent. In kernels 3.10.0-327 and 3.10.0-514, we were seeing the stack trace, but by my memory, I don't think we've seen any of those in 3.10.0-693.
@alexhexabeam 3.10.0-693 seems to work flawlessy, tnx :)
Same on CentOS7 kernel 4.16.0-1.el7.elrepo.x86_64 and docker 18.03.0-ce
It worked for weeks before the crash and when to try to up, it completely stuck.
The problem also happened with kernel 3.10.0-693.21.1.el7
I can confirm it also happens on:
Linux 3.10.0-693.17.1.el7.x86_64
Red Hat Enterprise Linux Server release 7.4 (Maipo)
I can reproduce it by doing "service docker restart" while having a certain amount of load.
@wuming5569 have you fixed this issue?what's your network type ? we have been confused by this issue for weeks .
Do you have wechat account ?
4admin2root, given the fix you mentioned, https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.4.114,
is it safe to disable userland proxy for docker daemon, if proper recent kernel is installed? It is not very clear if it is from
https://github.com/moby/moby/issues/8356
https://github.com/moby/moby/issues/11185
Since both are older than the kernel fix
Thank you
we have been confused by this issue for weeks .
Linux 3.10.0-693.17.1.el7.x86_64
CentOS Linux release 7.4.1708 (Core)
Can anyone confirm if the latest 4.14 kernel has this issue? Seems like it does not. No one around the Internet faced this issue with the 4.14 kernel.
I see this in 4.15.15-1 kernel, Centos7
Looking at the change logs, https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.15.8 has a fix for SCTP, but not TCP. So you may like to try the latest 4.14.
we have now upgraded to 4.16.13. Observing. This bug was hitting us on a one node only approx once per week.
Did you disable ipv6 in grub boot params or sysctl? Only boot params will work. Sysctl will not fix it.
On June 4, 2018 at 12:09:53 PM, Sergey Pronin ([email protected](mailto:[email protected])) wrote:
even 4.15.18 does not help with this bug
disabling ipv6 does not help as wellwe have now upgraded to 4.16.13. Observing. This bug was hitting us on a one node only approx once per week.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub(https://github.com/moby/moby/issues/5618#issuecomment-394410321), or mute the thread(https://github.com/notifications/unsubscribe-auth/AAo3HLYI_jnwjgtQ0ce-E4mc6Em5yeISks5t5VvRgaJpZM4B4L4Z).
for me, most of the time the bug shows up after redeploying the same project/network again
@qrpike you are right, we tried only sysctl. Let me try with grub. Thanks!
4.9.88 Debian kernel. Reproducible.
@qrpike you are right, we tried only sysctl. Let me try with grub. Thanks!
In my case disabling ipv6 didn't make any difference.
@spronin-aurea Did disabling ipv6 at boot loader help?
@qrpike can you tell us about the nodes you are using if disabling ipv6 helped in your case? Kernel version, k8s version, CNI, docker version etc.
@komljen I have been using CoreOS for the past 2years without a single incident. Since ~ver 1000. I haven't tried it recently but if I do not disable ipv6 the bug happens.
On my side, I'm using CoreOS too, ipv6 disabled with grub and still getting the issue
@deimosfr I'm currently using PXE boot for all my nodes:
DEFAULT menu.c32
prompt 0
timeout 50
MENU TITLE PXE Boot Blade 1
label coreos
menu label CoreOS ( blade 1 )
kernel coreos/coreos_production_pxe.vmlinuz
append initrd=coreos/coreos_production_pxe_image.cpio.gz ipv6.disable=1 net.ifnames=1 biosdevname=0 elevator=deadline cloud-config-url=http://HOST_PRIV_IP:8888/coreos-cloud-config.yml?host=1 root=LABEL=ROOT rootflags=noatime,discard,rw,seclabel,nodiratime
However, my main node that is the PXE host is also CoreOS and boots from disk, and does not have the issue either.
What kernel versions you guys are running?
The ones I got the issue were on 4.14.32-coreos and before. I do not encounter this issue yet on 4.14.42-coreos
Centos 7.5 with 4.17.3-1 kernel, still got the issue.
Env :
kubernetes 1.10.4
Docker 13.1
with Flannel network plugin.
Log :
[ 89.790907] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 89.798523] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 89.799623] cni0: port 8(vethb8a93c6f) entered blocking state
[ 89.800547] cni0: port 8(vethb8a93c6f) entered disabled state
[ 89.801471] device vethb8a93c6f entered promiscuous mode
[ 89.802323] cni0: port 8(vethb8a93c6f) entered blocking state
[ 89.803200] cni0: port 8(vethb8a93c6f) entered forwarding state
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1。
Now :
The node IP can reach, but cannot use any network services , like ssh...
The symptoms here are similar to a lot of reports in various other places. All having to do with network namespaces. Could the people running into this please see if unshare -n
hangs, and if so, from another terminal, do cat /proc/$pid/stack
of the unshare process to see if it hangs in copy_net_ns()
? This seems to be a common denominator for many of the issues including some backtraces found here. Between 4.16 and 4.18 there have been a number of patches by Kirill Tkhai refactoring the involved locking a lot. The affected distro/kernel package maintainers should probably look into applying/backporting them to stable kernels and see if that helps.
See also: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779678
@Blub
sudo cat /proc/122355/stack
[<ffffffff8157f6e2>] copy_net_ns+0xa2/0x180
[<ffffffff810b7519>] create_new_namespaces+0xf9/0x180
[<ffffffff810b775a>] unshare_nsproxy_namespaces+0x5a/0xc0
[<ffffffff81088983>] SyS_unshare+0x193/0x300
[<ffffffff816b8c6b>] tracesys+0x97/0xbd
[<ffffffffffffffff>] 0xffffffffffffffff
Given the locking changes in 4.18 it would be good to test the current 4.18rc, especially if you can trigger it more or less reliably, as from what I've seen there are many people where changing kernel versions also changed the likelihood of this happening a lot.
I had this issues with Kubernetes and after switching to latest CoreOS stable release - 1745.7.0 the issue is gone:
same issue on CentOS 7
@Blub Seeing the same on CoreOS 1688.5.3, kernel 4.14.32
ip-10-72-101-86 core # cat /proc/59515/stack
[<ffffffff9a4df14e>] copy_net_ns+0xae/0x200
[<ffffffff9a09519c>] create_new_namespaces+0x11c/0x1b0
[<ffffffff9a0953a9>] unshare_nsproxy_namespaces+0x59/0xb0
[<ffffffff9a07418d>] SyS_unshare+0x1ed/0x3b0
[<ffffffff9a003977>] do_syscall_64+0x67/0x120
[<ffffffff9a800081>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[<ffffffffffffffff>] 0xffffffffffffffff
In theory there may be one or more other traces somewhere containing one of the functions from net_namespace.c locking the net_mutex
(cleanup_net
, net_ns_barrier
, net_ns_init
, {,un}register_pernet_{subsys,device}
). For stable kernels it would of course be much easier if there was one particular thing deadlocking in a way that could be fixed, than backporting all the locking changes from 4.18. But so far I haven't seen a trace leading to the root cause. I don't know if it'll help, but maybe other /proc/*/stack
s with the above functions are visible when the issue appears?
same issue ! and my env is debian 8
RHEL, SWARM, 18.03.0-ce
Manually starting a container on a manager node:
sudo docker run -it -v /import:/temp/eximport -v /home/myUser:/temp/exhome docker.repo.myHost/fedora:23 /bin/bash
After some time doing nothing:
[root@8a9857c25919 myDir]#
Message from syslogd@se1-shub-t002 at Jul 19 11:56:03 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
After minutes I am back on the console of the manager node and the started container is not running any longer.
Does this describe the same issue or is this another "problem suite"?
THX in advance!
UPDATE
This also happens directly on the ssh console (on the swarm manager bash).
UPDATE
Host machine (one manager node in the swarm):
Linux [MACHINENNAME] 3.10.0-514.2.2.el7.x86_64 #1 SMP Wed Nov 16 13:15:13 EST 2016 x86_64 x86_64 x86_64 GNU/Linux
If this does not fix after some time, then this's a different problem.
Same on CentOS7.5 kernel 3.10.0-693.el7.x86_64 and docker 1.13.1
The same problem OEL 7.5
uname -a
4.1.12-124.16.1.el7uek.x86_64 #2 SMP Mon Jun 11 20:09:51 PDT 2018 x86_64 x86_64 x86_64 GNU/Linux
docker info
Containers: 9
Running: 5
Paused: 0
Stopped: 4
Images: 6
Server Version: 17.06.2-ol
dmesg
[2238374.718889] unregister_netdevice: waiting for lo to become free. Usage count = 1
[2238384.762813] unregister_netdevice: waiting for lo to become free. Usage count = 1
[2238392.792585] eth0: renamed from vethbed6d59
(repeating this https://github.com/moby/moby/issues/5618#issuecomment-351942943 here again, because GitHub is hiding old comments)
The issue being discussed here is a kernel bug and has not yet been fully fixed. Some patches went in the kernel that fix _some_ occurrences of this issue, but others are not yet resolved.
There are a number of options that may help for _some_ situations, but not for all (again; it's most likely a combination of issues that trigger the same error)
If's the kernel crash _after_ that's a bug (see below)
"I have this too" does not help resolving the bug. only leave a comment if you have information that may help resolve the issue (in which case; providing a patch to the kernel upstream may be the best step).
If you want to let know you have this issue too use the "thumbs up" button in the top description:
Every comment here sends an e-mail / notification to over 3000 people I don't want to lock the conversation on this issue, because it's not resolved yet, but may be forced to if you ignore this.
I will be removing comments that don't add useful information in order to (slightly) shorten the thread
To be clear, the message itself is benign, it's the kernel crash after the messages reported by the OP which is not.
The comment in the code, where this message is coming from, explains what's happening. Basically every user, such as the IP stack) of a network device (such as the end of
veth
pair inside a container) increments a reference count in the network device structure when it is using the network device. When the device is removed (e,g. when the container is removed) each user is notified so that they can do some cleanup (e.g. closing open sockets etc) before decrementing the reference count. Because this cleanup can take some time, especially under heavy load (lot's of interface, a lot of connections etc), the kernel may print the message here once in a while.If a user of network device never decrements the reference count, some other part of the kernel will determine that the task waiting for the cleanup is stuck and it will crash. It is only this crash which indicates a kernel bug (some user, via some code path, did not decrement the reference count). There have been several such bugs and they have been fixed in modern kernel (and possibly back ported to older ones). I have written quite a few stress tests (and continue writing them) to trigger such crashes but have not been able to reproduce on modern kernels (i do however the above message).
* Please only report on this issue if your kernel actually crashes*, and then we would be very interested in:
- kernel version (output of
uname -r
)- Linux distribution/version
- Are you on the latest kernel version of your Linux vendor?
- Network setup (bridge, overlay, IPv4, IPv6, etc)
- Description of the workload (what type of containers, what type of network load, etc)
- And ideally a simple reproduction
Thanks!
Are you guys running docker under any limits? Like ulimits, cgroups etc...
newer systemd has a default limit even if you didnt set it. I set things to unlimited and the issue hasn't occurred ever since (watching since 31 days).
I had the same issue in many environments and my solution was stop firewall. It has not happened again, for now
Rhel 7.5 - 3.10.0-862.3.2.el7.x86_64
Docker 1.13
@dElogics What version of systemd is considered "newer"? Is this default limit enabled in the CentOS 7.5 systemd?
Also, when you ask if we're running docker under any limits, do you mean the docker daemon, or the individual containers?
The docker daemon. The systemd as in Debian 9 (232-25).
Not sure about RHEL, but I've personally seen this issue on RHEL too. I'd set LimitNOFILE=1048576, LimitNPROC=infinity, LimitCORE=infinity, TasksMax=infinity
kernel: unregister_netdevice: waiting for eth0 to become free. Usage count = 3
kernel 4.4.146-1.el7.elrepo.x86_64
linux version CentOS Linux release 7.4.1708 (Core)
bridge mode
I had the same issue,what can i do?
Same issue:
CentOS Linux release 7.5.1804 (Core)
Docker version 18.06.1-ce, build e68fc7a
Kernel Version: 3.10.0-693.el7.x86_64
The similar issue I've met here...
Is there any moves I could perform right now? Please help me out...
CentOS 7.0.1406
[root@zjsm-slavexx etc]# uname -a
Linux zjsm-slave08 3.10.0-123.el7.x86_64 #1 SMP Mon Jun 30 12:09:22 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
[root@zjsm-slavexx etc]# cat /etc/centos-release
CentOS Linux release 7.0.1406 (Core)
Docker information:
[root@zjsm-slavexx ~]# docker version
Client:
Version: 17.04.0-ce
API version: 1.28
Go version: go1.7.5
Git commit: 4845c56
Built: Mon Apr 3 18:01:50 2017
OS/Arch: linux/amd64
Server:
Version: 17.04.0-ce
API version: 1.28 (minimum version 1.12)
Go version: go1.7.5
Git commit: 4845c56
Built: Mon Apr 3 18:01:50 2017
OS/Arch: linux/amd64
Experimental: false
CentOS Linux release 7.2.1511 kernel: 3.10.0-327.el7.x86_64
same problem
I've experimented this issue.
Ubuntu 16.04.3 LTS
Kernel 4.4.0-87-generic #110-Ubuntu SMP Tue Jul 18 12:55:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Docker version:
Client:
Version: 17.09.0-ce
API version: 1.32
Go version: go1.8.3
Git commit: afdb6d4
Built: Tue Sep 26 22:42:18 2017
OS/Arch: linux/amd64
Server:
Version: 17.09.0-ce
API version: 1.32 (minimum version 1.12)
Go version: go1.8.3
Git commit: afdb6d4
Built: Tue Sep 26 22:40:56 2017
OS/Arch: linux/amd64
Experimental: false
@thaJeztah, perhaps you should add your comment to the top of the original post, as people are still ignoring it.
$ docker network ls
NETWORK ID NAME DRIVER SCOPE
b3fc47abfff2 bridge bridge local
f9474559ede8 dockerfile_cluster_net bridge local
ef999de68a96 host host local
e7b41d23674c none null local
$ docker network rm f9474559ede8
fixed it.
@hzbd You mean delete the user-defined bridge network? Have your tried to dig further to find out why? Please let me know if you did that. I really appreciate that.
Waiting to be fixed
Are you guys running docker under any limits? Like ulimits, cgroups etc...
newer systemd has a default limit even if you didnt set it. I set things to unlimited and the issue hasn't occurred ever since (watching since 31 days).
Ok, this bug still occurs, but probability has reduced.
I think if the containers are gracefully stops (PID 1 exist()s), then this bug will not bother us.
@dElogics thanks for letting us know, could you please show us what commands you ran to set this systemd limits to unlimited. I like to try that too.
@dElogics thanks for letting us know, could you please show us what commands you ran to set this systemd limits to unlimited. I like to try that too.
You've to modify the systemd unit of docker. The systemd unit I use (only relevant parts) --
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target docker.socket firewalld.service flannel.service
Wants=network-online.target
Requires=docker.socket
[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=1048576
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
# restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s
[Install]
WantedBy=multi-user.target
Did someone have this issue in a kernel 4.15 or newer?
This Dan Streetman fix (https://github.com/torvalds/linux/commit/4ee806d51176ba7b8ff1efd81f271d7252e03a1d) is first included in 4.15 kernel version and it seems that at least for someone it is not happening anymore since they upgraded to 4.16 (https://github.com/kubernetes/kubernetes/issues/64743#issuecomment-436839647)
Did someone try it out?
@victorgp We still experience the issue with the 4.15 kernel. We will report here when we have tested with 4.16 kernel (hopefully in a few weeks).
We used kernel version:4.14.62 for a few months ,this issue disappeared.
To add to my previous resolutions -- a gracefully stopping containers (which respond to SIGTERM) never triggers this.
Also try running the containers in the host namespace (if it's acceptable for you) which fully resolves the issue.
@dElogics What do you mean by "host namespace"? Is it simply --privileged
?
@dElogics What do you mean by "host namespace"? Is it simply
--privileged
?
No, it means --network=host
Since upgrading from kernel 4.4.0 to 4.15.0 and docker 1.11.2 to 18.09 the issue disappeared.
In a sizeable fleet of VMs acting as docker hosts we had this issue appearing multiple times a day (with our Docker use-case).
45 days in and we are no longer seeing this.
For posterity, a stack-trace of a hung Docker 1.11.2 w/ printk's showing unregister_netdevice: waiting for vethXXXXX
(similar to what we were always seeing in our fleet, in hundreds of VMs) can be found at http://paste.ubuntu.com/p/6RgkpX352J/ (the interesting Container ref is 0xc820001980
)
goroutine 8809 [syscall, 542 minutes, locked to thread]:
syscall.Syscall6(0x2c, 0xd, 0xc822f3d200, 0x20, 0x0, 0xc822f3d1d4, 0xc, 0x20, 0xc82435fda0, 0x10)
/usr/local/go/src/syscall/asm_linux_amd64.s:44 +0x5
syscall.sendto(0xd, 0xc822f3d200, 0x20, 0x20, 0x0, 0xc822f3d1d4, 0xc80000000c, 0x0, 0x0)
/usr/local/go/src/syscall/zsyscall_linux_amd64.go:1729 +0x8c
syscall.Sendto(0xd, 0xc822f3d200, 0x20, 0x20, 0x0, 0x7faba31bded8, 0xc822f3d1c8, 0x0, 0x0)
/usr/local/go/src/syscall/syscall_unix.go:258 +0xaf
github.com/vishvananda/netlink/nl.(*NetlinkSocket).Send(0xc822f3d1c0, 0xc82435fda0, 0x0, 0x0)
/usr/src/docker/vendor/src/github.com/vishvananda/netlink/nl/nl_linux.go:333 +0xd4
github.com/vishvananda/netlink/nl.(*NetlinkRequest).Execute(0xc82435fda0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/src/docker/vendor/src/github.com/vishvananda/netlink/nl/nl_linux.go:215 +0x111
github.com/vishvananda/netlink.LinkDel(0x7fab9c2b15d8, 0xc825ef2240, 0x0, 0x0)
/usr/src/docker/vendor/src/github.com/vishvananda/netlink/link_linux.go:615 +0x16b
github.com/docker/libnetwork/drivers/bridge.(*driver).DeleteEndpoint(0xc8204aac30, 0xc8203ae780, 0x40, 0xc826e7b800, 0x40, 0x0, 0x0)
/usr/src/docker/vendor/src/github.com/docker/libnetwork/drivers/bridge/bridge.go:1060 +0x5cf
github.com/docker/libnetwork.(*endpoint).deleteEndpoint(0xc822945b00, 0xc82001ac00, 0x0, 0x0)
/usr/src/docker/vendor/src/github.com/docker/libnetwork/endpoint.go:760 +0x261
github.com/docker/libnetwork.(*endpoint).Delete(0xc822945b00, 0x7fab9c2b0a00, 0x0, 0x0)
/usr/src/docker/vendor/src/github.com/docker/libnetwork/endpoint.go:735 +0xbcb
github.com/docker/libnetwork.(*sandbox).delete(0xc8226bc780, 0xc8229f0600, 0x0, 0x0)
/usr/src/docker/vendor/src/github.com/docker/libnetwork/sandbox.go:217 +0xd3f
github.com/docker/libnetwork.(*sandbox).Delete(0xc8226bc780, 0x0, 0x0)
/usr/src/docker/vendor/src/github.com/docker/libnetwork/sandbox.go:175 +0x32
github.com/docker/docker/daemon.(*Daemon).releaseNetwork(0xc820001980, 0xc820e23a40)
/usr/src/docker/.gopath/src/github.com/docker/docker/daemon/container_operations.go:732 +0x4f1
github.com/docker/docker/daemon.(*Daemon).Cleanup(0xc820001980, 0xc820e23a40)
/usr/src/docker/.gopath/src/github.com/docker/docker/daemon/start.go:163 +0x62
github.com/docker/docker/daemon.(*Daemon).StateChanged(0xc820001980, 0xc825f9fac0, 0x40, 0xc824155b50, 0x4, 0x8900000000, 0x0, 0x0, 0x0, 0x0, ...)
/usr/src/docker/.gopath/src/github.com/docker/docker/daemon/monitor.go:39 +0x60a
github.com/docker/docker/libcontainerd.(*container).handleEvent.func2()
/usr/src/docker/.gopath/src/github.com/docker/docker/libcontainerd/container_linux.go:177 +0xa5
github.com/docker/docker/libcontainerd.(*queue).append.func1(0xc820073c01, 0xc820f9a2a0, 0xc821f3de20, 0xc822ddf9e0)
/usr/src/docker/.gopath/src/github.com/docker/docker/libcontainerd/queue_linux.go:26 +0x47
created by github.com/docker/docker/libcontainerd.(*queue).append
/usr/src/docker/.gopath/src/github.com/docker/docker/libcontainerd/queue_linux.go:28 +0x1da
From that we can observe that it hanged in https://github.com/moby/moby/blob/v1.11.2/daemon/container_operations.go#L732
which points us to https://github.com/moby/moby/blob/v1.11.2/vendor/src/github.com/docker/libnetwork/sandbox.go#L175
And
https://github.com/moby/moby/blob/v1.11.2/vendor/src/github.com/docker/libnetwork/endpoint.go#L760
Which goes into libnetwork bridge driver (check the awesome description)
https://github.com/moby/moby/blob/v1.11.2/vendor/src/github.com/docker/libnetwork/drivers/bridge/bridge.go#L1057-L1061
Moving to netlink
https://github.com/moby/moby/blob/v1.11.2/vendor/src/github.com/vishvananda/netlink/link_linux.go#L601-L617
https://github.com/moby/moby/blob/v1.11.2//vendor/src/github.com/vishvananda/netlink/nl/nl_linux.go#L215
And ultimately in that netlink socket, calls https://github.com/moby/moby/blob/v1.11.2/vendor/src/github.com/vishvananda/netlink/nl/nl_linux.go#L333
We feel that the bug in general happens when stopping a container and due to SKBs being still referenced in the netns the veth is not released, then Docker issues a Kill to that container after 15s. Docker daemon does not handle this situation gracefully, but ultimately the bug is in the kernel. We believe that https://github.com/torvalds/linux/commit/4ee806d51176ba7b8ff1efd81f271d7252e03a1d (accepted in 4.15 upstream) and commits linked to it (there are several) act as a mitigation.
In general, that part of the kernel is not a pretty place.
For what its worth...we upgraded RHEL Linux kernel from 3.10.0 to 4.17.11. (Running Kubernetes cluster on). Before upgrading this bug was occuring on daily basis several times on different servers. Running with the upgrade for three weeks now. Bug occurred only once. So roughly said is reduced by 99%.
@marckamerbeek You updated RHEL Kernel to a community Kernel? Then it is no longer supported.
@Beatlor CentOS user can do like this.
centos 7.2 still has this problem: kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
@Beatlor RHEL did not help us at all. A stable production environment is more important then some worthless support contract. We are still running very stable now on 4.17.11. No big issues anymore.
@Beatlor RHEL did not help us at all. A stable production environment is more important then some worthless support contract. We are still running very stable now on 4.17.11. No big issues anymore.
Yes, I also did not have this problem after upgrading the kernel to 4.17.0-1.el7.elrepo.x86_64. I tried this before (4.4.x, 4.8, 4.14..) and it has failed. It seems that the problem will not occur again in the 4.17+ kernel.
centos 7.2 still has this problem: kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
You can try to upgrade to the latest 4.19+ kernel.
Just wait for a few months and someone will come up complaining about the 4.19 kernel too. Just history repeating itself.
Hey everyone, good news !
Since my last comment here (at the time of writing, 17 days ago) I haven't got these errors again. My servers (about 30 of them) were running ubuntu 14.04 with some outdated packages.
After a full system upgrade including docker-engine (from 1.7.1 to 1.8.3) + kernel upgrade to the latest possible version on ubuntu's repo, my servers are running without any occurences.
🎱
which kernel version are you upgrade?
maybe related to this https://github.com/torvalds/linux/commit/f186ce61bb8235d80068c390dc2aad7ca427a4c2
Here's an attempt to summarise this issue, from the comments of this issue, https://github.com/kubernetes/kubernetes/issues/70427, https://github.com/kubernetes/kubernetes/issues/64743, and https://access.redhat.com/solutions/3659011
I challenge https://github.com/kubernetes/kubernetes/issues/70427#issuecomment-470681000 as we haven't been seeing this w/ thousands of VMs in 4.15.0 whilst we were seeing it dozens of times daily on 4.4.0, are there more reports of it in 4.15.0?
I'm seeing this issue with one of my machines running Docker on Debian 9 Stretch (4.9.0-8-amd64
). I experience this issue with a tunnel created within the Docker container via Docker Gen and it generates a kernel panic:
Message from syslogd@xxxx at Apr 29 15:55:41 ...
kernel:[719739.507961] unregister_netdevice: waiting for tf-xxxxxxxx to become free. Usage count = 1
Here's our Docker information:
Client:
Version: 18.09.3
API version: 1.39
Go version: go1.10.8
Git commit: 774a1f4
Built: Thu Feb 28 06:34:04 2019
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 18.09.3
API version: 1.39 (minimum version 1.12)
Go version: go1.10.8
Git commit: 774a1f4
Built: Thu Feb 28 05:59:55 2019
OS/Arch: linux/amd64
Experimental: false
Does anybody know if there's a temporary fix to this without restarting the entire machine? We'd really prefer not having to restart the entire machine when we experience this issue.
Somewhat off-topic, we cannot suppress the kernel panic messages within the terminal as well. I've tried dmesg -D
and dmesg -n 1
. However, no luck. Is there a way to suppress these type of kernel panic messages from within the terminal? It's annoying trying to type commands and having that message pop up every 10 seconds or so.
Thanks.
Are these vanilla kernels or heavily patched by distros with backported fixes?
@pmoust I do see this on ubuntu 4.15.0-32 once a week or so. definitely much better since 4.4.0
@iavael i'll attempt to list distro info in the summary if the reference provides it.
Anyone saw this bug with 4.19?
Anyone saw this bug with 4.19?
https://github.com/kubernetes/kubernetes/issues/64743#issuecomment-451351435
https://github.com/kubernetes/kubernetes/issues/64743#issuecomment-461772385
This information may be helpful to you.
@tankywoo @drpancake @egasimus @csabahenk @spiffytech @ibuildthecloud @sbward @jbalonso @rsampaio @MrMMorris @rsampaio @unclejack @chrisjstevenson @popsikle @fxposter @scher200 @victorgp @jstangroome @Xuexiang825 @dElogics @Nowaker @pmoust @marckamerbeek @Beatlor @warmchang @Jovons @247687009 @jwongz @tao12345666333 @clkao Please look at this https://pingcap.com/blog/try-to-fix-two-linux-kernel-bugs-while-testing-tidb-operator-in-k8s/
@tankywoo @drpancake @egasimus @csabahenk @spiffytech @ibuildthecloud @sbward @jbalonso @rsampaio @MrMMorris @rsampaio @unclejack @chrisjstevenson @popsikle @fxposter @scher200 @victorgp @jstangroome @Xuexiang825 @dElogics @Nowaker @pmoust @marckamerbeek @Beatlor @warmchang @Jovons @247687009 @jwongz @tao12345666333 @clkao Please look at this https://pingcap.com/blog/try-to-fix-two-linux-kernel-bugs-while-testing-tidb-operator-in-k8s/
I followed the documentation, but I still get an error.
[root@node1 ~]# kpatch list
Loaded patch modules:
livepatch_route [enabled]
Installed patch modules:
[root@node1 ~]#
Message from syslogd@node1 at May 7 15:59:11 ...
kernel:unregister_netdevice: waiting for eth0 to become free. Usage count = 1
That message itself is not the bug; it's the kernel crashing afterwards; https://github.com/moby/moby/issues/5618#issuecomment-407751991
@tankywoo @drpancake @egasimus @csabahenk @spiffytech @ibuildthecloud @sbward @jbalonso @rsampaio @MrMMorris @rsampaio @unclejack @chrisjstevenson @popsikle @fxposter @scher200 @victorgp @jstangroome @Xuexiang825 @dElogics @Nowaker @pmoust @marckamerbeek @Beatlor @warmchang @Jovons @247687009 @jwongz @tao12345666333 @clkao Please look at this https://pingcap.com/blog/try-to-fix-two-linux-kernel-bugs-while-testing-tidb-operator-in-k8s/
I followed the documentation, but I still get an error.
[root@node1 ~]# kpatch list Loaded patch modules: livepatch_route [enabled] Installed patch modules: [root@node1 ~]# Message from syslogd@node1 at May 7 15:59:11 ... kernel:unregister_netdevice: waiting for eth0 to become free. Usage count = 1
After rebooting, ok···
@vincent927 BTW,You should put livepatch_route.ko to /var/lib/kpatch/$(uname -r), when enable kpatch.service, the ko can be auto load after reboot.
We got this at our company suddenly today in several kubernetes clusters.
uname -a
:
Linux ip-10-47-17-58 4.9.0-8-amd64 #1 SMP Debian 4.9.144-3.1 (2019-02-19) x86_64 GNU/Linux
docker version
:
Client:
Version: 18.09.5
API version: 1.39
Go version: go1.10.8
Git commit: e8ff056dbc
Built: Thu Apr 11 04:44:28 2019
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 18.09.2
API version: 1.39 (minimum version 1.12)
Go version: go1.10.6
Git commit: 6247962
Built: Sun Feb 10 03:42:13 2019
OS/Arch: linux/amd64
Experimental: false
kubectl version
(server):
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.2", GitCommit:"cff46ab41ff0bb44d8584413b598ad8360ec1def", GitTreeState:"clean", BuildDate:"2019-01-10T23:28:14Z", GoVersion:"go1.11.4", Compiler:"gc", Platform:"linux/amd64"}
We don't know the cause yet; we've been running these versions of the above software for several months with no issues. I am just commenting to add to the list of "versions of software that experience this bug" for now.
@ethercflow I've read that but since we run Debian at my company it's as not straightforward for us to implement the fix in that post.
@ethercflow @2rs2ts we are also running debian. I have encountered a lot of issues trying to get kpatch-build to work. If I manage to find a workaround I'll keep you posted. In any case, does anybody have any other solution? Is it kernel version 4.15 or 4.19 that mitigates the problem? I have been trying to find the answer for the past week and still have not managed to.
@commixon our experience is still the same as reported in https://github.com/moby/moby/issues/5618#issuecomment-455800975, across a fleet of thousand VMs no re-occurance of the issue w/ 4.15.0 on generic, AWS-optimised and GCP-optimised flavors of kernels provided by Canonical. Limited test on vanilla 4.15.0 did not show any of those issues either, but was not tested at scale.
Thanks a lot @pmoust . Will try them out. In any case I'll also try to patch kpatch to work with Debian (as a side project) and post updates here for anyone interested.
@ethercflow @2rs2ts we are also running debian. I have encountered a lot of issues trying to get kpatch-build to work. If I manage to find a workaround I'll keep you posted. In any case, does anybody have any other solution? Is it kernel version 4.15 or 4.19 that mitigates the problem? I have been trying to find the answer for the past week and still have not managed to.
You may upgrade to 4.19. It's in the backports.
BTW it's been a year for us here. ;)
We actually tried the 4.19 in the backports but it had some major regressions in other areas (the EC2 instances would just randomly reboot and then networking would be broken upon startup.) Guess we'll have to deal with this until the next stable.
@2rs2ts For the past 4 days we are using 4.19 from backports (in EC2) and we have not seen any problems at all. The kernel crash issue has not appeared at all and everything else seems fine as well. I don't believe it makes any difference but we based our Debian image to the one provided by kops (https://github.com/kubernetes/kops/blob/master/docs/images.md#debian). We updated the kernel in this image and not the stock debian.
Friends, I have been using the 4.19 kernel for stable operation for half a year. I hope that you can enjoy stability as well.
I have a container open 80 and 443 port , every 2 weeks ,from anther computer access container 80 and
443 is deny
centos7.3 kernel version is :
Linux browser1 3.10.0-514.el7.x86_64 #1 SMP Tue Nov 22 16:42:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
root@browser1 ~]# docker version
Client:
Version: 18.06.3-ce
API version: 1.38
Go version: go1.10.4
Git commit: d7080c1
Built: Wed Feb 20 02:24:22 2019
OS/Arch: linux/amd64
Experimental: false
Server:
Engine:
Version: 18.06.3-ce
API version: 1.38 (minimum version 1.12)
Go version: go1.10.3
Git commit: d7080c1
Built: Wed Feb 20 02:25:33 2019
OS/Arch: linux/amd64
Experimental: false
[root@browser1 ~]#
dmesg:
[1063959.636785] unregister_netdevice: waiting for lo to become free. Usage count = 1
[1071340.887512] br-af29e1edc1b8: port 5(vethc2ac4f8) entered disabled state
[1071340.891753] br-af29e1edc1b8: port 5(vethc2ac4f8) entered disabled state
[1071340.895118] device vethc2ac4f8 left promiscuous mode
[1071340.895138] br-af29e1edc1b8: port 5(vethc2ac4f8) entered disabled state
[1071340.990505] device veth5e4f161 entered promiscuous mode
[1071340.990897] IPv6: ADDRCONF(NETDEV_UP): veth5e4f161: link is not ready
[1071340.990904] br-af29e1edc1b8: port 5(veth5e4f161) entered forwarding state
[1071340.990924] br-af29e1edc1b8: port 5(veth5e4f161) entered forwarding state
[1071341.231405] IPv6: ADDRCONF(NETDEV_CHANGE): veth5e4f161: link becomes ready
[1071355.991701] br-af29e1edc1b8: port 5(veth5e4f161) entered forwarding state
[1071551.533907] br-af29e1edc1b8: port 5(veth5e4f161) entered disabled state
[1071551.537564] br-af29e1edc1b8: port 5(veth5e4f161) entered disabled state
[1071551.540295] device veth5e4f161 left promiscuous mode
[1071551.540313] br-af29e1edc1b8: port 5(veth5e4f161) entered disabled state
[1071551.570924] device veth8fd3a0a entered promiscuous mode
[1071551.571550] IPv6: ADDRCONF(NETDEV_UP): veth8fd3a0a: link is not ready
[1071551.571556] br-af29e1edc1b8: port 5(veth8fd3a0a) entered forwarding state
[1071551.571582] br-af29e1edc1b8: port 5(veth8fd3a0a) entered forwarding state
[1071551.841656] IPv6: ADDRCONF(NETDEV_CHANGE): veth8fd3a0a: link becomes ready
[1071566.613998] br-af29e1edc1b8: port 5(veth8fd3a0a) entered forwarding state
[1071923.465082] br-af29e1edc1b8: port 5(veth8fd3a0a) entered disabled state
[1071923.470215] br-af29e1edc1b8: port 5(veth8fd3a0a) entered disabled state
[1071923.472888] device veth8fd3a0a left promiscuous mode
[1071923.472904] br-af29e1edc1b8: port 5(veth8fd3a0a) entered disabled state
[1071923.505580] device veth9e693ae entered promiscuous mode
[1071923.505919] IPv6: ADDRCONF(NETDEV_UP): veth9e693ae: link is not ready
[1071923.505925] br-af29e1edc1b8: port 5(veth9e693ae) entered forwarding state
[1071923.505944] br-af29e1edc1b8: port 5(veth9e693ae) entered forwarding state
[1071923.781658] IPv6: ADDRCONF(NETDEV_CHANGE): veth9e693ae: link becomes ready
[1071938.515044] br-af29e1edc1b8: port 5(veth9e693ae) entered forwarding state
Anyone saw this bug with 4.19?
Yes. I have the issue on kernel 4.19.4-1.e17.elrep.x86_64
Hello,
I am also seeing this error. Do we have any solution for this issue? Kernel 3.10.0-514.26.2.el7.x86_64
[username@ip-10-1-4-64 ~]$
Message from syslogd@ip-10-1-4-64 at Jul 19 10:50:01 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
Message from syslogd@ip-10-1-4-64 at Jul 19 10:50:48 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
This issue still happening :( no update/ideas on how to fix?
Happening on Debian Stretch. I was trying to update my Jenkins container via Ansible when this happened.
This issue has been solved by this commit :
https://github.com/torvalds/linux/commit/ee60ad219f5c7c4fb2f047f88037770063ef785f
Using kpatch
curl -SOL https://raw.githubusercontent.com/Aleishus/kdt/master/kpatchs/route.patch
kpatch-build -t vmlinux route.patch
mkdir -p /var/lib/kpatch/${UNAME}
cp -a livepatch-route.ko /var/lib/kpatch/${UNAME}
systemctl restart kpatch
kpatch list
This issue has been solved by this commit :
torvalds/linux@ee60ad2
Using kpatchcurl -SOL https://raw.githubusercontent.com/Aleishus/kdt/master/kpatchs/route.patch kpatch-build -t vmlinux route.patch mkdir -p /var/lib/kpatch/${UNAME} cp -a livepatch-route.ko /var/lib/kpatch/${UNAME} systemctl restart kpatch kpatch list
This must be in 4.19.30 onwards.
I am not sure torvalds/linux@ee60ad2 is the definitive fix for it - we've seen this in 4.4.0 AFAR, whereas https://github.com/torvalds/linux/commit/deed49df7390d5239024199e249190328f1651e7 was only added in 4.5.0
We've reproduced the same bug using a diagnostic kernel that had delays artificially inserted to make PMTU discovery exception routes hit this window.
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index a0163c5..6b9e7ee 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -133,6 +133,8 @@
static int ip_min_valid_pmtu __read_mostly = IPV4_MIN_MTU;
+static int ref_leak_test;
+
/*
* Interface to generic destination cache.
*/
@@ -1599,6 +1601,9 @@ static void ip_del_fnhe(struct fib_nh *nh, __be32 daddr)
fnhe = rcu_dereference_protected(*fnhe_p, lockdep_is_held(&fnhe_lock));
while (fnhe) {
if (fnhe->fnhe_daddr == daddr) {
+ if (ref_leak_test)
+ pr_info("XXX pid: %d, %s: fib_nh:%p, fnhe:%p, daddr:%x\n",
+ current->pid, __func__, nh, fnhe, daddr);
rcu_assign_pointer(*fnhe_p, rcu_dereference_protected(
fnhe->fnhe_next, lockdep_is_held(&fnhe_lock)));
fnhe_flush_routes(fnhe);
@@ -2145,10 +2150,14 @@ static struct rtable *__mkroute_output(const struct fib_result *res,
fnhe = find_exception(nh, fl4->daddr);
if (fnhe) {
+ if (ref_leak_test)
+ pr_info("XXX pid: %d, found fnhe :%p\n", current->pid, fnhe);
prth = &fnhe->fnhe_rth_output;
rth = rcu_dereference(*prth);
if (rth && rth->dst.expires &&
` time_after(jiffies, rth->dst.expires)) {
+ if (ref_leak_test)
+ pr_info("eXX pid: %d, del fnhe :%p\n", current->pid, fnhe);
ip_del_fnhe(nh, fl4->daddr);
fnhe = NULL;
} else {
@@ -2204,6 +2213,14 @@ static struct rtable *__mkroute_output(const struct fib_result *res,
#endif
}
+ if (fnhe && ref_leak_test) {
+ unsigned long time_out;
+
+ time_out = jiffies + ref_leak_test;
+ while (time_before(jiffies, time_out))
+ cpu_relax();
+ pr_info("XXX pid: %d, reuse fnhe :%p\n", current->pid, fnhe);
+ }
rt_set_nexthop(rth, fl4->daddr, res, fnhe, fi, type, 0);
if (lwtunnel_output_redirect(rth->dst.lwtstate))
rth->dst.output = lwtunnel_output;
@@ -2733,6 +2750,13 @@ static int ipv4_sysctl_rtcache_flush(struct ctl_table *__ctl, int write,
.proc_handler = proc_dointvec,
},
{
+ .procname = "ref_leak_test",
+ .data = &ref_leak_test,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec,
+ },
+ {
.procname = "max_size",
.data = &ip_rt_max_size,
.maxlen = sizeof(int),
ref_leak_test_begin.sh:
#!/bin/bash
# constructing a basic network with netns
# client <-->gateway <--> server
ip netns add svr
ip netns add gw
ip netns add cli
ip netns exec gw sysctl net.ipv4.ip_forward=1
ip link add svr-veth type veth peer name svrgw-veth
ip link add cli-veth type veth peer name cligw-veth
ip link set svr-veth netns svr
ip link set svrgw-veth netns gw
ip link set cligw-veth netns gw
ip link set cli-veth netns cli
ip netns exec svr ifconfig svr-veth 192.168.123.1
ip netns exec gw ifconfig svrgw-veth 192.168.123.254
ip netns exec gw ifconfig cligw-veth 10.0.123.254
ip netns exec cli ifconfig cli-veth 10.0.123.1
ip netns exec cli route add default gw 10.0.123.254
ip netns exec svr route add default gw 192.168.123.254
# constructing concurrently accessed scenes with nerperf
nohup ip netns exec svr netserver -L 192.168.123.1
nohup ip netns exec cli netperf -H 192.168.123.1 -l 300 &
nohup ip netns exec cli netperf -H 192.168.123.1 -l 300 &
nohup ip netns exec cli netperf -H 192.168.123.1 -l 300 &
nohup ip netns exec cli netperf -H 192.168.123.1 -l 300 &
# Add delay
echo 3000 > /proc/sys/net/ipv4/route/ref_leak_test
# making PMTU discovery exception routes
echo 1 > /proc/sys/net/ipv4/route/mtu_expires
for((i=1;i<=60;i++));
do
for j in 1400 1300 1100 1000
do
echo "set mtu to "$j;
ip netns exec svr ifconfig svr-veth mtu $j;
ip netns exec cli ifconfig cli-veth mtu $j;
ip netns exec gw ifconfig svrgw-veth mtu $j;
ip netns exec gw ifconfig cligw-veth mtu $j;
sleep 2;
done
done
ref_leak_test_end.sh:
#!/bin/bash
echo 0 > /proc/sys/net/ipv4/route/ref_leak_test
pkill netserver
pkill netperf
ip netns exec cli ifconfig cli-veth down
ip netns exec gw ifconfig svrgw-veth down
ip netns exec gw ifconfig cligw-veth down
ip netns exec svr ifconfig svr-veth down
ip netns del svr
ip netns del gw
ip netns del cli
The test process:
ref_leak_test_begin.sh
, ref_leak_test_end.sh
, [root@iZuf6h1kfgutxc3el68z2lZ test]# bash ref_leak_test_begin.sh
net.ipv4.ip_forward = 1
nohup: ignoring input and appending output to ‘nohup.out’
nohup: set mtu to 1400
appending output to ‘nohup.out’
nohup: appending output to ‘nohup.out’
nohup: appending output to ‘nohup.out’
nohup: appending output to ‘nohup.out’
set mtu to 1300
set mtu to 1100
set mtu to 1000
set mtu to 1400
set mtu to 1300
set mtu to 1100
^C
[root@iZuf6h1kfgutxc3el68z2lZ test]# bash ref_leak_test_end.sh
[root@iZuf6h1kfgutxc3el68z2lZ test]#
Message from syslogd@iZuf6h1kfgutxc3el68z2lZ at Nov 4 20:29:43 ...
kernel:unregister_netdevice: waiting for cli-veth to become free. Usage count = 1
After some testing, torvalds/linux@ee60ad2 can indeed fix this bug.
Anyone saw this bug with 4.19?
same
Yes, on Debian! Is there any way to suppress it?
Found out my Docker logs are also being spammed. Kernel 5.4.0, Docker 19.03.8:
Mar 21 18:46:14 host.mysite.com dockerd[16544]: time="2020-03-21T18:46:14.127275161Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Mar 21 18:45:13 host.mysite.com dockerd[16544]: time="2020-03-21T18:45:13.642050333Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Mar 21 18:44:13 host.mysite.com dockerd[16544]: time="2020-03-21T18:44:13.161364216Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Mar 21 18:43:12 host.mysite.com dockerd[16544]: time="2020-03-21T18:43:12.714725302Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
I finally found out how to suppress these messages btw. From this question on StackExchange, I commented out this line in /etc/rsyslog.conf
:
# Everybody gets emergency messages
#*.emerg :omusrmsg:*
Very nuclear option, but at least now my system is usable again!
@steelcowboy You can configure rsyslog to only void those annoying messages instead of all emergencies which is more desirable.
I wrote the following into /etc/rsyslog.d/40-unreigster-netdevice.conf
and restarted rsyslog systemctl restart rsyslog
.
# match frequent not relevant emergency messages generated by Docker when transfering large amounts of data through the network
:msg,contains,"unregister_netdevice: waiting for lo to become free. Usage count = 1" /dev/null
# discard matching messages
& stop
Any news here?
Most helpful comment
(repeating this https://github.com/moby/moby/issues/5618#issuecomment-351942943 here again, because GitHub is hiding old comments)
If you are arriving here
The issue being discussed here is a kernel bug and has not yet been fully fixed. Some patches went in the kernel that fix _some_ occurrences of this issue, but others are not yet resolved.
There are a number of options that may help for _some_ situations, but not for all (again; it's most likely a combination of issues that trigger the same error)
The "unregister_netdevice: waiting for lo to become free" error itself is not the bug
If's the kernel crash _after_ that's a bug (see below)
Do not leave "I have this too" comments
"I have this too" does not help resolving the bug. only leave a comment if you have information that may help resolve the issue (in which case; providing a patch to the kernel upstream may be the best step).
If you want to let know you have this issue too use the "thumbs up" button in the top description:

If you want to stay informed on updates use the _subscribe button_.
Every comment here sends an e-mail / notification to over 3000 people I don't want to lock the conversation on this issue, because it's not resolved yet, but may be forced to if you ignore this.
I will be removing comments that don't add useful information in order to (slightly) shorten the thread
If you want to help resolving this issue
Thanks!