Lxd: Container won't start, monitor is hung.

Created on 17 Apr 2018  ยท  45Comments  ยท  Source: lxc/lxd

Required information

  • Distribution: Ubuntu
  • Distribution version: 16.04.04
  • The output of "lxc info" or if that fails:
    ```
    $ lxc info
    config:
    core.https_address: '[::]:8443'
    core.trust_password: true
    api_extensions:

    • storage_zfs_remove_snapshots

    • container_host_shutdown_timeout

    • container_stop_priority

    • container_syscall_filtering

    • auth_pki

    • container_last_used_at

    • etag

    • patch

    • usb_devices

    • https_allowed_credentials

    • image_compression_algorithm

    • directory_manipulation

    • container_cpu_time

    • storage_zfs_use_refquota

    • storage_lvm_mount_options

    • network

    • profile_usedby

    • container_push

    • container_exec_recording

    • certificate_update

    • container_exec_signal_handling

    • gpu_devices

    • container_image_properties

    • migration_progress

    • id_map

    • network_firewall_filtering

    • network_routes

    • storage

    • file_delete

    • file_append

    • network_dhcp_expiry

    • storage_lvm_vg_rename

    • storage_lvm_thinpool_rename

    • network_vlan

    • image_create_aliases

    • container_stateless_copy

    • container_only_migration

    • storage_zfs_clone_copy

    • unix_device_rename

    • storage_lvm_use_thinpool

    • storage_rsync_bwlimit

    • network_vxlan_interface

    • storage_btrfs_mount_options

    • entity_description

    • image_force_refresh

    • storage_lvm_lv_resizing

    • id_map_base

    • file_symlinks

    • container_push_target

    • network_vlan_physical

    • storage_images_delete

    • container_edit_metadata

    • container_snapshot_stateful_migration

    • storage_driver_ceph

    • storage_ceph_user_name

    • resource_limits

    • storage_volatile_initial_source

    • storage_ceph_force_osd_reuse

    • storage_block_filesystem_btrfs

    • resources

    • kernel_limits

    • storage_api_volume_rename

    • macaroon_authentication

    • network_sriov

    • console

    • restrict_devlxd

    • migration_pre_copy

    • infiniband

    • maas_network

    • devlxd_events

    • proxy

    • network_dhcp_gateway

    • file_get_symlink

    • network_leases

    • unix_device_hotplug

    • storage_api_local_volume_handling

    • operation_description

    • clustering

    • event_lifecycle

    • storage_api_remote_volume_handling

    • nvidia_runtime

      api_status: stable

      api_version: "1.0"

      auth: trusted

      public: false

      auth_methods:

    • tls

      environment:

      addresses:

    • 192.168.1.1:8443

    • '[2604:6000:6342:::::]:8443'

    • '[2604:6000:6342:::::]:8443'

    • '[2604:6000:6342:::::]:8443'

    • 192.168.122.1:8443

      architectures:

    • x86_64

    • i686

      certificate: |

      -----BEGIN CERTIFICATE-----

      MIIFRDCCAyygAwIBAgIRAJOZkpRrAKHaoGlOrtgqFL4wDQYJKoZIhvcNAQELBQAw

      NTEcMBoGA1UEChMTbGludXhjb250YWluZXJzLm9yZzEVMBMGA1UEAwwMcm9vdEBO

      ZXBoZWxlMB4XDTE3MTAwNDIwMzU0OVoXDTI3MTAwMjIwMzU0OVowNTEcMBoGA1UE

      ChMTbGludXhjb250YWluZXJzLm9yZzEVMBMGA1UEAwwMcm9vdEBOZXBoZWxlMIIC

      IjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAt75i6L7oq8xyuD3sdhc/NXAN

      d+aeRbbxThGWURJUWxRY9ZDz/gPZr5hCfSPUlnvdL6ieWIqs2otdN3di84yKik3o

      EGS5QUJAa3yofa6HkBCUpyUXEFNGG4k8aiuLBv1R46Xgj/Zv8sqxPuKcrU5Y/Ckc

      KhHTq0o7xosU/XStq4SpL0Wh0prxEzj6P9luqCLm2+80L5Ri/uRVvOBtyjJqXbPz

      N3qnYHBxxHtZo4ipH/qWesajglAKdyZnFRnbdhJbRmmAkHJ3W7LG5L80WcGPKY1i

      zq3BfqRbKpwZ8wB2VFs6QnSu5wQKiIrUHOTutvHORFnGvymWuNL9eg+XkEc59RBK

      QCu6X+h2CT1BDxzGbUSdaUwQ2NFg/wllNbJDomt6mVxCAoEmZU+vnINqDzCXzmpR

      HeqLnu1Nq4IscXn9cJ+Vlvm5Kqd69XTsYIkt0R1k83CtMN6FxCGCZg1Jm9hKWFdr

      2ib/g6xmUleT/uI4ElQzYpuss6s959/eamA1REFoj26v8MWIQp2xnWQ34imk/QNV

      /HdHvGXpTIS06AOfuT9Jc0k3zjtbzJ/psjeCAiETKozm5vxJ4VB396OJb2eOJhsw

      4ULKIunb10M2AJ2UOo7JVv3YZrRyLJPKRUUew0CY1uTHEjZhyw4eXtxH2K0+OEY0

      FgaX/wEs23MIGex5VMMCAwEAAaNPME0wDgYDVR0PAQH/BAQDAgWgMBMGA1UdJQQM

      MAoGCCsGAQUFBwMBMAwGA1UdEwEB/wQCMAAwGAYDVR0RBBEwD4IHTmVwaGVsZYcE

      wKgBAjANBgkqhkiG9w0BAQsFAAOCAgEAhQpGVhgiBVbs3bS4lpyThPrjMBoLvCh2

      6lCXO4YkHO9bcJ3G358LTgLALjs3j1bzk5qHk+Z63Tdsgr0oDCcDLX9HPHkkYQUf

      mihO6o1mINOcQ6jjz0y7KGfxD/GexcN7/WH2htbO1VF2p6oBdNsxi0G/lur5ASwt

      dJrDRQi7ifP00/DUWKNgZoLQOG5z/hfsjvocLZwmwGMqgY+xW5Gh7vCfljiCPL+x

      tKZOpKaZ6FxwY7OlsW802P27sCftRrwn0rvw6jsyY2KHo+LwQxlVHmJtxnT2diZn

      NQXFhBT921gzX1kkJdP26JHDlJhv0iL7A3JHt8QzAdxXNfBIQ/Dwu5fILjFbncyS

      8y1ulaLQd/fybi1dSaJgdHMnojMQV3c7mydUjB1j21+VlaW14rRmSwCnr3cZtg3m

      OndA7luHKvvthekvTyiI+ioDsDsO1DFc6YcljI5nVZkjeBTBJF1ornRnb+TP+XQl

      PlG3uSNG0UGoDm7y/wv8WZUwGU+BFiTIsSPyEeYWY92ePYgoLHy1TkaUp17GwBJI

      XdARg7L/kwy0pwgLPSiSkSoH1lXDQmSf7RFzrpaUkEcKG7n/tXPR0yCYRy5L1WPM

      ESsFJryCN1MW1ftb2Xu0+K15AHvYaAlQbXhTPKR4vPwiWFN9fZiIFXOhhhQoVMOc

      wUPmTgDyfNA=

      -----END CERTIFICATE-----

      certificate_fingerprint: 203defdb53eb4eb5d574fb581e619836f89525e6fed7b29689106fb0cac44010

      driver: lxc

      driver_version: 3.0.0

      kernel: Linux

      kernel_architecture: x86_64

      kernel_version: 4.16.0-041600-generic

      server: lxd

      server_pid: 2635

      server_version: 3.0.0

      storage: btrfs

      storage_version: "4.4"

      server_clustered: false

      server_name: Nephele

# Issue description

Certain container fails to start
Full system reboot fixes it, but restarting lxd alone does not seem to
# Steps to reproduce

 ```
$lxc start container
 --hangs, ctrl-c
$lxc list container
 --container|error|etc
$lxc info container --show-log
 --error: monitor is hung
$sudo kill -9 [[[relevant lxc monitor]]]
$lxc start container
 --hangs, ctrl-c
$lxc info container --show-log
 --error:monitor is hung
$sudo kill -9 [[[relevant lxc monitor]]]
$snap restart lxd

---rinse and repeat, even after $snap restart lxd

I don't need this container, so I won't reset my system until we can figure this out
Let me know if you need any more logs etc

Most helpful comment

The problem is that net_mutex is exclusive, and it's taken for both copy_net_ns() and cleanup_net(). The second function makes many heavy sleepable actions under net_mutex such as taking rtnl_lock() and rcu synchronization. So, it's impossible to create new net before cleanup_net() is finished.

I've submitted patches to change the locking of these functions, solving this problem. They will be in 4.17, while the kernel in the issue is 4.16.

All 45 comments

Can you strace the monitor process, please?

Can you run lxc monitor --pretty --type logging in a separate terminal as you start the container? that may give us some pointer as to what LXD's doing.

$ lxc monitor --pretty --type logging
DBUG[04-17|21:22:54] New event listener: ae5db3fc-cf15-4fd6-a326-bcd5c5a36138
DBUG[04-17|21:22:59] handling                                 ip=@ method=GET url=/1.0
DBUG[04-17|21:23:00] handling                                 ip=@ method=GET url=/1.0/containers/Downloader
DBUG[04-17|21:23:00] handling                                 ip=@ method=GET url=/1.0/events
DBUG[04-17|21:23:00] New event listener: e2cbadaa-d449-4e31-b018-0e6f252fdca5
DBUG[04-17|21:23:00] handling                                 ip=@ method=PUT url=/1.0/containers/Downloader/state
DBUG[04-17|21:23:00] New task operation: 83f17b29-da58-4bfc-a562-a3dbe8c19d6a
DBUG[04-17|21:23:00] Started task operation: 83f17b29-da58-4bfc-a562-a3dbe8c19d6a
DBUG[04-17|21:23:00] handling                                 ip=@ method=GET url=/1.0/operations/83f17b29-da58-4bfc-a562-a3dbe8c19d6a
DBUG[04-17|21:23:00] Initializing a BTRFS driver.
DBUG[04-17|21:23:00] Mounting BTRFS storage volume for container "Downloader" on storage pool "default".
DBUG[04-17|21:23:00] Mounting BTRFS storage pool "default".
DBUG[04-17|21:23:00] Mounted BTRFS storage volume for container "Downloader" on storage pool "default".
DBUG[04-17|21:23:00] Mounting BTRFS storage pool "default".
DBUG[04-17|21:23:00] Mounting BTRFS storage volume for container "Downloader" on storage pool "default".
DBUG[04-17|21:23:00] Mounted BTRFS storage volume for container "Downloader" on storage pool "default".
INFO[04-17|21:23:00] Starting container                       action=start created="2017-10-05 02:21:09 +0000 UTC" ephemeral=false name=Downloader stateful=false used="2018-04-17 20:04:06 +0000 UTC"
DBUG[04-17|21:23:01] handling                                 ip=@ method=GET url=/1.0
DBUG[04-17|21:23:01] handling                                 ip=@ method=GET url=/internal/containers/9/onstart
DBUG[04-17|21:23:01] Mounting BTRFS storage volume for container "Downloader" on storage pool "default".
DBUG[04-17|21:23:01] Initializing a BTRFS driver.
DBUG[04-17|21:23:01] Mounting BTRFS storage pool "default".
DBUG[04-17|21:23:01] Mounted BTRFS storage volume for container "Downloader" on storage pool "default".
DBUG[04-17|21:23:01] Scheduler: container Downloader started: re-balancing

after lxc info/monitor is hung the Dbug gives

DBUG[04-17|21:25:24] handling                                 ip=@ method=GET url=/1.0
DBUG[04-17|21:25:24] Disconnected event listener: e2cbadaa-d449-4e31-b018-0e6f252fdca5
DBUG[04-17|21:25:24] handling                                 ip=@ method=GET url=/1.0/containers/Downloader

Can you strace the monitor process, please?

Not really, the specific monitor process doesn't exist until I attempt to start the container and then it gives no output.

After I kill the lxc-monitor the dbug shows

EROR[04-17|21:32:28] balance: Unable to set cpuset            err="Failed to set cgroup cpuset.cpus=\"0,1\": setting cgroup item for the container failed" name=Downloader value=0,1
EROR[04-17|21:32:28] Failed starting container                action=start created="2017-10-05 02:21:09 +0000 UTC" ephemeral=false name=Downloader stateful=false used="2018-04-17 20:04:06 +0000 UTC"
DBUG[04-17|21:32:28] Failure for task operation: 83f17b29-da58-4bfc-a562-a3dbe8c19d6a: Failed to run: /snap/lxd/current/bin/lxd forkstart Downloader /var/snap/lxd/common/lxd/containers /var/snap/lxd/common/lxd/logs/Downloader/lxc.conf:
EROR[04-17|21:32:28] Failed to apply network priority         container=Downloader err="Failed to set network device priority: Can't set cgroups on a stopped container"

Oh, I thought we were talking about [lxc monitor]. Nm, can you show the container's config located under:

/var/log/lxd/Downloader/lxc.conf

please?

$ cat /var/log/lxd/Downloader/lxc.conf
cat: /var/log/lxd/Downloader/lxc.conf: No such file or directory

$ lxc config show Downloader
architecture: x86_64
config:
  image.architecture: amd64
  image.description: Ubuntu xenial amd64 (20171004_03:49)
  image.os: Ubuntu
  image.release: xenial
  image.serial: "20171004_03:49"
  limits.cpu.allowance: 10%
  limits.cpu.priority: "1"
  limits.network.priority: "1"
  linux.kernel_modules: ip_tables, ip6_tables
  security.privileged: "true"
  volatile.base_image: 369f84766c53e842cb0239d51c4b78743bd9d4b48dbbfde2020c1e6d9604760f
  volatile.eth0.hwaddr: 00:16:3e:8c:92:4c
  volatile.idmap.base: "0"
  volatile.idmap.next: '[]'
  volatile.last_state.idmap: '[]'
  volatile.last_state.power: RUNNING
devices:
  hello:
    path: /data/hello
    source: /data/hello
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

Hm, you're using the snap. What's the correct path to show the container's config for the snap, @stgraber?

Also, please show:

  • cat /proc/1/cgroup
  • findmnt | grep cgroup

Not sure if timing matters for this- while the container is trying to start or whenever -- the container is not trying to start during this output

$ cat /proc/1/cgroup
12:memory:/init.scope
11:freezer:/
10:blkio:/init.scope
9:hugetlb:/
8:cpuset:/
7:devices:/init.scope
6:rdma:/
5:cpu,cpuacct:/init.scope
4:pids:/init.scope
3:perf_event:/
2:net_cls,net_prio:/
1:name=systemd:/init.scope

$ findmnt | grep cgroup
โ”‚ โ”œโ”€/sys/fs/cgroup                    tmpfs                  tmpfs       rw,mode=755
โ”‚ โ”‚ โ”œโ”€/sys/fs/cgroup/systemd          cgroup                 cgroup      rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd
โ”‚ โ”‚ โ”œโ”€/sys/fs/cgroup/net_cls,net_prio cgroup                 cgroup      rw,nosuid,nodev,noexec,relatime,net_cls,net_prio
โ”‚ โ”‚ โ”œโ”€/sys/fs/cgroup/perf_event       cgroup                 cgroup      rw,nosuid,nodev,noexec,relatime,perf_event,release_agent=/run/cgmanager/agents/cgm-release-agent.perf_event
โ”‚ โ”‚ โ”œโ”€/sys/fs/cgroup/pids             cgroup                 cgroup      rw,nosuid,nodev,noexec,relatime,pids,release_agent=/run/cgmanager/agents/cgm-release-agent.pids
โ”‚ โ”‚ โ”œโ”€/sys/fs/cgroup/cpu,cpuacct      cgroup                 cgroup      rw,nosuid,nodev,noexec,relatime,cpu,cpuacct
โ”‚ โ”‚ โ”œโ”€/sys/fs/cgroup/rdma             cgroup                 cgroup      rw,nosuid,nodev,noexec,relatime,rdma,release_agent=/run/cgmanager/agents/cgm-release-agent.rdma
โ”‚ โ”‚ โ”œโ”€/sys/fs/cgroup/devices          cgroup                 cgroup      rw,nosuid,nodev,noexec,relatime,devices
โ”‚ โ”‚ โ”œโ”€/sys/fs/cgroup/cpuset           cgroup                 cgroup      rw,nosuid,nodev,noexec,relatime,cpuset,clone_children
โ”‚ โ”‚ โ”œโ”€/sys/fs/cgroup/hugetlb          cgroup                 cgroup      rw,nosuid,nodev,noexec,relatime,hugetlb,release_agent=/run/cgmanager/agents/cgm-release-agent.hugetlb
โ”‚ โ”‚ โ”œโ”€/sys/fs/cgroup/blkio            cgroup                 cgroup      rw,nosuid,nodev,noexec,relatime,blkio
โ”‚ โ”‚ โ”œโ”€/sys/fs/cgroup/freezer          cgroup                 cgroup      rw,nosuid,nodev,noexec,relatime,freezer
โ”‚ โ”‚ โ””โ”€/sys/fs/cgroup/memory           cgroup                 cgroup      rw,nosuid,nodev,noexec,relatime,memory

/var/snap/lxd/common/lxd/logs/Downloader/lxc.conf and /var/snap/lxd/common/lxd/logs/Downloader/lxc.log may be useful

$ cat /var/snap/lxd/common/lxd/logs/Downloader/lxc.conf
lxc.log.file = /var/snap/lxd/common/lxd/logs/Downloader/lxc.log
lxc.log.level = warn
lxc.console.buffer.size = auto
lxc.console.size = auto
lxc.console.logfile = /var/snap/lxd/common/lxd/logs/Downloader/console.log
lxc.cap.drop = sys_time sys_module sys_rawio
lxc.mount.auto = proc:mixed sys:mixed
lxc.autodev = 1
lxc.pty.max = 1024
lxc.mount.entry = mqueue dev/mqueue mqueue rw,relatime,create=dir,optional
lxc.mount.entry = /dev/fuse dev/fuse none bind,create=file,optional
lxc.mount.entry = /dev/net/tun dev/net/tun none bind,create=file,optional
lxc.mount.entry = /proc/sys/fs/binfmt_misc proc/sys/fs/binfmt_misc none rbind,create=dir,optional
lxc.mount.entry = /sys/fs/fuse/connections sys/fs/fuse/connections none rbind,create=dir,optional
lxc.mount.entry = /sys/fs/pstore sys/fs/pstore none rbind,create=dir,optional
lxc.mount.entry = /sys/kernel/debug sys/kernel/debug none rbind,create=dir,optional
lxc.mount.entry = /sys/kernel/security sys/kernel/security none rbind,create=dir,optional
lxc.include = /snap/lxd/current/lxc/config//common.conf.d/
lxc.cgroup.devices.deny = a
lxc.cgroup.devices.allow = b *:* m
lxc.cgroup.devices.allow = c *:* m
lxc.cgroup.devices.allow = c 136:* rwm
lxc.cgroup.devices.allow = c 1:3 rwm
lxc.cgroup.devices.allow = c 1:5 rwm
lxc.cgroup.devices.allow = c 1:7 rwm
lxc.cgroup.devices.allow = c 1:8 rwm
lxc.cgroup.devices.allow = c 1:9 rwm
lxc.cgroup.devices.allow = c 5:0 rwm
lxc.cgroup.devices.allow = c 5:1 rwm
lxc.cgroup.devices.allow = c 5:2 rwm
lxc.cgroup.devices.allow = c 10:229 rwm
lxc.cgroup.devices.allow = c 10:200 rwm
lxc.arch = linux64
lxc.hook.pre-start = /snap/lxd/current/bin/lxd callhook /var/snap/lxd/common/lxd 9 start
lxc.hook.post-stop = /snap/lxd/current/bin/lxd callhook /var/snap/lxd/common/lxd 9 stop
lxc.tty.max = 0
lxc.uts.name = Downloader
lxc.mount.entry = /var/snap/lxd/common/lxd/devlxd dev/lxd none bind,create=dir 0 0
lxc.apparmor.profile = lxd-Downloader_</var/snap/lxd/common/lxd>//&:lxd-Downloader_<var-snap-lxd-common-lxd>:
lxc.seccomp.profile = /var/snap/lxd/common/lxd/security/seccomp/Downloader
lxc.environment = http_proxy=
lxc.cgroup.cpu.shares = 115
lxc.cgroup.cpu.cfs_period_us = 100000
lxc.rootfs.path = dir:/var/snap/lxd/common/lxd/containers/Downloader/rootfs
lxc.mount.entry = /var/snap/lxd/common/lxd/devices/Downloader/disk.hello.data-hello data/Downloads none bind,create=dir
lxc.net.0.type = veth
lxc.net.0.flags = up
lxc.net.0.link = br0
lxc.net.0.hwaddr = 00:16:3e:8c:92:4c
lxc.net.0.name = eth0
lxc.mount.entry = /var/snap/lxd/common/lxd/shmounts/Downloader dev/.lxd-mounts none bind,create=dir 0 0

$ cat /var/snap/lxd/common/lxd/logs/Downloader/lxc.log
lxc 20180417213257.375 ERROR    lxc_cgfsng - cgroups/cgfsng.c:create_path_for_hierarchy:1739 - The cgroup "/sys/fs/cgroup/systemd//lxc/Downloader" already existed
lxc 20180417213257.375 ERROR    lxc_cgfsng - cgroups/cgfsng.c:cgfsng_create:1826 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/Downloader"
lxc 20180417213257.375 ERROR    lxc_cgfsng - cgroups/cgfsng.c:create_path_for_hierarchy:1739 - The cgroup "/sys/fs/cgroup/systemd//lxc/Downloader-1" already existed
lxc 20180417213257.375 ERROR    lxc_cgfsng - cgroups/cgfsng.c:cgfsng_create:1826 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/Downloader-1"
lxc 20180417213257.375 ERROR    lxc_cgfsng - cgroups/cgfsng.c:create_path_for_hierarchy:1739 - The cgroup "/sys/fs/cgroup/systemd//lxc/Downloader-2" already existed
lxc 20180417213257.375 ERROR    lxc_cgfsng - cgroups/cgfsng.c:cgfsng_create:1826 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/Downloader-2"
lxc 20180417213257.375 ERROR    lxc_cgfsng - cgroups/cgfsng.c:create_path_for_hierarchy:1739 - The cgroup "/sys/fs/cgroup/systemd//lxc/Downloader-3" already existed
lxc 20180417213257.375 ERROR    lxc_cgfsng - cgroups/cgfsng.c:cgfsng_create:1826 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/Downloader-3"
lxc 20180417213257.375 ERROR    lxc_cgfsng - cgroups/cgfsng.c:create_path_for_hierarchy:1739 - The cgroup "/sys/fs/cgroup/systemd//lxc/Downloader-4" already existed
lxc 20180417213257.375 ERROR    lxc_cgfsng - cgroups/cgfsng.c:cgfsng_create:1826 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/Downloader-4"
lxc 20180417213257.375 ERROR    lxc_cgfsng - cgroups/cgfsng.c:create_path_for_hierarchy:1739 - The cgroup "/sys/fs/cgroup/systemd//lxc/Downloader-5" already existed
lxc 20180417213257.375 ERROR    lxc_cgfsng - cgroups/cgfsng.c:cgfsng_create:1826 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/Downloader-5"
lxc 20180417213257.375 ERROR    lxc_cgfsng - cgroups/cgfsng.c:create_path_for_hierarchy:1739 - The cgroup "/sys/fs/cgroup/systemd//lxc/Downloader-6" already existed
lxc 20180417213257.375 ERROR    lxc_cgfsng - cgroups/cgfsng.c:cgfsng_create:1826 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/Downloader-6"
lxc 20180417213257.375 ERROR    lxc_cgfsng - cgroups/cgfsng.c:create_path_for_hierarchy:1739 - The cgroup "/sys/fs/cgroup/systemd//lxc/Downloader-7" already existed
lxc 20180417213257.375 ERROR    lxc_cgfsng - cgroups/cgfsng.c:cgfsng_create:1826 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/Downloader-7"
lxc 20180417213257.375 ERROR    lxc_cgfsng - cgroups/cgfsng.c:create_path_for_hierarchy:1739 - The cgroup "/sys/fs/cgroup/systemd//lxc/Downloader-8" already existed
lxc 20180417213257.375 ERROR    lxc_cgfsng - cgroups/cgfsng.c:cgfsng_create:1826 - Failed to create cgroup "/sys/fs/cgroup/systemd//lxc/Downloader-8"
lxc 20180417213415.851 ERROR    lxc_container - lxccontainer.c:wait_on_daemonized_start:815 - No such file or directory - Failed to receive the container state
lxc 20180417213415.851 WARN     lxc_commands - commands.c:lxc_cmd_rsp_recv:130 - Connection reset by peer - Failed to receive response for command "get_cgroup"
lxc 20180417213415.851 WARN     lxc_commands - commands.c:lxc_cmd_rsp_recv:130 - Connection reset by peer - Failed to receive response for command "get_cgroup"
lxc 20180417213415.851 WARN     lxc_commands - commands.c:lxc_cmd_rsp_recv:130 - Connection reset by peer - Failed to receive response for command "get_cgroup"
lxc 20180417213415.851 WARN     lxc_commands - commands.c:lxc_cmd_rsp_recv:130 - Connection reset by peer - Failed to receive response for command "get_cgroup"
lxc 20180417213415.852 WARN     lxc_commands - commands.c:lxc_cmd_rsp_recv:130 - Connection reset by peer - Failed to receive response for command "get_cgroup"

snap info lxd?

$ snap info lxd
name:      lxd
summary:   System container manager and API
publisher: canonical
contact:   https://github.com/lxc/lxd/issues
license:   unknown
description: |
  LXD is a container manager for system containers.

  It offers a REST API to remotely manage containers over the network, using an image based workflow
  and with support for live migration.

  Images are available for all Ubuntu releases and architectures as well as for a wide number of other
  Linux distributions.

  LXD containers are lightweight, secure by default and a great alternative to virtual machines.
commands:
  - lxd.benchmark
  - lxd.check-kernel
  - lxd.database
  - lxd.lxc
  - lxd
  - lxd.migrate
services:
  lxd.daemon: simple, enabled, inactive
snap-id:   J60k4JY0HppjwOjW8dZdYc8obXKxujRu
tracking:  stable
refreshed: 2018-04-09T19:13:08-04:00
installed:       3.0.0       (6643) 56MB -
channels:                           
  stable:        3.0.0       (6643) 56MB -
  candidate:     3.0.0       (6729) 56MB -
  beta:          โ†‘                       
  edge:          git-7c74528 (6747) 56MB -
  2.0/stable:    2.0.11      (6627) 27MB -
  2.0/candidate: 2.0.11      (6627) 27MB -
  2.0/beta:      โ†‘                       
  2.0/edge:      git-d71807e (6630) 25MB -
  3.0/stable:    3.0.0       (6649) 56MB -
  3.0/candidate: 3.0.0       (6730) 56MB -
  3.0/beta:      โ†‘                       
  3.0/edge:      git-bd47a49 (6716) 56MB -

Anything suspicious looking in dmesg?

So one unlikely thing could be that something inside the container might create subcgroups and sets cpuset values. Then LXD won't be able to change them anymore because in cgroup v1 they don't apply recursively unless a specific flag is set that has been added to the kernel just recently.

@brauner we're not seeing any cgroup related failure though. The failure is:

lxc 20180417213415.851 ERROR    lxc_container - lxccontainer.c:wait_on_daemonized_start:815 - No such file or directory - Failed to receive the container state

@19wolf can you post ps fauxww when lxc start is hanging too?

@stgraber that's what I was going by

EROR[04-17|21:32:28] balance: Unable to set cpuset            err="Failed to set cgroup cpuset.cpus=\"0,1\": setting cgroup item for the container failed" name=Downloader value=0,1

lxc 20180417213415.851 ERROR lxc_container - lxccontainer.c:wait_on_daemonized_start:815 - No such file or directory - Failed to receive the container state

Doesn't really tell you anything. It just means that the socket was closed because the container failed to start.

dmesg output from only after attempting to launch

245357.391787] ACPI Error: SMBus/IPMI/GenericSerialBus write requires Buffer of length 66, found length 32 (20180105/exfield-427)
[245357.394708] No Local Variables are initialized for Method [_PMM]
[245357.394709] No Arguments are initialized for method [_PMM]
[245357.394716] ACPI Error: Method parse/execution failed \_SB.PMI0._PMM, AE_AML_BUFFER_LIMIT (20180105/psparse-550)
[245357.396685] ACPI Error: AE_AML_BUFFER_LIMIT, Evaluating _PMM (20180105/power_meter-338)
[245415.545209] audit: type=1400 audit(1524004327.881:284): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="lxd-Downloader_</var/snap/lxd/common/lxd>" pid=49828 comm="apparmor_parser"
[245415.590049] br0: port 20(vethX87JPV) entered blocking state
[245415.590052] br0: port 20(vethX87JPV) entered disabled state
[245415.590222] device vethX87JPV entered promiscuous mode
[245415.590411] IPv6: ADDRCONF(NETDEV_UP): vethX87JPV: link is not ready

I think this is the relevant output? The only things since 12pm today

$ps fauxww
nephele  50472  0.0  0.0  27332  2932 ?        Ss   18:35   0:00 SCREEN lxc start Downloader
nephele  50473  1.0  0.0  64620 11252 pts/1    Ssl+ 18:35   0:00  \_ lxc start Downloader
root     50502  0.0  0.0 122500  6808 ?        Ds   18:35   0:00 [lxc monitor] /var/snap/lxd/common/lxd/containers Downloader

stuck in I/O wait, that's no good.

Can you post cat /proc/50502/stack?

@brauner The failed rebalance is expected, it was hanging on the container monitor process, once the monitor died, it continued but the container isn't running so the cgroups can't be updated. The question is why isn't the container running :)

That's why I was asking for the strace of the [lxc monitor] before. :)

@brauner strace won't help you much on a process stuck in I/O wait though, it'll just hang strace too :)

I couldn't have known that it was stuck in Ds then though. :)

Exciting :)

I killed the monitor so I ran it again, hence different pid

$sudo cat /proc/52048/stack
[<0>] copy_net_ns+0xa6/0x230
[<0>] create_new_namespaces+0x11f/0x1d0
[<0>] copy_namespaces+0x73/0xa0
[<0>] copy_process.part.34+0x949/0x1ab0
[<0>] _do_fork+0xcf/0x3b0
[<0>] SyS_clone+0x19/0x20
[<0>] do_syscall_64+0x73/0x130
[<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[<0>] 0xffffffffffffffff

Ok, this is a bug that @Blub and I have seen multiple times.

I think that @Blub has already sent something to lkml but he's currently on vacation, I think.

See also similar and in earlier kernels (just for me an @Blub): https://github.com/lxc/lxc/issues/2141

So do we have the information we need? If so I'd like to restart my system and get the container back (it's not necessary, but I would like it online).

It seems to fail a few times a month (more specifically, an app running in the container fails and it's easiest to just restart the container, but about every 3rd time I do, it doesn't come back up)

Oh yeah, it's that kernel bug... though what's weird is that @19wolf is reporting a 4.16 kernel here so that should have the fix.

The kernel networking code is currently pretty heavily refactored. A bunch of operations have been marked as async that weren't before so the regression potential is there.

  • Anything exciting about br0?
  • Does it only happen with privileged containers?

I think the last time someone touched that code significantly it was Kirill from the Virtuozzo guys. I'll ping him tomorrow and see what he thinks.

  • br0 has a lot of veths, but I think that's normal for me.
  • I just tested it, seems anything I start/restart will hang, privileged or not.

Hi @tkhai, I'm just gonna rope you in here for a second. Seems we're running into each other quite often recently. :) We're seeing an issue with copy_net_ns() that we've seen before and I think you're one of the last persons to touch that code. Any idea what's going on?:

$sudo cat /proc/52048/stack
[<0>] copy_net_ns+0xa6/0x230
[<0>] create_new_namespaces+0x11f/0x1d0
[<0>] copy_namespaces+0x73/0xa0
[<0>] copy_process.part.34+0x949/0x1ab0
[<0>] _do_fork+0xcf/0x3b0
[<0>] SyS_clone+0x19/0x20
[<0>] do_syscall_64+0x73/0x130
[<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[<0>] 0xffffffffffffffff

The problem is that net_mutex is exclusive, and it's taken for both copy_net_ns() and cleanup_net(). The second function makes many heavy sleepable actions under net_mutex such as taking rtnl_lock() and rcu synchronization. So, it's impossible to create new net before cleanup_net() is finished.

I've submitted patches to change the locking of these functions, solving this problem. They will be in 4.17, while the kernel in the issue is 4.16.

@tkhai, cool! Thanks for stopping by! :)

Closing as this is a kernel issue which has already been resolved upstream.
A temporary workaround may be to try another kernel version while waiting for the fix to make it to your current kernel. Or if it's your own kernel, then cherry-pick the fix.

I'm updating to 4.17-rc1, hopefully the issue doesn't recur

Hiya, sorry to dig up old thread but this looks like the same issue I'm having with a nested container...

Difference is I'm trying 4.15.0-42-generic kernel

Is this something still an issue on the latest kernels?

Cheers,
Jon

image

image

image

image

image

root 22999 0.0 0.0 297888 7628 ? Ds Nov27 0:00 [lxc monitor] /var/snap/lxd/common/lxd/containers telegraf1 root@gns3vm:/home/jclayton# cat /proc/22999/stack [<0>] copy_net_ns+0xab/0x230 [<0>] create_new_namespaces+0x11b/0x1e0 [<0>] copy_namespaces+0x6d/0xa0 [<0>] copy_process.part.35+0x941/0x1ad0 [<0>] _do_fork+0xdf/0x3f0 [<0>] SyS_clone+0x19/0x20 [<0>] do_syscall_64+0x73/0x130 [<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [<0>] 0xffffffffffffffff root@gns3vm:/home/jclayton# uname -r 4.15.0-42-generic

Is this something still an issue on the latest kernels?

Doesn't seem like it, but latest kernel is 4.19.

"I've submitted patches to change the locking of these functions, solving this problem. They will be in 4.17, while the kernel in the issue is 4.16."

Unless your kernel has the patches backported, you could still hit the same issue on 4.15.

Ahh right fair enough, I thought I was on a super new kernel.

I'll try and update.

Thanks,
Jon.

Was this page helpful?
0 / 5 - 0 ratings