Lxd: Arch LXD (host + client): container not getting ipv4 address

Created on 4 Dec 2017 · 14Comments · Source: lxc/lxd

Required information

Distribution: Arch Linux 4.13.16.1-hardened (installed as Virtualbox VM)
Distribution version: Latest
Cross-posted on arch linux forums as well: https://bbs.archlinux.org/viewtopic.php?pid=1752990#p1752990
The output of "lxc info":

config: {}
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- macaroon_authentication
- network_sriov
- console
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
environment:
  addresses: []
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
    -----------
    -----END CERTIFICATE-----
  certificate_fingerprint: fdc630712d08eff1508eaf7cc281cea768ece662c75639fbeb6b4dfa9badda42
  driver: lxc
  driver_version: 2.1.1
  kernel: Linux
  kernel_architecture: x86_64
  kernel_version: 4.13.16-1-hardened
  server: lxd
  server_pid: 194
  server_version: "2.20"
  storage: btrfs
  storage_version: "4.13"

Issue description

I installed an Arch Virtualbox VM and started both an Ubuntu and an Arch container. The Ubuntu container gets an ipv4 address, the arch container doesn't. The host seems to behave well: it deals out ip's, the containers start OK, unprivileged mode works well. The arch container starts correctly, but can't seem to get the ipv4 address, only the ipv6 one. dhcpcd indicates that the interface is incorrectly installed? As far as I can see / know about the system, lxc's network system is working fine, so I'm at a loss where to look for answers. I suspect the archlinux image is not behaving correctly? To be sure: my goal is to get networking working in the arch container, getting an ipv4 address and be able to access the outside network.

[root@testcontainer ~]# dhcpcd       
dev: loaded udev
no valid interfaces found
no interfaces have a carrier
forked to background, child pid 22431

Steps to reproduce

install arch on Virtualbox (linux-hardened to enable unprivileged containers)
configure LXD / LXC using the aur repository
launch an arch container (lxc launch images:archlinux/current/amd64) and an ubuntu container (lxc launch ubuntu:16.04)
1. the ubuntu container gets an ipv4 address, the arch container doesn't

Information to attach

arch testcontainer config

[user@host ~]$ lxc config show testcontainer
architecture: x86_64
config:
  image.architecture: amd64
  image.description: Archlinux current amd64 (20171203_01:27)
  image.os: Archlinux
  image.release: current
  image.serial: "20171203_01:27"
  volatile.base_image: 36aaa2701180327bf39efc7c70958be877d1fbef5f4762b9ebfefd9515ea847f
  volatile.eth0.hwaddr: 00:16:3e:ae:fb:95
  volatile.eth0.name: eth0
  volatile.idmap.base: "0"
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":100000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":100000,"Nsid":0,"Maprange":65536}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":100000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":100000,"Nsid":0,"Maprange":65536}]'
  volatile.last_state.power: RUNNING
devices: {}
ephemeral: false
profiles:
- default
stateful: false
description: ""

lxc uses lxdbr0 and is used by the two containers (arch and ubuntu)

[user@host ~]$ lxc network list
+--------+----------+---------+-------------+---------+
|  NAME  |   TYPE   | MANAGED | DESCRIPTION | USED BY |
+--------+----------+---------+-------------+---------+
| enp0s3 | physical | NO      |             | 0       |
+--------+----------+---------+-------------+---------+
| lxcbr0 | bridge   | NO      |             | 0       |
+--------+----------+---------+-------------+---------+
| lxdbr0 | bridge   | YES     |             | 2       |
+--------+----------+---------+-------------+---------+

configuration of the arch container

[user@host ~]$ lxc info testcontainer
Name: testcontainer
Remote: unix://
Architecture: x86_64
Created: 2017/12/03 21:13 UTC
Status: Running
Type: persistent
Profiles: default
Pid: 609
Ips:
  eth0: inet6   fd42:4c81:1f9c:71a5:216:3eff:feae:fb95  veth7XH4QA
  eth0: inet6   fe80::216:3eff:feae:fb95    veth7XH4QA
  lo:   inet    127.0.0.1
  lo:   inet6   ::1
Resources:
  Processes: 12
  CPU usage:
    CPU usage (in seconds): 240
  Memory usage:
    Memory (current): 237.16MB
    Memory (peak): 278.21MB
  Network usage:
    eth0:
      Bytes received: 400.45kB
      Bytes sent: 766B
      Packets received: 1076
      Packets sent: 9
    lo:
      Bytes received: 0B
      Bytes sent: 0B
      Packets received: 0
      Packets sent: 0

Ip addr output for the arch container

[root@testcontainer ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
4: eth0@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:ae:fb:95 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fd42:4c81:1f9c:71a5:216:3eff:feae:fb95/64 scope global dynamic mngtmpaddr 
       valid_lft 3579sec preferred_lft 3579sec
    inet6 fe80::216:3eff:feae:fb95/64 scope link 
       valid_lft forever preferred_lft forever

Source

joachimnielandt

Most helpful comment

I was running into this problem and I realized that this fix will only work on the feature branch and not the stable branch (for now). Commenting so other people don't get stuck like I did.

fizzy123 on 20 Dec 2017

👍5

All 14 comments

Does ArchLinux still provide dhclient? If so, can you try and reproduce the same error with dhclient inside the container?

brauner on 4 Dec 2017

Dhclient is not available on the container. Looking around for it did set me on the path to some failed services inside the containers:

systemd-networkd.service
systemd-revolved.service
systemd-networkd.socket

Not sure if networkctl should be running by default, but here's the output of it (saw another question somewhere refer to it):

root@testcontainer ~]# networkctl status
WARNING: systemd-networkd is not running, output will be incomplete.

●        State: n/a
       Address: fd42:4c81:1f9c:71a5:216:3eff:feae:fb95 on eth0
                fe80::216:3eff:feae:fb95 on eth0
       Gateway: fe80::3cc7:fbff:feb3:7373 on eth0
[root@testcontainer ~]# networkctl
WARNING: systemd-networkd is not running, output will be incomplete.

IDX LINK             TYPE               OPERATIONAL SETUP     
  1 lo               loopback           n/a         unmanaged 
  4 eth0             ether              n/a         unmanaged 

2 links listed.

I'm reading around, normally you would edit /etc/netctl/* by adding a profile there, which you can then enable (such as a static ethernet connection on eth0). But I'm not sure whether that's the right course to take here, as eth0 is already getting an ipv6 somewhere (I'm guessing the outside LXD system).

joachimnielandt on 4 Dec 2017

I'm also seeing some errors in journalctl (in the container), but not sure whether they're related:

Dec 04 17:02:49 testcontainer systemd[1]: [email protected]: Failed to set invocati
on ID on control group /system.slice/system-getty.slice/[email protected], ignoring
: Operation not permitted
Dec 04 17:02:49 testcontainer systemd[1]: Started Getty on lxc/tty2.
-- Subject: Unit [email protected] has finished start-up
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit [email protected] has finished starting up.
-- 
-- The start-up result is RESULT.
Dec 04 17:02:49 testcontainer agetty[25846]: /dev/lxc/tty2: cannot open as standard inpu
t: No such file or directory
Dec 04 17:02:49 testcontainer agetty[25845]: /dev/lxc/tty5: cannot open as standard inpu
t: No such file or directory

joachimnielandt on 4 Dec 2017

Ok, managed to dig a bit further w.r.t. the failed services. I'm still assuming the networkd service is responsible for getting the ipv4 address? I am testing an arch container on an Ubuntu laptop as well (lxd 2.20) which exhibits the same behaviour:

[root@testcontainer ~]# systemctl status systemd-networkd
● systemd-networkd.service - Network Service
   Loaded: loaded (/usr/lib/systemd/system/systemd-networkd.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Mon 2017-12-04 17:01:07 UTC; 2h 7min ago
     Docs: man:systemd-networkd.service(8)
  Process: 25769 ExecStart=/usr/lib/systemd/systemd-networkd (code=exited, status=237/KEYRING)
 Main PID: 25769 (code=exited, status=237/KEYRING)

Dec 04 17:01:07 testcontainer systemd[1]: Failed to start Network Service.
Dec 04 17:01:07 testcontainer systemd[1]: systemd-networkd.service: Service has no hold-off time, scheduling restart.
Dec 04 17:01:07 testcontainer systemd[1]: systemd-networkd.service: Scheduled restart job, restart counter is at 5.
Dec 04 17:01:07 testcontainer systemd[1]: Stopped Network Service.
Dec 04 17:01:07 testcontainer systemd[1]: systemd-networkd.service: Start request repeated too quickly.
Dec 04 17:01:07 testcontainer systemd[1]: systemd-networkd.service: Failed with result 'exit-code'.
Dec 04 17:01:07 testcontainer systemd[1]: Failed to start Network Service.
[root@testcontainer ~]# journalctl _PID=25769
-- Logs begin at Sun 2017-12-03 21:13:32 UTC, end at Mon 2017-12-04 19:08:34 UTC. --
Dec 04 17:01:07 testcontainer systemd[25769]: systemd-networkd.service: Failed to change ownership of session keyring: Permission denied
Dec 04 17:01:07 testcontainer systemd[25769]: systemd-networkd.service: Failed to set up kernel keyring: Permission denied
Dec 04 17:01:07 testcontainer systemd[25769]: systemd-networkd.service: Failed at step KEYRING spawning /usr/lib/systemd/systemd-networkd: Permission denied

The same error crops up with the resolved service.

joachimnielandt on 4 Dec 2017

lxc profile set default security.syscalls.blacklist "keyctl errno 38"

And then restart your containers, that should take care of that.

The reason is that the networkd systemd unit somehow makes use of the kernel keyring, which doesn't work inside unprivileged containers right now. The line above makes that system call return not-implemented which is enough of a workaround to get things going again.

stgraber on 5 Dec 2017

👍5

Magical, Stéphane, the container now has a proper ipv4 address without any extra effort. Thanks!

joachimnielandt on 5 Dec 2017

I was running into this problem and I realized that this fix will only work on the feature branch and not the stable branch (for now). Commenting so other people don't get stuck like I did.

fizzy123 on 20 Dec 2017

👍5

Hi,
I am having the same problem trying to start a vanilla Fedora 28 container on a fresh installed Ubuntu 18.04 host. By the way, these are privileged containers started by root. Luckily the fix with the syscall three comments above works fine (with lxc 3.0.1 from Ubuntu 18.04).
I wonder, however, if there is a way to get this working out of the box since I find it unfortunate the luck of interoperability between two of the major Linux distributions. I don't know where a proper solution would lie. Should lxd have more permissive defaults for that syscall? Should networkd be more resilient to failures opening the keyring? What do you think?
Thanks a lot!

cquike on 15 Jul 2018

We've suggested a fix for systemd in the past but it didn't get included... Either that needs to be included or someone has to figure out unprivileged keyring use at the kernel level.

stgraber on 15 Jul 2018

Thank you very much for the info. Do you have the link to the proposed systemd change?

cquike on 15 Jul 2018

@brauner might

stgraber on 15 Jul 2018

@xnox might :)

brauner on 16 Jul 2018

xnox did!

@cquike

https://github.com/systemd/systemd/commit/e64c2d0b5fbd8ab75d8f73f5820696ee15c8c6f0#diff-ef9f64675d767bfa6e9c264105226575

https://github.com/systemd/systemd/issues/7655

It's part of v239