Running Docker inside LXC on an Ubuntu Xenial image results in VFS being chosen as the default storage driver. VFS results in significant disk space usage and slow performance.
/proc/filesystems (a prerequisite step in the docker docs)https://docs.docker.com/engine/userguide/storagedriver/selectadriver/
Ubuntu 16.04.1 LTS
lxd --version
2.0.3
lxc info:
driver: lxc
driverversion: 2.0.3
kernel: Linux
kernelarchitecture: x86_64
kernelversion: 4.4.0-34-generic
server: lxd
serverpid: 2262
serverversion: 2.0.3
storage: zfs
storageversion: "5
A brief description of what failed or what could be improved.
lxc launch ubuntu-daily:16.04 docker -p default -p docker
lxc exec docker -- apt update
lxc exec docker -- apt dist-upgrade -y
lxc exec docker -- apt install docker.io -y
lxc exec docker -- docker info
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: 1.11.2
Storage Driver: vfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: null host bridge
Kernel Version: 4.4.0-34-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 4 GiB
Name: docker
ID: LFOL:35Y6:DCVV:5P5P:GQKW:MBV6:LOYV:3XWS:LC67:R46I:2JVJ:E6GV
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
So you're right that on zfs, this is a bit of a problem with overlay not working. zfs nesting isn't possible. aufs may be fine though it may also have the same problem as overlay.
To have LXD load the aufs driver for you, you can do:
lxc profile edit docker
And then add the aufs module to the linux.kernel_modules line next to the overlay one.
Other than that, there's very little LXD itself can do about it. We can't list out of tree drivers like aufs in a profile as that'd break on all distros that don't ship it. And as much as we'd love to have zfs nesting, there's no such thing right now (or even being actively worked on my the zfsonlinux folks).
I'm going to close this issue as there's nothing actionable for us to do.
To share back my experience.
I ended up setting up my LXD host to use BTRFS, rather than ZFS. Using BTRFS means that I can't set disk limits per LXD container - using LXD but for my use case, that is OK.
With BTRFS on the host, the LXD container is using BTRFS and as such, the Docker container was able to run with BTRFS.
It is also worth calling out that I needed to set "user_subvol_rm_allowed" as a mount option on my BTRFS mount on my host.
Some good discussion in the bog and comments below:
https://www.stgraber.org/2016/04/14/lxd-2-0-lxd-in-lxd-812/
https://www.stgraber.org/2016/04/13/lxd-2-0-docker-in-lxd-712/
Quotas actually work with btrfs, they're just not visible in "df" output as they would be for zfs.
I am using Proxmox which uses LXC for containers, and noticed that Docker is extremely slow inside my containers.
I also noticed that Proxmox uses RAW QEMU image files to store LXC filesystems.
Do you think it would be possible to format the image using BTRFS somehow, without changing the host? I've been searching for info on this for hours. Also posted in the Proxmox forum but this seems semi-related to what you guys are doing in here.
If proxmox uses raw qemu disks for LXC containers, then it should be possible, so long as you can tell proxmox to format that raw disk using btrfs rather than ext4.
That's what I am thinking. I guess I just need to figure out how to specify the filesystem type. Or see if I can convert an existing image somehow.
@stgraber I would use btrfs for LXD but it is a bit slower than ZFS and most of all the quotas are escapable.
So I would like to stick to ZFS but Docker should run with a fast storage driver inside the containers which seems to be impossible with ZFS.
What is the best way to proceed?
Can we install a fast storage driver with ZFS backend?
Should I use btrfs and hope the quotas get fixed?
Thanks in advance!
@michacassola One thought would be to setup two storage pools in LXD, a ZFS one for your containers and a btrfs one for Docker, then create a volume on the btrfs one and attach that to /var/lib/docker or wherever docker writes its stuff?
That'd have the rest of the container be under ZFS' control and quotas and only have Docker be on btrfs, using a separate, possibly smaller storage pool.
Another alternative, but this time mostly outside of LXD itself would be to use a ZFS volume, format that as btrfs and then have it mounted on /var/lib/docker inside the container.
Something kinda like (untested):
zfs create -V 20GB my-pool/docker/blah
mkfs.btrfs /dev/zvol/my-pool/docker/blah
lxc config device add my-container docker disk source=/dev/zvol/my-pool/docker/blah path=/var/lib/docker
What about LVM as a backend, would there be a more straight forward solution to have:
?
PS: Is there updated info for https://lxd.readthedocs.io/en/stable-2.0/storage-backends/#feature-comparison ?
Nope because LVM is global to the system so can't/shouldn't be exposed to Docker inside the container.
If using LVM, the Docker container would effectively be backed by ext4 or xfs, restricting your options to aufs/overlay2.
But last I checked ext4 and xfs performance should be great for Docker with Overlay2 (or anything else)?
So programs and Docker Containers should run fine/fast inside the Linux Container, right?
But the quota question remains. Last I fiddled with LVM I could set a per volume size. So why does LXD not support a fixed size for a Container on a LVM pool?
That's because the container's image is the LV, the container is then a snapshot of the LV, making it inherit its size. You should be able to grow it after the fact though, I haven't played with our LVM storage in a while, @brauner may remember better :)
Hm, please let me know if I am completely off now:
If a container would get its own LV it could be easily integrated to include a specific size to set the LV to. That is what I meant before.
I certainly would not mind the bit of extra storage needed to separate the container LV from the image LV. I would also be able to live with the extra time needed to clone the LV over just snapshotting.
In this way we would get quota with LVM. Which outways the downside in my opinion.
@brauner mentioned something here: https://github.com/lxc/lxd/pull/3285
by using the new volatile key we can set volatile.apply_quota: 10GB and then on next container start make sure that the quota (or resize in the case of lvm) is applied.
So I guess I will have to set the standard size to the minimum:
sudo lxc storage set lvmpool volume.size 10GB
And once the Container is created I do: sudo lxc config set lvmcontainer volatile.apply_quota 50GB
And restart: sudo lxc restart lvmcontainer
Then my containers LV should have the new size of 50GB.
Please confirm.
@michacassola, please be aware that LVM and filesystems on top of it are fickle little beasts. It might work fine it might not. Resizing filesystems is not a very reliable thing. But in theory it should work.
@brauner Thanks!
What about my suggestion to separate the Image and Container LVs. Then we could set the LV size while creating the Container.
Also Why can't we get AUFS working on a ZFS backend again? I tried adding the kernel module to the container but it didn't help.
I tried using LVM but docker still uses VFS as the storage driver. How come??
So you're right that on zfs, this is a bit of a problem with overlay not working. zfs nesting isn't possible. aufs may be fine though it may also have the same problem as overlay.
To have LXD load the aufs driver for you, you can do:
lxc profile edit dockerAnd then add the aufs module to the linux.kernel_modules line next to the overlay one.
I installed aufs-tools in the lxd host and added aufs module to the linux.kernel_modules list in the lxd profile.
But this doesn't seem to work.
Oct 16 11:01:47 tc-ely-docker containerd[177]: time="2019-10-16T11:01:47.797526296+02:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.aufs"..." type=io.containerd.snapshotter.v1
Oct 16 11:01:47 tc-ely-docker containerd[177]: time="2019-10-16T11:01:47.805279765+02:00" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.aufs" error="modprobe aufs failed: "modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-165-generic/modules.dep.bin'\nmodprobe: FATAL: Module aufs not found in directory /lib/modules/4.4.0-165-generic\n": exit status 1"
Oct 16 11:01:47 tc-ely-docker containerd[177]: time="2019-10-16T11:01:47.805319608+02:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.native"..." type=io.containerd.snapshotter.v1
Oct 16 11:01:47 tc-ely-docker containerd[177]: time="2019-10-16T11:01:47.805407692+02:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.overlayfs"..." type=io.containerd.snapshotter.v1
Oct 16 11:01:47 tc-ely-docker containerd[177]: time="2019-10-16T11:01:47.805567012+02:00" level=info msg="loading plugin "io.containerd.snapshotter.v1.zfs"..." type=io.containerd.snapshotter.v1
Oct 16 11:01:47 tc-ely-docker containerd[177]: time="2019-10-16T11:01:47.808105987+02:00" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.zfs" error="exec: "zfs": executable file not found in $PATH: "zfs zfs list -Hp -o name,origin,used,available,mountpoint,compression,type,volsize,quota,referenced,written,logicalused,usedbydataset lxd/containers/tc-ely-docker" => "
Oct 16 11:01:47 tc-ely-docker containerd[177]: time="2019-10-16T11:01:47.808156999+02:00" level=info msg="loading plugin "io.containerd.metadata.v1.bolt"..." type=io.containerd.metadata.v1
Oct 16 11:01:47 tc-ely-docker containerd[177]: time="2019-10-16T11:01:47.808207908+02:00" level=warning msg="could not use snapshotter btrfs in metadata plugin" error="path /var/lib/containerd/io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter"
Oct 16 11:01:47 tc-ely-docker containerd[177]: time="2019-10-16T11:01:47.808220127+02:00" level=warning msg="could not use snapshotter aufs in metadata plugin" error="modprobe aufs failed: "modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-165-generic/modules.dep.bin'\nmodprobe: FATAL: Module aufs not found in directory /lib/modules/4.4.0-165-generic\n": exit status 1"
Oct 16 11:01:47 tc-ely-docker containerd[177]: time="2019-10-16T11:01:47.808234896+02:00" level=warning msg="could not use snapshotter zfs in metadata plugin" error="exec: "zfs": executable file not found in $PATH: "zfs zfs list -Hp -
Oct 16 11:01:52 tc-ely-docker dockerd[437]: time="2019-10-16T11:01:52.430980132+02:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4207a1e00, READY" module=grpc
Oct 16 11:01:52 tc-ely-docker dockerd[437]: time="2019-10-16T11:01:52.432050494+02:00" level=error msg="AUFS cannot be used in non-init user namespace" storage-driver=aufs
Am I doing something wrong?
Any hints?
Hm, I tried the same again, but without changing /etc/docker/daemon.json and this time it worked!
Magic!
Thanks @stgraber .
No, the difference when I got it working was that the container had security.privileged=true.
@stgraber do you know of another way besides the reduced security?
This suggestion maybe https://github.com/lxc/lxd/issues/2305#issuecomment-391113595
No, the suggestion from https://github.com/lxc/lxd/issues/2305#issuecomment-391113595 renders in this issue: https://stackoverflow.com/questions/45731683/docker-pull-operation-not-permitted
@andersruneson I think the reason is this commit https://github.com/moby/moby/commit/2a71f28a4e1167dee32aa16ddbc819c9d9e77f71
But for a deeper explanation I have no idea either
I was able to get the zfs backend working inside a zfs-backed LXD container without any obvious ill-effects...
The LXD container must be privileged, then simply having the device:
dev-zfs:
mode: "0666"
path: /dev/zfs
type: unix-char
Will get things going - docker creates its new datasets relative to the LXD container's root dataset:
Server:
....
Storage Driver: zfs
Zpool: zpool
Zpool Health: ONLINE
Parent Dataset: zpool/lxd/containers/docker-test
Space Used By Parent: xxx
Space Available: xxx
Parent Quota: no
Compression: lz4
However, I think it's better to create a dedicated dataset for docker. It needs to be mounted somewhere on the host (I'm using /data/docker for this example), and then passed through to the LXD container. Then LXD's ZFS tree doesn't get messed up by docker:
var-lib-docker:
path: /var/lib/docker
source: /data/docker
type: disk
I'm not sure what potential there is for things to go horribly wrong with this setup... but I like it :)
The latter is very much preferred especially from a security standpoint :)
You can do it all through LXD by creating a second storage pool using any of the supported backends except for ZFS.
Most helpful comment
@michacassola One thought would be to setup two storage pools in LXD, a ZFS one for your containers and a btrfs one for Docker, then create a volume on the btrfs one and attach that to /var/lib/docker or wherever docker writes its stuff?
That'd have the rest of the container be under ZFS' control and quotas and only have Docker be on btrfs, using a separate, possibly smaller storage pool.
Another alternative, but this time mostly outside of LXD itself would be to use a ZFS volume, format that as btrfs and then have it mounted on /var/lib/docker inside the container.
Something kinda like (untested):