Lxd: Image download optimisation

Created on 21 Jan 2017  路  6Comments  路  Source: lxc/lxd

Dear LXD developers,

how hard do you think it would be to implement some kind of differential scheme for official images (i.e. layers, similar to Docker) when using ZFS as a storage backend? I'm using LXD v2.7 on Ubuntu 16.10 ...

I was really surprised to discover that when the auto-update option is turned on, LXD just moved now "old" images to /deleted dataset (because I still have running containers from these now "old" datasets) and then LXD just downloaded the new, full images, instead of differences only from the previous images :-(

The situation is, for currently 8 distributions I have pulled locally and these are used frequently to create many new containers often (Alpine, Arch, Centos, Debian, Fedora, Gentoo, Opensuse and Ubuntu), it can eat up to 4+GB everytime the auto-update process happens.

Now you can possibly imagine that the ZFS pool will relatively quickly run out of disk space - old images just won't be deleted because there are still running containers based on them.

I really hoped that LXD would take full advantages of the ZFS snapshots / clones functionalities, like Docker does.

Also, I'd be interested in implementing this if you can guide me a little bit where to start (I've already done some private fixes for the LXD to get it running on other distributions than Ubuntu, but just haven't had the time to contribute yet...)

Thanks.

Documentation Feature

Most helpful comment

simplestreams support was added for this now, so we can work on the client side.

All 6 comments

Have you been able to get lxd running on debian?

ZFS is only one of the supported backends and the only one with the annoying limitation that old images must be kept around so long as they have downstream references.

We may consider using a standard binary delta mechanism to reduce the network usage when using the daily generated images in the future, but it's something we'd need to carefully think through on the server side as we don't want to raise disk size too much on the server end as that may impact mirroring and caching negatively.

Layering on top of a previous ZFS image as opposed to keeping everything as separate filesystems may be worth considering at some point, at least as an option. This however does come with the obvious cost that you will never be able to remove the base image, which for rolling distributions would likely mean more disk usage than the current scheme.

This would also need much more careful tracking of images on LXD's part than is done today. The storage rework currently in progress would make such tracking easier but will also add complexity as images will have to be present in a variety of different pools based on container usage, so the layering would have to be done on a per-pool basis and needed snapshots sent through send/receive as needed.

So we're moving forward on this one with xdelta3 diffs of images. This will allow the daily images to refresh with a very minimal download. The rest of the LXD behavior remains unchanged, so this is purely a download time and bandwidth optimization.

https://images.linuxcontainers.org is busy building deltas now. Over the next 3 days we should have all images available as squashfs and with binary deltas for those squashfs available.

The main difficulty on the server side is the design of the stream metadata to expose those diffs. Once we figure out something which works for us and doesn't break older LXD, we'll expose the metadata and start working on LXD support for it, which should be reasonably straightforward.

simplestreams support was added for this now, so we can work on the client side.

Is it possible that this same process of only transferring image diffs could be incorporated into LXD when transferring containers/images to an LXD remote? (e.g. between a development and production environment with limited bandwidth).

This should have been closed by #3675. @stgraber, if there is still something outstanding please re-open.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sforteva picture sforteva  路  3Comments

spacekookie picture spacekookie  路  3Comments

fwaggle picture fwaggle  路  4Comments

AndreiPashkin picture AndreiPashkin  路  5Comments

simos picture simos  路  3Comments