Using ceph thinly provisioned block devices as container rootfs would be awesome and another step forward for container migration due tot ceph's distributed nature.
Ceph also has full openstack Swift and glance support, making it a good fit for lxd
ceph feature would be really appreciated.
I posted this on #2875 , but I didn't realise that issue was closed.
I believe ceph is the most popular storage backend used in openstack, for cinder and glance.
If nova-lxd wants to become a first class citizen in openstack, I would say that ceph support for lxd is a must!
+1 from me :)
@jocado, check out this, https://github.com/OpenNebula/addon-lxdone. It's a project I'm currently working on. It's something like nova-lxd but for OpenNebula instead of OpenStack. It supports Ceph as storage backend, for rootfs and extra devices. The thing is LXD is not aware it's using Ceph, it's a workaround.
in my opinion if nova-lxd is something more than a playground, then support of ceph volumes should be a must, at least for cinder volume attachments
we have ceph serving cinder and glance
I have recently deployed a compute node with ubuntu (16.04), the rest of our OS infra is centos7, and decided to try nova-lxd, the others are qemu/kvm, and was disappointed when I realized I could not use ceph to attach volumes or to boot from ceph block devices.
I understand that quite a large number of OS deployments is using ceph, so I had guessed that supporting it in all (most) nova-drivers would be high priority
I can't evaluate the best strategy to do that, and I don't want to sound as negative criticism, but help push this to the light of day
@mariojmdavid yes, though this has nothing to do with LXD's own ceph storage backend. To be able to attach OpenStack cinder volumes or similar OpenStack managed (as opposed to LXD managed) volumes, what you need is support for ceph rbd and mounts in nova-lxd itself rather than any change in LXD.
I believe the nova-lxd team is actively working on this for the reasons you described.
I won't be able to start working on this properly until 25th May I assume.
These are steps that LXD will assume have been taken by the user. LXD will not take care of these since most of them actually require sensitive and often interactive modifications to the underlying system:
In general LXD will not be concerned with any ceph-deploy setup steps! Specifically, LXD will not be concerned with creating ceph clusters, deploying and configuring new nodes. LXD will only make use of existing clusters and nodes.
The LXD daemon will thus start interacting with ceph on the osd pool level:
ceph osd pool create {pool-name} {pg-num} [{pgp-num}] [replicated] [crush-ruleset-name] [expected-num-objects]
rbd create --size {megabytes} {pool-name}/{image-name}
rbd create --pool {pool-name} {rbd-name} --size
rbd map --pool {pool-name} {rbd-name}
mkfs.{type} /dev/rbd/{pool-name}/{rbd-name}
Not using any ceph-deploy level administration will enable LXD to run on any arbitrary node in the ceph cluster and still create storage pools and volumes.
Applicable configuration keys that already exist:
volume.block.filesystem (default=”xfs”)volume.block.mount_options (default=”discard”)volume.size (default=10GB)ceph.cluster_name (string, default=”ceph”, the default value is identical to ceph’s)ceph.osd.pool_nameceph.osd.pg_num (int, default=”100”, number of placement groups)ceph.sparse (book, default=”true”, whether space-efficient but dependency-introducing copies should be used. Similar to the zfs.clone_copy” property.)block.mount_options (string, default=”discard”, Mount options for block devices)block.filesystem (string, default=”xfs”, Filesystem to use for this volume)ubuntu@xenial:~$ lxc storage create pool1 ceph
This will issue the following commands behind the scene:
ubuntu@nuturo:~$ ceph osd pool create --cluster ceph lxd 100
pool 'lxd' created
ubuntu@nuturo:~$ ceph osd lspools
0 rbd,1 .rgw.root,2 default.rgw.control,3 default.rgw.data.root,4 default.rgw.gc,5 default.rgw.log,6 lxd,
ubuntu@xenial:~$ lxc storage create pool1 ceph ceph.osd.pool_name=foo
md5-459045a231d9ed53613791aba4fe6c8f
ubuntu@xenial:~$ lxc storage create pool1 ceph source=foo
Combinations of the “source” property and “ceph.osd.pool_name” do currently not make sense. This behavior is different from the zfs driver where “lxc storage create pool1 zfs source=/dev/sdX zfs.pool_name=foo” means to create a new zfs pool named “foo” on block device /dev/sdX. However, as mentioned above LXD is not concerned with setting up block devices or storage nodes for ceph itself. This is up to the user.
Notes on LXD internals
The “source” property for ceph storage pools will be set to the name of the osd pool if we have created it or been given it’s name (e.g. “ceph”. Note that we always record the name of the cluster in “ceph.cluster_name”. the default value for ceph.cluster_name will be “ceph”. In case we are given an existing pool and the name of the pool is ambiguous or we are given the name of an osd pool that doesn't exist in the default “ceph” cluster users need to give us the name of the cluster it belongs to.
Notes on ceph internals
On a new cluster only the “rbd” pool exists
You can have multiple clusters on the same host. In order to address a specific cluster the “--cluster” argument must be passed to all “ceph osd” and “rbd” commands.
rbd images are conceptually identical to lvm logical volumes and provide at least feature parity with them
rbd images support snapshotting in case the origin has been marked as “protected”. The “protected” property is an explicit marker set on a given rbd image to mark it as having dependent datasets. In contrast to (non-thinpool) logical volumes and zfs clones where this is implicitly enforced behavior, rbd explicitly requires you to enable this behavior.
rbd images seem to support full copy behavior. Issuing “rbd cp /dev/rbd/lxd/bla /dev/rbd/lxd/bla-copy” seems to create a new rbd image with identical contents without introducing dependencies between images. We should thus give ceph pools a Boolean property cep.sparse which allows to determine whether users want to use space-efficient but dependency-introducing copies or not.
When you try to create an already existing osd pool via
ceph --cluster {cluster-name} osd pool create {pool-name} {pg-num}
Then ceph will give a warning that the given osd pool already exists but exit with 0.
Most helpful comment
I posted this on #2875 , but I didn't realise that issue was closed.
I believe ceph is the most popular storage backend used in openstack, for cinder and glance.
If nova-lxd wants to become a first class citizen in openstack, I would say that ceph support for lxd is a must!
+1 from me :)