Lxd: Incremental copies of containers

Created on 17 May 2017  Â·  14Comments  Â·  Source: lxc/lxd

Feature request:

I would like the ability to do incremental copies of a container. This feature would shorten the downtime when copying/moving containers either locally or between servers especially if live migration is not available or does not work. This can be done manually today via multi-stage rsync, but having this feature built into the product would make great sense (especially from usability point of view).

For example:
lxd copy server1:<container_a> <container_b>
...or...
lxd copy server1:<container_a> server2:<container_b>

Next, we run an incremental copy
lxc copy --incremental server1:<container_a> <container_b>
...or...
lxc copy --incremental server1:<container_a> server2:<container_b>

The logic is pretty straight forward

  • If the target container does not exist, perform copy (just like today)
  • If the target container exists and is powered on, disallow copy (just like today)
  • If the target container exists and is powered off, sync contents of container
Documentation Feature

Most helpful comment

It originally was but in the end couldn't make it as we had to focus on clustering.
No clear milestone right now, our rough plan is to have this in by end of October.

All 14 comments

I took some time to think through this and I think this is a nice feature to have. Supporting this should be straightforward. In case the two containers are on the same storage backend it should be possible to use driver specific features. For example, {btrfs,zfs} {send,receive}.
@stgraber, what do you think?

Thanks Christian.

@stgraber: any comment on the proposed enhancement?

On May 18, 2017, at 11:23 AM, Christian Brauner notifications@github.com wrote:

I took some time to think through this and I think this is a nice feature to have. Supporting this should be straightforward. In case the two containers are on the same storage backend it should be possible to use driver specific features. For example, {btrfs,zfs} {send,receive}.
@stgraber, what do you think?

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.

I'm fine with this, though I'd recommend we use "--refresh" as the argument for this and obviously fail if attempting to copying over an existing container without --refresh passed.

This may get a bit trickier when the source container has a bunch of snapshots, but I'll let @brauner think about those corner cases :)

I need this feature too.
Right now I'm directly using zfs send/receive to keep a replication of my containers between hosts.
But this conflict with LXD snapshots and you need to be very careful to implement it.
I think that LXD need a native solution to implement HA.

Regards

This is just a "me too". But in my case if I lxc move could utilize either a cache or a diff mechanism that would be great. Some random ideas (pardon me I'm not too familiar with the internals).

  1. Use snapshotting to avoid copying the base layer, this would fix my issue.
  2. Make a more complicated block storage protocol that allowed both ends to communicate: "I need to send a block with this SHA256, do you have it?"
  3. Something else?

Just a quick update on this to clarify what we're expecting to do here.

The goal is to allow running "lxc copy src:c1 dst:c1 --refresh" and have LXD update the existing destination container from the source, using an incremental (rsync) storage code path to avoid a full transfer.

A few notes:

  • If destination doesn't exist, "--refresh" should fail
  • Migration sink should be setup through a new refresh option in PUT /1.0/containers/NAME
  • Migration source should be setup through the standard POST /1.0/containers/NAME
  • By default, we should sync the snapshots list between the source and destination. Transfer any new snapshot and remove any removed ones
  • We should also honor --container-only, in which case we don't transfer any snapshot nor do we attempt to remove any extra snapshots on the target
  • This feature must work with all 3 migration modes (pull, push and relay)
  • This would always work in stateless mode, we don't allow refreshing container runtime state.
  • The normal "lxc copy" config overrides would still work and would apply to the container config we send in PUT.

Some humble opinions:

  • with zfs and btrfs, native zend/receive should be used.
  • add a --force option to replace destination if it was used and have changes, like zfs send -F

From what I have seen on multiple servers, btrfs send/receive is very slow compared to standard rsync. In fact, it seems to be 2-3 times slower even on servers with SSDs. The disk throughput is lower (lower MB/sec) and the CPU is higher.

I would prefer the rsync option when using the “—container-only” mode Stephane mentioned earlier and and use the appropriate send/receive tools if the snapshots have to be sync’d between servers.

I would also ask for a “quick-init” option that would simply create an empty container on the remote server then use rsync to copy the data.

My $0.02.

-Ron

On Nov 4, 2017, at 10:10 AM, Christian Pradelli notifications@github.com wrote:

Some humble opinions:

with zfs and btrfs, native zend/receive should be used.
add a --force option to replace destination if it was used and have changes, like zfs send -F
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub https://github.com/lxc/lxd/issues/3326#issuecomment-341899741, or mute the thread https://github.com/notifications/unsubscribe-auth/AWnCPA65Si5-2aVjG0lOf_lSeP7KrBUWks5szHBEgaJpZM4Ndr6W.

@kattunga the --refresh mentioned above is effectively your --force.

We indeed should attempt to use send/receive whenever possible and it actually should work in this case as we can send/receive any new snapshot fine and for the main dataset, we can create a new temporary snapshot and transfer that.

Was this supposed to be in 3.0? If it was, what's the new milestone?

It originally was but in the end couldn't make it as we had to focus on clustering.
No clear milestone right now, our rough plan is to have this in by end of October.

Which milestone is this for? Still points to 3.7.

3.7 has it

Was this page helpful?
0 / 5 - 0 ratings