This is a placeholder issue for us to think about and implement a better, simpler backup mechanism for those who want offline backups of their containers.
I have been doing this with an adapted script from https://github.com/vistoyn/backup-tools where you essentially dump the container config and then tar gzip it along with the rootfs, excluding proc sys and tmp.
I've also written a number of CLI tools in order to restore existing containers from such backups, or provision new ones (and fixing up the network config etc.). However this is not very generalize-able.
Call me old-fashioned, but I simply do not trust the snapshot functionality as-is and backup chains in general. I've lived one too many horror stories on that front. So in my mind, a wholesale portable snapshot is the way to go.
It would be neat if this was baked-in, as it's a strong "selling point" that you can do live snapshots of containers in this way, and it would make newcomers more at ease doing this stuff as it takes a bit of research and trial and error to get this working as-is.
So yeah, huzzah!
One thing I see would be improved is that as-is I have to import my tarball as an image, then do lxd init and then I nuke the temporary image again. Which seems a bit ... indirect.
I follow this topic with high interest. I've tried different methods backing up lxd containers described in the lxd backup strategy and for now I'm sticking with doing this on the filesystem level (Btrfs). I know it's not 100% reliable and maybe not the "right way" but in lack of other options my best choice.
@Kramerican, be great if you can post your cli restore tools, in a zip file.
I'm in the same situation, where I have 100s of LXD containers running across many machines + require a mechanism to restore a container on any machine, any time + have the container work.
This was very easy with LXC + ridiculously difficult with LXD.
I'll take a look at the above link.
Thanks for posting it.
@davidfavor our tools are very much baked into our management tools, so it would be a bit difficult and time consuming to split out the relevant parts for you. However, feel free to shoot me any specific questions you might have and I'll try to help the best I can.
I have encountered an issue in connection with this recently, which is a bit of a catch-22 for me.
As mentioned, I want portable off-site backups of containers - that's why I don't use snapshotting.
This has worked fine, untill we had a container with hundreds of thousands of files on an HDD setup. This container would take about ~10 hours to "snapshot" instead of the usual couple of minutes.
This is clearly not acceptable in our setup - so now I am not investigating alternatives. I have looked at something like https://zpool.org/zfs-snapshots-and-remote-replication/ - but we are unfortunately limited by our remote storage which is a dumb FTP server.
The ultimate scenario would be the ability to create portable snapshots of containers quickly and copy them off-site for storage, like you can with so many other traditional vm solutions.
AFAIK you can dump ZFS snapshots and upload them to the ftp server.
@tomposmiko got anything you can link to on this?
zfs send blah@snapshot > file.img
Then when you want to restore: zfs receive blah < file.img
@stgraber I have stumbled on this myself these last few days, and have had wonderfully mixed, but always far superior results with zfs send / receive. Mixed in the sense that I have a problem with zfs send fluctuating wildly in speed, but I've reached out to the zfs mailing list to diagnose this.
The remainder here is probably a bit off topic, but I'll leave it here for your comments and if others might find this useful ... :
It is now clear to me that it is crazy to consider descending into the filesystem with tar and making tarballs if your objective is snapshots of running containers. That is simply not the correct tool for the job when you have zfs at your disposal.
I have now updated all my snapshotting and restoration workflows with regards to my portable backups to use zfs send / receive. This is not on production yet however, as I am doing things a bit gung-ho with regards to restoration, let me explain:
Rolling back a running container, I do something like: stop container, destroy original pool (and snaps) and then rename pool to minimize downtime
pigz -c -d /backup/testsnap.gz | zfs recv -F lxd/containers/temppool
lxc stop mycontainer
zfs destroy -r lxd/containers/mycontainer
zfs rename lxd/containers/temppool lxd/containers/mycontainer
lxc start mycontainer
Provisioning a new container based on a snapshotted dataset: Init "empty" ubuntu container, Destroy old storage dataset, Import new dataset, start the container, update network config and whatever else needs done
lxc init ubuntu:xenial testnewcontainer --profile webdock
zfs destroy -r lxd/containers/testnewcontainer
pigz -c -d /backup/testsnap.gz | zfs recv -F lxd/containers/testnewcontainer
lxc start testnewcontainer
... Update network config and whatever else needs doing
This approach is something I came up with as a test, just to see if it would work, and for now this seems to work just nicely. I have not seen anybody doing things this way however (not that I could find), which leads me to suspect there is possibly more to this kettle of fish. Is this a crazy way of doing it? i.e. just replacing the dataset wholesale like I do here? I can't think of anything theoretical which would be wrong with this, but then again I'm not well versed in the internals of LXD.
What you're doing should be fine and I in fact intend for us to support something like this natively with #3730. Effectively letting you export a container (and its snapshots) as a tarball which either contains the raw fs tree (good old tarball) or contains the storage backend optimized format (result of zfs send, btrfs send, ...).
What format you'd export as would therefore depend on what you expect to restore your container on.
@stgraber Thank you for setting my mind at ease, and awesome to hear this will be supported natively so I can clean up my scripts a bit :smiley:
Regards ZFS... I've had to fall back to using ext4, because ZFS speed fluctuates wildly + the client sites I host in LXD containers are all high traffic sites... so speed fluctuation make ZFS a no-go.
@Kramerican, when you do the LXD restore, pass along how you handle regenerating /dev + /sys + /proc as this seems like the only sticking point in the restore process.
@stgraber, maybe you can comment on this also.
Looking at https://github.com/vistoyn/backup-tools looks like this person's approach is to lxc init $container which is a bit simplistic. Likely taking this approach requires init'ing the container using an exact match for the image type used for the running container, which was backed up.
This approach ensures correct creation of /dev + /proc + /sys + seems like huge overkill.
If there's a better/faster/lighter-weight approach, I'd love any suggestions.
Thanks.
@davidfavor I am no longer doing the tar'ing of the container. I had no problems with the method you refer to, except for the obvious issues of speed and impracticality vs. zfs
The speed issues you have experienced with zfs: Is that when doing zfs send/receive, or just disk i/o in general?
I had tremendously varied results with zfs send/receive up untill I sparred with the nerds on the zfs mailing list. From that I gathered a bunch of very excellent suggestions for optimizations suitable for LXD containers. I have been meaning to do a writeup of this and post...somewhere, when I have time.
In essence: Containers suffer from fragmentation of data. This slows down disk/io. Especially webservers with heavy db access or great deletion/creation of files. Here you can essentially "defrag" the datapool with a couple of simple commands, and mitigate this fragmentation with some zfs settings.
If you are seeing bad disk i/o in general, I do not have any solutions for you - except for a few optimization settings for zfs: make sure compression is on, do zfs set xattr=sa lxd possibly zfs set sync=disabled lxd
and possibly zfs set atime=off lxd as well as tweaking the amount of RAM zfs can use. PS do not use these settings without reading up on what they do as they can result in data loss in certain scenarios.
I'd suspect something other than zfs is giving you trouble maybe ...
Just a few design notes on this:
We'll keep client support for this pretty limited at the beginning with just two extra top-level commands:
lxc export [<remote>:]<container> [target] [--container-only] [--optimized-storage]lxc import [<remote>:] <backup file>So export would effectively ask LXD to create a backup, setting a short expiry for it, then retrieve it once made and delete the backup. Import would read a backup file and POST it to LXD to have it restored.
@brauner @monstermunchkin assigning this to the two of you.
@monstermunchkin you can try to do all the API addition bits, structs, API extension, ... and then sync with @brauner for the actual generation and consumption of the tarball.
We are excited to see this feature coming.
The last two years we are using our own scripts running something like this:
zfs send $zfssrc/$container@snapshot-$id | lz4 > $zfslcl/$container-$id.zfs.lz4
Also do we use the the zfs copy to another lxd daemon as a "stand-by" backup.
For both we would love to see an incremental (https://github.com/lxc/lxd/issues/3326) option, as the backup load on the network is slowly hitting a critical size for us.
Same here: we are using (almost) since the beginning https://github.com/digint/btrbk. Works really good with incremental backups.
Michael Luggen wrote:
We are excited to see this feature coming.
The last two years we are using our own scripts running something like this:
|zfs send $zfssrc/$container@snapshot-$id | lz4 >
$zfslcl/$container-$id.zfs.lz4|Also do we use the the |zfs copy| to another lxd daemon as a "stand-by"
backup.For both we would love to see an incremental (#3326
https://github.com/lxc/lxd/issues/3326) option, as the backup load on
the network is slowly hitting a critical size for us.
The will fail if any database subsystem is running, like MariaDB/MySQL.
Snapshotting disk data fails to capture memory buffers, so when this type of
snapshot is restored, it fails with corrupt database files.
With MariaDB, sometimes this corruption can be recovered + data is still lost.
Many times file corruption cannot be recovered + error messages are emitted
each time MariaDB/MySQL is started.
With MySQL (shudder) rarely can any type of corruption be recovered.
The best way to do this is just to do a...
1) lxc stop container
2) lxc copy container in background
3) restart container
At this point the container copy will run till it finishes.
Then stop mysqld on both servers + do a 2nd rsync /var/lib/mysql from
original container to copied container.
Then restart mysqld on both servers.
@davidfavor
Ha, good to know: Backup is easy, Restore is hard (-:
Unfortunately we cant stop our machines for the backup. Is there anyway to instruct the Database to save to disk before a backup?
Michael Luggen wrote:
@davidfavor https://github.com/davidfavor
Ha, good to know: Backup is easy, Restore is hard (-:Unfortunately we cant stop our machines for the backup. Is there anyway
to instruct the Database to save to disk before a backup?
Two things to consider.
1) Doing a mysql stop + rsync only produces a few seconds of outage.
This is the fastest way to create a 100% backup which will work
when restored.
2) You can use xtrabackup (apt-get -y install percona-xtrabackup-24)
to clone database to another directory + then move that directory
to your copied container.
This allows 100% database uptime + the trade off is time required
to run xtrabackup.
If the site has a small database, no problem.
If you're looking at many Gb or Tb of data, then using the rsync
trick is the only way to handle this.
The way I do this is to place a friendly maintenance mode message
on site for the 5-10 seconds usually required for this operation.
@davidfavor We've been doing, on our ZFS backed systems
/sbin/zfs snapshot lxd/containers/$LXD_NAME@$BACKUP_TYPE
and then
/sbin/zfs send lxd/containers/$LXD_NAME@$BACKUP_TYPE | /usr/bin/mbuffer -q -m 500M | /usr/bin/pigz -c -p 6 | /usr/bin/mbuffer -q -m 500M > $LXD_BACKUP/$LXD_SNAPSHOT.gz
For 7-8 months now, on 300+ containers running MariaDB without noticing a single instance of corrupted database data.
Granted, the vast majority of these sites are not busy websites, so maybe we didn't notice any potential drop of data? At the very least, we have seen no corruption of data or problems arising on the database side.
Maybe we have just consistently been getting lucky? :)
Edit: We have restored many a webserver from snapshots like these. I don't have a precise count, but including migrations internally on our network it's in the hundreds. Again, we have never had a server not come up or had problems due to corrupt MariaDB data
Second edit: If you are wondering what mbuffer is for, it's due to us streaming from one disk to another on the system, typically from an SSD to a HDD - so mbuffer is there to optimize the transfer by "borrowing" some RAM, in case pigz lags behind or the HDD for some reason sees high i/o and can't receive the stream quick enough.
Kramerican wrote:
@davidfavor https://github.com/davidfavor We've been doing, on our ZFS
backed systems|/sbin/zfs snapshot lxd/containers/$LXD_NAME@$BACKUP_TYPE|
and then
|/sbin/zfs send lxd/containers/$LXD_NAME@$BACKUP_TYPE | /usr/bin/mbuffer
-q -m 500M | /usr/bin/pigz -c -p 6 | /usr/bin/mbuffer -q -m 500M >
$LXD_BACKUP/$LXD_SNAPSHOT.gz|For 1 year+ on 300+ containers running MariaDB without noticing a single
instance of corrupted database data.Granted, the vast majority of these sites are not busy websites, so
maybe we didn't notice any potential drop of data? At the very least, we
have seen no corruption of data or problems arising on the database side.Maybe we have just consistently been getting lucky? :)
Some of the sites I host run 50K-100K reqs/minute, so tend to corrupt.
The only reason you don't see any corruption is...
1) You're log level is to low to see corruption messages.
2) Database updates are so infrequent flushes occur before backups run.
In general, trying to do a snapshot will create corruption under many
circumstances... so...
If you're running hobby sites, you can ignore corruption.
If you're running money/production sites, best to rsync /var/lib/mysql,
as described previously in this thread.
@davidfavor Maybe this doesn't need to be that complicated? Looking at this:
https://serverfault.com/questions/805257/backing-up-a-mysql-database-via-zfs-snapshots
There may be some configuration / commands which can be used to make sure memory buffers are flushed when doing the snapshot. I'm going to monkey around with this a bit and see what I find. If you've already been down this path and failed, it would be much appreciated to know :)
Edit: From further reading, as I am using InnoDB (the default for MariaDB) and since the innodb_flush_log_at_trx_commit setting is set to 1 per default More info here then as far as I see it there should be no chance of any major badness happening, and no need for any FLUSH TABLES ... stuff before snapshotting. I guess possibly MariaDB would emit some messages about recovering the last transaction commit, but as far as I understand it, it's all good :)
If I have misunderstood something here, or you have more practical experience with this @davidfavor I would love to know.
Second edit: (I should really complete my research before writing replies..) From further reading there may be some chance of the flush not completing and hence will be lost in the snapshot. However, Just because a snapshot didn't grab the very last commit, I don't really see how that could corrupt the database. We're not talking about a power outage here, where the contents of memory is lost. There'd just be some data that didn't make it to that snapshot, but neither will any of the data that comes after More info on this InnoDB auto-recovery magic From my current understanding of things - and this synchs with my practical experience so far - there should be no issue here. I can't speak for MySQL specifically, but if you're using InnoDB you should be good.
So from reading this @stgraber:
Will the new feature of more native backups adress the mentioned database troubles?
Also more imprtantly:
For larger Containers, will there be an incremental option?
@michacassola no, this feature lets you extract the filesystem as it currently is into a tarball. If that filesystem is inconsistent because your database server hasn't been told to write everything, then it will be inconsistent. Database servers should usually be special cased in backups, either by turning them off, by doing a seperate dump of the databases or by syncing remotely with a remote DB server.
For incremental type backups, #3326 is what we'll be looking at.
Thank you @stgraber !
I don't have any large sites right now but its always better to know in case I will.
@davidfavor
The best way to do this is just to do a...
1) lxc stop container
2) lxc copy container in background
3) restart container
At this point the container copy will run till it finishes.
Then stop mysqld on both servers + do a 2nd rsync /var/lib/mysql from
original container to copied container.Then restart mysqld on both servers.
I dont get it.
Why not stop mysql then stop the container then copy it and restart it again. Would the copy then not be corruption free? And you would save a step? Please enlighten me.
Also why is a mysql dump by mysql itself not possible and one could copy the container and do a mysql dump and for restoring putting the mysql dump back into the restored container?
Thanks in advance! :)
@michacassola Did you see my comments? If you are using MariaDB/InnoDB you have no need to worry. Just snapshot and off you go ...
@Kramerican Yes of course I did. But with the many edits I was not sure what was the end result. But thanks for making that clear!
But I will have to set innodb_flush_log_at_trx_commit to 1 or 2.
And I guess @stgraber if the log buffer is continually flushed to disk using the future LXD native backup function will also work as a ZFS snapshot would?
Excellent, Thank you for fix
On 15-May-2018, at 3:45 PM, Christian Brauner <[email protected]notifications@github.com> wrote:
Closed #3730https://github.com/lxc/lxd/issues/3730.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHubhttps://github.com/lxc/lxd/issues/3730#event-1626828710, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AQpDFgXls0ee-Ov9IEJrQ6k9e6BbGiUKks5tyqrZgaJpZM4PEzlQ.
@Kramerican - Running ZFS + BTRFS backing stores are incredibly slow, compared to raw EXT4, so EXT4 is really the only option for high IOPs databases.
This entire problem returns to many existing tickets which have requested.
1) Pre + Post hooks for lxc copy, so database start/stop can be integrated into copy operations.
2) Allow passing rsync exclude lists, to optimize copy operations.
If these two features existed working with large data containers would simplify.
+1 for incremetal backups
Most helpful comment
Here's an illustration of the new feature:
