Kops: etcd backups

Created on 14 Feb 2017  路  31Comments  路  Source: kubernetes/kops

We should use commands similar to this:

etcd2: etcdctl backup --data-dir=foo --backup-dir=bar and then push the result up to storage etcd3: the command is a bit different: etcdctl --endpoints=127.0.0.1:2379 snapshot save /backup/dir/snapshot.db
also for etcd3 you need to set an env var: ETCDCTL_API=3

lifecyclrotten

Most helpful comment

@chrislovecnm Is there any documentation available yet?

From what I can see one could use the following config:

  etcdClusters:
  - backups:
      backupStore: s3://bucket/cluster.example.com/backups/etcd/main/
    etcdMembers:
    - instanceGroup: master-us-west-2a
      name: a
    name: main
  - backups:
      backupStore: s3://bucket/cluster.example.com/backups/etcd/events/
    etcdMembers:
    - instanceGroup: master-us-west-2a
      name: a
    name: events

Can an existing cluster be updated with that? And does it need anything else? How often does it backup per default?

All 31 comments

This isn't really a kops thing either, if we can do this in a non-kops-specific way

@justinsb do you want to create a command under kops? If so, I think there should also be a command to restore the backup.

@justinsb, I'd be happy to help with this. Do you have a particular design in mind for it? I'm wondering if we want the ability to schedule a regular backup and upload to S3/Cloud Storage?

@robinpercy can the etcd operator be setup for just backups? We should have an addon for this, someone must have an operator. We should have update and rolling-update call the operator as well, when the masters are touched.

@chrislovecnm good call. I'll have a look at what the operator (and others) provide. +1 for calling it before rolling updates.

Looking at the operator, it appears you need to create the etcd cluster via the operator. Would need to dig through the code. We should just make this simple. A cronjob or a pod that calls backup and stored the data on a pvc. An optional addon.

Figured I would chime in here and provide another data point as I created some backup/restore tooling for etcd3 prior to using kops to manage our k8 clusters.

(none of this applies to etcd2).

The Setup

Similar to kops, we ran a sidecar program that was responsible for bootstrapping the etcd servers, however it would continue running after etcd had started to perform periodic backups.

Backup

Every 10 minutes the sidecar would create a backup via the Snapshot method on the Etcd client, and pushed the resulting data to a versioned S3 bucket.

We ran this every 10 minutes, and set a lifecycle policy to expire old versions of the backups.

Restore

Unfortunately the Etcd client doesn't provide a way to restore without using the CLI.

For our purposes we used the code from the snapshot_command.go source as a guideline, but we could have also done it via shelling out to the CLI tool instead.

On instance startup, the sidecar would check if there was an available backup in S3, and if so restore it to the filesystem before starting Etcd.
If there wasn't a backup, it would start in a new cluster configuration.

After being sidetracked for the last couple of weeks, here's what I'm thinking. For now I'm targeting etcd2, since it's the only option supported by kops, but I think this will be easily adapted (and even simplified for etcd3).

General Approach

  • Install a privileged pod on each master
  • the pod contains 2 containers and an emptyDir volume
  • container 1 is responsible for backing up the "local" etcd data dir to a known location on the emptyDir volume
  • container 2 is responsible for watching that known location and doing something with the backup
  • we will provide a reasonable default implementation of container 2 that ships the etcd backup to S3
  • users can easily customize this behaviour by overriding the container 2 image.

Key considerations:

  • Backups require access to the filesystem in etcd2, they can't be done remotely
  • We don't want to take backups off of minority members during a partition (thus only backing up the leader)
  • Offsite storage mechanisms will be use-case specific and should be easily customized

I'm currently not clear on what's possible using the add-ons mechanism. In particular, can an add-on be defined in a way that reads the master count before deployment (e.g. will set replicas to 1 or 3 depending on cluster config)? Or is this better done with static manifests on the masters?

@justinsb @chrislovecnm what do you think of the above design?

Having this as an addon is fine, or we can deploy a sidecar with etcd? Do we need two containers? Having a single loop seems reasonable. Should we backup to pvc or s3? I am thinking PVC is better for IAM perms, we do not need yet another bucket.

Do we need backups or can we do snapshots? https://github.com/kubernetes-incubator/external-storage/tree/master/snapshot

How do we monitor that backups are occurring?

@chrislovecnm:

So, the problem I see with a PVC is that we'd need one for each backup pod, and we end up with backups spread across each (based on whichever pod happens to be beside the leader at any given time). The two-container approach is just there so users can easily override that "upload" behavior however they want.

I've avoided the snapshot route due to the conversation here:
https://github.com/kubernetes/kubernetes/issues/40027#issuecomment-288930752 and related kops issue: https://github.com/kubernetes/kops/issues/1506. In my experience etcd2 hasn't been very resilient to minor deltas between member data stores. The etcdctl backup approach does make for a more tedious restore process, but it seems to be the only one endorsed by CoreOS (for etcd2).

I'm open to suggestions about monitoring. I was thinking of exposing a prometheus-compatible endpoint on each backup pod that includes the timestamp, size and location of the latest backup. Then users can collect that however they like.

@edulop91 we would love your help with this. I think a separate controller would be best. Protokube is not ha aware, and would need to be it we put it in protokube.

The benefit of putting it in protokube is having a restore that could be triggered w/o k8s running.

Let just start iterating and make it awesome throwing more iterations.

In case it is useful 'kube-aws' does what you're discussing using systemd and a handy etcdadm script crafted by @mumoshu and friends. It takes 1-minute backups to S3, automatically 'resets' failed etcd nodes, and automatically recovers a failed etcd cluster from the S3 backup. The differences that might not fit your use case are kube-aws uses etcd3 and dedicated etcd instances. But logic might be relevant.

https://github.com/kubernetes-incubator/kube-aws/blob/master/docs/advanced-topics/etcd-backup-and-restore.md

We merged the alpha version of using kopeio etcd manager. This will be available in kops 1.9.

JFYI but in kube-aws, etcd node gets a reset only when its data seemed broken.

When the etcd node was terminated by any transient issue, it just come back with the same identity(same EBS vol and Elastic IP) and continue the job as usual. No reset in this case.

I find this simpler and reliable as we don't need to manipulate etcd cluster membership at all.

automatically 'resets' failed etcd nodes, and automatically recovers a failed etcd cluster from the S3 backup

It was a compliment to this info!
Not sure kops or etcd-manager doing things differently, but hope this helps anyway.

@chrislovecnm Is there any documentation available yet?

From what I can see one could use the following config:

  etcdClusters:
  - backups:
      backupStore: s3://bucket/cluster.example.com/backups/etcd/main/
    etcdMembers:
    - instanceGroup: master-us-west-2a
      name: a
    name: main
  - backups:
      backupStore: s3://bucket/cluster.example.com/backups/etcd/events/
    etcdMembers:
    - instanceGroup: master-us-west-2a
      name: a
    name: events

Can an existing cluster be updated with that? And does it need anything else? How often does it backup per default?

Backups are being collected every 5 minutes and stored to s3.

Unfortunately my current tests show that only the ETCD Main cluster is backed up.

I'd rather suggest to create a add-on to enable ark than snapshot to s3. It's way easier to manage backups using ark

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

+1

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@chrislovecnm Is there any documentation available yet?

From what I can see one could use the following config:

  etcdClusters:
  - backups:
      backupStore: s3://bucket/cluster.example.com/backups/etcd/main/
    etcdMembers:
    - instanceGroup: master-us-west-2a
      name: a
    name: main
  - backups:
      backupStore: s3://bucket/cluster.example.com/backups/etcd/events/
    etcdMembers:
    - instanceGroup: master-us-west-2a
      name: a
    name: events

Can an existing cluster be updated with that? And does it need anything else? How often does it backup per default?

where the hell is this documented?
my main etcd db is being snapshotted to s3, but the events one not... I have no idea why.

I agree with @marekaf . This should be reopened. Its not complete as there is no documentation on this functionality.

/remove-lifecycle rotten

I've applied the etcdClusters.backups.backupStore configs, I can see both main and events for kops-1.11.1 in the S3 bucket. But I only see a json file with the content:

{
  "memberCount": 1,
  "etcdVersion": "3.2.24"
}

While my etcd version defined in the master's user-data is

etcdClusters:
  events:
    image: k8s.gcr.io/etcd:2.2.1
    version: 2.2.1

In both events.yaml and main.yaml I can se the command had changed, but no tar.gz in the bucket yet. Here the example of my main.yaml command secction

  - command:
    - /bin/sh
    - -c
    - mkfifo /tmp/pipe; (tee -a /var/log/etcd.log < /tmp/pipe & ) ; exec /etcd-manager
      --backup-store=s3://bucket/cluster.example.com/backups/etcd/main/
      --client-urls=https://__name__:4001 --cluster-name=etcd --containerized=true
      --dns-suffix=.internal.cluster.example.com --etcd-insecure=false --grpc-port=3996
      --insecure=false --peer-urls=https://__name__:2380 --quarantine-client-urls=https://__name__:3994
      --v=6 --volume-name-tag=k8s.io/etcd/main --volume-provider=aws --volume-tag=k8s.io/etcd/main
      --volume-tag=k8s.io/role/master=1 --volume-tag=kubernetes.io/cluster/cluster.example.com=owned
      > /tmp/pipe 2>&1
    image: kopeio/etcd-manager:3.0.20190516
    name: etcd-manager

But the question is (besides the leak of documentation about this feature). How do we restore the backup?

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Was this page helpful?
0 / 5 - 0 ratings