Velero: Increase visibility of what's inside a backup

Created on 20 Mar 2018  路  12Comments  路  Source: vmware-tanzu/velero

I might have missed something, but from what I found out from the docs is that when I want to look inside a backup, I run ark backup describe <name>
I think that the information this gives me is pretty limited.

I'd expect to see to see the Kubernetes resources that have been backed up. Something like this -

$ ark backup describe wp
Name:         wp
Namespace:    heptio-ark
Labels:       <none>
Annotations:  <none>

Namespaces:
  Included:  *
  Excluded:  <none>

Resources:
  Included:        *
  Excluded:        <none>
  Cluster-scoped:  auto

Label selector:  <none>

Snapshot PVs:  auto

TTL:  720h0m0s

Hooks:  <none>

Phase:  New

Backup Format Version:  0

Expiration:  0001-01-01 00:00:00 +0000 UTC

Validation errors:  <none>

Persistent Volumes: <none included>

Resources:
  Deployments:
  - database
  - wordpress
  Services:
  - database (3306/TCP)
  - wordpress (80:31229/TCP)
  Secrets:
  - database-root-password
  - database-user-password
  ConfigMaps:
  - database

This would really help me keep track of backups and what to restore at what time.

EnhancemenUser Help wanted P2 - Long-term important

Most helpful comment

We'd need to figure out how to do this without blowing up etcd. Etcd values are capped at 1.5MB, so if you're backing up an entire cluster, I don't think we can store all of that information on the backup object itself. We could include a manifest in object storage, and have describe download the manifest and display it.

All 12 comments

We'd need to figure out how to do this without blowing up etcd. Etcd values are capped at 1.5MB, so if you're backing up an entire cluster, I don't think we can store all of that information on the backup object itself. We could include a manifest in object storage, and have describe download the manifest and display it.

@ncdc that makes sense. How do I start hacking on this? I've been trying to read through the code, but I'm not super familiar with how it is structured. Where should I start looking? Also, would love if there is an architecture doc that'd guide me through this.

If you just want to hack, you could either add a method to the BackupService interface:

https://github.com/heptio/ark/blob/master/pkg/cloudprovider/backup_service.go#L38-L65

or add yet another arg to the UploadBackup method.

This is where we call UploadBackup:

https://github.com/heptio/ark/blob/master/pkg/controller/backup_controller.go#L371

We do plan on creating some architecture docs. Just not done yet...

It would be great if there was more information about each backup in metadata stored in the bucket. (not in etcd) e.g., the contents of the k8s manifests, and file data from snapshotted volumes (file names, dates, and hashes)

@robbyt we are planning on adding metadata in the bucket about what's in it (resource types + namespaces + names). I doubt we'll look at file metadata in the short term, though. That's significantly more work.

I think we should be careful about dumping too much information into the output of the describe command. I might propose spinning out a new CLI command, inspect, that allows the user to see the contents of a particular backup.

I like the idea of inspect, or we could add a flag to describe to show it.

@skriss for 1.1 are we just interested in showing the resource kinds, namespaces and names? Perhaps we can create a separate issue for an inspect command?

kinds, namespaces and names are a good place to start. Since this issue was originally created, we've added a --details flag to the velero backup describe command - so one option is to show this new info only when that flag is specified, rather than creating an entirely separate command.

Sounds good, it sounded like the discussion here was that an inspect command could show the full k8s manifests that were backed up, but certainly using --details for kinds, namespaces and names makes sense.

Yeah, starting with showing kinds/namespaces/names for velero backup describe --details makes sense. Since we already have velero backup download, I'm not sure there's much additional utility in an additional CLI command that displays the contents of the files in the tarball.

Splitting this work up into 2/3 PRs:

  • [x] store backup resource list in object storage (#1709)
  • [x] DownloadRequest changes to fetch backup resource list from object storage (#1714 )
  • [ ] CLI changes to display backup resource list in velero backup describe --details (#1714 )

Might combine the last 2 depending on length

Was this page helpful?
0 / 5 - 0 ratings

Related issues

concaf picture concaf  路  3Comments

nrb picture nrb  路  4Comments

abh picture abh  路  4Comments

totemcaf picture totemcaf  路  4Comments

carlisia picture carlisia  路  4Comments