What steps did you take and what happened:
I installed velero + restic on Scaleway kubernetes cluster. After having tried some simple backup, I added persistence volume backup on pods in a namespace.
I create the backup:
velero backup create gitea --include-namespaces=git
No bucket is created, and the backup sticks on "InProgress":
velero backup get
NAME STATUS CREATED EXPIRES STORAGE LOCATION SELECTOR
gitea InProgress 0001-01-01 00:00:00 +0000 UTC 29d default <none>
Note that others backup, without persistence volume backup, are working well. Buckets are created and backups are stored.
What did you expect to happen:
Backup of the namespace + volumes in my bucket.
The output of the following commands will help us better understand what's going on:
(Pasting long output into a GitHub gist or other pastebin is fine.)
kubectl logs deployment/velero -n velerovelero backup describe <backupname> or kubectl get backup/<backupname> -n velero -o yamlName: gitea
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: <none>
Phase: InProgress
Namespaces:
Included: git
Excluded: <none>
Resources:
Included: *
Excluded: <none>
Cluster-scoped: auto
Label selector: <none>
Storage Location: default
Snapshot PVs: auto
TTL: 720h0m0s
Hooks: <none>
Backup Format Version: 1
Started: <n/a>
Completed: <n/a>
Expiration: 2019-04-03 12:15:09 +0200 CEST
Validation errors: <none>
Persistent Volumes: <none included>
velero backup logs <backupname>velero backup logs gitea
An error occurred: request failed: <?xml version='1.0' encoding='UTF-8'?>
<Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><RequestId>tx26c87a0b00024765b7a2f-005c7cfe63</RequestId><Key>backups/gitea/gitea-logs.gz</Key></Error>
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
Environment:
velero version): v0.11.0kubectl version):/etc/os-release): Centos 7.4 on server, Fedora 29 for the clientOK, found the problem and I cannot fix that.
velero restic repo get -o yaml
message: |-
error running command=restic init --repo=s3:https://s3.nl-ams.scw.cloud/velero/restic/git --password-file=/tmp/velero-restic-credentials-git393484349 --cache-dir=/scratch/.cache/restic, stdout=, stderr=Fatal: create repository at s3:https://s3.nl-ams.scw.cloud/velero/restic/git failed: client.BucketExists: The authorization header is malformed; the region 'us-east-1' is wrong; expecting 'nl-ams'
It's the same problem with "mc" command I tried before, I needed to use "s3v2" api version to avoid that problem.
Is there a way to fix ?
To be sure that everything is ok:
So it seems that "restic" doesn't use the provided region parameter.
Here is my backupstoragelocation:
apiVersion: velero.io/v1
kind: BackupStorageLocation
metadata:
name: default
namespace: velero
spec:
provider: aws
objectStorage:
bucket: velero
config:
region: nl-ams
s3ForcePathStyle: "true"
s3Url: https://s3.nl-ams.scw.cloud
Hmm. I'm guessing this is an issue within restic itself. Could you try manually creating a restic repo in your object storage? The restic docs should explain how to do that. Let me know if that does or does not work.
You're right:
restic -r s3:https://s3.nl-ams.scw.cloud/restic init
Fatal: create repository at s3:https://s3.nl-ams.scw.cloud/restic failed: client.BucketExists: The authorization header is malformed; the region 'us-east-1' is wrong; expecting 'nl-ams'
I found this bug report here: https://github.com/restic/restic/issues/2023 where it is explained that we can use restic "rclone" on preexisting bucket. I don't know if it could be usable with velero.
Hello there,
I am currently facing the same issue with scaleway,
@metal3d did you find a workaround about this issue ?
Same here. @metal3d have you found a workaround in the meantime? And you @Hyrsham ?
Hello people,
I've been able to setup velero to use Scaleway's Kubernetes & Object storage the following way.
velero-test-newton in fr-par region)brew install velero
velero install
--provider velero.io/aws
--bucket velero-backups
--plugins velero/velero-plugin-for-aws:v1.0.0
--backup-location-config s3Url=https://s3.fr-par.scw.cloud,region=fr-par
--use-volume-snapshots=false
--secret-file=./credentials
--kubeconfig ./kubeconfig-k8s-serene-bardeen.yaml
--bucket velero-test-newton
- create an example app in Kubernetes
apiVersion: v1
kind: Namespace
metadata:
name: nginx-example
labels:
app: nginx
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nginx-logs
namespace: nginx-example
labels:
app: nginx
spec:
storageClassName: do-block-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deploy
namespace: nginx-example
labels:
app: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
volumes:
- name: nginx-logs
persistentVolumeClaim:
claimName: nginx-logs
containers:
- image: nginx:stable
name: nginx
ports:
- containerPort: 80
volumeMounts:
- mountPath: "/var/log/nginx"
name: nginx-logs
readOnly: false
apiVersion: v1
kind: Service
metadata:
labels:
app: nginx
name: nginx-svc
namespace: nginx-example
spec:
ports:
- `kubectl apply -f ./nginx-example --kubeconfig kubeconfig-k8s-serene-bardeen.yaml`
- `velero backup create nginx-backup --selector app=nginx --kubeconfig kubeconfig-k8s-serene-bardeen.yaml`
- `velero backup describe nginx-backup --kubeconfig kubeconfig-k8s-serene-bardeen.yaml`
Gives
```Name: nginx-backup
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: <none>
Phase: Completed
Namespaces:
Included: *
Excluded: <none>
Resources:
Included: *
Excluded: <none>
Cluster-scoped: auto
Label selector: app=nginx
Storage Location: default
Snapshot PVs: auto
TTL: 720h0m0s
Hooks: <none>
Backup Format Version: 1
Started: 2020-01-16 13:53:37 +0100 CET
Completed: 2020-01-16 13:53:40 +0100 CET
Expiration: 2020-02-15 13:53:37 +0100 CET
Persistent Volumes: <none included>
Hi @newtoncorp , if you do a describe with --detailsdoes it show that the volume has been backed up correctly? How come the storage class is DigitalOcean's in the PVC if you are on Scaleway's managed Kubernetes? I install Velero exactly the same way but Restic wouldn't restore PVCs from Scaleway object storage. I haven't tried backups because I was just restoring to a new cluster after migrating the backups from Exoscale to Scaleway. Because Scaleway is not so compatible with s3 it seems, and because I need versioning (which Exoscale doesn't support), I switched to DigitalOcean Spaces and all is good. Shame though, because Scaleway offers 75GB of free storage and is very cheap.....
Can you clarify how you got backups and restores working with Scaleway?
Hello again,
I did a mistake while testing (as I don't really know Kubernetes), I've removed the storage class as the default on Scaleway's Kubernetes is good. So the pod is running now.
16:58:44 ~/Work/tutorials/velero 禄 kubectl get pods --all-namespaces --kubeconfig kubeconfig-k8s-serene-bardeen.yaml | grep nginx-example
nginx-example nginx-deploy-694c85cdc8-25jk9 1/1 Running 0 89s
------------------------------------------------------------
17:00:42 ~/Work/tutorials/velero 禄 velero backup create nginx-backup-scw --selector app=nginx --kubeconfig kubeconfig-k8s-serene-bardeen.yaml
Backup request "nginx-backup-scw" submitted successfully.
Run `velero backup describe nginx-backup-scw` or `velero backup logs nginx-backup-scw` for more details.
------------------------------------------------------------
17:00:49 ~/Work/tutorials/velero 禄 velero backup describe nginx-backup-scw --details --kubeconfig kubeconfig-k8s-serene-bardeen.yaml
Name: nginx-backup-scw
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: <none>
Phase: Completed
Namespaces:
Included: *
Excluded: <none>
Resources:
Included: *
Excluded: <none>
Cluster-scoped: auto
Label selector: app=nginx
Storage Location: default
Snapshot PVs: auto
TTL: 720h0m0s
Hooks: <none>
Backup Format Version: 1
Started: 2020-01-16 17:00:50 +0100 CET
Completed: 2020-01-16 17:00:53 +0100 CET
Expiration: 2020-02-15 17:00:50 +0100 CET
Resource List:
apiextensions.k8s.io/v1beta1/CustomResourceDefinition:
- ciliumidentities.cilium.io
apps/v1/Deployment:
- nginx-example/nginx-deploy
apps/v1/ReplicaSet:
- nginx-example/nginx-deploy-694c85cdc8
cilium.io/v2/CiliumIdentity:
- 17697
v1/Endpoints:
- nginx-example/nginx-svc
v1/Namespace:
- nginx-example
v1/PersistentVolume:
- pvc-26ec25fe-6928-425c-b940-6cb055e90fe0
v1/PersistentVolumeClaim:
- nginx-example/nginx-logs
v1/Pod:
- nginx-example/nginx-deploy-694c85cdc8-25jk9
v1/Service:
- nginx-example/nginx-svc
Persistent Volumes: <none included>
------------------------------------------------------------
17:01:24 ~/Work/tutorials/velero 禄 kubectl delete -f ./nginx-example --kubeconfig kubeconfig-k8s-serene-bardeen.yaml
namespace "nginx-example" deleted
persistentvolumeclaim "nginx-logs" deleted
deployment.apps "nginx-deploy" deleted
service "nginx-svc" deleted
------------------------------------------------------------
17:02:01 ~/Work/tutorials/velero 禄 velero restore create toto-scw --from-backup nginx-backup-scw --kubeconfig kubeconfig-k8s-serene-bardeen.yaml
Restore request "toto-scw" submitted successfully.
Run `velero restore describe toto-scw` or `velero restore logs toto-scw` for more details.
------------------------------------------------------------
17:02:40 ~/Work/tutorials/velero 禄 velero restore describe toto-scw --kubeconfig kubeconfig-k8s-serene-bardeen.yaml --details
Name: toto-scw
Namespace: velero
Labels: <none>
Annotations: <none>
Phase: Completed
Backup: nginx-backup-scw
Namespaces:
Included: *
Excluded: <none>
Resources:
Included: *
Excluded: nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io
Cluster-scoped: auto
Namespace mappings: <none>
Label selector: <none>
Restore PVs: auto
------------------------------------------------------------
17:02:54 ~/Work/tutorials/velero 禄 kubectl get pods --all-namespaces --kubeconfig kubeconfig-k8s-serene-bardeen.yaml | grep nginx-example
nginx-example nginx-deploy-694c85cdc8-25jk9 1/1 Running 0 24s
------------------------------------------------------------
17:05:50 ~/Work/tutorials/velero 禄 aws s3 ls --recursive s3://velero-test-newton
2020-01-16 17:00:53 2535 backups/nginx-backup-scw/nginx-backup-scw-logs.gz
2020-01-16 17:00:53 29 backups/nginx-backup-scw/nginx-backup-scw-podvolumebackups.json.gz
2020-01-16 17:00:53 286 backups/nginx-backup-scw/nginx-backup-scw-resource-list.json.gz
2020-01-16 17:00:53 29 backups/nginx-backup-scw/nginx-backup-scw-volumesnapshots.json.gz
2020-01-16 17:00:53 3705 backups/nginx-backup-scw/nginx-backup-scw.tar.gz
2020-01-16 17:00:53 873 backups/nginx-backup-scw/velero-backup.json
2020-01-16 13:53:40 2272 backups/nginx-backup/nginx-backup-logs.gz
2020-01-16 13:53:40 29 backups/nginx-backup/nginx-backup-podvolumebackups.json.gz
2020-01-16 13:53:40 177 backups/nginx-backup/nginx-backup-resource-list.json.gz
2020-01-16 13:53:40 29 backups/nginx-backup/nginx-backup-volumesnapshots.json.gz
2020-01-16 13:53:40 2509 backups/nginx-backup/nginx-backup.tar.gz
2020-01-16 13:53:40 865 backups/nginx-backup/velero-backup.json
2020-01-16 17:02:42 1139 restores/toto-scw/restore-toto-scw-logs.gz
2020-01-16 17:02:42 49 restores/toto-scw/restore-toto-scw-results.gz
2020-01-16 16:48:20 900 restores/toto/restore-toto-logs.gz
2020-01-16 16:48:20 200 restores/toto/restore-toto-results.gz
2020-01-16 16:52:09 816 restores/toto2/restore-toto2-logs.gz
2020-01-16 16:52:09 49 restores/toto2/restore-toto2-results.gz
2020-01-16 16:52:56 813 restores/toto3/restore-toto3-logs.gz
2020-01-16 16:52:57 49 restores/toto3/restore-toto3-results.gz
Maybe something you are missing is a PVC translation ? I can ask someone from the Kubernetes team to answer if you need more help.
The issue lies with the restic version not being the latest one. Velero uses 0.9.5 but the fix to use custom region arrived in 0.9.6 in Restic. Only solution is to update velero to use a recent restic release. cc @metal3d @vitobotta
@newtoncorp In my case I only tried restoring and not backing up. Perhaps the problem is mainly with restores, I have no idea. @Sh4d1 For now I have switched back to DigitalOcean Spaces since their API is very compatible with S3 and I have no problems with backups/restores, CORS/direct uploads, versioning... Scaleway is cheaper though so I will try it again when Velero/Restic is upgraded.
@vitobotta what issue exactly do you have when restoring? And on what type of cluster (managed or not, and where) ?
@Sh4d1 It was giving me two errors, one weird one CPU something which I never got with other storage backends, and the other about the region being wrong or something like that. But the settings were definitely correct.
Tested with restic 0.9.6 and it works:
Restic Backups:
Completed:
nginx-example/nginx-deploy-694c85cdc8-nn55h: nginx-logs
Hi @Sh4d1 cool, I'm going to try again. What should I do with the current version of Velero to try this?
@vitobotta you can either use the master version of the container image or wait for a new release.
@nrb any idea when a new release is going to be cut?
@Sh4d1 Cool, will give it a try, thanks! :)
@Sh4d1 We're aiming for v1.3 at the end of March, and a v1.2.1 to address CRD restoration issues sometime this week. @skriss Would you think we could include the restic version bump in 1.2.1?
we could consider it since it's just a patch version - can take a look at the release notes and see if it looks like there's anything risky for us.
@Sh4d1 Awesome, with master I was able to restore just fine 馃憤
Yay 馃帀 and with a little bit a of luck it'll end up in 1.2.1 馃榿
@Sh4d1 I am having problems with backips though :(
Fatal: invalid id \"fefb274a\": no matching ID found\n: unable to find summary in restic backup command output
What can cause this? I have migrated the bucket from DO to Scaleway. With DO I didn't have this. Any idea?
Hmm good question 馃 @newtoncorp any idea? 馃槄
could be caused by https://github.com/restic/restic/issues/2389 apparently
cf https://github.com/vmware-tanzu/velero/blob/master/pkg/restic/exec_commands.go#L164
Nop, I think I read too fast :sweat_smile:
@Sh4d1 I'm having problems with stale locks as well. Arghh restic unlock doesn't work
@vitobotta only on scaleway's s3 ?
How did you migrate the bucket? I had no problem when backuping directly to scaleway's s3 馃
I managed to remove the lock on one repo. Now I am trying to back up one thing per time. I migrated the backups with rclone
I managed to back up redis and drone, but it's stuck backing up harbor for some reason with no errors...
Hmm weird.. I'm really not an object storage expert so can't really help I think :(
Maybe worth to open another issue with some details though :)
I see the problem. Velero says that the volume is being backed up with Restic but in the Restic logs there is no sign of this backup. And I can't delete it apparently because it's in progress
I restarted the pods and the backup after deleting the previous and I get this in the restic logs:
restic-cl22h restic time="2020-01-21T23:20:36Z" level=info msg="No parent snapshot found for PVC, not using --parent flag for this backup" backup=velero/test-harbor controller=pod-volume-backup logSource="pkg/controller/pod_volume_backup_
controller.go:253" name=test-harbor-z6xfd namespace=velero
restic-cl22h restic panic: runtime error: slice bounds out of range
restic-cl22h restic
restic-cl22h restic goroutine 94 [running]:
restic-cl22h restic github.com/vmware-tanzu/velero/pkg/restic.getLastLine(0xc000770000, 0x0, 0x200, 0x0, 0xc0000aa001, 0x0)
restic-cl22h restic /go/src/github.com/vmware-tanzu/velero/pkg/restic/exec_commands.go:157 +0xcf
restic-cl22h restic github.com/vmware-tanzu/velero/pkg/restic.RunBackup.func1(0xc0006aae10, 0x1f999e0, 0xc00063e640, 0xc0006aadb0, 0xc000702480)
restic-cl22h restic /go/src/github.com/vmware-tanzu/velero/pkg/restic/exec_commands.go:99 +0x129
restic-cl22h restic created by github.com/vmware-tanzu/velero/pkg/restic.RunBackup
restic-cl22h restic /go/src/github.com/vmware-tanzu/velero/pkg/restic/exec_commands.go:94 +0x153
You better open up a new issue with this probem I think @vitobotta :stuck_out_tongue:
OK...
@vitobotta I'll try to take a look this week though :)
Thanks. I am thinking to try with an empty bucket on Scaleway. I have migrated everything to this new cluster and don't want to repeat it all once again...
@Sh4d1 Everything worked just fine and I was able to do a full backup (including 6 volumes) with an empty bucket 馃憤Apparently it didn't like the existing backups for some reason :D
@vitobotta ah perfect glad to hear it 馃槃