What steps did you take and what happened:
I created a backup that included some PVCs backed up using restic (with the backup.velero.io/backup-volumes annotation). The backup completed successfully. I was using S3. These PVCs are bound to normal EBS PV's, so I know I can use snapshots, but I was trying out restic.
I then deleted the cluster and set up a new kops cluster, installed velero using the same manifests and created a restore job.
The restore job then gets stuck in pending.
What did you expect to happen:
Everything to restore.
The output of the following commands will help us better understand what's going on:
kubectl logs deployment/velero -n velero<snip>
time="2019-03-14T01:55:43Z" level=info msg="Restoring cluster level resource 'persistentvolumes' from: /tmp/725321603/resources/persistentvolumes/cluster" backup=bts3 logSource="pkg/restore/restore.go:696" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:43Z" level=info msg="Getting client for /v1, Kind=PersistentVolume" backup=bts3 logSource="pkg/restore/restore.go:754" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:43Z" level=info msg="No snapshot found for persistent volume" backup=bts3 logSource="pkg/restore/restore.go:1148" persistentVolume=pvc-50d4544c-4526-11e9-9f1f-0aa34d933732 restore=velero/bts3-20190313185509
time="2019-03-14T01:55:43Z" level=info msg="Attempting to restore PersistentVolume: pvc-50d4544c-4526-11e9-9f1f-0aa34d933732" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:43Z" level=info msg="error restoring pvc-50d4544c-4526-11e9-9f1f-0aa34d933732: <nil>" backup=bts3 logSource="pkg/restore/restore.go:964" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="successfully restored persistent volume from snapshot" backup=bts3 logSource="pkg/restore/restore.go:1166" persistentVolume=pvc-510b74ed-4526-11e9-9f1f-0aa34d933732 providerSnapshotID=snap-094014b90adb832dc restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Attempting to restore PersistentVolume: pvc-510b74ed-4526-11e9-9f1f-0aa34d933732" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="No snapshot found for persistent volume" backup=bts3 logSource="pkg/restore/restore.go:1148" persistentVolume=pvc-53b5ffce-4526-11e9-9f1f-0aa34d933732 restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Attempting to restore PersistentVolume: pvc-53b5ffce-4526-11e9-9f1f-0aa34d933732" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="error restoring pvc-53b5ffce-4526-11e9-9f1f-0aa34d933732: <nil>" backup=bts3 logSource="pkg/restore/restore.go:964" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="No snapshot found for persistent volume" backup=bts3 logSource="pkg/restore/restore.go:1148" persistentVolume=pvc-53f0ace1-4526-11e9-9f1f-0aa34d933732 restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Attempting to restore PersistentVolume: pvc-53f0ace1-4526-11e9-9f1f-0aa34d933732" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="error restoring pvc-53f0ace1-4526-11e9-9f1f-0aa34d933732: <nil>" backup=bts3 logSource="pkg/restore/restore.go:964" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="No snapshot found for persistent volume" backup=bts3 logSource="pkg/restore/restore.go:1148" persistentVolume=pvc-54284933-4526-11e9-9f1f-0aa34d933732 restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Attempting to restore PersistentVolume: pvc-54284933-4526-11e9-9f1f-0aa34d933732" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="error restoring pvc-54284933-4526-11e9-9f1f-0aa34d933732: <nil>" backup=bts3 logSource="pkg/restore/restore.go:964" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Restoring resource 'persistentvolumeclaims' into namespace 'sgs' from: /tmp/725321603/resources/persistentvolumeclaims/namespaces/sgs" backup=bts3 logSource="pkg/restore/restore.go:694" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Getting client for /v1, Kind=PersistentVolumeClaim" backup=bts3 logSource="pkg/restore/restore.go:754" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Attempting to restore PersistentVolumeClaim: etcd-pv-0" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Attempting to restore PersistentVolumeClaim: etcd-pv-1" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Attempting to restore PersistentVolumeClaim: etcd-pv-2" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Attempting to restore PersistentVolumeClaim: postgres-pv-0" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:55:44Z" level=info msg="Attempting to restore PersistentVolumeClaim: postgres-pv-1" backup=bts3 logSource="pkg/restore/restore.go:903" restore=velero/bts3-20190313185509
time="2019-03-14T01:56:43Z" level=warning msg="Timeout reached waiting for persistent volume pvc-50d4544c-4526-11e9-9f1f-0aa34d933732 to become ready" backup=bts3 logSource="pkg/restore/restore.go:830" restore=velero/bts3-20190313185509
<snip>
velero backup describe <backupname> or kubectl get backup/<backupname> -n velero -o yamlName: bts3
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: <none>
Phase: Completed
Namespaces:
Included: *
Excluded: <none>
Resources:
Included: *
Excluded: <none>
Cluster-scoped: auto
Label selector: <none>
Storage Location: default
Snapshot PVs: auto
TTL: 720h0m0s
Hooks: <none>
Backup Format Version: 1
Started: 2019-03-12 18:48:59 -0700 PDT
Completed: 2019-03-12 18:51:17 -0700 PDT
Expiration: 2019-04-11 18:48:59 -0700 PDT
Validation errors: <none>
Persistent Volumes:
pvc-510b74ed-4526-11e9-9f1f-0aa34d933732:
Snapshot ID: snap-094014b90adb832dc
Type: gp2
Availability Zone: us-west-2c
IOPS: <N/A>
Restic Backups:
New:
sgs/etcd-0-79894c7d66-jv9hz: data, data, data
sgs/etcd-1-74d4fdd498-9wdxw: data, data, data
sgs/etcd-2-bbdb44bdd-xlb27: data, data, data
sgs/postgres-0-fd9f5dfd4-fql6b: data, data
sgs/postgres-0-fd9f5dfd4-h6mks: data
Anything else you would like to add:
The restore also hangs, so there are no logs for the restore, only what I extract from the velero pod.
Also, this is the output of kubectl get pvc:
sgs etcd-pv-0 Lost pvc-53b5ffce-4526-11e9-9f1f-0aa34d933732 0 gp2-us-west-2c 8m
sgs etcd-pv-1 Lost pvc-53f0ace1-4526-11e9-9f1f-0aa34d933732 0 gp2-us-west-2c 8m
sgs etcd-pv-2 Lost pvc-54284933-4526-11e9-9f1f-0aa34d933732 0 gp2-us-west-2c 8m
sgs postgres-pv-0 Lost pvc-50d4544c-4526-11e9-9f1f-0aa34d933732 0 gp2-us-west-2c 8m
sgs postgres-pv-1 Bound pvc-510b74ed-4526-11e9-9f1f-0aa34d933732 100Gi RWO gp2-us-west-2c 8m
postgres-pv-1 is a volume I forgot the annotation on so it used normal snapshots. The other PVCs I set the annotation on.
Environment:
velero version):kubectl version):/etc/os-release):It doesn't look like any of your restic backups ran. Can you confirm you deployed the restic daemonset, and that the pods are running?
kubectl -n velero get daemonsets
kubectl -n velero get pods
The restic pods were running, one per node. How can you tell the restic backups didn't run? The backup phase was complete and there were no errors so I assumed those all ran successfully. Or do you mean the restores?
from your velero backup describe output:
Restic Backups:
New:
sgs/etcd-0-79894c7d66-jv9hz: data, data, data
sgs/etcd-1-74d4fdd498-9wdxw: data, data, data
sgs/etcd-2-bbdb44bdd-xlb27: data, data, data
sgs/postgres-0-fd9f5dfd4-fql6b: data, data
sgs/postgres-0-fd9f5dfd4-h6mks: data
If they ran, they should've shown as completed. The fact that they're still in New phase implies they weren't even picked up by the relevant controller. Do you still have the setup running? I'd want to see the pods, and the logs from the restic pods.
I tore it down, but happy to recreate it for you if that would help. Give me a few minutes
Huh, when trying to recreate it I didn't get the same result. Looks like the restic backups say complete this time:
Name: bts5
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: <none>
Phase: Completed
Namespaces:
Included: *
Excluded: <none>
Resources:
Included: *
Excluded: <none>
Cluster-scoped: auto
Label selector: <none>
Storage Location: default
Snapshot PVs: auto
TTL: 720h0m0s
Hooks: <none>
Backup Format Version: 1
Started: 2019-03-14 11:32:52 -0700 PDT
Completed: 2019-03-14 11:35:10 -0700 PDT
Expiration: 2019-04-13 11:32:52 -0700 PDT
Validation errors: <none>
Persistent Volumes: <none included>
Restic Backups:
Completed:
sgs/etcd-0-79894c7d66-lzgmq: data
sgs/etcd-1-74d4fdd498-lrt8h: data
sgs/etcd-2-bbdb44bdd-75fhl: data
sgs/postgres-0-fd9f5dfd4-29kss: data
sgs/postgres-1-5dc54b4dc6-85q8t: data
I don't recall doing anything different, but now I know to check for that. If this restores successfully I'll close the issue
OK sounds good - this one looks right!
The restore appears to have hung on:
time="2019-03-14T18:54:34Z" level=info msg="Restoring resource 'volumesnapshotlocations.velero.io' into namespace 'velero' from: /tmp/755586717/resources/volumesnapshotlocations.velero.io/namespaces/velero" backup=bts5 logSource="pkg/restore/restore.go:694" restore=velero/bts5-20190314115236
time="2019-03-14T18:54:34Z" level=info msg="Getting client for velero.io/v1, Kind=VolumeSnapshotLocation" backup=bts5 logSource="pkg/restore/restore.go:754" restore=velero/bts5-20190314115236
time="2019-03-14T18:54:34Z" level=info msg="Attempting to restore VolumeSnapshotLocation: aws-default" backup=bts5 logSource="pkg/restore/restore.go:903" restore=velero/bts5-20190314115236
time="2019-03-14T18:54:34Z" level=info msg="Skipping restore of VolumeSnapshotLocation: aws-default because it already exists in the cluster and is unchanged from the backed up version" backup=bts5 logSource="pkg/restore/restore.go:926" restore=velero/bts5-20190314115236
time="2019-03-14T18:54:35Z" level=error msg="No ready channel found for repository velero/sgs-default-flllc" logSource="pkg/restic/repository_ensurer.go:78"
Full log output:
https://gist.github.com/immesys/f2fc1d0efa5bfcfefcd213c75c411e0b
The restic pods all say
time="2019-03-14T18:48:58Z" level=info msg="Setting log-level to INFO"
time="2019-03-14T18:48:58Z" level=info msg="Starting Velero restic server ab2fc65c02ecdcb9bb1690e3afe67a3b3196aebc" logSource="pkg/cmd/cli/restic/server.go:57"
time="2019-03-14T18:48:58Z" level=info msg="Starting controllers" logSource="pkg/cmd/cli/restic/server.go:145"
time="2019-03-14T18:48:58Z" level=info msg="Controllers started successfully" logSource="pkg/cmd/cli/restic/server.go:186"
time="2019-03-14T18:48:58Z" level=info msg="Starting controller" controller=pod-volume-backup logSource="pkg/controller/generic_controller.go:76"
time="2019-03-14T18:48:58Z" level=info msg="Waiting for caches to sync" controller=pod-volume-backup logSource="pkg/controller/generic_controller.go:79"
time="2019-03-14T18:48:58Z" level=info msg="Starting controller" controller=pod-volume-restore logSource="pkg/controller/generic_controller.go:76"
time="2019-03-14T18:48:58Z" level=info msg="Waiting for caches to sync" controller=pod-volume-restore logSource="pkg/controller/generic_controller.go:79"
time="2019-03-14T18:48:59Z" level=info msg="Caches are synced" controller=pod-volume-backup logSource="pkg/controller/generic_controller.go:83"
time="2019-03-14T18:48:59Z" level=info msg="Caches are synced" controller=pod-volume-restore logSource="pkg/controller/generic_controller.go:83"
what does velero restore describe NAME --details show?
Name: bts5-20190314115236
Namespace: velero
Labels: <none>
Annotations: <none>
Backup: bts5
Namespaces:
Included: *
Excluded: <none>
Resources:
Included: *
Excluded: nodes, events, events.events.k8s.io, backups.ark.heptio.com, backups.velero.io, restores.ark.heptio.com, restores.velero.io
Cluster-scoped: auto
Namespace mappings: <none>
Label selector: <none>
Restore PVs: auto
Phase: InProgress
Validation errors: <none>
Warnings: <none>
Errors: <none>
Restic Restores:
New:
sgs/etcd-0-79894c7d66-lzgmq: data
sgs/etcd-1-74d4fdd498-lrt8h: data
sgs/etcd-2-bbdb44bdd-75fhl: data
sgs/postgres-0-fd9f5dfd4-29kss: data
sgs/postgres-1-5dc54b4dc6-85q8t: data
can you describe a couple of these pods, e.g. kubectl -n sgs describe pod etcd-0-79894c7d66-lzgmq?
etcd-0
Name: etcd-0-79894c7d66-lzgmq
Namespace: sgs
Priority: 0
PriorityClassName: <none>
Node: <none>
Labels: cluster=etcd
name=etcd-0
pod-template-hash=3545073822
velero-restore=bts5-20190314115236
velero.io/backup-name=bts5
velero.io/restore-name=bts5-20190314115236
Annotations: backup.velero.io/backup-volumes: data
snapshot.velero.io/data: 4f564002
Status: Pending
IP:
Controlled By: ReplicaSet/etcd-0-79894c7d66
Init Containers:
restic-wait:
Image: gcr.io/heptio-images/velero-restic-restore-helper:v0.11.0
Port: <none>
Host Port: <none>
Args:
55541829-468a-11e9-8017-0a17ae9bd8c6
Environment:
POD_NAMESPACE: sgs (v1:metadata.namespace)
POD_NAME: etcd-0-79894c7d66-lzgmq (v1:metadata.name)
Mounts:
/restores/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-ncw9q (ro)
Containers:
etcd:
Image: quay.io/coreos/etcd:v3.3.11
Ports: 2379/TCP, 2380/TCP
Host Ports: 0/TCP, 0/TCP
Command:
/usr/local/bin/etcd
--name
etcd-0
--data-dir
/data
--initial-advertise-peer-urls
http://etcd-0:2380
--listen-peer-urls
http://0.0.0.0:2380
--listen-client-urls
http://0.0.0.0:2379
--advertise-client-urls
http://etcd-0:2379
--initial-cluster
etcd-0=http://etcd-0:2380,etcd-1=http://etcd-1:2380,etcd-2=http://etcd-2:2380
--initial-cluster-state
new
Environment:
ETCD_AUTO_COMPACTION_RETENTION: 1
ETCD_QUOTA_BACKEND_BYTES: 8589934592
Mounts:
/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-ncw9q (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: etcd-pv-0
ReadOnly: false
default-token-ncw9q:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-ncw9q
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 10m (x22 over 11m) default-scheduler could not find v1.PersistentVolume "pvc-44e44335-4686-11e9-a0b6-0ae02beae008" (repeated 3 times)
PVCs:
kubectl -n sgs get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
etcd-pv-0 Lost pvc-44e44335-4686-11e9-a0b6-0ae02beae008 0 gp2-us-west-2c 12m
etcd-pv-1 Lost pvc-4519f4be-4686-11e9-a0b6-0ae02beae008 0 gp2-us-west-2c 12m
etcd-pv-2 Lost pvc-45467a0f-4686-11e9-a0b6-0ae02beae008 0 gp2-us-west-2c 12m
postgres-pv-0 Lost pvc-48549731-4686-11e9-a0b6-0ae02beae008 0 gp2-us-west-2c 12m
postgres-pv-1 Lost pvc-4888fac7-4686-11e9-a0b6-0ae02beae008 0 gp2-us-west-2c 12m
Similar for postgres:
Name: postgres-0-fd9f5dfd4-29kss
Namespace: sgs
Priority: 0
PriorityClassName: <none>
Node: <none>
Labels: cluster=postgres
name=postgres-0
pod-template-hash=985918980
role=master
velero-restore=bts5-20190314115236
velero.io/backup-name=bts5
velero.io/restore-name=bts5-20190314115236
Annotations: backup.velero.io/backup-volumes: data
snapshot.velero.io/data: b20d51b3
Status: Pending
IP:
Controlled By: ReplicaSet/postgres-0-fd9f5dfd4
Init Containers:
restic-wait:
Image: gcr.io/heptio-images/velero-restic-restore-helper:v0.11.0
Port: <none>
Host Port: <none>
Args:
55541829-468a-11e9-8017-0a17ae9bd8c6
Environment:
POD_NAMESPACE: sgs (v1:metadata.namespace)
POD_NAME: postgres-0-fd9f5dfd4-29kss (v1:metadata.name)
Mounts:
/restores/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-ncw9q (ro)
Containers:
postgres:
Image: quay.io/pingthingsio/postgres:v5.1.15
Port: 5432/TCP
Host Port: 0/TCP
Environment:
POSTGRES_PASSWORD: <set to the key 'pgpassword' in secret 'postgres-root-password'> Optional: false
POSTGRES_RPASSWORD: <set to the key 'pgpassword' in secret 'postgres-repl-password'> Optional: false
ROLE: master
Mounts:
/var/lib/postgresql/backups from data (rw)
/var/lib/postgresql/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-ncw9q (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: postgres-pv-0
ReadOnly: false
default-token-ncw9q:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-ncw9q
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 11m (x23 over 12m) default-scheduler could not find v1.PersistentVolume "pvc-48549731-4686-11e9-a0b6-0ae02beae008" (repeated 3 times)
kubectl describe pv pvc-44e44335-4686-11e9-a0b6-0ae02beae008?
michael@shadowpad:~/w/devops/bts$ kubectl describe pv pvc-44e44335-4686-11e9-a0b6-0ae02beae008
Error from server (NotFound): persistentvolumes "pvc-44e44335-4686-11e9-a0b6-0ae02beae008" not found
michael@shadowpad:~/w/devops/bts$ kubectl get pv --all-namespaces
No resources found.
Do these PVs have a reclaim policy of Retain?
I can check what they were, but to be clear, I am restoring the backup into a completely different cluster (I did a kops delete, kops create)
Yes the reclaim policy was Retain
OK...so I'm pretty sure you're running into https://github.com/heptio/velero/issues/1151 :( https://github.com/heptio/velero/issues/1151#issuecomment-463849604 is a proposal for how to fix for this case. Unfortunately it's not implemented yet.
So, if I just use snapshots instead of Restic I should avoid that issue for the time being?
yeah, snapshots should work fine in this scenario
Great. Thank you for your help. I'll follow #1151 and wait for a resolution. The restic backup is ideal because I can restore in a different region (AWS main to AWS govcloud) but snapshots will do for now.
I got into a similar situation where I was trying to do restores that included restic volumes. All of the restores were getting stuck in STATUS=New. The way I resolved it was to delete the velero pod, so the velero deployment would recreate it. Then, the newly created velero pod started picking up the restores and their STATUS changed to InProgress.
Same issue as @ngealy ( Velero 1.1.0 ).
When I create a restore,, I need to delete the velero pod.
Not really convenient.
Is there a fix planned for this ?
Thanks
@Stolr I would need to see some logs/etc. to be able to diagnose what your issue is. This is not typical.
@skriss I am facing exactly the same situation described by @Stolr and @ngealy and the worse fact about it is: Velero doesn't have any interesting logs for the restore even running in debug mode....
$ velero restore describe jenkins --details
Name: jenkins
Namespace: velero
Labels: <none>
Annotations: <none>
Phase:
Backup: jenkins
Namespaces:
Included: all namespaces found in the backup
Excluded: <none>
Resources:
Included: *
Excluded: <none>
Cluster-scoped: auto
Namespace mappings: <none>
Label selector: <none>
Restore PVs: auto
$ velero restore get
NAME BACKUP STATUS ERRORS WARNINGS CREATED SELECTOR
jenkins jenkins New 0 0 2020-08-18 11:39:33 +0200 CEST <none>
$ kubectl -n velero logs velero-5d599c5ccd-qbzgm |grep jenkins
time="2020-08-18T09:33:55Z" level=debug msg="Running processQueueItem" controller=restic-repository key=velero/jenkins-default-8zbcw logSource="pkg/controller/restic_repository_controller.go:110"
time="2020-08-18T09:33:55Z" level=debug msg="Checking repository for stale locks" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:139" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T09:33:56Z" level=debug msg="Ran restic command" command="restic unlock --repo=s3:s3-eu-west-1.amazonaws.com/nie-backup-ict-prod/restic/jenkins --password-file=/tmp/velero-restic-credentials-jenkins119613165 --cache-dir=/scratch/.cache/restic" logSource="pkg/restic/repository_manager.go:291" repository=jenkins stderr= stdout="successfully removed locks\n"
time="2020-08-18T09:33:56Z" level=debug msg=resticRepositoryController.runMaintenanceIfDue controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:216" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T09:33:56Z" level=debug msg="not due for maintenance" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:221" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T09:38:55Z" level=debug msg="Running processQueueItem" controller=restic-repository key=velero/jenkins-default-8zbcw logSource="pkg/controller/restic_repository_controller.go:110"
time="2020-08-18T09:38:55Z" level=debug msg="Checking repository for stale locks" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:139" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T09:39:02Z" level=debug msg="Ran restic command" command="restic unlock --repo=s3:s3-eu-west-1.amazonaws.com/nie-backup-ict-prod/restic/jenkins --password-file=/tmp/velero-restic-credentials-jenkins738132903 --cache-dir=/scratch/.cache/restic" logSource="pkg/restic/repository_manager.go:291" repository=jenkins stderr= stdout="successfully removed locks\n"
time="2020-08-18T09:39:02Z" level=debug msg=resticRepositoryController.runMaintenanceIfDue controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:216" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T09:39:02Z" level=debug msg="not due for maintenance" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:221" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T09:43:55Z" level=debug msg="Running processQueueItem" controller=restic-repository key=velero/jenkins-default-8zbcw logSource="pkg/controller/restic_repository_controller.go:110"
time="2020-08-18T09:43:55Z" level=debug msg="Checking repository for stale locks" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:139" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T09:43:56Z" level=debug msg="Ran restic command" command="restic unlock --repo=s3:s3-eu-west-1.amazonaws.com/nie-backup-ict-prod/restic/jenkins --password-file=/tmp/velero-restic-credentials-jenkins121428593 --cache-dir=/scratch/.cache/restic" logSource="pkg/restic/repository_manager.go:291" repository=jenkins stderr= stdout="successfully removed locks\n"
time="2020-08-18T09:43:56Z" level=debug msg=resticRepositoryController.runMaintenanceIfDue controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:216" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T09:43:56Z" level=debug msg="not due for maintenance" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:221" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T09:48:55Z" level=debug msg="Running processQueueItem" controller=restic-repository key=velero/jenkins-default-8zbcw logSource="pkg/controller/restic_repository_controller.go:110"
time="2020-08-18T09:48:55Z" level=debug msg="Checking repository for stale locks" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:139" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T09:48:57Z" level=debug msg="Ran restic command" command="restic unlock --repo=s3:s3-eu-west-1.amazonaws.com/nie-backup-ict-prod/restic/jenkins --password-file=/tmp/velero-restic-credentials-jenkins108811211 --cache-dir=/scratch/.cache/restic" logSource="pkg/restic/repository_manager.go:291" repository=jenkins stderr= stdout="successfully removed locks\n"
time="2020-08-18T09:48:57Z" level=debug msg=resticRepositoryController.runMaintenanceIfDue controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:216" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T09:48:57Z" level=debug msg="not due for maintenance" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:221" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T09:53:55Z" level=debug msg="Running processQueueItem" controller=restic-repository key=velero/jenkins-default-8zbcw logSource="pkg/controller/restic_repository_controller.go:110"
time="2020-08-18T09:53:55Z" level=debug msg="Checking repository for stale locks" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:139" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T09:53:56Z" level=debug msg="Ran restic command" command="restic unlock --repo=s3:s3-eu-west-1.amazonaws.com/nie-backup-ict-prod/restic/jenkins --password-file=/tmp/velero-restic-credentials-jenkins471339061 --cache-dir=/scratch/.cache/restic" logSource="pkg/restic/repository_manager.go:291" repository=jenkins stderr= stdout="successfully removed locks\n"
time="2020-08-18T09:53:56Z" level=debug msg=resticRepositoryController.runMaintenanceIfDue controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:216" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T09:53:56Z" level=debug msg="not due for maintenance" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:221" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T09:58:55Z" level=debug msg="Running processQueueItem" controller=restic-repository key=velero/jenkins-default-8zbcw logSource="pkg/controller/restic_repository_controller.go:110"
time="2020-08-18T09:58:55Z" level=debug msg="Checking repository for stale locks" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:139" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T09:58:56Z" level=debug msg="Ran restic command" command="restic unlock --repo=s3:s3-eu-west-1.amazonaws.com/nie-backup-ict-prod/restic/jenkins --password-file=/tmp/velero-restic-credentials-jenkins156838959 --cache-dir=/scratch/.cache/restic" logSource="pkg/restic/repository_manager.go:291" repository=jenkins stderr= stdout="successfully removed locks\n"
time="2020-08-18T09:58:56Z" level=debug msg=resticRepositoryController.runMaintenanceIfDue controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:216" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T09:58:56Z" level=debug msg="not due for maintenance" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:221" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T10:03:55Z" level=debug msg="Running processQueueItem" controller=restic-repository key=velero/jenkins-default-8zbcw logSource="pkg/controller/restic_repository_controller.go:110"
time="2020-08-18T10:03:55Z" level=debug msg="Checking repository for stale locks" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:139" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T10:03:56Z" level=debug msg="Ran restic command" command="restic unlock --repo=s3:s3-eu-west-1.amazonaws.com/nie-backup-ict-prod/restic/jenkins --password-file=/tmp/velero-restic-credentials-jenkins350813753 --cache-dir=/scratch/.cache/restic" logSource="pkg/restic/repository_manager.go:291" repository=jenkins stderr= stdout="successfully removed locks\n"
time="2020-08-18T10:03:56Z" level=debug msg=resticRepositoryController.runMaintenanceIfDue controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:216" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T10:03:56Z" level=debug msg="not due for maintenance" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:221" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T10:08:56Z" level=debug msg="Running processQueueItem" controller=restic-repository key=velero/jenkins-default-8zbcw logSource="pkg/controller/restic_repository_controller.go:110"
time="2020-08-18T10:08:56Z" level=debug msg="Checking repository for stale locks" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:139" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T10:08:57Z" level=debug msg="Ran restic command" command="restic unlock --repo=s3:s3-eu-west-1.amazonaws.com/nie-backup-ict-prod/restic/jenkins --password-file=/tmp/velero-restic-credentials-jenkins982469654 --cache-dir=/scratch/.cache/restic" logSource="pkg/restic/repository_manager.go:291" repository=jenkins stderr= stdout="successfully removed locks\n"
time="2020-08-18T10:08:57Z" level=debug msg=resticRepositoryController.runMaintenanceIfDue controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:216" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T10:08:57Z" level=debug msg="not due for maintenance" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:221" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T10:13:55Z" level=debug msg="Running processQueueItem" controller=restic-repository key=velero/jenkins-default-8zbcw logSource="pkg/controller/restic_repository_controller.go:110"
time="2020-08-18T10:13:55Z" level=debug msg="Checking repository for stale locks" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:139" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T10:13:57Z" level=debug msg="Ran restic command" command="restic unlock --repo=s3:s3-eu-west-1.amazonaws.com/nie-backup-ict-prod/restic/jenkins --password-file=/tmp/velero-restic-credentials-jenkins157038205 --cache-dir=/scratch/.cache/restic" logSource="pkg/restic/repository_manager.go:291" repository=jenkins stderr= stdout="successfully removed locks\n"
time="2020-08-18T10:13:57Z" level=debug msg=resticRepositoryController.runMaintenanceIfDue controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:216" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T10:13:57Z" level=debug msg="not due for maintenance" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:221" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T10:18:55Z" level=debug msg="Running processQueueItem" controller=restic-repository key=velero/jenkins-default-8zbcw logSource="pkg/controller/restic_repository_controller.go:110"
time="2020-08-18T10:18:55Z" level=debug msg="Checking repository for stale locks" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:139" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T10:18:56Z" level=debug msg="Ran restic command" command="restic unlock --repo=s3:s3-eu-west-1.amazonaws.com/nie-backup-ict-prod/restic/jenkins --password-file=/tmp/velero-restic-credentials-jenkins014176695 --cache-dir=/scratch/.cache/restic" logSource="pkg/restic/repository_manager.go:291" repository=jenkins stderr= stdout="successfully removed locks\n"
time="2020-08-18T10:18:56Z" level=debug msg=resticRepositoryController.runMaintenanceIfDue controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:216" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T10:18:56Z" level=debug msg="not due for maintenance" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:221" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T10:23:55Z" level=debug msg="Running processQueueItem" controller=restic-repository key=velero/jenkins-default-8zbcw logSource="pkg/controller/restic_repository_controller.go:110"
time="2020-08-18T10:23:55Z" level=debug msg="Checking repository for stale locks" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:139" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T10:23:56Z" level=debug msg="Ran restic command" command="restic unlock --repo=s3:s3-eu-west-1.amazonaws.com/nie-backup-ict-prod/restic/jenkins --password-file=/tmp/velero-restic-credentials-jenkins239810817 --cache-dir=/scratch/.cache/restic" logSource="pkg/restic/repository_manager.go:291" repository=jenkins stderr= stdout="successfully removed locks\n"
time="2020-08-18T10:23:56Z" level=debug msg=resticRepositoryController.runMaintenanceIfDue controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:216" name=jenkins-default-8zbcw namespace=velero
time="2020-08-18T10:23:56Z" level=debug msg="not due for maintenance" controller=restic-repository logSource="pkg/controller/restic_repository_controller.go:221" name=jenkins-default-8zbcw namespace=velero
Perhaps this could be caused because few hours before create this restore, other restore with the same name and same backup source was created and it finished successfully without any errors.
The only difference is that the old restore was done in other namespace by using the --namespace-mappings flag (just for validation proposes). After it was done and validated, then I deleted the restore on velero plus the restored namespace and the application namespace. Then, I created a new restore with the same settings from the old one but without the --namespace-mappings flag. And then it stuck in Status: New. Only after restart velero's pod, that the restore was started.
Most helpful comment
I got into a similar situation where I was trying to do restores that included restic volumes. All of the restores were getting stuck in STATUS=New. The way I resolved it was to delete the velero pod, so the velero deployment would recreate it. Then, the newly created velero pod started picking up the restores and their STATUS changed to InProgress.