What steps did you take and what happened:
volumeBindingMode: Immediate.What's happening then:
Restore object is stuck forever (or very long) in InProgress statePodVolumeRestore object gets created, but it's state is never updated.Restic logs does not say anything on info log level.What did you expect to happen:
Data in PV should be restored from backup.
The output of the following commands will help us better understand what's going on:
(Pasting long output into a GitHub gist or other pastebin is fine.)
kubectl logs deployment/velero -n velerovelero backup describe <backupname> or kubectl get backup/<backupname> -n velero -o yamlvelero backup logs <backupname>velero restore describe <restorename> or kubectl get restore/<restorename> -n velero -o yamlvelero restore logs <restorename>Anything else you would like to add:
As a workaround, one can create a copy of used storageclass and use https://velero.io/docs/v1.5/restore-reference/#changing-pvpvc-storage-classes feature to use modified one, which has volumeBindingMode: Immediate.
Volumes are being provisioned using https://github.com/hetznercloud/csi-driver.
Restoring volumes with volumeBindingMode: Immediate works well.
Environment:
velero version): Tried 1.4.2 and 1.5.1 with the same resultvelero client config get features): features: <NOT SET>kubectl version):
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"archive", BuildDate:"2020-09-18T18:46:38Z",
GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"clean", BuildDate:"2020-09-16T13:32:58Z",
GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}
/etc/os-release): Flatcar stableVote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.
Thanks for this report. I think I see the issue, and it's an order of operations one - Velero is trying to recreate the PV, PVC, and Pod (in that order), but when in a WaitForFirstConsumer binding mode, this isn't sufficient.
I'm going to log this as a high priority bug, because it's not a unique use case, but I don't have an answer for it at the moment.
We've been looking at this as well for other use cases. A long term solution would be to use the proposed Data Populators (https://github.com/kubernetes/enhancements/issues/1495) but this will require changes in how Restic is handled.
Most helpful comment
Thanks for this report. I think I see the issue, and it's an order of operations one - Velero is trying to recreate the PV, PVC, and Pod (in that order), but when in a
WaitForFirstConsumerbinding mode, this isn't sufficient.I'm going to log this as a high priority bug, because it's not a unique use case, but I don't have an answer for it at the moment.