Describe the problem/challenge you have
Hello and thank you for this very useful product. I tried to restore a backup made for a namespace with strict resource quotas and found a lot of errors like this:
... ingresses.extensions "NNNNN" is forbidden: exceeded quota: myns-general, requested: count/ingresses.extensions=1, used: count/ingresses.extensions=40, limited: count/ingresses.extensions=40
...
... replicasets.apps "app-5c6c447d48" is forbidden: exceeded quota: myns-general, requested: count/replicasets.apps=1, used: count/replicasets.apps=300, limited: count/replicasets.apps=300
... services "myapp-web" is forbidden: exceeded quota: myns-general, requested: count/services=1, used: count/services=80, limited: count/services=80
Describe the solution you'd like
A possible (but primitive and not guaranteed) solution would be to have a flag like "--pause-between-creation" that makes sure a previous resource is deployed before the next deploy starts to avoid hitting quotas.
Do you know any better ways to restore backups in such conditions without doubling quotas for all namespaces?
Environment:
velero version): 1.2.0kubectl version): 1.12.10hey @KIVagant, sorry this slipped through the cracks.
Can you clarify why the quotas are being exceeded on restore? Are the ResourceQuotas themselves part of the backup/restore?
Hello, @skriss
I configured an automatic backup in a cluster that has Resource Quotas. Some of them limit the amount of deployments or replicasets etc. Then I tried to restore the cluster from a backup and Velero quickly hit all the limits for most of the resources.
I guess my question is - if the # of resources weren't exceeding the quotas at backup time, then what made them exceed the quotas during restore? Are you creating multiple copies of resources?
That's a good question. First, one the quotas it reached is "count/replicasets.apps". So to restore from a backup, Velero will probably try to start new replicas for existing deployments. If one has 1000 replicas maximum allowed in a namespace and 900 of them were requested, then to restore from a backup we will need another 900 available. Which conflicts with the idea of limiting them. Because doubling the quota makes no sense.
Another quota was reached is called count/services and here I don't really understand why. Service names must be the same... I'll try to re-test this part.
Hmm, yeah I'm also not clear on why something like a service quota would be exceeded. I think tracking this down might be helpful - let me know if you can figure out what might be going on there!
If this is an ongoing issue, you could look at writing a custom RestoreItemAction plugin, that would allow you to modify the quota values during a restore (e.g. to double them). However, you'd probably want some post-restore process to then turn them back down to more appropriate values.
You could also consider excluding the quotas from your first restore, via the --excluded-resources flag, and only restore the quotas after everything else has been restored.
Thank you for the recommendations. Sounds a little bit complicated but still possible. I will try to test the backup/restore process again and will return back with the results.
@KIVagant 馃憢 checking in to see if you had a chance to follow Steve's suggestions?
Sorry, my plate is full, I will eventually check this but cannot say when :(
:+1: I think we'll close this one out -- feel free to reach out again as needed, thanks!