Velero: Pull secret associated with service account is not being restored

Created on 26 Jun 2019  路  6Comments  路  Source: vmware-tanzu/velero

What steps did you take and what happened:
I'm running into a restore problem with service accounts and secrets. This is in an OpenShift 4.x environment, where 'create namespace' automatically creates builder/default/deployer SAs and 'create serviceaccount' automatically creates a dockercfg secret and two token secrets. My backup was created from a namespace that has a fourth, non-default serviceaccount (with its associated dockercfg and token secret). When I restore into a different cluster (with this namespace not yet existing), the restored namespace has the expected default SAs with their expected secrets (different generated name than what's in the backup). The new non-default SA was created with the expected (new generated name) token secret. However, there is no dockercfg secret. kubectl describe sa jenkins shows the old dockercfg secret reference that matches the backup, but indicates that it's not found in the cluster:

Image pull secrets:  jenkins-dockercfg-89x6k (not found)
Mountable secrets:   jenkins-dockercfg-89x6k (not found)
                     jenkins-token-rz7sz

Velero logs indicate that a restore was attempted on this jenkins-dockercfg secret with no error reported. I have two other secrets from the backup that are not associated with serviceaccounts that are being restored just fine. I'm not sure where to look for the source of the failure. The missing pull secret is preventing pods in this restore from getting their images.

(using the v1.0.0 codebase)

What did you expect to happen:
I would have expected that either the service account creation or the secret restore would have created this secret, but neither seems to have done it.

The output of the following commands will help us better understand what's going on:
(Pasting long output into a GitHub gist or other pastebin is fine.)
logs, describe, etc: https://gist.github.com/sseago/5409b61535b3e606c57a5d4d42f2beae

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • Velero version (use velero version):
    Velero is in our fusor fork, but it's essentially the 1.0.0 codebase with a custom Dockerfile. Also, I changed Debug statements to Info in service_account_action temporarily.

  • Kubernetes version (use kubectl version): 1.12
    Server Version: version.Info{Major:"1", Minor:"12+", GitVersion:"v1.12.4+0ba401e", GitCommit:"0ba401e", GitTreeState:"clean", BuildDate:"2019-03-31T22:28:12Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64"}

  • Kubernetes installer & version:

  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):

Most helpful comment

Looks like a plugin modeled after service_account_action.go to remove the pull secret from Secrets and ImagePullSecrets resolves this on our end. Closing this issue. Thanks for the help.

All 6 comments

Also, if I create the namespace and jenkins service account prior to restore, everything works fine since manually creating the account also creates the dockercfg secret. After restore, the service account actually has references to both secrets (the one from the backup that isn't restored, and the one created by the manual account creation), but things are functional. Somehow creating the service account via velero is different than creating it on the CLI, and this secret isn't being created when the restore creates the service account.

Also, I've realized that it is the secret that should have been created automatically by the service account creation that I want -- restoring the secret from the backup would be useless.

On further investigation, it seems that the problem stems from the service account having now-invalid token references. I did another restore today and immediately post-restore, I see this for the SA secrets:

Image pull secrets:  jenkins-dockercfg-89x6k (not found)
Mountable secrets:   jenkins-dockercfg-89x6k (not found)
                     jenkins-token-5fcxn

However if I edit the resource and remove the jenkins-dockercfg secret references from both imagePullSecrets and secrets, then those secrets are recreated automatically:

Image pull secrets:  jenkins-dockercfg-74b5t
Mountable secrets:   jenkins-token-5fcxn
                     jenkins-dockercfg-74b5t

So assuming that the -dockercfg secret is something that's added by OpenShift (and isn't there in a standard kube environment), it looks like the best answer would be a plugin which removes these on restore. I'll test that next.

Yeah, so on restore Velero has code to remove default token secret refs from service accounts, since we expect new ones to be automatically created during the restore. We're basically string-matching on <service account name>-token-. So I think that would match the behavior you're observing.

I did try modifying that code to also remove <service account name>-dockercfg- but for whatever reason at the point this code was hit, serviceAccount.secrets was not finding it. It could be that at this point it's only showing up in imagePullSecrets and not secrets, even though post-restore it shows up in both. In any case, since the -dockercfg secret seems to be an OpenShift addition, then a plugin is probably the right place to remove it, following ServiceAccountAction.Execute (from service_account_action.go) as a model.

Looks like a plugin modeled after service_account_action.go to remove the pull secret from Secrets and ImagePullSecrets resolves this on our end. Closing this issue. Thanks for the help.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

doronmak picture doronmak  路  3Comments

totemcaf picture totemcaf  路  4Comments

Chams91 picture Chams91  路  4Comments

archmangler picture archmangler  路  3Comments

Berndinox picture Berndinox  路  3Comments