What steps did you take and what happened:
During restore Velero skips restoring a service if the service already exists in the namespace. However for nodePort services that require node ports to be preserved on restore, restores always fail if the same nodePort service already exists in the target namespace. It seems during the restore Velero tries to create the service on the target namespace but the API server rejects the operation since the node port to be restored is already allocated to the preexisting service. Below steps should help in reproducing this issue:
$ kubectl create ns nginx
$ cat <<EOF | kubectl apply -n nginx -f -
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
nodePort: 32555
selector:
app: nginx
type: NodePort
EOF
$ kubectl create deployment nginx --image=nginx -n nginx
$ velero backup create nginx-backup --include-namespaces nginx
$ kubectl delete deployment nginx -n nginx
$ velero restore create --from-backup nginx-backup
Restores fail consistently with the error "provided port is already allocated":
{"errors":{"namespaces":{"nginx":["error restoring services/nginx/nginx: Service \"nginx\" is invalid: spec.ports[0].nodePort: Invalid value: 32555: provided port is already allocated"]}},"warnings":{"namespaces":{"nginx":["could not restore, endpoints \"nginx\" already exists. Warning: the in-cluster version is different than the backed-up version."]}}}
What did you expect to happen:
Velero to skip restore of the nodePort service if the service already exists in the target namespace. This maintains restore behavior consistent with how Velero handles restore of other objects it finds in the namespace.
The output of the following commands will help us better understand what's going on:
(Pasting long output into a GitHub gist or other pastebin is fine.)
kubectl logs deployment/velero -n velerovelero backup describe <backupname> or kubectl get backup/<backupname> -n velero -o yamlvelero backup logs <backupname>velero restore describe <restorename> or kubectl get restore/<restorename> -n velero -o yamlvelero restore logs <restorename>Anything else you would like to add:
This issue is closely related to #1950 .
Environment:
velero version): v1.2.0velero client config get features): kubectl version):/etc/os-release):I think the issue is at this line. We expect the API server to say that the object already exists if it looks the same, but in the case of NodePorts that are re-using the same port, it gives us a different error - that it's invalid.
I confirmed this by printing out the error, and this is what I saw:
restoreErr was: &errors.StatusError{ErrStatus:v1.Status{TypeMeta:v1.TypeMeta{Kind:"Status", APIVersion:"v1"}, ListMeta:v1.ListMeta{SelfLink:"", ResourceVersion:"", Continue:"", RemainingItemCount:(*int64)(nil)}, Status:"Failure", Message:"Service \"nginx\" is invalid: spec.ports[0].nodePort: Invalid value: 32555: provided port is already allocated", Reason:"Invalid", Details:(*v1.StatusDetails)(0xc0007c1680), Code:422}}time="2020-03-04T20:50:34-05:00" level=info msg="error restoring nginx: Service \"nginx\" is invalid: spec.ports[0].nodePort: Invalid value: 32555: provided port is already allocated" logSource="pkg/restore/restore.go:1200" restore=velero/nginx-backup-20200304205033
Digging into the StatusDetails a little more with the following patch:
if apierrors.IsInvalid(restoreErr) {
statusErr := restoreErr.(*apierrors.StatusError)
fmt.Printf("Err Reason: %#v: ", statusErr.Status().Reason)
fmt.Printf("Err Details: %#v", statusErr.Status().Details)
}
Err Reason: "Invalid": Err Details: &v1.StatusDetails{Name:"nginx", Group:"", Kind:"Service", UID:"", Causes:[]v1.StatusCause{v1.StatusCause{Type:"FieldValueInvalid", Message:"Invalid value: 32555: provided port is already allocated", Field:"spec.ports[0].nodePort"}}, RetryAfterSeconds:0}time="2020-03-04T20:57:21-05:00" level=info msg="error restoring nginx: Service \"nginx\" is invalid: spec.ports[0].nodePort: Invalid value: 32555: provided port is already allocated" logSource="pkg/restore/restore.go:1204" restore=velero/nginx-backup-20200304205719
So, to fix this for nodePort services, I think we could do the following, which I have not yet tested.
apierrors.AlreadyExists, check for apierrors.IsInvalidrestoreError.Status().Details.Causes slice for nodePort in Field and already allocated in Message.Causes match this case, then the Service is the same and log it as such instead of erroring.I'm not 100% happy with this as it relies on a text string in the error message that could change, but I haven't yet checked on the API server side to see what exactly it generates, so maybe there's a better solution.
More generally, chasing down each case like Service Accounts and now Services in the core restore code doesn't seem sustainable. I'm wondering if some sort of RestoreConflict plugin type might make sense, which would allow us to specify criteria by which instances of a given type would be considered the same. That's a very rough idea at the moment and would require some more design work.
@skriss and @ashish-amarnath does my rough algorithm seem workable to you? I'll try to track down what the API server does to return the invalid error to see if there's something more reliable than string matching that we can do.
@ajohnstone and @jenting I'm tagging you too as you both commented on the other issue, and I'm able to reproduce, with detailed error structs.
Ok, so the error seems to originate in the port allocator, which is pretty core to Kubernetes. The error is passed directly through the REST API if the port is already allocated, which is why this doesn't show up as AlreadyExists.
Looking around here, I see it uses a RangeRegistry, which is similar to how IP address fields are exposed. My guess is that objects that have IPs for allocation will get similar errors when retained, although those have been more dynamic in my experience.
NodePorts have been a far more recurring problem for users of Velero.
@nrb that approach sounds reasonable, though I agree that adding lots of special cases here is not going to be sustainable.
Another option we could consider is trying to Get the resource before trying to restore it, and use that to determine if the resource exists already. That means an additional API call per resource we're restoring, though, so potentially has some runtime impact.
@nrb @skriss How about trying a GET only on the services iff we find the last-applied-config annotation on the service during restore? This way we only make the API call if we're restoring a service that needs node port preservation. If a service does not have the last-applied-config annotation, we simply delete the node ports as we're doing currently and let the API server auto-assign the node ports.
How about trying a GET only on the services iff we find the last-applied-config annotation on the service during restore?
That works for Services applied by Kubectl, but I think it still has the problem of adding special cases as they come up.
Reworking the entire logic to be based on a Get vs doing a Create and expecting an AlreadyExists error may be more useful in more cases, and may also serve us better for doing things like trying to overwrite items that differ from an existing backup, because then we'll have a lot more information about the exact differences versus just information that it already exists.
Problem is actual in 1.5.2
Most helpful comment
I think the issue is at this line. We expect the API server to say that the object already exists if it looks the same, but in the case of NodePorts that are re-using the same port, it gives us a different error - that it's invalid.
I confirmed this by printing out the error, and this is what I saw:
Digging into the
StatusDetailsa little more with the following patch:So, to fix this for nodePort services, I think we could do the following, which I have not yet tested.
apierrors.AlreadyExists, check forapierrors.IsInvalidrestoreError.Status().Details.Causesslice fornodePortinFieldandalready allocatedinMessage.Causesmatch this case, then theServiceis the same and log it as such instead of erroring.I'm not 100% happy with this as it relies on a text string in the error message that could change, but I haven't yet checked on the API server side to see what exactly it generates, so maybe there's a better solution.
More generally, chasing down each case like Service Accounts and now Services in the core restore code doesn't seem sustainable. I'm wondering if some sort of
RestoreConflictplugin type might make sense, which would allow us to specify criteria by which instances of a given type would be considered the same. That's a very rough idea at the moment and would require some more design work.@skriss and @ashish-amarnath does my rough algorithm seem workable to you? I'll try to track down what the API server does to return the invalid error to see if there's something more reliable than string matching that we can do.
@ajohnstone and @jenting I'm tagging you too as you both commented on the other issue, and I'm able to reproduce, with detailed error structs.