What steps did you take and what happened:
I've made a backup of my namespace with prometheus operator and prometheus instance.
Then I've deleted Prometheus with following script:
helm delete --purge prom-op
kubectl delete crd/alertmanagers.monitoring.coreos.com
kubectl delete crd/podmonitors.monitoring.coreos.com
kubectl delete crd/prometheuses.monitoring.coreos.com
kubectl delete crd/prometheusrules.monitoring.coreos.com
kubectl delete crd/servicemonitors.monitoring.coreos.com
kubectl delete namespace prom-op
Then I've restored it. The restored prometheus seems to work, but velero showed errors.
What did you expect to happen:
Restore without errors.
The output of the following commands will help us better understand what's going on:
$ velero restore logs prom-op-20200325220001 | grep error
time="2020-03-25T21:00:02Z" level=info msg="error restoring alertmanagers.monitoring.coreos.com: CustomResourceDefinition.apiextensions.k8s.io \"alertmanagers.monitoring.coreos.com\" is invalid: spec.versions: Invalid value: []apiextensions.CustomResourceDefinitionVersion{apiextensions.CustomResourceDefinitionVersion{Name:\"v1\", Served:true, Storage:true, Schema:(*apiextensions.CustomResourceValidation)(0xc02adb9f08), Subresources:(*apiextensions.CustomResourceSubresources)(nil), AdditionalPrinterColumns:[]apiextensions.CustomResourceColumnDefinition(nil)}}: per-version schemas may not all be set to identical values (top-level validation should be used instead)" logSource="pkg/restore/restore.go:1199" restore=velero/prom-op-20200325220001
time="2020-03-25T21:00:02Z" level=info msg="error restoring podmonitors.monitoring.coreos.com: CustomResourceDefinition.apiextensions.k8s.io \"podmonitors.monitoring.coreos.com\" is invalid: spec.versions: Invalid value: []apiextensions.CustomResourceDefinitionVersion{apiextensions.CustomResourceDefinitionVersion{Name:\"v1\", Served:true, Storage:true, Schema:(*apiextensions.CustomResourceValidation)(0xc01142e2e0), Subresources:(*apiextensions.CustomResourceSubresources)(nil), AdditionalPrinterColumns:[]apiextensions.CustomResourceColumnDefinition(nil)}}: per-version schemas may not all be set to identical values (top-level validation should be used instead)" logSource="pkg/restore/restore.go:1199" restore=velero/prom-op-20200325220001
time="2020-03-25T21:00:02Z" level=info msg="error restoring prometheuses.monitoring.coreos.com: CustomResourceDefinition.apiextensions.k8s.io \"prometheuses.monitoring.coreos.com\" is invalid: spec.versions: Invalid value: []apiextensions.CustomResourceDefinitionVersion{apiextensions.CustomResourceDefinitionVersion{Name:\"v1\", Served:true, Storage:true, Schema:(*apiextensions.CustomResourceValidation)(0xc03610a5a0), Subresources:(*apiextensions.CustomResourceSubresources)(nil), AdditionalPrinterColumns:[]apiextensions.CustomResourceColumnDefinition(nil)}}: per-version schemas may not all be set to identical values (top-level validation should be used instead)" logSource="pkg/restore/restore.go:1199" restore=velero/prom-op-20200325220001
time="2020-03-25T21:00:02Z" level=info msg="error restoring prometheusrules.monitoring.coreos.com: CustomResourceDefinition.apiextensions.k8s.io \"prometheusrules.monitoring.coreos.com\" is invalid: spec.versions: Invalid value: []apiextensions.CustomResourceDefinitionVersion{apiextensions.CustomResourceDefinitionVersion{Name:\"v1\", Served:true, Storage:true, Schema:(*apiextensions.CustomResourceValidation)(0xc023e28260), Subresources:(*apiextensions.CustomResourceSubresources)(nil), AdditionalPrinterColumns:[]apiextensions.CustomResourceColumnDefinition(nil)}}: per-version schemas may not all be set to identical values (top-level validation should be used instead)" logSource="pkg/restore/restore.go:1199" restore=velero/prom-op-20200325220001
time="2020-03-25T21:00:02Z" level=info msg="error restoring servicemonitors.monitoring.coreos.com: CustomResourceDefinition.apiextensions.k8s.io \"servicemonitors.monitoring.coreos.com\" is invalid: spec.versions: Invalid value: []apiextensions.CustomResourceDefinitionVersion{apiextensions.CustomResourceDefinitionVersion{Name:\"v1\", Served:true, Storage:true, Schema:(*apiextensions.CustomResourceValidation)(0xc00ed731b8), Subresources:(*apiextensions.CustomResourceSubresources)(nil), AdditionalPrinterColumns:[]apiextensions.CustomResourceColumnDefinition(nil)}}: per-version schemas may not all be set to identical values (top-level validation should be used instead)" logSource="pkg/restore/restore.go:1199" restore=velero/prom-op-20200325220001
It shows erros when restoring CRDs but afterwards they're restored:
$ kubectl get crd -A | grep coreos
alertmanagers.monitoring.coreos.com 2020-03-26T07:35:12Z
podmonitors.monitoring.coreos.com 2020-03-26T07:35:13Z
prometheuses.monitoring.coreos.com 2020-03-26T07:35:12Z
prometheusrules.monitoring.coreos.com 2020-03-26T07:35:13Z
servicemonitors.monitoring.coreos.com 2020-03-26T07:35:12Z
kubectl logs deployment/velero -n velero
https://termbin.com/8jky
velero restore describe prom-op-20200325220001
https://termbin.com/4jlz
velero restore logs prom-op-20200325220001
https://termbin.com/xbiu
Environment:
velero version): Client:
Version: v1.3.1
Git commit: -
Server:
Version: v1.3.1
velero client config get features): features: <NOT SET>
kubectl version):Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-13T18:08:14Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.6-beta.0", GitCommit:"e7f962ba86f4ce7033828210ca3556393c377bcc", GitTreeState:"clean", BuildDate:"2020-01-15T08:18:29Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"linux/amd64"}
Kubernetes installer & version:
Manually installed with kubeadm. Version 1.16.
Cloud provider or hardware configuration:
Physical hardware + KVM VMs.
OS (e.g. from /etc/os-release):
NAME="Ubuntu"
VERSION="18.04.3 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.3 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
@nrb any quick reaction to this? Might be related to the api group versioning mess.
Quick reaction is that it's API server validation related, not Velero's; it's something to do with CRD schema validation that wants to be more strict.
Prometheus hasn't conformed to the new CRD schema standard yet.
We likely need to make this particularly error not fail the backup...
Hi, I have similar errors with the restore of elasticsearch+kibana. Should I put the logs here or open a new issue?
UPDATE: I've created another issue, it seems cleaner: #2383
I've assigned this to myself, I think the problems here are the same as those in #2383. I'm going to grab some of the prometheus CRDs to include in my test data for #2478.
@TomaszKlosinski Do you know what version of the Prometheus operator you were using? https://github.com/coreos/prometheus-operator/blob/master/bundle-v1beta1-crd.yaml seems to be working correctly (no errors on restore) in my tests with both Velero master and v1.3.2.
@nrb , I've installed it with helm: stable/prometheus-operator --version 8.3.1
Thanks!
Most helpful comment
Quick reaction is that it's API server validation related, not Velero's; it's something to do with CRD schema validation that wants to be more strict.
Prometheus hasn't conformed to the new CRD schema standard yet.
We likely need to make this particularly error not fail the backup...