RKE version:
rke version v0.2.2
(same for lastest version v0.2.7)
Docker version: (docker version,docker info preferred)
docker version
Client:
Version: 18.09.7
API version: 1.39
Go version: go1.10.8
Git commit: 2d0083d
Built: Thu Jun 27 17:56:06 2019
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 18.09.7
API version: 1.39 (minimum version 1.12)
Go version: go1.10.8
Git commit: 2d0083d
Built: Thu Jun 27 17:26:28 2019
OS/Arch: linux/amd64
Experimental: false
Operating system and kernel: (cat /etc/os-release, uname -r preferred)
cat /etc/os-release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.5 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.5"
PRETTY_NAME="Red Hat"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.5:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.5
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.5"
uname -r
3.10.0-862.el7.x86_64
Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)
Bare-metal
cluster.yml file:
Steps to Reproduce:
install cluster using rke installer
rke up --config=rke-config.yaml
take backup of etcd
rke etcd snapshot-save --config=rke-config.yml --name=testing_etcd_backup
restore rke without S3 bucket
Results:
rke etcd snapshot-restore --config=rke-config.yml --name=testing_etcd_backup
INFO[0000] Restoring etcd snapshot testing_etcd_backup
INFO[0000] Successfully Deployed state file at [./rke-config.rkestate]
INFO[0000] [dialer] Setup tunnel for host [XXXX.local]
INFO[0000] [dialer] Setup tunnel for host [XXXX.local]
INFO[0000] [dialer] Setup tunnel for host [XXXX.local]
FATA[0006] failed to prepare backup: restoring S3 backups with no cluster level S3 configuration is not supported
it also fail for auto generted name like 2019-08-07T10:26:09Z_etcd.
I think issue is with IsLocalSnapshot function. it always returns false.
https://github.com/rancher/rke/blob/master/cluster/etcd.go
func IsLocalSnapshot(name string) bool {
// name is fmt.Sprintf("%s-%s%s-", cluster.Name, typeFlag, providerFlag)
// typeFlag = "r": recurring
// typeFlag = "m": manaul
//
// providerFlag = "l" local
// providerFlag = "s" s3
re := regexp.MustCompile("^c-[a-z0-9].*?-.l-")
return re.MatchString(name)
}
when i renamed backup file to c-20190706-.l- it worked. so regex is wrong it seems.
it used to work with v0.13
Having the same issue. rke version v0.2.7
I have the same problem. v0.2.2
Works for me with version v0.2.8
Can you provide details of rke-config.yaml?
I was wrong. v0.2.8 doesn't fix the problem. The cause for my error was a custom hyperkube image "sdevd/hyperkube:v1.14.1-rancher1-zfs". With the original image "rancher/hyperkube:v1.13.5-rancher1" etcd backups work fine. The same behavior, when setting the image to "rancher/hyperkube:v1.14.5-rancher1". This is probably "work as defined". I should generate a new config with rke.
My cluster.yml
cluster_name: mycluster
nodes:
- address: rke-node-1
port: "22"
internal_address: ""
role:
- worker
hostname_override: ""
user: rke
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: /home/rke/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels:
node-role.kubernetes.io/node:
# change zone label for SSO Storage class:
failure-domain.beta.kubernetes.io/zone: nova
- address: rke-node-3
port: "22"
internal_address: ""
role:
- worker
hostname_override: ""
user: rke
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: /home/rke/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels:
node-role.kubernetes.io/node:
# change zone label for SSO Storage class:
failure-domain.beta.kubernetes.io/zone: nova
- address: rke-node-2
port: "22"
internal_address: ""
role:
- worker
hostname_override: ""
user: rke
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: /home/rke/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels:
node-role.kubernetes.io/node:
# change zone label for SSO Storage class:
failure-domain.beta.kubernetes.io/zone: nova
- address: rke-master-3
port: "22"
internal_address: ""
role:
- controlplane
- etcd
hostname_override: ""
user: rke
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: /home/rke/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels:
node-role.kubernetes.io/master: true
- address: rke-master-2
port: "22"
internal_address: ""
role:
- controlplane
- etcd
hostname_override: ""
user: rke
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: /home/rke/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels:
node-role.kubernetes.io/master: true
- address: rke-master-1
port: "22"
internal_address: ""
role:
- controlplane
- etcd
hostname_override: ""
user: rke
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: /home/rke/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels:
node-role.kubernetes.io/master: true
services:
etcd:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
external_urls: []
ca_cert: ""
cert: ""
key: ""
path: "/etcdcluster"
snapshot: null
retention: ""
creation: ""
backup_config:
enabled: true
interval_hours: 6
retention: 30
kube-api:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
service_cluster_ip_range: 10.43.0.0/16
service_node_port_range: ""
pod_security_policy: false
always_pull_images: false
kube-controller:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
cluster_cidr: 10.42.0.0/16
service_cluster_ip_range: 10.43.0.0/16
scheduler:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
kubelet:
image: ""
extra_args: {}
# cloud-provider: "external"
extra_binds: []
extra_env: []
cluster_domain: cluster.local
infra_container_image: ""
cluster_dns_server: 10.43.0.10
fail_swap_on: false
kubeproxy:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
network:
plugin: canal
options: {}
authentication:
strategy: x509
sans: []
webhook: null
addons: ""
addons_include: []
system_images:
etcd: rancher/coreos-etcd:v3.2.24-rancher1
alpine: rancher/rke-tools:v0.1.27
nginx_proxy: rancher/rke-tools:v0.1.27
cert_downloader: rancher/rke-tools:v0.1.27
kubernetes_services_sidecar: rancher/rke-tools:v0.1.27
kubedns: rancher/k8s-dns-kube-dns:1.15.0
dnsmasq: rancher/k8s-dns-dnsmasq-nanny:1.15.0
kubedns_sidecar: rancher/k8s-dns-sidecar:1.15.0
kubedns_autoscaler: rancher/cluster-proportional-autoscaler:1.0.0
coredns: coredns/coredns:1.2.6
coredns_autoscaler: rancher/cluster-proportional-autoscaler:1.0.0
kubernetes: sdevd/hyperkube:v1.14.1-rancher1-zfs
flannel: rancher/coreos-flannel:v0.10.0-rancher1
flannel_cni: rancher/flannel-cni:v0.3.0-rancher1
calico_node: rancher/calico-node:v3.4.0
calico_cni: rancher/calico-cni:v3.4.0
calico_controllers: ""
calico_ctl: rancher/calico-ctl:v2.0.0
canal_node: rancher/calico-node:v3.4.0
canal_cni: rancher/calico-cni:v3.4.0
canal_flannel: rancher/coreos-flannel:v0.10.0
weave_node: weaveworks/weave-kube:2.5.0
weave_cni: weaveworks/weave-npc:2.5.0
pod_infra_container: rancher/pause:3.1
ingress: rancher/nginx-ingress-controller:0.21.0-rancher3
ingress_backend: rancher/nginx-ingress-controller-defaultbackend:1.4-rancher1
metrics_server: rancher/metrics-server:v0.3.1
ssh_key_path: /home/rke/.ssh/id_rsa
ssh_cert_path: ""
ssh_agent_auth: false
authorization:
mode: rbac
options: {}
ignore_docker_version: false
kubernetes_version: ""
private_registries: []
ingress:
provider: none
options: {}
node_selector: {}
extra_args: {}
cloud_provider:
name: openstack
openstackCloudProvider:
global:
username: XXX
password: XXX
auth-url: XXX
tenant-id: XXX
tenant-name: XXX
region: XXX
domain-name: XXX
prefix_path: ""
addon_job_timeout: 0
bastion_host:
address: ""
port: ""
user: ""
ssh_key: ""
ssh_key_path: ""
ssh_cert: ""
ssh_cert_path: ""
monitoring:
provider: ""
options: {}
restore:
restore: false
snapshot_name: ""
dns: null
I generated a new config and upgraded the cluster with rke 0.2.8 and I got the same error "restoring S3 backups with no cluster level S3 configuration is not supported" after an restore again. With the etcd backup config enabled rke tries to get the backup from S3.
backup_config:
interval_hours: 6
retention: 30
Without the backup config the restore works fine:
backup_config: null
Tested commands:
rke etcd snapshot-save --name test-snapshot-040919-4 --config /etc/rke/cluster.yml
rke etcd snapshot-restore --name test-snapshot-040919-4 --config /etc/rke/cluster.yml
Available as of RKE v1.1.0-rc5
I was able to reproduce this in rke v0.2.7 as originally reported. Sample YAML:
nodes:
- address: x.x.x.x
user: root
role: [controlplane,worker,etcd]
services:
etcd:
backup_config:
interval_hours: 6
retention: 30
snapshot: true
creation: 6h
retention: 24h
Steps:
$ rke up --config ./cluster.yml --ssh-agent-auth
$ rke etcd snapshot-save --name rke-snapshot-test-1 --config ./cluster.yml --ssh-agent-auth
$ rke etcd snapshot-restore --name rke-snapshot-test-1 --config ./cluster.yml --ssh-agent-auth
...
FATA[0008] failed to prepare backup: restoring S3 backups with no cluster level S3 configuration is not supported
With rke 1.1.0-rc5 I was able to restore the above backup without issues as now the backup source will be determined by the presence of the yaml keys under services.etcd
Scenarios tested:
backup_config (like the above sample yaml)backup_config (defaults to local)backup_config and s3backupconfig correct settings (save and restore to s3).s3backupconfig (defaults to local)s3backupconfig correct but with wrong backup name during restore.s3backupconfig.region still backups to the correct S3 bucket/folder.I generated a new config and upgraded the cluster with rke 0.2.8 and I got the same error "restoring S3 backups with no cluster level S3 configuration is not supported" after an restore again. With the etcd backup config enabled rke tries to get the backup from S3.
backup_config: interval_hours: 6 retention: 30Without the backup config the restore works fine:
backup_config: nullTested commands:
rke etcd snapshot-save --name test-snapshot-040919-4 --config /etc/rke/cluster.yml rke etcd snapshot-restore --name test-snapshot-040919-4 --config /etc/rke/cluster.yml
Restore works with no issues. Steps are different though:
1) you need to up kubernetes with rke up
2) after done - prune cluster, but do not remove it
docker rm -vf $(docker ps -aq)
docker rmi -f $(docker images -aq)
docker volume prune -f
3) You need only one snapshot, but from the master ETDC node put in /opt/rke/etcd-snapshots
4) you need to copy the rancher-cluster.rkestate and kube_config_rancher-cluster.yml files of the cluster that you backed up to the folder from which you run the restore.
5) Run the command rke etcd snapshot-restore and it will re-deploy the cluster.
Most helpful comment
I generated a new config and upgraded the cluster with rke 0.2.8 and I got the same error "restoring S3 backups with no cluster level S3 configuration is not supported" after an restore again. With the etcd backup config enabled rke tries to get the backup from S3.
Without the backup config the restore works fine:
Tested commands: