Hi!
The latest version of Kubernetes available on EKS is 1.13. The Cluster Autoscaler docs recommend using a version of the autoscaler that matches that of the Kubernetes version in use, so EKS users will be using the examples from this branch:
However in those examples, the SSL certs path for the host is set to:
volumes:
- name: ssl-certs
hostPath:
path: "/etc/ssl/certs/ca-certificates.crt"
...when it should be /etc/ssl/certs/ca-bundle.crt.
This results in:
Error: failed to start container "cluster-autoscaler":
Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:402:container init caused \"rootfs_linux.go:58: mounting \\\"/etc/ssl/certs/ca-certificates.crt\\\" to rootfs \\\"/var/lib/docker/overlay2/da2f555b196e50cd870a45f7c1eb8011351fa95cbb148e33ad7dd4a0fc4f3689/merged\\\" at \\\"/var/lib/docker/overlay2/da2f555b196e50cd870a45f7c1eb8011351fa95cbb148e33ad7dd4a0fc4f3689/merged/etc/ssl/certs/ca-certificates.crt\\\" caused \\\"not a directory\\\"\"": unknown:
Are you trying to mount a directory onto a file (or vice-versa)? Check if the specified host path exists and is the expected type
This appears to be fixed on master, however I imagine many people will instead be using the branch versions.
This can be fixed by overriding the sslCertHostPath value on the release.
Mine looks like:
autoDiscovery:
clusterName: MyCluster
rbac:
create: true
sslCertHostPath: /etc/ssl/certs/ca-bundle.crt
podAnnotations:
iam.amazonaws.com/role: k8s-autoscaler
Please check common notes and Gotchas here.
https://github.com/kubernetes/autoscaler/tree/cluster-autoscaler-1.13.5/cluster-autoscaler/cloudprovider/aws#common-notes-and-gotchas
I plan to update default value to EKS friendly since more users move from kops to EKS now..
I updated docs in the 1.12-1.14 branches. I will close this issue and feel free to reopen
/close
@Jeffwan: Closing this issue.
In response to this:
I updated docs in the 1.12-1.14 branches. I will close this issue and feel free to reopen
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
This occurred again, e.g. in this file: https://github.com/kubernetes/autoscaler/blob/cluster-autoscaler-1.16.5/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml
@xmik We do use ca bundle in hostPath, https://github.com/kubernetes/autoscaler/blob/cluster-autoscaler-1.16.5/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml#L157
Can you check if it's working?
I tested that particular file and I saw such error:
$ kubectl describe -n kube-system pod/cluster-autoscaler-8b46dddf5-8xkns
Warning Failed 26s kubelet, ip-172-20-62-245.eu-west-1.compute.internal Error: failed to start container "cluster-autoscaler": Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:58: mounting \\\"/etc/ssl/certs/ca-bundle.crt\\\" to rootfs \\\"/var/lib/docker/overlay2/0f99e6ea6c9a9f8dedf87b977b72ecff7b56e0e6e439fcd8f9275bda3dbcabe6/merged\\\" at \\\"/var/lib/docker/overlay2/0f99e6ea6c9a9f8dedf87b977b72ecff7b56e0e6e439fcd8f9275bda3dbcabe6/merged/etc/ssl/certs/ca-certificates.crt\\\" caused \\\"not a directory\\\"\"": unknown: Are you trying to mount a directory onto a file (or vice-versa)? Check if the specified host path exists and is the expected type
Warning BackOff 12s (x2 over 42s) kubelet, ip-172-20-62-245.eu-west-1.compute.internal Back-off restarting failed container
After replacing:
volumeMounts:
- name: ssl-certs
mountPath: /etc/ssl/certs/ca-certificates.crt
readOnly: true
with
volumeMounts:
- name: ssl-certs
mountPath: /etc/ssl/certs/ca-bundle.crt
readOnly: true
the error was gone.
@xmik besides volumeMounts, can you share your volumes as well?
This is the only thing I changed. I deployed AutoScaler on a fresh, test k8s cluster (on AWS). The whole file which worked:
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
name: cluster-autoscaler
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-autoscaler
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
rules:
- apiGroups: [""]
resources: ["events", "endpoints"]
verbs: ["create", "patch"]
- apiGroups: [""]
resources: ["pods/eviction"]
verbs: ["create"]
- apiGroups: [""]
resources: ["pods/status"]
verbs: ["update"]
- apiGroups: [""]
resources: ["endpoints"]
resourceNames: ["cluster-autoscaler"]
verbs: ["get", "update"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["watch", "list", "get", "update"]
- apiGroups: [""]
resources:
- "pods"
- "services"
- "replicationcontrollers"
- "persistentvolumeclaims"
- "persistentvolumes"
verbs: ["watch", "list", "get"]
- apiGroups: ["extensions"]
resources: ["replicasets", "daemonsets"]
verbs: ["watch", "list", "get"]
- apiGroups: ["policy"]
resources: ["poddisruptionbudgets"]
verbs: ["watch", "list"]
- apiGroups: ["apps"]
resources: ["statefulsets", "replicasets", "daemonsets"]
verbs: ["watch", "list", "get"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["watch", "list", "get"]
- apiGroups: ["batch", "extensions"]
resources: ["jobs"]
verbs: ["get", "list", "watch", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
rules:
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["create","list","watch"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
verbs: ["delete", "get", "update", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cluster-autoscaler
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-autoscaler
subjects:
- kind: ServiceAccount
name: cluster-autoscaler
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: cluster-autoscaler
subjects:
- kind: ServiceAccount
name: cluster-autoscaler
namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
app: cluster-autoscaler
spec:
replicas: 1
selector:
matchLabels:
app: cluster-autoscaler
template:
metadata:
labels:
app: cluster-autoscaler
spec:
serviceAccountName: cluster-autoscaler
containers:
- image: k8s.gcr.io/cluster-autoscaler:v1.12.3
name: cluster-autoscaler
resources:
limits:
cpu: 100m
memory: 300Mi
requests:
cpu: 100m
memory: 300Mi
command:
- ./cluster-autoscaler
- --v=4
- --stderrthreshold=info
- --cloud-provider=aws
- --skip-nodes-with-local-storage=false
- --expander=least-waste
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/${CLUSTER_NAME}
volumeMounts:
- name: ssl-certs
mountPath: /etc/ssl/certs/ca-bundle.crt
readOnly: true
imagePullPolicy: "Always"
volumes:
- name: ssl-certs
hostPath:
path: "/etc/ssl/certs/ca-bundle.crt"
Most helpful comment
This can be fixed by overriding the
sslCertHostPathvalue on the release.Mine looks like: