Kops: rootVolumeSize value not respected for spot node instance group with M5 instance

Created on 24 Jan 2018 · 17Comments · Source: kubernetes/kops

kops version: Version 1.8.0 (git-5099bc5)
kubectl version: v1.8.7

Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.7", GitCommit:"b30876a5539f09684ff9fde266fda10b37738c9c", GitTreeState:"clean", BuildDate:"2018-01-16T21:59:57Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.7", GitCommit:"b30876a5539f09684ff9fde266fda10b37738c9c", GitTreeState:"clean", BuildDate:"2018-01-16T21:52:38Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

cloud provider: AWS
What commands did you run? What is the simplest way to reproduce this issue?
Created an instance group with following configuration.

apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: example.com
  name: spot-nodes
spec:
  additionalSecurityGroups:
  - sg-1e310660
  image: kope.io/k8s-1.8-debian-jessie-amd64-hvm-ebs-2017-12-02
  machineType: m5.4xlarge
  maxPrice: "0.4"
  maxSize: 4
  minSize: 4
  nodeLabels:
    spot: "true"
  role: Node
  rootVolumeSize: 100
  subnets:
  - us-east-1a
  taints:
  - spot=true:NoSchedule

What happened after the commands executed?
The nodes ASG was created with 100 GB root volume, but with only 8GB partition being used.

~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev             10M     0   10M   0% /dev
tmpfs            13G  5.0M   13G   1% /run
/dev/nvme0n1p1  7.4G  5.6G  1.5G  80% /
tmpfs            31G     0   31G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            31G     0   31G   0% /sys/fs/cgroup

~$ lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
nvme0n1     259:0    0  100G  0 disk
└─nvme0n1p1 259:1    0    8G  0 part /

What did you expect to happen?
Node using the full 100GB capacity of the root volume.
cluster manifest:

$ kops get --name $NAME -oyaml
apiVersion: kops/v1alpha2
kind: Cluster
metadata:
  creationTimestamp: 2018-01-17T11:04:19Z
  name: example.com
spec:
  api:
    dns: {}
  authorization:
    alwaysAllow: {}
  channel: stable
  cloudProvider: aws
  configBase: s3://my-bucket/example.com
  dnsZone: example.com
  etcdClusters:
  - enableEtcdTLS: true
    etcdMembers:
    - instanceGroup: master-us-east-1a
      name: a
    name: main
    version: 3.0.17
  - enableEtcdTLS: true
    etcdMembers:
    - instanceGroup: master-us-east-1a
      name: a
    name: events
    version: 3.0.17
  iam:
    allowContainerRegistry: true
    legacy: false
  kubeAPIServer:
    runtimeConfig:
      admissionregistration.k8s.io/v1alpha1: "true"
  kubernetesApiAccess:
  - 10.2.0.0/16
  kubernetesVersion: 1.8.7
  masterPublicName: api.example.com
  networkCIDR: 10.2.0.0/16
  networkID: vpc-xxx
  networking:
    kubenet: {}
  nonMasqueradeCIDR: 100.64.0.0/10
  sshAccess:
  - 10.2.0.0/16
  sshKeyName: xxx
  subnets:
  - cidr: 10.2.48.0/20
    name: us-east-1a
    type: Public
    zone: us-east-1a
  topology:
    dns:
      type: Public
    masters: public
    nodes: public

---

apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: 2018-01-17T11:04:19Z
  labels:
    kops.k8s.io/cluster: example.com
  name: master-us-east-1a
spec:
  image: kope.io/k8s-1.8-debian-jessie-amd64-hvm-ebs-2017-12-02
  machineType: t2.large
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-us-east-1a
  role: Master
  subnets:
  - us-east-1a

---

apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: 2018-01-17T11:04:19Z
  labels:
    kops.k8s.io/cluster: example.com
  name: nodes
spec:
  additionalSecurityGroups:
  - sg-1e310660
  image: kope.io/k8s-1.8-debian-jessie-amd64-hvm-ebs-2017-12-02
  machineType: m4.4xlarge
  maxSize: 10
  minSize: 10
  nodeLabels:
    kops.k8s.io/instancegroup: nodes
  role: Node
  rootVolumeSize: 200
  subnets:
  - us-east-1a

---

apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: 2018-01-23T09:06:04Z
  labels:
    kops.k8s.io/cluster: example.com
  name: spot-nodes
spec:
  additionalSecurityGroups:
  - sg-1e310660
  image: kope.io/k8s-1.8-debian-jessie-amd64-hvm-ebs-2017-12-02
  machineType: m5.4xlarge
  maxPrice: "0.4"
  maxSize: 4
  minSize: 4
  nodeLabels:
    spot: "true"
  role: Node
  rootVolumeSize: 100
  subnets:
  - us-east-1a
  taints:
  - spot=true:NoSchedule

Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.
Anything else do we need to know?
Other node instance group has partition with same size as the volume.

~$ lsblk
NAME    MAJ:MIN   RM  SIZE RO TYPE MOUNTPOINT
xvda    202:0      0  200G  0 disk
└─xvda1 202:1      0  200G  0 part /

Source

ApsOps

👍5

Most helpful comment

Also seeing this issue on our new M5 Nodes.

benjigoldberg on 25 Jan 2018

👍4

All 17 comments

I just came across the exact same issue, disk is created with the correct size but root partition defaults to 8gb instead. Pretty much the same config as above.

I will go ahead and assume it is a debian jessie issue: https://github.com/kubernetes/kops/blob/master/docs/releases/1.8-NOTES.md

New AWS instance types: P3, C5, M5, H1. Please note that NVME volumes are not supported on the default jessie image, so masters will not boot on M5 and C5 instance types unless a stretch image is chosen (change stretch to jessie in the image name). Also note that kubernetes will not support mounting persistent volumes on NVME instances until Kubernetes 1.9.

This should affect masters and nodes.

guitmz on 24 Jan 2018

From the statement in release notes, I assumed only masters are affected since it says

...NVME volumes are not supported on the default jessie image, so masters will not boot on M5 and C5 instance types...

ApsOps on 24 Jan 2018

👍1

Also seeing this issue on our new M5 Nodes.

benjigoldberg on 25 Jan 2018

👍4

@ApsOps does it work with stretch?

chrislovecnm on 26 Jan 2018

@chrislovecnm yeah, looks like it works with the stretch image. There's an extra 1MB partition (of unusable space, I think) though.

~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev             31G     0   31G   0% /dev
tmpfs           6.2G  4.5M  6.2G   1% /run
/dev/nvme0n1p2   94G  3.1G   87G   4% /
tmpfs            31G     0   31G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            31G     0   31G   0% /sys/fs/cgroup

~$ lsblk
NAME        MAJ:MIN RM    SIZE RO TYPE MOUNTPOINT
nvme0n1     259:0    0    100G  0 disk
├─nvme0n1p1 259:1    0 1007.5K  0 part
└─nvme0n1p2 259:2    0    100G  0 part /

ApsOps on 26 Jan 2018

We ran into the same issue (cluster worked fine, until we started a few pods containing large images, then the images would not be deployed because of "No space left on device").
Switching from m5.xlarge to m4.xlarge instances seems to fix it for now, but since older instance types are getting more expensive, I hope this issue can be fixed soon.