Amazon-vpc-cni-k8s: ENI in Secondary VPC CIDR not getting created

Created on 26 Jan 2019  Â·  18Comments  Â·  Source: aws/amazon-vpc-cni-k8s

Region: us-east-1
AMI : ami-0c24db5df6badc35a
CNI : 1.3

Instance IAM role has arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy

Primary VPC CIDR : 192.168.0.0/16
Secondary VPC CIDR : 100.64.0.0/16

EC2 Instances Subnet CIDR : 192.168.0.0/18
Expecting CNI to be using secondary CIDR subnet range : 100.64.0.0/22

Both the above subnets have the route to 0.0.0.0/0 via NAT gateway.

Upgraded to 1.3 plugin using https://docs.aws.amazon.com/eks/latest/userguide/cni-upgrades.html
Added the following to aws-node daemonset

- name: AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG
   value: "true"
- name: AWS_VPC_K8S_CNI_EXTERNALSNAT
   value: "true"
- name: ENI_CONFIG_LABEL_DEF
   value: failure-domain.beta.kubernetes.io/zone

Passed --use-max-pods=true during instance bootstrap

Created eniconfigs named as us-east-1a with subnet value corresponding to 100.64.0.0/22
Terminated the instance in ASG to get new one.

Notice that the ENI with secondary CIDR range does not get created.
Logged in to instance to look ipamd logs and found the following

019-01-26T03:06:34Z [INFO]  Setting myENI to: default
2019-01-26T03:06:36Z [INFO] Handle ENIConfig Add/Update:  us-east-1a, [sg-0e136ced130dcae47], subnet-05ed4d82a4ed8a2cc
2019-01-26T03:06:36Z [INFO] Handle ENIConfig Add/Update:  us-east-1b, [sg-0e136ced130dcae47], subnet-0bf32cbc60e246a59
2019-01-26T03:06:36Z [INFO] Handle corev1.Node: ip-192-168-104-171.ec2.internal, map[node.alpha.kubernetes.io/ttl:0 volumes.kubernetes.io/controller-managed-attach-detach:true]
2019-01-26T03:06:36Z [INFO]  Setting myENI to: default
2019-01-26T03:06:36Z [INFO] Handle corev1.Node: ip-192-168-31-146.ec2.internal, map[node.alpha.kubernetes.io/ttl:0 volumes.kubernetes.io/controller-managed-attach-detach:true]
2019-01-26T03:06:37Z [DEBUG] Skip the primary ENI for need IP check
2019-01-26T03:06:37Z [DEBUG] IP pool stats: total = 0, used = 0, c.currentMaxAddrsPerENI = 14, c.maxAddrsPerENI = 14
2019-01-26T03:06:37Z [DEBUG] Start increasing IP Pool size
2019-01-26T03:06:37Z [ERROR] Failed to get pod ENI config
2019-01-26T03:06:37Z [DEBUG] Reconciling ENI/IP pool info...
2019-01-26T03:06:37Z [DEBUG] Total number of interfaces found: 1 
2019-01-26T03:06:37Z [DEBUG] Found eni mac address : 02:ec:54:25:ed:82
2019-01-26T03:06:37Z [DEBUG] Using device number 0 for primary eni: eni-0ec34990890e8c0f4
2019-01-26T03:06:37Z [DEBUG] Found eni: eni-0ec34990890e8c0f4, mac 02:ec:54:25:ed:82, device 0
2019-01-26T03:06:37Z [DEBUG] Found cidr 192.168.64.0/18 for eni 02:ec:54:25:ed:82
2019-01-26T03:06:37Z [DEBUG] Found ip addresses [192.168.104.171] on eni 02:ec:54:25:ed:82
2019-01-26T03:06:37Z [DEBUG] Reconcile existing ENI eni-0ec34990890e8c0f4 IP pool
2019-01-26T03:06:37Z [DEBUG] Reconcile and skip primary IP 192.168.104.171 on eni eni-0ec34990890e8c0f4
2019-01-26T03:06:37Z [DEBUG] Successfully Reconciled ENI/IP pool
2019-01-26T03:06:41Z [INFO] Handle ENIConfig Add/Update:  us-east-1a, [sg-0e136ced130dcae47], subnet-05ed4d82a4ed8a2cc
2019-01-26T03:06:41Z [INFO] Handle ENIConfig Add/Update:  us-east-1b, [sg-0e136ced130dcae47], subnet-0bf32cbc60e246a59
2019-01-26T03:06:41Z [INFO] Handle corev1.Node: ip-192-168-104-171.ec2.internal, map[node.alpha.kubernetes.io/ttl:0 volumes.kubernetes.io/controller-managed-attach-detach:true]
2019-01-26T03:06:41Z [INFO]  Setting myENI to: default

If i manually create a ENI in 100.64.0.0/22 and attach to the instance, everything works good.
Wondering, whats going on with ipamd not able to create the ENI with secondary VPC CIDR ???

Most helpful comment

I got bitten by the same thing.

While ENI_CONFIG_LABEL_DEF is in the documentation, it is actually not a valid env var for 1.3.

In order to get it working, you need to compile from master. Hopefully they will release a new version soon.

All 18 comments

@liwenwu-amazon - appreciate any inputs or insights on cause and fix ?

I got bitten by the same thing.

While ENI_CONFIG_LABEL_DEF is in the documentation, it is actually not a valid env var for 1.3.

In order to get it working, you need to compile from master. Hopefully they will release a new version soon.

Same issue here. Followed this guide: https://docs.aws.amazon.com/eks/latest/userguide/cni-custom-network.html.
ENIs are not created but used onced attached manually.

I'm seeing a similar thing (after manually attaching an ENI to the instance) only when I attach an ENI to the instance it fails to allocate the IPs to pods. It does, however, allocated secondary IPs to the ENI in the CIDR range of the ENIConfig.

ENIConfig

apiVersion: crd.k8s.amazonaws.com/v1alpha1
kind: ENIConfig
metadata:
  name: us-east-1b
spec:
  subnet: subnet-03e2ea20c351349bd
  securityGroups:
  - sg-0d0e1b63640f61d44
2019-03-14T22:09:01Z [INFO]  Setting myENI to: default
2019-03-14T22:09:01Z [INFO] Handle ENIConfig Add/Update:  us-east-1f, [sg-0d0e1b63640f61d44], subnet-0aea9b713df3d10c9
2019-03-14T22:09:01Z [INFO] Handle ENIConfig Add/Update:  us-east-1a, [sg-0d0e1b63640f61d44], subnet-012e2ee9c1105e548
2019-03-14T22:09:01Z [INFO] Handle ENIConfig Add/Update:  us-east-1b, [sg-0d0e1b63640f61d44], subnet-03e2ea20c351349bd
2019-03-14T22:09:01Z [INFO] Handle ENIConfig Add/Update:  us-east-1c, [sg-0d0e1b63640f61d44], subnet-0db2de31c00afa5f5
2019-03-14T22:09:01Z [INFO] Handle ENIConfig Add/Update:  us-east-1d, [sg-0d0e1b63640f61d44], subnet-07fa92c5cbd3a9697
2019-03-14T22:09:01Z [INFO] Handle ENIConfig Add/Update:  us-east-1e, [sg-0d0e1b63640f61d44], subnet-07027511c62a51caa
2019-03-14T22:09:02Z [DEBUG] Skip the primary ENI for need IP check
2019-03-14T22:09:02Z [DEBUG] Using WARM-ENI-TARGET 2
2019-03-14T22:09:02Z [DEBUG] IP pool stats: total = 9, used = 0, c.currentMaxAddrsPerENI = 9, c.maxAddrsPerENI = 9
2019-03-14T22:09:02Z [DEBUG] Start increasing IP Pool size
2019-03-14T22:09:02Z [ERROR] Failed to get pod ENI config
2019-03-14T22:09:06Z [INFO] Handle corev1.Node: ip-192-168-63-91.ec2.internal, map[node.alpha.kubernetes.io/ttl:0 volumes.kubernetes.io/controller-managed-attach-detach:true]

Like @anshrma I am also using ENI_CONFIG_LABEL_DEF and I have set it to failure-domain.beta.kubernetes.io/zone

@jicowan Just found out that it's simply a strange release from the CNI we're using by default.
I assume you're using the release 1.3.2 (the newest one).
1.3.2: https://github.com/aws/amazon-vpc-cni-k8s/blob/v1.3.2/pkg/eniconfig/eniconfig.go
current master: https://github.com/aws/amazon-vpc-cni-k8s/blob/master/pkg/eniconfig/eniconfig.go

There's no mention about any eniconfig setting via labels in the 1.3.2 release although it's in the master branch since January 12. latest.

It's as @dadux already said, it's simply not implemented.
I built it by myself two days ago: https://cloud.docker.com/repository/docker/tilmankrauss/amazon-k8s-cni. Just the corresponding docker image for the master branch.

Once replaced in daemonset, it works like a charm:).

Just patch your aws-node :

kubectl patch -n kube-system daemonset aws-node  --patch '
spec:
  template:
    spec:
      containers:
      - name: aws-node
        image: tilmankrauss/amazon-k8s-cni:v1.3.0-91-g1448130d
        env:
        - name:  AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG
          value: "true"
        - name:  ENI_CONFIG_LABEL_DEF
          value: "failure-domain.beta.kubernetes.io/zone"
        - name : AWS_VPC_K8S_CNI_EXTERNALSNAT
          value: "true"
'

Please note that the docker image is just a build from a arbitrary state within the master branch, nothing like a release;).

@till-krauss Thanks! That updating the image fixed the issue I was having. Wondering when these changes will make it into an official release.

Did this make it into an official release yet? Tried with 1.4 and the network config did not apply for me. I made sure to have an ENIConfig that was named with the availability zone (us-east-1a for example). Is that format incorrect or is this feature still not officially released despite being in the docs?

The cfg looks good for 1.4.0 and higher. Did not check the whole file but the labeldef has been mentioned at least in contrast to the last I tried.
Are you sure it's well configured?

I applied two ENIConfigs with the names of my availability zones (us-east-1a and us-east-1b). They contain the proper subnet and security groups for each.

Applied the following lines as needed in the CNI DaemonSet

         - name: AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG
          value: "true"
         - name: ENI_CONFIG_LABEL_DEF
         value: "failure-domain.beta.kubernetes.io/zone"

Fired up new nodes that would pick up this configuration, but no IPs get picked up. Doesn't even seem like the ENI is getting provisioned.

Looks good, can you put the log of the daemonset in here?

I've got to imagine you mean something else or I have something not set right for this, because this is the log of the aws-node pod on the node in question. Doesn't have much behind it.

=====Starting installing AWS-CNI =========
=====Starting amazon-k8s-agent ===========
time="2019-05-29T15:51:37Z" level=error msg="failed to initialize service object for operator metrics: OPERATOR_NAME must be set"

Can you include the ENIConfig you're trying to apply to the node?

Get Outlook for Androidhttps://aka.ms/ghei36


From: Tyler Mapp notifications@github.com
Sent: Wednesday, May 29, 2019 11:45:12 AM
To: aws/amazon-vpc-cni-k8s
Cc: Jeremy Cowan; Mention
Subject: Re: [aws/amazon-vpc-cni-k8s] ENI in Secondary VPC CIDR not getting created (#302)

I've got to imagine you mean something else or I have something not set right for this, because this is the log of the aws-node pod on the node in question. Doesn't have much behind it.

=====Starting installing AWS-CNI =========
=====Starting amazon-k8s-agent ===========
time="2019-05-29T15:51:37Z" level=error msg="failed to initialize service object for operator metrics: OPERATOR_NAME must be set"

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com/aws/amazon-vpc-cni-k8s/issues/302?email_source=notifications&email_token=ACCYJK6KFP2DIS6K3WHGCLDPX2XJRA5CNFSM4GSP6D32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWP6BMI#issuecomment-497017009, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACCYJKYQZ23IH2IWBNIKWMDPX2XJRANCNFSM4GSP6D3Q.

Was also getting this error. No logs to give a clue as to why.

I disabled AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG and now it works. It uses same eni and subnet as worker node which is what I want and this problem went away.

Everyone having this problem can juset set AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG=false.

You don't need to create ENIConfig resources, set k8s.amazonaws.com/eniConfig node label or set aws-node ENI_CONFIG_LABEL_DEF=true.

Default CNI behaviour when AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG=false is to create pods with same network as worker eni.

I don't get why you need to duplicate the workers network information in ENIConfig - must be a different usecase than mine. But if you followed https://docs.aws.amazon.com/eks/latest/userguide/cni-custom-network.html guide then you already created workers with the desired subnet and SG.

Is this still an issue with v1.5 or later?

I am facing this issue currently. I deployed a brand new cluster (EKS 1.14, aws-node 1.5.3). Then followed the instructions to setup 3 worker nodes (1 per AZ) along with ENIConfigs etc. The new workers don't get any secondary ENIs attached. As a result even the core-dns container is failing to start.

Details:

Warning  FailedCreatePodSandBox  3m (x562 over 13m)  kubelet, ip-192-168-14-162.ec2.internal  (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "f0db1e906ef05178c876fe9dfd91bbe8eced0a02ba3e14eb7bd296711d7538a4" network for pod "coredns-cc8fc4797-j22jk": NetworkPlugin cni failed to set up pod "coredns-cc8fc4797-j22jk_kube-system" network: add cmd: failed to assign an IP address to container

This is the ENIConfig applied:

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: eniconfigs.crd.k8s.amazonaws.com
spec:
  scope: Cluster
  group: crd.k8s.amazonaws.com
  version: v1alpha1
  names:
    plural: eniconfigs
    singular: eniconfig
    kind: ENIConfig

And here is a sample nodeconfig:

```apiVersion: crd.k8s.amazonaws.com/v1alpha1
kind: ENIConfig
metadata:
name: us-east-1c
spec:
securityGroups:
- sg-045f292130d5952da
- sg-08d673aea1e0624b3
subnet: subnet-0683458b533af6db7

Running pods:

$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system aws-node-4jk6d 1/1 Running 1 26m
kube-system aws-node-ccthb 1/1 Running 1 26m
kube-system aws-node-g4mt7 1/1 Running 1 34m
kube-system coredns-cc8fc4797-j22jk 0/1 ContainerCreating 0 15m
kube-system coredns-cc8fc4797-sn64f 0/1 ContainerCreating 0 15m
kube-system kube-proxy-864jf 1/1 Running 0 34m
kube-system kube-proxy-rbccj 1/1 Running 0 26m
kube-system kube-proxy-rzvhx 1/1 Running 0 26m

ipamd.log

2019-09-26T01:30:52.147Z [INFO] Received AddNetwork for NS /proc/21434/ns/net, Pod coredns-cc8fc4797-j22jk, NameSpace kube-system, Container 812bc32489491af1134d48ee05a78c
5e99f1cfa35dacea1f9284152d58aabe7e, ifname eth0
2019-09-26T01:30:52.148Z [DEBUG] AssignIPv4Address: IP address pool stats: total: 0, assigned 0
2019-09-26T01:30:52.148Z [DEBUG] AssignPodIPv4Address: Skip ENI eni-06196a0021ff36319 that does not have available addresses
2019-09-26T01:30:52.148Z [ERROR] DataStore has no available IP addresses
2019-09-26T01:30:52.148Z [DEBUG] VPC CIDR 192.168.0.0/16
2019-09-26T01:30:52.148Z [DEBUG] VPC CIDR 100.64.0.0/16
2019-09-26T01:30:52.148Z [INFO] Send AddNetworkReply: IPv4Addr , DeviceNumber: 0, err: assignPodIPv4AddressUnsafe: no available IP addresses
2019-09-26T01:30:52.160Z [DEBUG] IP pool stats: total = 0, used = 0, c.maxIPsPerENI = 5
2019-09-26T01:30:52.160Z [DEBUG] IP pool is too low: available (0) < ENI target (1) * addrsPerENI (5)
2019-09-26T01:30:52.160Z [DEBUG] Starting to increase IP pool size
2019-09-26T01:30:52.160Z [DEBUG] Skip the primary ENI for need IP check
2019-09-26T01:30:52.160Z [ERROR] Failed to get pod ENI config
2019-09-26T01:30:52.160Z [DEBUG] Successfully increased IP pool
2019-09-26T01:30:52.160Z [DEBUG] IP pool stats: total = 0, used = 0, c.maxIPsPerENI = 5
2019-09-26T01:30:52.160Z [DEBUG] IP pool stats: total = 0, used = 0, c.maxIPsPerENI = 5
2019-09-26T01:30:52.160Z [DEBUG] Its NOT possible to remove extra ENIs because available (0) <= ENI target (1) * addrsPerENI (5):
2019-09-26T01:30:52.166Z [INFO] Received DelNetwork for IP , Pod coredns-cc8fc4797-j22jk, Namespace kube-system, Container 812bc32489491af1134d48ee05a78c5e99f1cfa35da
cea1f9284152d58aabe7e
2019-09-26T01:30:52.166Z [DEBUG] UnassignPodIPv4Address: IP address pool stats: total:0, assigned 0, pod(Name: coredns-cc8fc4797-j22jk, Namespace: kube-system, Cont
ainer 812bc32489491af1134d48ee05a78c5e99f1cfa35dacea1f9284152d58aabe7e)
2019-09-26T01:30:52.166Z [WARN] UnassignPodIPv4Address: Failed to find pod coredns-cc8fc4797-j22jk namespace kube-system Container 812bc32489491af1134d48ee05a78c5e99f1cfa3
5dacea1f9284152d58aabe7e
2019-09-26T01:30:52.166Z [DEBUG] UnassignPodIPv4Address: IP address pool stats: total:0, assigned 0, pod(Name: coredns-cc8fc4797-j22jk, Namespace: kube-system, Cont
ainer )
2019-09-26T01:30:52.166Z [WARN] UnassignPodIPv4Address: Failed to find pod coredns-cc8fc4797-j22jk namespace kube-system Container
2019-09-26T01:30:52.166Z [INFO] Send DelNetworkReply: IPv4Addr , DeviceNumber: 0, err: datastore: unknown pod
2019-09-26T01:30:52.706Z [INFO] Received DelNetwork for IP , Pod coredns-cc8fc4797-j22jk, Namespace kube-system, Container 812bc32489491af1134d48ee05a78c5e99f1cfa35da
cea1f9284152d58aabe7e

**Update**
I managed to fix this by forcing the WARM_ENI_TARGET

kubectl set env daemonset aws-node -n kube-system WARM_ENI_TARGET=2

Now:

$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system aws-node-hkw4g 1/1 Running 0 36s
kube-system aws-node-z8c97 1/1 Running 0 105s
kube-system aws-node-zs4qv 1/1 Running 0 69s
kube-system coredns-cc8fc4797-j22jk 1/1 Running 0 43m
kube-system coredns-cc8fc4797-sn64f 1/1 Running 0 43m
kube-system kube-proxy-864jf 1/1 Running 0 61m
kube-system kube-proxy-rbccj 1/1 Running 0 53m
kube-system kube-proxy-rzvhx 1/1 Running 0 53m

coredns pod details:

$ kubectl describe pod coredns-cc8fc4797-j22jk -n kube-system
Name: coredns-cc8fc4797-j22jk
Namespace: kube-system
Priority: 2000001000
PriorityClassName: system-node-critical
Node: ip-192-168-14-162.ec2.internal/192.168.14.162
Start Time: Wed, 25 Sep 2019 20:59:16 -0400
Labels: eks.amazonaws.com/component=coredns
k8s-app=kube-dns
pod-template-hash=cc8fc4797
Annotations: kubernetes.io/psp=eks.privileged
Status: Running
IP: 100.64.168.210
Controlled By: ReplicaSet/coredns-cc8fc4797
```

I had to terminate my ec2 instance to get it to use the secondary cidr ENIConfig

@srijitm Have you seen this issue again? The interesting log line is Failed to get pod ENI config, which means that at the time of start up, ipamd could not find the ENIConfig and that's why it could not attach any secondary ENI and get IPs to the pool. When was the config applied to the cluster?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ltagliamonte-dd picture ltagliamonte-dd  Â·  5Comments

caleygoff-invitae picture caleygoff-invitae  Â·  4Comments

atimush picture atimush  Â·  4Comments

rkatti picture rkatti  Â·  4Comments

xvdy picture xvdy  Â·  4Comments