Cilium: No rev-NAT xlation for service replies in CNI chaining

Created on 3 Sep 2019  路  4Comments  路  Source: cilium/cilium

A user reported that a pod was failing to connect to the kube-apiserver running on the other node. They are using Cilium with AWS-VPC CNI via the recently introduced CNI chaining (--cni-chaining-mode=aws-cni).

The relevant tcpdump output from the pod netns:

10:23:15.981543 IP 10.75.39.76.53096 > 100.64.0.1.443: Flags [S], seq 3559369392, win 26883, options [mss 8961,sackOK,TS val 25480963 ecr 0,nop,wscale 7], length 0
10:23:15.981767 IP 10.75.40.126.443 > 10.75.39.76.53096: Flags [S.], seq 3086623939, ack 3559369393, win 26847, options [mss 8961,sackOK,TS val 1629780917 ecr 25480963,nop,wscale 7], length 0
10:23:15.981782 IP 10.75.39.76.53096 > 10.75.40.126.443: Flags [R], seq 3559369393, win 0, length 0
  • 100.64.0.1:443 - ClusterIP of the kube-apiserver service.
  • 10.75.39.76 - the client pod IP addr.
  • 10.75.40.126 - the node IP addr which runs the kube-apiserver.

The relevant output from the cilium bpf lb list:

100.64.0.1:443         0.0.0.0:0 (140)            
                       10.75.28.100:443 (140)     
                       10.75.40.126:443 (140)     
                       10.75.48.63:443 (140)    

We can spot from the dump, that the SYN-ACK reply from the server was not rev-NAT xlated, so therefore it was dropped by the client (RST).

Regarding Cilium version:

we upgraded from 1.6.0-rc2 to 1.6.0, the problem did not immediately present itself, I can鈥檛 say for certain but it may have occurred when we were on 1.6.0-rc2 but we didn鈥檛 notice it

we went to 1.6.0 on the 21st August, we only noticed this issue on Friday
aredatapath kinbug needtriage

Most helpful comment

The issue seems to be fixed in 1.6.1. The following test is reproducible in 1.6.0, but not 1.6.1:

The performed test:

  • install cilium on EKS following https://cilium.readthedocs.io/en/stable/gettingstarted/k8s-install-eks/
  • apply the CNP[1]
  • run connectivity-check, that curls pods using ClusterIP:
    kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/1.6.1/examples/kubernetes/connectivity-check/connectivity-check.yaml
  • observe policy drops in cilium monitor
    ```xx drop (Policy denied (L3)) flow 0xb9488e80 to endpoint 74, identity 2->52214: 10.100.242.102:80 -> 192.168.17.28:38288 tcp SYN, ACK
    xx drop (Policy denied (L3)) flow 0xfd0bcfb1 to endpoint 74, identity 2->52214: 10.100.242.102:80 -> 192.168.17.28:38216 tcp SYN, ACK
    xx drop (Policy denied (L3)) flow 0x6e5a83f5 to endpoint 74, identity 2->52214: 10.100.242.102:80 -> 192.168.17.28:38216 tcp SYN, ACK
    xx drop (Policy denied (L3)) flow 0x2dfd5573 to endpoint 74, identity 2->52214: 10.100.242.102:80 -> 192.168.17.28:38288 tcp SYN, ACK
    xx drop (Policy denied (L3)) flow 0x3d522755 to endpoint 74, identity 2->52214: 10.100.242.102:80 -> 192.168.17.28:38288 tcp SYN, ACK
    xx drop (Policy denied (L3)) flow 0xdc405421 to endpoint 74, identity 2->52214: 10.100.242.102:80 -> 192.168.17.28:38288 tcp SYN, ACK
    xx drop (Policy denied (L3)) flow 0x473a8fdc to endpoint 74, identity 2->52214: 10.100.242.102:80 -> 192.168.17.28:38288 tcp SYN, ACK
- contacting pod by podIp works


[1]

kind: CiliumNetworkPolicy
apiVersion: cilium.io/v2
metadata:
name: all-within-namespace
specs:

  • endpointSelector:
    matchLabels: {}
    egress:

    • toEndpoints:



      • matchLabels: {}



  • endpointSelector:
    matchLabels: {}
    ingress:

    • fromEndpoints:



      • matchLabels: {}



  • endpointSelector:
    matchLabels: {}
    egress:

    • toEndpoints:



      • matchLabels:


        'k8s:io.kubernetes.pod.namespace': kube-system


        toPorts:


      • ports:





        • port: '53'



          protocol: ANY



          rules:



          dns:







          • matchPattern: '*'




            ```










All 4 comments

As a temporary workaround, users can disable BPF services by setting disable-k8s-services: "true" in the cilium-agent ConfigMap.

The issue seems to be fixed in 1.6.1. The following test is reproducible in 1.6.0, but not 1.6.1:

The performed test:

  • install cilium on EKS following https://cilium.readthedocs.io/en/stable/gettingstarted/k8s-install-eks/
  • apply the CNP[1]
  • run connectivity-check, that curls pods using ClusterIP:
    kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/1.6.1/examples/kubernetes/connectivity-check/connectivity-check.yaml
  • observe policy drops in cilium monitor
    ```xx drop (Policy denied (L3)) flow 0xb9488e80 to endpoint 74, identity 2->52214: 10.100.242.102:80 -> 192.168.17.28:38288 tcp SYN, ACK
    xx drop (Policy denied (L3)) flow 0xfd0bcfb1 to endpoint 74, identity 2->52214: 10.100.242.102:80 -> 192.168.17.28:38216 tcp SYN, ACK
    xx drop (Policy denied (L3)) flow 0x6e5a83f5 to endpoint 74, identity 2->52214: 10.100.242.102:80 -> 192.168.17.28:38216 tcp SYN, ACK
    xx drop (Policy denied (L3)) flow 0x2dfd5573 to endpoint 74, identity 2->52214: 10.100.242.102:80 -> 192.168.17.28:38288 tcp SYN, ACK
    xx drop (Policy denied (L3)) flow 0x3d522755 to endpoint 74, identity 2->52214: 10.100.242.102:80 -> 192.168.17.28:38288 tcp SYN, ACK
    xx drop (Policy denied (L3)) flow 0xdc405421 to endpoint 74, identity 2->52214: 10.100.242.102:80 -> 192.168.17.28:38288 tcp SYN, ACK
    xx drop (Policy denied (L3)) flow 0x473a8fdc to endpoint 74, identity 2->52214: 10.100.242.102:80 -> 192.168.17.28:38288 tcp SYN, ACK
- contacting pod by podIp works


[1]

kind: CiliumNetworkPolicy
apiVersion: cilium.io/v2
metadata:
name: all-within-namespace
specs:

  • endpointSelector:
    matchLabels: {}
    egress:

    • toEndpoints:



      • matchLabels: {}



  • endpointSelector:
    matchLabels: {}
    ingress:

    • fromEndpoints:



      • matchLabels: {}



  • endpointSelector:
    matchLabels: {}
    egress:

    • toEndpoints:



      • matchLabels:


        'k8s:io.kubernetes.pod.namespace': kube-system


        toPorts:


      • ports:





        • port: '53'



          protocol: ANY



          rules:



          dns:







          • matchPattern: '*'




            ```










Thanks for the confirmation @genbit . Looks like it was addressed by #8978 .

Note to any users that hit this: After upgrade, application pods must be restarted (rejoined to CNI) for this fix to take effect for existing pods that were deployed on v1.6.0.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

arzarif picture arzarif  路  4Comments

ghouscht picture ghouscht  路  4Comments

hazelnutsgz picture hazelnutsgz  路  3Comments

twpayne picture twpayne  路  3Comments

ianvernon picture ianvernon  路  4Comments