Containers-roadmap: [EKS] [request]: allow to configure ipvs kube-proxy mode

Created on 30 Jan 2019  路  17Comments  路  Source: aws/containers-roadmap

Tell us about your request
It would be nice to allow to switch from default iptables to ipvs kube-proxy mode.

Which service(s) is this request for?
EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Better load balancing across all pods covered by particular service.

EKS Proposed

Most helpful comment

For what it's worth, we have been running IPVS on large clusters (managed by ourselves) for about a year without problem except for graceful termination: we kept using a version without it because it was a bit unstable. We have fixed a few issues with graceful termination in the last few months and it seems almost ready now.

A few important PRs haven't been merged yet but should be soon:

  • "Do not delete existing VS and RS when starting" #75283
  • "Add flag to enable strict ARP" #75295 (probably not relevant to EKS but I do not remember if the host veth interface has an IP)
  • "add module 'nf_conntrack' in ipvs prerequisite check" #70325 (support of 4.19 kernels, merged but not cherry-picked yet in 1.11 and 1.12 but soon)

If you test it and find any issue, let us know

All 17 comments

Any update on this one?

Hi - we haven't adopted this in EKS as it's GA but not the default for Kubernetes yet. We are researching being able to enable this with EKS.

I'd like to have this sure, but I'd prefer to wait until it is the k8s default. I guess it could be an experimental option, but I am not sure AWS want to support both modes 馃槃

Even in k8s 1.14 IPVS is not the default (I think?) and the people are still smoothing over the rough edges of IPVS for the things that it breaks or that it don't support, e.g.

  • IPVS: "ExternalTrafficPolicy: Local" now works with LoadBalancer services using loadBalancerIP (#72432, @lbernail)
  • kube-proxy in IPVS mode will stop initiating connections to terminating pods for services with sessionAffinity set. (#71834, @lbernail)
  • Support graceful termination with IPVS when deleting a service (#71895, @lbernail)
    Fixes issue with cleaning up stale NFS subpath mounts (#71804, @msau42)
  • UDP connections now support graceful termination in IPVS mode (#71515, @lbernail)

For what it's worth, we have been running IPVS on large clusters (managed by ourselves) for about a year without problem except for graceful termination: we kept using a version without it because it was a bit unstable. We have fixed a few issues with graceful termination in the last few months and it seems almost ready now.

A few important PRs haven't been merged yet but should be soon:

  • "Do not delete existing VS and RS when starting" #75283
  • "Add flag to enable strict ARP" #75295 (probably not relevant to EKS but I do not remember if the host veth interface has an IP)
  • "add module 'nf_conntrack' in ipvs prerequisite check" #70325 (support of 4.19 kernels, merged but not cherry-picked yet in 1.11 and 1.12 but soon)

If you test it and find any issue, let us know

Not sure if this should be mentioned here or if it should comprise a separate request - would it make sense to optionally allow disabling the deployment of kube-proxy altogether?

The justification would be that kube-proxy itself is not a strict requirement with certain configurations. It would mirror the steps kubeadm took to make kube-proxy's deployment optional (for similar reasons).

@arzarif that's a topic I think that warrants a separate thread. Can you open a new issue?

Hi, Could you please let me know if EKS is allowing to switch from iptables to ipvs as per this blog or work is still in progress?

https://medium.com/@jeremy.i.cowan/the-problem-with-kube-proxy-enabling-ipvs-on-eks-169ac22e237e

Hi, is this coming anytime soon?

Nice blog @SankarGopal77 !

Hi, Is there any progress

+1 on this issue

Hi, Could you please let me know if EKS is allowing to switch from iptables to ipvs as per this blog or work is still in progress?

https://medium.com/@jeremy.i.cowan/the-problem-with-kube-proxy-enabling-ipvs-on-eks-169ac22e237e

As mentioned in the article, this is already supported by replacing the following properties in each worker node's user-data script:

#cloud-config
packages:
 - ipvsadm
runcmd:
 - sudo modprobe ip_vs 
 - sudo modprobe ip_vs_rr
 - sudo modprobe ip_vs_wrr 
 - sudo modprobe ip_vs_sh
 - sudo modprobe nf_conntrack_ipv4
 - /var/lib/cloud/scripts/per-instance/bootstrap.al2.sh

And in the kube-proxy deamonset:

containers:
  - command:
    - /bin/sh
    - -c
    - kube-proxy --v=2 --kubeconfig=/var/lib/kube-proxy/kubeconfig --proxy-mode=ipvs --ipvs-scheduler=sed
    image: 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/kube-proxy:v1.13.8

I believe the opened issue is suggesting IPVS as default, as opposed to IP tables as these can cause contention on large-scale production systems.

W e are waiting IPVS. It's quire clear that is so important but we didnot get any reply or support from EKS Team. It's so sad for your customer.

Is here any progress? We also need to use IPVS.

FYI: In our case, we enable IPVS modules directly via UserData setting on top the latest AMI.

I've submitted this to the EKS AMI team. If it can be included in the base EKS AMI it'd be a matter of enabling it.

https://github.com/awslabs/amazon-eks-ami/issues/546

As mentioned in the article, this is already supported by replacing the following properties in each worker node's user-data script:

In my case this was not sufficient as the kube-proxy-config CM in kube-system took priority over arguments.

apiVersion: v1
data:
  config: |-
    apiVersion: kubeproxy.config.k8s.io/v1alpha1
    bindAddress: 0.0.0.0
    clientConnection:
      acceptContentTypes: ""
      burst: 10
      contentType: application/vnd.kubernetes.protobuf
      kubeconfig: /var/lib/kube-proxy/kubeconfig
      qps: 5
    clusterCIDR: ""
    configSyncPeriod: 15m0s
    conntrack:
      # max: 0 <-- I comment out this line as it causes a syntax error when the kube-proxy is loading; maybe caused by an upgrade to 1.18?
      maxPerCore: 32768
      min: 131072
      tcpCloseWaitTimeout: 1h0m0s
      tcpEstablishedTimeout: 24h0m0s
    enableProfiling: false
    healthzBindAddress: 0.0.0.0:10256
    hostnameOverride: ""
    iptables:
      masqueradeAll: false
      masqueradeBit: 14
      minSyncPeriod: 0s
      syncPeriod: 30s
    ipvs:
      excludeCIDRs: null
      minSyncPeriod: 0s
      scheduler: "lc" # <-- set to desired scheduler
      syncPeriod: 30s
    kind: KubeProxyConfiguration
    metricsBindAddress: 127.0.0.1:10249
    mode: "ipvs" # <-- modified from "iptables"
    nodePortAddresses: null
    oomScoreAdj: -998
    portRange: ""
    udpIdleTimeout: 250ms
kind: ConfigMap
metadata:
  labels:
    eks.amazonaws.com/component: kube-proxy
    k8s-app: kube-proxy
  name: kube-proxy-config
  namespace: kube-system
Was this page helpful?
0 / 5 - 0 ratings