Ingress-nginx: [Failed CI] IPVS proxy mode ingress controller can't create loadbalancer for ingress

Created on 30 Mar 2018 · 9Comments · Source: kubernetes/ingress-nginx

Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug

What happened:
In a k8s cluster with IPVS proxy mode, when you create an ingress, the loadbalancer of this ingress can't be create　automatically, which caused the ipvs ci "gci-gce-ipvs" failed, link: http://k8s-testgrid.appspot.com/sig-network-gce#gci-gce-ipvs

The failed case is Loadbalancing: L7 [Slow] Nginx should conform to Ingress spec, with logs:

Mar 30 00:39:51.870: INFO: stdout: "service \"echoheadersx\" created\nservice \"echoheadersy\" created\n"
Mar 30 00:39:51.870: INFO: Parsing ingress from test/e2e/testing-manifests/ingress/http/ing.yaml
Mar 30 00:39:51.873: INFO: creating echomap ingress
STEP: waiting for urls on basic HTTP ingress
Mar 30 00:39:51.905: INFO: Waiting for Ingress e2e-tests-ingress-n6smw/echomap to acquire IP, error: <nil>, ipOrNameList: []
Mar 30 00:40:01.909: INFO: Waiting for Ingress e2e-tests-ingress-n6smw/echomap to acquire IP, error: <nil>, ipOrNameList: []
Mar 30 00:40:11.909: INFO: Waiting for Ingress e2e-tests-ingress-n6smw/echomap to acquire IP, error: <nil>, ipOrNameList: []
Mar 30 00:40:21.914: INFO: Waiting for Ingress e2e-tests-ingress-n6smw/echomap to acquire IP, error: <nil>, ipOrNameList: []
Mar 30 00:40:31.914: INFO: Waiting for Ingress e2e-tests-ingress-n6smw/echomap to acquire IP, error: <nil>, ipOrNameList: []

Time out when waiting for for ingress to get an address.

I'm not sure why, but I can see the pod nginx-ingress-controller keep restarting,

# k get pod 
NAME                             READY     STATUS             RESTARTS   AGE
echoheaders-cxp2d                1/1       Running            0          2h
nginx-ingress-controller-5prlf   0/1       CrashLoopBackOff   34         2h

And with container logs:

I0330 12:21:03.732302       5 launch.go:92] &{NGINX 0.9.0-beta.1 git-910b706 https://github.com/bprashanth/ingress.git}
I0330 12:21:03.732908       5 launch.go:221] Creating API server client for https://10.0.0.1:443
I0330 12:21:03.733867       5 nginx.go:109] starting NGINX process...
F0330 12:21:03.762489       5 launch.go:109] no service with name kube-system/default-http-backend found: services "default-http-backend" is forbidden: User "system:serviceaccount:default:default" cannot get services in the namespace "kube-system"

AFAIK, ingress IP have nothing to do with kube-proxy, but when I change proxy mode to iptables, then ingress got it's IP in few seconds, though ingress-controller still in CrashLoopBackOff status.

Is there any conflict between ipvs proxy rules and ingress controller? Maybe have this possibility.

What you expected to happen:
Ingress can got it's IP in ipvs proxy mode.

How to reproduce it (as minimally and precisely as possible):

Build a cluster in gce environment with proxy mode IPVS.
Create nginx-ingress-controller.
Create ingress with backends.
Wait for the ingress IP. T_T

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version): master
Cloud provider or hardware configuration: gce
OS (e.g. from /etc/os-release): Ubuntu16.04.3
Kernel (e.g. uname -a): 4.13.0-1008-gcp #11-Ubuntu SMP Thu Jan 25 11:08:44 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Install tools: cluster/kube-up
Others:

kinbug

Source

Lion-Wei

All 9 comments

@Lion-Wei first, please update the ingress controller version.

no service with name kube-system/default-http-backend found: services "default-http-backend" is forbidden: User "system:serviceaccount:default:default" cannot get services in the namespace "kube-system"

That error means you have RBAC (or any other alternative) enabled and you forgot to add permissions.
https://github.com/kubernetes/ingress-nginx/blob/master/deploy/rbac.md

aledbf on 30 Mar 2018

Closing. Please reopen if the issue persists after adding permissions to the ingress controller

aledbf on 30 Mar 2018

@aledbf Thanks, but once I change proxy mode to iptables, problem solved. So I think this problem may have something to do with kube-proxy.

Lion-Wei on 31 Mar 2018

IPVS mode, nginx-ingress-controller will have logs:

# k logs -n e2e-tests-ingress-6lx2c nginx-ingress-controller-k58fz
I0331 06:31:53.740167       5 launch.go:92] &{NGINX 0.9.0-beta.1 git-910b706 https://github.com/bprashanth/ingress.git}
I0331 06:31:53.740557       5 launch.go:221] Creating API server client for https://10.0.0.1:443
I0331 06:31:53.741319       5 nginx.go:109] starting NGINX process...
F0331 06:31:53.755221       5 launch.go:109] no service with name kube-system/default-http-backend found: Get https://10.0.0.1:443/api/v1/namespaces/kube-system/services/default-http-backend: dial tcp 10.0.0.1:443: getsockopt: connection refused

Seems like network issue.

Lion-Wei on 31 Mar 2018

I found this maybe related with iptables rules KUBE-HOSTPORTS , ingress-controller need to validate the default service by access https://10.0.0.1:443.... But the iptables like:

Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination
KUBE-SERVICES  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */
KUBE-HOSTPORTS  all  --  0.0.0.0/0            0.0.0.0/0            /* kube hostport portals */ ADDRTYPE match dst-type LOCAL

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
KUBE-SERVICES  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */
KUBE-HOSTPORTS  all  --  0.0.0.0/0            0.0.0.0/0            /* kube hostport portals */ ADDRTYPE match dst-type LOCAL

Chain KUBE-HOSTPORTS (2 references)
target     prot opt source               destination
KUBE-HP-3NMCVHXCDLOPPG3S  tcp  --  0.0.0.0/0            0.0.0.0/0            /* nginx-ingress-controller-jgrbx_default hostport 443 */ tcp dpt:443
KUBE-HP-ONDCSRUXLK6MON4H  tcp  --  0.0.0.0/0            0.0.0.0/0            /* nginx-ingress-controller-jgrbx_default hostport 80 */ tcp dpt:80

Chain KUBE-HP-3NMCVHXCDLOPPG3S (1 references)
target     prot opt source               destination
KUBE-MARK-MASQ  all  --  10.64.1.47           0.0.0.0/0            /* nginx-ingress-controller-jgrbx_default hostport 443 */
DNAT       tcp  --  0.0.0.0/0            0.0.0.0/0            /* nginx-ingress-controller-jgrbx_default hostport 443 */ tcp to:10.64.1.47:443

Chain KUBE-HP-ONDCSRUXLK6MON4H (1 references)
target     prot opt source               destination
KUBE-MARK-MASQ  all  --  10.64.1.47           0.0.0.0/0            /* nginx-ingress-controller-jgrbx_default hostport 80 */
DNAT       tcp  --  0.0.0.0/0            0.0.0.0/0            /* nginx-ingress-controller-jgrbx_default hostport 80 */ tcp to:10.64.1.47:80

Chain KUBE-SERVICES (2 references)
target     prot opt source               destination
KUBE-MARK-MASQ  all  -- !10.64.0.0/14         0.0.0.0/0            match-set KUBE-CLUSTER-IP dst,dst
KUBE-MARK-MASQ  tcp  --  0.0.0.0/0            0.0.0.0/0            tcp match-set KUBE-NODE-PORT-TCP dst

That means the access of port 443 will be directly DNAT to nginx-ingress-controller.

In iptables proxy mode, DNAT will be done in KUBE-SERVICES chain, so the access of https://10.0.0.1:443... still functional.

This is my conclusion, but I'm not familiar will ingress-controller, is this make sense? @aledbf

Lion-Wei on 31 Mar 2018

This is my conclusion, but I'm not familiar will ingress-controller, is this make sense? @aledbf

No :)

Please update the version of the ingress controller. The version you are using is really old https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-ingress-controller-0.9-beta.1

aledbf on 31 Mar 2018

@aledbf Finally, I found it's ipvs issue. ╯﹏╰

Sorry for the disturbing, Thanks for your response and advise.

I think this issue can be closed. : )

Lion-Wei on 3 Apr 2018

👍1

@Lion-Wei What sort of IPVS issue? Our ingress-nginx is not playing nicely with it enabled. Removing the associated netpol inappropriately clears it up.

cyberi0n on 21 Nov 2018

👀1

@Lion-Wei Repeating the question above

What sort of IPVS issue? Our ingress-nginx is not playing nicely with it enabled.

I don't see an option where I can remove the network policy mentioned

a kubespray deployed cluster i wanted to test ipvs configuration with
inventory/group_vars/k8s-cluster/k8s-cluster.yml has proxy mode set

kube_proxy_mode: ipvs

On the ipvs cluster kube-proxy is running on port 80 & 443

netstat -tunlp|grep proxy
tcp        0      0 10.144.104.83:80        0.0.0.0:*               LISTEN      19455/kube-proxy
tcp        0      0 10.144.104.83:443       0.0.0.0:*               LISTEN      19455/kube-proxy
tcp        0      0 127.0.0.1:10249         0.0.0.0:*               LISTEN      19455/kube-proxy

nginx ingress is crashlooping

bind() to 0.0.0.0:443 failed (98: Address already in use)

On a cluster deployed the same way except for proxy mode == iptables

netstat -tunlp|grep 443
tcp        0      0 0.0.0.0:443             0.0.0.0:*               LISTEN      5367/nginx: master
tcp6       0      0 :::443                  :::*                    LISTEN      5367/nginx: master

netstat -tunlp|grep proxy
tcp        0      0 127.0.0.1:10249         0.0.0.0:*               LISTEN      28827/kube-proxy

Is there an option that I missed to alter kube-proxy acquiring ports 80 & 443 or is there an option needed to support running an ingress with ipvs I’ve overlooked.