Ingress-nginx: Ingress controller not starting

Created on 27 Aug 2018  路  28Comments  路  Source: kubernetes/ingress-nginx

NGINX Ingress controller version: 0.18.0

Kubernetes version (use kubectl version): v1.11.1

Environment:

  • Cloud provider or hardware configuration: HW
  • OS (e.g. from /etc/os-release): Ubuntu 18.04.1
  • Kernel (e.g. uname -a): 4.15.0-29

What happened:
After upgrade controller from 0.17.1 to 0.18.0 it`s not started and wrote in logs:

-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:    0.18.0
  Build:      git-7b20058
  Repository: https://github.com/kubernetes/ingress-nginx.git
-------------------------------------------------------------------------------

nginx version: nginx/1.15.2
W0827 09:10:13.940665       9 client_config.go:552] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0827 09:10:13.940907       9 main.go:191] Creating API client for https://10.96.0.1:443
I0827 09:10:13.982783       9 main.go:235] Running in Kubernetes cluster version v1.11 (v1.11.1) - git (clean) commit b1b29978270dc22fecc592ac55d903350454310a - platform linux/amd64
I0827 09:10:14.002798       9 main.go:100] Validated ingress-nginx/default-http-backend as the default backend.
I0827 09:10:14.153962       9 nginx.go:255] Starting NGINX Ingress controller
 ResourceVersion:"2636042", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress kube-system/dashboard
I0827 09:10:15.263295       9 backend_ssl.go:68] Adding Secret "kube-system/wild" to the local store
I0827 09:10:15.263483       9 event.go:221] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"default", Name:"consent", UID:"250ae80d-9aee-11e8-8514-a4bf012da507", APIVersion:"extensions/v1beta1", ResourceVersion:"2636043", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress default/consent
I0827 09:10:15.264786       9 event.go:221] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"default", Name:"static-content", UID:"fe34ea2d-a52d-11e8-8514-a4bf012da507", APIVersion:"extensions/v1beta1", ResourceVersion:"3058967", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress default/static-content
I0827 09:10:15.354651       9 nginx.go:684] Starting TLS proxy for SSL Passthrough
F0827 09:10:15.354735       9 nginx.go:696] listen tcp :443: bind: permission denied

How to reproduce it (as minimally and precisely as possible):
Config:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: nginx-ingress-controller
  namespace: ingress-nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ingress-nginx
  template:
    metadata:
      labels:
        app: ingress-nginx
      annotations:
        prometheus.io/port: '10254'
        prometheus.io/scrape: 'true'
    spec:
      serviceAccountName: nginx-ingress-serviceaccount
      containers:
        - name: nginx-ingress-controller
          image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.18.0
          args:
            - /nginx-ingress-controller
            - --default-backend-service=$(POD_NAMESPACE)/default-http-backend
            - --configmap=$(POD_NAMESPACE)/nginx-configuration
            - --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
            - --udp-services-configmap=$(POD_NAMESPACE)/udp-services
            - --publish-service=$(POD_NAMESPACE)/ingress-nginx
            - --annotations-prefix=nginx.ingress.kubernetes.io
            - --enable-ssl-passthrough
          securityContext:
            capabilities:
              drop:
                - ALL
              add:
                - NET_BIND_SERVICE
            # www-data -> 33
            runAsUser: 33
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          ports:
            - name: http
              containerPort: 80
            - name: https
              containerPort: 443
          livenessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: 10254
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 1
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: 10254
              scheme: HTTP
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 1
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: nginx-configuration
  namespace: ingress-nginx
  labels:
    app: ingress-nginx
data:
  client-body-buffer-size: 32M
  hsts: "true"
  proxy-body-size: 1G
  proxy-buffering: "off"
  proxy-read-timeout: "600"
  proxy-send-timeout: "600"
  server-tokens: "false"
  ssl-redirect: "false"
  upstream-keepalive-connections: "50"
  use-proxy-protocol: "false"

Anything else we need to know:

kinbug

Most helpful comment

@OrlinVasilev we release every three or four weeks. Next release should be in a week or two
Please check https://github.com/kubernetes/ingress-nginx/projects/27

All 28 comments

I've been getting a similar issue though it seems to only be with TCP services.

[nginx-ingress-public-controller-6d876f78f7-8vkkt] Error: exit status 1 [nginx-ingress-public-controller-6d876f78f7-8vkkt] nginx: the configuration file /tmp/nginx-cfg188813845 syntax is ok [nginx-ingress-public-controller-6d876f78f7-8vkkt] 2018/08/28 17:44:40 [emerg] 124#124: bind() to 0.0.0.0:23 failed (2: No such file or directory) [nginx-ingress-public-controller-6d876f78f7-8vkkt] nginx: [emerg] bind() to 0.0.0.0:23 failed (2: No such file or directory) [nginx-ingress-public-controller-6d876f78f7-8vkkt] nginx: configuration file /tmp/nginx-cfg188813845 test failed

I rolled back to 0.17.1 and it's working correctly again. I'm not sure what changed between 0.17.1 and 0.18.0.

I am also seeing the bind error with 0.18.0 on GKE. Rolling back to 0.17.1 fixes it.

I've tried to update 0.17.1 to 0.19.0. Same result.
But on another cluster I have successfully updated 0.13.0 to 0.19.0

@foxsa From your logs[1] its seems like it could be something related to the SSL passthrough. This other cluster you just mentioned also have it enabled?

[1]

I0827 09:10:15.354651       9 nginx.go:684] Starting TLS proxy for SSL Passthrough
F0827 09:10:15.354735       9 nginx.go:696] listen tcp :443: bind: permission denied

@Foxsa @rochacon

I also have SSL pass-through enabled, so maybe that is the common factor...

Confirmed for me also, 0.18.0 and 0.19.0 when enabling ssl-passthrough (was my only option for TLS termination at the pod level when using GRPC)

same issue,

NGINX Ingress controller
  Release:    0.19.0
  Build:      git-05025d6
I0903 03:08:42.221794       9 backend_ssl.go:68] Adding Secret "kube-system/ingress-nginx-tls-certs" to the local store
I0903 03:08:42.317294       9 nginx.go:686] Starting TLS proxy for SSL Passthrough
F0903 03:08:42.317441       9 nginx.go:698] listen tcp :443: bind: permission denied

@rochacon yes, on that other cluster SSL pass-through disabled. It could be the case

I can confirm this is an issue when SSL pass-through is enabled. The reason for the error is related to the securityContext settings where we use runAsUser: 33. To run as user, we use authbind which works just fine for binaries not generated with Go. To fix this issue I am planning to remove the go proxy that enables SSL pass-through and use the sni preread feature from NGINX

Edit: as a workaround, using runAsUser: 0 and adding the flag --enable-dynamic-certificates should fix this error (using 0.19.0)

I tried this workaround, but getting:
Error generating self-signed certificate: could not create temp pem file /etc/ingress-controller/ssl/default-fake-certificate.pem: open /etc/ingress-controller/ssl/default-fake-certificate.pem993917147: permission denied

running with:

        args:
          - /nginx-ingress-controller
          - --default-backend-service=$(POD_NAMESPACE)/default-http-backend
          - --configmap=$(POD_NAMESPACE)/nginx-configuration
          - --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
          - --udp-services-configmap=$(POD_NAMESPACE)/udp-services
          - --publish-service=$(POD_NAMESPACE)/ingress-nginx
          - --annotations-prefix=nginx.ingress.kubernetes.io
          - --enable-ssl-passthrough
          - --enable-dynamic-certificates
          - --enable-dynamic-configuration
          - --enable-ssl-chain-completion=false
          - --update-status

@aledbf I am using

helm install stable/nginx-ingress --name ingress-nginx --namespace ingress-nginx --set controller.extraArgs.enable-ssl-passthrough="" --set controller.extraArgs.enable-dynamic-certificates=""

how do I set runAsUser?

To all the affected users by this issue, please use quay.io/kubernetes-ingress-controller/nginx-ingress-controller-amd64:fix-tcp-udp

I cant use that tag

helm install stable/nginx-ingress --name ingress-nginx --namespace ingress-nginx --set controller.extraArgs.enable-ssl-passthrough="" --set controller.image.repository="quay.io/kubernetes-ingress-controller/nginx-ingress-controller-amd64" --set controller.image.tag="fix-tcp-udp"
Error: render error in "nginx-ingress/templates/controller-deployment.yaml": template: nginx-ingress/templates/controller-deployment.yaml:51:22: executing "nginx-ingress/templates/controller-deployment.yaml" at <semverCompare ">=0.9...>: error calling semverCompare: Invalid Semantic Version

The tag fix-tcp-udp is not usable from helm. Better workarround seems to be to use helm install ..etc. with --version 0.17.1 to select the previous version of the nginx-ingress chart.

@stephen-dahl @GeertJohan that's on purpose. This image is just a temporal workaround until the next release. Install the helm chart and then use kubectl set image to use the fix-tcp-udp image.

what that fix will be ported to the main stream in like 0.19.1?

@OrlinVasilev already done here #3038 (part of 0.20)

@aledbf - where I can find the release date of 0.20?

@OrlinVasilev we release every three or four weeks. Next release should be in a week or two
Please check https://github.com/kubernetes/ingress-nginx/projects/27

@stephen-dahl @GeertJohan that's on purpose. This image is just a temporal workaround until the next release. Install the helm chart and then use kubectl set image to use the fix-tcp-udp image.

Should anyone need the command:

kubectl -n kube-system set image deployments/nginx-ingress-controller *=quay.io/kubernetes-ingress-controller/nginx-ingress-controller-amd64:fix-tcp-udp

I don't have the problem in 0.20 anymore.

I am having this issue with version 0.22. It it back? 0.19 works all right.

Strange, for me it worked fine in 0.22.0, but I started getting this issue in 0.23.0.

Logs:

E0326 17:13:41.800364       6 controller.go:184] Unexpected failure reloading the backend:

-------------------------------------------------------------------------------
Error: exit status 1
2019/03/26 17:13:41 [notice] 36#36: ModSecurity-nginx v1.0.0
nginx: the configuration file /tmp/nginx-cfg292931438 syntax is ok
2019/03/26 17:13:41 [emerg] 36#36: bind() to 0.0.0.0:80 failed (13: Permission denied)
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
nginx: configuration file /tmp/nginx-cfg292931438 test failed

EDIT: My issue is 99,9% related to #3858.

EDIT 2: I tried running as root (runAsUser: 0), but that just gave me another issue that's mentioned in #3589, which I can't seem to figure out:

F0326 17:24:10.760302       6 main.go:116] Error generating self-signed certificate: could not create temp pem file /etc/ingress-controller/ssl/default-fake-certificate.pem: open /etc/ingress-controller/ssl/default-fake-certificate.pem255531222: permission denied

I am also having the same issue with the latest version of the nginx-ingress in the stable repo of helm. @anton-johansson 's workaround of using 0.22.0 works.

$ helm install --wait --name nginx-ci-test stable/nginx-ingress
Error: release nginx-ci-test failed: timed out waiting for the condition

Pod logs were:

2019/04/04 18:48:54 [emerg] 131#131: bind() to 0.0.0.0:80 failed (13: Permission denied)
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
nginx: configuration file /tmp/nginx-cfg361707631 test failed

Me also. Similar issue in on Ubuntu 18.04.2 with nginx-ingress version 0.23.0 the stable repo of helm. [git-be1329b22]. Tried with 0.22.0 - same issue. Tried with latest as well - 0.24.1, and same issue.
Docker engine is configured with overlay2, and using extfs.

kube-system nginx-ingress-controller-f44c6f76b-q7srb 0/1 CrashLoopBackOff 9 16m

Logs:

E0510 23:46:11.181961 6 controller.go:184] Unexpected failure reloading the backend:

Error: exit status 1
2019/05/10 23:46:11 [notice] 42#42: ModSecurity-nginx v1.0.0
nginx: the configuration file /tmp/nginx-cfg596781409 syntax is ok
2019/05/10 23:46:11 [emerg] 42#42: bind() to 0.0.0.0:80 failed (13: Permission denied)
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
nginx: configuration file /tmp/nginx-cfg596781409 test failed

Any suggestion ?

I had this error aswell when using --enable-ssl-passthrough in 0.22.0. Upgrading to 0.24.1 solved my issue.

For my case i had to add these 2 parameters in the deployments to get it to work:

--enable-ssl-chain-completion=false
--enable-ssl-passthrough

For my case i had to add these 2 parameters in the deployments to get it to work:

--enable-ssl-chain-completion=false
--enable-ssl-passthrough

I am using calico as network pod on 1.19.0 and I observe that before passing these flags that you pointed it used to restart. Now it is flawless.

Thanks for the tip. :)

Was this page helpful?
0 / 5 - 0 ratings