Hello,
My ingress controller suddenly stopped working. This is the message that I get. I have deployed it in the past following exactly the instructions here: https://kubernetes.github.io/ingress-nginx/deploy/
Everything was working, but after I restarted kubernetes and docker it doesn't work anymore. I tried to redeploy it but still. I am running on CentOS 7.
healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
@thzois please use the template issue.
Please post the ingress controller pod logs to see exactly what's happening.
I'm seeing exactly the same error in GKE cluster
@rimusz can you post the ingress controller pod log and the describe pod output?
sure
$ kubectl describe pod gcstg-use1-nginx-ingress-controller-dn28b
Name: gcstg-use1-nginx-ingress-controller-dn28b
Namespace: gcstg-use1
Priority: 0
PriorityClassName: <none>
Node: gke-k8s-saas-us-east1-app-pool-07b3da5e-1m8r/192.168.21.16
Start Time: Wed, 17 Apr 2019 14:57:21 +0300
Labels: app=nginx-ingress
component=controller
controller-revision-hash=3331872658
pod-template-generation=1
release=gcstg-use1-nginx-ingress
Annotations: <none>
Status: Running
IP: 10.96.4.94
Controlled By: DaemonSet/gcstg-use1-nginx-ingress-controller
Containers:
nginx-ingress-controller:
Container ID: docker://fd519c290a450324b8b973856527ab5a1e6da7ae7ea4c02d9f69e31ea75dc35f
Image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.24.1
Image ID: docker-pullable://docker.jfrog.io/kubernetes-ingress-controller/nginx-ingress-controller@sha256:76861d167e4e3db18f2672fd3435396aaa898ddf4d1128375d7c93b91c59f87f
Ports: 80/TCP, 443/TCP, 18080/TCP, 10254/TCP
Host Ports: 80/TCP, 443/TCP, 18080/TCP, 0/TCP
Args:
/nginx-ingress-controller
--default-backend-service=gcstg-use1/gcstg-use1-nginx-ingress-default-backend
--election-id=ingress-controller-leader
--ingress-class=nginx
--configmap=gcstg-use1/gcstg-use1-nginx-ingress-controller
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Wed, 17 Apr 2019 15:14:05 +0300
Finished: Wed, 17 Apr 2019 15:14:54 +0300
Ready: False
Restart Count: 9
Liveness: http-get http://:10254/healthz delay=10s timeout=10s period=10s #success=1 #failure=3
Readiness: http-get http://:10254/healthz delay=10s timeout=10s period=10s #success=1 #failure=3
Environment:
POD_NAME: gcstg-use1-nginx-ingress-controller-dn28b (v1:metadata.name)
POD_NAMESPACE: gcstg-use1 (v1:metadata.namespace)
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from gcstg-use1-nginx-ingress-token-dzmdd (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
gcstg-use1-nginx-ingress-token-dzmdd:
Type: Secret (a volume populated by a Secret)
SecretName: gcstg-use1-nginx-ingress-token-dzmdd
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unschedulable:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 16m (x8 over 17m) kubelet, gke-k8s-saas-us-east1-app-pool-07b3da5e-1m8r Readiness probe failed: HTTP probe failed with statuscode: 500
Normal Pulled 16m (x3 over 17m) kubelet, gke-k8s-saas-us-east1-app-pool-07b3da5e-1m8r Container image "quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.24.1" already present on machine
Normal Created 16m (x3 over 17m) kubelet, gke-k8s-saas-us-east1-app-pool-07b3da5e-1m8r Created container
Normal Started 16m (x3 over 17m) kubelet, gke-k8s-saas-us-east1-app-pool-07b3da5e-1m8r Started container
Normal Killing 12m (x6 over 17m) kubelet, gke-k8s-saas-us-east1-app-pool-07b3da5e-1m8r Killing container with id docker://nginx-ingress-controller:Container failed liveness probe.. Container will be killed and recreated.
Warning Unhealthy 7m47s (x22 over 17m) kubelet, gke-k8s-saas-us-east1-app-pool-07b3da5e-1m8r Liveness probe failed: HTTP probe failed with statuscode: 500
Warning BackOff 2m46s (x44 over 12m) kubelet, gke-k8s-saas-us-east1-app-pool-07b3da5e-1m8r Back-off restarting failed container
let me fetch pod's log as well, and remove sensitive stuff
pod log:
$ kubectl logs gcstg-use1-nginx-ingress-controller-dn28b
-------------------------------------------------------------------------------
NGINX Ingress controller
Release: 0.24.1
Build: git-ce418168f
Repository: https://github.com/kubernetes/ingress-nginx
-------------------------------------------------------------------------------
I0417 12:14:05.324179 8 flags.go:185] Watching for Ingress class: nginx
W0417 12:14:05.324448 8 flags.go:214] SSL certificate chain completion is disabled (--enable-ssl-chain-completion=false)
nginx version: nginx/1.15.10
W0417 12:14:05.333045 8 client_config.go:549] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0417 12:14:05.333342 8 main.go:205] Creating API client for https://10.94.0.1:443
I0417 12:14:05.348097 8 main.go:249] Running in Kubernetes cluster version v1.11+ (v1.11.7-gke.12) - git (clean) commit 06f08e60069231bd21bdf673cf0595aac80b99f6 - platform linux/amd64
I0417 12:14:05.350047 8 main.go:102] Validated gcstg-use1/gcstg-use1-nginx-ingress-default-backend as the default backend.
I0417 12:14:05.595193 8 main.go:124] Created fake certificate with PemFileName: /etc/ingress-controller/ssl/default-fake-certificate.pem
I0417 12:14:05.624584 8 nginx.go:265] Starting NGINX Ingress controller
...
I0417 12:14:06.854240 8 event.go:209] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"xxxx", Name:"xxxx-xxxx-xxxx-server", UID:"c5e0675e-1790-11e9-99c1-4201ac100003", APIVersion:"extensions/v1beta1", ResourceVersion:"89377026", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress xxxx/xxxx-xxxx-xxxx-server
I0417 12:14:06.854563 8 backend_ssl.go:68] Adding Secret "xxxx/xxxx-info-secret" to the local store
I0417 12:14:06.854751 8 event.go:209] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"gcstg-use1", Name:"gcstg-use1-xxxx-monitoring-page", UID:"98c47e93-6017-11e9-935b-4201ac100009", APIVersion:"extensions/v1beta1", ResourceVersion:"89377033", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress gcstg-use1/gcstg-use1-xxxx-monitoring-page
I0417 12:14:06.855147 8 backend_ssl.go:68] Adding Secret "gcstg-use1/gcstg-use1-xxx-xxxx-info-secret" to the local store
I0417 12:14:06.931716 8 nginx.go:311] Starting NGINX process
I0417 12:14:06.931797 8 leaderelection.go:217] attempting to acquire leader lease gcstg-use1/ingress-controller-leader-nginx...
W0417 12:14:06.936231 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
W0417 12:14:06.936369 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
I0417 12:14:06.939031 8 leaderelection.go:227] successfully acquired lease gcstg-use1/ingress-controller-leader-nginx
I0417 12:14:06.941224 8 status.go:86] new leader elected: gcstg-use1-nginx-ingress-controller-dn28b
I0417 12:14:06.941335 8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:07.198944 8 controller.go:182] Unexpected failure reloading the backend:
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:07 [notice] 54#54: ModSecurity-nginx v1.0.0
2019/04/17 12:14:07 [emerg] 54#54: invalid number of arguments in "set" directive in /tmp/nginx-cfg919104772:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg919104772:4321
nginx: configuration file /tmp/nginx-cfg919104772 test failed
-------------------------------------------------------------------------------
W0417 12:14:07.198995 8 queue.go:130] requeuing initial-sync, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:07 [notice] 54#54: ModSecurity-nginx v1.0.0
2019/04/17 12:14:07 [emerg] 54#54: invalid number of arguments in "set" directive in /tmp/nginx-cfg919104772:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg919104772:4321
nginx: configuration file /tmp/nginx-cfg919104772 test failed
-------------------------------------------------------------------------------
W0417 12:14:10.269739 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
W0417 12:14:10.269883 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
I0417 12:14:10.270525 8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:10.512571 8 controller.go:182] Unexpected failure reloading the backend:
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:10 [notice] 62#62: ModSecurity-nginx v1.0.0
2019/04/17 12:14:10 [emerg] 62#62: invalid number of arguments in "set" directive in /tmp/nginx-cfg612748947:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg612748947:4321
nginx: configuration file /tmp/nginx-cfg612748947 test failed
-------------------------------------------------------------------------------
W0417 12:14:10.512619 8 queue.go:130] requeuing kuku/jmeter-reporter, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:10 [notice] 62#62: ModSecurity-nginx v1.0.0
2019/04/17 12:14:10 [emerg] 62#62: invalid number of arguments in "set" directive in /tmp/nginx-cfg612748947:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg612748947:4321
nginx: configuration file /tmp/nginx-cfg612748947 test failed
-------------------------------------------------------------------------------
W0417 12:14:13.603063 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
W0417 12:14:13.603133 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
I0417 12:14:13.603661 8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:13.853688 8 controller.go:182] Unexpected failure reloading the backend:
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:13 [notice] 74#74: ModSecurity-nginx v1.0.0
2019/04/17 12:14:13 [emerg] 74#74: invalid number of arguments in "set" directive in /tmp/nginx-cfg020726998:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg020726998:4321
nginx: configuration file /tmp/nginx-cfg020726998 test failed
-------------------------------------------------------------------------------
W0417 12:14:13.853736 8 queue.go:130] requeuing xxxxgcp01/xxxxgcp01-xxxx-rabbitmq-ha, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:13 [notice] 74#74: ModSecurity-nginx v1.0.0
2019/04/17 12:14:13 [emerg] 74#74: invalid number of arguments in "set" directive in /tmp/nginx-cfg020726998:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg020726998:4321
nginx: configuration file /tmp/nginx-cfg020726998 test failed
-------------------------------------------------------------------------------
W0417 12:14:16.936792 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
W0417 12:14:16.937059 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
I0417 12:14:16.938282 8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:17.205245 8 controller.go:182] Unexpected failure reloading the backend:
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:17 [notice] 81#81: ModSecurity-nginx v1.0.0
2019/04/17 12:14:17 [emerg] 81#81: invalid number of arguments in "set" directive in /tmp/nginx-cfg147961405:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg147961405:4321
nginx: configuration file /tmp/nginx-cfg147961405 test failed
-------------------------------------------------------------------------------
W0417 12:14:17.205297 8 queue.go:130] requeuing gcstg-use1/xxxx.info-nfs, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:17 [notice] 81#81: ModSecurity-nginx v1.0.0
2019/04/17 12:14:17 [emerg] 81#81: invalid number of arguments in "set" directive in /tmp/nginx-cfg147961405:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg147961405:4321
nginx: configuration file /tmp/nginx-cfg147961405 test failed
-------------------------------------------------------------------------------
W0417 12:14:20.269750 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
W0417 12:14:20.269825 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
I0417 12:14:20.270655 8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:20.522354 8 controller.go:182] Unexpected failure reloading the backend:
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:20 [notice] 88#88: ModSecurity-nginx v1.0.0
2019/04/17 12:14:20 [emerg] 88#88: invalid number of arguments in "set" directive in /tmp/nginx-cfg161907320:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg161907320:4321
nginx: configuration file /tmp/nginx-cfg161907320 test failed
-------------------------------------------------------------------------------
W0417 12:14:20.522398 8 queue.go:130] requeuing xxxxcentralgcstguse1/xxxxcentralgcstguse1-xxxx-rabbitmq-ha-discovery, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:20 [notice] 88#88: ModSecurity-nginx v1.0.0
2019/04/17 12:14:20 [emerg] 88#88: invalid number of arguments in "set" directive in /tmp/nginx-cfg161907320:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg161907320:4321
nginx: configuration file /tmp/nginx-cfg161907320 test failed
-------------------------------------------------------------------------------
E0417 12:14:22.590609 8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
E0417 12:14:22.592768 8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
W0417 12:14:23.603091 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
W0417 12:14:23.603158 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
I0417 12:14:23.603716 8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:23.859675 8 controller.go:182] Unexpected failure reloading the backend:
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:23 [notice] 95#95: ModSecurity-nginx v1.0.0
2019/04/17 12:14:23 [emerg] 95#95: invalid number of arguments in "set" directive in /tmp/nginx-cfg482805111:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg482805111:4321
nginx: configuration file /tmp/nginx-cfg482805111 test failed
-------------------------------------------------------------------------------
W0417 12:14:23.859737 8 queue.go:130] requeuing kube-system/gcp-controller-manager, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:23 [notice] 95#95: ModSecurity-nginx v1.0.0
2019/04/17 12:14:23 [emerg] 95#95: invalid number of arguments in "set" directive in /tmp/nginx-cfg482805111:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg482805111:4321
nginx: configuration file /tmp/nginx-cfg482805111 test failed
-------------------------------------------------------------------------------
W0417 12:14:26.936343 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
W0417 12:14:26.936410 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
I0417 12:14:26.936987 8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:27.223886 8 controller.go:182] Unexpected failure reloading the backend:
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:27 [notice] 102#102: ModSecurity-nginx v1.0.0
2019/04/17 12:14:27 [emerg] 102#102: invalid number of arguments in "set" directive in /tmp/nginx-cfg479136874:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg479136874:4321
nginx: configuration file /tmp/nginx-cfg479136874 test failed
-------------------------------------------------------------------------------
W0417 12:14:27.223950 8 queue.go:130] requeuing xxxxgcp7/xxxxgcp7-xxxx-xxxx-server, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:27 [notice] 102#102: ModSecurity-nginx v1.0.0
2019/04/17 12:14:27 [emerg] 102#102: invalid number of arguments in "set" directive in /tmp/nginx-cfg479136874:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg479136874:4321
nginx: configuration file /tmp/nginx-cfg479136874 test failed
-------------------------------------------------------------------------------
W0417 12:14:30.269805 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
W0417 12:14:30.269957 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
I0417 12:14:30.270695 8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:30.534347 8 controller.go:182] Unexpected failure reloading the backend:
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:30 [notice] 110#110: ModSecurity-nginx v1.0.0
2019/04/17 12:14:30 [emerg] 110#110: invalid number of arguments in "set" directive in /tmp/nginx-cfg755472065:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg755472065:4321
nginx: configuration file /tmp/nginx-cfg755472065 test failed
-------------------------------------------------------------------------------
W0417 12:14:30.534398 8 queue.go:130] requeuing xxxx02/xxxx02-xxxx-rabbitmq-ha-discovery, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:30 [notice] 110#110: ModSecurity-nginx v1.0.0
2019/04/17 12:14:30 [emerg] 110#110: invalid number of arguments in "set" directive in /tmp/nginx-cfg755472065:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg755472065:4321
nginx: configuration file /tmp/nginx-cfg755472065 test failed
-------------------------------------------------------------------------------
E0417 12:14:32.590578 8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
E0417 12:14:32.592722 8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
W0417 12:14:33.603075 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
W0417 12:14:33.603131 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
I0417 12:14:33.603649 8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:33.892027 8 controller.go:182] Unexpected failure reloading the backend:
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:33 [notice] 117#117: ModSecurity-nginx v1.0.0
2019/04/17 12:14:33 [emerg] 117#117: invalid number of arguments in "set" directive in /tmp/nginx-cfg153138988:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg153138988:4321
nginx: configuration file /tmp/nginx-cfg153138988 test failed
-------------------------------------------------------------------------------
W0417 12:14:33.892083 8 queue.go:130] requeuing xxxx/xxxx-xxxx-xxxx-persist, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:33 [notice] 117#117: ModSecurity-nginx v1.0.0
2019/04/17 12:14:33 [emerg] 117#117: invalid number of arguments in "set" directive in /tmp/nginx-cfg153138988:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg153138988:4321
nginx: configuration file /tmp/nginx-cfg153138988 test failed
-------------------------------------------------------------------------------
W0417 12:14:36.936405 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
W0417 12:14:36.936474 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
I0417 12:14:36.937014 8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:37.240681 8 controller.go:182] Unexpected failure reloading the backend:
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:37 [notice] 126#126: ModSecurity-nginx v1.0.0
2019/04/17 12:14:37 [emerg] 126#126: invalid number of arguments in "set" directive in /tmp/nginx-cfg213387931:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg213387931:4321
nginx: configuration file /tmp/nginx-cfg213387931 test failed
-------------------------------------------------------------------------------
W0417 12:14:37.240730 8 queue.go:130] requeuing shlomidemo70/shlomidemo70-xxxx-rabbitmq-ha-discovery, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:37 [notice] 126#126: ModSecurity-nginx v1.0.0
2019/04/17 12:14:37 [emerg] 126#126: invalid number of arguments in "set" directive in /tmp/nginx-cfg213387931:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg213387931:4321
nginx: configuration file /tmp/nginx-cfg213387931 test failed
-------------------------------------------------------------------------------
W0417 12:14:40.269796 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
W0417 12:14:40.269922 8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
I0417 12:14:40.270550 8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:40.504945 8 controller.go:182] Unexpected failure reloading the backend:
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:40 [notice] 133#133: ModSecurity-nginx v1.0.0
2019/04/17 12:14:40 [emerg] 133#133: invalid number of arguments in "set" directive in /tmp/nginx-cfg059340094:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg059340094:4321
nginx: configuration file /tmp/nginx-cfg059340094 test failed
-------------------------------------------------------------------------------
W0417 12:14:40.504990 8 queue.go:130] requeuing xxxx02/xxxx02-xxxx-xxxx-indexer, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:40 [notice] 133#133: ModSecurity-nginx v1.0.0
2019/04/17 12:14:40 [emerg] 133#133: invalid number of arguments in "set" directive in /tmp/nginx-cfg059340094:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg059340094:4321
nginx: configuration file /tmp/nginx-cfg059340094 test failed
-------------------------------------------------------------------------------
E0417 12:14:42.590939 8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
E0417 12:14:42.592859 8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
I0417 12:14:42.893735 8 main.go:172] Received SIGTERM, shutting down
I0417 12:14:42.893814 8 nginx.go:387] Shutting down controller queues
I0417 12:14:42.893866 8 status.go:116] updating status of Ingress rules (remove)
I0417 12:14:42.912037 8 nginx.go:395] Stopping NGINX process
2019/04/17 12:14:42 [notice] 134#134: signal process started
I0417 12:14:43.925234 8 nginx.go:408] NGINX process has stopped
I0417 12:14:43.925284 8 main.go:180] Handled quit, awaiting Pod deletion
E0417 12:14:52.593093 8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
I0417 12:14:53.925529 8 main.go:183] Exiting with 0
2019/04/17 12:14:40 [emerg] 133#133: invalid number of arguments in "set" directive in /tmp/nginx-cfg059340094:4321
@rimusz in your case the issue is related to a bad configuration. Are you using custom snippets?
Note: this is not going to be an issue 0.25 thanks to https://github.com/kubernetes/ingress-nginx/pull/3802
no, we don't use any custom snippets there
any timeline for 0.25 release?
no, we don't use any custom snippets there
Ok, then use kubectl exec <ing pod> cat nginx: /tmp/nginx-cfg059340094 to see exactly what's wrong
any timeline for 0.25 release?
~3 weeks
Hello, I have not managed to reproduce the issue that's why I didn't post anything. I restarted the controller 2-3 times and the issue was resolved.
I also face this issue in nginx-ingress 0.24.1
I0603 08:23:49.076078 8 nginx.go:311] Starting NGINX process
I0603 08:23:49.076198 8 leaderelection.go:217] attempting to acquire leader lease utils/ingress-controller-leader-nginx...
I0603 08:23:49.079065 8 status.go:86] new leader elected: nginx-ingress-controller-84bb6995c5-rhzx6
I0603 08:23:49.114287 8 controller.go:170] Configuration changes detected, backend reload required.
2019/06/03 08:24:05 Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
W0603 08:24:05.078110 8 nginx_status.go:172] unexpected error obtaining nginx status info: Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
E0603 08:24:17.493531 8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
2019/06/03 08:24:32 Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
W0603 08:24:32.776660 8 nginx_status.go:172] unexpected error obtaining nginx status info: Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
E0603 08:24:47.493559 8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
E0603 08:24:55.022540 8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
2019/06/03 08:25:05 Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
W0603 08:25:05.080689 8 nginx_status.go:172] unexpected error obtaining nginx status info: Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
E0603 08:25:17.493466 8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
I only faced this issue in my production environment and not in my staging/qa environments.
The most interesting thing is that eventually after some restarts, the pods start working.
The same issue. It started to happen immediately after I deleted etingroup-sync pod to recreate it by automate by ReplicaSet. The pod works fine.
kubectl --context=etingroup-production -n nginx-ingress log -f pod/nginx-ingress-controller-589f7bc68f-g2bx6
-------------------------------------------------------------------------------
NGINX Ingress controller
Release: 0.24.1
Build: git-ce418168f
Repository: https://github.com/kubernetes/ingress-nginx
-------------------------------------------------------------------------------
I0708 10:11:31.107184 6 flags.go:185] Watching for Ingress class: nginx
W0708 10:11:31.107602 6 flags.go:214] SSL certificate chain completion is disabled (--enable-ssl-chain-completion=false)
nginx version: nginx/1.15.10
W0708 10:11:33.680844 6 client_config.go:549] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0708 10:11:33.683650 6 main.go:205] Creating API client for https://10.55.240.1:443
I0708 10:11:35.775897 6 main.go:249] Running in Kubernetes cluster version v1.13+ (v1.13.6-gke.13) - git (clean) commit fcbc1d20b6bca1936c0317743055ac75aef608ce - platform linux/amd64
I0708 10:11:35.784068 6 main.go:102] Validated nginx-ingress/nginx-ingress-default-backend as the default backend.
I0708 10:11:49.038667 6 main.go:124] Created fake certificate with PemFileName: /etc/ingress-controller/ssl/default-fake-certificate.pem
W0708 10:11:51.214288 6 store.go:613] Unexpected error reading configuration configmap: configmaps "nginx-ingress-controller" not found
I0708 10:11:51.713804 6 nginx.go:265] Starting NGINX Ingress controller
E0708 10:11:54.582229 6 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
I0708 10:11:57.173844 6 event.go:209] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"argocd", Name:"argocd-server-http-ingress", UID:"1e332eb9-8451-11e9-9fe8-42010a8400b2", APIVersion:"extensions/v1beta1", ResourceVersion:"13010210", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress argocd/argocd-server-http-ingress
I0708 10:11:57.957309 6 nginx.go:311] Starting NGINX process
I0708 10:11:57.959214 6 leaderelection.go:217] attempting to acquire leader lease nginx-ingress/ingress-controller-leader-nginx...
I0708 10:11:58.130491 6 controller.go:170] Configuration changes detected, backend reload required.
I0708 10:11:58.978471 6 backend_ssl.go:68] Adding Secret "argocd/argocd-secret" to the local store
I0708 10:11:59.730692 6 backend_ssl.go:68] Adding Secret "concourse/concourse-etingroup-pl-tls" to the local store
I0708 10:11:59.723495 6 leaderelection.go:227] successfully acquired lease nginx-ingress/ingress-controller-leader-nginx
I0708 10:11:59.723578 6 status.go:86] new leader elected: nginx-ingress-controller-589f7bc68f-g2bx6
I0708 10:12:00.438162 6 event.go:209] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"etingroup", Name:"etingroup-sync", UID:"35893ce3-84ae-11e9-90aa-42010a840106", APIVersion:"extensions/v1beta1", ResourceVersion:"13010211", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress etingroup/etingroup-sync
I0708 10:12:00.438460 6 event.go:209] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"concourse", Name:"concourse-web", UID:"22990b6f-8f96-11e9-90aa-42010a840106", APIVersion:"extensions/v1beta1", ResourceVersion:"13010209", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress concourse/concourse-web
E0708 10:12:00.441464 6 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
I0708 10:12:00.448210 6 backend_ssl.go:68] Adding Secret "etingroup/main-etingroup-pl-tls" to the local store
I0708 10:12:00.560179 6 main.go:172] Received SIGTERM, shutting down
I0708 10:12:00.561090 6 nginx.go:387] Shutting down controller queues
I0708 10:12:00.562061 6 status.go:116] updating status of Ingress rules (remove)
W0708 10:12:00.606436 6 template.go:108] unexpected error cleaning template: signal: terminated
E0708 10:12:00.612075 6 controller.go:182] Unexpected failure reloading the backend:
invalid NGINX configuration (empty)
W0708 10:12:00.612360 6 queue.go:130] requeuing initial-sync, err invalid NGINX configuration (empty)
I0708 10:12:00.616718 6 status.go:135] removing address from ingress status ([35.195.XXX.XX])
I0708 10:12:00.617685 6 nginx.go:395] Stopping NGINX process
I0708 10:12:00.620943 6 status.go:295] updating Ingress argocd/argocd-server-http-ingress status from [] to [{35.195.XXX.XX }]
I0708 10:12:00.621944 6 status.go:295] updating Ingress etingroup/etingroup-sync status from [] to [{35.195.XXX.XX }]
I0708 10:12:00.623930 6 status.go:295] updating Ingress concourse/concourse-web status from [] to [{35.195.XXX.XX }]
2019/07/08 10:12:00 [notice] 35#35: signal process started
E0708 10:12:04.107941 6 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
I0708 10:12:05.757746 6 nginx.go:408] NGINX process has stopped
I0708 10:12:05.761038 6 main.go:180] Handled quit, awaiting Pod deletion
E0708 10:12:14.324338 6 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
I0708 10:12:15.937613 6 main.go:183] Exiting with 0
values.yaml for helm
nginx-ingress:
controller:
service:
externalTrafficPolicy: "Local"
loadBalancerIP: "35.195.XXX.XX"
publishService:
enabled: true
* X is censored :)
The issue appear in version: 1.6.18 of helm chart.
It fixed itself after about 15 minutes.
kubectl --context=etingroup-production -n nginx-ingress describe pod/nginx-ingress-controller-589f7bc68f-g2bx6
Name: nginx-ingress-controller-589f7bc68f-g2bx6
Namespace: nginx-ingress
Priority: 0
Node: gke-production-pool-1-6cb6f205-5rft/10.132.0.5
Start Time: Mon, 24 Jun 2019 18:55:28 +0200
Labels: app=nginx-ingress
component=controller
pod-template-hash=589f7bc68f
release=nginx-ingress
Annotations: <none>
Status: Running
IP: 10.52.1.35
Controlled By: ReplicaSet/nginx-ingress-controller-589f7bc68f
Containers:
nginx-ingress-controller:
Container ID: docker://8c469aea0a1ab17bdc4a9849686eb30c8d7810355e4de3935f52f2e7067f4c4a
Image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.24.1
Image ID: docker-pullable://quay.io/kubernetes-ingress-controller/nginx-ingress-controller@sha256:76861d167e4e3db18f2672fd3435396aaa898ddf4d1128375d7c93b91c59f87f
Ports: 80/TCP, 443/TCP
Host Ports: 0/TCP, 0/TCP
Args:
/nginx-ingress-controller
--default-backend-service=nginx-ingress/nginx-ingress-default-backend
--publish-service=nginx-ingress/nginx-ingress-controller
--election-id=ingress-controller-leader
--ingress-class=nginx
--configmap=nginx-ingress/nginx-ingress-controller
State: Running
Started: Mon, 08 Jul 2019 12:31:01 +0200
Last State: Terminated
Reason: Error
Exit Code: 143
Started: Mon, 08 Jul 2019 12:25:13 +0200
Finished: Mon, 08 Jul 2019 12:25:52 +0200
Ready: True
Restart Count: 68
Liveness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
Environment:
POD_NAME: nginx-ingress-controller-589f7bc68f-g2bx6 (v1:metadata.name)
POD_NAMESPACE: nginx-ingress (v1:metadata.namespace)
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from nginx-ingress-token-n59h8 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
nginx-ingress-token-n59h8:
Type: Secret (a volume populated by a Secret)
SecretName: nginx-ingress-token-n59h8
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulled 35m (x59 over 13d) kubelet, gke-production-pool-1-6cb6f205-5rft Container image "quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.24.1" already present on machine
Normal Created 35m (x59 over 13d) kubelet, gke-production-pool-1-6cb6f205-5rft Created container
Normal Killing 35m (x58 over 9d) kubelet, gke-production-pool-1-6cb6f205-5rft Killing container with id docker://nginx-ingress-controller:Container failed liveness probe.. Container will be killed and recreated.
Warning Unhealthy 35m (x38 over 9d) kubelet, gke-production-pool-1-6cb6f205-5rft Readiness probe failed: HTTP probe failed with statuscode: 500
Warning Unhealthy 33m (x391 over 13d) kubelet, gke-production-pool-1-6cb6f205-5rft Readiness probe failed: Get http://10.52.1.35:10254/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 32m (x328 over 13d) kubelet, gke-production-pool-1-6cb6f205-5rft Liveness probe failed: Get http://10.52.1.35:10254/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 27m (x147 over 9d) kubelet, gke-production-pool-1-6cb6f205-5rft Liveness probe failed: Get http://10.52.1.35:10254/healthz: dial tcp 10.52.1.35:10254: connect: connection refused
Normal Started 17m (x66 over 13d) kubelet, gke-production-pool-1-6cb6f205-5rft Started container
Warning BackOff 7m48s (x348 over 9d) kubelet, gke-production-pool-1-6cb6f205-5rft Back-off restarting failed container
But considering number of Unhealthy and BackOff it is not a first time when it was down.
Oh just one extra thing which maybe can help to recreate it. It is Ingress annotations for Service to this etingroup-sync Pod.
annotations:
kubernetes.io/ingress.class: "nginx"
nginx.ingress.kubernetes.io/whitelist-source-range: "213.XXX.XXX.XXX/32,85.XXX.XXX.XXX/32"
certmanager.k8s.io/cluster-issuer: "letsencrypt"
This time it started after update other Application without nginx.ingress.kubernetes.io/whitelist-source-range:.
Guys who have also this issue. Do you use publishService and externalTrafficPolicy for your nginx-ingress? I just updated to the newest ingress, but issue still exist. It would be great if somebody can fix it... it is very critical bug.
Anybody solved it? I still suffer this issue.
i have the same issue, version 0.25.1,
I think this issue is not being treated with enough tenacity. Honestly I wonder how one can use this in production, as even our staging tests for months keep failing on issues like this.
Restarting the controller is not a fix; It's sysadmin patch work.
I've been facing this issue too and it got fixed after solving a couple of issues in my ingress resources.
In most of the cases, I could see how there were ingress resources deployed in my cluster which had no any endpoints available or even deployed.
After deleting those useless and problematic ingress resources, nginx started to start up normally.
@eljefedelrodeodeljefe Regarding your comment about how one can use this in production, I have to say that we run this component in production since 3 years and so far it hasn't cause any outage.
@mmingorance-dh are you sure the same issue?
When something is down, then down is whole ingress not only one service. It works, but randomly聽it doesn't. I have only a few services in cluster.
@eljefedelrodeodeljefe Do you use publishService and externalTrafficPolicy for your nginx-ingress? Maybe it is an issue. Probably not many people use it. I guess.
@kwladyka not the same issue then as we don't use publishService and externalTrafficPolicy
I am also getting this error :
healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: i/o timeout
And after some time, it works again and issue is intermittent, and without making any change , it started working.
Any solution to overcome this?
Closing. Fixed in master https://github.com/kubernetes/ingress-nginx/pull/4531
Closing. Fixed in master #4531
Can you please give me the Image tag ?
Currently I m using : quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.25.1
last time you give me something like this for example :
if you want to test the fix, you can use the image quay.io/kubernetes-ingress-controller/nginx-ingress-controller:dev
quay.io/kubernetes-ingress-controller/nginx-ingress-controller:dev
Thanks For your Quick response , last question , is it okay to use :dev image for Production ?
Or should we wait for :0.25.2 or something ?
It is not ok to use dev on production in any project ;)
What do you suggest, Should We wait for 0.25.2 or something ?
Because We are getting errors and its intermittent.
In my case I will wait. I don't see other choice.
So, I've just ran into this issue as well.
However, my logs indicate that backed reload failed but it does not say why
Unexpected failure reloading the backend:
2019/09/20 06:26:51 Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
unexpected error obtaining nginx status info: Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
nginx: [alert] kill(52, 1) failed (3: No such process)
2019/09/20 06:26:50 [alert] 2096#2096: kill(52, 1) failed (3: No such process)
2019/09/20 06:26:50 [notice] 2096#2096: signal process started
2019/09/20 06:26:50 [notice] 2096#2096: ModSecurity-nginx v1.0.0
nginx: [alert] kill(52, 1) failed (3: No such process)
2019/09/20 06:26:50 [alert] 2096#2096: kill(52, 1) failed (3: No such process)
2019/09/20 06:26:50 [notice] 2096#2096: signal process started
2019/09/20 06:26:50 [notice] 2096#2096: ModSecurity-nginx v1.0.0
exit status 1
requeuing nginx-private/nginx-private-nginx-ingress-controller-metrics, err exit status 1
Unexpected failure reloading the backend:
Configuration changes detected, backend reload required.
Configuration changes detected, backend reload required.
2019/09/20 06:26:48 Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
2019/09/20 06:26:48 Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
unexpected error obtaining nginx status info: Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
unexpected error obtaining nginx status info: Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
-------------------------------------------------------------------------------
2019/09/20 06:26:47 [notice] 122#122: ModSecurity-nginx v1.0.0
Error: signal: killed
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
2019/09/20 06:26:47 [notice] 122#122: ModSecurity-nginx v1.0.0
Error: signal: killed
-------------------------------------------------------------------------------
Will this also be solved in the next release or is this a different issue?
@OGKevin yes, you can check this using the image quay.io/kubernetes-ingress-controller/nginx-ingress-controller:dev
@OGKevin yes, you can check this using the image
quay.io/kubernetes-ingress-controller/nginx-ingress-controller:dev
dev seems not work,but when i try to solve this problem,i found another error says " cannot list resource "ingresses" in API group "networking.k8s.io", so i add following, and the error missing,the ingress-controller has not been restarted from then
dev seems not work,but when i try to solve this problem,i found another error says " cannot list resource "ingresses" in API group "networking.k8s.io", so i add following, and the error missing,the ingress-controller has not been restarted from then
It seems you are using k8s >= 1.14. For that reason, you need to update the roles to be able to use that api https://github.com/kubernetes/ingress-nginx/blob/master/deploy/static/rbac.yaml#L53
Eventually we fixed this by upgrading the nodes to 1.14.7-gke.10. After that the for i in $(seq 1 200); do curl localhost:10254/healthz; done inside the ingress-nginx container was done in a few seconds, whereas before it took minutes. It could well be that the upgrade triggered a reset on the root cause, which is still unknown to me. Or maybe somehow nginx-ingress-controller:0.26.1 works better with the newer kubernetes version.
I still have this issue:
kubectl --context=etingroup-production get node
NAME STATUS ROLES AGE VERSION
gke-production-pool-1-ce587bf0-rxwq Ready <none> 31m v1.14.7-gke.10

```
ingress version
tag: "0.26.1"
Is it possible it fail, because third party pod which has nginx-ingress service fail? Will nginx-ingress fail, because third party app fail?
I have the same behavior with AKS 1.14.8 and nginx-controller 0.27.1 + HPA. @kwladyka
So, anybody know the root case?
@ZzzJing check https://github.com/kubernetes/ingress-nginx/issues/4735
Hello, I have not managed to reproduce the issue that's why I didn't post anything. I restarted the controller 2-3 times and the issue was resolved.
it worked for me
Most helpful comment
I've been facing this issue too and it got fixed after solving a couple of issues in my ingress resources.
In most of the cases, I could see how there were ingress resources deployed in my cluster which had no any endpoints available or even deployed.
After deleting those useless and problematic ingress resources, nginx started to start up normally.
@eljefedelrodeodeljefe Regarding your comment about how one can use this in production, I have to say that we run this component in production since 3 years and so far it hasn't cause any outage.