What keywords did you search in NGINX Ingress controller issues before filing this one? publish-service
Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT
NGINX Ingress controller version:
1.17.1/0.18.0
Kubernetes version (use kubectl version):
1.10.4
Environment:
What happened:
We use "publish-service" flag to announce the ingress service ip to ingress resources so that external-dns can collect them and create DNS entries.
After some uptime nginx-ingress stops to announce its ip address to the ingress resources. A recreation of the nginx pods resolve this issue.
What you expected to happen:
I expect nginx-ingress to push its service ip address to every ingress resource available.
How to reproduce it (as minimally and precisely as possible):
We use the following args set:
- args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-http-backend
- --configmap=$(POD_NAMESPACE)/nginx-configuration
- --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
- --udp-services-configmap=$(POD_NAMESPACE)/udp-services
- --annotations-prefix=nginx.ingress.kubernetes.io
- --watch-namespace=$(POD_NAMESPACE)
- --enable-ssl-passthrough
- --publish-service=$(POD_NAMESPACE)/ingress
The problem seems to appear at random times. This time the nginx-ingress controller pods were up 27 days, but i cannot tell when it stoped working since we don't deploy new ingress resouces on a daily basis.
This only affects new ingress resources. The existing ones keep the ingress service ip (unless deleted and recreated).
Anything else we need to know:
Nothing too interesting in the logs, but I'll provide them anyway:
(I deleted and recreated an ingress resource 'pi-ingress')
kube-system/nginx-ingress-controller-5445cd64fc-m59lp[nginx-ingress-controller]: I0903 15:36:49.382976 6 event.go:221] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"kube-system", Name:"pi-ingress", UID:"b9222892-af8a-11e8-97bd-ea596d21e776", APIVersion:"extensions/v1beta1", ResourceVersion:"16850640", FieldPath:""}): type: 'Normal' reason: 'DELETE' Ingress kube-system/pi-ingress
kube-system/nginx-ingress-controller-5445cd64fc-m59lp[nginx-ingress-controller]: I0903 15:36:49.383458 6 controller.go:169] Configuration changes detected, backend reload required.
kube-system/nginx-ingress-controller-5445cd64fc-68q5b[nginx-ingress-controller]: I0903 15:36:49.384144 7 event.go:221] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"kube-system", Name:"pi-ingress", UID:"b9222892-af8a-11e8-97bd-ea596d21e776", APIVersion:"extensions/v1beta1", ResourceVersion:"16850640", FieldPath:""}): type: 'Normal' reason: 'DELETE' Ingress kube-system/pi-ingress
kube-system/nginx-ingress-controller-5445cd64fc-68q5b[nginx-ingress-controller]: I0903 15:36:49.384360 7 controller.go:169] Configukube-system/nginx-ingress-controller-5445cd64fc-m59lp[nginx-ingress-controller]: I0903 15:36:49.528784 6 controller.go:185] Backend successfully reloaded.
kube-system/nginx-ingress-controller-5445cd64fc-68q5b[nginx-ingress-controller]: I0903 15:36:49.544128 7 controller.go:185] Backend successfully reloaded.
kube-system/nginx-ingress-controller-5445cd64fc-m59lp[nginx-ingress-controller]: I0903 15:36:59.481239 6 event.go:221] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"kube-system", Name:"pi-ingress", UID:"3259eb1e-af8f-11e8-97bd-ea596d21e776", APIVersion:"extensions/v1beta1", ResourceVersion:"16850681", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress kube-system/pi-ingress
kube-system/nginx-ingress-controller-5445cd64fc-68q5b[nginx-ingress-controller]: I0903 15:36:59.481564 7 event.go:221] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"kube-system", Name:"pi-ingress", UID:"3259eb1e-af8f-11e8-97bd-ea596d21e776", APIVersion:"extensions/v1beta1", ResourceVersion:"16850681", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress kube-system/pi-ingress
kube-system/nginx-ingress-controller-5445cd64fc-m59lp[nginx-ingress-controller]: I0903 15:36:59.482772 6 backend_ssl.go:60] Updating Secret "kube-system/default-ssl-cert" in the local store
kube-system/nginx-ingress-controller-5445cd64fc-68q5b[nginx-ingress-controller]: I0903 15:36:59.483366 7 backend_ssl.go:60] Updating Secret "kube-system/default-ssl-cert" in the local store
kube-system/nginx-ingress-controller-5445cd64fc-m59lp[nginx-ingress-controller]: I0903 15:36:59.483061 6 controller.go:169] Configuration changes detected, backend reload required.
kube-system/nginx-ingress-controller-5445cd64fc-68q5b[nginx-ingress-controller]: I0903 15:36:59.483874 7 controller.go:169] Configuration changes detected, backend reload required.
kube-system/nginx-ingress-controller-5445cd64fc-m59lp[nginx-ingress-controller]: I0903 15:36:59.608594 6 controller.go:185] Backend successfully reloaded.
kube-system/nginx-ingress-controller-5445cd64fc-68q5b[nginx-ingress-controller]: I0903 15:36:59.639084 7 controller.go:185] Backend successfully reloaded.
kube-system/nginx-ingress-controller-5445cd64fc-m59lp[nginx-ingress-controller]: W0903 15:37:04.634612 6 reflector.go:341] k8s.io/ingress-nginx/internal/ingress/controller/store/store.go:152: watch of *v1.ConfigMap ended with: too old resource version: 16848926 (16850385)
It looks good to me, but the ingress resource won't get an IP address.
The output of kubectl get ingress looks like this:
NAME HOSTS ADDRESS PORTS AGE
k8s-toolbox-ingress k8s-toolbox-dev.example.com 172.22.246.1 80, 443 28d
kube-opsview-ingress opsview.example.com,kube-system-stage.example.com 172.22.246.1 80, 443 26d
kubernetes-dashboard-ingress dashboard-dev.example.com 172.22.246.1 80, 443 39d
pi-ingress pi-dev.example.com 80, 443 39m
@TheKangaroo from that example the status of the ingress changes because you are deleting the Ingress and then creating a new one with the same name. The update of the status is executed every 60 seconds so after (at most) two minutes you should see the status
Sure, I see I have to wait like one to two minutes for the address to be updated. Unfortunately, this stops working at some point. As you can see in the last snippet the ingress resource didn't get the address even after 39m:
NAME HOSTS ADDRESS PORTS AGE
k8s-toolbox-ingress k8s-toolbox-dev.example.com 172.22.246.1 80, 443 28d
kube-opsview-ingress opsview.example.com,kube-system-stage.example.com 172.22.246.1 80, 443 26d
kubernetes-dashboard-ingress dashboard-dev.example.com 172.22.246.1 80, 443 39d
pi-ingress pi-dev.example.com 80, 443 39m
Unfortunately, this stops working at some point
When the ingress controller status update changes something you should see something like https://github.com/kubernetes/ingress-nginx/blob/master/internal/ingress/status/status.go#L362 in the logs (also when the status is being cleared)
Please use kubectl logs <ingress pod> | grep "status.go" to search when this happens
Are you running multiple deployments of the ingress controller?
I see this entry when I delete and recreate an ingress resource:
I0904 14:21:25.643599 7 status.go:362] updating Ingress kube-system/pi-ingress status to [{172.22.248.8 }]
But right now the ingress works as expected. I will check the logs for this line when the problem reappears. But I don't think I will see something in the logs since it would have been in the logs in my original post, right?
But I don't think I will see something in the logs since it would have been in the logs in my original post, right?
Right, but now you know what to search for in the logs. Also, check the pod is not restarted when you check the logs. If that happens, add the flag --previous to the kubectl command
hi @aledbf, I experienced this issue again. from the logs I see the line:
I0927 09:16:57.395266 10 event.go:221] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"mynamespace", Name:"demo-ingress", UID:"1529a413-c236-11e8-8a6c-0050568460f6", APIVersion:"extensions/v1beta1", ResourceVersion:"45628488", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress mynamespace/demo-ingress
I0927 09:16:57.510689 10 controller.go:185] Backend successfully reloaded.
I0927 09:16:57.512045 10 controller.go:202] Dynamic reconfiguration succeeded.
Unfortunately, I see no status.go log line anywhere near this event. After a restart of the nginx container, everything worked again. I'm a little lost here how I can debug and get more information. Do you have any ideas on how to troubleshoot this?
We're experiencing exactly the same issues. After some uptime ingress-nginx stops updating 'status' field of our Ingresses. Restart helps but it doesn't look like a Production solution.
I've opened logs and don't see anything related to status.go, controller.go or event.go. So seems all of these "processes" stopped some time ago.
Is it possible to somehow check that status.go logic was stopped in this pod? or try to reinitiate it?
It's actually really strange that nobody else reported this issue maybe we are doing something wrong and it's just our configuration issue...
Thank you in advance!
Same problem on my side. My workaround was to delete the pod so that it's recreated
thank you @aledbf 鉂わ笍
I seem to have a similar issue in that IP address of the service is not being published which seems to be breaking external DNS.
Kubernetes: v1.11.5
NGINX Ingress controller: 0.21.0
I have set --publish-service=networking/nginx-ingress-external-controller
kubectl get svc --all-namespaces
networking nginx-ingress-external-controller LoadBalancer 172.16.0.112 84.22.190.178 80:31615/TCP,443:30726/TCP 23m
So I have an external load balancer stood up with IP: 84.22.190.178.
kubectl get ing --all-namespaces
NAMESPACE NAME HOSTS ADDRESS PORTS AGE
default echo echo.exampledomain.com <emptyaddress> 80, 443 9m
default myapp myapp.exampledomain.com <emptyaddress> 80, 443 6d
My understanding is that setting --publish-service means that Nginx will take the external address from the Service (e.g. 84.22.190.178 ) and publish that into the ingress components address field.
Can anyone shed any light?
So, for anyone that arrives here after looking at #3368, #3180, #250, #2085, I have been able to reproduce this in GKE cluster 1.13.11-gke.23 with Helm chart 1.30.0, which uses 0.28.0 nginx-ingress.
Downgrading to Helm chart 1.4.0, which uses 0.23.0, made the controller update the loadbalancer status in the resources, however this is going back too much back into time. I tried a few combinations of nginx-controller, and to no avail. Had no idea what could be going wrong.
Finally, I decided to upgrade the cluster to 1.14.10-gke.21, and it's all working now... so I would recommend you guys upgrade to 1.14, it solved the issue for me.
I still have the logs from nginx-ingress-controller if someone is interested, need to anonymize, but I still have some logs, if that helps.
Can confirm, 1.14 on GKE fixed this for us (with 0.28.0) as @txomon said
GKE cluster 1.13.11-gke.23 with Helm chart 1.30.0, which uses 0.28.0 nginx-ingress.
This issue is related to clusters running k8s < 1.14. The issue is fixed in https://github.com/kubernetes/ingress-nginx/pull/4996 and will be released in 0.29.0
Most helpful comment
So, for anyone that arrives here after looking at #3368, #3180, #250, #2085, I have been able to reproduce this in GKE cluster 1.13.11-gke.23 with Helm chart 1.30.0, which uses 0.28.0 nginx-ingress.
Downgrading to Helm chart 1.4.0, which uses 0.23.0, made the controller update the loadbalancer status in the resources, however this is going back too much back into time. I tried a few combinations of nginx-controller, and to no avail. Had no idea what could be going wrong.
Finally, I decided to upgrade the cluster to 1.14.10-gke.21, and it's all working now... so I would recommend you guys upgrade to 1.14, it solved the issue for me.
I still have the logs from nginx-ingress-controller if someone is interested, need to anonymize, but I still have some logs, if that helps.