Hi,
Is there any way to prevent permanent DNS caching, especially when setting up a reverse proxy through an ingress + headless service with ExternalName?
If the upstream target changes ip, I need to rescale the nginx ingress controller pods to resolve the target.
Right now the upstreams for service type=ExternalName use a normal server definition. This means the DNS resolution is done only at the start. To fix this we need to find a way to use variables and not upstream for this case.
Something like:
set $upstream_endpoint http://foo.bar.com;
proxy_pass $upstream_endpoint;
Just in case this is not an issue in nginx plus (you can define a resolve directive in the upstream server)
Do you have any suggestions to work around the issue? Amazon can reassign ELB ip at any moments notice, breaking the reverse proxy.
@michaelgeorgeattard you could try to use my sample code in a configuration-snippet annotation but I am not sure if this will work
@aledbf Unfortunately, I don't think there is a clean way to insert the above snippet, given the template below:
{{ if not (empty $location.Backend) }}
{{ buildProxyPass $server.Hostname $all.Backends $location $all.DynamicConfigurationEnabled }}
{{ if (or (eq $location.Proxy.ProxyRedirectFrom "default") (eq $location.Proxy.ProxyRedirectFrom "off")) }}
proxy_redirect {{ $location.Proxy.ProxyRedirectFrom }};
{{ else if not (eq $location.Proxy.ProxyRedirectTo "off") }}
proxy_redirect {{ $location.Proxy.ProxyRedirectFrom }} {{ $location.Proxy.ProxyRedirectTo }};
{{ end }}
{{ else }}
# No endpoints available for the request
return 503;
{{ end }}
{{ else }}
# Location denied. Reason: {{ $location.Denied }}
return 503;
{{ end }}
Can you kindly confirm if I'm missing something?
By any chance, did the below code changes have any impact on this limitation?
https://github.com/kubernetes/ingress-nginx/commit/d4faf684164c8b5af4b515d153a69adf724667d3
By any chance, did the below code changes have any impact on this limitation?
After that PR there's no such limitation in dynamic mode.
You can test this using the dev image quay.io/kubernetes-ingress-controller/nginx-ingress-controller:dev
Please remove the securityContext section from the deployment if you use this image
@aledbf: Thanks for your response.
Looks like the DNS issue is mostly resolved with the dev image + enable-dynamic-configuration set to true. Some extra info:
W0730 11:32:25.892550 31 nginx_status.go:207] unexpected error obtaining nginx status info: unexpected error scraping nginx status page: unexpected error scraping nginx : Get http://0.0.0.0:18080/nginx_status: dial tcp 0.0.0.0:18080: connect: connection refused
W0730 11:32:25.893659 31 nginx_status.go:207] unexpected error obtaining nginx status info: unexpected error scraping nginx status page: unexpected error scraping nginx : Get http://0.0.0.0:18080/nginx_status: dial tcp 0.0.0.0:18080: connect: connection refused
- More importantly, I am still getting some "downtime" when ips switch. With a DNS TTL of 1 second, I am still getting around 30 seconds worth of the old ip resolved for upstream.
Update: Now that I think about it, this might not be a problem if DNS record returned by AWS Route 53 is guaranteed to be accurate for its lifetime. I need to look into this more.
The error log part still stands.
I am getting the following errors (verbosity is off) on init:
This is not an error. The issue here is the prometheus metric gathering starts before the first nginx reload.
I will fix this before the next release.
@michaelgeorgeattard in dynamic mode this should have been fixed. It would take maximum TTL+1s to resolve the DNS again.
With a DNS TTL of 1 second
I'd be surprised if 1s of TTL is respected by DNS servers, probably they override it with 30s or so.
Understood.
Thanks for the support @aledbf @ElvinEfendi
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
Closing. Since 0.20.0 only dynamic mode is enabled where this is not an issue
Hello,
I had this issue of DNS cache after an AWS ELB IP change with version 0.27.1.
The dynamic mode should be there I guess. Restarting the controller solved the issue, the new IPs of the ELB were used.
This is the first time I deploy a service with a reverse proxy in the ingress and I encountered the issue two weeks after the initial deployment.
The ingress config:
nginx.ingress.kubernetes.io/server-snippet: |
location ~ /ts/($|api|(.*)/signin) {
deny all;
}
location /ts/ {
proxy_pass https://myapp.xy.io/;
proxy_redirect https://externalapp.xy.io/ https:/myapp.xy.io/ts/;
}
location /v1/ {
proxy_pass https://externalapp.xy.io/v1/;
}
any idea to prevent this ?