Ingress-nginx: Prevent permanent DNS caching in ingress controller

Created on 19 Jul 2018 · 14Comments · Source: kubernetes/ingress-nginx

Hi,

Is there any way to prevent permanent DNS caching, especially when setting up a reverse proxy through an ingress + headless service with ExternalName?

If the upstream target changes ip, I need to rescale the nginx ingress controller pods to resolve the target.

kinbug lifecyclrotten

Source

michaelgeorgeattard

All 14 comments

Right now the upstreams for service type=ExternalName use a normal server definition. This means the DNS resolution is done only at the start. To fix this we need to find a way to use variables and not upstream for this case.
Something like:

set $upstream_endpoint http://foo.bar.com;
proxy_pass $upstream_endpoint;

aledbf on 19 Jul 2018

Just in case this is not an issue in nginx plus (you can define a resolve directive in the upstream server)

aledbf on 19 Jul 2018

Do you have any suggestions to work around the issue? Amazon can reassign ELB ip at any moments notice, breaking the reverse proxy.

michaelgeorgeattard on 19 Jul 2018

@michaelgeorgeattard you could try to use my sample code in a configuration-snippet annotation but I am not sure if this will work

aledbf on 19 Jul 2018

@aledbf Unfortunately, I don't think there is a clean way to insert the above snippet, given the template below:

{{ if not (empty $location.Backend) }}
{{ buildProxyPass $server.Hostname $all.Backends $location $all.DynamicConfigurationEnabled }}
{{ if (or (eq $location.Proxy.ProxyRedirectFrom "default") (eq $location.Proxy.ProxyRedirectFrom "off")) }}
proxy_redirect                          {{ $location.Proxy.ProxyRedirectFrom }};
{{ else if not (eq $location.Proxy.ProxyRedirectTo "off") }}
proxy_redirect                          {{ $location.Proxy.ProxyRedirectFrom }} {{ $location.Proxy.ProxyRedirectTo }};
{{ end }}
{{ else }}
# No endpoints available for the request
return 503;
{{ end }}
{{ else }}
# Location denied. Reason: {{ $location.Denied }}
return 503;
{{ end }}

Can you kindly confirm if I'm missing something?

By any chance, did the below code changes have any impact on this limitation?

https://github.com/kubernetes/ingress-nginx/commit/d4faf684164c8b5af4b515d153a69adf724667d3

michaelgeorgeattard on 30 Jul 2018

By any chance, did the below code changes have any impact on this limitation?

After that PR there's no such limitation in dynamic mode.
You can test this using the dev image quay.io/kubernetes-ingress-controller/nginx-ingress-controller:dev
Please remove the securityContext section from the deployment if you use this image

aledbf on 30 Jul 2018

@aledbf: Thanks for your response.

Looks like the DNS issue is mostly resolved with the dev image + enable-dynamic-configuration set to true. Some extra info:

I am getting the following errors (verbosity is off) on init:

W0730 11:32:25.892550      31 nginx_status.go:207] unexpected error obtaining nginx status info: unexpected error scraping nginx status page: unexpected error scraping nginx : Get http://0.0.0.0:18080/nginx_status: dial tcp 0.0.0.0:18080: connect: connection refused
W0730 11:32:25.893659      31 nginx_status.go:207] unexpected error obtaining nginx status info: unexpected error scraping nginx status page: unexpected error scraping nginx : Get http://0.0.0.0:18080/nginx_status: dial tcp 0.0.0.0:18080: connect: connection refused

~~- More importantly, I am still getting some "downtime" when ips switch. With a DNS TTL of 1 second, I am still getting around 30 seconds worth of the old ip resolved for upstream.~~

Update: Now that I think about it, this might not be a problem if DNS record returned by AWS Route 53 is guaranteed to be accurate for its lifetime. I need to look into this more.

The error log part still stands.

michaelgeorgeattard on 30 Jul 2018

I am getting the following errors (verbosity is off) on init:

This is not an error. The issue here is the prometheus metric gathering starts before the first nginx reload.
I will fix this before the next release.

aledbf on 30 Jul 2018

👍1

@michaelgeorgeattard in dynamic mode this should have been fixed. It would take maximum TTL+1s to resolve the DNS again.

With a DNS TTL of 1 second

I'd be surprised if 1s of TTL is respected by DNS servers, probably they override it with 30s or so.

ElvinEfendi on 30 Jul 2018

👍1

Understood.

Thanks for the support @aledbf @ElvinEfendi

michaelgeorgeattard on 30 Jul 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot on 28 Oct 2018

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot on 27 Nov 2018

Closing. Since 0.20.0 only dynamic mode is enabled where this is not an issue

aledbf on 27 Nov 2018

Hello,
I had this issue of DNS cache after an AWS ELB IP change with version 0.27.1.
The dynamic mode should be there I guess. Restarting the controller solved the issue, the new IPs of the ELB were used.

This is the first time I deploy a service with a reverse proxy in the ingress and I encountered the issue two weeks after the initial deployment.

The ingress config:

nginx.ingress.kubernetes.io/server-snippet: |
      location ~ /ts/($|api|(.*)/signin) {
        deny all;
      }
      location /ts/ {
        proxy_pass https://myapp.xy.io/;
        proxy_redirect https://externalapp.xy.io/ https:/myapp.xy.io/ts/;
      }
      location /v1/ {
        proxy_pass https://externalapp.xy.io/v1/;
      }

any idea to prevent this ?