Ingress-nginx: 502 and "Connection refused while connecting to upstream" scaling pods down (or deleting pods)

Created on 9 Jan 2019 · 11Comments · Source: kubernetes/ingress-nginx

Is this a BUG REPORT or FEATURE REQUEST? (choose one):

BUG REPORT

NGINX Ingress controller version:

0.21.0 / git-b65b85cd9

Kubernetes version (use kubectl version):

1.11

Environment:

AWS EKS

Installed from chart version 1.0.1 as daemonset.

Chart config:

controller:
  ingressClass: "nginx-public"
  publishService:
    enabled: true
    pathOverride: default/ingress02-nginx-ingress-controller
  kind: DaemonSet
  service:
    targetPorts:
      https: 80
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-ssl-cert: xxx
      service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "https"
      service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "http"
      external-dns.alpha.kubernetes.io/hostname: xxx.xxx.xxx-xxxx.com
      prometheus.io/scrape: 'true'
      prometheus.io/port: '10254'
  config:
    proxy-next-upstream: error timeout http_502
    proxy-next-upstream-tries: "3"
    use-geoip: "true"
    enable-vts-status: "true"
    log-format-escape-json: "true"
    log-format-upstream: '{
      "@timestamp": "$time_iso8601",
      "body_bytes_sent": "$body_bytes_sent",
      "client_ip": "$remote_addr"
      ...
      "user_agent": "$http_user_agent"
    }'
  stats:
    enabled: true
  metrics:
    enabled: true
    service:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "10254"

What happened:

When scaling down or deleting pods we see a few 502 errors returned to the client and this in the controller logs:

2019/01/09 12:02:19 [error] 64#64: *13620 connect() failed (111: Connection refused) while connecting to upstream, client: 66.66.66.15, server: xx-01.xx.xx-xx.com, request: "GET /xxx/xxx/xxx/alive HTTP/1.1", upstream: "http://10.0.27.39:80/xxx/xxx/xxx/alive", host: "xxx-01.xx.xxx-xxx.com"

What you expected to happen:

No error visible for client.

How to reproduce it (as minimally and precisely as possible):

Run HTTP GET requests continuously. Delete some pods or downscale deployment.

Anything else we need to know:

Note I have set proxy-next-upstream: error timeout http_502 as shown above.

The deployment behind the ingress is also nginx and it sometimes shows error 499 which indicates that the ingress-controller is terminating the connections before completion?

The log from the controller shows this:

status:                   502   
upstream_status: 502, 502, 502

Even if the upstream pod is shutting down, and returns a 502, then shouldn't the ingress-controller retry with another pod?
Shouldn't k8s stop routing any more requests to this pod if it's shutting down?
Is there a small window here that is unavoidable?

Source

max-rocket-internet

Most helpful comment

From your description, you went from 10 to 5 pods but the ingress controller only retries 3 times

OK yes, that makes sense because in theory, the ingress controller could try 3 of the 5 stopping pods.

you can test this removing only one pod. You should not see 502

Tested, everything makes sense now!

in other words if it exits immediately

Yes but I think the problem here is that there will always be a small window between the pod containers getting SIGTERM and the ingress-controller updating it's upstreams. If your app takes 5 seconds to stop, you won't see it. But for our app, a request might only last 50ms. And it shuts own correctly, i.e. finishes any remaining requests and then exits. So in theory, the app can gracefully shutdown and exit before the ingress-controller is up to date.

This can also be the case when your app does not handle SIGTERM properly, in other words if it exits immediately

True and for nginx (not the controller but part of our application pod), apparently /usr/sbin/nginx -s quit is required.

Anyway, in summary, I was able to avoid all 502s when scaling down by adding this to our application nginx container:

        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 3 && /usr/sbin/nginx -s quit"]

And this to our app container:

        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 3"]

Thanks for taking the time @aledbf and @ElvinEfendi

max-rocket-internet on 10 Jan 2019

👍11 🎉2 ❤1

All 11 comments

Even if the upstream pod is shutting down, and returns a 502, then shouldn't the ingress-controller retry with another pod?

No. The default value is to not do that. Please check https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/#proxy-next-upstream

The default value is error timeout. You can change that setting
proxy_next_upstream: "error timeout http_502" in the configuration configmap.

aledbf on 9 Jan 2019

@aledbf I did that already, see my configmap above 🙂

max-rocket-internet on 9 Jan 2019

I did that already, see my configmap above slightly_smiling_face

Sorry about that.

How many replicas do you have when this happens?
Are you configuring RollingUpdateStrategy to avoid removing old replicas while the new ones are not yet ready?
Are you using probes in the deployment of your application?
Please post the complete log line from the ingress controller pod

aledbf on 9 Jan 2019

All good!

I go from 10 replicas to 5. I also just deleted 2 pods and also saw the same issue.
Yes but I don't think this makes a difference in this scenario as there is always pods available to serve requests.
Like this? This is from the pod

{ "@timestamp": "2019-01-09T12:02:21+00:00", "body_bytes_sent": "2", "client_ip": "1.2.3.4", "geoip_city": "", "geoip_country_code": "GB", "geoip_region_name": "", "http_cf_ray": "", "http_referer": "", "http_user_agent": "ApacheBench/2.3", "http_x_forwarded_for": "1.2.3.4, 54.3.4.5, 1.2.3.4", "http_x_forwarded_proto": "https", "ingress_name": "app1", "namespace": "default", "proxy_add_x_forwarded_for": "1.2.3.4, 54.3.4.5, 1.2.3.4", "proxy_protocol_addr": "", "proxy_upstream_name": "default-app1-80", "remote_address": "10.0.24.131", "remote_user": "", "request_host": "xxxx-01.xx.xxxx.com", "request_id": "6a99d7249713e81694b2208b81f1b66b", "request_length" : 594, "request_method": "GET", "request_path": "/app/monitoring/alive", "request_time": 0.003, "request": "GET /app/monitoring/alive HTTP/1.1", "response_time": "0.003", "service_name": "app1", "status": "200", "time_local": "09/Jan/2019:12:02:21 +0000", "upstream_addr": "10.0.26.32:80", "upstream_response_length": "2", "upstream_response_time": "0.004", "upstream_status": "200", "uri": "/app/monitoring/alive", "user_agent": "ApacheBench/2.3" }
2019/01/09 12:02:22 [error] 64#64: *13620 connect() failed (111: Connection refused) while connecting to upstream, client: 1.2.3.4, server: xxxx-01.xx.xxxx.com, request: "GET /app/monitoring/alive HTTP/1.1", upstream: "http://10.0.25.162:80/app/monitoring/alive", host: "xxxx-01.xx.xxxx.com"
{ "@timestamp": "2019-01-09T12:02:22+00:00", "body_bytes_sent": "2", "client_ip": "1.2.3.4", "geoip_city": "", "geoip_country_code": "GB", "geoip_region_name": "", "http_cf_ray": "", "http_referer": "", "http_user_agent": "ApacheBench/2.3", "http_x_forwarded_for": "1.2.3.4, 54.3.4.5, 1.2.3.4", "http_x_forwarded_proto": "https", "ingress_name": "app1", "namespace": "default", "proxy_add_x_forwarded_for": "1.2.3.4, 54.3.4.5, 1.2.3.4", "proxy_protocol_addr": "", "proxy_upstream_name": "default-app1-80", "remote_address": "10.0.24.131", "remote_user": "", "request_host": "xxxx-01.xx.xxxx.com", "request_id": "0f1659cbe9224562561e476d457208f7", "request_length" : 594, "request_method": "GET", "request_path": "/app/monitoring/alive", "request_time": 0.002, "request": "GET /app/monitoring/alive HTTP/1.1", "response_time": "0.002", "service_name": "app1", "status": "200", "time_local": "09/Jan/2019:12:02:22 +0000", "upstream_addr": "10.0.25.162:80, 10.0.24.174:80", "upstream_response_length": "0, 2", "upstream_response_time": "0.004, 0.000", "upstream_status": "502, 200", "uri": "/app/monitoring/alive", "user_agent": "ApacheBench/2.3" }
[09/Jan/2019:12:02:22 +0000]TCP200000.000
2019/01/09 12:02:22 [error] 64#64: *13620 connect() failed (111: Connection refused) while connecting to upstream, client: 1.2.3.4, server: xxxx-01.xx.xxxx.com, request: "GET /app/monitoring/alive HTTP/1.1", upstream: "http://10.0.27.29:80/app/monitoring/alive", host: "xxxx-01.xx.xxxx.com"
{ "@timestamp": "2019-01-09T12:02:22+00:00", "body_bytes_sent": "2", "client_ip": "1.2.3.4", "geoip_city": "", "geoip_country_code": "GB", "geoip_region_name": "", "http_cf_ray": "", "http_referer": "", "http_user_agent": "ApacheBench/2.3", "http_x_forwarded_for": "1.2.3.4, 54.3.4.5, 1.2.3.4", "http_x_forwarded_proto": "https", "ingress_name": "app1", "namespace": "default", "proxy_add_x_forwarded_for": "1.2.3.4, 54.3.4.5, 1.2.3.4", "proxy_protocol_addr": "", "proxy_upstream_name": "default-app1-80", "remote_address": "10.0.24.131", "remote_user": "", "request_host": "xxxx-01.xx.xxxx.com", "request_id": "d4e68d20ba3e30cfd844dd4bba71c8ce", "request_length" : 594, "request_method": "GET", "request_path": "/app/monitoring/alive", "request_time": 0.003, "request": "GET /app/monitoring/alive HTTP/1.1", "response_time": "0.003", "service_name": "app1", "status": "200", "time_local": "09/Jan/2019:12:02:22 +0000", "upstream_addr": "10.0.27.29:80, 10.0.24.236:80", "upstream_response_length": "0, 2", "upstream_response_time": "0.000, 0.004", "upstream_status": "502, 200", "uri": "/app/monitoring/alive", "user_agent": "ApacheBench/2.3" }

max-rocket-internet on 9 Jan 2019

64: *13620 connect() failed (111: Connection refused) while connecting to upstream

That means the pod is not ready to receive connections. This should not happen if the app is using probes

"upstream_addr": "10.0.25.162:80, 10.0.24.174:80", "upstream_response_length": "0, 2", "upstream_response_time": "0.004, 0.000", "upstream_status": "502, 200",

The configuration is working, 10.0.25.162:80 returned 502 (internally) and then 10.0.24.174:80 returned 200. You should see 200 as response.

I also just deleted 2 pods and also saw the same issue.

You see 502 in apachebench or just in the logs?

aledbf on 9 Jan 2019

That means the pod is not ready to receive connections. This should not happen if the app is using probes

We are using probes. Both Liveness and Readiness. And they work.

The configuration is working, 10.0.25.162:80 returned 502 (internally) and then 10.0.24.174:80 returned 200. You should see 200 as response.

Ahh I see what you mean! It looks like this, where it tried and failed with 10.0.25.162 and succeeded with 10.0.24.174:

"upstream_addr": "10.0.25.162:80, 10.0.24.174:80"
"upstream_status": "502, 200"

That's great to see it working but we still see messages like this from the controller also:

upstream_addr:   10.0.27.39:80, 10.0.25.162:80, 10.0.26.117:80  
upstream_status:     502, 502, 502

So I think the controller is doing the best it can.

You see 502 in apachebench or just in the logs?

Yes, we see it in AB.

OK at this point I'll close this issue. Maybe I'll open a new issue about the 499 as I think that's different.

But do you know how our pod could return 502 when it should have been prevented from serving requests by k8s? There is 0 502s before the pod deletion is started. Is there a small window here that is unavoidable?

max-rocket-internet on 9 Jan 2019

There is 0 502s before the pod deletion is started. Is there a small window here that is unavoidable?

From your description, you went from 10 to 5 pods but the ingress controller only retries 3 times (you can change that). If the retries are sent to the killed pods, there is a small window of time (less than a second) where the ingress controller is not up to date.

Edit: you can test this removing only one pod. You should not see 502

aledbf on 9 Jan 2019

This can also be the case when your app does not handle SIGTERM properly, in other words if it exits immediately. I'd suggest trying with following configuration for your deployment:

...
      containers:
        ...
        lifecycle:
          preStop:
            exec:
              command: ["sleep", "5"]
...

ElvinEfendi on 9 Jan 2019

👍1

From your description, you went from 10 to 5 pods but the ingress controller only retries 3 times

OK yes, that makes sense because in theory, the ingress controller could try 3 of the 5 stopping pods.

you can test this removing only one pod. You should not see 502

Tested, everything makes sense now!

in other words if it exits immediately

This can also be the case when your app does not handle SIGTERM properly, in other words if it exits immediately

True and for nginx (not the controller but part of our application pod), apparently /usr/sbin/nginx -s quit is required.

Anyway, in summary, I was able to avoid all 502s when scaling down by adding this to our application nginx container:

        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 3 && /usr/sbin/nginx -s quit"]

And this to our app container:

        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 3"]

Thanks for taking the time @aledbf and @ElvinEfendi

max-rocket-internet on 10 Jan 2019

👍11 🎉2 ❤1

@max-rocket-internet thank you for posting this, this solved a major mystery that I've been after 3 years!

How did you end up with sleep 3 - I think it takes longer for ingress-nginx to reload?

matti on 18 Jan 2019

👍2

How did you end up with sleep 3 - I think it takes longer for ingress-nginx to reload?

@matti as long as you're using recent ingress-nginx versions reload does not matter. You want to sleep <the time it takes for ingress-nginx to get update from k8s API about terminating pods> + 1s (https://github.com/kubernetes/ingress-nginx/blob/d968ee9cfd0823f639013066b50595858f3c184c/rootfs/etc/nginx/lua/balancer.lua#L15) to sync new endpoints

ElvinEfendi on 18 Jan 2019

👍2

Was this page helpful?

0 / 5 - 0 ratings