Kong: Getting "failure to get a peer from the ring-balancer" error after upgrading to 1.2.x

Created on 5 Jul 2019 · 14Comments · Source: Kong/kong

Summary

I'm getting "failure to get a peer from the ring-balancer" error when calling the API's of my service after upgrading Kong to 1.2.1 (also tried 1.2.0). The exact same config works well in 1.1.2.
After some troubleshooting, I realized the problem has something to do with the following in my Kong yaml:

Route1 points to Service1 with Upstream1 and PortX
Route2 points to Service2 with Upstream2 and PortX (i.e. same as Route1)
Upstream1 and Upstream2 have the same Target config (i.e. same IP and same Port)

I have this setup because I have my own plugin with different settings for Route1 and Route2 respectively; underlying they should both go to the same microservice and the same port. And I'm using a library to auto-generate the complex Kong yaml.

Apparently the cause of the error is having multiple Upstreams sharing the same Target (same IP and same Port). If I remove one of the Routes, or simply point Route2 to Service1 and get rid of the thus excess Upstream2 etc., the error will go away.

Steps To Reproduce

See Summary. Define two Routes pointing to different Services with different Upstreams but having the same Target setting.

Additional Details & Logs

Kong 1.2.1. No issue with the same Kong yaml in 1.1.2.
I am using Kong with declarative config (i.e. DB-less mode) and I am deploying Kong on K8S, just in case this matters.

corbalancer tasneeds-investigation

Source

22vincetsang

👍4

Most helpful comment

Hi @22vincetsang @imsky @22vincetsang @HsinHeng @Renz2018
We have the fix for this bug and it has released with 1.3 rc1, feel free to try it out. The final version should be released soon, stay tuned.

guanlan on 14 Aug 2019

👍3

All 14 comments

we ran into this exact bug, it's completely unclear as to what's breaking. downgrading from 1.2.1 to 1.1.2 works fine

imsky on 16 Jul 2019

👍2

Hi @imsky and @22vincetsang, thanks for reporting. We've started triage internally.

guanlan on 17 Jul 2019

👍1

I believe we are experiencing the same issue. Just rolled back to 1.1.2, continuing to monitor if it fixes.

Also, db-less on k8s with the kong-ingress-controller.

jlarocque88 on 17 Jul 2019

👍2

@22vincetsang Could you post error logs of Kong around this event?

Also, do you see any errors when you manually (or Kong Ingress Controller automatically) load the configuration using /config into Kong?

ping @Tieske

hbagdi on 22 Jul 2019

👍1

@hbagdi Only these:

2019/07/23 01:59:46 [debug] 33#0: *1197 [lua] certificate.lua:22: log(): [ssl] no SNI provided by client, serving default SSL certificate
2019/07/23 01:59:46 [debug] 33#0: *1196 [lua] base_plugin.lua:43: log(): executing plugin "prometheus": log
172.20.2.1 - - [23/Jul/2019:01:59:46 +0000] "GET /my-api/v1/my_endpoint HTTP/1.0" 503 58 "-" "PostmanRuntime/7.6.1"

And yes, the "failure to get a peer from the ring-balancer" error (with HTTP status code of 503) can be reproduced if I manually load the Kong config via the /config endpoint.

Btw, just to share with those who're suffering the same issue, I'm currently working around this on my side by doing a post-process on my Kong yaml so there're no more multiple upstreams sharing same target (i.e. pointing Route2 to Service1 then getting rid of the excess Upstream2 and Service2, as I indicated in description).

22vincetsang on 23 Jul 2019

👍2

encounter same problem at 1.2.1 db-less mode

HsinHeng on 2 Aug 2019

👍1

same problem at 1.2.1 db-less mode (Kong Ingress Controller)

Renz2018 on 9 Aug 2019

👍1

guanlan on 14 Aug 2019

👍3

Same problem here at 1.2.x

rafambbr on 21 Aug 2019

👍1

i'm stuck with the same error 1.2.x

rmnobarra on 21 Aug 2019

👍1

Kong 1.3 is released, please upgrade to Kong 1.3 to resolve this issue.
The fixes here are:
https://github.com/Kong/kong/pull/4817
https://github.com/Kong/kong/pull/4810

hbagdi on 22 Aug 2019

I'm facing the same issue with Kong 1.2.1. I was willing to upgrade to 1.4 (specially for all the bug-fixes and new features) in our production environment, but reading the upgrading guide 1.2.x to 1.4, there are some breaking changes in version 1.3, so the upgrade process will take more than expected (test, validation, etc.).

Whenever I restart my Upstream back-end, I start receiving "failure to get a peer from the ring-balancer" error.

For us, the solution was to restart Kong, whenever we restart our Upstream back-end, in order to avoid the error mentioned above. Not the best solution at all, but it gives a way to go, while we can find a Maintenance Window with our customers.

Edenshaw on 27 Nov 2019

@Edenshaw not sure that would be the same thing.

Most of the former reports are based on identical targets, getting an identical uuid, and hence one is created, and the second will overwrite the first one. This seems Kong configuration related.

In your case the backend changes, which has nothing to do with the Kong configuration. My guess would be to have a look at DNS records, changing IP's and possibly DNS record TTL settings. And definitely checkout your logs, because there will be some errors in there pointing to the probable cause.

Tieske on 29 Nov 2019

@Edenshaw not sure that would be the same thing.

Most of the former reports are based on identical targets, getting an identical uuid, and hence one is created, and the second will overwrite the first one. This seems Kong configuration related.

In your case the backend changes, which has nothing to do with the Kong configuration. My guess would be to have a look at DNS records, changing IP's and possibly DNS record TTL settings. And definitely checkout your logs, because there will be some errors in there pointing to the probable cause.

Hi @Tieske Thanks a lot for your insight. I'll check again my configuration and logs to try spot the problem.

Edenshaw on 29 Nov 2019

Was this page helpful?

0 / 5 - 0 ratings