We have a service that returns 503 responses when some dependencies are not available. We are observing that in this situation zuul retry one time.
The problem is that the request is a POST with json body and the first time the body is correctly sent, but when zuul retries, the body is sent empty and it results in a bad request to the client. We are in Angel.SR3 but the response is the same with SR6
{
"result": {
"code": 400,
"info": "Bad Request"
},
"errors": [
{
"code": "requestBodyInvalid",
"description": "request body is invalid"
}
]
}
Any idea of how to retry with body?? it seems a serius bug to me
Thanks in advance!!
Can you show more of your configuration please? Retries are the responsibility of ribbon in zuul. By default it only retries GET.
Hi,
After some test I push a demo with the bug running here
Steps to reproduce
2016-03-09 22:31:13.755 INFO 14706 --- [nio-8080-exec-1] org.yukatan.service.Application503 : Body arrived to the service: {"content":"hello"}
2016-03-09 22:31:13.817 INFO 14706 --- [ main] org.yukatan.service.Service503Test : Response:<400 Bad Request,{"timestamp":1457559073768,"status":400,"error":"Bad Request","exception":"org.springframework.http.converter.HttpMessageNotReadableException","message":"Required request body content is missing: org.springframework.web.method.HandlerMethod$HandlerMethodParameter@113443","path":"/test"},{Server=[Apache-Coyote/1.1], X-Application-Context=[zuul:8000], Date=[Wed, 09 Mar 2016 21:31:13 GMT], Content-Type=[application/json;charset=UTF-8], Transfer-Encoding=[chunked], Connection=[close]}
2016-03-09 22:31:18.825 INFO 14706 --- [nio-8080-exec-3] org.yukatan.service.Application503 : Body arrived to the service: {"content":"hello"}
2016-03-09 22:31:18.834 INFO 14706 --- [ main] org.yukatan.service.Service503Test : Response:<400 Bad Request,{"timestamp":1457559078832,"status":400,"error":"Bad Request","exception":"org.springframework.http.converter.HttpMessageNotReadableException","message":"Required request body content is missing: org.springframework.web.method.HandlerMethod$HandlerMethodParameter@113443","path":"/test"},{Server=[Apache-Coyote/1.1], X-Application-Context=[zuul:8000], Date=[Wed, 09 Mar 2016 21:31:18 GMT], Content-Type=[application/json;charset=UTF-8], Transfer-Encoding=[chunked], Connection=[close]}>
2016-03-09 22:31:23.842 INFO 14706 --- [nio-8080-exec-5] org.yukatan.service.Application503 : Body arrived to the service: {"content":"hello"}
2016-03-09 22:31:23.850 INFO 14706 --- [ main] org.yukatan.service.Service503Test : Response:<400 Bad Request,{"timestamp":1457559083848,"status":400,"error":"Bad Request","exception":"org.springframework.http.converter.HttpMessageNotReadableException","message":"Required request body content is missing: org.springframework.web.method.HandlerMethod$HandlerMethodParameter@113443","path":"/test"},{Server=[Apache-Coyote/1.1], X-Application-Context=[zuul:8000], Date=[Wed, 09 Mar 2016 21:31:23 GMT], Content-Type=[application/json;charset=UTF-8], Transfer-Encoding=[chunked], Connection=[close]}>
2016-03-09 22:31:28.858 INFO 14706 --- [nio-8080-exec-7] org.yukatan.service.Application503 : Body arrived to the service: {"content":"hello"}
2016-03-09 22:31:28.867 INFO 14706 --- [ main] org.yukatan.service.Service503Test : Response:<400 Bad Request,{"timestamp":1457559088864,"status":400,"error":"Bad Request","exception":"org.springframework.http.converter.HttpMessageNotReadableException","message":"Required request body content is missing: org.springframework.web.method.HandlerMethod$HandlerMethodParameter@113443","path":"/test"},{Server=[Apache-Coyote/1.1], X-Application-Context=[zuul:8000], Date=[Wed, 09 Mar 2016 21:31:28 GMT], Content-Type=[application/json;charset=UTF-8], Transfer-Encoding=[chunked], Connection=[close]}>
As you can see, post request is retried in the same second. The first one with body the second one empty
I hope this helps.
Hi,
Any advance on this?
@yukatan no update.
@yukatan - feels like a dup of https://github.com/Netflix/zuul/issues/193
Please take a look for the suggested workaround
@yukatan did you follow @dorkolog's suggestion?
Hi,
I have implemented the suggestion form another thread. is ther any plan to fix it in Sping cloud zuul/ribbon itself ?
@pradeepkusingh we will no longer use the Ribbon RestClient in Camden so this issue is not particularly valid anymore for the latest releases. Please keep all retry questions in #1290.
Hi @ryanjbaxter , @spencergibb
are there any plans to fix this bug in the near future?
We ran into this problem as well and fixed it with help of the workaround mentioned in netflix/zuul#193
Of course, the workaround is outdated but we were able to adopt it into the current code (Dalson.RELEASE with Spring-Retry and spring-cloud-commons >= 1.2.3.RELEASE / spring-cloud-netflix-core >= 1.3.2.RELEASE for the 'RetryOnStatusCode feature')
This means that we have a filter which replaces the request entity on the zuul request context and a modified version of the RetryableRibbonLoadBalancingHttpClient which can reset the request's input stream which was set through the ribbon request context.
I was wondering why it is not adopted in spring-cloud already. Maybe the FormBodyWrapperFilter can be modified to run on any request with a payload in general (POST/PUT/PATCH, I think?) and the FormBodyRequestWrapper should be used to replace the requests's input stream with a resettable one.
What do you think?
Best regards,
Andre
@doernbrackandre its not on our radar at the moment, but if you would like to submit a PR with the necessary changes that would be great!
I experienced the same issue with Dalston.SR2. The OkToRetryOnAllOperations option does not work for Requests with a body. The body is not submitted to the target for the retry.
Applying the workaround suggested in (netflix/zuul#193) seams to be dangerous for large requests because all request bodies are buffered in memory.
There's no other way, but to buffer the body. How else would it retry with a body?
It could be implemented in a way that the requests are only buffer up to a certain body size. If the limit is exceeded, a retry would not be possible - which is IMHO better as going out of memory even if retry is not necessary if a large request is passed.
Any recommendation for fixing this i Dalston Release train? Or should we not retry post requests
I dont see why not. @spencergibb do you see any reason why we can't backport this?
Most helpful comment
I dont see why not. @spencergibb do you see any reason why we can't backport this?