Spring-cloud-gateway: HTTP 503 should be dispatched in case the circuit is broken

Created on 5 Aug 2019  路  9Comments  路  Source: spring-cloud/spring-cloud-gateway

I'm using Spring Cloud Gateway 2.1.2.RELEASE ( Greenwich.SR2 ) along with a Hystrix Filter and I'm running a series of unit tests in order to verify the behavior of the circuit breaker mechanism.

In case the back-end system does not respond in time (see the property: execution.isolation.thread.timeoutInMilliseconds), then the HTTP status dispatched by the Gateway is 504 (as expected).
In case multiple errors of this kind occur, then the circuit is broken (also as expected) but the HTTP status dispatched by the Gateway is 500 (while one would expect 503).

Is there any configuration we are missing here?

bug help wanted

All 9 comments

I think this is a tricky situation as to what the right status code should be. I can see a 503 indicating that the downstream service being unavailable, however I could see it also meaning the gateway itself if unavailable.

Thank you @ryanjbaxter for the prompt response!

Let me also add to the discussion that there is nothing wrong with the Gateway itself; it's just one route (out of dozen registered ones) which received multiple timeout errors from a single back-end service.
A response with HTTP 500 should prevent the original service consumer from retrying, while a response with HTTP 503 should indicate that a retry is feasible after a while.

On a side note, are you aware whether HTTP 500 is coming straight from Hystrix or is that how the Gateway interprets the com.netflix.hystrix.exception.HystrixRuntimeException ?

I understand, my gut says we should return a 503 but I want to hear what my teammates think as well.

On a side note, are you aware whether HTTP 500 is coming straight from Hystrix or is that how the Gateway interprets the com.netflix.hystrix.exception.HystrixRuntimeException ?

Not off the top of my head

If it helps, below you may find the stacktrace:

Daemon Thread [HystrixTimer-1] (Suspended (breakpoint at line 39 in HystrixRuntimeException))   
    HystrixRuntimeException.<init>(FailureType, Class<HystrixInvokable>, String, Exception, Throwable) line: 39 
    HystrixGatewayFilterFactory$RouteHystrixCommand(AbstractCommand<R>).handleFallbackDisabledByEmittingError(Exception, FailureType, String) line: 1052    
    HystrixGatewayFilterFactory$RouteHystrixCommand(AbstractCommand<R>).getFallbackOrThrowException(AbstractCommand<R>, HystrixEventType, FailureType, String, Exception) line: 878 
    HystrixGatewayFilterFactory$RouteHystrixCommand(AbstractCommand<R>).handleTimeoutViaFallback() line: 997    
    AbstractCommand<R>.access$500(AbstractCommand) line: 60 
    AbstractCommand$12.call(Throwable) line: 609    
    AbstractCommand$12.call(Object) line: 601   
    OperatorOnErrorResumeNextViaFunction$4.onError(Throwable) line: 140 
    OnSubscribeDoOnEach$DoOnEachSubscriber<T>.onError(Throwable) line: 87   
    OnSubscribeDoOnEach$DoOnEachSubscriber<T>.onError(Throwable) line: 87   
    AbstractCommand$HystrixObservableTimeoutOperator$1.run() line: 1142 
    HystrixContextRunnable$1.call() line: 41    
    HystrixContextRunnable$1.call() line: 37    
    HystrixContextRunnable.run() line: 57   
    AbstractCommand$HystrixObservableTimeoutOperator$2.tick() line: 1159    
    HystrixTimer$1.run() line: 99   
    Executors$RunnableAdapter<T>.call() line: 511   
    ScheduledThreadPoolExecutor$ScheduledFutureTask<V>(FutureTask<V>).runAndReset() line: 308   
    ScheduledThreadPoolExecutor$ScheduledFutureTask<V>.access$301(ScheduledThreadPoolExecutor$ScheduledFutureTask) line: 180    
    ScheduledThreadPoolExecutor$ScheduledFutureTask<V>.run() line: 294  
    ScheduledThreadPoolExecutor(ThreadPoolExecutor).runWorker(ThreadPoolExecutor$Worker) line: 1149 
    ThreadPoolExecutor$Worker.run() line: 624   
    Thread.run() line: 748

Note: I'm using the latest version of Hystrix (i.e. 1.5.18).

I believe the issue is with regards to org.springframework.cloud.gateway.filter.factory.HystrixGatewayFilterFactory.
Specifically, the switch statement handles the SHORTCIRCUIT failureType as a generic/default case.

One solution here could be something along the following lines:

  switch (failureType) {
    case TIMEOUT:
      return Mono.error(new TimeoutException());
    case SHORTCIRCUIT:
      return Mono.error(new ServiceUnavailableException());

Obviously, the class ServiceUnavailableException does not exist but it should be easy to create one based on the concept of the existing org.springframework.cloud.gateway.support.TimeoutException.

Any thoughts?

Thanks, first I want others to chime in on whether returning a 503 is right.

@spencergibb @TYsewyn @OlgaMaciaszek any thoughts?

IMO it does make sense to return 503 in such cases, and AFAIK we have the failure type SHORT_CIRCUITED for that.
That new exception would indeed look like the TimeoutException.
Thumbs up from me! 馃憤

EDIT: We should also look at the Retry-After HTTP header. If this is not passed to the browser in a response then the browser will - in most cases, if not all - just handle the 503 like it鈥檚 a 500 error.

There's already a switch to determine if it is short-circuited where a 503 could be returned.

PRs welcome

@spencergibb, @ryanjbaxter : PR created ( https://github.com/spring-cloud/spring-cloud-gateway/pull/1230 )

Was this page helpful?
0 / 5 - 0 ratings