Using Spring Boot 2.0.0.RC2 & Cloud 2.0.0.M7. Having following setup,
Zuul Gateway - Replica 1
Edge service - Replica 2
Micro service - Replica 2
Following are the configurations,
ribbon:
ConnectTimeout: 5000
ReadTimeout: 10000
MaxAutoRetries: 0
MaxAutoRetriesNextServer: 2
retryableStatusCodes: 404,502,504
hystrix:
shareSecurityContext: true
feign:
hystrix:
enabled: false
health.config.enabled: false
spring.cloud.loadbalancer.retry.enabled: true
zuul:
retryable: true
sensitiveHeaders: Cookie
ignoredServices: '*'
ribbon:
eager-load:
enabled: true
routes:
user-service:
path: /users/**
stripPrefix: true
product-edge:
path: /products/**
stripPrefix: true
md5-6895d1bc02fa0b8669201367e0fc1c24
java.lang.IndexOutOfBoundsException: index (2) must be less than size (2)
at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:310)
at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:292)
at com.google.common.collect.RegularImmutableList.get(RegularImmutableList.java:65)
at com.netflix.loadbalancer.AbstractServerPredicate.chooseRoundRobinAfterFiltering(AbstractServerPredicate.java:203)
at com.netflix.loadbalancer.PredicateBasedRule.choose(PredicateBasedRule.java:45)
at com.netflix.loadbalancer.BaseLoadBalancer.chooseServer(BaseLoadBalancer.java:736)
at com.netflix.loadbalancer.ZoneAwareLoadBalancer.chooseServer(ZoneAwareLoadBalancer.java:113)
at com.netflix.loadbalancer.LoadBalancerContext.getServerFromLoadBalancer(LoadBalancerContext.java:481)
at com.netflix.loadbalancer.reactive.LoadBalancerCommand$1.call(LoadBalancerCommand.java:184)
at com.netflix.loadbalancer.reactive.LoadBalancerCommand$1.call(LoadBalancerCommand.java:180)
at rx.Observable.unsafeSubscribe(Observable.java:10151)
at rx.internal.operators.OnSubscribeConcatMap.call(OnSubscribeConcatMap.java:94)
at rx.internal.operators.OnSubscribeConcatMap.call(OnSubscribeConcatMap.java:42)
at rx.Observable.unsafeSubscribe(Observable.java:10151)
at rx.internal.operators.OperatorRetryWithPredicate$SourceSubscriber$1.call(OperatorRetryWithPredicate.java:127)
at rx.internal.schedulers.TrampolineScheduler$InnerCurrentThreadScheduler.enqueue(TrampolineScheduler.java:73)
at rx.internal.schedulers.TrampolineScheduler$InnerCurrentThreadScheduler.schedule(TrampolineScheduler.java:52)
at rx.internal.operators.OperatorRetryWithPredicate$SourceSubscriber.onNext(OperatorRetryWithPredicate.java:79)
at rx.internal.operators.OperatorRetryWithPredicate$SourceSubscriber.onNext(OperatorRetryWithPredicate.java:45)
at rx.internal.util.ScalarSynchronousObservable$WeakSingleProducer.request(ScalarSynchronousObservable.java:276)
at rx.Subscriber.setProducer(Subscriber.java:209)
at rx.internal.util.ScalarSynchronousObservable$JustOnSubscribe.call(ScalarSynchronousObservable.java:138)
at rx.internal.util.ScalarSynchronousObservable$JustOnSubscribe.call(ScalarSynchronousObservable.java:129)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48)
at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30)
at rx.Observable.subscribe(Observable.java:10247)
Getting these exceptions frequently, so retry logic is breaking. MaxAutoRetriesNextServer: 2, but available servers are only 2, not 3. So this case the bounds have to be checked before execution.
We are going to need more information. To start with nothing in that stacktrace points to anything in Spring Cloud Netflix. Is there more in the logs that would point to some code in Spring Cloud Netflix?
The same problem, spring cloud M8.
Apologies it took some time to get back. Here is the code repo which I have tried. Here is the complete log of Zuul gateway.
Once you setup the services config, registry, gateway & edge (micro is not needed now), run the project breaker. It will call the services and service will simulate timeout, and this error appears.
I am seeing this exact issue with Boot 1.5.10 and Edgware (RELEASE, SR1, SR2) too. I found this merged PR in Ribbon which seems to fix it.
It is a problem introduced fairly recently, in the 2.2.4 release of Ribbon, which is used by spring-cloud-netflix 1.4.x. There is no release > 2.2.4 yet, so downgrading to Dalston was the only viable option for me. Once the next release of Ribbon is out, this will be an easy fix.
Thanks @grelland.
@spencergibb do you think we should downgrade to 2.2.3? Doesnt look like the bug is in that release.
sure, we can ask for a release as well.
This suggest the quick solution to disable the ribbon's circuit breaker.
niws:
loadbalancer:
availabilityFilteringRule:
filterCircuitTripped: false # defaults to true
https://yangdongdong.org/2017/12/31/spring-cloud-feign/
@spencergibb, @ryanjbaxter Is it right way to go for now ?
I am not sure if that property will make a difference or not. You can try overriding the ribbon dependencies in your applications POM to use version 2.2.3.
@ryanjbaxter, @spencergibb we have the same issue. Can you please help me, how do I override ribbon dependency in POM to use version 2.2.3? I do not have an explicit dependency for Ribbon in POM. Also, the latest version for Ribbon I see in Maven is 2.0.1. Do I nee this in API GAteway level?
The latest Finchley release is using 2.2.5 https://github.com/spring-cloud/spring-cloud-netflix/blob/v2.0.2.RELEASE/spring-cloud-netflix-dependencies/pom.xml#L20
Most helpful comment
I am seeing this exact issue with Boot 1.5.10 and Edgware (RELEASE, SR1, SR2) too. I found this merged PR in Ribbon which seems to fix it.
It is a problem introduced fairly recently, in the 2.2.4 release of Ribbon, which is used by spring-cloud-netflix 1.4.x. There is no release > 2.2.4 yet, so downgrading to Dalston was the only viable option for me. Once the next release of Ribbon is out, this will be an easy fix.