Question/Enhancement
When running in a cloud environment, where containers come and go, it's important that in flight requests are not terminated abruptly but rather given time to finish when the container is shutdown. The container platform usually supports this by sending a SIGTERM first, then waiting sometime before finally sending a SIGKILL if necessary. For example see: https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods
I don't believe Spring Cloud Gateway (netty) handles SIGTERM at all. The question then is how can we support this on Spring Cloud Gateway. The concept has been discussed here: https://github.com/spring-projects/spring-boot/issues/4657. The thread basically says it not supported generically by Spring Boot, but left to each container to handle. The sample supplied there for Netty doesn't work.
@violetagg Maybe you have some ideas how we can support this?
I think that with this commit it should be OK?
https://github.com/spring-projects/spring-framework/issues/23631
Thanks! I will check it out and report back.
Hi @violetagg ,
I tested this and it does not work. After SIGTERM existing connections are closed immediately and 500 response is send back to clients.
2019-10-07 10:28:49.162 WARN 5424 --- [ctor-http-nio-2] r.netty.http.client.HttpClientConnect : [id: 0xd21f3379, L:0.0.0.0/0.0.0.0:63100 ! R:httpbin.org/3.225.168.125:80] The connection observed an error
reactor.netty.http.client.PrematureCloseException: Connection prematurely closed BEFORE response
2019-10-07 10:28:49.203 ERROR 5424 --- [ctor-http-nio-2] a.w.r.e.AbstractErrorWebExceptionHandler : [02789889] 500 Server Error for HTTP GET "/delay/1000"
reactor.netty.http.client.PrematureCloseException: Connection prematurely closed BEFORE response
2019-10-07 10:28:49.350 ERROR 5424 --- [ctor-http-nio-2] o.s.w.s.adapter.HttpWebHandlerAdapter : [02789889] Error [java.nio.channels.ClosedChannelException] for HTTP GET "/delay/1000", but ServerHttpResponse already committed (500 INTERNAL_SERVER_ERROR)
Here is how I reproduced this. Using Spring's Initializr I created Spring boot app with Gateway dependency (please find attached graceful-shutdown.zip).
I created Route to httpbin's delay endpoint
spring:
cloud:
gateway:
routes:
- id: httpbin
uri: "http://httpbin.org:80"
predicates:
- Path=/delay/{delay}
After the Gateway started I made this request that takes 10 seconds to respond.
http://127.0.0.1:8080/delay/10
In the meantime I send SEGTERM to the Gateway. I was expecting that Gateway will wait until the response is send back to the client and then shutdown. However connection is closed and Gateway shuts down immediately after SIGTERM without waiting for request to successfully complete.
@libkad Can you try this
@SpringBootApplication
public class GracefulShutdownApplication {
public static void main(String[] args) {
SpringApplication.run(GracefulShutdownApplication.class, args);
}
@Bean
public GracefulShutdown gracefulShutdown() {
return new GracefulShutdown();
}
}
public class GracefulShutdown implements ApplicationListener<ContextClosedEvent> {
@Override
public void onApplicationEvent(ContextClosedEvent contextClosedEvent) {
HttpResources.disposeLoopsAndConnectionsLater().delaySubscription(Duration.ofSeconds(20)).block();
}
}
Helo,
I was thinking about stopping connections first and then wait for workers to finish their work, but I can't find the way how to do it. Event when I stop the connections with factory.getConnectionProvider().disposeLater().block(); I'm able to connect to gateway while there is running some background task. Is there any other way how to stop connections?
... some config class
@Bean
ReactorResourceFactory reactorResourceFactory() {
ReactorResourceFactory resourceFactory = new ReactorResourceFactory();
resourceFactory.setUseGlobalResources(false);
return resourceFactory;
}
@Service
@Slf4j
public static class GracefulShutdown implements ApplicationListener<ContextClosedEvent> {
private final ReactorResourceFactory factory;
public GracefulShutdown(ReactorResourceFactory factory) {
this.factory = factory;
}
@Override
public void onApplicationEvent(ContextClosedEvent contextClosedEvent) {
log.debug("Stopping connections");
factory.getConnectionProvider().disposeLater().block();
log.debug("Stopping events");
factory.getLoopResources().disposeLater().delaySubscription(Duration.ofSeconds(30)).block();
log.debug("All stopped");
}
}
Thanks
This could be the solution to stop accepting incoming requests.
@Component
public class ShutdownFilter implements GlobalFilter, Ordered, ApplicationListener<ApplicationContextEvent> {
private static final Logger logger = LoggerFactory.getLogger(ShutdownFilter.class);
private volatile boolean fireErrorOnStop;
@Override
public void onApplicationEvent(ApplicationContextEvent event) {
if (event instanceof ContextRefreshedEvent) {
logger.debug("All requests will be denied");
fireErrorOnStop = false;
}
if (event instanceof ContextClosedEvent || event instanceof ContextStoppedEvent) {
logger.debug("Allowing requests to process");
fireErrorOnStop = true;
}
}
@Override
public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
if (fireErrorOnStop) {
return Mono.error(new ServiceUnavailableException("Service is in shutdown process"));
}
return chain.filter(exchange);
}
@Override
public int getOrder() {
return Integer.MIN_VALUE;
}
public static class Config {
}
}
@violetagg 's solution works fine for me. My spring cloud gateway application is deployed in Kubernetes.
I've scaled in service to one instance during constant load generated by wrk, no 500s observed.
This is now supported by spring boot 2.3.0