Describe the bug
When I start a quarkus application with the oidc extension used, the startup will fail, when the auth server is not reachable.
Expected behavior
Application should start normal, maybe with a warning and an automatic retry
Actual behavior
Application stops with following exception:
java.util.concurrent.CompletionException: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8180
at java.base/java.util.concurrent.CompletableFuture.reportJoin(CompletableFuture.java:412)
at java.base/java.util.concurrent.CompletableFuture.join(CompletableFuture.java:2044)
at io.quarkus.oidc.VertxKeycloakRecorder.setup(VertxKeycloakRecorder.java:59)
at io.quarkus.deployment.steps.VertxKeycloakBuildStep$setup28.deploy_0(VertxKeycloakBuildStep$setup28.zig:92)
at io.quarkus.deployment.steps.VertxKeycloakBuildStep$setup28.deploy(VertxKeycloakBuildStep$setup28.zig:36)
at io.quarkus.runner.ApplicationImpl1.doStart(ApplicationImpl1.zig:137)
at io.quarkus.runtime.Application.start(Application.java:94)
at io.quarkus.runner.RuntimeRunner.run(RuntimeRunner.java:135)
at io.quarkus.dev.DevModeMain.doStart(DevModeMain.java:180)
at io.quarkus.dev.DevModeMain.start(DevModeMain.java:94)
at io.quarkus.dev.DevModeMain.main(DevModeMain.java:66)
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8180
Caused by: java.net.ConnectException: Connection refused
at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:336)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:685)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:632)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:549)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:511)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:834)
2019-10-16 20:28:40,692 ERROR [io.qua.dev.DevModeMain] (main) Failed to start quarkus: java.lang.RuntimeException: Failed to start quarkus
at io.quarkus.runner.ApplicationImpl1.doStart(ApplicationImpl1.zig:185)
at io.quarkus.runtime.Application.start(Application.java:94)
at io.quarkus.runner.RuntimeRunner.run(RuntimeRunner.java:135)
at io.quarkus.dev.DevModeMain.doStart(DevModeMain.java:180)
at io.quarkus.dev.DevModeMain.start(DevModeMain.java:94)
at io.quarkus.dev.DevModeMain.main(DevModeMain.java:66)
Caused by: java.util.concurrent.CompletionException: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8180
at java.base/java.util.concurrent.CompletableFuture.reportJoin(CompletableFuture.java:412)
at java.base/java.util.concurrent.CompletableFuture.join(CompletableFuture.java:2044)
at io.quarkus.oidc.VertxKeycloakRecorder.setup(VertxKeycloakRecorder.java:59)
at io.quarkus.deployment.steps.VertxKeycloakBuildStep$setup28.deploy_0(VertxKeycloakBuildStep$setup28.zig:92)
at io.quarkus.deployment.steps.VertxKeycloakBuildStep$setup28.deploy(VertxKeycloakBuildStep$setup28.zig:36)
at io.quarkus.runner.ApplicationImpl1.doStart(ApplicationImpl1.zig:137)
... 5 more
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8180
Caused by: java.net.ConnectException: Connection refused
at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:336)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:685)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:632)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:549)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:511)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:834)
To Reproduce
Steps to reproduce the behavior:
Configuration
```properties
quarkus.oidc.auth-server-url=http://localhost:8180/auth/realms/test
quarkus.oidc.client-id=test-client
Another error ocurres when the keycloak is up, however the REALM is incorrect. Is it a good idea open another Issue?
2019-10-16 19:06:46,165 INFO [io.qua.dep.QuarkusAugmentor] (main) Quarkus augmentation completed in 1912ms
java.util.concurrent.CompletionException: io.vertx.core.impl.NoStackTraceThrowable: Not Found
at java.base/java.util.concurrent.CompletableFuture.reportJoin(CompletableFuture.java:412)
at java.base/java.util.concurrent.CompletableFuture.join(CompletableFuture.java:2044)
at io.quarkus.oidc.VertxKeycloakRecorder.setup(VertxKeycloakRecorder.java:59)
at io.quarkus.deployment.steps.VertxKeycloakBuildStep$setup44.deploy_0(VertxKeycloakBuildStep$setup44.zig:73)
at io.quarkus.deployment.steps.VertxKeycloakBuildStep$setup44.deploy(VertxKeycloakBuildStep$setup44.zig:92)
at io.quarkus.runner.ApplicationImpl1.doStart(ApplicationImpl1.zig:169)
at io.quarkus.runtime.Application.start(Application.java:94)
at io.quarkus.runner.RuntimeRunner.run(RuntimeRunner.java:135)
at io.quarkus.dev.DevModeMain.doStart(DevModeMain.java:180)
at io.quarkus.dev.DevModeMain.start(DevModeMain.java:94)
at io.quarkus.dev.DevModeMain.main(DevModeMain.java:66)
Caused by: io.vertx.core.impl.NoStackTraceThrowable: Not Found
2019-10-16 19:06:48,149 ERROR [io.qua.dev.DevModeMain] (main) Failed to start quarkus: java.lang.RuntimeException: Failed to start quarkus
at io.quarkus.runner.ApplicationImpl1.doStart(ApplicationImpl1.zig:225)
at io.quarkus.runtime.Application.start(Application.java:94)
at io.quarkus.runner.RuntimeRunner.run(RuntimeRunner.java:135)
at io.quarkus.dev.DevModeMain.doStart(DevModeMain.java:180)
at io.quarkus.dev.DevModeMain.start(DevModeMain.java:94)
at io.quarkus.dev.DevModeMain.main(DevModeMain.java:66)
Caused by: java.util.concurrent.CompletionException: io.vertx.core.impl.NoStackTraceThrowable: Not Found
at java.base/java.util.concurrent.CompletableFuture.reportJoin(CompletableFuture.java:412)
at java.base/java.util.concurrent.CompletableFuture.join(CompletableFuture.java:2044)
at io.quarkus.oidc.VertxKeycloakRecorder.setup(VertxKeycloakRecorder.java:59)
at io.quarkus.deployment.steps.VertxKeycloakBuildStep$setup44.deploy_0(VertxKeycloakBuildStep$setup44.zig:73)
at io.quarkus.deployment.steps.VertxKeycloakBuildStep$setup44.deploy(VertxKeycloakBuildStep$setup44.zig:92)
at io.quarkus.runner.ApplicationImpl1.doStart(ApplicationImpl1.zig:169)
... 5 more
Caused by: io.vertx.core.impl.NoStackTraceThrowable: Not Found
2019-10-16 19:06:48,150 ERROR [io.qua.dev.DevModeMain] (main) Failed to start Quarkus, attempting to start hot replacement endpoint to recover
Well, at least a better error handling would be great. An info that the realm is not found would be very helpful, instead of just a "not found"
@viniciusfcf can you please open another issue with your stacktrace. I don't think we can figure out the realm is incorrect if not found is thrown but at least we should have a more useful message logged, such as, please make sure the 'quarkus.oidc.auth-server-url' is correct
@UnvirtualHH For the endpoints to be protected the IDP should be up and running at the moment the endpoint is up. If we retry, then when it should be done, on the 1st request ? But that will slow things down and we'll need to block before the server is up and running
@viniciusfcf can you please open another issue with your stacktrace. I don't think we can figure out the realm is incorrect if not found is thrown but at least we should have a more useful message logged, such as,
please make sure the 'quarkus.oidc.auth-server-url' is correct
Sure
When I start a quarkus application with the oidc extension used, the startup will fail, when the auth server is not reachable.
I think that the issue may be a bit worse than what is described here. It looks like simply enabling the quarkus-oidc extension forces a valid OIDC server to have to be running at start-up. This means that you can't disable the property either. This makes it impossible to, say, use BASIC authentication for tests while using the quarkus-oidc extension in production.
It should be possible to enable the quarkus-oidc extension with a non-existent quarkus.oidc.auth-server property so that you could so do something like the following:
%prod.quarkus.oidc.auth-server=<url>
%test.quarkus.security.users.embedded.enabled=true
Hi @dwalluck
On many security related extension, you can switch them by env by setting an 'enabled' property to false.
Apparently, this property is missing in OIDC, if we implement it you will be able to configure your apps like this:
%prod.quarkus.oidc.auth-server=<url>
%test.quarkus.security.users.embedded.enabled=true
%tes.quarkus.oidc.enabled=false
I think this would solve your issue.
@sberyozkin I can provide a PR for this if it's OK for you
Hi @loicmathieu - if you have a bit of time then yes, please do it.
@UnvirtualHH @dwalluck with #4828 you will be able to disabled OIDC on test profile like this:
%prod.quarkus.oidc.auth-server=<url>
%test.quarkus.security.users.embedded.enabled=true
%test.quarkus.oidc.enabled=false
You can test it by installing the branch (or master when merged) locally.
Is it OK for you? If yes I will close the issue when the PR will be merged.
@loicmathieu thanks Loic :-)
One more thing that may be we can add is a single delayed retry, say, after 10 sec, if the initial attempt to connect to IDP fails, which will probably help @UnvirtualHH as it appears the use case there is that in some test environment the tests starts before the docker KC instance is up. @UnvirtualHH can you please confirm ?
Well, better would be a configurable delay, maybe 10 seconds as default value, but overwritable by a config param, so that I may can set it up to 30sec or more, so that it fits to my needs.
@loicmathieu Thanks! One reason I was trying to disable the quarkus-oidc extension is that I don't think the quarkus.security.users.* properties will work if it's left enabled.
This is a separate issue, but I don't think it is clear what security provider will get used if quarkus.oidc.enabled=true and quarkus.security.users.embedded.enabled=true (or quarkus.security.users.file.enabled=true) at the same time. Is it related to which extension loads first? It would help if it indicated this in the log somewhere.
@UnvirtualHH OK, I guess a property like connection-retry-interval would do, but we'll just have a single retry...thanks
@dwalluck normally if you have multiple security providers enabled at the same time it should crash (at build time). As the security layer has been refactored a lot recently it may not still be the case.
With my changes already merged, you can disabled each extension so it fixes your issue but not the one from @UnvirtualHH unfortunatly ...
Regarding the issue @UnvirtualHH have, I'm not fan of a connection-retry-interval implementation, it didn't covers a lot of possible connection issue and seems like a band-aid to me (timeout on socket or on read? which default value? wich max value to avoid stoping all threads during a long time, ...).
Two things on my minds:
I think the second is the best as in a container environement we cannot control the order of start of the different containers ... and on reliable architecture we cannot make different components depends on each other.
So I sponsor a change to the OIDC extension to make it starts even if it cannot connect to the server.
@loicmathieu Currently, there seems to be no indication that it (quarkus.security.users.*) is not being used. You will just realize when authentication doesn't work.
My issue was similar to @UnvirtualHH: when I tried to have a blank %test.quarkus.oidc.auth-server= property value, I believe I got the same exception. I wonder if an empty URL couldn't imply the same as enabled=false.
I think that starting servers in docker is orthogonal to this issue. Every sever that makes some connection may be subject to this startup issue, including the database/agroal features. For me, the startup of keycloak in my test docker container actually takes quite some time (closer to minutes than seconds). It is running a full copy of wildfly after all and not quarkus. Usually this is considered an issue to be solved in docker itself.
However, I was wondering if it could make sense to use something like microprofile fault tolerance retry (number of retries, interval) to make this feature more generic.
@dwalluck it's more complicated than this because the extension hook into vertx/undertow to add security intereceptors, interceptors are not the same for all extension, and this is done at build time.
So you need to use the enabled property.
@dwalluck normally if you have multiple security providers enabled at the same time it should crash (at build time). As the security layer has been refactored a lot recently it may not still be the case.
With my changes already merged, you can disabled each extension so it fixes your issue but not the one from @UnvirtualHH unfortunatly ...
Regarding the issue @UnvirtualHH have, I'm not fan of a
connection-retry-intervalimplementation, it didn't covers a lot of possible connection issue and seems like a band-aid to me (timeout on socket or on read? which default value? wich max value to avoid stoping all threads during a long time, ...).Two things on my minds:
- Solve your chicken-and-eggs issue properly on your test (using testcontainers for example)
- Make possible the OIDC extension to start without being able to connect to the OIDC server
I think the second is the best as in a container environement we cannot control the order of start of the different containers ... and on reliable architecture we cannot make different components depends on each other.
So I sponsor a change to the OIDC extension to make it starts even if it cannot connect to the server.
Well, I was just a meaning a configurable parameter after which time in seconds one retry should occur. So e.g. if I know that after 30 seconds all other containers are up, I will configure it with this one...if I know that it takes 180seconds, I will configure my retry to that value. So not to configure how many retries after what time, just a parameter for one retry.
@loicmathieu Hi Loic, I'd not go for starting Quarkus without the IDP being available because it means the whole system is not operational. We can still rely on the adapter failing to verify the token even without IDP but there has to be a strong enough guarantee that IDP is listening there ready to support all the flows. Of course, IDP can crush after the connection has been established :-), but this is out of Quarkus control.
Please also check the 1st comment I made to @UnvirtualHH - for the adapter to attempt to re-establish the connection later we'd need to make the concessions...
I think I made a mistake with suggesting a retry, even a single one, it is over complicating things, what may help @UnvirtualHH and others is an initial and the only configurable delay before the connection attempt to IDP is made.
Most helpful comment
@dwalluck normally if you have multiple security providers enabled at the same time it should crash (at build time). As the security layer has been refactored a lot recently it may not still be the case.
With my changes already merged, you can disabled each extension so it fixes your issue but not the one from @UnvirtualHH unfortunatly ...
Regarding the issue @UnvirtualHH have, I'm not fan of a
connection-retry-intervalimplementation, it didn't covers a lot of possible connection issue and seems like a band-aid to me (timeout on socket or on read? which default value? wich max value to avoid stoping all threads during a long time, ...).Two things on my minds:
I think the second is the best as in a container environement we cannot control the order of start of the different containers ... and on reliable architecture we cannot make different components depends on each other.
So I sponsor a change to the OIDC extension to make it starts even if it cannot connect to the server.