Running _EcrClient.listImagesPaginator()_ in a _Collection.parallelStream()_ using Spring Boot sometimes results in _WebIdentityCredentialsUtils.factory()_ not finding STS on its class path.
_Sometimes_ when I execute the above I get the error message: "To use web identity tokens, the 'sts' service module must be on the class path.". I know that parallel streams delegates work to the ForkJoin common pool, and it seems that sometimes _some_ threads from this pool might have a class loader that does not have STS on its path. (Maybe because of Spring? I noticed that the loader called "app" does not have STS)
Is there a reason you guys are using _Thread.currentThread().getContextClassLoader()_, instead of _Object.getClass().getClassLoader()_ in the WebIdentityCredentialsUtils class?
I admit I know little about class loaders and how you use them, but I cannot use parallelStream() reliably at the moment. Or do you advise against using parallelStream at all?
Not filing as bug, since it might be intended behaviour.
I am listing docker images from multiple repositories in parallel.
@Override
public Stream<String> listImages(List<String> repositories) {
return repositories.parallelStream()
.flatMap(repoName -> ecr.listImagesPaginator(builder -> builder.registryId(this.registryId).repositoryName(repoName).filter(f -> f.tagStatus(TagStatus.TAGGED)))
.imageIds()
.stream()
.map(ImageIdentifier::imageTag));
}
I get the message "To use web identity tokens, the 'sts' service module must be on the class path." instead of it finding STS and using the token credentials.
Thank you for reporting the issue! I think we should use class loader instead of the thread context loader. We already have the ClassLoaderHelper#classLoader helper class. Marking this as a bug
Hi @BartXZX
Could you please help me to confirm ones, if you have added "sts" module in dependency like
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>sts</artifactId>
<version>2.15.20</version>
</dependency>
or it would be of great help if you could provide all dependencies related to
Hi @joviegas, we use gradle. I have the following dependencies.
implementation platform('software.amazon.awssdk:bom:2.14.23')
implementation 'software.amazon.awssdk:s3'
implementation 'software.amazon.awssdk:ecr'
implementation 'software.amazon.awssdk:sts'
implementation 'software.amazon.awssdk:eks'
implementation 'software.amazon.awssdk:dynamodb'
implementation 'software.amazon.awssdk:rds'
Also, I removed all the parallel streams that used the SDK from the codebase and the issue goes away.
I did some debugging and have some stacktraces from both cases. The thing that stands out to me is that STS could not be found when the stream delegated some of its work to another thread _ForkJoinWorkerThread_.
STS was not found:
2020-10-28 13:11:30.774 ERROR 1 --- [onPool-worker-3] c.p.c.p.o.util.ClassloaderUtils : STS not found, classloader: app
java.lang.ClassNotFoundException: software.amazon.awssdk.services.sts.internal.StsWebIdentityCredentialsProviderFactory
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:602) ~[na:na]
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) ~[na:na]
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) ~[na:na]
at java.base/java.lang.Class.forName0(Native Method) ~[na:na]
at java.base/java.lang.Class.forName(Class.java:416) ~[na:na]
at com.planonsoftware.cloud.pco.orchestrator.util.ClassloaderUtils.printIsStsOnClasspath(ClassloaderUtils.java:13) ~[classes!/:na]
at com.planonsoftware.cloud.pco.orchestrator.services.software.docker.registries.EcrRegistry.listImages(EcrRegistry.java:81) ~[classes!/:na]
at com.planonsoftware.cloud.pco.orchestrator.services.software.docker.DockerImageStoreImpl.lambda$listImages$1(DockerImageStoreImpl.java:35) ~[classes!/:na]
at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:271) ~[na:na]
at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1621) ~[na:na]
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[na:na]
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[na:na]
at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:952) ~[na:na]
at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:926) ~[na:na]
at java.base/java.util.stream.AbstractTask.compute(AbstractTask.java:327) ~[na:na]
at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:746) ~[na:na]
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) ~[na:na]
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1016) ~[na:na]
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1665) ~[na:na]
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1598) ~[na:na]
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) ~[na:na]
And right below that in our logs, STS was found:
2020-10-28 13:11:30.779 INFO 1 --- [ scheduling-1] c.p.c.p.o.util.ClassloaderUtils : STS found, classloader: null
java.lang.Exception: Stack trace
at java.base/java.lang.Thread.dumpStack(Thread.java:1379)
at com.planonsoftware.cloud.pco.orchestrator.util.ClassloaderUtils.printIsStsOnClasspath(ClassloaderUtils.java:16)
at com.planonsoftware.cloud.pco.orchestrator.services.software.docker.registries.EcrRegistry.listImages(EcrRegistry.java:81)
at com.planonsoftware.cloud.pco.orchestrator.services.software.docker.DockerImageStoreImpl.lambda$listImages$1(DockerImageStoreImpl.java:35)
at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:271)
at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1621)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:952)
at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:926)
at java.base/java.util.stream.AbstractTask.compute(AbstractTask.java:327)
at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:746)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
at java.base/java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:408)
at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:736)
at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:919)
at java.base/java.util.stream.DistinctOps$1.reduce(DistinctOps.java:64)
at java.base/java.util.stream.DistinctOps$1.opEvaluateParallelLazy(DistinctOps.java:110)
at java.base/java.util.stream.AbstractPipeline.sourceSpliterator(AbstractPipeline.java:434)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
at com.planonsoftware.cloud.pco.orchestrator.services.software.docker.DockerImageStoreImpl.listImages(DockerImageStoreImpl.java:37)
at com.planonsoftware.cloud.pco.orchestrator.services.application.ApplicationServiceImpl.refresh(ApplicationServiceImpl.java:115)
at com.planonsoftware.cloud.pco.orchestrator.services.application.ApplicationServiceImpl.refreshCache(ApplicationServiceImpl.java:85)
at com.planonsoftware.cloud.pco.orchestrator.scheduled.ScheduledTasks.refreshApplicationList(ScheduledTasks.java:22)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84)
at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:830)
Hi @BartXZX ,
Thanks for the above data. Appreciate your quick responses.
We have fixed the issue with 2139.
Could you please try to recreate the issue with Release 2.15.26 and help me to validate if the above PR fixed the issue.
Is there a preview build somewhere for 26, or should I just wait for it to be released?
Nevermind, I see it's already in 25 :-)
And I'm happy to say I no longer see the issue!
Of course, it was an issue we saw 'sometimes' but usually we had a 50/50 chance to see it on startup.
I've tested many times now and have not seen it since, so I'm happy.
Will let you guys know if I see it pop up again.
Thank you for the quick response!
Marking this to auto close soon, feel free to reach out if the issue persists after the fix.
I've also hit the bug, and can confirm that the fix is working.
Thank you so much @casperbiering and @BartXZX .
Closing the issue.