Elasticsearch: AutoFollowIT#testCleanFollowedLeaderIndexUUIDs failure

Created on 6 Jan 2021  路  5Comments  路  Source: elastic/elasticsearch

Build scan:
https://gradle-enterprise.elastic.co/s/wchepwmxp4rn6/tests/:x-pack:plugin:ccr:internalClusterTest/org.elasticsearch.xpack.ccr.AutoFollowIT/testCleanFollowedLeaderIndexUUIDs?expanded-stacktrace=WyIwIl0#1
Repro line:

./gradlew ':x-pack:plugin:ccr:internalClusterTest' --tests "org.elasticsearch.xpack.ccr.AutoFollowIT.testCleanFollowedLeaderIndexUUIDs" -Dtests.seed=C1F2474A919348E8 -Dtests.security.manager=true -Dtests.locale=et-EE -Dtests.timezone=America/Indiana/Marengo

Reproduces locally?:
No
Applicable branches:
7.x, master
Failure history:
It failed twice according to build scans

https://gradle-enterprise.elastic.co/scans/tests?search.relativeStartTime=P7D&search.timeZoneId=Europe/Madrid&tests.container=org.elasticsearch.xpack.ccr.AutoFollowIT&tests.sortField=FAILED&tests.test=testCleanFollowedLeaderIndexUUIDs&tests.unstableOnly=true

But only once according to build-stats

https://build-stats.elastic.co/app/kibana#/discover?_g=(refreshInterval:(pause:!t,value:0),time:(from:now-30d,mode:quick,to:now))&_a=(columns:!(_source),index:b646ed00-7efc-11e8-bf69-63c8ef516157,interval:auto,query:(language:lucene,query:testCleanFollowedLeaderIndexUUIDs),sort:!(process.time-start,desc))

Failure excerpt:

org.elasticsearch.discovery.MasterNotDiscoveredException: (No message provided)
:DistributeCCR >test-failure Distributed v7.12.0 v8.0.0

Most helpful comment

haha, everyone so fast here

All 5 comments

Pinging @elastic/es-distributed (Team:Distributed)

Here is the stack trace with an interesting bit "Expected current thread ...to not be a transport thread. Reason: [Blocking operation]"

org.elasticsearch.xpack.ccr.AutoFollowIT > testCleanFollowedLeaderIndexUUIDs FAILED
    MasterNotDiscoveredException[null]
        at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$2.onTimeout(TransportMasterNodeAction.java:230)
        at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:335)
        at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:252)
        at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:601)
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:684)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

    com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=878, name=elasticsearch[follower2][transport_worker][T#1], state=RUNNABLE, group=TGRP-AutoFollowIT]

        Caused by:
        java.lang.AssertionError: Expected current thread [Thread[elasticsearch[follower2][transport_worker][T#1],5,TGRP-AutoFollowIT]] to not be a transport thread. Reason: [Blocking operation]
            at __randomizedtesting.SeedInfo.seed([C1F2474A919348E8]:0)
            at org.elasticsearch.transport.Transports.assertNotTransportThread(Transports.java:60)
            at org.elasticsearch.common.util.concurrent.BaseFuture.blockingAllowed(BaseFuture.java:92)
            at org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:64)
            at org.elasticsearch.common.util.concurrent.FutureUtils.get(FutureUtils.java:76)
            at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:61)
            at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:55)
            at org.elasticsearch.xpack.ccr.repository.CcrRepository$RestoreSession.close(CcrRepository.java:628)
            at org.elasticsearch.core.internal.io.IOUtils.close(IOUtils.java:74)
            at org.elasticsearch.core.internal.io.IOUtils.close(IOUtils.java:116)
            at org.elasticsearch.core.internal.io.IOUtils.close(IOUtils.java:99)
            at org.elasticsearch.xpack.ccr.repository.CcrRepository.lambda$restoreShard$2(CcrRepository.java:330)
            at org.elasticsearch.action.ActionListener$5.onFailure(ActionListener.java:303)
            at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:136)
            at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:136)
            at org.elasticsearch.indices.recovery.MultiChunkTransfer.onCompleted(MultiChunkTransfer.java:145)
            at org.elasticsearch.indices.recovery.MultiChunkTransfer.handleItems(MultiChunkTransfer.java:133)
            at org.elasticsearch.indices.recovery.MultiChunkTransfer.access$000(MultiChunkTransfer.java:59)
            at org.elasticsearch.indices.recovery.MultiChunkTransfer$1.write(MultiChunkTransfer.java:78)
            at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.processList(AsyncIOProcessor.java:108)
            at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.drainAndProcessAndRelease(AsyncIOProcessor.java:96)
            at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.put(AsyncIOProcessor.java:84)
            at org.elasticsearch.indices.recovery.MultiChunkTransfer.addItem(MultiChunkTransfer.java:89)
            at org.elasticsearch.indices.recovery.MultiChunkTransfer.lambda$handleItems$4(MultiChunkTransfer.java:125)
            at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:136)
            at org.elasticsearch.action.ActionListener$3.onFailure(ActionListener.java:183)
            at org.elasticsearch.action.support.ListenerTimeouts$TimeoutableListener.onFailure(ListenerTimeouts.java:96)
            at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:59)
            at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1299)
            at org.elasticsearch.transport.InboundHandler.lambda$handleException$3(InboundHandler.java:328)
            at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:224)
            at org.elasticsearch.transport.InboundHandler.handleException(InboundHandler.java:326)
            at org.elasticsearch.transport.InboundHandler.handlerResponseError(InboundHandler.java:318)
            at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:137)
            at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:95)
            at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:700)
            at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:142)
            at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:117)
            at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:82)
            at org.elasticsearch.transport.nio.MockNioTransport$MockTcpReadWriteHandler.consumeReads(MockNioTransport.java:296)
            at org.elasticsearch.nio.SocketChannelContext.handleReadBytes(SocketChannelContext.java:228)
            at org.elasticsearch.nio.BytesChannelContext.read(BytesChannelContext.java:40)
            at org.elasticsearch.nio.EventHandler.handleRead(EventHandler.java:139)
            at org.elasticsearch.transport.nio.TestEventHandler.handleRead(TestEventHandler.java:151)
            at org.elasticsearch.nio.NioSelector.handleRead(NioSelector.java:420)
            at org.elasticsearch.nio.NioSelector.processKey(NioSelector.java:246)
            at org.elasticsearch.nio.NioSelector.singleLoop(NioSelector.java:174)
            at org.elasticsearch.nio.NioSelector.runLoop(NioSelector.java:131)

The actual failure here is:

WARNING: Uncaught exception in thread: Thread[elasticsearch[follower2][transport_worker][T#1],5,TGRP-AutoFollowIT] | 聽
-- | --
聽 | java.lang.AssertionError: Expected current thread [Thread[elasticsearch[follower2][transport_worker][T#1],5,TGRP-AutoFollowIT]] to not be a transport thread. Reason: [Blocking operation] | 聽
聽 | at __randomizedtesting.SeedInfo.seed([C1F2474A919348E8]:0) | 聽
聽 | at org.elasticsearch.transport.Transports.assertNotTransportThread(Transports.java:60) | 聽
聽 | at org.elasticsearch.common.util.concurrent.BaseFuture.blockingAllowed(BaseFuture.java:92) | 聽
聽 | at org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:64) | 聽
聽 | at org.elasticsearch.common.util.concurrent.FutureUtils.get(FutureUtils.java:76) | 聽
聽 | at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:61) | 聽
聽 | at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:55) | 聽
聽 | at org.elasticsearch.xpack.ccr.repository.CcrRepository$RestoreSession.close(CcrRepository.java:628) | 聽
聽 | at org.elasticsearch.core.internal.io.IOUtils.close(IOUtils.java:74) | 聽
聽 | at org.elasticsearch.core.internal.io.IOUtils.close(IOUtils.java:116) | 聽
聽 | at org.elasticsearch.core.internal.io.IOUtils.close(IOUtils.java:99) | 聽
聽 | at org.elasticsearch.xpack.ccr.repository.CcrRepository.lambda$restoreShard$2(CcrRepository.java:330) | 聽
聽 | at org.elasticsearch.action.ActionListener$5.onFailure(ActionListener.java:303) | 聽
聽 | at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:136) | 聽
聽 | at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:136) | 聽
聽 | at org.elasticsearch.indices.recovery.MultiChunkTransfer.onCompleted(MultiChunkTransfer.java:145) | 聽
聽 | at org.elasticsearch.indices.recovery.MultiChunkTransfer.handleItems(MultiChunkTransfer.java:133) | 聽
聽 | at org.elasticsearch.indices.recovery.MultiChunkTransfer.access$000(MultiChunkTransfer.java:59) | 聽
聽 | at org.elasticsearch.indices.recovery.MultiChunkTransfer$1.write(MultiChunkTransfer.java:78) | 聽
聽 | at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.processList(AsyncIOProcessor.java:108) | 聽
聽 | at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.drainAndProcessAndRelease(AsyncIOProcessor.java:96) | 聽
聽 | at org.elasticsearch.common.util.concurrent.AsyncIOProcessor.put(AsyncIOProcessor.java:84) | 聽
聽 | at org.elasticsearch.indices.recovery.MultiChunkTransfer.addItem(MultiChunkTransfer.java:89) | 聽
聽 | at org.elasticsearch.indices.recovery.MultiChunkTransfer.lambda$handleItems$4(MultiChunkTransfer.java:125) | 聽
聽 | at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:136) | 聽
聽 | at org.elasticsearch.action.ActionListener$3.onFailure(ActionListener.java:183) | 聽
聽 | at org.elasticsearch.action.support.ListenerTimeouts$TimeoutableListener.onFailure(ListenerTimeouts.java:96) | 聽
聽 | at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:59) | 聽
聽 | at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1299) | 聽
聽 | at org.elasticsearch.transport.InboundHandler.lambda$handleException$3(InboundHandler.java:328) | 聽
聽 | at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:224) | 聽
聽 | at org.elasticsearch.transport.InboundHandler.handleException(InboundHandler.java:326) | 聽
聽 | at org.elasticsearch.transport.InboundHandler.handlerResponseError(InboundHandler.java:318) | 聽
聽 | at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:137) | 聽
聽 | at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:95) | 聽
聽 | at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:700) | 聽
聽 | at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:142) | 聽
聽 | at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:117) | 聽
聽 | at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:82) | 聽
聽 | at org.elasticsearch.transport.nio.MockNioTransport$MockTcpReadWriteHandler.consumeReads(MockNioTransport.java:296) | 聽
聽 | at org.elasticsearch.nio.SocketChannelContext.handleReadBytes(SocketChannelContext.java:228) | 聽
聽 | at org.elasticsearch.nio.BytesChannelContext.read(BytesChannelContext.java:40) | 聽
聽 | at org.elasticsearch.nio.EventHandler.handleRead(EventHandler.java:139) | 聽
聽 | at org.elasticsearch.transport.nio.TestEventHandler.handleRead(TestEventHandler.java:151) | 聽
聽 | at org.elasticsearch.nio.NioSelector.handleRead(NioSelector.java:420) | 聽
聽 | at org.elasticsearch.nio.NioSelector.processKey(NioSelector.java:246) | 聽
聽 | at org.elasticsearch.nio.NioSelector.singleLoop(NioSelector.java:174) | 聽
聽 | at org.elasticsearch.nio.NioSelector.runLoop(NioSelector.java:131) | 聽
聽 | at java.lang.Thread.run(Thread.java:748)

Perhaps related to @original-brownbear's latest changes?

haha, everyone so fast here

This is a result of https://github.com/elastic/elasticsearch/pull/65921 which causes a failure callback to execute on the transport instead of the generic thread pool where it ran previously. Will open a fix PR shortly.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ppf2 picture ppf2  路  3Comments

malpani picture malpani  路  3Comments

clintongormley picture clintongormley  路  3Comments

dawi picture dawi  路  3Comments

clintongormley picture clintongormley  路  3Comments