Elasticsearch: crash on lucene merge.

Created on 18 Jul 2018  路  8Comments  路  Source: elastic/elasticsearch

Elasticsearch version (bin/elasticsearch --version):
5.6.8

Plugins installed: [none]

JVM version (java -version):
java version "1.8.0_66"
Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)

OS version (uname -a if on a Unix-like system):
Linux 3.10.0-327.el7.x86_64 #1 SMP Thu Nov 19 22:10:57 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:
one data node crash when execute force merge, when i restart the node, it crash again.
Cluster has 3 master node , 15 data node on spinning platter drives.

Steps to reproduce:

Provide logs (if relevant):
first crash logs.
hs_err_pid22329.log

restart crash logs, it only exit es by log.

[2018-07-18T19:55:16,294][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [xnode-18] fatal error in thread [elasticsearch[xnode-18][generic][T#12]], exiting
java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code
at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:195) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
at org.apache.lucene.util.bkd.BKDReader.(BKDReader.java:62) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
at org.apache.lucene.codecs.lucene60.Lucene60PointsReader.(Lucene60PointsReader.java:105) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:55:25]
at org.apache.lucene.codecs.lucene60.Lucene60PointsFormat.fieldsReader(Lucene60PointsFormat.java:108) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:55:25]
at org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:134) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
at org.apache.lucene.index.SegmentReader.(SegmentReader.java:74) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:145) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
at org.apache.lucene.index.ReadersAndUpdates.getReadOnlyClone(ReadersAndUpdates.java:197) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:103) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:467) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:103) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:79) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
at org.elasticsearch.index.engine.InternalEngine.createSearcherManager(InternalEngine.java:329) ~[elasticsearch-5.6.8.jar:5.6.8]
at org.elasticsearch.index.engine.InternalEngine.(InternalEngine.java:175) ~[elasticsearch-5.6.8.jar:5.6.8]
at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:25) ~[elasticsearch-5.6.8.jar:5.6.8]
at org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:1602) ~[elasticsearch-5.6.8.jar:5.6.8]
at org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:1584) ~[elasticsearch-5.6.8.jar:5.6.8]
at org.elasticsearch.index.shard.IndexShard.internalPerformTranslogRecovery(IndexShard.java:1027) ~[elasticsearch-5.6.8.jar:5.6.8]
at org.elasticsearch.index.shard.IndexShard.skipTranslogRecovery(IndexShard.java:1048) ~[elasticsearch-5.6.8.jar:5.6.8]
at org.elasticsearch.indices.recovery.RecoveryTarget.prepareForTranslogOperations(RecoveryTarget.java:360) ~[elasticsearch-5.6.8.jar:5.6.8]
at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$PrepareForTranslogOperationsRequestHandler.messageReceived(PeerRecoveryTargetService.java:330) ~[elasticsearch-5.6.8.jar:5.6.8]
at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$PrepareForTranslogOperationsRequestHandler.messageReceived(PeerRecoveryTargetService.java:324) ~[elasticsearch-5.6.8.jar:5.6.8]
at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33) ~[elasticsearch-5.6.8.jar:5.6.8]
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-5.6.8.jar:5.6.8]
at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1556) ~[elasticsearch-5.6.8.jar:5.6.8]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:674) ~[elasticsearch-5.6.8.jar:5.6.8]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.6.8.jar:5.6.8]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_66]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_66]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_66]

feedback_needed

Most helpful comment

Hi @cyberdak

I am having the same issue, too. I was wondering where did you check for your bad memory on your file and erase it. Thank you so much.

All 8 comments

I hava updated this node to jdk 1.8.0_144 , it carsh too.

And I check memory and disk free space ,them are all ok.

java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code

Elasticsearch and Lucene are not using unsafe memory. Maybe the jvm itself does an illegal memory access or maybe your memory is corrected. Can you upgrade that the latest java 8 (currently update 181) and test your memory whether it not corrupted?

I find it is because my disk is bad .when i comment the bad data path , es is ok now.

in my dmesg

[6663179.508007] sd 0:0:5:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[6663179.508018] sd 0:0:5:0: [sdf] Sense Key : Medium Error [current] [descriptor]
[6663179.508022] sd 0:0:5:0: [sdf] Add. Sense: Unrecovered read error
[6663179.508025] sd 0:0:5:0: [sdf] CDB: Read(16) 88 00 00 00 00 02 80 02 a2 08 00 00 01 00 00 00
[6663179.508028] blk_update_request: critical medium error, dev sdf, sector 10737590840
[6663182.792073] mpt2sas0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
[6663182.792091] sd 0:0:5:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[6663182.792099] sd 0:0:5:0: [sdf] Sense Key : Medium Error [current] [descriptor]
[6663182.792103] sd 0:0:5:0: [sdf] Add. Sense: Unrecovered read error
[6663182.792107] sd 0:0:5:0: [sdf] CDB: Read(16) 88 00 00 00 00 02 80 02 a2 38 00 00 00 08 00 00
[6663182.792110] blk_update_request: critical medium error, dev sdf, sector 10737590840

@cyberdak Thanks for getting back to us!

is any possible add this to es docs to help other people avoid this issue ?

Hi @cyberdak, I'm facing the same problem when running the ES official docker image (5.5.3) inside a docker compose. Can you tell which were the steps to get the log and the dmesg output? I think I might run it inside my container after the failure, right?

Anyhow, I'm getting the java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code error since the first crash. Were you getting the same at first?

I have also tried to start the image by itself (without docker compose), and it seems to be working, so it's a bit confusing!

Thanks in advance!

Hi @cyberdak

I am having the same issue, too. I was wondering where did you check for your bad memory on your file and erase it. Thank you so much.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

clintongormley picture clintongormley  路  3Comments

abtpst picture abtpst  路  3Comments

abrahamduran picture abrahamduran  路  3Comments

rpalsaxena picture rpalsaxena  路  3Comments

clintongormley picture clintongormley  路  3Comments