Hey everyone,
I have an index, called metadata, with 5 shards and 1 replicas, After restart ES, index cannot be fully-recovered . All primary shards are recovered properly. However two of the shards were remained unassigned.
ES version: 1.3.2
When I executed this command:
curl -XGET "http:/localhost:9200/_cat/shards
metadata 2 p STARTED 7712779 4.2gb ip-1 Motormouth
metadata 2 r STARTED 7712779 4.2gb ip-2 Harold "Happy" Hogan
metadata 0 p STARTED 7714351 4.1gb ip-2 Harold "Happy" Hogan
metadata 0 r UNASSIGNED
metadata 3 p STARTED 7711363 4.6gb ip-1 Motormouth
metadata 3 r STARTED 7711363 4.6gb ip-2 Harold "Happy" Hogan
metadata 1 p STARTED 7712560 4.2gb ip-2 Harold "Happy" Hogan
metadata 1 r UNASSIGNED
metadata 4 p STARTED 7714620 2.7gb ip-1 Motormouth
metadata 4 r STARTED 7714620 2.7gb ip-2 Harold "Happy" Hogan
[2014-11-29 15:01:58,383][WARN ][index.engine.internal ] [Motormouth] [metadata][0] failed engine [corrupted preexisting index]
[2014-11-29 15:01:58,384][WARN ][indices.cluster ] [Motormouth] [metadata][0] failed to start shard
org.apache.lucene.index.CorruptIndexException: [metadata][0] Corrupted index [corrupted_3gXTXI3KQtm2e1WPsFntkg] caused by: CorruptIndexException[codec footer mismatch: actual footer=1308690703 vs expected footer=-1071082520 (resource: NIOFSIndexInput(path="/var/lib/elasticsearch/elasticsearch/nodes/0/indices/metadata/0/index/_8x9g.fdt"))]
at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:343)
at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:328)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyInitializingShard(IndicesClusterStateService.java:727)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyNewOrUpdatedShards(IndicesClusterStateService.java:580)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:184)
at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:444)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:153)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2014-11-29 15:01:58,437][WARN ][index.engine.internal ] [Motormouth] [metadata][1] failed engine [corrupted preexisting index]
[2014-11-29 15:01:58,437][WARN ][indices.cluster ] [Motormouth] [metadata][1] failed to start shard
org.apache.lucene.index.CorruptIndexException: [metadata][1] Corrupted index [corrupted_P-smhoB-SEeM7kHiTsIEug] caused by: CorruptIndexException[codec footer mismatch: actual footer=-262453147 vs expected footer=-1071082520 (resource: NIOFSIndexInput(path="/var/lib/elasticsearch/elasticsearch/nodes/0/indices/metadata/1/index/_avoi_es090_0.doc"))]
at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:343)
at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:328)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyInitializingShard(IndicesClusterStateService.java:727)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyNewOrUpdatedShards(IndicesClusterStateService.java:580)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:184)
at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:444)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:153)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Hope someone can help.
hey, first of all nothing got lost so that's good. If we detect a corruption we mark the shard as corrupted there should be a corrupted_?????
on disk that prevent your cluster from allocating the shard on that node. We do that to allow you to backup corrupted data etc. if we can allocate the shard somewhere else we will remove it. yet in your case you can remove the shard from the node it question and then ES will recover from the primary. Yet, I'd want to to know what happened that made the shard go corrupted. Did you upgrade lately? if so from what version?
you can rm
the files on Motormouth
for shard 0 & 1 for the metadata index and then run curl -XPOST 'localhost:9200/_cluster/reroute'
this should kick off the recovery. If you are unsure you can post the commands here and I will ahve a look first.
Actually, last thing I did, querying something which should return 20M documents. I was not expecting that will down ES. After that I restarted ES and it started to giving that exceptions for these shards.
So you are suggesting that deleting all data under the folder for shards: 0,1, right?
Btw, does ES trying to merge two shards when recovering?
Actually I was thinking to reduce number_of_replicas to 0 and then increase the 1 again. But I am not sure if that causes any data loss because primary shards are in different nodes?
your other replicase are just fine I don't think you need to do that. Yet, you certainly can.
which one do you suggest, (reducing number of shards) or (removal shards and recover shards)?
I'd got and do a mv /your/path/to/data/indices/metadata/0 /your/path/to/data/indices/metadata/backup_0
then run reroute and wait until the shard is active. remove the backup continue with the second shard.
ok I will try thanks
Hi @mehmetgunturkun
Actually, last thing I did, querying something which should return 20M documents
Do you mean you requested 20M documents in one search response? eg { "from": 0, "size": 20000000 }`?
If so, that could have caused an OOM exception. Please can you look in the logs on each node to see if you had an OOM?
yeah, it gave an Out of Memory Exception; but still couldn't understand why there is a mismatch?
Basically, if you get an OOM exception, all bets are off. At that stage the JVM is in an undefined state. That said, it shouldn't write a commit point that includes a file which hasn't been written correctly.
Was this index originally created with an older version of Elasticsearch? If so, which version? It would be helpful if you could upload your logs somewhere.
actually, I was almost sure about that this index created on 1.3.2 but I saw a file, "_avoi_es090_0.doc"," does it indicate version, 0.9?
sample of log file is in the following link:
https://dl.dropboxusercontent.com/u/69632603/New%20folder/elasticsearch.log
does it indicate version, 0.9?
no that is confusing it only means we added this codec in 0.90
but that is only a naming thing.
https://dl.dropboxusercontent.com/u/69632603/New%20folder/elasticsearch.log
you know what this seems like a half written index during a recovery. I think what happened here is you got an OOM and one of the shards was recoverying during that time. Then it got corrupted because you ran into OOM since recovery didn't finish. It left your shard in half baked state. We fixed this in 1.4.0 where we rename files after they all have been written and we rename them such that commitpointer are renamed last. I think that is what happened, that explains the missing file as well as the truncated one?
yeah this is what happened most probably, because on recovery, my application was still sending documents to index. Thanks a lot guys.
@mehmetgunturkun I assume this got resolved... I am closing it please reopen if you object.
@clintongormley First, thanks for the clear (and easy to find) help here. Today we ran into an issue with a corrupted shard as well. As far as we know though, the problems started when our master node and other (not master) node started having communication issues. We are still investigating what happened, but I was wondering if you'd be interested in our logs?
Hi @Bertg
We may well be. I'd open a new issue mentioning the version that you're using, plus all of the details including the logs. Note: if you're using an older version, there's a good chance that we've already fixed the bug, so you may want to trawl through the issues list first, to see if you find something that could explain the problem.
@clintongormley Actually investigating it more, I think we figured out what happened. A very complex query got generated, overloaded the master and the slaves "got confused" somehow. It does seem that later versions might fix the issue we had. We'll do the update and try to re run the offending query. If it happens again we'll open a ticket.
Had this issue with 1.4.4.
Fix mentioned in this issue did work: renaming the .../0 directory to .../0.backup. The 'unassigned' replica became primary, data was accessible again and a new replica was created. Case closed.
Hi,
A similar error happened to me for a 1.3.4 ES version. It seems too there is a broken recovery for one shard (among 5 for the same index). A lot of people suggest it could be resolved by renaming the shard directory. The 'unassigned' replica would become primary so data could be accessible again and a new replica created. Can someone confirm that ?
Someone in the #10066 suggested that this error would not happen in a 1.5.0 version ? Does someone agree with that ?
This happened to me with 1.4.5. It could be related to upgrade from 1.4.4 to 1.4.5 - I am sorry to say I am unsure about that. I only had 1 corrupt shard, though - I'd guess I should have more if it was related to the upgrade ... ?
In either case: _rm/mv_ + a call to _reroute_ worked like a charm!
Most helpful comment
This happened to me with 1.4.5. It could be related to upgrade from 1.4.4 to 1.4.5 - I am sorry to say I am unsure about that. I only had 1 corrupt shard, though - I'd guess I should have more if it was related to the upgrade ... ?
In either case: _rm/mv_ + a call to _reroute_ worked like a charm!