We麓re having problems with VersionConflictEngineExceptions all the time. The update should happen as a script and increment a number value (see sample document below)
We麓re running a cluster of two els instances and I can only imagine that the synchronization is causing the conflict version in one node. I would expect the update not to throw this kind of exception in a cluster, as each update is atomically. We are running four application servers that execute this code and the Exceptions are thrown randomly on all instances.
stacktrace:
Caused by: org.elasticsearch.index.engine.VersionConflictEngineException: [kpi][4] [opportunity][1442415600000]: version conflict, current [5933], provided [5932]
at org.elasticsearch.index.engine.internal.InternalEngine.innerIndex(InternalEngine.java:582) [elasticsearch-1.4.4.jar:]
at org.elasticsearch.index.engine.internal.InternalEngine.index(InternalEngine.java:522) [elasticsearch-1.4.4.jar:]
at org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:425) [elasticsearch-1.4.4.jar:]
at org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:193) [elasticsearch-1.4.4.jar:]
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:512) [elasticsearch-1.4.4.jar:]
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.doStart(TransportShardReplicationOperationAction.java:426) [elasticsearch-1.4.4.jar:]
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.start(TransportShardReplicationOperationAction.java:342) [elasticsearch-1.4.4.jar:]
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction.doExecute(TransportShardReplicationOperationAction.java:97) [elasticsearch-1.4.4.jar:]
at org.elasticsearch.action.index.TransportIndexAction.innerExecute(TransportIndexAction.java:134) [elasticsearch-1.4.4.jar:]
at org.elasticsearch.action.index.TransportIndexAction.doExecute(TransportIndexAction.java:112) [elasticsearch-1.4.4.jar:]
at org.elasticsearch.action.index.TransportIndexAction.doExecute(TransportIndexAction.java:60) [elasticsearch-1.4.4.jar:]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75) [elasticsearch-1.4.4.jar:]
at org.elasticsearch.action.update.TransportUpdateAction.shardOperation(TransportUpdateAction.java:217) [elasticsearch-1.4.4.jar:]
at org.elasticsearch.action.update.TransportUpdateAction.shardOperation(TransportUpdateAction.java:170) [elasticsearch-1.4.4.jar:]
at org.elasticsearch.action.support.single.instance.TransportInstanceSingleOperationAction$AsyncSingleAction$1.run(TransportInstanceSingleOperationAction.java:187) [elasticsearch-1.4.4.jar:]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_20]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_20]
at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_20]
sample document:
{
"_index": "kpi",
"_type": "opportunity",
"_id": "1442412000000",
"_version": 14742,
"found": true,
"_source": {
"timestamp": "2015-09-16T14:00:00.249+0000",
"own": 224,
"shared": 2,
"network": 3941,
"unknown": 10575
}
}
each update script looks like this (one of the lines, only one increment per script):
ctx._source.own+=1;
ctx._source.shared+=1;
ctx._source.network+=1;
ctx._source.unknown+=1;
Can you confirm that you are not setting the retry_on_conflict
parameter? This parameter is zero by default and is designed exactly for your use case of updates where the ordering of updates (say incrementing a counter) isn't important.
If you do confirm this, this behavior is expected when you have multiple writers attempting to update the same document. You can address this issue by using the retry_on_conflict
parameter to retry when a version conflict occurs. You can read more about this issue in the documentation on partial updates including the specific section on conflicts.
I can confirm I麓m not setting the retry_on_conflict parameter. But it sounds exactly like the parameter I want to use. I deployed with the parameter and the exceptions seem to be gone.
Most helpful comment
Can you confirm that you are not setting the
retry_on_conflict
parameter? This parameter is zero by default and is designed exactly for your use case of updates where the ordering of updates (say incrementing a counter) isn't important.If you do confirm this, this behavior is expected when you have multiple writers attempting to update the same document. You can address this issue by using the
retry_on_conflict
parameter to retry when a version conflict occurs. You can read more about this issue in the documentation on partial updates including the specific section on conflicts.