Elasticsearch: Fail put-mapping requests sooner if they will exceed the field number limit

Created on 14 Nov 2018  Â·  7Comments  Â·  Source: elastic/elasticsearch

6.4.1

We had a scenario when an index was already hitting the field number limit (index.mapping.total_fields.limit) and subsequent (high volume) indexing requests attempted to add new fields to this index. As a result, a lot of put_mapping tasks got generated. This caused the cluster
state to be held on in memory and became non-GC-able until these mapping updates eventually got rejected (and the coordinating node ran out of memory).

image

This is an enhancement request to handle this situation better. Is this something the real memory circuit breaker in 7.0 will help with?

:DistributeCRUD

Most helpful comment

This is an enhancement request to handle this situation better. Is this something the real memory circuit breaker in 7.0 will help with?

I think we should try to address the root cause, if possible, it'd be nice if we could check the limit for mappings prior to a put-mapping request being sent to the master. For instance, if the local node's cluster state contains over 1000 fields in the mapping (with the default limit being 1000), we know that even if the cluster state is behind the number of fields cannot decrease, so no need to send an update mapping request to the master node. The request can be rejected without overloading any other node.

All 7 comments

Pinging @elastic/es-core-infra

This is an enhancement request to handle this situation better. Is this something the real memory circuit breaker in 7.0 will help with?

I think we should try to address the root cause, if possible, it'd be nice if we could check the limit for mappings prior to a put-mapping request being sent to the master. For instance, if the local node's cluster state contains over 1000 fields in the mapping (with the default limit being 1000), we know that even if the cluster state is behind the number of fields cannot decrease, so no need to send an update mapping request to the master node. The request can be rejected without overloading any other node.

Hi @dakrone
Based on my understanding theTransportPutMappingAction is handled by the master which only checks for block and then goes ahead submitting a cluster state update task. Do you think it makes sense to reject it at master but before submitting the cluster state update tasks just as we check for blocks. I believe since the update task is serialized on the master and put-mapping has a priority HIGH, processing gets significantly delayed by the PutMappingExecutor(espl in cases when there are pending tasks with priority URGENT) allowing the heap build up. This would help even in cases where the local cluster state was lagging unaware of field limit breach.

Hey @dakrone, I'll be more than happy to work on this PR. Please share your thoughts on the same

@ppf2 @dakrone any thoughts on this?

I think that the coordinating node no longer runs out of memory due to failed put-mappings calls in versions ≥7.0, so I have updated the title of this issue to reflect the remaining work mentioned in this comment.

Pinging @elastic/es-distributed

Was this page helpful?
0 / 5 - 0 ratings

Related issues

DhairyashilBhosale picture DhairyashilBhosale  Â·  3Comments

dadoonet picture dadoonet  Â·  3Comments

Praveen82 picture Praveen82  Â·  3Comments

clintongormley picture clintongormley  Â·  3Comments

brwe picture brwe  Â·  3Comments