Beats: Reevaluate Elasticsearch default bulk_max_size

Created on 10 Apr 2018  路  5Comments  路  Source: elastic/beats

The default bulk_max_size for the Elasticsearch output is 50 events only. The value has been chosen (for beats 1.0.0) to be very conservative, to accomodate very small Elasticsearch 2.x clusters/nodes.

Let's check if the default value can be increased for Elasticsearch 6.x:

  • Good value for small one node cluster?
  • Good value for default elastic cloud cluster?
Stalled enhancement libbeat needs_team

Most helpful comment

Or both. I think the ES pipeline is indeed in events/documents, not byte usage. If the bulk_max_size is too big, single events might get a 429 (in bursts), no matter the event size.

All 5 comments

I think we could also use a max_bytes_size instead of bulk_max_size, we could use the same logic to split events for the logstash or the ES output at the pipeline level.

Or both. I think the ES pipeline is indeed in events/documents, not byte usage. If the bulk_max_size is too big, single events might get a 429 (in bursts), no matter the event size.

Adding a note here from @jsvd if we ever plan to split based on the size of the batch.

ES output checks for the bulk size after compression, but ES checks the size after decompression, so you can send a small bulk request that still returns a 413 due to heavy compression

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

This issue doesn't have a Team:<team> label.

Was this page helpful?
0 / 5 - 0 ratings