Beats: Reduce memory usage of elasticsearch/index metricset

Created on 21 Feb 2020 · 3Comments · Source: elastic/beats

The elastiscearch/index metricset (when xpack.enabled: true is set) consumes memory proportional to the # of indices in the ES cluster and the size of the cluster state, specifically the sizes of the GET _stats and GET _cluster/state responses.

This is somewhat expected as the metricset needs data from those two API calls to create type: index_stats documents in .monitoring-es-* indices.

However, it may be possible to reduce the memory consumed by this metricset's code. Concretely, it would be worth looking into exactly which fields from the API responses are being used (and if the rest could be excluded) and also whether switching to a streaming JSON parser might help.

Metricbeat Stack monitoring Services bug module

Source

ycombinator

👍3

Most helpful comment

@jsoriano kindly let me pick his brain on this over Zoom. Summarizing our conversation:

This is not a common problem with other metricsets but we have encountered it before with the prometheus module.
The raw API response sizes are not massive but it's the parsed objects that are. So switching to a streaming JSON parser is probably the way to go here.

Thanks @jsoriano for chatting about this and validating some of my thinking! ❤️

ycombinator on 24 Feb 2020

👍2

All 3 comments

Pinging @elastic/stack-monitoring (Stack monitoring)

elasticmachine on 21 Feb 2020

@jsoriano kindly let me pick his brain on this over Zoom. Summarizing our conversation:

This is not a common problem with other metricsets but we have encountered it before with the prometheus module.
The raw API response sizes are not massive but it's the parsed objects that are. So switching to a streaming JSON parser is probably the way to go here.

Thanks @jsoriano for chatting about this and validating some of my thinking! ❤️

ycombinator on 24 Feb 2020

👍2

@ycombinator We still encounter Metricbeat being OOMKilled. After rolling out Metricbeat 7.6.2 we experienced lower memory consumption, because we could run Metricbeat with 500Mi pod limit (before we needed much more). Now after two weeks we have more indices and shards (between 300 and 400 shards) and now we have again OOMKilled our Metricbeat running on the K8s node of the current master.
You mentioned 1.79 MB memory usage in #16538 so still having 500 Mi pod limit seems to be very much.
Perhaps you can give some insights if the number of shards might be too much compared to your memory usage statement. The Metricbeat daemonset now consumes unnecessary much memory limit on a node.
In the end I will open a new support case?