In order to be able to install beats as a lightway shipper in any host environment, i would like to know if there is a possibility to add resource throttling in the Beats platform to prevent the host machine to consume all the resources.
The idea and proposal is to natively in beats be able to do this without the need of any 3rd party tool or configuration, so anyone is able to install beats as an agent without wondering around resource consumption.
Also, the idea of this is to ask:
Something that works today is to limit the Beats to a single CPU core, via the max_procs setting. Of course, that can still be too much, so I understand the feature request.
I guess there are two ways we could go about this (just dumping my thoughts to start a discussion on this):
Notes:
@tsg i'm +1 on option number 1, but i see the disadvantages and probably the technical problems that we might be missing that are not considered in your notes. That said, cgroups automatic configuration is a good idea and we can take a look at Job Objects (https://msdn.microsoft.com/en-us/library/windows/desktop/ms684161(v=vs.85).aspx) as a possible solution for windows environments.
I believe that hiding this outside the process (managing this at the OS level) is better and prevents possible issues afterwards.
We've debated this again, and I still think OS level solutions (cgroups and/or nice) are better suited for this tasks, because the kernel has better data on the resource usage and needs and can apply the limitations better. This means the OS level solution is a lot less likely to introduce negative side effects.
Happy to reconsider if the OS level solutions don't prove to be applicable or to productize them if they do prove applicable.
I created a new meta-issue to track this. Closing this issue for now https://github.com/elastic/beats/issues/17716
Most helpful comment
We've debated this again, and I still think OS level solutions (cgroups and/or nice) are better suited for this tasks, because the kernel has better data on the resource usage and needs and can apply the limitations better. This means the OS level solution is a lot less likely to introduce negative side effects.
Happy to reconsider if the OS level solutions don't prove to be applicable or to productize them if they do prove applicable.