Locust: Multi thread master node

Created on 31 Jan 2020  路  11Comments  路  Source: locustio/locust

Is your feature request related to a problem? Please describe.


The load I am putting on my master node is destroying it despite it being on a powerful box. It looks like its only utilizing one core at the moment so...hard for me to scale that. If it were multithreaded, I could throw more CPUs at it...

Describe the solution you'd like


For the master node to be multithreaded.

Describe alternatives you've considered


I need this enough that I'm happy to do it myself if someone will just point me at the place I shold start looking at the code.

Additional context


Context:
I'm writing a load test for a content provider for a major travel company. My target is to be able to run 1.5mil requests per minute (25k rps).

At the moment I have 4 amazon t2 ECSs (4processors, 16g of ram) running 4 slaves on each box. I think I can probably get to 25k rps by just adding more boxes.

The issue I'm getting is that the master node is freezing up when I start to get past 4k rps. It's pegging it's CPU pretty hard, so I keep hoping for multithreading....

feature request wontfix

Most helpful comment

Great!

I'm closing it as a won't fix for now. I wouldn't mind improving the parallelism of the master at some point if it can be done without adding too much complexity. But so far I have yet to see a case where CPU on the master were the bottleneck.

If it were to be implemented, I'm guessing that a good way to do it would be to make this line:

https://github.com/locustio/locust/blob/046eeb26c37c97a0528fe1526f1c52d2ebf73a23/locust/stats.py#L672

run in a multiprocessing pool (using python threads wouldn't help because of the GIL). But I have little experience with python multiprocessing, especially together with gevent.

All 11 comments

As I said, I'm happy to fork the project and figure this out if there's no real solution built into the code. I'd really like this to work for us....

I just ran into this issue as well. The master is only using a single CPU from what I can tell... once that gets pegged, you're toast.

Yes, Locust only utilises a single CPU core per node (master as well as slaves). However it does sound weird that the master would become a bottleneck at only 16 slave nodes. I've ran load tests with >100 slave nodes without the master node maxing out on CPU.

It should be possible to parallelize some of the CPU work that is done in the master node (mainly aggregating stats data from the slave nodes), but I'd start with looking into why it's maxing out in the first place.

Are you running any slave nodes on the machine where you're running the master (I would advice against that)?

How many different named endpoints do you have in the test (e.g. endpoints that haven't been grouped using the HTTP client's name parameter)? Having a lot of different endpoints that haven't been grouped using the name parameters could probably increase CPU usage.

On a side note, I think you might need more than four 4-core machines to achieve 25k reqs/s, though it obviously depends a lot on the test scripts, and which HTTP client you're using. Though you should be able to see if that's the case by monitoring the CPU usage of the slave nodes.

Running 32 slaves. Getting up to 12k requests per second with minimal wait time. Master is on an 8 core machine dedicated. There are 8 slave machines 4 cores each. Testing APIs not UI.
I am not using the name parameter... so I have TONS of different urls (REST based so 1000ish ids are randomly used). Maybe this is my issue. I was just using the self.client model from the examples. Thanks, I will try making this change... would be better when looking at the details as well.

I'm not familiar with what you mean by the name parameter, so I'm guessing that's what my problem is ;).

Right now we are hitting 2.3 mil different possible endpoints. I'm using python string interpolation to generate each endpoint just before making the call by randomizing a couple values in the request. If there's a better way I definitely want to know about it :). I'll go look into this....

And yes, I know we'll need to make more machines to get to 25krps, I just wasn't going to take the time until I have it working on 4krps.

Just read up on the name parameter, boy am I embarassed -_-;;. I'll need to test it in the morning, but looks extremely promising. l should be able to group _all_ of our requests under one name parameter. I assume that will increase performance dramatically.

I made the update, I think this prevented the CPU from getting pegged now. Thanks @heyman!

However it does sound weird that the master would become a bottleneck at only 16 slave nodes

Agreed.

I'm writing a load test for a content provider for a major travel company. My target is to be able to run 1.5mil requests per minute (25k rps)

FYI we've reached loads higher than this with locust without any issues but we scale out slaves 50-200. The master performed fine with less than 4 cores.

We just ran a largish load test this morning, here's the specs of master and slaves if anyone is interested for comparison.

Load test peak RPM: 1.8M

locust master: 2 cores, 1GB or memory
locust slaves: 50 slaves at 1 core and 2GB each

Master CPU was at about 40% and 70% memory.
Slaves were at 100% CPU and 44% memory

Running on k8s.

Okay, running muuuuch better now. Not sure if we should close this request or not though? Multi threading still feels like a good idea....eventually....maybe?

Great!

I'm closing it as a won't fix for now. I wouldn't mind improving the parallelism of the master at some point if it can be done without adding too much complexity. But so far I have yet to see a case where CPU on the master were the bottleneck.

If it were to be implemented, I'm guessing that a good way to do it would be to make this line:

https://github.com/locustio/locust/blob/046eeb26c37c97a0528fe1526f1c52d2ebf73a23/locust/stats.py#L672

run in a multiprocessing pool (using python threads wouldn't help because of the GIL). But I have little experience with python multiprocessing, especially together with gevent.

Was this page helpful?
0 / 5 - 0 ratings