Locust: Why the "RPS" generated by locust is much fewer than other performance testing tools ?

Created on 12 May 2015 · 20Comments · Source: locustio/locust

I did load testing on a HTTP interface with several performance testing tools, and I found the 'RPS' generated by locust is much fewer than others.

ApacheBench

command: ab -n 1000 -c 80 http://testurl:8000/echo/hello

Benchmark:

Requests per second:    291.38 [#/sec] (mean)
...
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       82  125  87.3    116    2070
Processing:    83  149 250.4    116    2830
Waiting:       82  145 245.0    115    2830
Total:        170  274 278.0    232    4899

Jmeter

set Number of Threads to 80, set Loop Count to 100, and got the value of Throughtout was 270/sec.

Locust

set min_wait = 0 max_wait = 0 in script file,
and run locustfile with command: locust -f api.py --no-web -c 80 -r 80 -n 10000 --only-summary

Benchmark:

 Name                                                          # reqs      # fails     Avg     Min     Max  |  Median   req/s
--------------------------------------------------------------------------------------------------------------------------------------------
 GET /echo/hello                                                10000     0(0.00%)    1020     305    3285  |    1000   70.50
--------------------------------------------------------------------------------------------------------------------------------------------
 Total                                                          10000     0(0.00%)                                      70.50

Percentage of the requests completed within given times
 Name                                                           # reqs    50%    66%    75%    80%    90%    95%    98%    99%   100%
--------------------------------------------------------------------------------------------------------------------------------------------
 GET /echo/hello                                                 10000   1000   1100   1100   1100   1200   1400   1700   2100   3285
--------------------------------------------------------------------------------------------------------------------------------------------

Source

jacexh

👍8

Most helpful comment

There was is an old article from k6 that showed locust being very slow but they configured it with min_wait = 5000 max_wait = 15000, I assume because this was used in an example in the locust docs. They have an updated article now which is very comprehensive: https://k6.io/blog/comparing-best-open-source-load-testing-tools

For comparison, we have run tests up to 30000 RPS (not using FastHttpLocust):

locust master = 2 cores, 1GB or memory
locust slaves = 50 slaves at 1 core and 2GB each

Python/locust will likely never match the raw speed of other tools like k6 or the basic tools like wrk/ab etc but ease of the Python language and MUCH easier horizontal scaling on k8s with the master/slave model more than makes up for it IMO.

max-rocket-internet on 18 May 2020

👍3

All 20 comments

Hi,

Were you able to solve this issue? I'm facing the same issue.

maheshnayak616 on 26 Jun 2015

The same to.

EduardoMMachado on 18 May 2017

@jacexh why was this question closed?

jonathannaguin on 5 Jul 2017

@jonathannaguin http://docs.locust.io/en/latest/running-locust-distributed.html.

A locust instance is running on a single CPU core, so this test is unfair.

jacexh on 5 Jul 2017

👍1

ah! that makes sense, this should be part of a FAQ section :)

jonathannaguin on 5 Jul 2017

screen shot 2017-07-26 at 10 16 28 am
I am running locust in distributed mode with 8 slaves and 1 master, but I can only get up to about 1000 requests per second. I have
min_wait = 0 max_wait = 0 in my locustfile. Can someone tell me how to increase the rps?

mickeyshaughnessy on 26 Jul 2017

👍1

Run Locust distributed on several machines.

danron on 31 May 2018

Why this question is closed, I dont understand ? I'm facing the same issue in 2019

berkaykirmizioglu on 25 Feb 2019

because the OP closed it.
Anyway, this is the tracker for bugs/issues.. not general questions.

cgoldberg on 25 Feb 2019

Why so many people said that 's because Locust run in one thread, so it is reasonable it could only reach about 300 RPS? And if you want more , run more slaves??

Usually there is no CPU consuming task in Locust task, just make http requests and send out, that is IO bound, that's what Gevent/greenlet shoud handle very efficiently.

Why Just 300 RPS ???!!

I remember we have developed our own performance testing tool with Python about 10 years ago, it could reach about 2000 RPS easily. We don't use Gevent, instead, we use reactor event loop.

Anyway, I think that's a little shame for Locust as a load testing tool, just get 300 RPS per core.

jcyrss on 17 Jun 2019

👍3

I remember we have developed our own performance testing tool with Python about
10 years ago, it could reach about 2000 RPS easily.

You should probably go back to using that tool if Locust does not meet your needs. You are also welcome to improve Locust and submit your changes in a PR.

cgoldberg on 17 Jun 2019

Found way to generate fixed RPS number in Locust:
https://github.com/locustio/locust/issues/646#issuecomment-507694244

savvagen on 2 Jul 2019

You should probably go back to using that tool if Locust does not meet your needs. You are also welcome to improve Locust and submit your changes in a PR.

As someone pointed out in another PR, the root cause is Locust use Requests to send/receive Http . Requests is not efficent , And I've noticed you guys are aleady making your own Http underline handler. That's great.

jcyrss on 21 Aug 2019

This issue should be opened up.

Sure the title is a "general question", but it could say "Improve RPS to be as fast as other testing tools". You can say "too much work", but it's a valid issue. It's particularly problematic because then you go to your team and say "well our app only does ~300 r/s, we need to spend time optimizing", even though the app can actually do 10k r/s.

As I see it and others have pointed out, there are two ways to improve:

use multiple threads automatically,
use the existing thread more effectively, similar to how nginx does it or how @jcyrss mentioned.

rokcarl on 2 Apr 2020

👍1

"well our app only does ~300 r/s, we need to spend time optimizing", even though the app can actually do 10k r/s.

First of all, even a fairly low-end machine running Locust should be able to generate far more than 300 RPS (using FastHttpLocust). The K6 people got Locust to max out at 2900 RPS running on "a small, fanless, 4-core Celeron server" in their benchmark.

However, I'd recommend that you monitor the CPU usage of your load testing tool when you run your load tests (to make sure that it isn't your load generator that is the bottleneck). That is why we've added support for seeing slave/worker CPU usage under the Slaves/Workers tab. ("Workers" in upcoming versions).

use multiple threads automatically,

That wouldn't help because of the Python GIL. Use multiple slave/worker processes to make use of multiple cores.

use the existing thread more effectively, similar to how nginx does it or how @jcyrss mentioned.

Yes, that's why we've added FastHttpLocust (check out the docs for more info). It uses another HTTP client (actually extracted from nginx, and implemented in C) that is ~6x faster (IIRC).

heyman on 2 Apr 2020

The K6 people got Locust to max out at 2900 RPS running on "a small, fanless, 4-core Celeron server" in their benchmark

Is this benchmark?

After I wrote another load testing tool ultron , I did not want to talk about this issue any more.

jacexh on 3 Apr 2020

Is this benchmark?

Yes. The reason I mentioned it was to debunk the 300 RPS number that were mentioned by multiple people in this thread.

Other than that the benchmark is really a bit like comparing apples to oranges (though I guess it's useful if what you care about is maximizing the RPS on a single URL endpoint) since Locust is a framework for simulating user behaviour in Python code, while many of the tools compared were created just to max out RPS. Also, if the test would have been done with a couple of machines, Locust would actually be ahead most of them since most of them can't be run distributed.

heyman on 5 Apr 2020

Cool, I wasn't aware of the fast HTTP client implementation in locust when I wrote this because I was a 0.11.0 version user. Will see how this plays out. BTW, I've been using the multi-machine setup for some time now, even if it's a bit cumbersome to set up.

rokcarl on 6 Apr 2020

👍1

For comparison, we have run tests up to 30000 RPS (not using FastHttpLocust):

locust master = 2 cores, 1GB or memory
locust slaves = 50 slaves at 1 core and 2GB each

max-rocket-internet on 18 May 2020

👍3

Hi @heyman

While running load at 200 RPS, i am getting some drop on the RPS. Can you please let me know how can we resolve this through locust?
Please find the attached reference:

result