Locust: Connection pool is full, discarding connection | 'Connection aborted.', RemoteDisconnected('Remote end closed connection without response

Created on 20 Feb 2020  路  20Comments  路  Source: locustio/locust

I am new to Locust Load testing framework and in process of migrating my existing Azure cloud based Performance testing C# scripts to Locust's Python based scripts. Our team almost completed migration of scripts. But during our load tests, we are getting errors as below, which fails to create new requests from the machine due to high CPU utilization or because of so many exception on Locust. We are running with Locust web based mode - details are indicated below. These scripts are working fine on smaller loads of 50 to 100 users. Issue happens only when we run tests with higher loads above 500 to 3500 users

"Error 1 -('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))"

"Error 2 : Connection pool is full, discarding connection"

Environment

Our Load testing configurations are : "3500 users at a hatch rate of 5 users per second". Running natively(no docker container) on a 8 Core , 16 Gb Linux Ubuntu Virtual machine on Azure. ulimit set as 50,000 on Linux machine.

Please help us with your thoughts

sample code

import os
import sys
sys.path.append(os.environ.get('WORKDIR', os.getcwd()))

from locust import HttpLocust, TaskSet, task
from locust.wait_time import between

class ContactUsBehavior(TaskSet):

wait_time = AppUtil.get_wait_time_function(2)


@task(1)
def post_load_test_contact(self):
    data = { "ContactName" : "Mane"
        , "Email" : "[email protected]"
        , "EmailVerifaction" : "[email protected]"
        , "TelephoneContact" : ""
        , "PhoneNumber" : ""
        , "ContactReason" : "Other"            
        , "OtherComment" : "TEST Comments 2019-12-30"
        , "Agree" : "true"
         }
    self.client.post("app/contactform", self.client, 'Contact us submission', post_data = data)

class UnauthenticatedUser(HttpLocust):
task_set = ContactUsBehavior
# host is override-able
host = 'https://app.devurl.com/'

  • OS: Linux Ubuntu( acutally a VM on Azure) - Linux 5.0.0-1032-azure x86_64
  • Python version: Python 3.6.9
  • Locust version: locust 0.14.4
  • Locust command line that you ran:~/LocustPythonScripts$ locust -f LocustTestCases.py -H https://appURL/ GeneralUser
  • Locust file contents (anonymized if necessary): see above sample file
bug

All 20 comments

Hi!

What is AppUtil?

What kind of throughput are you getting? You鈥檒l need to run multiple load gen processes for real high throughputs (python is limited to one core). See documentation about distributed runs.

Are there warnings in the log about high cpu usage?

Hi! What is AppUtil? What kind of throughput are you getting? You鈥檒l need to run multiple load gen processes for real high throughputs. See documentation about distributed runs. Are there warnings in the log about high cpu usage?

Its just a utility class which get wait time from global configuration.

`class AppUtil:

@classmethod
def get_wait_time_function(cls, taskset_avg_time):
    """
    Return global wait time configuration (int) if exists, or otherwise None 

    See https://github.com/locustio/locust/blob/master/locust/wait_time.py
    on why method is defined this way.
    """

    value = Configuration.get_wait_time()
    if (value is not None):
        return lambda instance: value 
    else:        
        return lambda instance: (taskset_avg_time - 1) + (random.random() * 2)    `

Did you mean running locust on Master-Slave config ? https://docs.locust.io/en/stable/running-locust-distributed.html ??

Ok!

Yes :)

.. but distributed shouldnt be necessary for such a simple test plan unless you鈥檙e running at least some hundreds of requests/s. You can also try FastHttpLocust https://docs.locust.io/en/stable/increase-performance.html

.. but distributed shouldnt be necessary for such a simple test plan unless you鈥檙e running at least some hundreds of requests/s. You can also try FastHttpLocust https://docs.locust.io/en/stable/increase-performance.html

I will try soon and let you know with in a day or two. Please wait and thanks for the directions.

.. but distributed shouldnt be necessary for such a simple test plan unless you鈥檙e running at least some hundreds of requests/s. You can also try FastHttpLocust https://docs.locust.io/en/stable/increase-performance.html

@cyberw - we checked both FastHttpLocust and gevenhttp clients, but both dont have an attribute on the response object that helps with URL of the response. Means we are checking Url of http response on the default locust http client to compare where the user lands after certain actions. Do you have any thoughts?

If FastHttpLocust doesnt fit your needs then it is probably best to stay on HttpLocust. Sounds like you have plenty of hardware, if you just run distributed.

What kind of throughput are you at when you get the problems? Are there any cpu usage warnings in your log?

I had a look at requests framework and apparently "connection is full..." is not so bad: https://stackoverflow.com/questions/53765366/urllib3-connectionpool-connection-pool-is-full-discarding-connection

The second error message seems to be the server side dropping the connection (maybe the server does not allow 3500 concurrent connections?) https://stackoverflow.com/questions/48105448/python-http-server-client-remote-end-closed-connection-without-response-error

@cyberw I too have been getting error1 (('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
My concern here is since the error is raised from Locust(client) but not produced by the server, i believe that is supposed to be ignored and shouldn't be counted as a failure. When it is showing it as a failure in the output report, it is assumed that the server is throwing the error.

Additionally, I believe this will not give accurate RPS also if it is counted in failure count.
RPS=Total Requests processed/(startTime-EndTime)
We are considering these as failures which means it is assumed that the request is processed but the server rejected it whereas indeed it has not reached the server.

Your thoughts?

cc: @abinjaik

Hmm... I'm not sure what you mean @Naren-Hub , the "Remote end closed connection without response" is a server side error (or at least not locust/client side, it could also be network), and counting it as an error is the only reasonable behaviour.

The error is of course detected/raised on locust/client side, but locust didnt cause the error.

@cyberw I got that, Apologies.
Googling the error gave me all references to python related and my server is a spring-boot. It made me think that the error has something to do with Locust or it's internal libraries while creating users or something.
Anyways, thanks for answering patiently, cheers!

@cyberw I tried running my existing code with regular Locust's HttpClient on Master-Slave configuration Linux servers. Master is 8 Core, 16GB Linux Ubuntu Machine & 2 Slave- 4 cores, 8 GB each. Every machine is getting CPU maxed out message easily. So tried to understand how many users cam be simulated with our code on a single machine with "standlone" mode. This shows only upto 200 users even on that 8 core machine... Sound weird?? When i checked on internet, it says Python only executes on single core at a time and it seems right, because even though my machine has 8 cores, locust is only utilizing one specific core.

Now , my question here is Can we run multiple slaves on a single 8 core machine ? So that I can utilize all cores on each machine.

Absolutely. Run one slave per core to ensure good utilization.

Absolutely. Run one slave per core to ensure good utilization.

How Locust will pick the other core on the same machine? If we open a new SSH session execute same command will it pick other core?

Absolutely. Run one slave per core to ensure good utilization.

How Locust will pick the other core on the same machine? If we open a new SSH session execute same command will it pick other core?

Python/locust is not bound to a specific core so you dont have to worry about that, the OS will distribute the processes to all your cores.

If we open a new SSH session execute same command will it pick other core?

Thanks for replying. Just confirming one point - So If we open a new SSH session on same Linux machine to execute same --Slave command will it work?

If we open a new SSH session execute same command will it pick other core?

Thanks for replying. Just confirming one point - So If we open a new SSH session on same Linux machine to execute same --Slave command will it work?

Yes. You may want to use the 鈥攅xpect-slaves parameter on master side to wait for all the slaves to connect before starting

If we open a new SSH session execute same command will it pick other core?

Thanks for replying. Just confirming one point - So If we open a new SSH session on same Linux machine to execute same --Slave command will it work?

Yes. You may want to use the 鈥攅xpect-slaves parameter on master side to wait for all the slaves to connect before starting

Yes. That seems to be working , but need to test on higher loads... But Charts on Locust Dashboard died now. Do youknow why Charts and other tabs died.

no idea, sorry...

Closing this issue. Because Connection aborted got solved by adding more slaves... I will create new issue charts , slave tab crashing problem... Thanks @cyberw for all your support

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bretrouse picture bretrouse  路  4Comments

YannickXiong picture YannickXiong  路  3Comments

dolohow picture dolohow  路  3Comments

ludo550 picture ludo550  路  3Comments

ShaolongHu picture ShaolongHu  路  3Comments