Locust: Connection pool is full, discarding connection | 'Connection aborted.', RemoteDisconnected('Remote end closed connection without response

Created on 20 Feb 2020 · 20Comments · Source: locustio/locust

I am new to Locust Load testing framework and in process of migrating my existing Azure cloud based Performance testing C# scripts to Locust's Python based scripts. Our team almost completed migration of scripts. But during our load tests, we are getting errors as below, which fails to create new requests from the machine due to high CPU utilization or because of so many exception on Locust. We are running with Locust web based mode - details are indicated below. These scripts are working fine on smaller loads of 50 to 100 users. Issue happens only when we run tests with higher loads above 500 to 3500 users

"Error 1 -('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))"

"Error 2 : Connection pool is full, discarding connection"

Environment

Our Load testing configurations are : "3500 users at a hatch rate of 5 users per second". Running natively(no docker container) on a 8 Core , 16 Gb Linux Ubuntu Virtual machine on Azure. ulimit set as 50,000 on Linux machine.

Please help us with your thoughts

sample code

import os
import sys
sys.path.append(os.environ.get('WORKDIR', os.getcwd()))

from locust import HttpLocust, TaskSet, task
from locust.wait_time import between

class ContactUsBehavior(TaskSet):

wait_time = AppUtil.get_wait_time_function(2)


@task(1)
def post_load_test_contact(self):
    data = { "ContactName" : "Mane"
        , "Email" : "[email protected]"
        , "EmailVerifaction" : "[email protected]"
        , "TelephoneContact" : ""
        , "PhoneNumber" : ""
        , "ContactReason" : "Other"            
        , "OtherComment" : "TEST Comments 2019-12-30"
        , "Agree" : "true"
         }
    self.client.post("app/contactform", self.client, 'Contact us submission', post_data = data)

class UnauthenticatedUser(HttpLocust):
task_set = ContactUsBehavior
# host is override-able
host = 'https://app.devurl.com/'

OS: Linux Ubuntu( acutally a VM on Azure) - Linux 5.0.0-1032-azure x86_64
Python version: Python 3.6.9
Locust version: locust 0.14.4
Locust command line that you ran:~/LocustPythonScripts$ locust -f LocustTestCases.py -H https://appURL/ GeneralUser
Locust file contents (anonymized if necessary): see above sample file

bug

Source

abinjaik

All 20 comments

Hi!

What is AppUtil?

What kind of throughput are you getting? You’ll need to run multiple load gen processes for real high throughputs (python is limited to one core). See documentation about distributed runs.

Are there warnings in the log about high cpu usage?

cyberw on 20 Feb 2020

Hi! What is AppUtil? What kind of throughput are you getting? You’ll need to run multiple load gen processes for real high throughputs. See documentation about distributed runs. Are there warnings in the log about high cpu usage?

Its just a utility class which get wait time from global configuration.

`class AppUtil:

@classmethod
def get_wait_time_function(cls, taskset_avg_time):
    """
    Return global wait time configuration (int) if exists, or otherwise None 

    See https://github.com/locustio/locust/blob/master/locust/wait_time.py
    on why method is defined this way.
    """

    value = Configuration.get_wait_time()
    if (value is not None):
        return lambda instance: value 
    else:        
        return lambda instance: (taskset_avg_time - 1) + (random.random() * 2)    `

Did you mean running locust on Master-Slave config ? https://docs.locust.io/en/stable/running-locust-distributed.html ??

abinjaik on 20 Feb 2020

Ok!

Yes :)

cyberw on 20 Feb 2020

.. but distributed shouldnt be necessary for such a simple test plan unless you’re running at least some hundreds of requests/s. You can also try FastHttpLocust https://docs.locust.io/en/stable/increase-performance.html

cyberw on 20 Feb 2020

👍1

.. but distributed shouldnt be necessary for such a simple test plan unless you’re running at least some hundreds of requests/s. You can also try FastHttpLocust https://docs.locust.io/en/stable/increase-performance.html

I will try soon and let you know with in a day or two. Please wait and thanks for the directions.

abinjaik on 20 Feb 2020

👍1

.. but distributed shouldnt be necessary for such a simple test plan unless you’re running at least some hundreds of requests/s. You can also try FastHttpLocust https://docs.locust.io/en/stable/increase-performance.html

@cyberw - we checked both FastHttpLocust and gevenhttp clients, but both dont have an attribute on the response object that helps with URL of the response. Means we are checking Url of http response on the default locust http client to compare where the user lands after certain actions. Do you have any thoughts?

abinjaik on 20 Feb 2020

If FastHttpLocust doesnt fit your needs then it is probably best to stay on HttpLocust. Sounds like you have plenty of hardware, if you just run distributed.

What kind of throughput are you at when you get the problems? Are there any cpu usage warnings in your log?

cyberw on 21 Feb 2020

I had a look at requests framework and apparently "connection is full..." is not so bad: https://stackoverflow.com/questions/53765366/urllib3-connectionpool-connection-pool-is-full-discarding-connection

The second error message seems to be the server side dropping the connection (maybe the server does not allow 3500 concurrent connections?) https://stackoverflow.com/questions/48105448/python-http-server-client-remote-end-closed-connection-without-response-error

cyberw on 21 Feb 2020

@cyberw I too have been getting error1 (('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
My concern here is since the error is raised from Locust(client) but not produced by the server, i believe that is supposed to be ignored and shouldn't be counted as a failure. When it is showing it as a failure in the output report, it is assumed that the server is throwing the error.

Additionally, I believe this will not give accurate RPS also if it is counted in failure count.
RPS=Total Requests processed/(startTime-EndTime)
We are considering these as failures which means it is assumed that the request is processed but the server rejected it whereas indeed it has not reached the server.

Your thoughts?

cc: @abinjaik

Naren-Hub on 27 Feb 2020

Hmm... I'm not sure what you mean @Naren-Hub , the "Remote end closed connection without response" is a server side error (or at least not locust/client side, it could also be network), and counting it as an error is the only reasonable behaviour.

The error is of course detected/raised on locust/client side, but locust didnt cause the error.

cyberw on 27 Feb 2020

@cyberw I got that, Apologies.
Googling the error gave me all references to python related and my server is a spring-boot. It made me think that the error has something to do with Locust or it's internal libraries while creating users or something.
Anyways, thanks for answering patiently, cheers!

Naren-Hub on 28 Feb 2020

👍1

@cyberw I tried running my existing code with regular Locust's HttpClient on Master-Slave configuration Linux servers. Master is 8 Core, 16GB Linux Ubuntu Machine & 2 Slave- 4 cores, 8 GB each. Every machine is getting CPU maxed out message easily. So tried to understand how many users cam be simulated with our code on a single machine with "standlone" mode. This shows only upto 200 users even on that 8 core machine... Sound weird?? When i checked on internet, it says Python only executes on single core at a time and it seems right, because even though my machine has 8 cores, locust is only utilizing one specific core.

Now , my question here is Can we run multiple slaves on a single 8 core machine ? So that I can utilize all cores on each machine.

abinjaik on 28 Feb 2020

Absolutely. Run one slave per core to ensure good utilization.

cyberw on 28 Feb 2020

Absolutely. Run one slave per core to ensure good utilization.

How Locust will pick the other core on the same machine? If we open a new SSH session execute same command will it pick other core?

abinjaik on 28 Feb 2020

Absolutely. Run one slave per core to ensure good utilization.

How Locust will pick the other core on the same machine? If we open a new SSH session execute same command will it pick other core?

Python/locust is not bound to a specific core so you dont have to worry about that, the OS will distribute the processes to all your cores.

cyberw on 28 Feb 2020

If we open a new SSH session execute same command will it pick other core?

Thanks for replying. Just confirming one point - So If we open a new SSH session on same Linux machine to execute same --Slave command will it work?

abinjaik on 28 Feb 2020

If we open a new SSH session execute same command will it pick other core?

Thanks for replying. Just confirming one point - So If we open a new SSH session on same Linux machine to execute same --Slave command will it work?

Yes. You may want to use the —expect-slaves parameter on master side to wait for all the slaves to connect before starting

cyberw on 28 Feb 2020

If we open a new SSH session execute same command will it pick other core?

Thanks for replying. Just confirming one point - So If we open a new SSH session on same Linux machine to execute same --Slave command will it work?

Yes. You may want to use the —expect-slaves parameter on master side to wait for all the slaves to connect before starting

Yes. That seems to be working , but need to test on higher loads... But Charts on Locust Dashboard died now. Do youknow why Charts and other tabs died.

abinjaik on 28 Feb 2020

no idea, sorry...

cyberw on 28 Feb 2020

Closing this issue. Because Connection aborted got solved by adding more slaves... I will create new issue charts , slave tab crashing problem... Thanks @cyberw for all your support

abinjaik on 3 Mar 2020

Was this page helpful?

0 / 5 - 0 ratings