Boto3: Connections to S3 much slower than with boto2?

Created on 22 Sep 2015  路  4Comments  路  Source: boto/boto3

I have developed a web application with boto (v2.36.0) and am trying to migrate it to use boto3 (v1.1.3). Because the application is deployed on a multi-threaded server, I connect to S3 for each HTTP request/response interaction.

Following the guidance here, I am generating a session per request to ensure thread-safety. However, I have noticed that whereas boto2 takes approximately 0.2 ms to connect to S3, boto3 takes approximately 25 ms.

Here is an IPython session that demonstrates the issue. This was run in IPython 3.0.0 on Python 3.4.3 (Anaconda 2.2.0, 64-bit) on Ubuntu 14.04 LTS on an AWS EC2 instance in us-east-1; however, I have noted similar behavior in Python 2.7 on my local machine.

import boto as boto2
import boto3

access_key_id = ...
secret_access_key = ...
bucket_name = ...
contents = chr(0)

def connect_with_boto2():
    connection = boto2.connect_s3(access_key_id, secret_access_key)
    return connection

def connect_with_boto3():
    session = boto3.session.Session(
        aws_access_key_id=access_key_id, 
        aws_secret_access_key=secret_access_key,
    )
    connection = session.resource('s3')
    return connection

def set_with_boto3():
    connection = connect_with_boto3()
    bucket = connection.Bucket(bucket_name)
    bucket.Object('boto3').put(Body=contents)

def set_with_boto2():
    connection = connect_with_boto2()
    bucket = connection.get_bucket(bucket_name)
    key = boto2.s3.key.Key(bucket, 'boto2')
    key.set_contents_from_string(contents)

%timeit set_with_boto2()
#10 loops, best of 3: 48.9 ms per loop

%timeit set_with_boto3()
#10 loops, best of 3: 73.1 ms per loop

%timeit connect_with_boto2()
#1000 loops, best of 3: 203 碌s per loop

%timeit connect_with_boto3()
#10 loops, best of 3: 26.7 ms per loop

Am I setting up these connections correctly, or am I comparing apples-to-oranges? If the latter, is there a way to get the boto3 performance to approximate boto2?

documentation

Most helpful comment

@kyleknap I believe there are a few more items to consider here

  1. Multithreading doesn't necessarily imply long lived threads. Use of multithreading for parallel download of files for S3 using eventlet for example is common in Python, and for that use case it actually makes sense to create the session and resource once and reuse it from the short lived threads.
  2. In addition to 1 above, but not only for that use case, there are many resources that are initialised and could be reused. I.e. the urllib3 error that is thrown often when trying to parallelise S3 reads with boto3 - "Connection pool is full, discarding connection" which points to the fact that multi-threaded usage of a shared session could be beneficial for reasons of sharing internal connection pools (see http://stackoverflow.com/a/18845952/135701, https://github.com/openstack/python-swiftclient/commit/19d7e1812a99d73785146667ae2f3a7156f06898 for example of possible solutions).

Therefore, recommending to create more instances on all multi-threaded use cases isn't a good advice. It's probably better to make Session and Resource immutable and threadsafe, and let them manage connection pool sizing etc and by that better support all multithreaded scenarios - including short lived threads used for parallelisation of I/O, and get a good performance gain while at it.

All 4 comments

So I think the documentation in boto3 is misleading. There is a few things that the example does not capture. Creation of sessions and service resources can take a significant amount of time. So ideally you want to minimize the number of sessions and service resources you create.

Based to the linked documentation, a new session and resource is created everytime the thread is run. Ideally the session and resource should only be created once for the lifetime of the thread. So create the session and resource when the thread is instantiated, and call all of the rest of the operations in the run() method of the thread. That should help improve performance.

Let me know if that works or if you have any questions.

Thanks for the clarification @kyleknap.

I don't have an easy way to access the threads (they are spawned and managed by the web server, so I'd have to monkeypatch them with the boto3 sessions and resources, which I'd prefer not to do for a bunch of reasons).

However, if I'm understanding you right, I should be able to create a pool of sessions/resources to be shared among the threads. Are there any caveats here?

That should be fine as long as each session and each service resource is not shared across more than one thread. Once a thread has a service resource (that is not shared by another thread), you should be good to use that service resource however you wish without worrying about the activities of other threads.

@kyleknap I believe there are a few more items to consider here

  1. Multithreading doesn't necessarily imply long lived threads. Use of multithreading for parallel download of files for S3 using eventlet for example is common in Python, and for that use case it actually makes sense to create the session and resource once and reuse it from the short lived threads.
  2. In addition to 1 above, but not only for that use case, there are many resources that are initialised and could be reused. I.e. the urllib3 error that is thrown often when trying to parallelise S3 reads with boto3 - "Connection pool is full, discarding connection" which points to the fact that multi-threaded usage of a shared session could be beneficial for reasons of sharing internal connection pools (see http://stackoverflow.com/a/18845952/135701, https://github.com/openstack/python-swiftclient/commit/19d7e1812a99d73785146667ae2f3a7156f06898 for example of possible solutions).

Therefore, recommending to create more instances on all multi-threaded use cases isn't a good advice. It's probably better to make Session and Resource immutable and threadsafe, and let them manage connection pool sizing etc and by that better support all multithreaded scenarios - including short lived threads used for parallelisation of I/O, and get a good performance gain while at it.

Was this page helpful?
0 / 5 - 0 ratings