I have developed a web application with boto (v2.36.0) and am trying to migrate it to use boto3 (v1.1.3). Because the application is deployed on a multi-threaded server, I connect to S3 for each HTTP request/response interaction.
Following the guidance here, I am generating a session per request to ensure thread-safety. However, I have noticed that whereas boto2 takes approximately 0.2 ms to connect to S3, boto3 takes approximately 25 ms.
Here is an IPython session that demonstrates the issue. This was run in IPython 3.0.0 on Python 3.4.3 (Anaconda 2.2.0, 64-bit) on Ubuntu 14.04 LTS on an AWS EC2 instance in us-east-1; however, I have noted similar behavior in Python 2.7 on my local machine.
import boto as boto2
import boto3
access_key_id = ...
secret_access_key = ...
bucket_name = ...
contents = chr(0)
def connect_with_boto2():
connection = boto2.connect_s3(access_key_id, secret_access_key)
return connection
def connect_with_boto3():
session = boto3.session.Session(
aws_access_key_id=access_key_id,
aws_secret_access_key=secret_access_key,
)
connection = session.resource('s3')
return connection
def set_with_boto3():
connection = connect_with_boto3()
bucket = connection.Bucket(bucket_name)
bucket.Object('boto3').put(Body=contents)
def set_with_boto2():
connection = connect_with_boto2()
bucket = connection.get_bucket(bucket_name)
key = boto2.s3.key.Key(bucket, 'boto2')
key.set_contents_from_string(contents)
%timeit set_with_boto2()
#10 loops, best of 3: 48.9 ms per loop
%timeit set_with_boto3()
#10 loops, best of 3: 73.1 ms per loop
%timeit connect_with_boto2()
#1000 loops, best of 3: 203 碌s per loop
%timeit connect_with_boto3()
#10 loops, best of 3: 26.7 ms per loop
Am I setting up these connections correctly, or am I comparing apples-to-oranges? If the latter, is there a way to get the boto3 performance to approximate boto2?
So I think the documentation in boto3 is misleading. There is a few things that the example does not capture. Creation of sessions and service resources can take a significant amount of time. So ideally you want to minimize the number of sessions and service resources you create.
Based to the linked documentation, a new session and resource is created everytime the thread is run. Ideally the session and resource should only be created once for the lifetime of the thread. So create the session and resource when the thread is instantiated, and call all of the rest of the operations in the run() method of the thread. That should help improve performance.
Let me know if that works or if you have any questions.
Thanks for the clarification @kyleknap.
I don't have an easy way to access the threads (they are spawned and managed by the web server, so I'd have to monkeypatch them with the boto3 sessions and resources, which I'd prefer not to do for a bunch of reasons).
However, if I'm understanding you right, I should be able to create a pool of sessions/resources to be shared among the threads. Are there any caveats here?
That should be fine as long as each session and each service resource is not shared across more than one thread. Once a thread has a service resource (that is not shared by another thread), you should be good to use that service resource however you wish without worrying about the activities of other threads.
@kyleknap I believe there are a few more items to consider here
Therefore, recommending to create more instances on all multi-threaded use cases isn't a good advice. It's probably better to make Session and Resource immutable and threadsafe, and let them manage connection pool sizing etc and by that better support all multithreaded scenarios - including short lived threads used for parallelisation of I/O, and get a good performance gain while at it.
Most helpful comment
@kyleknap I believe there are a few more items to consider here
Therefore, recommending to create more instances on all multi-threaded use cases isn't a good advice. It's probably better to make Session and Resource immutable and threadsafe, and let them manage connection pool sizing etc and by that better support all multithreaded scenarios - including short lived threads used for parallelisation of I/O, and get a good performance gain while at it.