Boto3: Very slow boto3.client.put_object

Created on 15 Dec 2015  路  14Comments  路  Source: boto/boto3

We're seeing an extremely puzzling issue: one of two machines, which are running identical code and nearly identical in configuration, exhibits wildly slower boto3.client('s3').put_object performance than the other machine (note: we only instantiate the client once per thread/process). Using boto3 and running multiple processes, Machine #2 transfers data at around 1.5Gbps while Machine #1 tranfers data at around 0.015Gbps.

The machine configurations are slightly different (mostly they have differing sets of network monitoring tools), so that's suspicious, but we've confirmed that uploading using the awscli tool runs at roughly 1Gbps on either machine. So Machine #1 and #2's network setups are fine.

Checking on raw boto3, we started up a fresh Python REPL and did a minimal test of boto3.client.put_object and saw the same very low performance on Machine #1.

We switched our upload script on Machine #2 from using boto3 to subprocess-calling awscli and Machine #2's performance headed towards Machine #1's (after accounting for the shelling-out-to-a-fresh-interpreter's effect on Amdahl's Law).

So we've ruled out all of the cases we can think of to explain the slowness of boto3.client.put_object on Machine #1 and are left with only boto3.client.put_object as the culprit. An additional strange characteristic of the slowness is that, using 'bmon', we're able to watch traffic on the interface slowly ramp up [exponentially?] until the file is completely uploaded (which can take up to a minute). Additionally, CPU sys % sits around 10% on Machine #1, which is similar to Machine #2 and indicates significant network activity (even though traffic is low).

Our usage of boto3 is basically (where _data_ can be a 100MB MP4):

s3_client = boto3.client('s3')
conn = boto.connect_s3()
bucket = Bucket(BUCKET_NAME)

def upload(key, data):
    s3_client.put_object(Bucket=bucket.bucket_name,
                         StorageClass='REDUCED_REDUNDANCY',
                         Key=key,
                         Body=data,
                         Metadata={ 'source': args.source })

We've run out of ideas for diagnostics. Do you have any pointers for us or any ideas as to the failure mode we're seeing?

closed-for-staleness guidance s3

Most helpful comment

@keven425 To be honest, I have little recollection of the issue (it's been nearly three years). The issue was in a cluster of 8 nearly identical Debian Jessie Dell R720xd machines which were uploading 10 minute videos to AWS S3 (of mice & rats; see https://vium.com). The machines were directly connected, via a router, to AWS Direct Connect over a 10gbps optical fiber.

All 14 comments

Just want to make sure I understand correctly. In your this paragraph:

We switched our upload script on Machine #2 from using boto3 to subprocess-calling awscli and Machine #2's performance headed towards Machine #1's (after accounting for the shelling-out-to-a-fresh-interpreter's effect on Amdahl's Law).

Did you mean subprocess-calling awscli on the faster machine (#2) slows it down? Or did you want to say subprocess-calling awscli on the slower machine (#1) speeds it up?

Ray,
Sorry for the confusion. I meant the latter. Shelling to awscli on
Machine #2 gets us 100x throughout as compared to using boto3 on Machine #2

So I guess you mean "Shelling to awscli on Machine TWO ONE gets us 100x throughout". I put my understanding in following table (you'd better visit this github page rather than just reading your email to see the nice table). Please correct me if some data were wrong. And we will look into this.

| Test Method | Machine 1 | Machine 2 |
| --- | --- | --- |
| spec (for us to reproduce the issue) | ? | ? |
| raw boto3 | 0.015 Gbps | 1.5Gbps |
| aws-cli | 1 Gbps | 1 Gbps |
| subprocess-calling awscli | 1.5 Gbps? | n/a? |

@alsonkemp Just want to confirm: For the Body=data argument for put_object, the data arg is just a normal opened file object, something like data = open(filename) right? Also curious if you've had a chance to try out s3_client.upload_file?

Agree with @jamesls. And this is the documentation for the s3_client.upload_file(). It accepts a filename, and it will automatically split the big file into multiple chunks with default size as 8MB and default concurrency of 10, and each chunk is streaming through the aforementioned low level APIs. This will generally give you a much better throughput than a single thread put_object(). Please let us know whether it makes a difference.

@rayluo Argh. I thought that I might get #1 and #2 backwards. Yes, shelling to awscli on slow-machine yields results that are nearly as fast as boto3 on fast-machine. But the machines are otherwise identical. boto3 is fast on fast-machine and slow on slow-machine.

Adding a bit to your table:

| Test Method | Slow Machine | Fast Machine |
| --- | --- | --- |
| spec (for us to reproduce the issue) | see below | see below |
| raw boto3 | 0.015 Gbps | 1.5Gbps |
| aws-cli | 1 Gbps | 1 Gbps |
| subprocess-calling awscli | 1Gbps | 1Gbps |

Spec: unfortunately, the spec is, basically, configure two identical Dell R720xd machines with Debian Jessie, install boto3, open a Python shell in each, import boto3, use put_object to send a 100+MB file to S3. If you have our luck, you'll wind up with a slow-machine and a fast-machine...

@jamesls the body argument was being passed the file contents, not a file pointer. The docs (http://boto3.readthedocs.org/en/latest/reference/services/s3.html#S3.Client.put_object) say that Body is of type b'bytes'. I'm not sure how this would affect one machine and not the other, but we'll get back with results from the change.

different env vars perhaps, ie slow machine using a proxy? compare library and interpreter versions on both, ie. falling back to interpreted elementtree would do a 10x over the native. Else its profile. on network perhaps one is using an s3 native vpc endpoint and the other is using a proxy.

Did you try the function boto3.s3.transfer instead of put_object?

I might be running into this, or something related. The AWS CLI seems to provide faster download speeds than boto3 (I'm not sure about uploads, but I assume it's the same). On a d2.8xlarge instance, which has 10gbps networking and fast storage, I was able to download a large file of randomly generated data (8 GB) at about 150 MB/s using the CLI, but only about 35-40 MB/s using boto3. Here's the boto3 code I used (the particular transfer config settings didn't seem to have all that much effect either with the CLI or boto3, and I believe I also tried the defaults with similar results):

import boto3
import logging
from concurrent.futures import ProcessPoolExecutor
logging.basicConfig(level='DEBUG')
logging.getLogger('botocore').setLevel('INFO')
client = boto3.client('s3')

config = boto3.s3.transfer.TransferConfig(
    multipart_threshold=64 * 1024 * 1024,
    max_concurrency=10,
    num_download_attempts=10,
    multipart_chunksize=16 * 1024 * 1024,
    max_io_queue=10000
)
config = boto3.s3.transfer.TransferConfig()
transfer = boto3.s3.transfer.MultipartDownloader(client, config, boto3.s3.transfer.OSUtils())
transfer.download_file('my-bucket-name-here', 'path/to/key/here/foo.npy', 'foo.npy', 8000000000, {})

IIRC, I also tried the simpler download_file and got the same 35-40 MB/s.

Am I missing something obvious in the configuration or something?

Thanks.

I'm experiencing a similar issue. When I'm calling download_file() on a boto3 s3 bucket, the speed is < 1% the speed of awscli call. The strange thing is, when I use tcp_check to monitor the network traffic, the speed ramps up gradually to something reasonable.

This is a ubuntu machine not hosted on AWS.

@alsonkemp did you ever figure out what was the cause?

@keven425 To be honest, I have little recollection of the issue (it's been nearly three years). The issue was in a cluster of 8 nearly identical Debian Jessie Dell R720xd machines which were uploading 10 minute videos to AWS S3 (of mice & rats; see https://vium.com). The machines were directly connected, via a router, to AWS Direct Connect over a 10gbps optical fiber.

Taking about 1-2 minutes to transfer a 15mb file using put_object. Been using this for weeks and normally it just takes a few seconds.

I suppose the only difference is that normally I'm running this at work on MacOS, and today I'm running it from home. But I'm having no other noticeable issues with my internet today (or ever).

Following up with old issue. As it has been more than a year since the last comment is anyone having any problem with the latest version of Boto3 ? If yes, please reopen a new issue and i would be happy to help.

Was this page helpful?
0 / 5 - 0 ratings