Boto3: completing multipart upload

Created on 29 Jan 2015  ·  7Comments  ·  Source: boto/boto3

I'm having trouble with completing a multipart upload

given the following test code

mp = s.create_multipart_upload(Bucket='datalake.primary', Key='test1')
uid = mp['UploadId']
p1 =s.upload_part(Bucket='datalake.primary', Key='test1', PartNumber=1, UploadId=uid, Body='part_0')
s.complete_multipart_upload(Bucket='datalake.primary', Key='test1', UploadId=uid, MultipartUpload=???)

I don't know what I'm supposed to be setting MultipartUpload to and can't work it out in the docs. I see it needs to be a dict but not sure what it should contain.

Without it, I get the error ClientError: An error occurred (InvalidRequest) when calling the CompleteMultipartUpload operation: You must specify at least one part

documentation question

Most helpful comment

@owenrumney this is really not obvious from the documentation, so it took me a few tries to get right. Multipart uploads require information about each part when you try to complete the upload. This is how you can accomplish it:

import boto3

bucket = 'my-bucket'
key = 'mp-test.txt'

s3 = boto3.client('s3')

# Initiate the multipart upload and send the part(s)
mpu = s3.create_multipart_upload(Bucket=bucket, Key=key)
part1 = s3.upload_part(Bucket=bucket, Key=key, PartNumber=1,
                       UploadId=mpu['UploadId'], Body='Hello, world!')

# Next, we need to gather information about each part to complete
# the upload. Needed are the part number and ETag.
part_info = {
    'Parts': [
        {
            'PartNumber': 1,
            'ETag': part['ETag']
        }
    ]
}

# Now the upload works!
s3.complete_multipart_upload(Bucket=bucket, Key=key, UploadId=mpu['UploadId'],
                             MultipartUpload=part_info)

I'll see what can be done about updating the documentation upstream. Let me know if you have any other questions!

Also, you can enable low-level logging at any time with this:

boto3.set_stream_logger(name='botocore')

All 7 comments

@owenrumney this is really not obvious from the documentation, so it took me a few tries to get right. Multipart uploads require information about each part when you try to complete the upload. This is how you can accomplish it:

import boto3

bucket = 'my-bucket'
key = 'mp-test.txt'

s3 = boto3.client('s3')

# Initiate the multipart upload and send the part(s)
mpu = s3.create_multipart_upload(Bucket=bucket, Key=key)
part1 = s3.upload_part(Bucket=bucket, Key=key, PartNumber=1,
                       UploadId=mpu['UploadId'], Body='Hello, world!')

# Next, we need to gather information about each part to complete
# the upload. Needed are the part number and ETag.
part_info = {
    'Parts': [
        {
            'PartNumber': 1,
            'ETag': part['ETag']
        }
    ]
}

# Now the upload works!
s3.complete_multipart_upload(Bucket=bucket, Key=key, UploadId=mpu['UploadId'],
                             MultipartUpload=part_info)

I'll see what can be done about updating the documentation upstream. Let me know if you have any other questions!

Also, you can enable low-level logging at any time with this:

boto3.set_stream_logger(name='botocore')

@danielgtaylor thanks, thats much better. I'd seen from the API docs this was the general form but wasn't completely clear. If the documentation could just detail the structure of dict that would probably have been enough.

What is the ETag? the dict, part, is not defined in this example.

ETag is part of the response of method s3.upload_part(). See the response structure in the doc: https://boto3.readthedocs.org/en/latest/reference/services/s3.html#S3.Client.upload_part

I guess the typo in the example is confusing you. part should be renamed to part1:

part_info = {
    'Parts': [
        {
            'PartNumber': 1,
            'ETag': part1['ETag']
        }
    ]
}

Hi,
With the same code, if Ii add a for loop it is not working.

`import boto3

bucket = 'my-bucket'
key = 'mp-test.txt'

s3 = boto3.client('s3')

mpu = s3.create_multipart_upload(Bucket=bucket, Key=key)

for i in range(1,3):
    part = s3.upload_part(Bucket=bucket, Key=key, PartNumber=i,
                       UploadId=mpu['UploadId'], Body='Hello, world!')
    part_info = {
        'Parts': [
            {
                'PartNumber': i,
                'ETag': part['ETag']
            }
        ]
    }


s3.complete_multipart_upload(Bucket=bucket, Key=key, UploadId=mpu['UploadId'],
                             MultipartUpload=part_info)`

Now it is throwing the same error.
botocore.exceptions.ClientError: An error occurred (InvalidPart) when calling the CompleteMultipartUpload operation: Unknown

can any one solve this issue.

You are overwriting the part_info['Parts']list. Do this:

parts = {
    'PartNumber': i,
    'ETag': part['ETag']
}
part_info['Parts'].append(parts)

Also it might be worth reading in an actual file, instead of using static Hello, world! in the body for each part.

Is the “ MultipartUpload” REQUIRED?
I can't find the "REQUIRED" behind the arg "MultipartUpload" from the docs of boto3 ,but i code

rep = s3.complete_multipart_upload(Bucket='bucket',
                                   Key='wentao.mp4',
                                   UploadId='2~in_WUwt5z4g7ri1yfT_MiaRqAs8MRXG')

the raised botocore.exceptions.ClientError: An error occurred (MalformedXML) when calling the CompleteMultipartUpload operation: Unknown

Was this page helpful?
0 / 5 - 0 ratings