Boto3: Setting Cache-Control via boto3 causes file size 0

Created on 24 Jan 2017  路  3Comments  路  Source: boto/boto3

I set the Cache-Control parameter using the test code below. After running the code, the modified file size becomes 0. I went to the S3 browser and double-clicked the modified file to open it in a separate tab. The tab opens then immediately closes. After this point, chrome downloads the file. When I try to open the browser downloaded file it is empty.

If I remove the Cache-Control metadata, then the file size remains at 0.

It is not clear why setting cache-control seems to destroy the file. I cannot find anything in the documentation about this.

Test code:

for key in bucket.objects.all():
    print(key)
    print(key.key)                                              

    key.put(CacheControl='max-age=31536000, public')
    key_obj = key.Object()

    print(key_obj)
    print(key_obj.cache_control)
    break                                   # prevent destroying all files in bucket

# TODO This is causing the file size to become zero, and prevents opening the file in a new tab.
closing-soon

Most helpful comment

I found this because I had the same problem. The solution was to add MetadataDirective='REPLACE' to the copy from call:

import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket('bucket-name')
cache_control = 'public,max-age= 604800,immutable'
for summary in bucket.objects.all():
    obj = summary.Object()
    # Copy file to itself with new cache-control headers & allowing replace
    obj.copy_from(
        CopySource={'Bucket': 'bucket-name', 'Key': obj.key},
        CacheControl=cache_control,
        MetadataDirective='REPLACE',
    )

Hopefully, this will be helpful for others.

All 3 comments

put maps to a PutObject call which is used to create a new S3 object. If the object exists it's replaced. If you just want to add cache control to an existing header, probably the easiest way would be to copy the object to itself with the updated values.

@jamesls apparently you cannot copy an object to itself if you are only changing the metadata tag. I'd like this issue to be re-opened.

    s3 = boto3.resource('s3')
    s3.meta.client.copy({"Bucket": interested_bucket, "Key": key}, interested_bucket, key, ExtraArgs={'CacheControl': 'no-cache'})

 "An error occurred (InvalidRequest) when calling the CopyObject operation: This copy request is illegal because it is trying to copy an object to itself without changing the object's metadata, storage class, website redirect location or encryption attributes.",

I found this because I had the same problem. The solution was to add MetadataDirective='REPLACE' to the copy from call:

import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket('bucket-name')
cache_control = 'public,max-age= 604800,immutable'
for summary in bucket.objects.all():
    obj = summary.Object()
    # Copy file to itself with new cache-control headers & allowing replace
    obj.copy_from(
        CopySource={'Bucket': 'bucket-name', 'Key': obj.key},
        CacheControl=cache_control,
        MetadataDirective='REPLACE',
    )

Hopefully, this will be helpful for others.

Was this page helpful?
0 / 5 - 0 ratings