Aws-cli: aws s3 sync --exclude does't work consistently

Created on 9 Dec 2013  路  9Comments  路  Source: aws/aws-cli

On OSX 10.8.5, I'm trying to exclude .DS_Store but --exclude doesn't seem to work as expected.

$  aws --version
aws-cli/1.2.7 Python/2.7.2 Darwin/12.5.0

$ ls -Ra
./         ../        .DS_Store  a/         test.txt

./a:
./         ../        .DS_Store

$ aws s3 sync  ./  s3://sajee-sync/  --exclude "*.DS_Store"   --dryrun
(dryrun) upload: ./test.txt to s3://sajee-sync/test.txt

$ cd a
$ aws s3 sync  ../  s3://sajee-sync/  --exclude "*.DS_Store"   --dryrun
(dryrun) upload: ../.DS_Store to s3://sajee-sync/.DS_Store
(dryrun) upload: ../test.txt to s3://sajee-sync/test.txt

Why is ../.DS_Store getting uploaded?

Instead of --exclude "*.DS_Store", I tried --exclude .DS_Store, --exclude *.DS_Store, ".DS_Store" but all produce the same results.

Most helpful comment

Same here. I've had to specifically write --exclude "*.DS_Store*" for the matching to work. Thanks

All 9 comments

==EDITED by author===

I believe that --include does not work and ends up including even if the filter does not work. I have not read the spec yet but if you are trying to sync from S3 down to a local file system and use --include, it does not behave as expected. Notice in the examples below how the "does not match" is identified, but the "should_include:" is set to true.

ex:

2013-12-11 18:32:15,643 - awscli.customizations.s3.filters - DEBUG - atlas-config.ec2.arbor.net/logs/feed-web-1/access-20131119T061701 did not match include filter: atlas-config.ec2.arbor.net/20131210T
2013-12-11 18:32:15,643 - awscli.customizations.s3.filters - DEBUG - =atlas-config.ec2.arbor.net/logs/feed-web-1/access-20131119T061701 final filtered status, should_include: True
2013-12-11 18:32:15,644 - awscli.customizations.s3.filters - DEBUG - /var/tmp/feed-web-1/access-20131119T061701 did not match include filter: /var/tmp/20131210T
2013-12-11 18:32:15,644 - awscli.customizations.s3.filters - DEBUG - =/var/tmp/feed-web-1/access-20131119T061701 final filtered status, should_include: True

2013-12-11 18:49:08,615 - awscli.customizations.s3.filters - DEBUG - atlas-config.ec2.arbor.net/logs/feed-web-1/access-20131122T071702 did not match include filter: atlas-config.ec2.arbor.net/20131210T
2013-12-11 18:49:08,615 - awscli.customizations.s3.filters - DEBUG - =atlas-config.ec2.arbor.net/logs/feed-web-1/access-20131122T071702 final filtered status, should_include: True
2013-12-11 18:49:08,615 - awscli.customizations.s3.filters - DEBUG - /var/tmp/feed-web-1/access-20131122T081702 did not match include filter: /var/tmp/20131210T
2013-12-11 18:49:08,615 - awscli.customizations.s3.filters - DEBUG - =/var/tmp/feed-web-1/access-20131122T081702 final filtered status, should_include: True

For the good of others - this is actually by code design. The conditional in filters.py only checkes for matched and then --include and then elseif --exclude, with a default of True. As a result, you MUST use --exclude="_" --include ="_pattern*" in order to just match the files you want.

otherwise, filters.py would have to be forked/modified to handle explicit use cases of --include w/o --exclude.

As to why --exclude does not work in the above, what happens if you put --debug after sync in your line? It will show the printout of what filters.py is doing.

Here's the problem:

2013-12-12 15:52:14,482 - awscli.customizations.s3.filters - DEBUG - /Temp/s3-sync-test/.DS_Store did not match exclude filter: /Temp/s3-sync-test/a/.DS_Store
2013-12-12 15:52:14,482 - awscli.customizations.s3.filters - DEBUG - =/Temp/s3-sync-test/.DS_Store final filtered status, should_include: True
2013-12-12 15:52:14,482 - botocore.service - DEBUG - Creating operation objects for: Service(s3)

A specific path to .DS_Store is being used so any .DS_Store that doesn't match that path will be included. How do I exclude any files that match .DS_Store?

yes, pathing is important, to prefix of * may be needed if it's not in the root of the bucket/container.

aws s3 sync ./ s3://sajee-sync/ --debug --exclude "*/.DS_Store" --dryrun

I'm not sure why *.DS_Store didn't work for you - but I'd like to see what the above shows. Also, can you aws --version for us too.

Same result. "*/.DS_Store" didn't make a difference.

The aws --version is up top in my original post.

Looks like you're running into this issue: https://github.com/aws/aws-cli/issues/548
which is fixed in https://github.com/aws/aws-cli/pull/554 and will go out in the next release.

I'm still receiving this error as well
aws-cli/1.8.9 Python/2.6.6 Linux/2.6.32-573.7.1.el6.x86_64

I am still experiencing this issue as well. --exclude patterns that are exact matches (e.g. '.DS_Store'), do not work. I must have a wildcard in order for the --exclude to work.

Same here. I've had to specifically write --exclude "*.DS_Store*" for the matching to work. Thanks

This either does not seem to behave logically, or is not well-documented... I can't tell which.

Was this page helpful?
0 / 5 - 0 ratings