We are in the process of writing a few thousand ~500 byte files to an s3 bucket/folder.
Currently In the console it shows 280 files (we assume this is correct as we have not wrote them all yet) in the folder.
aws s3 ls bucket/folder/ --recursive
shows 280 files.
aws s3 cp bucket/folder/ ./ --recursive
shows its copied 280 files on the command line. You can read the number explicitly and I counted lines of "copied file" output in console.
However in mac os (right click get info) it shows 211 files.
Additionally a ls . | wc -l
shows 211 files.
I have tried reducing number of concurrent to 1. Cannot understand why it would show completed when the file doesn't exist on the local. This is over wifi btw, so if there is no checking if a file was downloaded correctly, maybe thats it? Unsure where to look next
UPDATE: We put together a hack to get around this bug... basically, if you run:
aws s3 ls bucket/folder/ --recursive --dryrun >> filestodownload.txt
It shows all the files you want to download, and saves it in a text file thats easy to parse. We then parsed it, and did a separate aws s3 cp command for each individual file. All downloaded successfully, albeit it was slow... so now working on concurrency... but files are still MIA (even though we have local checks after download the file size > 0 and all succeed.
Really confusing...
UPDATE 2: We spun up a AWS AMI and repeated process above. It worked as expected. IDK, seems to be related to Mac? I am on 10.12.4 Sierra.
You rerun the commands with the --debug
flag which will give you a lot of information to help pinpoint the issue. If you still think something is a bug you can post the sanitized logs here for us to take a look at.
Extremely large amount of output, let me look thru
What are you looking for? A bunch of items relating region redirect, and stuff like below
[2018-01-18 18:28:17,014 - ThreadPoolExecutor-0_5 - botocore.hooks - DEBUG - Event after-call.s3.GetObject: calling handler <function enhance_error_msg at 0x104583230>
2018-01-18 18:28:17,014 - ThreadPoolExecutor-0_2 - botocore.retryhandler - DEBUG - No retry needed.
2018-01-18 18:28:17,015 - ThreadPoolExecutor-0_1 - botocore.hooks - DEBUG - Event after-call.s3.GetObject: calling handler <function enhance_error_msg at 0x104583230>
2018-01-18 18:28:17,015 - ThreadPoolExecutor-0_5 - s3transfer.tasks - DEBUG - IOWriteTask(transfer_id=6, {'offset': 0}) about to wait for the following futures []
2018-01-18 18:28:17,015 - ThreadPoolExecutor-0_0 - botocore.hooks - DEBUG - Event before-parameter-build.s3.GetObject: calling handler <bound method S3RegionRedirector.redirect_from_cache of <botocore.utils.S3RegionRedirector object at 0x104cd23d0>>
](url)
What version of the CLI were you using that was running into these errors? Sounds like an upgrade might have resolved it, but it would be good to know for sure.
Closing due to inactivity.
I have the same problem. Trying to copy all files from a bucket to another one with --recursive but at the end some of the files are not copied, and no error is reported.
I simply use:
aws s3 cp s3://old_bucket/folder1 s3://new_bucket/folder2/ --recursive
I also have this issue. At the moment I have a feeling it might be to do with S3's eventual consistency model and me doing something weird when poking the files into S3. In case this helps anyone.
Facing the same issue. ls shows all directories and files. CP or sync only does some of them No error. Also, there is no policy set up for inclusion-exclusion.
The same problem here, awscli==1.16.74
aws s3 cp
command with --recursive
parameter go the deepest directory, and then back to directory only one level higher.
This way it omits many directories on higher levels.
@JordonPhillips please open this issue
The same problem here, awscli==1.16.186
aws s3 cp
and aws s3 sync
happen same .
but when i use s3cmd,it works . @JordonPhillips please open this issue