I know this has been brought up before but I think I have found another scenario where this error can occur.
So I'm running the following command to sync files to S3.
aws s3 sync ./source/ s3://bucket/destination/ --exclude '*' --include '*.mp3' --storage-class STANDARD_IA --no-follow-symlink
Great, works fine. But every now and again the sync fails with the error:
[Errno 2] No such file or directory
I had a look around and saw people were talking about broken symlinks. I though it was quite strange seeing as my source is just a folder with MP3's - but I tried the --no-follow-symlink
flag anyway. This still didn't work.
So I have a theory which I hope will be correct and result in a fix. When I run the sync command, it gathers a list of all the files in the directory then syncs them. This list doesn't change from the initial gathering. So I'm trying to sync a folder which regularly changes. So my theory is that while I'm trying to sync files, at the same time some of them are being deleted by their respective users, but because the initial file list is only retrieved at the beginning of the sync - it fails because the file doesn't exist anymore.
It would be nice just to ignore these failures. Let me know if I'm correct in thinking this.
That definitely sounds like the likely cause for the error you're seeing.
If that's the case, we'd also have the same problem even once the upload process has started. Each task thread gets its own file handle, so it's possible that if the file is deleted while it's being uploaded one of the task threads will get a file not found error.
We should be able to fix the case you've mentioned though. Let me discuss with the others.
I still can't sync an important directory because of this issue. Is there any way to make this a priority if I paid something? Also although I don't know python I can give it a shot myself trying to fix it but would you be kind enough to point me in the general direction of which file I'd be looking to edit? Thanks
Also experiencing the same issue here. In my case, I rsync to a folder and then aws sync that to S3. Rsync by default adds a .tmp
extension to files as they copy, and I exclude them with --exclude *.tmp
. However, when rsync finishes and renames it back to the original filename, I get
[Errno 2] No such file or directory
Shouldn't aws be excluding the .tmp
file? Why is it coming back to it to notice that it's no longer there? Seems like a bug to me.
I am having a similar problem. In my case the sync does not continue. It stops there. I have discovered today that you can't get it ignore anything that starts with ".". Also I am getting an error with the --no-follow-symlinks command. The error is "Unknown options: --no-follow-symlinks"
I'm also having this issue syncing anything starting with .
that also has a ~
in the file name, getting the exact same generic [Errno 2] No such file or directory
Think of it like this for example: ./6025/Proj_Name/Foo/Review/031416/RFJTJ~7
I ran a find . -iname "*~*"
on my directory and found a slew of other files that happen to fall into this area of concern, but I don't necessarily want to go and remove that character or remove those offending "characters to avoid" as they maybe important, but I don't know, I'm just told to "archive all of it."
What's the best way to overcome this concern? I need to finish uploading this stuff. I've tried s4cmd, s3cmd, Cyberduck and some others as a workaround to AWS' exclude feature leaving something left to be desired... all run into the same issue.
Guys I really don't understand why you can't fix this issue. It's such a simple solution of checking the existence of the file first and not throwing an exception if it doesn't exist!!
The problem is it takes a huge amount of time for the CLI to iterate through a large directory, when it does it stores the file list in cache while it runs through and syncs to S3. In the time that it's created the file list and when it actually get's to some of the files...some files may have been deleted - resulting in this problem.
I've done a debug and here is the stack trace
2016-11-15 00:48:54,237 - MainThread - awscli.customizations.s3.results - DEBUG - Exception caught during command execution: [Errno 2] No such file or directory: '/var/www/mp3s/file.mp3'
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/awscli/customizations/s3/s3handler.py", line 170, in call
for fileinfo in fileinfos:
File "/usr/local/lib/python2.7/dist-packages/awscli/customizations/s3/fileinfobuilder.py", line 31, in call
for file_base in files:
File "/usr/local/lib/python2.7/dist-packages/awscli/customizations/s3/comparator.py", line 79, in call
src_file = advance_iterator(src_files)
File "/usr/local/lib/python2.7/dist-packages/awscli/customizations/s3/filegenerator.py", line 142, in call
for src_path, extra_information in file_iterator:
File "/usr/local/lib/python2.7/dist-packages/awscli/customizations/s3/filegenerator.py", line 211, in list_files
size, last_update = get_file_stat(file_path)
File "/usr/local/lib/python2.7/dist-packages/awscli/customizations/s3/utils.py", line 212, in get_file_stat
stats = os.stat(path)
OSError: [Errno 2] No such file or directory: '/var/www/mp3s/file.mp3'
fatal error: [Errno 2] No such file or directory: '/var/www/mp3s/file.mp3'
2016-11-15 00:48:54,247 - Thread-1 - awscli.customizations.s3.results - DEBUG - Shutdown request received in result processing thread, shutting down result thread.
Please guys can you look into this, it renders the CLI useless on large constantly changing directories. There's not a whole lot of solutions out there other than the official AWS CLI.
We are running into this same issue using aws s3 sync
as part of log rotation (a file is rotated while the sync is running which causes issues even though it isn't part of the --include
pattern).
I believe the issue is that there is a race condition between the check for the file existing in issues_warning
and when the stat
is looked up. In this brief period of time, the file gets removed.
I'm going to look into whether it's reasonable to EAFP here.
So... how does the fix in PR #2333 look?
@kyleknap Why was this closed? I'm still encountering a similar issue
@neonb88 - Thanks for your feedback. This issue was closed due to PR #2333 getting merged. What version of the CLI is in use and can provide the output from using the --debug
option please?
Most helpful comment
We are running into this same issue using
aws s3 sync
as part of log rotation (a file is rotated while the sync is running which causes issues even though it isn't part of the--include
pattern).