Dvc: Access is denied on Windows 10.

Created on 3 Nov 2020  路  7Comments  路  Source: iterative/dvc

Bug Report

I am just starting to use DVC to manage data files on Windows 10 x64. When I did DVC add for a large folder (2GB, 10K+ files), I got something like OSError: File WindowsPathInfo: '.dvc\cache\ba\0faa5f5464fde0e583677c6ef568c9' copy failed, error: Access is denied. or ERROR: unexpected error - File WindowsPathInfo: '.dvc\cache\67\9dd3f295ad11d17f5960ac093e74d7' copy failed, error: Access is denied. My files are copied from another folder so I don't think any other program may operate those files at the same time. Also, I run the command in admin prompt. I turn off the anti virus software but it doesn't work.
I have talked with Pawe艂, one of you DVC developers, and we come to a conclusion that the possible reason might be that the memory may not be enough to add a large folder. I tried to change the checksum_jobs to 1 but it doesn't work. It is a workaround that I add subfolders one by one, but it is a waste of time. So is it really because of memory? How much memory should I prepare for adding per MB files?

Please provide information about your setup

I updated DVC to the latest version just now, but still meet this trouble.

Before:

DVC version: 1.7.0 (exe)

Platform: Python 3.7.5 on Windows-10-10.0.17134-SP0
Supports: All remotes
Cache types: hardlink
Cache directory: NTFS on C:\
Workspace directory: NTFS on C:\
Repo: dvc, git

Now:

DVC version: 1.9.1 (exe)

Platform: Python 3.7.9 on Windows-10-10.0.17134-SP0
Supports: All remotes
Cache types: hardlink, symlink
Cache directory: NTFS on C:\
Workspace directory: NTFS on C:\
Repo: dvc, git

Additional Information (if any):

If applicable, please also provide a --verbose output of the command, eg: dvc add --verbose.
https://github.com/pzhy13/fyp/blob/main/temp
It's part of one of my trials.

bug p3-nice-to-have research

Most helpful comment

In recent trials I turned off the Behavior Monitoring of the Trend Micro and added the dvc.exe to the whitelist of the Trend Micro, and I observed that no more "Access is denied" errors!

All 7 comments

Some context from our email exchange and one meeting:

and we come to a conclusion that the possible reason might be that the memory may not be enough to add a large folder

Fail usually happens somewhere during copying files to the cache. Upon closer inspection during the call, we noticed that just before throwing Acess is denied exception, number of hard faults spikes. That was what led me to believe that it might be something related to memory.

The machine in question had 8GB of RAM.

Changing the checksum_jobs to 1 seemed to help at least in one of the datasets case (2 GB). Though it was not a golden bullet, as dvc was still failing on bigger datasets (90 GB). We have been editing checksum_jobs as it was only direct way to tinker with DVC resource usage that I saw at the time.

How much memory should I prepare for adding per MB files?

Regretfully, we do not have an answer to that question. Our experience with dvc-bench shows that in case of our test dataset (~800 MB, ~25k files) - 4 GB is not enough.

Changing the checksum_jobs to 1 seemed to help at least in one of the datasets case (2 GB).
I need to use a batch to do add for every subfolder, or the error will appear, even if I change the checksum_jobs to 1. If I don't change, the batch will fail to add some of the subfolders.

@pzhy13 as far as I remember, chaning the number of checksum_jobs helped at least in the case of dataset we have been discussing on our call. Am I wrong?

@pared You are right, but that dataset is far less than 2GB.

@pared Thanks for the research! :pray:

@pzhy13 Unfortunately we are not able to reproduce your problem. So far the clues show that you might want to just increase the RAM (or the swap/page file size) for your machine, as it is likely to be caused by the lack of it :slightly_frowning_face: . We have another ticket about RAM optimization in https://github.com/iterative/dvc/issues/4139 , for which we plan on researching and optimizing dvc. That might help in your case as well. We plan on having another round of optimizations in December, so for now seems like you only have two options: research and contribute some dvc ram optimizations or increase RAM(or swap/page file size) on your machine.

In recent trials I turned off the Behavior Monitoring of the Trend Micro and added the dvc.exe to the whitelist of the Trend Micro, and I observed that no more "Access is denied" errors!

@pzhy13 Great to hear that, thanks for sharing the solution! Seems like the problem is solved. Closing the issue.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

siddygups picture siddygups  路  3Comments

jorgeorpinel picture jorgeorpinel  路  3Comments

dnabanita7 picture dnabanita7  路  3Comments

GildedHonour picture GildedHonour  路  3Comments

dmpetrov picture dmpetrov  路  3Comments