dvc: .dvcignore trouble with nfs mounted directory

Created on 5 Nov 2019  路  13Comments  路  Source: iterative/dvc

I have a large NFS mounted in a directory that I would like dvc to ignore.

Directory Structure:

directory
|___nfs
|___...
|___.dvc
|___.dvcignore

My .dvcignore has the following line:
/nfs/ (I've tried nfs/ and nfs/*)

The problem is that when I run dvc status or dvc pull the processes will just hang:

DEBUG: PRAGMA user_version;
DEBUG: fetched: [(3,)]
DEBUG: CREATE TABLE IF NOT EXISTS state (inode INTEGER PRIMARY KEY, mtime TEXT NOT NULL, size TEXT NOT NULL, md5 TEXT NOT NULL, timestamp TEXT NOT NULL)
DEBUG: CREATE TABLE IF NOT EXISTS state_info (count INTEGER)
DEBUG: CREATE TABLE IF NOT EXISTS link_state (path TEXT PRIMARY KEY, inode INTEGER NOT NULL, mtime TEXT NOT NULL)
DEBUG: INSERT OR IGNORE INTO state_info (count) SELECT 0 WHERE NOT EXISTS (SELECT * FROM state_info)
DEBUG: PRAGMA user_version = 3; 

Here is the traceback from KeyboardInterrupt:

   File "/home/ec2-user/app/proc/.env/lib/python3.7/site-packages/dvc/repo/__init__.py", line 499, in dvcignore
    return DvcIgnoreFilter(self.root_dir)
  File "/home/ec2-user/app/proc/.env/lib/python3.7/site-packages/dvc/ignore.py", line 67, in __init__
    for root, dirs, _ in os.walk(root_dir):
  File "/home/ec2-user/app/proc/.env/lib64/python3.7/os.py", line 410, in walk
    yield from walk(new_path, topdown, onerror, followlinks)
  File "/home/ec2-user/app/proc/.env/lib64/python3.7/os.py", line 368, in walk
    is_dir = entry.is_dir() 

Which makes me feel like the directory is not being ignored.

Additonal
I've unmounted the NFS directory and ran dvc status with no problem so I believe the issue stems from dvc trying to traverse it.

System Information:

DVC version: 0.66.6
Python version: 3.7.4
Platform: Linux 4.14.109-99.92.amzn2.x86_64
Installation: pip
bug p0-critical

Most helpful comment

thanks a lot for the report, @chadlohrli !

All 13 comments

thanks a lot for the report, @chadlohrli !

Hi @chadlohrli !

So nfs is mounted outside of your dvc repo, right? And where is your dvcignore?

@mroutis could you please include a link to the Discord discussion as we usually do?

So nfs is mounted outside of your dvc repo, right? And where is your dvcignore?

The nfs is mounted inside my dvc repo. The .dvcignore is at the same level as the nfs mount directory and .dvc directory. (I've tried moving the .dvcignore file around but no luck)

I have a large NFS mounted in a directory that I would like dvc to ignore.

@chadlohrli As far as I know, .dvcignore ignores only DVC-files (those that have extension .dvc). So, I would say it works as designed, although this might not be what is expected.

To understand better how .dvcignore works, please check this example: https://katacoda.com/dvc/courses/examples/dvcignore

@dashohoxha I don't believe that's correct.
Here is the official documentation: https://dvc.org/doc/user-guide/dvcignore

@chadlohrli Got it! Sorry, only now noticed that you've provided that info in the PR comments. Could you please try adding /nfs to .gitignore and committing those changes? Have a suspicion that it might be git that is hanging there.

@efiop tried adding /nfs to the root .gitignore file and committing. No luck.

@chadlohrli Thanks for the info! So probably a bug on our side, looking into it...

@chadlohrli Indeed, looks like our dvcignore filter has a midlife crisis and has no respect for himself :smile: Jokes aside, it simply doesn't account for itself when collecting dvcignores. Preparing a patch right now.

@chadlohrli The fix is out in 0.66.7, please upgrade and give it a try :slightly_smiling_face: Thanks for the feedback!

@efiop that seems to have done the job!

One subtlety is that the pattern for a nfs directory in .dvcignore must be nfs and not nfs/ or else dvc still attempts to traverse it's contents.

Thank you for the quick fix!

@chadlohrli Glad to hear it worked! :slightly_smiling_face:

It should actually be /nfs. We have the same rules as .gitignore and nfs would make dvc ignore all files/dirs named nfs anywhere in the project. And /nfs says to specifically ignore nfs in this directory.

Thanks for the feedback! :slightly_smiling_face:

Was this page helpful?
0 / 5 - 0 ratings

Related issues

luchoPipe87 picture luchoPipe87  路  69Comments

pared picture pared  路  73Comments

danfischetti picture danfischetti  路  41Comments

gcoter picture gcoter  路  38Comments

ChrisHowlin picture ChrisHowlin  路  35Comments