Dvc: Set correct permissions for users in a group automatically

Created on 17 Jul 2019  路  13Comments  路  Source: iterative/dvc

In a shared cache directory set up, dvc fails to set the correct permissions for the folders in the cache directory. For instance, a user X executes a dvc command - e.g dvc pull which sets the permission of a directory to rwxr-xr-x. A different user Y who is in the same group and is working with the same cache directory, is not able to delete files in this specific directory as the permission forbids it. Therefore, the user Y gets a Operation is not permitted error.

With this context, a additional parameter in the the cache config like git config core.sharedRepository group (https://git-scm.com/docs/git-config) which addresses this problem, may a suitable solution for a working shared cache directory set up.

For a more detailed overview see the conversation in the discord channel: https://discordapp.com/channels/485586884165107732/485596304961962003

bug c3-small-fix p0-critical

All 13 comments

I have this problem too. I tried setfacl to let other user have the access to the files, but it still can not work.

@BoysFight Have you tried setting umask to 0(or 002) ? Theoretically It should be a viable workaround, as it will make all files and directories created by dvc be accessble by other users in your group. Though you'd need to chmod your already existing files first to make it work. But in any case, this should be supported by dvc itself too, same as git does with it's core.sharedRepository, so I've prepared a patch for cache.shared option that does precisely that. Right now I'm figuring out details for some optimizations in our code that don't play nicely with that shared mode. Should be ready soon, so stay tuned! :slightly_smiling_face:

@efiop I tried, but i got "ERROR: failed to obtain data status - [Errno 1] Operation not permitted: {dvcCacheFilePath}".
As a workaround, i modify the owner of "/usr/bin/dvc" and set the suid flag.

Oh, @BoysFight , that is a neat workaround indeed! Does everything work properly after setting the suid flag? Btw, I've pushed dvc config cache.shared group support to my fork. Instructions on how to try it out are available here https://discordapp.com/channels/485586884165107732/485596304961962003/601402500779737103 , please feel free to try it out and be sure to let us know if it worked for you or not.

@efiop , take a loot at this issue: https://github.com/iterative/dvc/issues/2214

It might be helpful and probably fix it.

@mroutis Great point! Indeed, running chmod g+s fixes it.

@Marcelo00 @BoysFight Does that workaround work for you, guys? Should we still introduce cache.shared, or would that be enough(in that case we'll have to document it of course)?

EDIT: I might've jumped the gun on this one, investigating.

Ok, so g+s helps, but in a limited way. On my ubuntu default umask is 0022, so newly created directories by default will have 0755 permissions, which means that the group won't be able to modify the contents of that directory. So other users won't be able to create directories in cache(e.g. first-level dirs like .dvc/cache/88/). So to fix that you'll need to set umask for everyone that is using that dvc cache, which is potentially dangerous(since it will affect other files outside of dvc too) and not very convenient. On dvc side with cache.shared we could take care of everything except the initial setup(e.g. if cache existed before cache.shared, user will need to chown to proper group and chmod -R it to proper perms once). What do you guys think?

@efiop , do you have any more details about this?
I'm wondering why if "alice" and "bob" are in the group dvc and the cache directory is owned by the dvc group (with the sticky bit set), why "alice" and "bob" aren't able to create directories below the cache? Isn't this a matter of chmod -R the cache directory correctly?

How dvc is going to chmod / chown with the cache.shared, does it need special permissions (sudo) ?

@mroutis Sorry for delay here, we've discussed it in private, but just for the record: the main reason is umask wich defaults to 0022, and results in group members not having write-permissions to cache directories. chmod -R will solve it, but that will need to be done every time any directory is added to cache, which is very inconvenient and will cause problems for other users during the time between creating dir and chmod.

How dvc is going to chmod / chown with the cache.shared, does it need special permissions (sudo) ?

No, because the files added and directories created in cache are owned by the current user, and he is allowed to chmod files and dir that he creates and that is all we need for this to work :slightly_smiling_face:

:astonished: true! It make sense, a user creates a file and they decide whether is a good idea to share it with a group or not.

@efiop , and what about chmod g+rws ? wouldn't it set the sticky bit to read/write directories by default?

EDIT: It doesn't work :sweat:

It make sense, a user creates a file and they decide whether is a good idea to share it with a group or not. - this might be not the best idea. Since cache is shared we may end up in a situation when two users created the same file. One of them won't be able to save/read changes in this case?

@shcheklein , I was thinking about umask defaulting to 022 :sweat_smile: (my bad for specifying).

I agree this is not the desired behavior for DVC, tho

For the record: another issue with chmod -R is that you are not allowed to chmod files that are not owned by you, even if you share the same group. So if protected mode is enabled, the owner itself should set read-only permissions on cache files when he is adding those.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

shcheklein picture shcheklein  路  3Comments

TezRomacH picture TezRomacH  路  3Comments

siddygups picture siddygups  路  3Comments

ghost picture ghost  路  3Comments

ghost picture ghost  路  3Comments