Dvc: status: ignoring cache that is not in local cache nor in remote

Created on 12 Aug 2020  路  3Comments  路  Source: iterative/dvc

dvc status -c should probably exit with non-zero code if some files are not presented neither in cache, neither in remote.

It is possible that some files are not presented neither in local cache, neither in remote. In this case dvc status -c will print out warning and exit with zero code:

$dvc status -c
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: data.csv, md5: 746308829575e17c3331bbcb00c0898b
Data and pipelines are up to date.

It may happen if Alice executes dvc add data.csv, git add data.csv, git commit and git push but doesn't execute dvc push and Bob executes dvc status -c data.csv on another machine.

There should be a way in such case to exit with non-zero from dvc status command. It will allow to handle such errors in different scripts.

Context: https://discordapp.com/channels/485586884165107732/563406153334128681/743159360745898044

feature request help wanted p2-medium

Most helpful comment

Thanks, @efiop ! I'll create PR with fix as soon as possible.

All 3 comments

dvc status -c shouldn't exit with non-zero code by default, because missing cache is just part of the normal status report. It should exit with non-zero in case -q is specified (there is already logic for that), but the problem is that we ignore STATUS_MISSING in https://github.com/iterative/dvc/blob/64f038fda83c2400263b335fca6bb4f63f6ecf0f/dvc/repo/status.py#L96 . So the first step here is to just stop ignoring it there and include it into the report. This is pretty simple and just requires adjusting the lines I've linked above.

After that we might consider introducing a special option that will tell status to error-out if it is not up-to-date.

Thanks, @efiop ! I'll create PR with fix as soon as possible.

Reopening per https://github.com/iterative/dvc/pull/4398#issuecomment-678744603, just a small detail:

If missing is now an expected status, the WARNING is probably no longer needed
I think I can achieve it by adding log_missing=True param into dvc.data_cloud.DataCloud.status...

Was this page helpful?
0 / 5 - 0 ratings

Related issues

GildedHonour picture GildedHonour  路  3Comments

TezRomacH picture TezRomacH  路  3Comments

dnabanita7 picture dnabanita7  路  3Comments

shcheklein picture shcheklein  路  3Comments

anotherbugmaster picture anotherbugmaster  路  3Comments