DVC version 0.63.4
Ubuntu 18.04
Installed via pip via poetry
After upgrading DVC from 0.62.1 to 0.63.4 I started seeing this error when our CI system does a dvc pull command:
dvc pull --jobs 3 --with-deps ./ImputeByEcg.dvc
ERROR: unexpected error - type object got multiple values for keyword argument 'desc'
Hi @BernMcCarty !
Could you please show dvc pull --jobs 3 --with-deps ./ImputeByEcg.dvc -v output?
In case it is helpful, the tqdm version that I am using here is 4.35.0.
dvc pull --jobs 3 --with-deps ./ImputeByEcg.dvc -v
DEBUG: PRAGMA user_version;
DEBUG: fetched: [(3,)]
DEBUG: CREATE TABLE IF NOT EXISTS state (inode INTEGER PRIMARY KEY, mtime TEXT NOT NULL, size TEXT NOT NULL, md5 TEXT NOT NULL, timestamp TEXT NOT NULL)
DEBUG: CREATE TABLE IF NOT EXISTS state_info (count INTEGER)
DEBUG: CREATE TABLE IF NOT EXISTS link_state (path TEXT PRIMARY KEY, inode INTEGER NOT NULL, mtime TEXT NOT NULL)
DEBUG: INSERT OR IGNORE INTO state_info (count) SELECT 0 WHERE NOT EXISTS (SELECT * FROM state_info)
DEBUG: PRAGMA user_version = 3;
DEBUG: Preparing to download data from 's3://cv-pipeline/dvc'
DEBUG: Preparing to collect status from s3://cv-pipeline/dvc
DEBUG: Collecting information from local cache...
DEBUG: SELECT count from state_info WHERE rowid=?
DEBUG: fetched: [(0,)]
DEBUG: UPDATE state_info SET count = ? WHERE rowid = ?
ERROR: unexpected error - type object got multiple values for keyword argument 'desc'
------------------------------------------------------------
Traceback (most recent call last):
File "/home/DisiGitLabRunner/.cache/pypoetry/virtualenvs/cv-pipeline-py3.6/lib/python3.6/site-packages/dvc/main.py", line 43, in main
ret = cmd.run()
File "/home/DisiGitLabRunner/.cache/pypoetry/virtualenvs/cv-pipeline-py3.6/lib/python3.6/site-packages/dvc/command/data_sync.py", line 31, in run
recursive=self.args.recursive,
File "/home/DisiGitLabRunner/.cache/pypoetry/virtualenvs/cv-pipeline-py3.6/lib/python3.6/site-packages/dvc/repo/__init__.py", line 33, in wrapper
ret = f(repo, *args, **kwargs)
File "/home/DisiGitLabRunner/.cache/pypoetry/virtualenvs/cv-pipeline-py3.6/lib/python3.6/site-packages/dvc/repo/pull.py", line 25, in pull
recursive=recursive,
File "/home/DisiGitLabRunner/.cache/pypoetry/virtualenvs/cv-pipeline-py3.6/lib/python3.6/site-packages/dvc/repo/fetch.py", line 54, in _fetch
used, jobs, remote=remote, show_checksums=show_checksums
File "/home/DisiGitLabRunner/.cache/pypoetry/virtualenvs/cv-pipeline-py3.6/lib/python3.6/site-packages/dvc/data_cloud.py", line 81, in pull
show_checksums=show_checksums,
File "/home/DisiGitLabRunner/.cache/pypoetry/virtualenvs/cv-pipeline-py3.6/lib/python3.6/site-packages/dvc/remote/local.py", line 389, in pull
download=True,
File "/home/DisiGitLabRunner/.cache/pypoetry/virtualenvs/cv-pipeline-py3.6/lib/python3.6/site-packages/dvc/remote/local.py", line 353, in _process
download=download,
File "/home/DisiGitLabRunner/.cache/pypoetry/virtualenvs/cv-pipeline-py3.6/lib/python3.6/site-packages/dvc/remote/local.py", line 262, in status
local_exists = self.cache_exists(md5s, jobs=jobs, name=self.cache_dir)
File "/home/DisiGitLabRunner/.cache/pypoetry/virtualenvs/cv-pipeline-py3.6/lib/python3.6/site-packages/dvc/remote/local.py", line 218, in cache_exists
+ ("cache in " + name if name else "local cache"),
File "/home/DisiGitLabRunner/.cache/pypoetry/virtualenvs/cv-pipeline-py3.6/lib/python3.6/site-packages/dvc/progress.py", line 99, in __init__
**kwargs
File "/home/DisiGitLabRunner/.cache/pypoetry/virtualenvs/cv-pipeline-py3.6/lib/python3.6/site-packages/tqdm/_tqdm.py", line 988, in __init__
self.display()
File "/home/DisiGitLabRunner/.cache/pypoetry/virtualenvs/cv-pipeline-py3.6/lib/python3.6/site-packages/tqdm/_tqdm.py", line 1376, in display
self.sp(self.__repr__() if msg is None else msg)
File "/home/DisiGitLabRunner/.cache/pypoetry/virtualenvs/cv-pipeline-py3.6/lib/python3.6/site-packages/tqdm/_tqdm.py", line 1022, in __repr__
return self.format_meter(**self.format_dict)
File "/home/DisiGitLabRunner/.cache/pypoetry/virtualenvs/cv-pipeline-py3.6/lib/python3.6/site-packages/tqdm/_tqdm.py", line 442, in format_meter
**extra_kwargs)
TypeError: type object got multiple values for keyword argument 'desc'
------------------------------------------------------------
@casperdcl Could you please take a look? :slightly_smiling_face:
that's definitely an outdated version of tqdm being used (the path is tqdm/_tqdm.py instead of tqdm/std.py). @BernMcCarty pip install -U tqdm or pip install -U dvc should hopefully fix the issue. If not, please re-post the error output.
@casperdcl But we have 4.35.0 in our deps, so we should we update it as well?
I encountered the same issue and I have an up-to-date tqdm (4.36.1).
The issue seems to be the workaround of a corner case in our format_dict that defines the desc dict field. This dict is expanded as kwargs by tqdm's formatter but desc is already defined there.
wait hang on...
https://github.com/iterative/dvc/blob/8d067c8805229f2312810700b467c4de7d14ca6e/dvc/progress.py#L139
There are two things wring with that line. Yes, will need fixing.
@efiop I think it's safe to say tqdm<5 rather than pinning but it's up to you.
BTW I was eager to adopt this v0.63.4 to see if it would stop filling up my gitlab CI logs with noisy progress. I guess the fact that we're even in tqdm probably means that it still will show progress in this version even when running on my CI server. With the recent change, is there a way I can stop this noisy progress on my CI server? I've never found a way.
@BernMcCarty do you mean your CI doesn't support carriage return (\r) so doesn't put progress on a single line?
I guess the answer is yes. We are using gitlab and we see tons of progress lines like so and then we hit the gitlab log limit and this is a big problem.
../../.dvc/cache/52/c3601f726f35e69afc581851eb69b2: 96% 6.83G/7.13G [00:13<00:00, 536MB/s]
../../.dvc/cache/52/c3601f726f35e69afc581851eb69b2: 97% 6.88G/7.13G [00:13<00:00, 539MB/s]
../../.dvc/cache/52/c3601f726f35e69afc581851eb69b2: 97% 6.94G/7.13G [00:14<00:00, 543MB/s]
../../.dvc/cache/52/c3601f726f35e69afc581851eb69b2: 98% 6.99G/7.13G [00:14<00:00, 551MB/s]
../../.dvc/cache/52/c3601f726f35e69afc581851eb69b2: 99% 7.04G/7.13G [00:14<00:00, 550MB/s]
../../.dvc/cache/52/c3601f726f35e69afc581851eb69b2: 99% 7.09G/7.13G [00:14<00:00, 550MB/s]
right, so you'd have the same problem with progressbars when pip is downloading and spinners when building wheels...
Is running dvc with --quiet not an option?
If not, probably best to open a separate issue.
@casperdcl Thanks for the fix! Btw, not sure I understand
@efiop I think it's safe to say tqdm<5 rather than pinning but it's up to you.
Could you elaborate, please?
@BernMcCarty @jeremcs Thanks for the feedback guys! I'm releasing the new dvc version right now.
@BernMcCarty 0.63.4 is still filling up your logs? I guess gitlab is using a pseudo-tty for capturing logs 🙁As @casperdcl noted, maybe --quiet would be an option for you?
@efiop I mean that I use semver for tqdm so any version 4.x.x should not break dvc.
I'm not a fan of aggressively pinning dependencies because that can cause version conflicts. I think dvc also recognises this and specifies minimum versions (tqdm>=4.35.0 as you said). But for ultimate safety maybe add <5? It's a minor point; I'm unlikely to release a v5 that would break anything the way I've implemented things.
@casperdcl Ah, got it. I wasn't aware of that in tqdm. Sure, let's limit with <5. I'll edit in a second. Thanks! :)
I had the --quiet option in place until very recently and it made no difference. I actually only just took it out when I tried to upgrade to v0.63.4 because from the release notes I thought progress might be suppressed automatically since it is running on a CI server.
no that was likely about suppressing output for file redirection, not for CI envs
@casperdcl Just tested and indeed -q doesn't suppress progress bar, we have a bug.
created https://github.com/iterative/dvc/issues/2665 for that
If --quiet is fixed and if it is the only way to suppress the progress output, then that still leaves folks like me who use GitLab in a pickle if --quiet is going to suppress all output. CI troubleshooting is hard and time consuming I don't actually want DVC to be completely silent, I just don't want noisy progress that fills up my log to the limit. Something like --no-progess maybe.
@BernMcCarty How about redirecting to a file then? That way the progress bar will be automatically disabled.
But I want to see the output. Would piping the output to tee disable the progress bar somehow? Maybe worth finding out, but just having --no-progress or the ability to set a control as an environment variable would be simpler.
@efiop Is that a new feature that redirecting dvc pull to a file automatically turns off progress? I just tried it with v0.64.0 and that did not work. I still got progress.
> dvc pull --jobs 3 ./QueryParse.dvc > ~/scrap/dvcpull.txt
Multi-Threaded:
38%|███▊ |../../../../../../../../../cv-pipeline/io/parsed/Echo measur628359168/1662167418 [00:10<05:07, 3.36MB/s]
1%|▏ |../../../../../../../../../cv-pipeline/io/parsed/ICD_pars212860928/15971303979 [00:04<1:40:18, 2.62MB/s]
100%|██████████|../../../../../../../../../cv-pipeline/io/parsed/Phenotypes_par23109797/23109797 [00:01<00:00, 2.03MB/s]
Really think it best to continue discussion in #2664...
dvc pull --jobs 3 ./QueryParse.dvc > ~/scrap/dvcpull.txt
Should be
dvc pull --jobs 3 ./QueryParse.dvc 2> ~/scrap/dvcpull-stderr.txt
Or even
dvc pull --jobs 3 ./QueryParse.dvc 2>&1 | tee
@BernMcCarty @jeremcs 0.65.0 is out. Please upgrade. Thanks a lot for the feedback! 🙂