params diff
: using parameterizationIn master
branch I have a params.yaml
like so:
datasets:
- A
- B
task: classification
seed: 42
feature_type: SparseDataset
training:
alpha: 0.01
In exp
branch I have a params.yaml
like so:
datasets:
- A
- B
task: classification
seed: 42
feature_type: SparseDataset
training:
alpha: 10.0
I have a training stage in dvc.yaml
like so:
train_model:
foreach: ${datasets}
do:
cmd: >-
python scripts/train.py
-- dataset ${item}
params:
- training.alpha
When I run dvc params diff
in my exp
branch I get
ERROR: unexpected error - duplicated key 'training.alpha'
Example:
Path Param Old New
params.yaml training.alpha 0.01 10.0
Mac OS Catalina 10.15.7
Python 3.7.9
Output of dvc version
:
$ dvc version
Tried on different dvc versions (1.11.8, master, 1.10.2) and only works on 1.10.2
Additional Information (if any):
Output of dvc params diff -v
:
2020-12-16 13:07:56,623 DEBUG: Check for update is enabled.
2020-12-16 13:07:56,639 DEBUG: fetched: [(3,)]
2020-12-16 13:07:56,703 DEBUG: 37.35 ms in resolving values
2020-12-16 13:07:56,824 DEBUG: 34.89 ms in resolving values
2020-12-16 13:07:56,851 DEBUG: fetched: [(5,)]
2020-12-16 13:07:56,858 DEBUG: fetched: [(3,)]
2020-12-16 13:07:56,911 DEBUG: 35.37 ms in resolving values
2020-12-16 13:07:56,946 DEBUG: fetched: [(5,)]
2020-12-16 13:07:56,949 ERROR: unexpected error - duplicated key 'training.alpha'
------------------------------------------------------------
Traceback (most recent call last):
File "/Users/bharathc/miniconda3/envs/dev/lib/python3.7/site-packages/dvc/main.py", line 90, in main
ret = cmd.run()
File "/Users/bharathc/miniconda3/envs/dev/lib/python3.7/site-packages/dvc/command/params.py", line 38, in run
all=self.args.all,
File "/Users/bharathc/miniconda3/envs/dev/lib/python3.7/site-packages/dvc/repo/params/__init__.py", line 13, in diff
return diff(self.repo, *args, **kwargs)
File "/Users/bharathc/miniconda3/envs/dev/lib/python3.7/site-packages/dvc/repo/params/diff.py", line 25, in diff
format_dict(old), format_dict(new), with_unchanged=with_unchanged
File "/Users/bharathc/miniconda3/envs/dev/lib/python3.7/site-packages/dvc/utils/diff.py", line 82, in diff
path_diff = _diff(old.get(path), new.get(path), with_unchanged)
File "/Users/bharathc/miniconda3/envs/dev/lib/python3.7/site-packages/dvc/utils/diff.py", line 67, in _diff
return _diff_dicts(old, new, with_unchanged)
File "/Users/bharathc/miniconda3/envs/dev/lib/python3.7/site-packages/dvc/utils/diff.py", line 46, in _diff_dicts
new = _flatten(new_dict)
File "/Users/bharathc/miniconda3/envs/dev/lib/python3.7/site-packages/dvc/utils/diff.py", line 40, in _flatten
return defaultdict(lambda: None, flatten(d))
File "/Users/bharathc/miniconda3/envs/dev/lib/python3.7/site-packages/dvc/utils/flatten.py", line 5, in flatten
return flatten_dict.flatten(d, reducer="dot")
File "/Users/bharathc/miniconda3/envs/dev/lib/python3.7/site-packages/flatten_dict/flatten_dict.py", line 91, in flatten
_flatten(d)
File "/Users/bharathc/miniconda3/envs/dev/lib/python3.7/site-packages/flatten_dict/flatten_dict.py", line 88, in _flatten
raise ValueError("duplicated key '{}'".format(flat_key))
ValueError: duplicated key 'training.alpha'
------------------------------------------------------------
2020-12-16 13:07:57,103 DEBUG: Version info for developers:
DVC version: 1.11.8 (pip)
---------------------------------
Platform: Python 3.7.9 on Darwin-19.6.0-x86_64-i386-64bit
Supports: gdrive, http, https, s3
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: None
Repo: dvc, git
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2020-12-16 13:07:57,104 DEBUG: Analytics is enabled.
2020-12-16 13:07:57,208 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/var/folders/r0/_1zb4w596xvdc5gydc9xxc9r0000gn/T/tmpkpzmnb5t']'
2020-12-16 13:07:57,210 DEBUG: Spawned '['daemon', '-q', 'analytics', '/var/folders/r0/_1zb4w596xvdc5gydc9xxc9r0000gn/T/tmpkpzmnb5t']'
I am not able to reproduce through the steps you mentioned. Looking at the code, it seems the issue is that we are not skipping files that are already a param dependency for the parameterization (they are always shown by default).
From your log, it is clear that we are going through 3 different revisions in the params (and, looks like this happens in metrics too) which should have been just 2. Will try to fix these, and please take a look later if it fixes your issue. Thanks
@bharathc346, could you please try on 1.11 branch of the DVC or the master whichever you prefer, and see if the bug was fixed? I was not able to repro the bug, so, would be good to get a confirmation. Thanks.
Running on master successfully by doing pip install git+https://github.com/iterative/dvc
, going to close this issue. Thanks for resolving.
Most helpful comment
Running on master successfully by doing
pip install git+https://github.com/iterative/dvc
, going to close this issue. Thanks for resolving.