Dvc: Boolean value might get unnecessarily uppercased

Created on 30 Nov 2020  路  8Comments  路  Source: iterative/dvc

Bug Report

Python uses True/False as boolean values, while in other languages it might not be the case (Julia uses true/false).

It seems that the correct behavior is dvc faithfully stores the params as strings, and only does type conversion (str -> bool/int/float) when it needs to output some statistics, e.g., dvc params diff.

This seems to be an edge case for bool values only, so feel free to close if you don't plan to fix it.

Please provide information about your setup

Output of dvc version:

DVC version: 1.10.2 (pip)
---------------------------------
Platform: Python 3.8.3 on Linux-5.4.0-54-generic-x86_64-with-glibc2.10
Supports: All remotes
Cache types: symlink
Cache directory: nfs4 on storage:/home
Caches: local
Remotes: s3
Workspace directory: nfs4 on storage:/home
Repo: dvc, git

Additional Information (if any):

# dvc.yaml
stages:
  build:
    cmd: julia -e "println(ARGS); println(typeof.(ARGS))" ${opt1} ${opt2} ${opt3}
# params.yaml
opt1: true
opt2: "true"
opt3: some_other_string

Notice that opt1 is transformed to "True" instead of "true".

jc@mathai3:~/Downloads/dvc_test$ dvc repro -f
Running stage 'build' with command:
    julia -e "println(ARGS); println(typeof.(ARGS))" True true some_other_string

["True", "true", "some_other_string"]
DataType[String, String, String]
bug parametrization

Most helpful comment

The thing here is what boolean means on the shell context. So, the use of boolean is kind of undefined behaviour, though there's nothing wrong in making it return false or true, i.e. lowercased.

All 8 comments

Here , the boolean value true and false should follow neither python nor julia but the yaml format or the json format, it is a bug I think.

The thing here is what boolean means on the shell context. So, the use of boolean is kind of undefined behaviour, though there's nothing wrong in making it return false or true, i.e. lowercased.

@skshetry
Shouldn't we deserialize these arguments through YAML, instead of passing them through the shell?

@karajan1001, they are stringified. The problem is that when YAML is loaded, it gets loaded as Python's boolean True/False which on str() becomes True/False.


in reality. the parser actually just generates a resolved dvc.yaml-like data, and substitutes.
So, if you add a cmd with just ${opt1}, the substitution happens as a boolean value.
If it has other string appended to it (eg: --value ${opt1}, then it is read as a string (with aforementioned logic of boolean stringification).

This thing has been limited by the schema that we use. It only allows validation, whereas we also need to convert those values.

https://github.com/iterative/dvc/blob/master/dvc/schema.py

I'll solve this issue by just converting boolean stringification to true/false as an edge-case.

Why not use the dump method in YAML and JSON?
image

It's many times slower than the following for simple conversion:

def conv(boolean):
  return "true" if boolean else "false"

But, your point is valid, in that using json.dumps _might_ allow us to provide better compatibility even on edge cases.

Also, we don't support the dict/list in cmd, so not much of a need to use those yet.

@johnnychen94, thanks for trying out parametrization. As you seem to use parameterization quite a lot, it'd be great if you could also provide some feedback overall (not just on the bugs but the feature :stuck_out_tongue_winking_eye:). How's been the experience?

You can either respond here or on the original issue #3633. We are hoping to release this feature at the end of the next month with a pre-release scheduled this week. So, any feedback is appreciated and will help us iron out the kinks.

Thanks again for the bug report. :pray:

Was this page helpful?
0 / 5 - 0 ratings

Related issues

shcheklein picture shcheklein  路  3Comments

shcheklein picture shcheklein  路  3Comments

anotherbugmaster picture anotherbugmaster  路  3Comments

robguinness picture robguinness  路  3Comments

mfrata picture mfrata  路  3Comments