Dvc: Consider using an alternative for ruamel.yaml due to critical CVE

Created on 30 Jun 2020  Â·  15Comments  Â·  Source: iterative/dvc

As a dependency of DVC, ruamel.yaml up to 0.16.10 (latest) contains a critical CVE (https://nvd.nist.gov/vuln/detail/CVE-2019-20478). This prevents us (and probably others) to use DVC in production systems.

awaiting response

Most helpful comment

Also, another thing to note from the documentation:

The default loader (typ='rt') is a direct derivative of the safe loader, without the methods to construct arbitrary Python
objects that make the unsafe loader unsafe, but with the changes needed for round-trip preservation of comments, etc.

So, I think we are safe here as we only use round-trip loader once when trying to dump, and looks like it is safe. All other uses PyYAML's safe_load.

All 15 comments

@stefanvangastel, sadly, there's no good alternative of ruamel.yaml, as others do not preserve comments. But, I think we can make an optional opt-in config that uses (C)SafeLoader that can be used in production.

Looks like safe_load is not affected.

Also, another thing to note from the documentation:

The default loader (typ='rt') is a direct derivative of the safe loader, without the methods to construct arbitrary Python
objects that make the unsafe loader unsafe, but with the changes needed for round-trip preservation of comments, etc.

So, I think we are safe here as we only use round-trip loader once when trying to dump, and looks like it is safe. All other uses PyYAML's safe_load.

We had a lot of trouble with serialization libraries for spaCy: the JSON library we were using, ujson, was unmaintained, and the msgpack library we were using needed a monkeypatch to support numpy arrays and then made several breaking changes.

What we ended up doing was making our own package that vendored these dependencies, srsly, so that we could maintain them. We've recently added YAML support, vendoring our own fork of Ruamel.

The current YAML support attempts to set the safe loading flags, but I want to simply remove this terrible feature from the fork. I don't think anyone should ever want that behaviour, and if they do they can use a different YAML library.

So, if it suits your needs, perhaps DVC can use srsly? We have similar concerns and together we can make sure it stays secure.

Discussed with @skshetry and it looks like dvc is safe and doesn't look like we will get any value from moving to srsly (but will gain 300KB more in dep size) at this point, but will will keep an eye on srsly if we find ourselves needing its other features. Closing for now.

@stefanvangastel btw, was it reported by some static analyzator? Veracode? If so, are you able to mark it as mitigated?

It was Sonatype Nexus IQ. And it only hit on the >= 9 CVE score of the
ruamel.yaml component. But that makes installing DVC impossible.

Op di 30 jun. 2020 om 18:33 schreef Ruslan Kuprieiev <
[email protected]>:

@stefanvangastel https://github.com/stefanvangastel btw, was it
reported by some static analyzator? Veracode? If so, are you able to mark
it as mitigated?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/iterative/dvc/issues/4134#issuecomment-651907302, or
unsubscribe
https://github.com/notifications/unsubscribe-auth/AAS34ESG54BRX4XNBS2GC7TRZIHUPANCNFSM4OL4JOQQ
.

@stefanvangastel Are you able to mark things as mitigated? How do you usually deal with such issues? Is it not possible to review particular warnings?

We can provide waivers to whitelist specific packages, so that issue is solved (not being able to install DVC due to usage of ruamel.yaml).

@stefanvangastel Just to clarify, will you be able to do that in this case to install dvc?

Ok, so to paint the entire picture;

We are in an offline / on-premise netwerk. We use Sonatype Nexus (on-prem) to proxy / cache PyPi packages. So using pip install we provide a the Nexus server als proxy.

Additionally Sonatype IQ (also on-prem) will scan all packages traveling through the proxy before allowing them in our network.

If we install DVC, pip returns an 403 error stating ruamel.yaml is placed in quarantine due to a CVE score >= 9 (in this case 9.8 out of 10). So DVC on itself is not the issue.

After providing your explanation our SEC dept allowed the specific version of ruamel.yaml package to be installed, making it possible to install DVC.

The whitelisting and allowing is just local, so we don't provide any upstream feedback to Sonatype or NIST since the CVE is valid and undisputed.

That's why my original question was; Can DVC use another package since using ruamel.yaml makes you (DVC) inherit its CVE score as it is a dependency.

@stefanvangastel Thank you for the explanation! Makes sense now!

Btw, other than on-premise Sonatype, are you aware of any tool that we could use to check our cve score during development? Without it, we won't ever be sure if the new package that we introduce is actually better or worse in CVE score. Regarding the ruamel itself, we could consider switching to srsly as suggested by @honnibal , but again I'm not sure what is the CVE score for it (seems like there is no score at all right now) and if that would really help us.

@efiop, srsly right now has only vendored the ruamel.yaml, so it's the same.

are you aware of any tool that we could use to check our cve score during development?

Doesn't dependabot catch those? I think, it's having trouble reading from our setup.py, it should work with requirements.txt files.

@efiop I can reproduce the CVE findings locally (macOS) by:

Installing a OWASP dependency checker CLI tool

brew install dependency-check

Create venv and install DVC

python3 -m venv dvc
source dvc/bin/activate
pip3 install dvc

Run a scan

dependency-check --project "dvc" --enableExperimental --scan dvc

(--enableExperimental has to be on to use the Python scanner)

This results in a dependency-check-report.html that looks like this (confirming the ruamel.yaml severity):

Screenshot 2020-07-02 at 16 01 39

@stefanvangastel Oh, I didn't know that dependency-check supports cve scanning. Nice! Created https://github.com/iterative/dvc/issues/4153 for that.

Was this page helpful?
0 / 5 - 0 ratings