Zero-to-jupyterhub-k8s: CI: Automatic hub image scanning using trivy

Created on 10 Jul 2020  ·  14Comments  ·  Source: jupyterhub/zero-to-jupyterhub-k8s

Bug description

Running trivy image scanning against the image for k8s-hub results in a number of vulnerabilities found.

Expected behaviour

0 vulnerabilities

Actual behaviour

trivy image --ignore-unfixed jupyterhub/k8s-hub:0.9.0-n033.h8211ad2 > out.txt

Results in: out.txt showing 100 vulnerabilities

How to reproduce

  1. Install trivy
  2. Run trivy image --ignore-unfixed jupyterhub/k8s-hub:0.9.0-n033.h8211ad2 > out.txt

Your personal set up

I'm reporting on jupyterhub/k8s-hub:0.9.0-n033.h8211ad2 because the release version 0.9.0 has around 300 vulnerabilities.

help wanted maintenance pr-idea

Most helpful comment

Thank you for your work on this @consideRatio. I applied the libc-bin patch in #1715 if that is something that you all would consider.

If you are interested, I am also happy to contribute a PR for adding image scanning to CI.

All 14 comments

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:

Welcome to the Jupyter community! :tada:

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:

Welcome to the Jupyter community! :tada:

@meneal thanks for introducing these toolkits as part of reporting on the security issues in the image! I'm curious to see the current status if we rebuild it now. I'll do that.

With it being rebuilt, it ended up like this:

hub-dependencies (ubuntu 18.04)
===============================
Total: 8 (UNKNOWN: 0, LOW: 4, MEDIUM: 4, HIGH: 0, CRITICAL: 0)

+----------+------------------+----------+-------------------+-----------------+------------------------------------+
| LIBRARY  | VULNERABILITY ID | SEVERITY | INSTALLED VERSION |  FIXED VERSION  |               TITLE                |
+----------+------------------+----------+-------------------+-----------------+------------------------------------+
| libc-bin | CVE-2018-11236   | MEDIUM   | 2.27-3ubuntu1     | 2.27-3ubuntu1.2 | glibc: Integer overflow in         |
|          |                  |          |                   |                 | stdlib/canonicalize.c on           |
|          |                  |          |                   |                 | 32-bit architectures leading       |
|          |                  |          |                   |                 | to stack-based buffer...           |
+          +------------------+          +                   +                 +------------------------------------+
|          | CVE-2018-11237   |          |                   |                 | glibc: Buffer overflow in          |
|          |                  |          |                   |                 | __mempcpy_avx512_no_vzeroupper     |
+          +------------------+          +                   +                 +------------------------------------+
|          | CVE-2018-19591   |          |                   |                 | glibc: file descriptor             |
|          |                  |          |                   |                 | leak in if_nametoindex() in        |
|          |                  |          |                   |                 | sysdeps/unix/sysv/linux/if_index.c |
+          +------------------+          +                   +                 +------------------------------------+
|          | CVE-2020-1751    |          |                   |                 | glibc: array overflow in           |
|          |                  |          |                   |                 | backtrace functions for            |
|          |                  |          |                   |                 | powerpc                            |
+          +------------------+----------+                   +                 +------------------------------------+
|          | CVE-2019-19126   | LOW      |                   |                 | glibc:                             |
|          |                  |          |                   |                 | LD_PREFER_MAP_32BIT_EXEC not       |
|          |                  |          |                   |                 | ignored in setuid binaries         |
+          +------------------+          +                   +                 +------------------------------------+
|          | CVE-2019-9169    |          |                   |                 | glibc: regular-expression          |
|          |                  |          |                   |                 | match via proceed_next_node        |
|          |                  |          |                   |                 | in posix/regexec.c leads to        |
|          |                  |          |                   |                 | heap-based buffer over-read...     |
+          +------------------+          +                   +                 +------------------------------------+
|          | CVE-2020-10029   |          |                   |                 | glibc: stack corruption from       |
|          |                  |          |                   |                 | crafted input in cosl, sinl,       |
|          |                  |          |                   |                 | sincosl, and tanl...               |
+          +------------------+          +                   +                 +------------------------------------+
|          | CVE-2020-1752    |          |                   |                 | glibc: use-after-free in           |
|          |                  |          |                   |                 | glob() function when expanding     |
|          |                  |          |                   |                 | ~user                              |
+----------+------------------+----------+-------------------+-----------------+------------------------------------+

Thank you for your work on this @consideRatio. I applied the libc-bin patch in #1715 if that is something that you all would consider.

If you are interested, I am also happy to contribute a PR for adding image scanning to CI.

If you are interested, I am also happy to contribute a PR for adding image scanning to CI.

Absolutely! We have been sticking mostly to Travis in the JupyterHub organization, perhaps it would make sense to have this as a CI CronJob of some kind?

I setup a TravisCI CronJob like this before here:
https://github.com/jupyterhub/configurable-http-proxy/blob/fdb3c717e250cbf6790489621119b694f67f4955/.travis.yml#L31-L45

Positive outcomes

I'm not sure about the details of what makes sense, but I figure it was important to try describe what a goal is more concretely.

  • [ ] To have a CI CronJob inspecting our hub image that we can inspect logs from.

    • [ ] Inspection probably make sense to do on the latest published image.

    • [ ] Inspection could also be made on a fresh build of the Dockerfile for comparison. All logic working with already built images could be made git-repo agnostic, but this part requires some repo-specific configuration.

  • [ ] To have a CI CronJob automatically doing something with GitHub, for example opening an issue about how it has degraded since last report or similarly.

Existing GitHub Actions?

While I personally prefer we stick to TravisCI in general as we do it in the JupyterHub organization on github typically, we have started using some GitHub actions here and there. So, if there is existing pre-defined github actions that does most of the work it may be worth adding that instead.

Aha! Perhaps this is the way to go!

It looks like a setup like what is in the example here would meet both positive outcomes without hitting:

Inspection could also be made on a fresh build of the Dockerfile for comparison. All logic working with already built images could be made git-repo agnostic, but this part requires some repo-specific configuration.

As you say, I think it could be more of an optional thing to hit this though. The biggest benefit, in my mind, of doing fresh builds against the Dockerfile would be that when issues are opened that show vulnerabilities in the published images they could be checked against current builds to check whether all is needed is a rebuild. This is valuable, but still requires intervention to actually push a new release.

It would seem that to add the most value we would add trivy cron job with the following features:
1. Daily check
2. Pull latest published image for hub image from dockerhub
3. Create issues if there are vulnerabilities

Should this check all images built by this repo in the folder here?

  • check every monday or so instead of daily, which would be too often.
  • checking all images are nice
  • im not sure if we have a "latest" tag pushed though, so: pip install chartpress, and doing: chartpress --skip-build, may be needed to get the tags written out at the moment, but i think we should push a latest tag instead perhaps to avoid this complexity

Did you evaluate the difference of this github action vs the official aquasecurity action? I assume the other is more mature but it would be good to have a note why we chose that instead of the official at this point in time

Did you evaluate the difference of this github action vs the official aquasecurity action?

Not yet, but I will do so and add notes.

Comparison between the two GH actions:

  1. Trivy Action

    • Pros:



      • Creates issues when vulnerabilities are found


      • More Stars ⭐ FWIW



    • Cons:



      • No activity for the last 4 months


      • Requires more permissions since it needs a personal access token to generate issues



  2. Trivy Vulnerability Scanner

    • Pros:



      • From the team which makes trivy itself


      • Updated more recently (within past 25 days)



    • Cons:



      • Less Stars ⭐ FWIW


      • Marked as _EXPERIMENTAL_ in the "About" section FWIW


      • Does not create issues, only outputs within the action itself



Overall, the one from Aquasecurity seems more geared toward being run as a part of CI/CD while the external one seems more geared toward being run as a cron job. I think both could be run either way, but just looking at the usage guidelines for each it seems their goals are oriented in this way.

@consideRatio What do you think? IMHO issue generation is more useful so that leads me to think "Trivy Action" is a better path. I have heard some concern (from in my company) that opening issues in a public repo for vulnerabilities is similar to putting a "kick me" sign on your back.

@meneal thanks for this excellent summary!

Requires more permissions since it needs a personal access token to generate issues

I think that is totally fine because its a available by default token according to this documentation if I understand things correctly.


I think it's a net positive to post an issue about it. When a container scanner runs, it scans for known vulnerabilities I figure, and posting about having known vulnerabilities feels sensible. Posting about unknown vulnerabilities would be a very different thing though and very bad practice, but I guess that posting about known ones should be a good thing to help us update the image and users update, even though those with bad intent may get a reminder about the many known vulnerabilities that could be exploited.


With regards to cronjob / part of tests triggered by pushes etc, I'm quite confident they could both run as a cronjob or not. It is the trigger of the job that runs the action. Having a CronJob test and help make us take action is what I would like.

Now that I think about it, I lean towards using the official github action because a Badge in the README.md could point to the CronJob status and then that would be quite a good signal to action. As a stretch goal one could aim to upstream some notification.

Regarding notifications btw, from the excellent documentation of a GitHub action created by @hamelsmu I learned that a issue creation could be setup outside the GitHub action itself with another github-script action, see: https://github.com/marketplace/actions/repo2docker-action#cache-builds-on-mybinderorg-and-provide-a-link

Hmmm I lean towards:

  1. Daily CronJob
  2. Official GitHub action that notifies using a badge in the README.md file about the latest run scan
  3. Optional stretch goal: to either upstream a notification feature to the aquasecurity action, or to add another step in the job that would create an issue if needed like done here. If we create issues, we may want to use a weekly cronjob or so, because having a new issue on every run in a daily CronJob would be like spam, it would need to instead update an existing or similar if so.

Just to verify one other part for this before starting to code something up. In your comment here you mentioned:

im not sure if we have a "latest" tag pushed though, so: pip install chartpress, and doing: chartpress --skip-build, may be needed to get the tags written out at the moment, but i think we should push a latest tag instead perhaps to avoid this complexity

Do you think a PR to push a latest tag is a first step on this, or should we take the pip install chartpress, chartpress --skip-build approach? Seems that the latest tag push is the better option from your comment.

Do you think a PR to push a latest tag is a first step on this, or should we take the pip install chartpress, chartpress --skip-build approach? Seems that the latest tag push is the better option from your comment.

Ah hmmm, so I think we should create a latest tag no matter what, but we could run some logic to extract the latest published tag using chartpress --skip-build and inspecting its output or reading the now modified jupyterhub/values.yaml file. It will be modified on paths described in chartpress.yaml for the images built in this repo.

If you would like to resolve this without a workaround where we would inspect the latest published tag manually, it would probably mean to resolve this feature in jupyterhub/chartpress: https://github.com/jupyterhub/chartpress/issues/94. It probably isn't so complicated to implement.

@meneal I'll work to implement this feature of chartpress right now so you get unblocked.

UPDATE: I didn't get https://github.com/jupyterhub/chartpress/issues/94 resolved and have a bit too split attention atm to take it on right now.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

consideRatio picture consideRatio  ·  3Comments

jonathanballs picture jonathanballs  ·  3Comments

consideRatio picture consideRatio  ·  3Comments

betatim picture betatim  ·  4Comments

betatim picture betatim  ·  4Comments