Security-wg: Migrate vulnerability database to it own repo

Created on 31 Jul 2018  路  25Comments  路  Source: nodejs/security-wg

As many suggested, we should move the vulnerability database (core + ecosystem) to its own repo.
Security Advisories repository: https://github.com/nodejs/security-advisories

Repository Structure

Suggested repo structure would be

package.json
README.md
core
    README.md
    ...json
npm
    README.md
    ...json

Open Questions

  1. should we also build and push an npm package with the vuln db?
  2. should we change the current identifier (the nswg-eco / nswg-core) that is based on the running ids? if so, to what?

Action Items

  • [x] transform the existing format to the new structure
  • [ ] announce the change via Twitter and official channels (we can contact Zibby Keaton)
  • [ ] announce the change via the official Node.js website by PRing an announcement there
  • [ ] an on/off switch for the vuln db in the current sec wg repo to be able to revert changes as needed (we can change the name of the directory or have a commit that removes the vuln_db directory, and later revert the commit if needed to restore immediately)
  • [ ] make sure we sync data between this repo and the vuln repo during the announce phase so consumers can already start playing around with it.
DB security-wg-agenda

Most helpful comment

So I have started to work on that. Please check PR https://github.com/nodejs/security-advisories/pull/2 . the result of the script is in PR https://github.com/nodejs/security-advisories/pull/3

Also, please take a look at the other's rep roadmap https://github.com/nodejs/security-advisories/issues/4

All 25 comments

:+1: from me for moving it.

As for directory structure, I would recommend you add a second layer under npm for the package name. Otherwise, you'll have a giant folder of IDs.

Looks how https://github.com/rubysec/ruby-advisory-db/ does it (under the gems directory).

@vdeturckheim we probably can't rush it such as doing it after next wg session as we should be following the same process as my proposal regarding vuln db updates (#351) to avoid interruptions to db consumers.

Also, if we have the opportunity of making some changes in term of the db location then I think it's worth thinking about how to structure the vuln db (we had some discussions before in terms of directory structure, etc), and maybe bring up the npm package again for discussion too?

馃憤 for moving, too. I am curious what drives the urgency?

@lirantal fair enough!

@MarcinHoppe I'm building a new integration for this db: push everything to Algolia to offer a free to use API/search API ^^

can't wait for that API already ;-)

I am also up for separating this. I like the directory structure that bundle audit has and it seems to work well there, we should look into adapting it.
We probably should have some tooling that will:

  • transform the existing format to the new structure
  • make sure we sync data between this repo and the vuln repo during the announce phase so consumers can already start playing around with it.

Not sure if packaging it as an npm module makes so much sense, but I have no strong opinions on it. I am looking forward to the API already.

I am happy to support the efforts with a dedicated repo.

Sounds lovely Andreas!
Let's bring up all the points and summarize them in the first issue comment.

should we also build and push an npm package with the vuln db

Probably pretty heavy and I don't see any big value if we have the Algolia API

should we change the current identifier (the nswg-eco / nswg-core) that is based on the running ids? if so, to what?

I'd be keeping numbering system for core.
For ecosystem, as suggested, we could move to multiple directory structure. The current system would not make much sense indeed.

Great idea to create a dedicated repo, publish as npm whould be nice. However, publish the advisories on a website such as https://security.nodejs.org/advisories/ so that clients can fetch the vulnerability infos easily via http whould be more important from my perspective. This would allow to gather them easily for multiple purposes, e.g.

+1 on a separate repo, but as mentioned we should follow the process we are working on to let people know in advance before changing where they need to get the data from.

Reference: https://github.com/nodejs/security-advisories

@vdeturckheim maybe we can update the description of the issue at the top with progress/action items? I edited it a bit to summarize the information.

@pxlpnk about the new eco vuln db system - each module to its own folder, how would the files be named? I see that bundle advisories use both an OSVDB or a CVE assignment from MITRE

As we do not always have a CVE/OSVDB I think to keep the numeric id we currently have could make sense. This might result in problems with performance and maintainability (I don't have concrete examples though).

As long as the ID is unique I don't think it matters too much. WDYT?

The incrementing sequential id number that we're using is not a good system IMO, at least not in the way it is currently implemented on a manual file being created. It is error prone as we see when needing to check open PRs manually to see if an id is taken already, or to fetch the latest repository update and check what's the highest id that is used and that isn't guaranteed either as someone else might simultaneously do the same thing and open another PR in parallel using the same id number.

If we want to use a unique sequential id like it is today we should probably start using a database to manage this for us. Another way is maybe to use something like cuid to auto-generate these ids (although for cuid specifically the fingerprint bit might raise some eyebrows).

Thinking about this more, the simple sequential ids probably won't cut it for long and lead to issues if self-picked.
The id itself - except for being an identifier - does not have any additional value/function. So having a "ticketing system" that hands out and reserves the id seems like an idea worth exploring.
Essentially like CVE/OSVDB handle it.
The only problem we are solving with this is that we can handle and publish vulnerabilities that do not/would not get one of those IDS and can be independent. However, it adds the overhead for us to operate and maintain this DB.

A quick search and some rough estimates suggest we have existing CVEs for maybe ~50% https://github.com/nodejs/security-wg/search?l=JSON&q=cves&type=

My guts tell me that having our own numbering system might be the way to go, or are we overcomplicating and engineering the problem?

or just use a UUID v4 instead of a number, so you do not need a system providing the identifiers.

@pxlpnk

The id itself - except for being an identifier - does not have any additional value/function. So having a "ticketing system" that hands out and reserves the id seems like an idea worth exploring.

Exactly.

The only problem we are solving with this is that we can handle and publish vulnerabilities that do not/would not get one of those IDS and can be independent. However, it adds the overhead for us to operate and maintain this DB.

There is another problem that we are solving here - maybe a report could be correlated with several CVEs (often happens in security advisories for OSs) so you'd want to have an identifier for a report that may span a mentioning of several CVE ids.

Currently we do have our own numbering system, it's those sequential ids :-)

@bufferoverflow that's what I was trying to approach with my suggestion for cuid, uuid v4, hyperid, and all those variants but need to think which one of them would be a better fit for this use-case. uuid4 is rather long and usually security advisories are being referred to using their identifiers, for example PSA: NSWG-ECO-100 A recent vulnerability in foobar disclosed

I agree UUID's are to long and NSWG-ECO-7 or NSWG-COR-7 is better.
Maybe just use a UUID as long as no sequential id has been assigned by the core team.

CVE is already an array within the json files, so I do not see an issue with several CVE's.

How about if we use something in between, like say convert the current millisecond timestamp to a large base representation?

i.e:

new Date().getTime().toString(36)
// something like jlmn6im3

that would be sustainable for quite a while

WDYT?

That sounds like a reasonable approach @lirantal and anyone can execute that on their local machine to generate a new idea. 馃憦

Great! Even if we agree on this we'll need to announce the change before which will take time (60 days) so let's make sure we're aligned first.

@nodejs/security-triage @nodejs/security-wg can WDYT about https://github.com/nodejs/security-wg/issues/359#issuecomment-418177154 as a method for auto-generating ids for reports?

I am happy with it. I think its own repo makes a lot of sense. Also, will migration be a good time to restructure it? Too many files in the same folder is painful to work with.

@dgonzalez: @reedloden suggested we look into how ruby-sec is doing it: https://github.com/nodejs/security-wg/issues/359#issuecomment-409286741

So I have started to work on that. Please check PR https://github.com/nodejs/security-advisories/pull/2 . the result of the script is in PR https://github.com/nodejs/security-advisories/pull/3

Also, please take a look at the other's rep roadmap https://github.com/nodejs/security-advisories/issues/4

Finally moved forward, repo is fully set up and there is a daily sync up thanks to travis.

closing in favor of #494

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mhdawson picture mhdawson  路  5Comments

lirantal picture lirantal  路  7Comments

mhdawson picture mhdawson  路  5Comments

vdeturckheim picture vdeturckheim  路  8Comments

victor1342 picture victor1342  路  4Comments