Nixpkgs: Review security of nixpkgs commit process

Created on 1 Dec 2016  Â·  33Comments  Â·  Source: NixOS/nixpkgs

Issue description

Currently on nixos a lot of people can merge anything into master, and i and a few other nixos users are getting really scarred, as we are runnign nixos on production clusters and on laptops we use for critical production deployments. As i can see there are a lot of people(including myself) i have to trust with their pull requests, and there's no way i can review every commit. While we have stable releases, i have to run unstable on a lot of systems, since i have to use new pacakges, and i don't want and don't have time to backport eveything.

Possible solutions

  • Split nixpkgs repo into nixcore and nixpkgs

And allow only a few developers(maybe top 10) to merge into nixcore and eveyone else to nixpkgs. By core i mean a base nixos system(what is base nixos system should be defined)

  • Remove commit access for many developers(including myself)

This would be easiest solution, but would slow-down merge rate, which is not necessary that bad.

  • Create nixpkgs security monitor/bot

This bot will look for changes of derivation input hashes and report on any change over provided pull requests on any critical package, and at least m/N of maintainers, will have to aknowledge pull request for pull request to be merged. Any direct commits into master(excluding trusted bots/commiters), will be reported automatically. We used to have nixos monitor, what has happend with that project?

I want to start discussion on this critical issue and see others ideas. A lot of other distros have tough review proccesses for someone to become a core maintainer. I would also like to perform security review of nixpkgs and would possibly pay some security researcher to inject code into nixpkgs/nixos on purpose, to see how bad security is, if you are fine with that?

security policy discussion

Most helpful comment

I think having an option for nix to store the old nix expressions of installed packages and displaying a diff between the previous version and the new one before upgrading a package. And of course display the whole expression when installing a new package. This is how some AUR helpers work, and I think it would help in detecting whether a package was tampered with. Most updates will just involve changing the version variable and the hashes, so it wouldn't even be unrealistic to check all of the diffs before an upgrade.

I am afraid this is not gonna help at all. I used to think I was protecting myself when reading AUR pkgbuilds, but I ended up doing so just out of curiosity. Knowing whether a new patch is legit or even whether the source repo is the original one is very tricky, especially when you are building a dependency you know nothing about. When reviewing diffs, things are even worse.

Consider the following diff. It is the most common and the most innocent diff you can see in nixpkgs. The repo is the same, so as long as the maintainer of the source repository isn't evil, the worst thing, that can happen, is getting a not-really-stable version, isn't it? Turns out, it isn't, Git/GitHub are against us here (from their point of view, it is IMHO an expected behavior). 754d113b0e3144a145d50bde8370ff2cae98169c doesn't have to be in the official tree at all. It can be a commit of a not-yet-merged pull request created by anybody with a GitHub account including malicious actors. Without verifying the commit is a part of a known official branch either explicitly or by requesting the branch in the first place, truly anything can be fetched from the remote.

diff --git a/pkgs/applications/audio/non/default.nix b/pkgs/applications/audio/non/default.nix
index a7252b9e28a3..f4e5998c0375 100644
--- a/pkgs/applications/audio/non/default.nix
+++ b/pkgs/applications/audio/non/default.nix
@@ -4,12 +4,12 @@

 stdenv.mkDerivation rec {
   name = "non-${version}";
-  version = "2016-12-07";
+  version = "2017-03-29";
   src = fetchFromGitHub {
     owner = "original-male";
     repo = "non";
-    rev = "754d113b0e3144a145d50bde8370ff2cae98169c";
-    sha256 = "04h67vy966vys6krgjsxd7dph4z46r8c6maw1hascxlasy3bhhk0";
+    rev = "10c31e57291b6e42be53371567a722b62b32d220";
+    sha256 = "080rha4ffp7qycyg1mqcf4vj0s7z8qfvz6bxm0w29xgg2kkmb3fx";
   };

   buildInputs = [ pkgconfig python2 cairo libjpeg ntk libjack2 libsndfile

All 33 comments

@domenkozar @edolstra @matejc @copumpkin @joachifm

I have to run unstable on a lot of systems, since I have to use new packages, and I don't want and don't have time to backport eveything.

If you have a bit of HDD space to spare, you could at least have two
checkouts, install fresh packages from the unstable one, but run your
core system from the stable one.

It is possible to list (with import <nixpkgs-fresh> {}; gnused) in
systemPackages.

  1. Split nixpkgs repo into nixcore and nixpkgs

And allow only a few developers(maybe top 10) to merge into nixcore and eveyone else to nixpkgs.

What is core?

  1. Remove commit access for many developers(including myself)

This would be easiest solution, but would slow-down merge rate, which is not necessary that bad.

We do have a problem with merge backlog being too large and growing.
Putting pressure on fewer people to deal with it is probably going to
end badly.

  1. Create nixpkgs security monitor/bot

This bot will look for changes of derivation input hashes and report on any change over provided pull requests on any critical package, and at least m/N of maintainers, will have to aknowledge pull request for pull request to be merged. Any direct commits into master(excluding trusted bots/commiters), will be reported automatically. We used to have nixos monitor, what has happend with that project?

This seems to have the same judgement question as 1 (what is core), but
keeps the benefits of unified repo. I find this better than 1, and in
practice the core packages are expected to have a different policy on
the coordination level. This technical change could be beneficial for
the workflow from the point of view of codifying the expectations.

I want to start discussion on this critical issue and see others ideas. A lot of other distros have tough review proccesses for someone to become a core maintainer. I would also like to perform security review of nixpkgs and would possibly pay some security researcher to inject code into nixpkgs/nixos on purpose, to see how bad security is, if you are fine with that?

Given a few months of wall-clock time and 100 hours of work-time it
should be completely feasible to do for a reasonable definition of
NixPkgs code injection (one needs to do some good work, ask for commit
rights when the time comes naturally, hand-patch some stupid portability
problem in some package — we have enough corner cases where the fix is
actually needed — while introducing a vulnerability, bury such a commit
inside a large set of small fixes/updates). Good underhanded coding and
testing skills should allow doing the same inconspiciously for a package
installed by default… How well you need to masquarade the expoit patch
is an interesting question, but not my call to make, I just update stuff
so that LibreOffice keeps working most of the time…

What is core?

Core is ment as packages and nixos modules, that are required to run basic nixos system(what is basic nixos system should be defined of course).

One of the ideas i got is we could use signed commits and have public keys of maintainers in maintainers.nix. A bot would monitor signed commits and check if changed packages are maintained by the user that commited a change. Online merges would only be accepted if a maintainer or a few maintainers(if multiple would be required by package metadata) would comment with something like @merge The only exception would be superusers like @domenkozar @edolstra and some others. I would be prepared to code such a bot, but i would like to discuss other possible solutions.

One of the other idea i got is also to define a list of trusted packages during build. Any additional package would have to be made trusted during build. This could prevent rough dependency injection attacks.

A few thoughts

Is there a good writeup/analysis (preferably formalized) on this topic? I find it difficult to evaluate solutions without a clear set of objectives.

It is self evident that all we can hope to achieve is to raise the dollar cost of injecting bad code into nixpkgs. I share the concern of over engineering ourselves into a situation where we inconvenience honest people more than its worth in terms of dollar cost of attack. To my mind it makes the most sense to optimize for detection and recovery. Distrust is inefficient.

The "classic" solution is the web of trust. I and a few others sign our commits, but absent a WOT, that doesn't really do anything for security, as all it does is "prove" that the creator of the commit also controls the private key used to sign that particular commit. Regardless of the outcome of this discussion, I think it'd be worthwhile to coordinate to bootstrap a web of trust for nixpkgs.

Also cc @grahamc @fpletz

My 2 cents:

When it comes to a production environment, I have adopted the solution to run the latest stable release and those expressions not available (or not updated) in the release, I include separately from master. I tend to copy the expression into my config folder and then import into configuration.nix.
I do the same with expressions that are not available (i.e. the Sublime Text 3 Dev builds, which are only available for licensed users).

I understand the level of distrust in certain packages due to the rather cumbersome task of keeping track of what is being updated. However, there is a reason, the channel is called nixos-unstable :)
I have actually found the velocity of the project to be rather refreshing, since I tend to use the "latest and greatest" software to build solutions.

At the same time, the investments @grahamc and others have made into the vulnerability roundups and the efforts to use NixOS in commercial environments in more and more places do require a certain level of review. I am just starting to wonder if there should be some sort of solution purely for the stable releases. At the same time, not very many committers actually commit into the releases.
Personally, I have started signing all my commits and I generally only touch packages I understand. In terms of PRs, I build every single one before merging in. But when it comes to security, there is only so much you can do. Nobody has the time to spend hours on each PR to run a small security review.

I am interested to see what various people have to say about this. It would be sad to see people discouraged from contributing in any way. I think that should be in everybody's mind going forward. We did not show up in GitHub's list of projects with the most contributors without reason. We have made it comparably easy to contribute and have given those who have proven competence the ability to help others.

The "classic" solution is the web of trust. I and a few others sign our commits, but absent a WOT, that doesn't really do anything for security, as all it does is "prove" that the creator of the commit also controls the private key used to sign that particular commit. Regardless of the outcome of this discussion, I think it'd be worthwhile to coordinate to bootstrap a web of trust for nixpkgs.

Well, it does help with continuity even without WOT. After all, commit
rights are given not to «you» as identified by a nation-issued ID, they
are given to «you» the intelligence behind your old useful commits (on
the Internet nobody should care if you are a benevolent AI pretending to
be a human!); and continuity of the signatures increases the chances you
are still the same. After all, takeover of a GitHub account doesn't
require any fault from neither you nor GitHub (there is always the mail
provider); an unexpected change of signatures could trigger some
scrutiny of commits.

But with Git I only do what everyone is already doing, so I won't try
setting up this for me (or anything nontrivial with git) until there is
a page with a commonly recognized set of instructions.

@7c6f434c To me, what you describe is a workable protocol for assigning trust to a key. I don't think it's reasonable to check the commit history every time you want to authenticate a message, tarball, or what have you. Consider also that the WOT is not only for fellow developers, but also end-user recipients of nixexpr tarballs, Nix release tarballs, announcements, &c.

With your method, you still need to keep track of "good keys" and have a way of reaching agreement about which keys are the "good ones" (especially if a "good" key turns "bad" or needs to be rotated for whatever reason). At that point you're doing something that approximates WOT, so why not use the tooling support that exists?

With your method, you still need to keep track of "good keys" and have a way of reaching agreement about which keys are the "good ones" (especially if a "good" key turns "bad" or needs to be rotated whatever reason). At that point you're doing WOT, so why not use the tooling support that exists?

Yes, you are right, I wrote sloppily. But I guess we need to make sure
we don't mix up this WoT with the in-person-keysigning one. Neither kind
of trust implies the other one.

@7c6f434c agreed; I think it's pretty essential that people can become trusted without having to reveal their IRL identity. A history of good commits signed by the same key is a good protocol for bootstrapping trust while remaining anonymous, I think.

@7c6f434c agreed; I think it's pretty essential that people can become trusted without having to reveal their IRL identity. A history of good commits signed by the same key is a good protocol for bootstrapping trust while remaining anonymous, I think.

And the other way round, too: even willingness to _prove_ IRL identity
(not just declare it) should not count towards _technical_
trustworthiness.

In terms of PRs, I build every single one before merging in.

I usually believe people who explicitly claim they have tested on NixOS. It's better to spend time checking that the non-trivial changes are reasonable, i.e. changing the upstream site should be explicitly mentioned and explained in the messages etc.

I think @joachifm's point about a formalized writeup of what
_precisely_ you're looking to solve, separate from a list of solutions,
would be very helpful.


At a fundamental level, what we need is _trust_. We need to trust that
people merging and pushing code have the project's bests interests in
mind. All of these technical solutions may help add a sanity check,
but without trust they are nothing.

Well, it does help with continuity even without WOT. After all,
commit rights are given not to «you» as identified by a
nation-issued ID, they are given to «you» the intelligence behind
your old useful commits (on the Internet nobody should care if you
are a benevolent AI pretending to be a human!); and continuity of
the signatures increases the chances you are still the same.

Yes. Agreed.

I think having an option for nix to store the old nix expressions of installed packages and displaying a diff between the previous version and the new one before upgrading a package. And of course display the whole expression when installing a new package. This is how some AUR helpers work, and I think it would help in detecting whether a package was tampered with. Most updates will just involve changing the version variable and the hashes, so it wouldn't even be unrealistic to check all of the diffs before an upgrade.

We can actually implement @C0DEHERO's suggestion behind a flag to nix-env --upgrade (maybe --verbose or --show-diffs?)

It doesn't do much, but makes it easy for people willing to verify updates. I would even call it essential. If I was using nix in production, checking diffs of derivations would be on my protocol for updates, just like checking changelogs and verifying signatures, if any.

And another very fun problem: huge generated chunks of code for Ruby/NodeJS/etc. These create diffs that are unreadable for a human, and just rerunning the generating tool a day later will probably catch some version update upstream (and therefore give a different pile of text). Replacing source of some core Ruby package to a typosquatted domain could be a fun game to play…

If that counts by any means: Remove commit access for many developers is my preffered solution.
I said it before and I say it again: I do not have commit access and I do not want commit access because I think that giving away commit access is to critical.

Having 10 ppl MAX with commit access and the rest just pinging around would be _way_ better! One of the 10 core people would simply have to look at the list of people who ACKed a PR (the new github feature should be used for this) and merge if enough ppl ACK. A multi-level hierarchy is the way here, like the kernel community does. We are big enough to implement this in our process!

Having 10 ppl MAX with commit access and the rest just pinging around would be _way_ better! One of the 10 core people would simply have to look at the list of people who ACKed a PR (the new github feature should be used for this) and merge if enough ppl ACK. A multi-level hierarchy is the way here, like the kernel community does. We are big enough to implement this in our process!

«Like the kernel community does»: the list of merge requesters for
merges that personally Linus Torvalds has done for Linux 4.9 contains
99 different people. Nixpkgs has less committers total (although not by
much) and some of the committers are not active anymore.

We have around a 1000 contributors total. GitHub refuses to count Linux
kernel contributors because they are too many; they have more people
contribute to a single release.

We still have easy tasks left uncompleted because not enough people
care.

I did not say that we are comparable. But I consider it critical that we introduce a scalable process _early_ rather than failing while growing. Having a process soon helps us in a few weeks.

Having a process soon helps us in a few weeks.

Process is expensive per se, it takes effort to run.

I remember that half a year ago with 50 committers we observed growing
backlog of easy-to-review PRs. Why will it be better with only 10
trusted people? If you start relying on reviewers to ACK only good
stuff, make sure you do not get a situation that is even easier to
manipulate while also increasing the load.

And load increases backlog and backlog decreases eagerness of people to
join.

Are there any updates to this issue, please?

I think having an option for nix to store the old nix expressions of installed packages and displaying a diff between the previous version and the new one before upgrading a package. And of course display the whole expression when installing a new package. This is how some AUR helpers work, and I think it would help in detecting whether a package was tampered with. Most updates will just involve changing the version variable and the hashes, so it wouldn't even be unrealistic to check all of the diffs before an upgrade.

I am afraid this is not gonna help at all. I used to think I was protecting myself when reading AUR pkgbuilds, but I ended up doing so just out of curiosity. Knowing whether a new patch is legit or even whether the source repo is the original one is very tricky, especially when you are building a dependency you know nothing about. When reviewing diffs, things are even worse.

Consider the following diff. It is the most common and the most innocent diff you can see in nixpkgs. The repo is the same, so as long as the maintainer of the source repository isn't evil, the worst thing, that can happen, is getting a not-really-stable version, isn't it? Turns out, it isn't, Git/GitHub are against us here (from their point of view, it is IMHO an expected behavior). 754d113b0e3144a145d50bde8370ff2cae98169c doesn't have to be in the official tree at all. It can be a commit of a not-yet-merged pull request created by anybody with a GitHub account including malicious actors. Without verifying the commit is a part of a known official branch either explicitly or by requesting the branch in the first place, truly anything can be fetched from the remote.

diff --git a/pkgs/applications/audio/non/default.nix b/pkgs/applications/audio/non/default.nix
index a7252b9e28a3..f4e5998c0375 100644
--- a/pkgs/applications/audio/non/default.nix
+++ b/pkgs/applications/audio/non/default.nix
@@ -4,12 +4,12 @@

 stdenv.mkDerivation rec {
   name = "non-${version}";
-  version = "2016-12-07";
+  version = "2017-03-29";
   src = fetchFromGitHub {
     owner = "original-male";
     repo = "non";
-    rev = "754d113b0e3144a145d50bde8370ff2cae98169c";
-    sha256 = "04h67vy966vys6krgjsxd7dph4z46r8c6maw1hascxlasy3bhhk0";
+    rev = "10c31e57291b6e42be53371567a722b62b32d220";
+    sha256 = "080rha4ffp7qycyg1mqcf4vj0s7z8qfvz6bxm0w29xgg2kkmb3fx";
   };

   buildInputs = [ pkgconfig python2 cairo libjpeg ntk libjack2 libsndfile

@vojta001 wow, that is such an easy attack vector, and made even worse by how frequently our community posts our system configs, enabling targeted attacks. Would be pretty trivial for just about anyone to sneak this in. This strikes me as both a big problem but also one that could be solved with automation.

CC security team @grahamc @fpletz

@vojta001 As far as I can tell, this occurs when forking, and a pull request is not a prerequisite to a shared object space.

As an example, my only fork with commits diverging from master (without PR) is a minor tweak of horst3180/arc-firefox-theme, which adds a file solarize.sh in ed478116201678989dae2b7d0e2d15140ec0bc97. By plugging this into the tarball URL for the original repository, we can fetch my fork without me ever having made a PR to horst3180s repo:

$ curl -sL https://github.com/horst3180/arc-firefox-theme/archive/ed478116201678989dae2b7d0e2d15140ec0bc97.tar.gz | tar tz | grep solarize
arc-firefox-theme-ed478116201678989dae2b7d0e2d15140ec0bc97/arc-firefox-theme/chrome/browser/sass/solarize.sh

This extends to the .patch and .diff endpoints, which are sometimes used to pick specific commits as patches. Example

This doesn't really change the situation, checking every PR of a project was already unacceptable, I only mention this to prevent a conclusion of "There are no PRs against project Foo, so I don't need to check the commit revision".

As for scanning nixpkgs for these, I don't have a full solution. By modifying find-tarballs.nix, we can output the URLs of any derivation with url or urls attributes (careful, ~11GB RAM usage during testing):


find-urls.nix (Click to expand!)

with import <nixpkgs> { };
with lib;

{ expr }:

let

  root = expr;

  urls = map
    (drv: drv.urls or [ drv.url ])
    urlDependencies;

  urlDependencies =
    filter
      (drv: isDerivation drv && (drv ? url || drv ? urls))
      dependencies;

  dependencies = map (x: x.value) (genericClosure {
    startSet = map keyDrv (derivationsIn' root);
    operator = { key, value }: map keyDrv (immediateDependenciesOf value);
  });

  derivationsIn' = x:
    if !canEval x then []
    else if isDerivation x then optional (canEval x.drvPath) x
    else if isList x then concatLists (map derivationsIn' x)
    else if isAttrs x then concatLists (mapAttrsToList (n: v: addErrorContext "while finding tarballs in '${n}':" (derivationsIn' v)) x)
    else [ ];

  keyDrv = drv: if canEval drv.drvPath then { key = drv.drvPath; value = drv; } else { };

  immediateDependenciesOf = drv:
    concatLists (mapAttrsToList (n: v: derivationsIn v) (removeAttrs drv ["meta" "passthru"]));

  derivationsIn = x:
    if !canEval x then []
    else if isDerivation x then optional (canEval x.drvPath) x
    else if isList x then concatLists (map derivationsIn x)
    else [ ];

  canEval = val: (builtins.tryEval val).success;

in urls

We can now query (hopefully) all GitHub URLs used in nixpkgs without resorting to naive text search:

NIX_PATH=nixpkgs=channel:nixos-unstable nix-instantiate \
  --eval --json --strict ./find-urls.nix \
  --arg expr 'import <nixpkgs/maintainers/scripts/all-tarballs.nix>' \
  | jq -r '.[][]' | sort -u | grep github

From these, we can filter URLs with references to commits, and automatically check if the commit is contained in a specific repository, or perhaps present in a branch of it.

... or we could, if there was a cheap way to check membership/reachability of a git commit without a full clone. GitHub somehow lazy-loads the branch membership information when viewing a commit, but I couldn't find this information in either the v3 or v4 API. I looked at git ls-remote and git fetch-pack, and they didn't seem suited to even answering "Is this commit contained in this remote repository", let alone "Is this commit reachable via a branch of this remote repository". Then again, I'm only vaguely familiar with git internals, and there may indeed be a way.

Cloning every referenced repository probably isn't all that costly, but I would prefer a less resource-hungry (traffic and storage)
method of checking GitHub URLs in nixpkgs.

This is actually horrifying. I would even go so far that this is a "bug" at github's end. EDIT: I think this is unfixable at github's end as the git repo should contain a ref to all pull requests and therefore probably also the commit.

Also concerning the merge permissions (the original discussion) are branch protections enabled for master? Because requiring signed commits, successful status checks, at least 2-3 approving reviewers, blocking on a change-requesting review (both by members only) etc. would help a lot to increase security in my opinion.

Even if GitHub fixes this (they might be able to do it, it may however cause a huge performance hit) we are not completely safe. Malicious actors might not be able to sneak their own code in, for targeted attacks, as @tbenst pointed out, they can just downgrade a package to an old vulnerable version and later exploit it.

So the only conclusion IMO is, that changes touching a version and a hash are just as concerning as complete rewrites/new packages/whatever. We might however create some automated tools to aid reviewers with determining the legitimacy of changes as @tilpner suggests.

I wonder if a hydra job or a merge check could validate SHA once rather than on everyone’s machine? Validate using the nix-prefetch-* tools..?

The problem is also that we would need guidelines for validation. e.g. the fetchFromGithub and fetchGit should probably contain a "tag", "release", "ref" or "commit-date" attribute against which we could validate the commit. Otherwise lots of vulnerabilities like downgrading would still work.

How feasible would it be to have a pre-commit hook that does the following?
The repo and version should be available via nix-instantiate.

git clone https://github.com/horst3180/arc-firefox-theme.git --branch master --single-branch
git branch --contains ed478116201678989dae2b7d0e2d15140ec0bc97 | colrm 1 2 | grep '^master$'

Just throwing it in here:
--filter=tree:0 --no-checkoutmay help and git name-rev --name-only $sha is also way faster.
And maybe git config remote.origin.fetch '+refs/pull/*:refs/remotes/origin/pull/*' is needed (when not knowing the ref but searching for it).

Would like to know how the original question resolved itself, if it has.
Using nix to do high security payloads seems a natural application since there is a deterministic, therefore verifiable path between the source, build process and tools. Hence an audit of that process is an actual, not formal thing, with nothing left to trust.

But it has been suggested by some folks that there is risk in what they view as a highly informal system, compared to, say, ubuntu package repo, where trust the trust model is really trust of the organisation rather than the build process.

Is there some writeup on that subject I could reference that could put these concerns to rest? If not, could I help to create one?

@yakman2020 I personally recommend you to read this whole issue (or at least skim over it as there is lots of off topic stuff and lengthy discussions) but I will try to summarize:
There are currently the following security issues discussed here:

  • The commit process itself e.g. who can commit! and merge and the trust issues with that
  • Probably also indirectly whether master is a protected branch and what are the required things to get a pr merged (minimum required approves, succeeding checks etc. https://github.com/NixOS/nixpkgs/issues/20836#issuecomment-695037310)
  • The issue that github patch, diff, tarball, etc urls can reference commit hashes of forks (which is probably the major non-human security issue currently. https://github.com/NixOS/nixpkgs/issues/20836#issuecomment-694966864 contains ideas on how to mitigate this.
  • Downgrades to a vulnerable version (when hashes or branches are referenced)

    • Probably more packages should support auto-updates so this is prevented (e.g "updating" from 1.0 to vulnerable 1.1 instead of the shortly after released 1.1.1). This also improves updating efforts which would mean faster updates and quicker fixes (also of security issues)

So I think many of the concerns are still valid (at least to an extent).

Was this page helpful?
0 / 5 - 0 ratings