Nix: Restrict fixed-output derivations

Created on 4 Jul 2018 · 49Comments · Source: NixOS/nix

People have started (ab)using fixed-output derivations to introduce large impurities into Nix build processes. For example, fetchcargo in Nixpkgs takes a Cargo.lock file as an input and produces an output containing all the dependencies specified in the Cargo.lock file. This is impure, but it works because fetchcargo is a fixed-output derivation. Such impurities are bad for reproducibility because the dependencies on external files are completely implicit: there is no way to tell from the derivation graph that the derivation depends on a bunch of crates fetched from the Internet.

You could argue that fetchurl has the same problem, but fetchurl has simple semantics (fetching a file from a URL) and is more-or-less visible in the derivation graph. This allows tools like maintainers/scripts/copy-tarballs.pl to mirror fetchurl files to ensure reproducibility.

Proposed solution: Add a new sandboxing mode where fixed-output derivations are not allowed to access the network (just like regular derivations). In this mode, only builtin derivations like builtin:fetchurl would be allowed to fetch files from the network. This mode should become the default at some point.

We would also need builtin:fetchGit to replace fetchGit in Nixpkgs, etc.

@cleverca22 pointed out that fixed-output derivations allow shenanigans like opening a reverse interactive shell into the build server, so that's another reason for removing network access.

feature

Source

edolstra

👍27 😕14

Most helpful comment

I was curious how much buildGoModule and friends inefficiency in fetching sources actually affect us so I went ahead and made a small experiment to see:
https://gist.github.com/adisbladis/5a6805d329326e828bc599fb18cbc058.

Cache busting is imho one of the more serious practical issues with the current model of fixed-output derivations.

adisbladis on 5 Jul 2019

👍7

All 49 comments

The main problem with this, of course, is that all fetchers will need to be built into Nix or provided as plugins.

edolstra on 4 Jul 2018

❤1 😕1

An alternative approach is what we're doing in nix-fetchers: Push more of this into eval time stuff (see e.g. fetch-pypi-hash

shlevy on 7 Jul 2018

For example, fetchcargo in Nixpkgs takes a Cargo.lock file as an input and produces an output containing all the dependencies specified in the Cargo.lock file. This is impure, but it works because fetchcargo is a fixed-output derivation.

It's not clear to me why this is _impure_, at least by my definition of the word. How do you define purity? If the lock file (lots of languages have a similar notion) is specified narrowly/precisely enough, it produces the same outputs every time. And if not, the hash won't validate.

The issue arises when you cache the result of these things and don't notice that your input wasn't sufficiently well specified. This brings me back to #520 where I talk about a semi-formal notion of a "lock file", where fetchers (potentially nondeterministic) are expected to produce a fully locked down "lock file" that is supposed to be feedable back into the process in order to reproduce it (and can and should be tested regularly for consistency).

I'm pretty strongly against forcing all fetchers to be built into Nix from now on. One of its biggest selling points for me is the ease of writing new fetchers and how clean that model is, and I don't honestly think that this is violating any of that. I'd be very sad to see today's model of FO derivations go away, despite its issues.

copumpkin on 7 Jul 2018

Re: fetchcargo: if upstream server allowed to post a Cargo.lock and provided a tarball of relevant dependencies, the functionality wouldn't be that much worse than fetchurl; probably better than submodule-aware call to fetchgit. They don't provide it, so fetchcargo does what fetchgit does, is that too bad?

What is the problem to solve? Unlimited network access does raise security questions; maybe the additional fetchers should have to precommit to only talking to a fixed list of domain names and ports (maybe with subnet blacklist to prevent resolving to local IPs, and with a clear ban on listening on external interfaces)? Maybe we could define the notion of fetcher so that they can be written as today, then administrator has to trust them via a mechanism similar to substituters?

7c6f434c on 7 Jul 2018

@copumpkin A function is impure if it depends on something other than its inputs. Fetchers depend on the network, so they are impure. The only mitigating aspect of fixed-output derivations is that the impurity is controlled in the sense that the output is verified to have a certain hash.

This is a real problem for Nix's reproducibility: for example, fetchurl calls frequently fail because a file disappeared. But at least with fetchurl, there is a generic method to mirror all fetchurl calls in a derivation.

edolstra on 8 Jul 2018

@volth Implicitly we're already doing that, since cache.nixos.org acts as a backup of FODs. However there are a few issues:

The binary cache could be garbage-collected in the future, so it's probably best not to rely on it.
With functions like fetchcargo the granularity is not ideal: each fetchcargo output will contains dozens or hundreds of crates, so you get a lot of duplication that would be avoided if the crates were mirrored individually.
Store path hashes depend on the store prefix, so cache.nixos.org can only act as a mirror for people using /nix/store.
You don't get the download progress indicator of nix build.

edolstra on 9 Jul 2018

👍1

OK, so the main point is probably ensuring sane granularity for caching (and that's a very good point).

At the same time, having all the fetch[VCS] be Nix plugin would be annoying from the point of view of dependency structure, if nothing else.

Maybe netork support for FOD should be optional (default off) with just Nixpkgs-level policy on nonduplication (which would be explicit)? Maybe with store deduplication eventually complaining if there is duplication between FOD outputs.

(Also, does it mean that we want to eventually deduplicate Linux kernel source tarball contents across versions?)

GC policies for FOD could be different than for normal builds.

Download progress indicator for binary caches would be nice anyway, LibreOffice build output is large, so I expect this to be a transient problem. Path rewriting substituter for FOD is also likely to happen at some point because the hard part is just absent in htis use case.

7c6f434c on 10 Jul 2018

As an initial thought: what about fetching things that don't use HTTP(S)? E.g. if we needed a fetcher for FTP, a custom protocol, etc.? Maybe some form of plugin approach could help?

andrew-d on 2 Feb 2019

I am the maintainer of https://github.com/NixOS/nixpkgs/blob/3ff636fb2e756ac57d7f0007dc2c6c2425401997/pkgs/development/compilers/ldc/default.nix and the only reason I need to use a fixed output derivation is because I want to run the unittests and the socket implementation is tested via the loopback address (127.0.0.1).
See https://github.com/dlang/phobos/blob/v2.084.1/std/socket.d#L779 for example.

I fail to see how the loopback address can introduce any impurities.
I guess it's hard to implement an exception into the sandbox for the loopback address.

If you are going to deprecate fixed output derivations I would like to know how I should change this derivation in the long term.
For sure I could just not run the test but my target was always to run all available tests and it proved to be useful.

ThomasMader on 16 Feb 2019

You shouldn't need a fixed-output derivation for that. Regular derivations run in a network namespace where they have their own loopback interface:

$ nix-build -E 'with import <nixpkgs> {}; runCommand "foo" { buildInputs = [ iproute ]; } "ip -4 a"'
these derivations will be built:
  /nix/store/pb341wyxv056mh847fkpj7zybs1j623d-foo.drv
building '/nix/store/pb341wyxv056mh847fkpj7zybs1j623d-foo.drv'...
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever

edolstra on 17 Feb 2019

👍1

That's right but the problem is that h_addr_list is empty if sandboxing is enabled in the test below.
Without sandboxing printf outputs 127.

  nixpkgsRev = "a44784e81181c971a41c588d93a6cf4bbd1a394c";
  nixpkgs = builtins.fetchTarball "github.com/NixOS/nixpkgs/archive/${nixpkgsRev}.tar.gz";
  pkgs = import nixpkgs {};

  file = pkgs.writeText "test.cpp" ''
    #include <sys/socket.h>
    #include <netdb.h>
    #include <arpa/inet.h>
    #include <stdio.h>
    int main()
    {
        struct hostent *he;
        struct in_addr ipv4addr;
        struct in6_addr ipv6addr;

        inet_pton(AF_INET, "127.0.0.1", &ipv4addr);
        he = gethostbyaddr(&ipv4addr, sizeof ipv4addr, AF_INET);
        printf("first addr: %i\n", (int)*he->h_addr_list[0]);
        return 0;
    }
  '';
in
  pkgs.runCommand "compile" {} ''
    ${pkgs.iproute}/bin/ip -4 a
    mkdir $out
    cd $out
    ${pkgs.clang}/bin/clang ${file} -o test
    ./test
  ''

ThomasMader on 18 Feb 2019

I don't like the idea where nix tries to solve all package managers in the world at all. There is already enough magic with lang2nix tools, and I don't see what is wrong with approach that we have for rust and go, where are we able to produce deterministic package depenency set. It is not nice and shiny, but reduces magic and gets the job done.

I much rather like a solution that is known to work, as solution that tries to emulate "right behaviour", for sake of addressing issues of package managers.

offlinehacker on 15 Jun 2019

The problem is that fixed-output derivations allow you to have irreproducible builds (requireFile would be the canonical example).

edolstra on 16 Jun 2019

Ok if I understand correctly, the problem is that fixed output derivations can be arbitrary complex as there are no boundaries from where information can be retrieved from internet. At the same time you lose dependency graph and it becomes time consuming to rebuild fixed output derivations.

Let's say that package managers that nix tries to wrap can be arbitrary complex and can change at any point in time, so it's hard to have nix native implementation, especially one that works right. What I see as alternative is to limit what fixed output derivations can do is:

limitations regarding internet access (blocking where impure derivations can connect), thus removing possible sources of impurities and having better security.
incremental builds (have state of privous builds avalible during new builds so fetching becomes faster)
ability to split build results in smaller parts, similar as multiple outputs is doing, but dynamic. This would allow to put different depedencies into different outputs and thus improve caching, as you would have to download only some outputs.

The other way is of course trying to wrap other package managers, but this brings in a lot of complexity, as you need lang2nix, whether this be part of build or as a pre-evaluation code generator.

offlinehacker on 16 Jun 2019

Cache busting is imho one of the more serious practical issues with the current model of fixed-output derivations.

adisbladis on 5 Jul 2019

👍7

So for buildGoModule we would save half the downloads (median 1.5) if specify each dependency explicitly. I am not quite sold having to generate a deps.nix since it increases the evaluation time and takes more effort then a single checksum.

Mic92 on 26 Jul 2019

@Mic92 That's not exactly the case, the 1.5 number is only correct within a single nixpkgs evaluation, so 1.5 is only the immediate up-front savings.
Any change to the dependency graph using the FOD packaging model will cause the _entire_ cache for that derivation to be busted.
The only way to measure the true impact would be to measure a package over time.

adisbladis on 4 Aug 2019

The justification I heard for tracking each dependency individually is the "dependency alignment" in Fedora or similar Linux distributions. The idea is that each library should appear in only a single version in whole distro, preferably as a .so file, to reduce maintenance efforts. Depends on the kind of maintenance, obviously. In Nixpkgs, maintenance IMO is mostly about tracking upstream closely, so the benefits aren't there.

Second argument for tracking individual dependencies would be keeping track of bugs (and security bugs). Answering the question "Is my system affected through a transitive dependency that some package pulled in?" gets tricky without it.

The data transfer savings due to and caching granularity mentioned before seem most relevant argument to me.

If working with deps.nix files can be made sufficiently painless and performant for all the build tools involved, I guess I am up for some of the more "advanced" solutions in https://github.com/NixOS/nixpkgs/issues/65275#issuecomment-514139329

@offlinehacker The day might come when the "Open Buildgraph Specification" (just made it up) releases a format that build tools can use to consume/provide each others outputs/inputs w/ some propagated metadata, and the build problem will be thus solved :package:

jirkadanek on 10 Aug 2019

Coming here after losing 2 days from being tricked by a fixed-output derivation (https://github.com/NixOS/nixpkgs/issues/66598).

I had overridden curl in a way that was incorrect when built via fetchurl, but didn't notice it until many days (months?) later because the thing that was fetched was fixed-output and so the curl was never built.

Thus I would argue that even

You could argue that fetchurl has the same problem, but fetchurl has simple semantics

wasn't true for me -- it tricked me badly.

Here a few questions:

Would it be possible to have a flag in which nix builds all dependencies, even of fixed-output derivations?
- For example, even if fetchpatch-mything.patch is fixed-output, if nix can see that its .drv depends on curl.drv, force it to build that curl.drv?
Would it be possible to have a flag or functionality to re-fetch all fixed-output derivations, and check whether they produce the same hashes?
Why does fetchurl/default.nix not simply use builtins.fetchpatch?
- For example in pkgsMusl we override curl, but never see whether it actually works when invoked via fetchurl of it because usually all that fixed-output stuff is already in the nix store. It seems unnecessary that we even invoke programs from the nixpkgs package set / overlay to fetch stuff from the Internet when there's a nix primop specifically designed to fetch stuff from the Internet. Am I missing something? If the code path pkgsMusl.mypackage -> fetchpatch -> fetchurl -> pkgsMusl.curl didn't even exist (because builtins.fetchurl was used), we couldn't get it wrong.

nh2 on 15 Aug 2019

👍2

Would it be possible to have a flag or functionality to re-fetch all fixed-output derivations, and check whether they produce the same hashes?

This seems possible now with nix-store -qR to get the dependency list, filtering with nix-store -q -b outputHash to get fixed-output derivations, and nix-store -r --check to rebuild them.

7c6f434c on 15 Aug 2019

👍1

This probably won't work if the goal to to redownload FODs using new curl because .drv files in Nix Store have path to old curl.

I would hope that nix-store -qR on the .drv file instantiated from the new version refer to the new curl, though.

So the .drv files are to be rebuild to and nix expression of all FODs are needed (and here we went back to the task of https://discourse.nixos.org/t/using-nixos-in-an-isolated-environment/3369/15)

Well, unlike there we do have a definition of «all» that is based on an easily evaluatable derivation

7c6f434c on 15 Aug 2019

@edolstra Would this sort of situation be improved if recursive nix were possible? E.g. maybe we could have an HTTP proxy on 127.0.0.1 to which the package managers' HTTP requests get redirected, and which serves the requests from store contents after fetchUrl-ing the data? That way you have granular insight into what stuff the package managers are fetching, without having to lift out all the logic that causes them to make those requests out into nix.

masaeedu on 29 Aug 2019

👍2

I like the idea of having http proxy too

offlinehacker on 30 Aug 2019

As a TODO, we also need svn fetching for go, now that go has added svn download support (could likely reuse git - https://git-scm.com/docs/git-svn).

c00w on 30 Apr 2020

Yeah, I mean, I just don't see this as very realistic. I need a pijul fetcher among other things that simply do not will not ever belong in Nix.

grahamc on 30 Apr 2020

If we're restricting fetchers, would removing fixed outputs as well be useful? Right now it's super confusing because you change a url (http://foo1 -> http://foo2), but forget to change the hash and nothing breaks.

If instead the fetch was
1) Lookup url in nix mirror
2) fetch output (either from mirror or source url).
3) download output + compare hash of output to hash specified as arg to fetch function.

This would be super useful for the workflow of updating hashes.

There is a bit of a technical discussion around what the final derivation hash would be (include url + content or just content hash) - but as long as we check the mapping without blindly trusting it, then we'd be in a much better state + people wouldn't have surprisingly painful failures.

c00w on 30 Apr 2020

👍1

IMO the real devil here is outputHashMode = recursive. If you just have a single tar.gz file, it's trivial to stick in a hashed-mirror / FTP server, and it's trivially compatible with the Software Heritage archive; but if you have a big directory full of something like Bazel outputs it's hard to figure out how to actually reproduce that without a fully working Bazel in your nix store capable of running on the host.

This matters a lot more if you aren't using the community binary cache, or if you have nix stores at different prefixes that could share hashed-mirrors but not binary caches, or if your build servers are air-gapped and just have access to a hashed-mirror.

bhipple on 30 Apr 2020

+1 for bazel + buck (and other complicated download systems - i.e. i2p,
ipfs, scuttlebutt got repos) showing the power / philosophy of nix. If you
were trying to do nixpks as a bazel repo you would have to implement so
much stuff + probably http adapters as a service for weird stuff. Nix
bypasses this by building bazel (correctly + securely) then using it to do
fetches. And that allows so much to get done so cheaply while integrating
it into the nix ecosystem.

There are definitely issues with the current setup - but they seem fixable.
For reproducible downloads, I'd love an option to have nix by default run
fetchers 2 or 3 times to make sure they're deterministic (if that's
expensive at least doing it at review time). I'd love a download checker
that alerts on upstream changes (the nix mirrors protect from this, but we
should probably either update the hash, or change the URL to only point at
the mirrors). I'd love a good fix for forgetting to update a hash. I'd love
to be able to copy upstream hashes into the system for verification, so we
could confirm that no one tampered with it between us and them.

On Wed, Apr 29, 2020, 21:16 Benjamin Hipple notifications@github.com
wrote:

IMO the real devil here is outputHashMode = recursive. If you just have a
single tar.gz file, it's trivial to stick in a hashed-mirror / FTP server,
and it's trivially compatible with the Software Heritage archive; but if
you have a big directory full of something like Bazel outputs it's hard to
figure out how to actually reproduce that without a fully working Bazel in
your nix store capable of running on the host.

This matters a lot more if you aren't using the community binary cache, or
if you have nix stores at different prefixes that could share
hashed-mirrors but not binary caches, or if your build servers are
air-gapped and just have access to a hashed-mirror.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/NixOS/nix/issues/2270#issuecomment-621554945, or
unsubscribe
https://github.com/notifications/unsubscribe-auth/AADWWNYDU2MAA5UXUCNF4I3RPDGOBANCNFSM4FIJCJZQ
.

c00w on 30 Apr 2020

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/speeding-up-rust-app-packaging/6384/14

nixos-discourse on 1 May 2020

Obviously, this is a huge thread and a lot of different problems are mentioned.

There is one that could be improved quite easily: Fixed output derivations are only rebuilt if their output path changes. Their output paths in turn depends on the name and the outputHash. As was mentioned repeatedly, this has wasted a lot of hours, days, months in error hunting.

I found this email from 2014 mentioning a solution: Make fixed output derivations also depend on their inputs.

Nix prevents me from using outputPaths directly or indirectly as part of derivation names. I don't know how to circumvent this without "import from derivation". Otherwise, one could even implement this in nix.

Hack with import from derivation

rerun-fixed.nix

Fixing this in nix

Probably, the easy way to solve this is to have the output hash include the outputHash AND the normal hash over all inputs. I can see two downsides that probably mandate making this an option rather than the default behavior:

When enabling this option, all fixed output derivations would be rebuilt. This would uncover a lot of output derivations those code does not work anymore -- all at once.
Some people might have relied on the fact that exactly the fixed output hash is part of the path.

But obviously, this is way less drastic than forbidding fixed output derivations alltogether.

@c00w @nh2

kolloch on 2 May 2020

👀1

@kolloch note that this will make quite a few more packages stdenv-rebuild-triggering, make sure nobody uses mirror:// and break all the workarounds people use to fetch something when Nix fails to.

Note re: rebuild avoidance that fetchFromGitHub produces source directories with name «source» even though there is more data that could be used for naming…

Opt-in rebuild-trigger argument «rebuild if this changes» would probably be nice.

7c6f434c on 2 May 2020

Also note that fixed outputs is tool that enables us to build pretty complicated stuff, pretty easily, look for example here: https://github.com/NixOS/nixpkgs/pull/86122/files#diff-60c35744fd5d4d584764d61048838524R35

If we would need to maintain generated files for such packages, it would become unmaintainable and would not make any sense.

Hacking around and reverse engineering to package different software in nix should not be the goal. Already purity that nix tries to achieve presents many issues in many cases. I would rather have many levels of purity and then user can decide if he wants to install software that is not built in pure way.

offlinehacker on 2 May 2020

@kolloch note that this will make quite a few more packages stdenv-rebuild-triggering, make sure nobody uses mirror:// and break all the workarounds people use to fetch something when Nix fails to.

We should not trigger them all at once, agreed. I don't get point about mirror://.

Opt-in rebuild-trigger argument «rebuild if this changes» would probably be nice.

Yes, good idea, agreed.

kolloch on 3 May 2020

We should not trigger them all at once, agreed. I don't get point about mirror://.

fetchurl supports mirror:// URLs where the well-known mirror lists are maintained inside fetchurl and can be updated whenever we want. Making maintenance of the mirror list expensive for all rev-deps has drawbacks.

7c6f434c on 3 May 2020

👍1

Also note that fixed outputs is tool that enables us to build pretty complicated stuff, pretty easily, look for example here: https://github.com/NixOS/nixpkgs/pull/86122/files#diff-60c35744fd5d4d584764d61048838524R35

@offlinehacker this is exactly why @edolstra wants to reduce how powerful they are, as far as I understand.

lheckemann on 3 May 2020

We should not trigger them all at once, agreed. I don't get point about mirror://.
fetchurl supports mirror:// URLs where the well-known mirror lists are maintained inside fetchurl and can be updated whenever we want. Making maintenance of the mirror list expensive for all rev-deps has drawbacks.

It should be possible to solve this in nix. Nix could check what mirror list actually applies to the URL used. That way, the final derivation will not depend on the mirror list. If the derivation is changed by the mirror list, I'd want it to be reinstantiated...

Otherwise, if we provided a hack to exclude it, it is totally conceivable that a change to the mirror list will break things undetectedly.

kolloch on 3 May 2020

If the derivation is changed by the mirror list, I'd want it to be reinstantiated...

Otherwise, if we provided a hack to exclude it, it is totally conceivable that a change to the mirror list will break things undetectedly.

Well, upstream changing URLs happens all the times, and that breaks the expressions without anything Nix can detect at evaluation time.

And the most frequent problem (missed src bump when version changes) would be avoidable if we made sure that specifically a version change triggers a rebuild…

7c6f434c on 3 May 2020

Well, upstream changing URLs happens all the times, and that breaks the expressions without anything Nix can detect at evaluation time. And the most frequent problem (missed src bump when version changes) would be avoidable if we made sure that specifically a version change triggers a rebuild…

I'd argue that you can also mess up the mirror config, no? I don't see a problem with fixing this within the nix code, do you? Then only fetches that are impacted by the mirror change are rexecuted. And that is nice.

(Obviously, we cannot guard agaisnt changes outside of our control like this. It would be nice to recheck our fixed output derivations from time to time AND on any input change.)

kolloch on 3 May 2020

Well, upstream changing URLs happens all the times, and that breaks the expressions without anything Nix can detect at evaluation time. And the most frequent problem (missed src bump when version changes) would be avoidable if we made sure that specifically a version change triggers a rebuild…

I'd argue that you can also mess up the mirror config, no? I don't see a problem with fixing this within the nix code, do you? Then only fetches that are impacted by the mirror change are rexecuted. And that is nice.

Well, truly messing up mirror config requires having _no_ valid mirrors in the list (assuming that the evaluation-part processing of the list succeeds). Rebuilding after adding one more mirror in the end will just fetch from the first one and thus check nothing about the change. So the proposed change adds a cost, does not verify many of the mirror list updates, and we need periodic verification of upstream URLs anyway. And for mirrors specifically we should also somehow check all the mirrors, but preferably not download _everything_ from every mirror (that's kind of wasteful).

We should improve the efficiency and make Nix reuse more and rebuild from scratch less (in the normal operation), not vice versa.

I agree that updating version only in half of the places is bad, but the solution should try to minimise both false negative and false positives.

7c6f434c on 3 May 2020

I agree that updating version only in half of the places is bad, but the solution should try to minimise both false negative and false positives.

Note that a lot of the pain should mitigated by the content-addressable store work: If a fetch is redone, all downstream work will not have to be redone because the output is the same.
Nix takes a very principled stance on many things. I am just proposing that it does what it always does, execute everything that changed, also for fixed output derivations. Currently, the commands in a fixed output derivation have no meaning. You don't know if they were ever executed because the original statements might be long gone. With what I propose, they were at least executed once.

I like the plan of making including inputs in the fixed output derivations optional. There should be a system-level option that establishes the default (whether to include by default or exclude by default). Then a option to override that for each derivation. That way, people can chose the trade-off themselves. I could imagine that we migrate standard functions in nixpkgs like mkDerivation to default to the new behavior at some point.

If you argue that it's good to ignore changes in some inputs but not in others, I don't know. On the one hand, it seems like a hack and introduce even more variability into how things work. On the other hand, it could make life of some people more pleasant.

kolloch on 3 May 2020

👍1

I agree that updating version only in half of the places is bad, but the solution should try to minimise both false negative and false positives.

Note that a lot of the pain should mitigated by the content-addressable store work: If a fetch is redone, all downstream work will not have to be redone because the output is the same.

Right, fixed-output content is supposed to be trivial to retarget as it should not contain any path references. But well, I support opt-in dependence on _selected_ inputs immediately.

Nix takes a very principled stance on many things. I am just proposing that it does what it always does, execute everything that changed, also for fixed output derivations. Currently, the commands in a fixed output derivation have no meaning. You don't know if they were ever executed because the original statements might be long gone. With what I propose, they were at least executed once.

Well, true, and there are expressions — even in Nixpkgs — that rely on that ability of creating a fixed-output derivation out of thin air. I mean requireFile etc. that need to specify what is the expected upstream binary file when we cannot obtain the file automatically.

I like the plan of making including inputs in the fixed output derivations optional. There should be a system-level option that establishes the default (whether to include by default or exclude by default). Then a option to override that for each derivation. That way, people can chose the trade-off themselves. I could imagine that we migrate standard functions in nixpkgs like mkDerivation to default to the new behavior at some point.

There is a _ton_ of unrelated weird failures Nix cannot really avoid. Nix tries to be fail-or-succeed-correctly, not always-succeed anyway.

If you argue that it's good to ignore changes in some inputs but not in others, I don't know. On the one hand, it seems like a hack and introduce even more variability into how things work. On the other hand, it could make life of some people more pleasant.

I think «pleasant» understates the problem a bit.

Let's look at the current staging-next workflow. This is a jobset that includes more or less everything on x86_64-linux (by the way, input-dependence means everything needs to be downloaded separately on every architecture, right?) We can look at https://hydra.nixos.org/job/nixpkgs/staging-next/glibc.x86_64-linux/all and count monthly glibc rebuilds (that in the proposed approach mean full redownload of all sources).

Apr 3
May 2
Feb 5
Jan 2
Dec 3
Nov 3
Oct 1

I believe that downloading everything 2 to 3 times per month in triplicate each time is an overkill.

7c6f434c on 3 May 2020

@kolloch Would just adding the source url into the hash be enough? I normally only get massively bitten by this when I change a URL + forget to change the hash.

That may be a lot more palatable for many people.

c00w on 8 May 2020

@c00w I bet that even the part of URL after the last / is enough in most cases (and this will be annoying only for very weird mirroring setups)

7c6f434c on 8 May 2020

Is this even a big issue in the first place? I've been bitten by it a couple times, and the first time was very confusing to be sure, but there's documentation in the manual that explains what's going on, and once you stop to think about it this kind of build dependency erasure is intuitive and seems relatively sane (at least to me):

https://nixos.org/nix/manual/#fixed-output-drvs

bhipple on 9 May 2020

@bhipple After thinking more about it, I appreciate the default behavior a bit more and that what you want is highly project-specific. Interestingly, I found a way to tune the behavior without changing Nix and wrote about it in Rerunning Fixed Output Derivations.

That does not directly address the original point of this issue but, with some effort, we can make things less error prone.

kolloch on 10 May 2020

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/parsing-go-sum-and-cargo-lock-files-to-spare-the-need-for-fixed-output-derivations/7367/1

nixos-discourse on 26 May 2020

As someone who periodically forgets to bump the sha on the prefetched dependencies of buildBazelPackage and spends an hour or so, I join those who would appreciate having the input hashes affect the derivation's hash, in addition to the fixed hash (the "did at least run once" semantic).