Nixpkgs: fetchpatch doesn't notice changes to other arguments

Created on 16 Oct 2018 · 16Comments · Source: NixOS/nixpkgs

Issue description

I wrote:

(pkgs.fetchpatch {
  name = "5446.patch";
  # TODO make URL stable
  url = "https://patch-diff.githubusercontent.com/raw/haskell/cabal/pull/5446.patch";
  sha256 = "15s612a241mn15ap5g7lcrphj1031nwbk0affyzk0q4l90rf8i25";
  # stripLen = 2; # strips of `a/Cabal/` (the nix derivation is already built from this subdir)
  includes = ["Cabal/*"];
})

After changing includes = ["Cabal/*"]; to includes = ["Cabal/asdfasdfasfd"];, which should exclude all contents of the patch and trigger this error, nothing happens (it doesn't rebuild the patch).

I suspect that the whole (filtered) patch output is accidentally captured by the sha256 somehow.

Technical details

nix-build (Nix) 2.0.4
nixpkgs commit 88ae8f7d55efa457c95187011eb410d097108445

Source

nh2

All 16 comments

Note that the lack of error is just one example of how this can break down; the following workflow is also completely broken:

Run derivation with wrong SHA, see error mentioning correct SHA
Copy correct SHA into derivation as
Do build
Change includes = ["Cabal/*"]; to includes = ["cabal-install/*"]; (this changes the patch substantially)
Observe nix produce same output as before!

nh2 on 16 Oct 2018

Found during the same problem investigation: #48569

nh2 on 16 Oct 2018

This is common for every fixed-output derivation. When there is a hash, Nix doesn't build it again and thus also doesn't do anything with the modified arguments.

As soon as you make a change, you indeed need to modify the hash to determine the correct hash.

FRidh on 17 Oct 2018

👍1

@FRidh What I don't get is why the fixed-output derivation isn't just for the downloaded patch contents.

I get that there are a few things that may want to be done as pre-processing before the contents are hashed (e.g. cutting off version strings as generated by Github), but there doesn't seem to be a benefit to capture stuff like includes in the fixed-output.

nh2 on 17 Oct 2018

I agree with you that it does more than is needed. Maybe it's better to apply the other modifications with a runCommand behind the scenes. The downside is then that is becomes even weirder regarding rebuilding. cc the contributors @globin and @Ericson2314

FRidh on 17 Oct 2018

@nh2 The point of fetchpatch (as opposed to fetchurl) is actually that it normalizes the patch contents. It's designed for when upstream patches sometimes vary in meaningless ways, like hunks being reordered or similar. So I think the fact that the sha256 encompasses the whole derivation is by-design.

Do you think there is a way for us to make this more explicit and less surprising? I can see only making this explicit in the manual, and would therefore suggest using this issue to track adding a sentence mentioning it to the manual.

Ekleog on 9 Feb 2019

👍1

@Ekleog As I mentioned in https://github.com/NixOS/nixpkgs/issues/48567#issuecomment-430652288, the fact that the patch contents are normalised is totally fine. That isn't what I complain about in the issue description.

The issue is that right now, fetchpatch has bad implementation details with its fixed-output derivation as is: It doesn't notice changes in the arguments given to it, those that are _meaningful_, like includes, as opposed to the meaningless changes your refer to.

What should we do?

I think the most straightforward way is to separate the meaningful post-processing arguments, e.g. into a separate runCommand, like @FRidh says.

runCommand notices when its arguments change.

nh2 on 9 Feb 2019

Well, that's one of the points being a fixed-output derivation. You can e.g. change the URL where it fetches from or the original URL may change contents in a way that gets normalized away – all that without causing any rebuilds and still succeeding if retried. I can't see how your "straightforward" way would work, unless you specified two hashes and thus make this IMHO more difficult to maintain.

EDIT: I mean, feel free to propose something better in more detail – I just can't see it. Well, I can probably imagine with intensional store, but that's a bit longer way off.

vcunat on 9 Feb 2019

@vcunat I mean that the sha256 should only capture the result of the _meaningless changes_ @Ekleog mentioned, like reordering hunks.

Stuff like includes should run afterwards, in a second derivation that's not fixed-otuput.

nh2 on 9 Feb 2019

Ah, right, that seems possible. Using hash after normalization and before other changes. I can't see a real disadvantage of that approach – at least not immediately. There will be more derivations, store files, etc. but nothing significant.

vcunat on 9 Feb 2019

@nh2 You can't really do that while keeping the same semantics: includes defines what patch changes are meaningless. For instance, the following two patches (made with diff -r for the example) are equivalent when including file a only, but not when also including file b:

diff -r a/a b/a
1c1
< foo
---
> bar

diff -r a/a b/a
1c1
< foo
---
> bar
diff -r a/b b/b
1c1
< bar
---
> foo

Also, IMO it would be pretty counter-intuitive to have some arguments be in the fixed-output (eg. the URL) and some not be -- even more counter-intuitive than the contrary, because nix has this fixed-output / non-fixed-output distinction well-rooted.

Ekleog on 10 Feb 2019

I don't think you can really change anything while keeping _exactly_ the same semantics. AFAIK the main point of fetchpatch is to ignore the changes done by generating them dynamically (by SW that changes over time), so that we don't need to update hashes from time to time. That part would be kept by the suggested approach. On the other hand, the advantage doesn't seem too significant to _myself_, and it might be a little more confusing to some people, as you suggest (while perhaps less confusing to others).

vcunat on 10 Feb 2019

👍1

the advantage doesn't seem too significant to myself, and it might be a little more confusing to some people, as you suggest (while perhaps less confusing to others)

Yes, as in many places with nix, once you know what's going on it's easy, but before that you lose hours with every little trivial thing you want to do. I hope that we can bring that down as much as possible, because it's the biggest hurdle to adoption.

In any case, how about the following alternative route:

add more commentary on top of https://github.com/NixOS/nixpkgs/blob/88ae8f7d55efa457c95187011eb410d097108445/pkgs/build-support/fetchpatch/default.nix to highlight its fixed-outputness and the issues that can arise from it (i.e. "rerun-nonreproducibility" when changing the args), and
extract the functionality that's currently in postFetch so that one can easily call it separately from fetchpatch, e.g. via a runCommand invocation, because that functionality is very useful

nh2 on 2 Mar 2019

For the sake of persistence: I suggested a change to the fixed-output mechanics on IRC which would make them more intuitive and fix the fetchpatch problem. My proposition would be to handle FO derivations just the same as regular derivations (i.e. hashing based on inputs), and then just check the output against the provided hash (but not using that hash for the $out name, or maybe only using it as an alias).

That way the usual nix semantics apply: If you change an input (curl, patchutils, url, excludes etc.) the derivation is rebuilt. The hash is still checked though, containing the impurity. A simple nix-review could check if a change invalidated any hashes.

The counter arguments were

1) this would blow up the nix cache, since every change to e.g. git would generate a whole bunch of now distinct fixed output derivations. That should be fixable with deduplication.
2) it may get hydra ratelimited/banned on upstream platforms, since it would refetch all sources every now and then

(2) probably makes this infeasible :/

timokau on 24 Jun 2019

Thank you for your contributions.

This has been automatically marked as stale because it has had no activity for 180 days.

If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.

Here are suggestions that might help resolve this more quickly:

Search for maintainers and people that previously touched the related code and @ mention them in a comment.
Ask on the NixOS Discourse.
Ask on the #nixos channel on irc.freenode.net.