I wrote:
(pkgs.fetchpatch {
name = "5446.patch";
# TODO make URL stable
url = "https://patch-diff.githubusercontent.com/raw/haskell/cabal/pull/5446.patch";
sha256 = "15s612a241mn15ap5g7lcrphj1031nwbk0affyzk0q4l90rf8i25";
# stripLen = 2; # strips of `a/Cabal/` (the nix derivation is already built from this subdir)
includes = ["Cabal/*"];
})
After changing includes = ["Cabal/*"]; to includes = ["Cabal/asdfasdfasfd"];, which should exclude all contents of the patch and trigger this error, nothing happens (it doesn't rebuild the patch).
I suspect that the whole (filtered) patch output is accidentally captured by the sha256 somehow.
nix-build (Nix) 2.0.4Note that the lack of error is just one example of how this can break down; the following workflow is also completely broken:
includes = ["Cabal/*"]; to includes = ["cabal-install/*"]; (this changes the patch substantially)Found during the same problem investigation: #48569
This is common for every fixed-output derivation. When there is a hash, Nix doesn't build it again and thus also doesn't do anything with the modified arguments.
As soon as you make a change, you indeed need to modify the hash to determine the correct hash.
@FRidh What I don't get is why the fixed-output derivation isn't just for the downloaded patch contents.
I get that there are a few things that may want to be done as pre-processing before the contents are hashed (e.g. cutting off version strings as generated by Github), but there doesn't seem to be a benefit to capture stuff like includes in the fixed-output.
I agree with you that it does more than is needed. Maybe it's better to apply the other modifications with a runCommand behind the scenes. The downside is then that is becomes even weirder regarding rebuilding. cc the contributors @globin and @Ericson2314
@nh2 The point of fetchpatch (as opposed to fetchurl) is actually that it normalizes the patch contents. It's designed for when upstream patches sometimes vary in meaningless ways, like hunks being reordered or similar. So I think the fact that the sha256 encompasses the whole derivation is by-design.
Do you think there is a way for us to make this more explicit and less surprising? I can see only making this explicit in the manual, and would therefore suggest using this issue to track adding a sentence mentioning it to the manual.
@Ekleog As I mentioned in https://github.com/NixOS/nixpkgs/issues/48567#issuecomment-430652288, the fact that the patch contents are normalised is totally fine. That isn't what I complain about in the issue description.
The issue is that right now, fetchpatch has bad implementation details with its fixed-output derivation as is: It doesn't notice changes in the arguments given to it, those that are _meaningful_, like includes, as opposed to the meaningless changes your refer to.
What should we do?
I think the most straightforward way is to separate the meaningful post-processing arguments, e.g. into a separate runCommand, like @FRidh says.
runCommand notices when its arguments change.
Well, that's one of the points being a fixed-output derivation. You can e.g. change the URL where it fetches from or the original URL may change contents in a way that gets normalized away – all that without causing any rebuilds and still succeeding if retried. I can't see how your "straightforward" way would work, unless you specified two hashes and thus make this IMHO more difficult to maintain.
EDIT: I mean, feel free to propose something better in more detail – I just can't see it. Well, I can probably imagine with intensional store, but that's a bit longer way off.
@vcunat I mean that the sha256 should only capture the result of the _meaningless changes_ @Ekleog mentioned, like reordering hunks.
Stuff like includes should run afterwards, in a second derivation that's not fixed-otuput.
Ah, right, that seems possible. Using hash after normalization and before other changes. I can't see a real disadvantage of that approach – at least not immediately. There will be more derivations, store files, etc. but nothing significant.
@nh2 You can't really do that while keeping the same semantics: includes defines what patch changes are meaningless. For instance, the following two patches (made with diff -r for the example) are equivalent when including file a only, but not when also including file b:
diff -r a/a b/a
1c1
< foo
---
> bar
diff -r a/a b/a
1c1
< foo
---
> bar
diff -r a/b b/b
1c1
< bar
---
> foo
Also, IMO it would be pretty counter-intuitive to have some arguments be in the fixed-output (eg. the URL) and some not be -- even more counter-intuitive than the contrary, because nix has this fixed-output / non-fixed-output distinction well-rooted.
I don't think you can really change anything while keeping _exactly_ the same semantics. AFAIK the main point of fetchpatch is to ignore the changes done by generating them dynamically (by SW that changes over time), so that we don't need to update hashes from time to time. That part would be kept by the suggested approach. On the other hand, the advantage doesn't seem too significant to _myself_, and it might be a little more confusing to some people, as you suggest (while perhaps less confusing to others).
the advantage doesn't seem too significant to myself, and it might be a little more confusing to some people, as you suggest (while perhaps less confusing to others)
Yes, as in many places with nix, once you know what's going on it's easy, but before that you lose hours with every little trivial thing you want to do. I hope that we can bring that down as much as possible, because it's the biggest hurdle to adoption.
In any case, how about the following alternative route:
postFetch so that one can easily call it separately from fetchpatch, e.g. via a runCommand invocation, because that functionality is very usefulFor the sake of persistence: I suggested a change to the fixed-output mechanics on IRC which would make them more intuitive and fix the fetchpatch problem. My proposition would be to handle FO derivations just the same as regular derivations (i.e. hashing based on inputs), and then just check the output against the provided hash (but not using that hash for the $out name, or maybe only using it as an alias).
That way the usual nix semantics apply: If you change an input (curl, patchutils, url, excludes etc.) the derivation is rebuilt. The hash is still checked though, containing the impurity. A simple nix-review could check if a change invalidated any hashes.
The counter arguments were
1) this would blow up the nix cache, since every change to e.g. git would generate a whole bunch of now distinct fixed output derivations. That should be fixable with deduplication.
2) it may get hydra ratelimited/banned on upstream platforms, since it would refetch all sources every now and then
(2) probably makes this infeasible :/
Thank you for your contributions.
This has been automatically marked as stale because it has had no activity for 180 days.
If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.
Here are suggestions that might help resolve this more quickly:
this is still important