Nixpkgs: haskell: Setup.hs should not be compiled with the same pkgdb as the package itself

Created on 28 Apr 2018  Ā·  22Comments  Ā·  Source: NixOS/nixpkgs

Let’s say I have a project that consists of two packages:

  1. Custom prelude. It depends on base-noprelude instead of base and provides a Prelude module so that it is imported automatically.
  2. Library. Depends, again, on base-noprelude and my custom prelude package.

(demo)

When I try to build the library package with nix, here is what I get:

~~~
these derivations will be built:
/nix/store/g9cdimcmbcmd2b5nc7xd420xij5w744k-demo-0.0.0.drv
building '/nix/store/g9cdimcmbcmd2b5nc7xd420xij5w744k-demo-0.0.0.drv'...
setupCompilerEnvironmentPhase
Build with /nix/store/kwwcbv3kdpq7wr3rqn1fawijqkzbac3s-ghc-8.4.1.
ignoring (possibly broken) abi-depends field for packages
unpacking sources
unpacking source archive /nix/store/8f8plqz8n5zgcqryw6xm2rrlpr3z0q49-demo
source root is demo
patching sources
compileBuildDriverPhase
setupCompileFlags: -package-db=/private/var/folders/sd/00sk0kcj1vqcxg1ygth7vl6h0000gn/T/nix-build-demo-0.0.0.drv-0/package.conf.d -j4 -threaded
[1 of 1] Compiling Main ( /nix/store/4mdp8nhyfddh7bllbi7xszz7k9955n79-Setup.hs, /private/var/folders/sd/00sk0kcj1vqcxg1ygth7vl6h0000gn/T/nix-build-demo-0.0.0.drv-0/Main.o )

/nix/store/4mdp8nhyfddh7bllbi7xszz7k9955n79-Setup.hs:1:1: error:
Ambiguous module name ā€˜Prelude’:
it was found in multiple packages: base-4.11.0.0 myprelude-0.0.0
|
1 | import Distribution.Simple
| ^
note: keeping build directory '/private/var/folders/sd/00sk0kcj1vqcxg1ygth7vl6h0000gn/T/nix-build-demo-0.0.0.drv-0'
builder for '/nix/store/g9cdimcmbcmd2b5nc7xd420xij5w744k-demo-0.0.0.drv' failed with exit code 1
error: build of '/nix/store/g9cdimcmbcmd2b5nc7xd420xij5w744k-demo-0.0.0.drv' failed
~~~

As far as I understand, the cause is that Setup.hs is built with the --make flag and with this flag GHC adds base to the set of visible packages, but my custom prelude is there as well.

I think one way to solve this is to do away with coreSetup and always build a custom environment consisting only of setupHaskellDepends. One issue I am aware of here is that Cabal does not have setupHaskellDepends and will need to be fixed with addSetupDepends Cabal [ mtl parsec ].

(@peti)
(somewhat related: #37254)

bug haskell

Most helpful comment

@kirelagin I'm not absolutely sure what @Ericson2314 meant; so this is just a side note. But the expressions I construct from .cabal files does expose the componentes (lib, exes, ...) so in theory we could even build derivations for each of them on their own.

All 22 comments

I’ve been working on this for the last couple of days. I’ll try to quickly document my findings.

  1. Setup.hs is potentially meant to run on a different platform (that is, it is a ā€œBuildHostā€ dependency), so, I think, it is a very good idea to split it into a separate derivation and then take it from buildHaskellPackages. Essentially the same is done in #37254, but I don’t see why we can’t do it unconditionally.
  2. Initially I tried to simply compile Setup.hs using GHC from buildHaskellPackages (as it used to be), but I got stuck trying to populate pkgdb. The thing is, I need to put there not only everything from setupHaskellDepends, but also their propagated dependencies and there is no easy way to do this. Normally setup.sh and its findInputs function is responsible for taking transitive dependencies into account, but, well, I didn’t find an easy way to reuse it without creating a separate derivation. So, conceptually, I need a function Derivation -> Set Derivation that takes a dependency and returns all its transitive propagated dependencies. I don’t know, maybe @Ericson2314 could help me with this.
    3.I personally feel that delegating as much as possible to nix instead of shell is a great idea, so I’d like to have more nix functions that produce shell code. As far as I understand, there is no way to get a list of propagated dependencies, because, well, some builders create them manually by echoing to nix-support instead of adding something to propagated inputs. It would be great if I could have this function from point#2 in nix, not in shell.
  3. withPackages has the code for setting up a compiler environment copy-pasted. I guess I might be able to deduplicate this.

@kirelagin https://github.com/NixOS/nixpkgs/pull/33368/files this has some ideas to that affect. I think I almost rather go straight to @angerman's thing of building components separately though. That change is long-overdue, and I'd love to make a major push on it for 18.09.

@Ericson2314 By ā€œcomponentsā€ do you mean Setup vs the rest or something even more fine-grained (cf. #34110).

@kirelagin I'm not absolutely sure what @Ericson2314 meant; so this is just a side note. But the expressions I construct from .cabal files does expose the componentes (lib, exes, ...) so in theory we could even build derivations for each of them on their own.

This sounds very promising.

Yes, I mean each lib, sub lib, exe, test, and benchmark separately.

Having a separate derivation for Setup.hs seems like a really bad idea considering that we have 10,000+ Haskell packages in Nixpkgs.

@peti so what? The vast majority of those don't have custom setups so it would be the same derivation—I'd expect this change to improve build times.

Even if it didn't, this is still the correct thing to do given cross and setup depends in general, and I'd like to see it happen. Finer grained caching > fewer derivations has a point of agreement amongst everyone I've talked to.

There are other considerations than just build times. The sheer number of .drv files and store paths we'd create if every single component of a Cabal file lived in its own derivation takes quite a toll on the file system. That may or may not be a serious problem, I don't know, but it certainly feels like a good idea to take that issue seriously. When haskell-ng was originally implemented, we had a setup where builds used a ghcWithPackagesderivation to set up their build environment, and ended up not using that approach because it was expensive.

Another consideration is the clarity of the error messages. Past experience has taught us that it's very important that users have a reasonable chance to tell which part of a build has failed and why -- even if they are not Nix experts. If they can't do that, then they'll either stop using Nix or they'll open lots of Github tickets that keep us occupied. This means that a Setup.hs build for foo should be labelled foo-setup (or something like that), because otherwise someone looking at a log of a failed build will have no clue at all what is going on. So chances for large-scale caching may exist in theory, but it's not clear to me whether practical considerations will allow us to realize them.

Last but not least, current Cabal maintainers don't give a damn about Setup.hs. Their POV is that people should use cabal-install to compile packages, and they'll continue to push features into the cabal utility that you don't get in a Setup.hs build. Personally, I've considered trying to make that move for quite a while, i.e. it should be possible to use the generic-builder with both Setup.hs (to bootstrap cabal) and cabal as a build driver.

@peti

There are other considerations than just build times.

First of all, my intention has always been to optimize our Haskell infra towards the needs of Haskell devs over the non-dev consumer of binaries written in Haskell, where those needs conflict. Given who uses Nix, and who does the most builds, that seems very reasonable. [Yes, as a professional Haskell dev, I might be biased.] Given that, build times and incrementalism is the overriding concern over just about everything else. Debug cycles are currently quite bad, which is why we get things like https://www.tweag.io/posts/2018-03-15-bazel-nix.html, etc. It's very important that we improve this as much as possible.

The sheer number of .drv files and store paths we'd create if every single component of a Cabal file lived in its own derivation takes quite a toll on the file system.

Given the number of files per component either way, this is a low coefficient O(n), is it not? Moreover, given that overriding need for shorter debug cycles, I'm inclined to quite liberally trade space for time.

When haskell-ng was originally implemented, we had a setup where builds used a ghcWithPackages derivation to set up their build environment, and ended up not using that approach because it was expensive.

In which way? My guess is the symlink trees, if it's even something that registers at all 4 years later. That's not needed for per-component.

Another consideration is the clarity of the error messages.

Keep in mind that Cabal has changed a lot in the past 1-2 years to be more oriented towards components over packages. This is one thing cabal-install and stack people agree on.

This means that a Setup.hs build for foo should be labelled foo-setup (or something like that), because otherwise someone looking at a log of a failed build will have no clue at all what is going on.

Absolutely all derivation names should have the package name and component name.

So chances for large-scale caching may exist in theory, but it's not clear to me whether practical considerations will allow us to realize them.

Do you have any other concrete concerns?

Last but not least, current Cabal maintainers don't give a damn about Setup.hs. Their POV is that people should use cabal-install to compile packages, and they'll continue to push features into the cabal utility that you don't get in a Setup.hs build. Personally, I've considered trying to make that move for quite a while, i.e. it should be possible to use the generic-builder with both Setup.hs (to bootstrap cabal) and cabal as a build driver.

It is true that custom setups are slowly dying, which I also am fine with. .But I think that's a reason to do this. cabal-install does build components separately, including Setup.hs itself when the setup-depends Cabal is sufficiently recent. I really like the look of @angerman's stuff, but long term I want to extra the exact build plan from cabal-install than it itself uses: that would only build setup components in the non-build-type: Simple case.

[Also, I suspect why might eventually need a bootstrap cabal binary which is also fine with me.]

I dunno ... the alternative to "one derivation per component" is "multiple outputs", and that approach is already somewhat popular and accepted in Nixpkgs.

Multiple outputs is a solution for closure size, not shorter debug cycles. It doesn't help with debug cycles at all. Debug cycles period length is still far and away my top priority.

@Ericson2314 Hmm, but separate derivations won’t solve the debug cycles issue either. The issue here is that all of the library gets rebuilt each time, not just the modules that changed, and I can’t see how separate derivations improve this.

I think one way of getting what you want would be to rewrite the builder to use cabal-install and structure it in such a way that its build phase can be easily reused for building the project in current directory (impurely). Then have a tool that converts .cabal files into derivations on the fly on each invocation and runs the build phase (and cabal-install will make sure to rebuild only the changed part).

@kirelagin No it doesn't solve it, but it's a step in the right direction. IIRC, cabal-install caches external packages on a per-component basis (if the Setup.hs is new enough), so this would at least align both our caching units for that. Your thing is a fine step too. I do want to get to pure per-module caching, but that will require GHC, Cabal, and Nix changes, oh my. Hopefully hnix can help (with prototyping and integrating).

I wouldn't even know how to achieve module caching without generating massive amounts of new derivations. Is there some precedence in nix for this somewhere?

There is indeed no precedence for that many derivations. But there isn't precedence for debug cycles that pick either. I want to break new ground.

I believe that https://github.com/NixOS/nixpkgs/pull/40996 fixed this issue.

I can still reproduce the issue with current HEAD (55f2889baeb98a650d19948e2db9c01ae7f94edf).

To be honest, looking at #40996 I don’t see how it could have fixed this šŸ¤”.

Hmm, I based my statement on https://github.com/NixOS/nixpkgs/pull/39735#issuecomment-391804895 from @Ericson2314.

I ran into an issue building 'entropy' with 'ghcjs'. entropy uses a custom Setup.hs and setupBuildDepends. The short version is that the package.conf.d in the build directory contained .conf files for ghc packages instead of ghcjs packages. As a result, when ghcjs tried to compile the code, it couldn't find any js files for the dependencies.

This was triggered by the use of setupHaskellDepends in the .nix file. I am hoping https://github.com/NixOS/nixpkgs/pull/41939 will fix the issue. If not I'll add more details or open a new bug.

UPDATE: building with head fixed this problem as hoped.

Yes. Good catch.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

lverns picture lverns  Ā·  3Comments

yawnt picture yawnt  Ā·  3Comments

grahamc picture grahamc  Ā·  3Comments

copumpkin picture copumpkin  Ā·  3Comments

spacekitteh picture spacekitteh  Ā·  3Comments