-L
flags passed to GCC-L
flags to its subprogram cc1
via an environment variableE2BIG
So once again, I'm building some Haskell with lots of library dependencies.
Trying to upgrade this build to 18.03, the nix-build
of the Haskell package newly fails with
Setup: Missing dependencies on foreign libraries:
* Missing C libraries: glog
This error message is wrong (cabal doesn't correctly propagate the argument list too long
error shown below; I filed a bug about it being misleading at https://github.com/haskell/cabal/issues/5355); the dependency exists.
The problem is that because in some eventual gcc
invocation the amount of arguments passed to GCC turns out to be very long.
strace
confirms:
strace -fye execve -s 100000000 -v runhaskell Setup.hs build 2>&1 | grep E2BIG
execve("cc1", [arguments here], [env vars here, "COLLECT_GCC_OPTIONS='-fno-stack-protector -L... 150KB of -L flags here'"] = -1 E2BIG (Argument list too long)
E2BIG (Argument list too long)
, in this case because COLLECT_GCC_OPTIONS
is longer than 128 KB (32 * 4 KB pages, see here, and a repro script I made here).
What is the COLLECT_GCC_OPTIONS
environment variable? It is an environment variable set by gcc
before calling out to cc1
, over which it communicates flags to cc1
. Most (if not all?) flags given to gcc will make it into this variable. So it can grow very big (easily larger than the 128 KB limit, especially on nix).
Note that even flags given in a "response file" via gcc @myresponsefile.rsp
(which was designed to pass GCC flags via a file instead of command line args to circumvent command line arg limits) will be put into COLLECT_GCC_OPTIONS
by gcc
itself to communicate them to cc1
(I have just confirmed that with a small example on my Ubuntu 16.04). So using @myresponsefile.rsp
is _not_ a workaround. (Yes, this seems to defeat the purpose of response files, but I suspect those were originally made to circumvent a much smaller limit of command line argument on Windows, where the limit is well below the 128 KB limit for environment variable lengths on Linux).
Aside: nix inflates the number of -L
flags by the fact that each -L
option to gcc is present multiple times, but those duplicates make only for factor 4x or so; even if they were deduplicated, I'd already be at half of MAX_ARG_STRLEN
with my medium-size Haskell project; so if I added a couple more dependencies to my Haskell project (all _recursive_ nix Haskell dependencies make it as -L
options into the gcc command line), I'd quickly exceed that limit again even without duplication.
-L
flags passed to GCC will help the issue by a small constant factor and make a couple more projects compile, but won't help with projects with many dependencies.executable
sections in the cabal file, see comment belowBuild a Haskell project with lots of dependencies on nixpkgs 18.03.
How many dependencies are we talking here? I've worked with quite a few without error.
CCing people that were in the past involved in dealing with large amount of args to gcc (#26554, #26974, #27609, #27657):
@domenkozar @edolstra @orivej @copumpkin @Ericson2314 @ryantrinkle
How many dependencies are we talking here? I've worked with quite a few without error.
@ElvishJerricco
build-depends
nix-shell
's ghc-pkg list
shows 330 total deps in the ghc-pkg databaseCOLLECT_GCC_OPTIONS
has 1790 -L
flags (+200 chars that aren't -L
flags)-L
flags ends up with 700 of them-L
flag is 64 chars long on average/nix/store/hash...-haskelllibname-1.2.3/lib
but non-Haskell buildInputs
are in here as well-L
entries each:/lib
and e.g. /lib/ghc-8.2.2/tasty-0.11.3
-L
entries * (65 chars + some spaces and quotes I stripped away in my grep) >= 128 K charsCould we use GCC spec files instead of command line arguments? This wouldn't work with Clang, but creating a Clang wrapper for supporting spec files is something that has sounded useful for a while now.
Deduplicating the
-L
flags passed to GCC will help the issue by a small constant factor and make a couple more projects compile, but won't help with projects with many dependencies.
I am relatively certain that this involves inserting an ordNub
into this line: https://github.com/haskell/cabal/blob/db05f8dd42bf28bfe9afa7992f3ca51e0f1af0c1/Cabal/Distribution/Simple/Configure.hs#L1684
just as in the line below that. I have confirmed that this line produces the duplicates.
Could we use GCC spec files instead of command line arguments?
@ElvishJerricco I don't know that. It may well be that gcc
still passes options to cc1
via the env var if we do it, but I'm not familiar with them.
Do you know how to use them? If yes, could you make a small example that we can strace
to check?
I can see the duplicated -L
paths already in echo $NIX_LDFLAGS
in nix-shell
.
The nixpkgs manual says
Bintools Wrapper's setup hook causes any
lib
andlib64
subdirectories to be added toNIX_LDFLAGS
.
but I haven't quite found yet where it's done and whether we can dedupe it there.
I found there's another part to it (https://github.com/haskell/cabal/pull/5356#issuecomment-393880674):
Independet of Haskell depencencies, for system dependencies (so, stuff that makes it in via no-Haskell-package-buildInputs
), Cabal adds a duplicated -L
flag for each executable
, test-suite
and benchmark
in the .cabal
file.
My case has 11 of those, so that also contributes to some blow-up.
Great cabal fixes! We should set strictDeps = true;
in generic builder to further deduplicate things.
@Ericson2314 I think the strictDeps = true;
fix does not work for stack. We'll need the cabal fix to get that working.
So is there a workaround/trick how I can build a stack project now? Can I downgrade the version of something?
@i-am-the-slime Building stack
against a version of the Cabal
library that has this patch https://github.com/haskell/cabal/pull/5356 should fix it.
My cabal PR has been merged which should provide a temporary mitigation: https://github.com/haskell/cabal/pull/5356
For the long run we still have to fix GCC upstream to not use the length-limited COLLECT_GCC_OPTIONS
environment variable for passing things around.
@Mistuke I saw your comment and a mentioned fix on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86030
What confuses me is that everthing there talks about -L
flags being passed via the command line while what we observe here is that they are being passed between gcc
and cc1
via an environment variable COLLECT_GCC_OPTIONS
(which also breaks due to the length limit on env vars).
Do you know if this should be fixed by that upstream patch?
@nh2 probably, the code filters out the -L
very early on. So what comes out of do_spec_1
will already be shortened so COLLECT_GCC_OPTIONS
should use it. It does honor response files so it shouldn't have re-expanded the options yet.
If you want confirmation I can test using a GCC 9 build on Monday.
@nh2 So I'm not sure it'll fix your particular problem.. It definitely adds the -L
to a response file now. but it seems it also still expands them into COLLECT_GCC_OPTIONS
which seems like a bug.
But also more annoyingly, collect2
doesn't seem to pass them on in an response file when it calls ld
. So it seems like the fix is too superficial.
@Mistuke Thanks for double-checking that for us. Will you follow up with upstream GCC on your issue, or should I do that?
I will re-open the ticket on Monday, but it would be helpful if you could
then comment as well, especially mention that it's a problem on Linux too
as I think that carries more weight than a Windows issue :)
On Thu, Jun 28, 2018 at 12:57 PM, Niklas HambĂĽchen <[email protected]
wrote:
@Mistuke https://github.com/Mistuke Thanks for double-checking that for
us. Will you follow up with upstream GCC on your issue, or should I do that?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/NixOS/nixpkgs/issues/41340#issuecomment-401010178,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABH3KY7zKNKCFQrVUzdiNuCD7ac-3fWMks5uBMS2gaJpZM4UVz4b
.
@Mistuke Sure, just let me know when I shall comment!
I have a similar issue on macOS using GHC 8.4.3:
Linking Setup ...
clang-5.0: warning: argument unused during compilation: '-nopie' [-Wunused-command-line-argument]
/nix/store/ckq71kkymh1ji2b44xn80wmr7fmi6wr5-clang-wrapper-5.0.2/bin/cc: line 183: /nix/store/bcl9zj60h52p47dy85s326mdrqx52417-clang-5.0.2/bin/clang: Argument list too long
`cc' failed in phase `Linker'. (Exit code: 126)
@domenkozar Same here. IIRC the maximum environment size is like one eighth of that on standard Linux, so it's a lot easier to hit.
@srhb what nixpkgs commit? I'm using 5974bb7c9c6a6fd4968516ebc6efa5370323fc8d
What's not clear to me is why this worked on GHC 8.2 and not so recent nixpkgs.
@domenkozar I wasn't aware that this was ever not a problem in the haskell infra, I always hit it for packages with a sufficiently large number of dependencies. fwiw currently head of nixos-18.03. I didn't build darwin long before that however.
For posterity:
angerman | The underlying issue is the following: we have
$out/lib
for each haskell library we build. However that folder is mostly useless for anything but haskell.
angerman | However the cc-wrapper logic, puts any$out/lib
folder of the dependencies into the NIX_CLFAGS.
angerman | Now you end up with $pkg/lib for each haskell dependency, as well as$pkg/lib/ghc8.4.../
that GHC forwards to the compiler/linker.
angerman | and that overruns your limit.
angerman | So a qhick fix is to just mutilate the haskell pkg builder, to stop it from generatinglib
folders.
angerman | The proper fix would be to use response-files in the cc-wrapper, so that we pass@args-file
toclang
orgcc
instead of passing the flags on the command line.
domenkozar | does clang support that?
angerman | domenkozar yes, for quite a few years already, the commit was landed in clang in 2014. See https://reviews.llvm.org/D4897
See https://github.com/NixOS/nixpkgs/pull/41420#issuecomment-401601581
@nh2 I've re-opened the issue, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86030 it seems however that in part it's a configuration thing. when GCC is build with --with-gnu-ld
will treat the response files correctly. (this behaviour seems to be undocumented.).
So it seems that only the environment variables stuff may still be an issue. But who knows.. maybe there's another undocumented flag..
@domenkozar @angerman that's a good idea, but lets put it under at least some prefix for sake of multiple outputs, or get {cc,ld}-wrapper
to ignore some lib dirs.
Yes, we should use a separate output for that :)
Given that strictDeps
fix not imminent, should we go with Angerman solution?
So I've dug deeper into the issue and it has been progressingly getting worse with cross-compilation fixes. I've been running following command to count how many time a dependency shows itself as a linker argument:
nix-shell -A haskellPackages.memory --run "env |grep NIX_LDFLAGS|grep -o -w basement |wc -w " .
More work on strictDeps: https://github.com/NixOS/cabal2nix/pull/358
@nh2 I've re-opened the issue, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86030 it seems however that in part it's a configuration thing. when GCC is build with --with-gnu-ld will treat the response files correctly. (this behaviour seems to be undocumented.).
So it seems that only the environment variables stuff may still be an issue. But who knows.. maybe there's another undocumented flag..
@Mistuke Uh OK, interesting. Is the --with-gnu-ld
something that we can use in general or does that put some restrictions on us? If we can use it, then we probably need only the env var upstream fix indeed.
@nh2 as far as I know it's something that has no other effects but allowing the response files. I think it's because at implementation time only ld supported this. So it's safe to turn on always as long as you either only use ld or all linkers you use support response files.
So I've bisected and https://github.com/NixOS/nixpkgs/compare/114a9b625386e3ca4e142dce6ce8bcfcabea8fe3...469fd8983276d851c19827cd9e78a89dd53a5914 (part of https://github.com/NixOS/nixpkgs/pull/26805/commits) is where the first time LDFLAGS
start to be duplicated in nixpkgs history.
@Ericson2314 could you explain why that's happening?
So it's safe to turn on always as long as you either only use ld or all linkers you use support response files.
@Mistuke makes sense; two questions:
gold
and lld
support it?$ git co 4d2b7638173afb4c0f0c12a43654bcff5c9500ab
$ nix-shell -A libxml2 --run "env |grep LDFLAGS" . | grep --color python
NIX_LDFLAGS=-rpath /nix/store/f04m1im3cwj9rgcys31h80ic3i2n8fl3-libxml2-2.9.7/lib64 -rpath /nix/store/f04m1im3cwj9rgcys31h80ic3i2n8fl3-libxml2-2.9.7/lib -L/nix/store/6yb5rvr6rvgvx8ylpchwz808djfw07rb-python-2.7.14/lib -L/nix/store/r2icacayl0prn5n65206x0bqx4idarpm-zlib-1.2.11-dev/lib -L/nix/store/la2xxf4grdyf37a94ak28m7a61yjqzr9-zlib-1.2.11/lib -L/nix/store/6yb5rvr6rvgvx8ylpchwz808djfw07rb-python-2.7.14/lib -L/nix/store/r2icacayl0prn5n65206x0bqx4idarpm-zlib-1.2.11-dev/lib -L/nix/store/la2xxf4grdyf37a94ak28m7a61yjqzr9-zlib-1.2.11/lib
$ git co 54d01b0e97f50c3fe87a41ebbbef36bb4ddc1e7b
$ nix-shell -A libxml2 --run "env |grep LDFLAGS" . | grep --color python
NIX_LDFLAGS=-rpath /nix/store/pjdj1mnqlyqnwmy4wcvrn3ipl2jhln39-libxml2-2.9.7/lib64 -rpath /nix/store/pjdj1mnqlyqnwmy4wcvrn3ipl2jhln39-libxml2-2.9.7/lib -L/nix/store/hr461qqci171p6k53wwz52ifyp2h9g9l-python-2.7.14/lib -L/nix/store/ggxa0sw8h1y0yjvfb7agi7vniyax70dn-zlib-1.2.11-dev/lib -L/nix/store/4zs260xcpyzf9h252lw2qny8wwpm97wm-zlib-1.2.11/lib
I explained to domen on IRC. But let me recap here. https://github.com/NixOS/nixpkgs/commit/469fd8983276d851c19827cd9e78a89dd53a5914 was the source of the issue. It, for compatability purposes only, runs env hooks multiple times so incorrectly categorized deps still end up working. This was done based on the crossConfig
environment variable, a convenient bash-level way to see whether we were cross compiling or not. https://github.com/NixOS/nixpkgs/commit/330ca731e88ec015181c43d92ae8f7c77cf0226a Then removed crossConfig
, making a new strictDeps
for this issue.
All this is to say my strictDeps
PR isn't some sort of band-aid, but a fix of the the original issue. Indeed I wish to write an RFC to just get rid of the strictDeps
flag by enabling it everywhere.
The final issue is that strictDeps, in its strict deps, breaks some Haskell packages. But, per Cabal's own plan, those packages are deemed broken. It is my opinion then that we should tollerate a higher number of overrides and/or more broken packages than usual, but the build failures are not Nix's fault, but the package's. In practical terms, with by the end of the year Nix and Cabal not building the package for the same easily-fixable reason, the package authors should be compelled to fix their packages so the number of overridden/broken packages should quickly decrease.
@nh2 I don't know how to get a version of stack build with that cabal patch. Do you happen to have one? Or some nix definition that builds it?
@nh2 Sorry I don't know the state of support for gold and lld, gold I wouldn't be terribly surprised if they didn't support it. I think lld might but couldn't find anything conclusive in the manual.
As to why it's a compile time check, I think a runtime check would be quite expensive to perform. Particularly on platforms where this is really required like windows. But realistically speaking... Someone just probably cut a corner here.. It could have been at least ast a configure time check, The same question was asked upstream in a different thread before with no answer.
@i-am-the-slime Wait, you're seeing this issue with stack
already, without nix? On what platform are you?
You will have to build stack from source with the cabal dependency pointed at a commit that includes my patch. I can help you do so if that really is what you need.
OK so this issue has got me again now.
My cabal fix for https://github.com/haskell/cabal/pull/5356 isn't enough. With a larger project, I now see that
ghc
invocation by cabal is deduped finecc
invocation by ghc is deduped finegcc
invocation by cc contains all -L
flags duplicated 10xNIX_LDFLAGS
duplicates (4x) and NIX_BUILD_LDFLAGS
(6x)collect
invocation by gcc also and that one crashes with E2BIG
This is on commit 2c07921cff84dfb0b9e0f6c2d10ee2bfee6a85ac and when I build with musl
(but I think that is unrelated) for https://github.com/nh2/static-haskell-nix/ with nix on Ubuntu 16.04.
I am not sure if this should be clear to me via https://github.com/NixOS/nixpkgs/issues/41340#issuecomment-405272346, but does anybody understand _why_ the duplicates are in NIX_LDFLAGS
and NIX_BUILD_LDFLAGS
?
I see flags like strictDeps
flying around but I don't really know what that does so it would be great if somebody could explain this to me.
@nh2 see #43559 for some discussion of the injection of lirbary search dirs by nix for packages with $out/lib
folders.
@nh2
You will have to build stack from source with the cabal dependency pointed at a commit that includes my patch. I can help you do so if that really is what you need.
That's exactly what I need. I would appreciate the help.
@i-am-the-slime Do you need the solution for NixOS or other distros?
For other distros, the easiest is to clone stack from git, check out the release tag you want, and change the extra-deps
entry for Cabal
in stack.yaml
to point at your fork of cabal at the same release tag but with the changes cherry-picked.
Here is a concrete example of that, picking my cabal fix on top of Cabal 2.2.0.1:
diff --git a/stack.yaml b/stack.yaml
index e6696b14..4c921f08 100644
--- a/stack.yaml
+++ b/stack.yaml
@@ -18,7 +18,10 @@ flags:
supported-build: true
extra-deps:
- rio-0.1.1.0@rev:0
-- Cabal-2.2.0.1@rev:0
+- git: [email protected]:nh2/cabal.git
+ commit: 7cb409fe7433833a3a8aa4b38a5fb3c2e01a5e5d
+ subdirs:
+ - Cabal
- hpack-0.28.2
- http-api-data-0.3.8.1@rev:0
You may be able to directly use code I wrote for static-haskell-nix
: https://github.com/nh2/static-haskell-nix/blob/ef283274ce193f713082591dd462f4bd3fb4dd1f/survey/default.nix#L96
If you don't want to build a static version of stack, you can probably also just use
stack = useFixedCabal super.stack;
with useFixedCabal
coming from here.
Hope that helps!
@nh2 Thanks a lot for the explanation and code! It is really helpful!
I think I am having the same problem as i-am-the-slime. I am trying to use stack
to build a package with buildStackPackage
, but it is failing with the Argument list too long
error.
I tried to use your version of stack
from static-haskell-nix/survey/default.nix
. stack
appears to build fine (except that it took a long time to build because I couldn't get cachix working), but it seems not to have fixed the problem. It still fails with the same error when trying to build haskell-gi
:
$ stack build --fast
haskell-gi-0.21.3: configure
haskell-gi-0.21.3: build
Progress: 1/12
-- While building custom Setup.hs for package haskell-gi-0.21.3 using:
/home/illabout/.stack/setup-exe-cache/x86_64-linux-nix/Cabal-simple_mPHDZzAJ_2.2.0.1_ghc-8.4.3 --builddir=.stack-work/dist/x86_64-linux-nix/Cabal-2.2.0.1 build --ghc-options " -ddump-hi -ddump-to-file -fdiagnostics-color=always"
Process exited with code: ExitFailure 1
Logs have been written to: /home/illabout/git/termonad/.stack-work/logs/haskell-gi-0.21.3.log
Configuring haskell-gi-0.21.3...
Preprocessing library for haskell-gi-0.21.3..
gcc: error trying to exec '/nix/store/imfm3gk3qchmyv7684pjpm8irvkdrrkk-gcc-7.3.0/libexec/gcc/x86_64-unknown-linux-gnu/7.3.0/cc1': execv: Argument list too long
compiling .stack-work/dist/x86_64-linux-nix/Cabal-2.2.0.1/build/Data/GI/CodeGen/GType_hsc_make.c failed (exit code 1)
command was: /nix/store/yz6kinf4ia19r1c14yirl6x4ciwgzk67-gcc-wrapper-7.3.0/bin/cc -c .stack-work/dist/x86_64-linux-nix/Cabal-2.2.0.1/build/Data/GI/CodeGen/GType_hsc_make.c -o .stack-work/dist/x86_64-linux-nix/Cabal-2.2.0.1/build/Data/GI/CodeGen/GType_hsc_make.o -fno-stack-protector -fno-stack-protector -D__GLASGOW_HASKELL__=804 -Dlinux_BUILD_OS=1 -Dx86_64_BUILD_ARCH=1 -Dlinux_HOST_OS=1 -Dx86_64_HOST_ARCH=1 -I/nix/store/12y8lyrwiaxmhpz6vmdb1rdb3s509q1m-xz-5.2.4-dev/include -I/nix/store/28x7gibl6gf1c165d5jnlayzyqghyfn8-gdk-pixbuf-2.36.12-dev/include -I/nix/store/2fmmlcn6apvmndlzbqarxf3vbdp1kxcy-cups-2.2.6-dev/include -I/nix/store/2z9hyi2c3k0bwiqicml7y3xx2hnqcsmf-xcb-util-0.4.0-dev/include -I/nix/store/307bxzghi2r3r6x5ajc3cn2x04f27hbj-libdrm-2.4.92-dev/include -I/nix/store/34illdyfm3k676akas85kr5zc54vjqx7-graphite2-1.3.6/include
...
Here is my code if you want to play around with it yourself:
https://github.com/cdepillabout/termonad
If you have nix
and stack
installed, you should be able to reproduce this error with the following commands:
$ git clone [email protected]:cdepillabout/termonad.git
$ cd termonad
$ git checkout 8d371be175c3bb40c2204fefe538462ff9a5186b
$ stack --nix build
...
gcc: error trying to exec '/nix/store/imfm3gk3qchmyv7684pjpm8irvkdrrkk-gcc-7.3.0/libexec/gcc/x86_64-unknown-linux-gnu/7.3.0/cc1': execv: Argument list too long
...
Here's the commit before I started using your stuff (this also fails with the exact same error):
https://github.com/cdepillabout/termonad/commit/fa237b46238819384a2b185e10a51511af26bc08
Here's the commit where I start using your survey/default.nix
file (this fails with the error pasted above):
https://github.com/cdepillabout/termonad/commit/8d371be175c3bb40c2204fefe538462ff9a5186b
(I am relatively new to nix, so it is possible that I am just making a stupid mistake with my nix code above.)
Is there any way to work around this problem? Ideally I could just go back to 17.09, but that doesn't have a GHC-8.4.3 derivation I can easily use.
I don't know if anything will be interested in this (other than @i-am-the-slime), but I was able to use GHC-8.0.2 to work around this.
Here is a way to run the code. It is using commit a4fcfdb24:
$ git clone [email protected]:cdepillabout/termonad.git
$ cd termonad
$ git checkout 8d371be175c3bb40c2204fefe538462ff9a5186b
$ stack --nix build
This is using GHC-8.0.2 from the nixpkgs for NixOS-18.03.
In case anyone is interested, GHC-8.2.2 also didn't work for any version of nixpkgs I tested. I tried a bunch of different ones all the way back to 17.09.
I've pushed https://github.com/dezgeg/nixpkgs/commit/f3758258b8895508475caf83e92bfb236a27ceb9 to staging with a hope that it resolves this issue. We'll see how it goes in the next days once hydra rebuilds the world.
Testing at https://hydra.nixos.org/jobset/nixpkgs/staging-next - it should finish tomorrow unless something important comes to Hydra queue meanwhile. So far it looks very promising.
To back this off:
$ on staging-next
$ nix-shell -A haskellPackages.stack --run "env | grep NIX_LDFLAGS | wc -m"
637
$ on master
$ nix-shell -A haskellPackages.stack --run "env | grep NIX_LDFLAGS | wc -m"
26505
Thanks to @dezgeg
Thanks a lot @domenkozar and @dezgeg!
I'd love to test this when hyrda finishes building it.
How can I figure out when Hydra has finished building it? When I go to https://hydra.nixos.org/jobset/nixpkgs/staging-next, I see a list of evaluations, but I can't tell which are finished and which still haven't been built. Also, I'm not sure which ones contain the commit from @dezgeg you linked to above.
Is there a guide to using Hydra you would recommend?
@cdepillabout it's the last two at this very moment on that list.
PR is open now: https://github.com/NixOS/nixpkgs/pull/44009
Hydra testing is done on linux, but it has quite a few packages left for arch/darwin, so probably merge is imminent for tomorrow.
Merged to master!
But the underlying problem of duplicated -L
flags is still there, it's just that we workaround the problems happening with Haskell. It's being hacked around here for instance: https://github.com/NixOS/nixpkgs/blob/master/pkgs/applications/graphics/inkscape/default.nix#L24
@domenkozar I guess you could split off it into a separate issue, with say the findings from https://github.com/NixOS/nixpkgs/issues/41340#issuecomment-403453874 and any other findings you have (I haven't followed this problem as closely as you).
Yes, the underlying issue of setup-hooks being triggered 6 times per dependency is still there, but I think "argument list too long" is "fixed" for now (as in, the limit isn't reached).
$ on master
$ nix-shell -A inkscape --run "env | grep NIX_LDFLAGS | wc -m"
4672
$ on (HEAD detached at f68920176c2)
$ nix-shell -A inkscape --run "env | grep NIX_LDFLAGS | wc -m"
9112
@domenkozar I'm still having a problem when trying to build with stack
's nix
support, using buildStackProject
.
Here is my project. Here is the shell.nix I am using.
Here is what happens when I try to build this project on NixOS:
$ stack build --nix
...
haskell-gi-0.21.3: configure
haskell-gi-0.21.3: build
Progress: 1/12
-- While building custom Setup.hs for package haskell-gi-0.21.3 using:
/home/illabout/.stack/setup-exe-cache/x86_64-linux-nix/Cabal-simple_mPHDZzAJ_2.2.0.1_ghc-8.4.3 --builddir=.stack-work/dist/x86_64-linux-nix/Cabal-2.2.0.1 build --ghc-options " -ddump-hi -ddump-to-file -fdiagnostics-color=always"
Process exited with code: ExitFailure 1
Logs have been written to: /home/illabout/git/termonad/.stack-work/logs/haskell-gi-0.21.3.log
Configuring haskell-gi-0.21.3...
Preprocessing library for haskell-gi-0.21.3..
gcc: error trying to exec '/nix/store/hsw4smim26xm732xxi4xnpx5s1az81ld-gcc-7.3.0/libexec/gcc/x86_64-unknown-linux-gnu/7.3.0/cc1': execv: Argument list too long
compiling .stack-work/dist/x86_64-linux-nix/Cabal-2.2.0.1/build/Data/GI/CodeGen/GType_hsc_make.c failed (exit code 1)
command was: /nix/store/frqm8gncayv83s3hrv7hpldp865gbspc-gcc-wrapper-7.3.0/bin/cc -c .stack-work/dist/x86_64-linux-nix/Cabal-2.2.0.1/build/Data/GI/CodeGen/GType_hsc_make.c -o .stack-work/dist/x86_64-linux-nix/Cabal-2.2.0.1/build/Data/GI/CodeGen/GType_hsc_make.o -fno-stack-protector ...
Is this a known problem? Is buildStackProject
still not supported?
@cdepillabout it builds for me, stack is a black box, maybe try stack clean --full
.
First of all—and I am embarrassed that I don't remember saying it before—Sorry for all the headache my cross changes have caused with this. Second, yes please do open an issue door the underlying problem. I will refer to it in my upcomming strictDeps
everywhere RFC.
@domenkozar Thanks for trying to build it on your machine. That is good to hear that it is working for you.
I was able to get it working on my machine as well.
I was running stack build --nix
using the system-wide-installed stack
. I thought this would re-exec the stack
defined in buildStackProject
, and use that to actually build the project. This turned out to not be the case.
Instead, I just ran nix-shell
with the shell.nix
file linked above, and then from there I ran stack build
. That appears to work.
Thanks!
I'm on darwin and still hitting this issue, but with hsc2hs
.
/nix/store/nc6wvwy7skyz2nsh7hyhgshg5a0y56j5-ghc-8.4.3/bin/hsc2hs: createProcess: runInteractiveProcess: exec: resource exhausted (Argument list too long)
I'm also hitting the error 'cc' failed in phase 'Linker'
on mac. However, I get it without using anything stack
related, I can consistently get the error with something like cabal install hyper-haskell-server
.
Both cabal-install-2.2.0.0
and ghc-8.4.3
were recently installed from the unstable channel.
Just like @2mol, I'm seeing this on Mac OS X in the linker when compiling Setup.hs
.
This can be reproduced by trying to build Termonad on Mac OS X. You should see this error on the current master
branch of Termonad (for posterity, this is commit https://github.com/cdepillabout/termonad/tree/09942c44eb5bf57fe95deedbbe601ecbdc0a9bf5):
$ git clone [email protected]:cdepillabout/termonad.git
$ nix-build
...
building '/nix/store/4f1c2pnzhh07vxq54s7q2ib5y6xrwndx-gi-gdk-3.0.16.drv'...
setupCompilerEnvironmentPhase
Build with /nix/store/dxljd69cp3m5cgfl53h0fjbq202m0fm7-ghc-8.4.4.
unpacking sources
unpacking source archive /nix/store/fab0pd9x1x2w949a33lsl3m1mkbh4890-gi-gdk-3.0.16.tar.gz
source root is gi-gdk-3.0.16
setting SOURCE_DATE_EPOCH to timestamp 1526109557 of file gi-gdk-3.0.16/stack.yaml
patching sources
compileBuildDriverPhase
setupCompileFlags: -package-db=/private/var/folders/1b/r0tb60rs58jbyrp6fcbhy62h0000gn/T/nix-build-gi-gdk-3.0.16.drv-0/setup-package.conf.d -j12 -threaded
[1 of 1] Compiling Main ( Setup.hs, /private/var/folders/1b/r0tb60rs58jbyrp6fcbhy62h0000gn/T/nix-build-gi-gdk-3.0.16.drv-0/Main.o )
Linking Setup ...
clang-5.0: warning: argument unused during compilation: '-nopie' [-Wunused-command-line-argument]
/nix/store/l928iaj0m48lfb7dib6h7p22d50l566x-clang-wrapper-5.0.2/bin/cc: line 183: /nix/store/9sczacl37p08kdw7c4a11cm05355088w-clang-5.0.2/bin/clang: Argument list too long
`cc' failed in phase `Linker'. (Exit code: 126)
builder for '/nix/store/4f1c2pnzhh07vxq54s7q2ib5y6xrwndx-gi-gdk-3.0.16.drv' failed with exit code 1
cannot build derivation '/nix/store/yn3vs02d1ns5l9w5rab7lbhw3jip3p7m-ghc-8.4.4-with-packages.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/gdyn3143lydyg2151mjwyympjah1cvk9-termonad-with-packages-8.4.4.drv': 1 dependencies couldn't be built
error: build of '/nix/store/gdyn3143lydyg2151mjwyympjah1cvk9-termonad-with-packages-8.4.4.drv' failed
Here's the full log produced by passing -v
to GHC:
https://gist.github.com/cdepillabout/5485f1160b8597eede7790a1722f5b5f
The bad line in question is the following, which occurs when linking Setup.hs
. This line is too long:
/nix/store/l928iaj0m48lfb7dib6h7p22d50l566x-clang-wrapper-5.0.2/bin/cc -fno-stack-protector -DTABLES_NEXT_TO_CODE -o Setup -lm -no-pie -fno-common -U__PIC__ -D__PIC__ -Wl,-no_compact_unwind /private/var/folders/1b/r0tb60rs58jbyrp6fcbhy62h0000gn/T/nix-build-gi-gdk-3.0.16.drv-0/Main.o -L/nix/store/p20qnkan97c32bxly68ldrwsjl24ifwq-haskell-gi-0.21.5/lib/ghc-8.4.4/x86_64-osx-ghc-8.4.4/haskell-gi-0.21.5-8TMPbZXMdzR6DkzsG709n8 ...
In https://github.com/NixOS/nixpkgs/pull/49552#issuecomment-438524235, @matthewbauer was saying that this problem is already fixed in nixpkgs, but maybe it is not? Or maybe it is only still broken on Mac OS X? The above code compiles correctly on Linux.
The above code is building with a version of nixpkgs from a few days ago.
@cdepillabout What is the maximum command line length on the version of OSX you're using?
I cannot reliably find information about this on the Internet, but https://apple.stackexchange.com/questions/254010/is-there-a-way-around-error-command-line-too-long suggests it's 130592 and your command line in question is 137k long.
@matthewbauer's fix is about environment variables, which unfortunately don't appear in your output, so we can't exactly tell. Perhaps with some equivalent of strace
on OSX it could be shown what the actual environment variables are during the exec()
call, so we can be sure that it's really the command line arguments limit, not long environment variables?
In any case, the command line looks reasonable to me, I don't think it can be shortened.
I think we really have to fix this issue in upstream compilers (like I said in https://github.com/NixOS/nixpkgs/issues/41340#issuecomment-399465674 for gcc, I didn't know clang has a similar issue), as we can only get so far with hacky workarounds trying to reduce the command line length. Modern programs link many libraries, and compilers must support that, there is no way around it.
I've been researching this issue for the past few days, because I've been trying to get the LibreOffice package to build on Darwin, and it was hitting this issue too.
In both that case, and the termonad issue posted by @cdepillabout, the issue has been that the arguments passed to the linker by clang have been too long. This probably isn't something we can fix – I suspect the reason we're hitting it and discovering all these things is that the extra linker flags added by cc-wrapper are just enough to push the linker invocation over the edge.
I fixed the issue in LibreOffice here (https://github.com/LibreOffice/core/commit/ab9d95e6073d84a0dbabf1a4e704b8468afe7bff) by using -Wl,-filelist
to pass object files to link to the linker, through clang, as a file rather than an argument list. Here's an example of it being fixed in Thunderbird when they independently ran into the same issue.
These fixes are specific to particular build systems, though, so if we want a proper solution, I agree that the thing to do is fix clang. There is a bug open in clang for this exact issue, and it seems to be generally accepted to be a good idea, but there hasn't been any activity on it for five years.
Once we have that fixed, we can move on to using response files when we invoke the compiler, and that should be pretty straightforward. But there's not much point doing that until the -filelist
thing is implemented. Since I've been testing it anyway, though, it should be something like this:
diff --git a/pkgs/build-support/cc-wrapper/cc-wrapper.sh b/pkgs/build-support/cc-wrapper/cc-wrapper.sh
index 8003fe1d8f3..cdc3b1ce86d 100644
--- a/pkgs/build-support/cc-wrapper/cc-wrapper.sh
+++ b/pkgs/build-support/cc-wrapper/cc-wrapper.sh
@@ -180,7 +180,10 @@ fi
PATH="$path_backup"
# Old bash workaround, see above.
-exec @prog@ \
+responseFile=$(mktemp)
+printf "%q " \
${extraBefore+"${extraBefore[@]}"} \
${params+"${params[@]}"} \
- ${extraAfter+"${extraAfter[@]}"}
+ ${extraAfter+"${extraAfter[@]}"} \
+ > "$responseFile"
+exec @prog@ "@$responseFile"
The problem with the clang
bug is that on Darwin clang
eventually calls system ld64
, which doesn’t support a response file. Unfortunately, -filelist
is not the same as the response file, as you can’t give options in it, you can only give input files. This issue here is caused mostly by the growing number of libraries, which translates into the growing number of _options_; by making clang
pass inputs to ld
via -filelist
we do not really solve the problem, although we slightly improve on it, of course. I guess, this is the reason for the lack of activity.
To be honest, I don’t think anyone expects Apple to implement this in ld64
. One viable strategy would be to patch ld64
ourselves, then patch clang
ourselves, tell it to use our ld
instead of the system one, and then maintain a patched linker and a patched C compiler 🙄. The good news is that the clang
patch _might be_ upstreamable, if hidden behind some sort of --with-gnu-ld
, similar to GCC, but maybe with a more indicative name. The bad news is that a patched linker is still meh 🤷‍♂️. (Although, maybe, just maybe, if we send this patch to Apple...)
@vcunat vcunat closed this in 9172c1e 4 hours ago
@vcunat @matthewbauer Should the above really have closed this issue?
avoid running the same hook twice
sounds to me like at most a factor of 2x can be gained, which would mean the problem isn't really solved in general.
So it seems to me the real solution remains fixing GCC and other compilers to not turn response file contents into environment variables.
Reopening until somebody confirms otherwise.
So it seems to me the real solution remains fixing GCC and other compilers to not turn response file contents into environment variables.
This. So. Much. This.
Oops sorry. Yeah that commit might fix the issue but was mostly just to avoid duplication that happens when strictDeps = false.
I've made a repro (for this) in Python on my Ubuntu 16.04 to confirm the 128KB limit for me:
import os
import sys
# num_chars = 128000 # works
# num_chars = (2**16) # works
# num_chars = (2**17) # fails with E2BIG (Argument list too long)
# num_chars = (2**17 - 3) # works
num_chars = (2**17 - 2) # fails with E2BIG (Argument list too long)
sys.stdout.write("trying with " + str(num_chars) + " chars\n")
os.execve('/bin/true', [], {
'x': 'a' * num_chars
})
This shows
trying with 131069 chars
and
trying with 131070 chars
OSError: [Errno 7] Argument list too long
The hsc2hs
variant of this issue is captured in this ticket: https://github.com/haskell/hsc2hs/issues/22
(note that cabal actually calls hsc2hs with response files when possible: https://github.com/haskell/cabal/issues/3122)
So the error [..]/bin/hsc2hs: createProcess: runInteractiveProcess: exec: resource exhausted (Argument list too long)
indicates that hsc2hs is getting invoked with a pre 2.4.0.0 version of the Cabal library. However, even with that fixed, hsc2hs itself could still fail on a long arg list, but that message would say something like [..]/bin/gcc: createProcess: runInteractiveProcess: exec: resource exhausted (Argument list too long)
.
I found two more points to this problem (hitting it again today):
NIX_LDFLAGS
already contains duplicated -L
flags, sneaking them past Cabal directly to the linker. This means no fixes in Cabal can help that.NIX_*FLAGS
variables are just bad hacks and should be avoided wherever possible. These hacks cause these problems. Instead, flags should be passed directly to the build systems, as I propose for a similar variable in #79303. I think in nixpkgs we must work to remove these hacks.-pgmc
compiler (or at least that seems to make the difference for me), then _additionally_ the same variables will be also in NIX_TARGET_LDFLAGS
, which will _also_ make its way into COLLECT_GCC_OPTIONS
(in addition to NIX_LDFLAGS
).For the second question, can anybody explain to me what NIX_TARGET_LDFLAGS
is designed for? Its uses in the code have zero comments, it's just a pile of bash script without any explanations.
NIX_LDFLAGS
already contains duplicated-L
flags, sneaking them past Cabal directly to the linker.
I've developed a hacky workaround for this, which that deduplicates NIX_*LDFLAGS
variables straight in you preConfigure
:
I'm also running into this problem, with gst_all_1.gst-plugins-bad
on darwin: https://hydra.nixos.org/build/123830272
Sanity check compile stderr:
/nix/store/jhfr9mqrjjc1nfjrnjwb8sl889aj8lj9-clang-wrapper-7.1.0/bin/clang: line 194: /nix/store/mgh4h96hhxgd1szm9zfy57pijh931vmx-clang-7.1.0/bin/clang: Argument list too long
-----
meson.build:1:0: ERROR: Compiler clang can not compile programs.
I don't know if that is the problem, but PKG_CONFIG_PATH contains 142 items and 10706 characters.
Every item is duplicated once, sort -u | wc -l
gives back 71 items.
NIX_CFLAGS_COMPILE_FOR_BUILD
is 30330 characters and everything is included three times.
I have not tried the script above yet, but I'm not sure if it will help, since it seems to filter other things.
Edit: After trying the build in a nix-shell with NIX_DBG=1
, I see that the total number of arguments that are passed is 1594, with a total of 77316 characters. I don't know what the limit is on mac.
The final argument list sent to clang
has 5 copies of everything, so sort -u
gives 210 arguments, instead of 1003 arguments (after merging paired arguments like -iframework /nix/...
into single arguments).
Edit 2: The limit on my mac seems to be 262117 characters, so I don't know why it failed. I may be missing something.
I'm also running into this problem, with
gst_all_1.gst-plugins-bad
on darwin: https://hydra.nixos.org/build/123830272Sanity check compile stderr: /nix/store/jhfr9mqrjjc1nfjrnjwb8sl889aj8lj9-clang-wrapper-7.1.0/bin/clang: line 194: /nix/store/mgh4h96hhxgd1szm9zfy57pijh931vmx-clang-7.1.0/bin/clang: Argument list too long ----- meson.build:1:0: ERROR: Compiler clang can not compile programs.
I don't know if that is the problem, but PKG_CONFIG_PATH contains 142 items and 10706 characters.
Every item is duplicated once,sort -u | wc -l
gives back 71 items.
NIX_CFLAGS_COMPILE_FOR_BUILD
is 30330 characters and everything is included three times.I have not tried the script above yet, but I'm not sure if it will help, since it seems to filter other things.
Edit: After trying the build in a nix-shell with
NIX_DBG=1
, I see that the total number of arguments that are passed is 1594, with a total of 77316 characters. I don't know what the limit is on mac.The final argument list sent to
clang
has 5 copies of everything, sosort -u
gives 210 arguments, instead of 1003 arguments (after merging paired arguments like-iframework /nix/...
into single arguments).Edit 2: The limit on my mac seems to be 262117 characters, so I don't know why it failed. I may be missing something.
I'm running into exactly the same problem, do you have any progress here?
Most helpful comment
I have a similar issue on macOS using GHC 8.4.3: