The (at this point infamous) load commands size problem still occurs on nixpkgs even after the GHC 8.0.2 / LTS 8.0 update.
I'll try to summarize the underlying issue, but please be aware I'm not an expert and am just stitching together things from other issues. Please correct me if the details are wrong.
This issue is due to a change in macOS's linker in Sierra which limits the mach-O linker load commands to 32768 bytes. Those load commands are dominated by RPATH entries and nix exacerbates the issue greatly by having a ton of long named directories, around one per dependency.
In GHC (https://ghc.haskell.org/trac/ghc/ticket/12479), Stack (https://github.com/commercialhaskell/stack/issues/2577), and Cabal (https://github.com/haskell/cabal/pull/3955, https://github.com/haskell/cabal/pull/3982, et al) it looks like it was resolved by carefully managing the location of dylibs to all be located in the same place so that only one RPATH needed emitting (https://ghc.haskell.org/trac/ghc/ticket/12479#comment:42, probably others).
I could imagine a similar solution could be cooked up by generic-builder and with-package-wrapper to create a lib directory for each haskell derivation to build that has symlinks for each transient dependency, but I'm really quite new to all this so I haven't attempted it yet.
nix-build stack-a5b8d468.nix
with stack-a5b8d468.nix
:
let
bootstrap = import <nixpkgs> {};
pkgsSrc = bootstrap.fetchFromGitHub {
owner = "NixOS";
repo = "nixpkgs";
rev = "a5b8d468a504e0eedcda71de1694201806fb921d";
sha256 = "0alh9v12nyxsdlssg72k892zdzzc15wr8fqy8qpq9rc282bpx7xm";
};
pkgs = import pkgsSrc {};
in
pkgs.stack
On my system yields:
…
[ 10 of 121] Compiling System.Process.Read ( src/System/Process/Read.hs, dist/build/System/Process/Read.o )
<no location info>: error:
ghc: panic! (the 'impossible' happened)
(GHC version 8.0.2 for x86_64-apple-darwin):
Loading temp shared object failed: dlopen(/private/var/folders/4b/7smbp2kj7m770r24jxxfwr700000gn/T/nix-build-stack-1.3.2.drv-0/ghc46406_0/libghc_135.dylib, 5): no suitable image found. Did find:
/private/var/folders/4b/7smbp2kj7m770r24jxxfwr700000gn/T/nix-build-stack-1.3.2.drv-0/ghc46406_0/libghc_135.dylib: malformed mach-o: load commands size (41792) > 32768
Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug
…
This is crushing. Basically none of my haskell projects are executable.
It is. @peti @LnL7, anybody else involved in the haskell and/or darwin infrastructure maybe have any opinions, tips, or concerns to contribute? I intend to try to fix it at some point soon if someone more experienced doesn't come along, but I'm definitely pretty new to the whole nix infrastructure.
^ same for me
I don't see a reasonable solution to this issue other than to disable dynamic linking of Haskell libraries on Darwin.
I don't see a reasonable solution to this issue other than to disable dynamic linking of Haskell libraries on Darwin.
How can one accomplish this?
Like http://nixos.org/nixpkgs/manual/#how-to-build-with-profiling-enabled, but instead of enableLibraryProfiling = true
set
enableSharedExecutables = false;
enableSharedLibraries = false;
as a preliminary report, I tried this with our project that was bombing by using:
haskellPackageOverrides = self: super: {
mkDerivation = args: super.mkDerivation (args // {
enableSharedExecutables = false;
enableSharedLibraries = false;
});
};
and it failed due to a build failure with file-embed
:
building path(s) ‘/nix/store/ypy48rz2k6b8vm6mfw4qi9409bfrwc3y-file-embed-0.0.10’
setupCompilerEnvironmentPhase
Build with /nix/store/7nl6dii88xd761nnz3xyh11qcnrqvqri-ghc-8.0.2.
unpacking sources
unpacking source archive /nix/store/qh39vhl5920wgaxwj7jy69njjy548y3y-file-embed-0.0.10.tar.gz
source root is file-embed-0.0.10
setting SOURCE_DATE_EPOCH to timestamp 1461252791 of file file-embed-0.0.10/test/sample/foo
patching sources
compileBuildDriverPhase
setupCompileFlags: -package-db=/private/var/folders/4b/7smbp2kj7m770r24jxxfwr700000gn/T/nix-build-file-embed-0.0.10.drv-0/package.conf.d -j8 -threaded
[1 of 1] Compiling Main ( Setup.lhs, /private/var/folders/4b/7smbp2kj7m770r24jxxfwr700000gn/T/nix-build-file-embed-0.0.10.drv-0/Main.o )
Linking Setup ...
configuring
configureFlags: --verbose --prefix=/nix/store/ypy48rz2k6b8vm6mfw4qi9409bfrwc3y-file-embed-0.0.10 --libdir=$prefix/lib/$compiler --libsubdir=$pkgid --with-gcc=clang --package-db=/private/var/folders/4b/7smbp2kj7m770r24jxxfwr700000gn/T/nix-build-file-embed-0.0.10.drv-0/package.conf.d --ghc-option=-j8 --disable-split-objs --disable-library-profiling --disable-profiling --disable-shared --disable-coverage --enable-library-vanilla --disable-executable-dynamic --enable-tests
Configuring file-embed-0.0.10...
Dependency base ==4.*: using base-4.9.1.0
Dependency bytestring >=0.9.1.4: using bytestring-0.10.8.1
Dependency directory >=1.0.0.3: using directory-1.3.0.0
Dependency file-embed -any: using file-embed-0.0.10
Dependency filepath -any: using filepath-1.4.1.1
Dependency template-haskell -any: using template-haskell-2.11.1.0
Using Cabal-1.24.2.0 compiled by ghc-8.0
Using compiler: ghc-8.0.2
Using install prefix:
/nix/store/ypy48rz2k6b8vm6mfw4qi9409bfrwc3y-file-embed-0.0.10
Binaries installed in:
/nix/store/ypy48rz2k6b8vm6mfw4qi9409bfrwc3y-file-embed-0.0.10/bin
Libraries installed in:
/nix/store/ypy48rz2k6b8vm6mfw4qi9409bfrwc3y-file-embed-0.0.10/lib/ghc-8.0.2/file-embed-0.0.10
Dynamic libraries installed in:
/nix/store/ypy48rz2k6b8vm6mfw4qi9409bfrwc3y-file-embed-0.0.10/lib/ghc-8.0.2/x86_64-osx-ghc-8.0.2
Private binaries installed in:
/nix/store/ypy48rz2k6b8vm6mfw4qi9409bfrwc3y-file-embed-0.0.10/libexec
Data files installed in:
/nix/store/ypy48rz2k6b8vm6mfw4qi9409bfrwc3y-file-embed-0.0.10/share/x86_64-osx-ghc-8.0.2/file-embed-0.0.10
Documentation installed in:
/nix/store/ypy48rz2k6b8vm6mfw4qi9409bfrwc3y-file-embed-0.0.10/share/doc/x86_64-osx-ghc-8.0.2/file-embed-0.0.10
Configuration files installed in:
/nix/store/ypy48rz2k6b8vm6mfw4qi9409bfrwc3y-file-embed-0.0.10/etc
No alex found
Using ar found on system at:
/nix/store/51bin2x0i1v9jff0sd98b51hmzv00g0v-cctools-binutils-darwin/bin/ar
No c2hs found
No cpphs found
Using gcc version 4.2.1 given by user at:
/nix/store/pr2jbdyfagqarc0pb023c64r5ny4gn87-clang-wrapper-3.7.1/bin/clang
Using ghc version 8.0.2 found on system at:
/nix/store/7nl6dii88xd761nnz3xyh11qcnrqvqri-ghc-8.0.2/bin/ghc
Using ghc-pkg version 8.0.2 found on system at:
/nix/store/7nl6dii88xd761nnz3xyh11qcnrqvqri-ghc-8.0.2/bin/ghc-pkg
No ghcjs found
No ghcjs-pkg found
No greencard found
Using haddock version 2.17.3 found on system at:
/nix/store/7nl6dii88xd761nnz3xyh11qcnrqvqri-ghc-8.0.2/bin/haddock
No happy found
Using haskell-suite found on system at: haskell-suite-dummy-location
Using haskell-suite-pkg found on system at: haskell-suite-pkg-dummy-location
No hmake found
Using hpc version 0.67 found on system at:
/nix/store/7nl6dii88xd761nnz3xyh11qcnrqvqri-ghc-8.0.2/bin/hpc
Using hsc2hs version 0.68.1 found on system at:
/nix/store/7nl6dii88xd761nnz3xyh11qcnrqvqri-ghc-8.0.2/bin/hsc2hs
Using hscolour version 1.24 found on system at:
/nix/store/fkwgiwswps6akvh0l5wga450b8hgkzxw-hscolour-1.24.1/bin/HsColour
No jhc found
Using ld found on system at:
/nix/store/pr2jbdyfagqarc0pb023c64r5ny4gn87-clang-wrapper-3.7.1/bin/ld
No lhc found
No lhc-pkg found
No pkg-config found
Using strip found on system at:
/nix/store/51bin2x0i1v9jff0sd98b51hmzv00g0v-cctools-binutils-darwin/bin/strip
Using tar found on system at:
/nix/store/ybaqdf0hlkkjrlsx5dazlp0arz6hbraq-gnutar-1.29/bin/tar
No uhc found
building
Building file-embed-0.0.10...
Preprocessing library file-embed-0.0.10...
[1 of 1] Compiling Data.FileEmbed ( Data/FileEmbed.hs, dist/build/Data/FileEmbed.o )
Preprocessing test suite 'test' for file-embed-0.0.10...
[1 of 1] Compiling Main ( test/main.hs, dist/build/test/test-tmp/Main.o )
<no location info>: error:
<command line>: can't load .so/.DLL for: libHSfile-embed-0.0.10-KWMmUDaNSRP7AxWSEHlrrK.dylib (dlopen(libHSfile-embed-0.0.10-KWMmUDaNSRP7AxWSEHlrrK.dylib, 5): image not found)
builder for ‘/nix/store/n9b72x3vfpaxl8f9dbkws759cgn7fsl1-file-embed-0.0.10.drv’ failed with exit code 1
I haven't tracked down yet for sure why this fails. It doesn't seem like file-embed
has any odd test suite configuration either nixpkgs or its cabal file. It's not the first package it tried to build with a test suite either - code-page
built prior with test suites. file-embed
may be the first test suite that uses TH though and that could be the reason.
Here's a hacky way to workaround this:
pkgs.haskell.packages.ghc802.override {
overrides = self: super: {
mkDerivation = args: super.mkDerivation (args // {
postCompileBuildDriver = ''
echo "Patching dynamic library dependencies"
# 1. Link all dylibs from 'dynamic-library-dirs's in package confs to $out/lib/links
mkdir -p $out/lib/links
for d in $(grep dynamic-library-dirs $packageConfDir/*|awk '{print $2}'); do
ln -s $d/*.dylib $out/lib/links
done
# 2. Patch 'dynamic-library-dirs' in package confs to point to the symlink dir
for f in $packageConfDir/*.conf; do
sed -i "s,dynamic-library-dirs: .*,dynamic-library-dirs: $out/lib/links," $f
done
# 3. Recache package database
ghc-pkg --package-db="$packageConfDir" recache
'';
});
};
}
@peti would you be ok with something like this for Darwin only? Would be happy to prepare a PR.
@dajmaki, yeah, that would be a viable workaround, I suppose.
@dajmaki I was thinking about something like this with rpath before. This will cause conflicts when using multiple haskell packages in a buildEnv
unless you put the symlinks in a separate drv or output.
I just got hit with this. Still no path forward?
Just got bitten by this as well.
(whoops commented from my work account 'dajmaki', switching to my private one). @LnL7 could you give an example of a conflict? I'm not quite following.
EDIT: Another problem with my workaround is that if one is building a statically linked executable this will include the dynamic libraries in the closure for no reason.
I get this on Sierra once I pass dead_code
to linker when building purescript
. See https://github.com/NixOS/nixpkgs/issues/21200#issuecomment-288345915
That makes no sense to me, why is it suddenly using longer path?
Assigning myself to keep this on my radar, but no guarantee I'll be able to get too it in a timely manner.
marking this issue closed on account of #25537 being merged and fixing my case! thanks very much @judah @shlevy @peti @LnL7 et al!
I had an idea for a workaround here... not sure it belongs in Nix though.
Mach-O dylibs have the option to re-export other dylibs. We can create "intermediate" dylibs that re-export all the other libraries, as deep as we want. So if the first re-exporting dylib fills up, we create another one. And if needed, we do it recursively. I hope we don't get that big though 😄
FWIW we're still seeing this issue, though this fix delayed it for a while. Also seeing it in a pure stack build, so it's not just nix related....
@copumpkin I'm looking into possible solutions this week, reexport is not a bad idea sadly
We've seen the same. The workaround is to use macOS 12.10 for now.
12.10? :open_mouth:
12.11* :)
Oh, 12.11 relaxes the limit somehow? I'm warming up to the re-export but it seems like it belongs on the Haskell side of things rather than the Nix side of things.
@copumpkin I'm pretty sure he means 10.11... unless Apple had two major version bumps since High Sierra :grinning:
Ah okay. So let's maybe do the re-export stuff? It's a pretty straightforward call to ld
. We even do that in libSystem
: https://github.com/NixOS/nixpkgs/blob/master/pkgs/os-specific/darwin/apple-source-releases/Libsystem/default.nix#L95-L97
Just to make sure I understand: the issue is that GHC passes a ton of -L
and -l
options to ld
, resulting in a Mach-O file with a ton of library load commands (all pretty long due to /nix/store/longhashjunk
) which goes over some limit in dyld
. Is that right?
If so, there's another somewhat hacky option, if ld
is willing to spit those files out and it's just dyld
that refuses to consume them.
@copumpkin yes, that's right as far as I understand
Okay, so the other option is I think similar to @dajmaki's proposal (as I understand it) but we let ld
link the too-long output normally. We then use install_name_tool
to rewrite all those library references to @loader_path/links/libname.dylib
rather than the full path, using the folder of symlinks @dajmaki puts into lib
. If the links are somewhere else, we can also just rewrite to @rpath/links/libname.dylib
and add an rpath load command pointing to wherever the links are.
Edit: the more robust version of this is to ensure that all Haskell-generated libraries use @rpath-relative install names so we don't have to rewrite the binaries, and then make sure that all haskell output gets the rpath injected into it at link time.
But that just gives us more breathing room, right? We can still eventually have enough libraries to break it? And how does that relate to https://github.com/NixOS/nixpkgs/pull/25537 ?
Yes, just more breathing room (re-export would be sustainable fix). And I don't actually understand #25537. My understanding is that that wouldn't help if GHC isn't doing weird things with ld
, because even if you create a folder full of symlinks to dylibs and tell ld
to use those via -L
, it'll ignore the path you provide and use the official "install name" (baked into the dylib) attached to the library.
So as an example of what I'm saying:
/path/to/my/libdylib.dylib
(and when creating, it _must_ have an install name, probably set to /path/to/my/libdylib.dylib
)ln -s /path/to/my/libdylib.dylib /another/path/libdylib.dylib
ld -L /another/path -ldylib ...
otool -L
will show you that libdylib is referenced via /path/to/my/libdylib.dylib
, not /another/path
hmm, that PR definitely helped us for a while :grinning: I think at this point we'd be better off with a permanent fix, but as long as we won't be revisiting this again in the next few months I suppose either works for me.
So I'm very confused about how that PR helped, but I guess it's probably not worth investigating.
But TBC, I'm not volunteering to do the re-export solution 😄 Happy to advise someone motivated to implement it, but I don't have the time to tinker with that stuff myself right now.
@copumpkin That PR fixes the issue because it makes ghc use $out/lib/links
as an rpath for the dependencies.
$ otool -L result/lib/ghc-8.0.2/x86_64-osx-ghc-8.0.2/libHScomonad-5.0.1-7j4AeOMTFovFSFO9XMFm1-ghc8.0.2.dylib
result/lib/ghc-8.0.2/x86_64-osx-ghc-8.0.2/libHScomonad-5.0.1-7j4AeOMTFovFSFO9XMFm1-ghc8.0.2.dylib:
@rpath/libHScomonad-5.0.1-7j4AeOMTFovFSFO9XMFm1-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHSdistributive-0.5.2-JCgfTXNR3ywAyV7fFWIBI5-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHStagged-0.8.5-1mTloBSoUxv8dqUr8XBGBt-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHStemplate-haskell-2.11.1.0-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHSpretty-1.1.3.3-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHSghc-boot-th-8.0.2-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHSbase-orphans-0.5.4-ABoxiBf7nXc7Qqh66CgYc9-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHScontravariant-1.4-3UCY3arLvoG71jrGOYoc39-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHSvoid-0.7.2-4PWwLjXxAER9U3zGpDhf6e-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHStransformers-compat-0.5.1.4-IuFogs8HAVUJBWVNMhtssu-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHSsemigroups-0.18.2-GvTCUro9Hym1wGKOLNRfUA-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHSStateVar-1.1.0.4-5dJbnTVECtEAhfJXPZKdbO-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHStransformers-0.5.2.0-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHSstm-2.4.4.1-JQn4hNPyYjP5m9AcbI88Ve-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHScontainers-0.5.7.1-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHSdeepseq-1.4.2.0-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHSarray-0.5.1.1-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHSbase-4.9.1.0-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHSinteger-gmp-1.0.0.1-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libHSghc-prim-0.5.0.0-ghc8.0.2.dylib (compatibility version 0.0.0, current version 0.0.0)
/nix/store/dpibmlc62jlg8qmwkm9q2a1ynk1fbamz-libiconv-osx-10.11.6/lib/libiconv.dylib (compatibility version 7.0.0, current version 7.0.0)
/nix/store/m6r6d8465l4xa3i4iff7k5hp23ymv7kc-gmp-6.1.2/lib/libgmp.10.dylib (compatibility version 14.0.0, current version 14.2.0)
/nix/store/p3aw3a3qx1nxpzz5irk7lbwl3zw9syw3-Libsystem-osx-10.11.6/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)
I don't know how we could solve this for stack in Nix itself.
We have about 300 dependencies, each taking on average about 100 chars for the rpath path, resulting into ~33000 chars, just above the limit.
I think stack will need to handle this similarly as we do in Nix :)
What if we package dyld https://opensource.apple.com/tarballs/dyld/dyld-421.1.tar.gz and patch #define MAX_MACH_O_HEADER_AND_LOAD_COMMANDS_SIZE (32*1024)
? Nix wins :dagger:
But this probably means we have to bump our SDK to 10.12
Pre-10.10 we might have been able to use a custom dyld
but this happened
Holy crap could Darwin be more hostile to developers? :angry:
sure! they could require everything to be signed
Has anyone opened a ticket with Apple (rdar or Radar or whatever they call it) regarding MAX_MACH_O_HEADER_AND_LOAD_COMMANDS_SIZE? Might be best to fix the problem at its source... assuming they actually listen to their developers...
I can only hope that since Apple is so against static linking they will be slightly more sympathetic.
Marking closed since https://github.com/NixOS/nixpkgs/pull/27536 is in, reopen if this is still broken.
Most helpful comment
Has anyone opened a ticket with Apple (rdar or Radar or whatever they call it) regarding MAX_MACH_O_HEADER_AND_LOAD_COMMANDS_SIZE? Might be best to fix the problem at its source... assuming they actually listen to their developers...