Nixpkgs: pandoc - statically linked closure-size

Created on 29 Jan 2018  Ā·  27Comments  Ā·  Source: NixOS/nixpkgs

Issue description

Statically linked pandoc from all-packages.nix has huge closure size.

Pandoc has a huge closure since 2017-12-20 https://hydra.nixos.org/job/nixpkgs/trunk/pandoc.x86_64-linux#tabs-charts.

It seems that pandoc switched to file-embed at this point, and started leaking refs to haskell libraries.

I've tried running ldd and patchelf --print-rpath and both seem clean enough. Not referencing anything in haskell ecosystem. But strings shows refs to haskell libraries.


> ldd /nix/store/dikskp4kisay6gcx3dyf27q97h9a8066-pandoc-2.0.6/bin/pandoc
        linux-vdso.so.1 (0x00007ffcf4f78000)
        libm.so.6 => /nix/store/1zv5dwifxg5fh08gif8ld3h9f40y8czh-glibc-2.26-115/lib/libm.so.6 (0x00007fe3d4b29000)
        liblua.so.5.3 => /nix/store/1p116lfln2pwfywmsvgcbyfb6inz9l7i-lua-5.3.4/lib/liblua.so.5.3 (0x00007fe3d48f1000)
        libz.so.1 => /nix/store/la2xxf4grdyf37a94ak28m7a61yjqzr9-zlib-1.2.11/lib/libz.so.1 (0x00007fe3d46da000)
        librt.so.1 => /nix/store/1zv5dwifxg5fh08gif8ld3h9f40y8czh-glibc-2.26-115/lib/librt.so.1 (0x00007fe3d44d2000)
        libutil.so.1 => /nix/store/1zv5dwifxg5fh08gif8ld3h9f40y8czh-glibc-2.26-115/lib/libutil.so.1 (0x00007fe3d42cf000)
        libdl.so.2 => /nix/store/1zv5dwifxg5fh08gif8ld3h9f40y8czh-glibc-2.26-115/lib/libdl.so.2 (0x00007fe3d40cb000)
        libgmp.so.10 => /nix/store/hcn4v9kl0lgayz666yf7nfggny504bwz-gmp-6.1.2/lib/libgmp.so.10 (0x00007fe3d3e38000)
        libpthread.so.0 => /nix/store/1zv5dwifxg5fh08gif8ld3h9f40y8czh-glibc-2.26-115/lib/libpthread.so.0 (0x00007fe3d3c1a000)
        libc.so.6 => /nix/store/1zv5dwifxg5fh08gif8ld3h9f40y8czh-glibc-2.26-115/lib/libc.so.6 (0x00007fe3d3868000)
        /nix/store/1zv5dwifxg5fh08gif8ld3h9f40y8czh-glibc-2.26-115/lib/ld-linux-x86-64.so.2 => /nix/store/z0b60y0khix9jb74ka56gw7b7n9s8awx-glibc-2.26-131/lib64/ld-linux-x86-64.so.2 (0x00007fe3d4e75000)

> patchelf --print-rpath /nix/store/dikskp4kisay6gcx3dyf27q97h9a8066-pandoc-2.0.6/bin/pandoc 
/nix/store/1p116lfln2pwfywmsvgcbyfb6inz9l7i-lua-5.3.4/lib:/nix/store/la2xxf4grdyf37a94ak28m7a61yjqzr9-zlib-1.2.11/lib:/nix/store/hcn4v9kl0lgayz666yf7nfggny504bwz-gmp-6.1.2/lib:/nix/store/1zv5dwifxg5fh08gif8ld3h9f40y8czh-glibc-2.26-115/lib

> strings /nix/store/dikskp4kisay6gcx3dyf27q97h9a8066-pandoc-2.0.6/bin/pandoc  | grep /nix/store
/nix/store/1zv5dwifxg5fh08gif8ld3h9f40y8czh-glibc-2.26-115/lib/ld-linux-x86-64.so.2
/nix/store/1p116lfln2pwfywmsvgcbyfb6inz9l7i-lua-5.3.4/lib:/nix/store/la2xxf4grdyf37a94ak28m7a61yjqzr9-zlib-1.2.11/lib:/nix/store/hcn4v9kl0lgayz666yf7nfggny504bwz-gmp-6.1.2/lib:/nix/store/1zv5dwifxg5fh08gif8ld3h9f40y8czh-glibc-2.26-115/lib
/nix/store/dikskp4kisay6gcx3dyf27q97h9a8066-pandoc-2.0.6/bin
/nix/store/dikskp4kisay6gcx3dyf27q97h9a8066-pandoc-2.0.6/lib/ghc-8.2.2/pandoc-2.0.6
/nix/store/dikskp4kisay6gcx3dyf27q97h9a8066-pandoc-2.0.6/lib/ghc-8.2.2/x86_64-linux-ghc-8.2.2
/nix/store/276rs4f8ad3xpld9kydw7kja95qxp1m3-pandoc-2.0.6-data/share/ghc-8.2.2/x86_64-linux-ghc-8.2.2/pandoc-2.0.6
/nix/store/dikskp4kisay6gcx3dyf27q97h9a8066-pandoc-2.0.6/libexec/x86_64-linux-ghc-8.2.2/pandoc-2.0.6
/nix/store/dikskp4kisay6gcx3dyf27q97h9a8066-pandoc-2.0.6/etc
/nix/store/sinharz0d375l5lhdj5qpb6cc4p5nqwa-pandoc-types-1.17.3/bin
/nix/store/sinharz0d375l5lhdj5qpb6cc4p5nqwa-pandoc-types-1.17.3/lib/ghc-8.2.2/pandoc-types-1.17.3
/nix/store/sinharz0d375l5lhdj5qpb6cc4p5nqwa-pandoc-types-1.17.3/lib/ghc-8.2.2/x86_64-linux-ghc-8.2.2
/nix/store/sinharz0d375l5lhdj5qpb6cc4p5nqwa-pandoc-types-1.17.3/share/x86_64-linux-ghc-8.2.2/pandoc-types-1.17.3
/nix/store/sinharz0d375l5lhdj5qpb6cc4p5nqwa-pandoc-types-1.17.3/libexec/x86_64-linux-ghc-8.2.2/pandoc-types-1.17.3
/nix/store/sinharz0d375l5lhdj5qpb6cc4p5nqwa-pandoc-types-1.17.3/etc
/nix/store/aayklpm1bgms01qcs4b6pd2ypdgiy1w8-HTTP-4000.3.9/bin
/nix/store/aayklpm1bgms01qcs4b6pd2ypdgiy1w8-HTTP-4000.3.9/lib/ghc-8.2.2/HTTP-4000.3.9
/nix/store/aayklpm1bgms01qcs4b6pd2ypdgiy1w8-HTTP-4000.3.9/lib/ghc-8.2.2/x86_64-linux-ghc-8.2.2
/nix/store/aayklpm1bgms01qcs4b6pd2ypdgiy1w8-HTTP-4000.3.9/share/x86_64-linux-ghc-8.2.2/HTTP-4000.3.9
/nix/store/aayklpm1bgms01qcs4b6pd2ypdgiy1w8-HTTP-4000.3.9/libexec/x86_64-linux-ghc-8.2.2/HTTP-4000.3.9
/nix/store/aayklpm1bgms01qcs4b6pd2ypdgiy1w8-HTTP-4000.3.9/etc
bug haskell

Most helpful comment

Hokay. In case it helps anyone else, I've been just downloading the static builds provided by pandoc, for example:

stdenv.mkDerivation {
  name = "pandoc";
  propagatedBuildInputs = [texlive.combined.scheme-basic];
  src = fetchTarball {
    url = https://github.com/jgm/pandoc/releases/download/2.5/pandoc-2.5-linux.tar.gz;
    sha256 = "1zi7d14cjpnf9xf7s71l0q26aw8xdnshciyx9ygl1as04g1yykr8";
  };
  buildPhase = "true";
  installPhase = ''
    mkdir -p $out/bin
    cp bin/pandoc $out/bin
  '';
}

All 27 comments

Yes, this is a problem, indeed:

~
$ nix-store -qR $(nix-build --no-out-link "" -A pandoc) | grep ghc
/nix/store/21xh5xbxfh7w0kbn4pphwwasvrgcq6sb-ghc-8.2.2-doc
/nix/store/iwlrpgj0g2w93hc7rmlqavhqiylpflvc-ghc-8.2.2
~

FWIW nix 2's "why-depends" is rather helpful for investigating these sorts of things. May want to use "--all".

$ rabin2 -z pandoc
...
100097 0x0642fac1 0x0682fac1  72  73 (.rodata) ascii pandoc-types-1.17.3-FTIUUxYRy77FizQ4Xfoub1:Text.Pandoc.Builder.C:HasMeta
100098 0x0642fb0a 0x0682fb0a  18  19 (.rodata) ascii Paths_pandoc_types
100099 0x0642fb1d 0x0682fb1d  19  20 (.rodata) ascii pandoc_types_bindir
100100 0x0642fb31 0x0682fb31  67  68 (.rodata) ascii /nix/store/i9n8fqnm5i7gpgnf7ib4qpzxg83a0311-pandoc-types-1.17.3/bin
100101 0x0642fb75 0x0682fb75  19  20 (.rodata) ascii pandoc_types_libdir
100102 0x0642fb89 0x0682fb89  97  98 (.rodata) ascii /nix/store/i9n8fqnm5i7gpgnf7ib4qpzxg83a0311-pandoc-types-1.17.3/lib/ghc-8.2.2/pandoc-types-1.17.3
100103 0x0642fbeb 0x0682fbeb  22  23 (.rodata) ascii pandoc_types_dynlibdir
100104 0x0642fc02 0x0682fc02 100 101 (.rodata) ascii /nix/store/i9n8fqnm5i7gpgnf7ib4qpzxg83a0311-pandoc-types-1.17.3/lib/ghc-8.2.2/x86_64-linux-ghc-8.2.2
100105 0x0642fc67 0x0682fc67  20  21 (.rodata) ascii pandoc_types_datadir
100106 0x0642fc7c 0x0682fc7c 112 113 (.rodata) ascii /nix/store/i9n8fqnm5i7gpgnf7ib4qpzxg83a0311-pandoc-types-1.17.3/share/x86_64-linux-ghc-8.2.2/pandoc-types-1.17.3
100107 0x0642fced 0x0682fced  23  24 (.rodata) ascii pandoc_types_libexecdir
100108 0x0642fd05 0x0682fd05 114 115 (.rodata) ascii /nix/store/i9n8fqnm5i7gpgnf7ib4qpzxg83a0311-pandoc-types-1.17.3/libexec/x86_64-linux-ghc-8.2.2/pandoc-types-1.17.3
100109 0x0642fd78 0x0682fd78  23  24 (.rodata) ascii pandoc_types_sysconfdir
100110 0x0642fd90 0x0682fd90  67  68 (.rodata) ascii /nix/store/i9n8fqnm5i7gpgnf7ib4qpzxg83a0311-pandoc-types-1.17.3/etc

ghc seems to encode some library information into the binary:

pandoc_types_bindir pandoc_types_libdir pandoc_types_dynlibdir pandoc_types_datadir pandoc_types_libexecdir pandoc_types_sysconfdir

Over library dependencies this finally links into ghc.
Can we teach ghc to stop doing that?

output of nix2:

$ nix why-depends nixpkgs.pandoc nixpkgs.ghc
/nix/store/fcn6xyp22g7dyp7aaxkgcava7416kncy-pandoc-2.1.2
ā•šā•ā•ā•bin/pandoc: ā€¦.pandoc_types_bindir./nix/store/ghjrl4kznrfy2l89b5yb94gb60rxb7w1-pandoc-types-1.17.3.1/bin.pandoā€¦
    => /nix/store/ghjrl4kznrfy2l89b5yb94gb60rxb7w1-pandoc-types-1.17.3.1
    ā•šā•ā•ā•lib/ghc-8.2.2/x86_64-linux-ghc-8.2.2/libHSpandoc-types-1.17.3.1-BSXOOhfaVkXLfdhdUSFk6O-ghc8.2.2.so: ā€¦6_64-linux-ghc-8.2.2:/nix/store/g9baqjsh28swdy6mvvp4hp2by8xvd7af-ghc-8.2.2/lib/ghc-8.2.2/array-0ā€¦
        => /nix/store/g9baqjsh28swdy6mvvp4hp2by8xvd7af-ghc-8.2.2

The same reference I found in the binary are also in this file: https://hackage.haskell.org/package/pandoc-types-1.19/docs/src/Paths_pandoc_types.html
so this seems to be autogenerated at build time

That module is generated by Cabal if it is requested by package author. It is usually used to find datadir, in case of pandoc-types it is used to get package version. Module is listed as "other-modules" which means it is not exported from package. I would expect ghc to eliminate everything besides version symbol as dead code in such case.

Apparently it is not dead code, since I found these strings being used by machine code in the binary.

@sopvop That is true, as long as the datadir, bindir etc are not used, see git-annex or purescript for example (the latter is also compiled statically and does not refer to other libraries).

Okay, I know the reason now: (cc @peti)

When compiling statically, cabal or ghc have to put the libdir (, bindir, ā€¦) references to each library that depends on them into the binary.

Hereā€™s the output of the pandoc binary:

ā—‹ ā†’ strings /nix/store/8ky36i3m3km1qnc7q6dzknc3qpildz36-pandoc-2.0.6/bin/pandoc | grep /nix/store
/nix/store/2kcrj1ksd2a14bm5sky182fv2xwfhfap-glibc-2.26-131/lib/ld-linux-x86-64.so.2
/nix/store/8qfd8gx0j3yzamkrbrfz5kc00h4cqd1q-gmp-6.1.2/lib:/nix/store/xjbsf7x6a9r64sxnzhd6h1f7kjs0c9vl-lua-5.3.4/lib:/nix/store/r43dk927l97n78rff7hnvsq49f3szkg6-zlib-1.2.11/lib:/nix/store/2kcrj1ksd2a14bm5sky182fv2xwfhfap-glibc-2.26-131/lib
/nix/store/8ky36i3m3km1qnc7q6dzknc3qpildz36-pandoc-2.0.6/bin
/nix/store/8ky36i3m3km1qnc7q6dzknc3qpildz36-pandoc-2.0.6/lib/ghc-8.2.2/pandoc-2.0.6
/nix/store/8ky36i3m3km1qnc7q6dzknc3qpildz36-pandoc-2.0.6/lib/ghc-8.2.2/x86_64-linux-ghc-8.2.2
/nix/store/ji0q149fqqafrriz1cc0p8yyq19qc226-pandoc-2.0.6-data/share/ghc-8.2.2/x86_64-linux-ghc-8.2.2/pandoc-2.0.6
/nix/store/8ky36i3m3km1qnc7q6dzknc3qpildz36-pandoc-2.0.6/libexec/x86_64-linux-ghc-8.2.2/pandoc-2.0.6
/nix/store/8ky36i3m3km1qnc7q6dzknc3qpildz36-pandoc-2.0.6/etc
/nix/store/wkqhd4ax8hd9lrccmlijw02wagz6zcrl-pandoc-types-1.17.3.1/bin
/nix/store/wkqhd4ax8hd9lrccmlijw02wagz6zcrl-pandoc-types-1.17.3.1/lib/ghc-8.2.2/pandoc-types-1.17.3.1
/nix/store/wkqhd4ax8hd9lrccmlijw02wagz6zcrl-pandoc-types-1.17.3.1/lib/ghc-8.2.2/x86_64-linux-ghc-8.2.2
/nix/store/wkqhd4ax8hd9lrccmlijw02wagz6zcrl-pandoc-types-1.17.3.1/share/x86_64-linux-ghc-8.2.2/pandoc-types-1.17.3.1
/nix/store/wkqhd4ax8hd9lrccmlijw02wagz6zcrl-pandoc-types-1.17.3.1/libexec/x86_64-linux-ghc-8.2.2/pandoc-types-1.17.3.1
/nix/store/wkqhd4ax8hd9lrccmlijw02wagz6zcrl-pandoc-types-1.17.3.1/etc
/nix/store/gx2cn0ybjsmxpjaqqip4a61yv94b1j4b-HTTP-4000.3.9/bin
/nix/store/gx2cn0ybjsmxpjaqqip4a61yv94b1j4b-HTTP-4000.3.9/lib/ghc-8.2.2/HTTP-4000.3.9
/nix/store/gx2cn0ybjsmxpjaqqip4a61yv94b1j4b-HTTP-4000.3.9/lib/ghc-8.2.2/x86_64-linux-ghc-8.2.2
/nix/store/gx2cn0ybjsmxpjaqqip4a61yv94b1j4b-HTTP-4000.3.9/share/x86_64-linux-ghc-8.2.2/HTTP-4000.3.9
/nix/store/gx2cn0ybjsmxpjaqqip4a61yv94b1j4b-HTTP-4000.3.9/libexec/x86_64-linux-ghc-8.2.2/HTTP-4000.3.9
/nix/store/gx2cn0ybjsmxpjaqqip4a61yv94b1j4b-HTTP-4000.3.9/etc

Hereā€™s the output of removing each reference to Paths_pandoc from the pandoc code:

ā—‹ ā†’ strings /nix/store/89q50sngrhfbkzgyyzry0h7ahf4rk115-pandoc-2.0.6/bin/pandoc | grep /nix/store
/nix/store/2kcrj1ksd2a14bm5sky182fv2xwfhfap-glibc-2.26-131/lib/ld-linux-x86-64.so.2
/nix/store/8qfd8gx0j3yzamkrbrfz5kc00h4cqd1q-gmp-6.1.2/lib:/nix/store/xjbsf7x6a9r64sxnzhd6h1f7kjs0c9vl-lua-5.3.4/lib:/nix/store/r43dk927l97n78rff7hnvsq49f3szkg6-zlib-1.2.11/lib:/nix/store/2kcrj1ksd2a14bm5sky182fv2xwfhfap-glibc-2.26-131/lib
/nix/store/wkqhd4ax8hd9lrccmlijw02wagz6zcrl-pandoc-types-1.17.3.1/bin
/nix/store/wkqhd4ax8hd9lrccmlijw02wagz6zcrl-pandoc-types-1.17.3.1/lib/ghc-8.2.2/pandoc-types-1.17.3.1
/nix/store/wkqhd4ax8hd9lrccmlijw02wagz6zcrl-pandoc-types-1.17.3.1/lib/ghc-8.2.2/x86_64-linux-ghc-8.2.2
/nix/store/wkqhd4ax8hd9lrccmlijw02wagz6zcrl-pandoc-types-1.17.3.1/share/x86_64-linux-ghc-8.2.2/pandoc-types-1.17.3.1
/nix/store/wkqhd4ax8hd9lrccmlijw02wagz6zcrl-pandoc-types-1.17.3.1/libexec/x86_64-linux-ghc-8.2.2/pandoc-types-1.17.3.1
/nix/store/wkqhd4ax8hd9lrccmlijw02wagz6zcrl-pandoc-types-1.17.3.1/etc
/nix/store/gx2cn0ybjsmxpjaqqip4a61yv94b1j4b-HTTP-4000.3.9/bin
/nix/store/gx2cn0ybjsmxpjaqqip4a61yv94b1j4b-HTTP-4000.3.9/lib/ghc-8.2.2/HTTP-4000.3.9
/nix/store/gx2cn0ybjsmxpjaqqip4a61yv94b1j4b-HTTP-4000.3.9/lib/ghc-8.2.2/x86_64-linux-ghc-8.2.2
/nix/store/gx2cn0ybjsmxpjaqqip4a61yv94b1j4b-HTTP-4000.3.9/share/x86_64-linux-ghc-8.2.2/HTTP-4000.3.9
/nix/store/gx2cn0ybjsmxpjaqqip4a61yv94b1j4b-HTTP-4000.3.9/libexec/x86_64-linux-ghc-8.2.2/HTTP-4000.3.9
/nix/store/gx2cn0ybjsmxpjaqqip4a61yv94b1j4b-HTTP-4000.3.9/etc

And hereā€™s the output after removing every reference to Paths_HTTP from the HTTP module pandoc depends on:

ā—‹ ā†’ strings /nix/store/644sic3knjljhancwypscsqjbw4kib1w-pandoc-2.0.6/bin/pandoc | grep /nix/store
/nix/store/2kcrj1ksd2a14bm5sky182fv2xwfhfap-glibc-2.26-131/lib/ld-linux-x86-64.so.2
/nix/store/8qfd8gx0j3yzamkrbrfz5kc00h4cqd1q-gmp-6.1.2/lib:/nix/store/xjbsf7x6a9r64sxnzhd6h1f7kjs0c9vl-lua-5.3.4/lib:/nix/store/r43dk927l97n78rff7hnvsq49f3szkg6-zlib-1.2.11/lib:/nix/store/2kcrj1ksd2a14bm5sky182fv2xwfhfap-glibc-2.26-131/lib
/nix/store/644sic3knjljhancwypscsqjbw4kib1w-pandoc-2.0.6/bin
/nix/store/644sic3knjljhancwypscsqjbw4kib1w-pandoc-2.0.6/lib/ghc-8.2.2/pandoc-2.0.6
/nix/store/644sic3knjljhancwypscsqjbw4kib1w-pandoc-2.0.6/lib/ghc-8.2.2/x86_64-linux-ghc-8.2.2
/nix/store/9x1fpvb7zmlfhs02xsznglnskqq4zy1d-pandoc-2.0.6-data/share/ghc-8.2.2/x86_64-linux-ghc-8.2.2/pandoc-2.0.6
/nix/store/644sic3knjljhancwypscsqjbw4kib1w-pandoc-2.0.6/libexec/x86_64-linux-ghc-8.2.2/pandoc-2.0.6
/nix/store/644sic3knjljhancwypscsqjbw4kib1w-pandoc-2.0.6/etc
/nix/store/wkqhd4ax8hd9lrccmlijw02wagz6zcrl-pandoc-types-1.17.3.1/bin
/nix/store/wkqhd4ax8hd9lrccmlijw02wagz6zcrl-pandoc-types-1.17.3.1/lib/ghc-8.2.2/pandoc-types-1.17.3.1
/nix/store/wkqhd4ax8hd9lrccmlijw02wagz6zcrl-pandoc-types-1.17.3.1/lib/ghc-8.2.2/x86_64-linux-ghc-8.2.2
/nix/store/wkqhd4ax8hd9lrccmlijw02wagz6zcrl-pandoc-types-1.17.3.1/share/x86_64-linux-ghc-8.2.2/pandoc-types-1.17.3.1
/nix/store/wkqhd4ax8hd9lrccmlijw02wagz6zcrl-pandoc-types-1.17.3.1/libexec/x86_64-linux-ghc-8.2.2/pandoc-types-1.17.3.1
/nix/store/wkqhd4ax8hd9lrccmlijw02wagz6zcrl-pandoc-types-1.17.3.1/etc

I don't understand that explanation. Why does ghc have to put any any of those reference into the binary? And why did it not have to do that before 2017-12-20?

Wasn't it the day when haskellPackages switched from 8.0.2 to 8.2.2? Seems so https://github.com/NixOS/nixpkgs/commit/d5676b04331efa730aa86c6a3c6cc0769b44a1e1

@peti it is not ghc, but generated haskell modules and some code in the executable is using this in pandoc.

@peti I think itā€™s because HTTP and pandoc-types use their generated Paths_ module at runtime, so GHC has to embed these paths into the static binary as well. What I donā€™t know is why apparently all of these paths are embedded, even though e.g. HTTP only uses the version function.

I get the feeling this sounds like a Cabal bug.

@Profpatsch does haskell perform link-time optimization by default? Otherwise it might not detect if a methods are not used. I don't know about the intermediate object files that the haskell compiler generates but looking at the binary size of haskell programs it does not seem so.

Okay, I had a chat with @hvr (the Cabal maintainer):

  • When you have a string literal (foo = "abc") that is exported from a module, GHC directly embeds it into the interface file, which could at most be removed when linking (but apparently isnā€™t in our case)
  • Apart from generating the Paths_ files, Cabal doesnā€™t give GHC special instructions
  • strip doesnā€™t work very well with cabal-generated binaries

So our problem mostly consists of packages that use version and donā€™t need bindir, datadir, libdir etc. But cabal always generates the whole module and causes the storepaths to be embedded into our results.

My proposal is patching the nixpkgs Cabal, making it possible to provide a list of symbols that should be exported by a packageā€™s Paths_ file, cause the generated Paths_ to only contain the mentioned symbols.
In nixpkgs weā€™d default to only giving version, causing build failures for packages that actually need the path symbols. Then we can fix them one by one (preferring patching out the paths if possible).

Hi, I am very interested if this problem could be solved. I'm packaging a hakyll-based binary, for generating a website, with dockerTools, and since it relies on pandocā€¦ Well:

  • StaticHaskell binary : 80MB
  • Base layer with dockerTools : 700MB
  • Uncompressed image after import in docker : 4,7GB (fits a DVD!)

Some more information on the binary's dynamic deps:

$ ldd ./result/bin/site
    linux-vdso.so.1 (0x00007ffeb4fc2000)
    libm.so.6 => /nix/store/pr73kx0cdszbv9iw49g8dzi0nqxfjbx2-glibc-2.27/lib/libm.so.6 (0x00007f1eb6862000)
    liblua.so.5.3 => /nix/store/v2a5c2rijnh2x9nii5y307nmp2ah0h0r-lua-5.3.4/lib/liblua.so.5.3 (0x00007f1eb662a000)
    libz.so.1 => /nix/store/gf00m2nz8079di7ihc6fj75v5jbh8p8v-zlib-1.2.11/lib/libz.so.1 (0x00007f1eb640e000)
    librt.so.1 => /nix/store/pr73kx0cdszbv9iw49g8dzi0nqxfjbx2-glibc-2.27/lib/librt.so.1 (0x00007f1eb6206000)
    libutil.so.1 => /nix/store/pr73kx0cdszbv9iw49g8dzi0nqxfjbx2-glibc-2.27/lib/libutil.so.1 (0x00007f1eb6003000)
    libdl.so.2 => /nix/store/pr73kx0cdszbv9iw49g8dzi0nqxfjbx2-glibc-2.27/lib/libdl.so.2 (0x00007f1eb5dff000)
    libpthread.so.0 => /nix/store/pr73kx0cdszbv9iw49g8dzi0nqxfjbx2-glibc-2.27/lib/libpthread.so.0 (0x00007f1eb5be0000)
    libgmp.so.10 => /nix/store/14nlrnm6bv7z25c9ismlhc2mrxsriyq2-gmp-6.1.2/lib/libgmp.so.10 (0x00007f1eb594c000)
    libc.so.6 => /nix/store/pr73kx0cdszbv9iw49g8dzi0nqxfjbx2-glibc-2.27/lib/libc.so.6 (0x00007f1eb5598000)
    /nix/store/pr73kx0cdszbv9iw49g8dzi0nqxfjbx2-glibc-2.27/lib/ld-linux-x86-64.so.2 => /nix/store/pr73kx0cdszbv9iw49g8dzi0nqxfjbx2-glibc-2.27/lib64/ld-linux-x86-64.so.2 (0x00007f1eb6bf7000)
$ nix run nixpkgs.patchelf -c patchelf --print-rpath ./result/bin/site
/nix/store/14nlrnm6bv7z25c9ismlhc2mrxsriyq2-gmp-6.1.2/lib:/nix/store/v2a5c2rijnh2x9nii5y307nmp2ah0h0r-lua-5.3.4/lib:/nix/store/gf00m2nz8079di7ihc6fj75v5jbh8p8v-zlib-1.2.11/lib:/nix/store/pr73kx0cdszbv9iw49g8dzi0nqxfjbx2-glibc-2.27/lib

strings shows references to pandoc and pandoc-types.

@MartinPotier thereā€™s two solutions, one for this specific use case and one general.

  • This specific use case: Look at the output of why-depends and patch out all references to the Paths_ modules in the respective transitive dependencies
  • General: Patch Cabal so that you can give Cabal a list of symbols it should put into the generated Paths_ module.

https://github.com/NixOS/nixpkgs/pull/32629 has the potential to solve this, but needs some investigation for pandoc.

@domenkozar I saw that #58525 (the successor to #32629) landed, is that enough to make progress on this issue? Thanks!

Sadly no, as @Profpatsch mentioned for some reason dead code elimination doesn't do it's job here, it needs to be solved in Cabal.

There are a few potential paths here:

Hokay. In case it helps anyone else, I've been just downloading the static builds provided by pandoc, for example:

stdenv.mkDerivation {
  name = "pandoc";
  propagatedBuildInputs = [texlive.combined.scheme-basic];
  src = fetchTarball {
    url = https://github.com/jgm/pandoc/releases/download/2.5/pandoc-2.5-linux.tar.gz;
    sha256 = "1zi7d14cjpnf9xf7s71l0q26aw8xdnshciyx9ygl1as04g1yykr8";
  };
  buildPhase = "true";
  installPhase = ''
    mkdir -p $out/bin
    cp bin/pandoc $out/bin
  '';
}

@thomasjm you need now dontStrip = false otherwise it segfaults in newer versions. I have a pandoc-bin in my NUR repository:

$ nix-shell -p nur.repos.mic92.pandoc-bin

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/bup-on-arm64-installs-haskell-etc-1-2gb/6705/8

I patched out the indirect dependency on GHC here: https://github.com/NixOS/nixpkgs/pull/86194. The runtime closure is down to 180MB.

Still, every haskellPackages library has a runtime dependency on the GHC closure, because the base packages are not split, which is a different issue.
And of course we should still catch cabal to not generate the datadir/ā€¦ symbols if we pass a flag, which we should make the default for haskellPackages (unless a package really needs it, then it can enable it manually).

At least we sorted out pandoc.

There is going to be a regression should pandoc add another one of these, so itā€™s at best solved temporarily. We donā€™t test for closure size regressions and I donā€™t know how we could in the first place.

Was this page helpful?
0 / 5 - 0 ratings