-split-sections is a ghc flag new in 8.2.1:
https://downloads.haskell.org/~ghc/8.2.1/docs/html/users_guide/phases.html?highlight=split-sections#ghc-flag--split-sections
Unfortunately it is a bit of a pain to use this with stack because it requires one to compile all of one's dependencies with -split-sections.
I have created a stack.yaml and build script for stack itself here:
https://github.com/duog/stack/tree/split-sections
which builds stack from an empty stack root both with and without -split-sections. If one examines the build.out file in that repository one observes that there is very little difference in build time, but the resulting executable is about half the size with -split-sections. It's not demonstrated here, but the .a library files are much bigger.
As I'm sure you're aware, it is a bit difficult to ensure all the libraries in one's snapshot have certain ghc-options, is there anything that can be done to make this easier?
This sounds like the sort of thing where there isn't much downside. May make sense to enable -split-sections by default.
As I'm sure you're aware, it is a bit difficult to ensure all the libraries in one's snapshot have certain ghc-options, is there anything that can be done to make this easier?
This is the thing that bugs me most about today's stack. Making this possible is actually quite an undertaking. https://github.com/commercialhaskell/stack/issues/1265 describes a solution, and part of what's described there has been implemented. Though that issue is closed, we still hope to do something about this at some point - https://github.com/commercialhaskell/stack/issues/3330 . It'd be great if someone wanted to take on this project.
I've no doubt that enabling -split-sections by default is the right long-term solution, but it's a new feature and you'd need a way to reliably turn it off.
As devil's advocate, the downsides I'm aware of are:
Though I've no doubt you've thought about it, it seems to me that the cabal-install new-* approach is the right one; a single package database with all the different flavours of packages. You could even have a per-snapshot database, though I don't see much upside to that. The current interface for specifying ghc-options in stack.yaml is great, it's just the interaction with already installed packages that causes me problems.
I haven't had much reason to use the cabal-install new-* stuff. If I understand it correctly, one consequence is that you can't really load dependencies into ghci without fully specifying which package ids ghci should use. With stack's approach, all of the packages that have been built for a particular snapshot / local db are available and consistent.
Once implemented, implicit snapshots will provide the benefits of both approaches. It may have a little bit of overhead in creating new package DBs, but that seems to be really quick. So, I think it is better to avoid the "everything in one DB" approach.
Yes that's a good point re ghci. I think I was assuming stack would provide me a sub-database with only the packages from my stack.yaml.
I admit that I don't really follow exactly what an implicit snapshot would be, but I suppose it's something close to the sub-database I was thinking of.
The primary benefit of a single DB is in maximizing sharing between projects, hopefully implicit snapshots will provide this.
The primary benefit of a single DB is in maximizing sharing between projects, hopefully implicit snapshots will provide this.
Yup, this can also be provided by multiple DBs. Stack already has package sharing between snapshot DBs. If some other snapshot DB already has the package (and it has the same dependencies), then it will just get registered in the other DB rather than rebuilt.
The thing that implicit snapshots adds to this is the possibility of also sharing extra-deps / possibly even git dependencies. More importantly in my mind, is that it would allow full deterministic control over the options used to build all dependencies.
Seems to be default anyway since GHC 8.2.x on linux/darwin: https://ghc.haskell.org/trac/ghc/ticket/11445
Possibly relevant data. Binary sizes when compiling Aura with various resolvers "as-is":
lts-11.16 (ghc 8.2.2) -> 28.6mbnightly-2018-07-04 (ghc 8.4.3) -> 25.7mbNaively adding -split-sections to the ghc-options: section of my package.yaml doesn't seem to affect anything.
Using the strip tool over the executables also doesn't seem to have an affect.
I fell into this rabbit-hole of executable-size optimization today, and gathered some more data.
I tested three stack projects:
On every project I ran stack build, checked the size of the executables in .stack-work/dist/..., ran strip on them and compared the size of the stripped executable to the one in .stack-work/install/... to make sure they matched (they did in all cases). I then added "$everything": -split-sections to the ghc-options: section in the stack-yaml file, adding the section if it wasn't there already, and repeated the same build and check.
The result:
| executable | -split-sections | strip | size |
|------------|:-----------------:|:-------:|------:|
| hbfc | no | no | 20M |
| hbfc | no | yes | 11M |
| hbfc | yes | no | 6,6M |
| hbfc | yes | yes | 3,6M |
| | | | |
| hbfi | no | no | 3,3M |
| hbfi | no | yes | 903K |
| hbfi | yes | no | 3,3M |
| hbfi | yes | yes | 903K |
| | | | |
| stack | no | no | 99M |
| stack | no | yes | 65M |
| stack | yes | no | 51M |
| stack | yes | yes | 33M |
| | | | |
| aura | no | no | 44M |
| aura | no | yes | 28M |
| aura | yes | no | 14M |
| aura | yes | yes | 8,4M |
Adding -split-sections to everything seems to make a big difference. The one exception is the hbfi executable in hbfc. I assume the reason for that it only uses base and the hbfc library, that base is already built with -split-sections, and that the library in the same package either builds with -split-sections automatically, or GHC optimizes away the unused parts anyway.
Nice! See also: https://github.com/NixOS/nixpkgs/issues/43795
As a data point, enabling split sections reduced the size of the binary from 50MB to 14MB in one of my personal projects.
@Berengal I'm going to test that myself with Aura right now. If that's trivially possible, then we don't need to consider Nix at all.
@fosskers Yes, I simply added
ghc-options:
$everything: -split-sections
to my stack.yml.
@sluukkonen Thanks, I'm going to try that myself.
Yup, 8.8mb! I'm going to go ahead with this approach and make all my releases this way. Thank you!
How about a stack build --release flag that forces $everything: -split-sections?
I'm not sure if that would be ideal.
Split sections requires all dependencies to built with it, so having a separate flag would mean that dependencies would be compiled twice.
Perhaps stack could switch them on by default at some point, but I'm not sure what the tradeoffs are. Compile times would be slower, at the very least.
That would be the point of --release. Rust's cargo has such a feature, and it does indeed recompile everything.
Enabling global -split-sections like this breaks some packages with custom Setup (lens (fixed) and pretty-simple).
Windows builds of vulkan (and singletons (fixed)) are broken with "too many sections" error (works in cabal-install).
Most helpful comment
@fosskers Yes, I simply added
to my stack.yml.