This looks like a regression since v1.25.0. I'm currently trying to find the commit that caused this.
72fd2ac9836dfce1d43d7f137291adf073d299ca seems to be at fault.
I used this ghci script in stack repl to bisect this:
import Data.Text.IO
text <- Data.Text.IO.readFile "/home/simon/src/cpkg/pkgs/pkg-set.dhall"
Right e = exprFromText "" text
resolved <- loadRelativeTo "/home/simon/src/cpkg/pkgs" e
Prelude.putStrLn "Done resolving"
import Data.Either
isRight (typeOf resolved)
Before 72fd2ac9836dfce1d43d7f137291adf073d299ca this takes 10–15s. At 72fd2ac9836dfce1d43d7f137291adf073d299ca it just doesn't finish.
@sjakobi: I can think of two options:
let representation when type-checkingI've been staring at https://github.com/dhall-lang/dhall-haskell/commit/72fd2ac9836dfce1d43d7f137291adf073d299ca for a while, but I don't see what went wrong.
Profiling reveals that about 75% of time is spent in shift and subst.
Reverting seems like the pragmatic next step right now. Preserving https://github.com/dhall-lang/dhall-haskell/commit/96921f03abc9af269be924f4e9cadec5d600fd06 might a bit tricky though.
@sjakobi: You can revert 96921f0 if necessary, too
The reversions are at https://github.com/dhall-lang/dhall-haskell/compare/sjakobi/revert-simple-let.
echo https://raw.githubusercontent.com/vmchale/cpkg/d6649462f3a64ed8feea5dae39e183d2062a3b70/pkgs/pkg-set.dhall | cabal run -w ghc-8.6.5 -- dhall:dhall type
now finishes in a reasonable time. Yet the script still doesn't. I'm confused. :/
@sjakobi: Maybe stack repl is using a stale version of the dhall package?
@Gabriel439 The difference is due to the way the file is accessed: Importing it works, but piping it to dhall is problematic.
To illustrate the issue more clearly:
On master, neither of the following commands finish when executed from cpkg's pkgs directory:
cat ./pkg-set.dhall | dhall type
echo ./pkg-set.dhall | dhall type
On the sjakobi/revert-simple-let branch
echo ./pkg-set.dhall | dhall type
succeeds quickly, but
cat ./pkg-set.dhall | dhall type
never finishes.
With v1.25.0, the difference can be observed, but is much less drastic:
echo ./pkg-set.dhall | dhall
takes about 8s on my machine.
cat ./pkg-set.dhall | dhall
takes about 14s.
@sjakobi: So I know why importing the file by path rather than by contents is faster. This is because of the semi-semantic caching support that @EggBaconAndSpam added. What happens is that the first time you import ./pkgs-set.dhall as a local path then the fully resolved expression is automatically cached under ~/.dhall-haskell. Then subsequent imports of the same path go more quickly because they are cache hits
@sjakobi: Oh, I think I know what the problem is. I'm willing to wager that it's due to removing this optimization as part of implementing dependent type support:
https://github.com/dhall-lang/dhall-haskell/pull/1164/files#diff-6b5593d90a4088272a0ea8057015520dL174
Ah, that makes sense. My whole investigation was bogus, as the script was still accessing the semantic cache! :/
Do you have an idea how the optimization could be brought back?
@sjakobi: I don't think the optimization can be brought back. For example, none of the tests in the Prelude would type-check if we reinstated that optimization. The only way we can fix this for real is to finish the work described in https://github.com/dhall-lang/dhall-haskell/issues/1129
I think we should just cut the release, communicate that there will be a performance regression for giant multi-let expressions and explain in the CHANGELOG how users can work around it (like by splitting up the multi-let expression). Then I will just implement #1129 myself after the release is cut
Also, for what it's worth, I checked master against dhall-kubernetes (since it is one of the key drivers of adoption) and it's much faster than when I last checked (down from tens of seconds to 1-2 seconds on several examples).
So this seems to confirm that the problem doesn't appear to affect well-factored projects and that there is a work-around to mitigate performance issues.
@vmchale You probably want to know about this!
This is also biting dhall-to-cabal a bit - test speeds are about a factor of 5 to 10 times slower; this is unsurprising, since the fast path that had to go was put in as part of the effort to make dhall-to-cabal's performance merely bad as opposed to catastrophic. But, since the change to constructors being fields of the union rather than constructors creating a quadratically-sized record removed the huge expression causing the slowdown in the first place, it's not as bad as it was.
@quasicomputational: One thing that will help while I'm working on the more general solution in #1129 is if you could reproduce the slowdown with a minimal pathological example. The reason I ask is that such an example might lead to a short-term fix we can apply in the interim (and we could add it to the benchmark suites)
Alright, I think I understand the problem and fix much better now and wrote up my thoughts on the elaboration thread: https://github.com/dhall-lang/dhall-haskell/issues/1129#issuecomment-532928798
Fix is up here: https://github.com/dhall-lang/dhall-haskell/pull/1335
@quasicomputational: Check to see if that branch fixes the dhall-to-cabal performance regression you saw
Looks much better now!
Most helpful comment
@sjakobi: I don't think the optimization can be brought back. For example, none of the tests in the Prelude would type-check if we reinstated that optimization. The only way we can fix this for real is to finish the work described in https://github.com/dhall-lang/dhall-haskell/issues/1129
I think we should just cut the release, communicate that there will be a performance regression for giant multi-
letexpressions and explain in the CHANGELOG how users can work around it (like by splitting up the multi-letexpression). Then I will just implement #1129 myself after the release is cut