Dhall-haskell: Terrible performance when type-checking with a large enum's constructors

Created on 28 May 2018  路  19Comments  路  Source: dhall-lang/dhall-haskell

In https://github.com/quasicomputational/dhall-prelude-enum-test, I have a conceptually-small Dhall program: it has two enums, one small and one large (301 constructors), and a prelude for convenient access to the enums' constructors. main.dhall then uses that to create a list of values from the smaller enum.With 301 constructors, constructors BigEnum has a rough size of ~90,000, which is big but hopefully not prohibitive. This pattern is distilled from dhall-lang/dhall-to-cabal#42.

However, contrary to hope, this pattern makes Dhall's typechecking really, really slow:

       dhall-hash +RTS -p -RTS

    total time  =       47.01 secs   (47007 ticks @ 1000 us, 1 processor)
    total alloc = 28,577,665,440 bytes  (excludes profiling overheads)

COST CENTRE     MODULE                     SRC                                                 %time %alloc

toList          Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22)   76.3   84.5
traverseWithKey Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(420,1)-(421,78)    8.6    1.3
normalizeWith   Dhall.Core                 src/Dhall/Core.hs:(1338,1)-(1688,22)                  4.8    5.2
denote          Dhall.Core                 src/Dhall/Core.hs:(1260,1)-(1320,39)                  2.7    2.7
normalize       Dhall.Core                 src/Dhall/Core.hs:1234:1-41                           2.7    4.1
typeWithA       Dhall.TypeCheck            src/Dhall/TypeCheck.hs:(97,1)-(724,49)                2.1    0.0
shift           Dhall.Core                 src/Dhall/Core.hs:(660,1)-(813,29)                    1.2    1.5

The problem isn't (just?) with toList from insert-ordered-containers. If I change the Data.HashMap.Strict.InsOrd.toList kts to Data.HashMap.Strict.toList (Data.HashMap.Strict.InsOrd.toHashMap kts) in the Union branch in Dhall.TypeCheck, this is the performance profile, with a significant win but still way too slow:

       dhall-hash +RTS -p -RTS

    total time  =       17.05 secs   (17054 ticks @ 1000 us, 1 processor)
    total alloc = 9,520,904,320 bytes  (excludes profiling overheads)

COST CENTRE     MODULE                     SRC                                                 %time %alloc

traverseWithKey Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(420,1)-(421,78)   32.9   23.1
toHashMap       Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:599:1-44           22.0   24.1
normalizeWith   Dhall.Core                 src/Dhall/Core.hs:(1338,1)-(1688,22)                 11.4   15.7
typeWithA       Dhall.TypeCheck            src/Dhall/TypeCheck.hs:(98,1)-(725,49)                8.5    4.8
denote          Dhall.Core                 src/Dhall/Core.hs:(1260,1)-(1320,39)                  6.4    8.0
normalize       Dhall.Core                 src/Dhall/Core.hs:1234:1-41                           5.8   12.4
toList          Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22)    4.2    0.4
new_            Data.HashMap.Array         Data/HashMap/Array.hs:177:1-28                        3.2    5.9
shift           Dhall.Core                 src/Dhall/Core.hs:(660,1)-(813,29)                    3.0    4.5
getInput        Text.Megaparsec            Text/Megaparsec.hs:404:1-40                           2.1    0.9

Digging deeper into the profile, it looks like things are getting typechecked and typechecked and then typechecked again. Here's the hot part of the profile from the patched version:

typeWithA         Dhall.TypeCheck             src/Dhall/TypeCheck.hs:(98,1)-(725,49)              1827          2    8.5    4.8    97.8   99.0
 normalize        Dhall.Core                  src/Dhall/Core.hs:1234:1-41                         1834   11744589    1.1    2.5    12.9   21.0
  normalizeWith   Dhall.Core                  src/Dhall/Core.hs:(1338,1)-(1688,22)                1835   11744589    5.9    9.6    11.7   18.5
   denote         Dhall.Core                  src/Dhall/Core.hs:(1260,1)-(1320,39)                1836   17511510    5.3    7.9     5.5    8.4
    new_          Data.HashMap.Array          Data/HashMap/Array.hs:177:1-28                      1837    1788859    0.3    0.5     0.3    0.5
   new_           Data.HashMap.Array          Data/HashMap/Array.hs:177:1-28                      1838    1788859    0.2    0.5     0.2    0.5
 shift            Dhall.Core                  src/Dhall/Core.hs:(660,1)-(813,29)                  1875    2837894    2.9    4.5     3.0    4.7
  new_            Data.HashMap.Array          Data/HashMap/Array.hs:177:1-28                      1912     876525    0.1    0.2     0.1    0.2
 toHashMap        Data.HashMap.Strict.InsOrd  src/Data/HashMap/Strict/InsOrd.hs:599:1-44          1832      39004    4.6    4.8     5.1    5.7
  new_            Data.HashMap.Array          Data/HashMap/Array.hs:177:1-28                      1833    3631129    0.6    0.9     0.6    0.9
 subst            Dhall.Core                  src/Dhall/Core.hs:(820,1)-(960,29)                  1911        586    0.0    0.0     0.0    0.0
 traverseWithKey  Data.HashMap.Strict.InsOrd  src/Data/HashMap/Strict/InsOrd.hs:(420,1)-(421,78)  1841        291   32.9   23.1    67.4   62.6
  normalize       Dhall.Core                  src/Dhall/Core.hs:1234:1-41                         1846   47233424    4.6    9.9    11.2   16.0
   normalizeWith  Dhall.Core                  src/Dhall/Core.hs:(1338,1)-(1688,22)                1847   47233424    5.5    6.0     6.6    6.1
    denote        Dhall.Core                  src/Dhall/Core.hs:(1260,1)-(1320,39)                1848   47234348    1.1    0.1     1.1    0.1
     new_         Data.HashMap.Array          Data/HashMap/Array.hs:177:1-28                      1849        284    0.0    0.0     0.0    0.0
    new_          Data.HashMap.Array          Data/HashMap/Array.hs:177:1-28                      1850        284    0.0    0.0     0.0    0.0
  toHashMap       Data.HashMap.Strict.InsOrd  src/Data/HashMap/Strict/InsOrd.hs:599:1-44          1843     156674   17.5   19.3    19.5   23.1
   new_           Data.HashMap.Array          Data/HashMap/Array.hs:177:1-28                      1844   14628125    2.0    3.8     2.0    3.8
  insert          Data.HashMap.Strict.InsOrd  src/Data/HashMap/Strict/InsOrd.hs:317:1-25          1896      38880    0.0    0.0     0.3    0.1
  insert          Dhall.Context               src/Dhall/Context.hs:40:1-49                        1894      38880    0.0    0.0     0.0    0.0
  lookup          Dhall.Context               src/Dhall/Context.hs:(61,1)-(68,33)                 1893      38880    0.1    0.0     0.1    0.0
  lookup          Data.HashMap.Strict.InsOrd  src/Data/HashMap/Strict/InsOrd.hs:302:1-59          1889      38880    0.0    0.0     0.1    0.0
   lookup         Data.HashMap.Base           Data/HashMap/Base.hs:(434,1)-(448,29)               1890      38880    0.1    0.0     0.1    0.0
  shift           Dhall.Core                  src/Dhall/Core.hs:(660,1)-(813,29)                  1895      38880    0.0    0.0     0.0    0.0
  new_            Data.HashMap.Array          Data/HashMap/Array.hs:177:1-28                      1853      12162    0.0    0.0     0.0    0.0
  mapWithKey      Data.HashMap.Strict.InsOrd  src/Data/HashMap/Strict/InsOrd.hs:(411,1)-(414,30)  1851          5    0.0    0.0     0.0    0.0
   new_           Data.HashMap.Array          Data/HashMap/Array.hs:177:1-28                      1852        284    0.0    0.0     0.0    0.0
  toList          Data.HashMap.Strict.InsOrd  src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22)  1845          0    3.2    0.2     3.2    0.2
  toList          Data.HashMap.Array          Data/HashMap/Array.hs:397:1-21                      1842          0    0.0    0.0     0.0    0.0
 lookup           Data.HashMap.Strict.InsOrd  src/Data/HashMap/Strict/InsOrd.hs:302:1-59          1862        125    0.0    0.0     0.0    0.0
  lookup          Data.HashMap.Base           Data/HashMap/Base.hs:(434,1)-(448,29)               1870        125    0.0    0.0     0.0    0.0
 insert           Data.HashMap.Strict.InsOrd  src/Data/HashMap/Strict/InsOrd.hs:317:1-25          1876         32    0.0    0.0     0.0    0.0
 insert           Dhall.Context               src/Dhall/Context.hs:40:1-49                        1874         32    0.0    0.0     0.0    0.0
 lookup           Dhall.Context               src/Dhall/Context.hs:(61,1)-(68,33)                 1873         32    0.0    0.0     0.0    0.0
 judgmentallyEqualDhall.Core                  src/Dhall/Core.hs:(1694,1)-(1697,51)                1914         30    0.0    0.0     0.0    0.0
  alphaNormalize  Dhall.Core                  src/Dhall/Core.hs:(966,1)-(1222,11)                 1918        360    0.0    0.0     0.0    0.0
   shift          Dhall.Core                  src/Dhall/Core.hs:(660,1)-(813,29)                  1923        240    0.0    0.0     0.0    0.0
    new_          Data.HashMap.Array          Data/HashMap/Array.hs:177:1-28                      1927         60    0.0    0.0     0.0    0.0
   subst          Dhall.Core                  src/Dhall/Core.hs:(820,1)-(960,29)                  1922        240    0.0    0.0     0.0    0.0
    new_          Data.HashMap.Array          Data/HashMap/Array.hs:177:1-28                      1926         60    0.0    0.0     0.0    0.0
   new_           Data.HashMap.Array          Data/HashMap/Array.hs:177:1-28                      1928         60    0.0    0.0     0.0    0.0
  normalize       Dhall.Core                  src/Dhall/Core.hs:1234:1-41                         1915         60    0.0    0.0     0.0    0.0
   normalizeWith  Dhall.Core                  src/Dhall/Core.hs:(1338,1)-(1688,22)                1916         60    0.0    0.0     0.0    0.0
    denote        Dhall.Core                  src/Dhall/Core.hs:(1260,1)-(1320,39)                1917        360    0.0    0.0     0.0    0.0
     new_         Data.HashMap.Array          Data/HashMap/Array.hs:177:1-28                      1924         60    0.0    0.0     0.0    0.0
    new_          Data.HashMap.Array          Data/HashMap/Array.hs:177:1-28                      1925         60    0.0    0.0     0.0    0.0
 mapWithKey       Data.HashMap.Strict.InsOrd  src/Data/HashMap/Strict/InsOrd.hs:(411,1)-(414,30)  1839          1    0.0    0.0     0.0    0.0
  new_            Data.HashMap.Array          Data/HashMap/Array.hs:177:1-28                      1840         94    0.0    0.0     0.0    0.0
 toList           Data.HashMap.Strict.InsOrd  src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22)  1828          0    0.9    0.2     0.9    0.2

Those numbers are really big! 11 million normalisations? 1.7 million variable shifts? That's too many; I'm sure some redundant work is going on here.

Inspecting Dhall.TypeCheck, I can see a possible reason for this: each time a variable is looked up from the environment, it gets re-typechecked, and each time a field is accessed on a record, the whole record is typechecked. So each of those 40 prelude.types.Little.Foo references is actually re-typechecking the large (301 fields, each for a 301-constructor type) record prelude.types.Big 3 times!

I think this can be improved. Let-bindings already typecheck the bound value; this can be stored in the Context, and then use sites of the variable won't need to re-typecheck. That should be easy: it just means that typeWithA needs to change the Context it accepts, and to establish the invariant that the types should match the values in it. If records could store their fields' types at construction time, they wouldn't have to re-typecheck at use time; that's harder to do, and will require some kind of variant AST.

performance

Most helpful comment

On the nbe-elaboration branch:

[nix-shell:/home/ollie/work/dhall-haskell/dhall]# time ./dist/build/dhall/dhall <<< ./dhall-prelude-enum-test/main.dhall 
[ < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
]

real    0m0.015s
user    0m0.011s
sys     0m0.004s

All 19 comments

Have you considered just slapping explicit memoization over type-checking, and seeing how that performs? Obviously it's not the best solution, but it's a good way to work out if repeated type checking is the problem.

I considered it, but I'm not sure how well it would work, since it needs to be keyed off the environment of bound variables as well. Probably worth trying!

@quasicomputational: If you can ensure that every variable inserted into a Context has already been type-checked then you can eliminate the type-check upon variable lookup (i.e. delete this line: https://github.com/dhall-lang/dhall-haskell/blob/master/src/Dhall/TypeCheck.hs#L105)

@quasicomputational: Actually, just go ahead and profile with the type-check upon variable lookup removed just so we can verify whether that is responsible for the slowdown

Another thing to note is that we can always as a last resort use a more optimal implementation under the hood for unions that are just enums (i.e. store them as just Naturals for efficiency). That way you only pay the price for them once at parse time.

Huh. This is weird. Here's the profile after I removed that check:

       dhall-hash +RTS -p -RTS

    total time  =       42.44 secs   (42436 ticks @ 1000 us, 1 processor)
    total alloc = 28,577,671,512 bytes  (excludes profiling overheads)

COST CENTRE     MODULE                     SRC                                                 %time %alloc

toList          Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22)   76.0   84.5
traverseWithKey Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(420,1)-(421,78)    8.6    1.3
normalizeWith   Dhall.Core                 src/Dhall/Core.hs:(1326,1)-(1676,22)                  4.8    5.2
denote          Dhall.Core                 src/Dhall/Core.hs:(1248,1)-(1308,39)                  3.0    2.7
normalize       Dhall.Core                 src/Dhall/Core.hs:1222:1-41                           2.7    4.1
typeWithA       Dhall.TypeCheck            src/Dhall/TypeCheck.hs:(97,1)-(723,49)                2.0    0.0
shift           Dhall.Core                 src/Dhall/Core.hs:(648,1)-(801,29)                    1.4    1.5

That's without the toList adjustment, so compare it to the first profile above.

It's an improvement, but not as much of an improvement as I was hoping: I was expecting more like a 33% gain, since 1/3 of the per-use re-typechecks have been eliminated, but that's only about 9%. Maybe something else is going on here and I don't really understand it.

Some more thoughts on having peered more closely at this:

  • Memoisation will require a change of AST anyway to allow for fast comparison - something like storing a hash with each constructor summarising its children. But looking up the big things would still be quite slow, since they'd still have to do a full comparison.

  • Special-casing enums might not be enough. dhall-to-cabal has Extension, which I'm pretty sure is big enough to already be causing performance issues, though less dramatic ones since it's only ~120 constructors.

  • I was wrong in the initial comment: the types are already stored in the Context, so the Var case isn't actually the problem, it seems. I'm still trying to figure out exactly what the problem is, though.

I think I've got it: the Let case substitutes in the value of the bound variable directly, rather than going through the Context. But that's a pretty massive pessimisation if the value being substituted in is large! This was a deliberate change in #69, but I think we can fix it: if we check if it's a type or a term, we can stuff terms into the Context and only do the substitution for types to get the nice dependent behaviour. I'll try and work up a patch for this.

It seems like there might still be some performance issues lurking, even with #420. Here's the hot part of the profile from running dhall-to-cabal on its own dhall-to-cabal.dhall file, with a few boring lines removed:

typeWithA                               Dhall.TypeCheck                              src/Dhall/TypeCheck.hs:(98,1)-(749,49)                           2286          1    1.3    0.5    55.7   68.3
 normalize                              Dhall.Core                                   src/Dhall/Core.hs:1222:1-41                                      2358    1845053    0.2    0.5     2.3    3.6
  normalizeWith                         Dhall.Core                                   src/Dhall/Core.hs:(1326,1)-(1676,22)                             2359    1845053    1.0    1.7     2.1    3.2
   denote                               Dhall.Core                                   src/Dhall/Core.hs:(1248,1)-(1308,39)                             2360    2861138    0.9    1.3     1.0    1.4
    new_                                Data.HashMap.Array                           Data/HashMap/Array.hs:177:1-28                                   3111     325862    0.1    0.1     0.1    0.1
   new_                                 Data.HashMap.Array                           Data/HashMap/Array.hs:177:1-28                                   3112     325862    0.0    0.1     0.0    0.1
   lookup                               Data.HashMap.Strict.InsOrd                   src/Data/HashMap/Strict/InsOrd.hs:302:1-59                       3118          2    0.0    0.0     0.0    0.0
 shift                                  Dhall.Core                                   src/Dhall/Core.hs:(648,1)-(801,29)                               2500    1464940    1.3    2.1     1.4    2.2
  new_                                  Data.HashMap.Array                           Data/HashMap/Array.hs:177:1-28                                   3113     424866    0.1    0.1     0.1    0.1
 lookup                                 Dhall.Context                                src/Dhall/Context.hs:(61,1)-(68,33)                              2495     339775    0.1    0.0     0.1    0.0
 subst                                  Dhall.Core                                   src/Dhall/Core.hs:(808,1)-(948,29)                               2548     158736    0.1    0.1     0.1    0.2
  shift                                 Dhall.Core                                   src/Dhall/Core.hs:(648,1)-(801,29)                               3115     110635    0.0    0.1     0.0    0.1
   new_                                 Data.HashMap.Array                           Data/HashMap/Array.hs:177:1-28                                   3116      20069    0.0    0.0     0.0    0.0
  new_                                  Data.HashMap.Array                           Data/HashMap/Array.hs:177:1-28                                   3117      22514    0.0    0.0     0.0    0.0
 insert                                 Dhall.Context                                src/Dhall/Context.hs:40:1-49                                     2496      75635    0.0    0.0     0.0    0.0
 judgmentallyEqual                      Dhall.Core                                   src/Dhall/Core.hs:(1682,1)-(1685,51)                             2543      19504    0.0    0.0     0.1    0.2
  alphaNormalize                        Dhall.Core                                   src/Dhall/Core.hs:(954,1)-(1210,11)                              2547      54964    0.0    0.0     0.1    0.1
   shift                                Dhall.Core                                   src/Dhall/Core.hs:(648,1)-(801,29)                               3124      73666    0.0    0.1     0.0    0.1
    new_                                Data.HashMap.Array                           Data/HashMap/Array.hs:177:1-28                                   3149       2718    0.0    0.0     0.0    0.0
   subst                                Dhall.Core                                   src/Dhall/Core.hs:(808,1)-(948,29)                               3123      73068    0.0    0.1     0.0    0.1
    shift                               Dhall.Core                                   src/Dhall/Core.hs:(648,1)-(801,29)                               3125       2958    0.0    0.0     0.0    0.0
    new_                                Data.HashMap.Array                           Data/HashMap/Array.hs:177:1-28                                   3148       2718    0.0    0.0     0.0    0.0
   new_                                 Data.HashMap.Array                           Data/HashMap/Array.hs:177:1-28                                   3128       2548    0.0    0.0     0.0    0.0
  normalize                             Dhall.Core                                   src/Dhall/Core.hs:1222:1-41                                      2544      39008    0.0    0.0     0.0    0.0
   normalizeWith                        Dhall.Core                                   src/Dhall/Core.hs:(1326,1)-(1676,22)                             2545      39008    0.0    0.0     0.0    0.0
    denote                              Dhall.Core                                   src/Dhall/Core.hs:(1248,1)-(1308,39)                             2546      54964    0.0    0.0     0.0    0.0
     new_                               Data.HashMap.Array                           Data/HashMap/Array.hs:177:1-28                                   3126       2548    0.0    0.0     0.0    0.0
    new_                                Data.HashMap.Array                           Data/HashMap/Array.hs:177:1-28                                   3127       2548    0.0    0.0     0.0    0.0
 toHashMap                              Data.HashMap.Strict.InsOrd                   src/Data/HashMap/Strict/InsOrd.hs:599:1-44                       2415       8717    0.6    0.8     0.6    1.0
 traverseWithKey                        Data.HashMap.Strict.InsOrd                   src/Data/HashMap/Strict/InsOrd.hs:(420,1)-(421,78)               3047         90   14.2    6.8    49.5   60.4
  normalize                             Dhall.Core                                   src/Dhall/Core.hs:1222:1-41                                      3057   19572568    2.4    4.9    24.1   40.6
  lookup                                Dhall.Context                                src/Dhall/Context.hs:(61,1)-(68,33)                              3054    4373775    1.0    0.1     1.0    0.1
  shift                                 Dhall.Core                                   src/Dhall/Core.hs:(648,1)-(801,29)                               3056    1262834    0.2    0.2     0.2    0.2
   new_                                 Data.HashMap.Array                           Data/HashMap/Array.hs:177:1-28                                   3086      27395    0.0    0.0     0.0    0.0
  insert                                Dhall.Context                                src/Dhall/Context.hs:40:1-49                                     3055     959891    0.0    0.0     0.0    0.0
  subst                                 Dhall.Core                                   src/Dhall/Core.hs:(808,1)-(948,29)                               3083     218956    0.0    0.0     0.0    0.0
   new_                                 Data.HashMap.Array                           Data/HashMap/Array.hs:177:1-28                                   3110       5434    0.0    0.0     0.0    0.0
   shift                                Dhall.Core                                   src/Dhall/Core.hs:(648,1)-(801,29)                               3093       3358    0.0    0.0     0.0    0.0
  judgmentallyEqual                     Dhall.Core                                   src/Dhall/Core.hs:(1682,1)-(1685,51)                             3078     182106    0.2    0.3     0.9    1.0
   alphaNormalize                       Dhall.Core                                   src/Dhall/Core.hs:(954,1)-(1210,11)                              3082     428650    0.1    0.0     0.4    0.5
    shift                               Dhall.Core                                   src/Dhall/Core.hs:(648,1)-(801,29)                               3091     346610    0.1    0.3     0.1    0.3
     new_                               Data.HashMap.Array                           Data/HashMap/Array.hs:177:1-28                                   3109       1500    0.0    0.0     0.0    0.0
    subst                               Dhall.Core                                   src/Dhall/Core.hs:(808,1)-(948,29)                               3090     343070    0.2    0.2     0.2    0.2
     shift                              Dhall.Core                                   src/Dhall/Core.hs:(648,1)-(801,29)                               3092      17070    0.0    0.0     0.0    0.0
     new_                               Data.HashMap.Array                           Data/HashMap/Array.hs:177:1-28                                   3108       1500    0.0    0.0     0.0    0.0
    new_                                Data.HashMap.Array                           Data/HashMap/Array.hs:177:1-28                                   3089       7356    0.0    0.0     0.0    0.0
   normalize                            Dhall.Core                                   src/Dhall/Core.hs:1222:1-41                                      3079     364212    0.1    0.1     0.2    0.2
  toHashMap                             Data.HashMap.Strict.InsOrd                   src/Data/HashMap/Strict/InsOrd.hs:599:1-44                       3073      89863    6.4    8.8     7.2   10.6
  lookup                                Data.HashMap.Strict.InsOrd                   src/Data/HashMap/Strict/InsOrd.hs:302:1-59                       3050       1372    0.0    0.0     0.0    0.0
  insert                                Data.HashMap.Strict.InsOrd                   src/Data/HashMap/Strict/InsOrd.hs:317:1-25                       3060       1125    0.0    0.0     0.0    0.0
  new_                                  Data.HashMap.Array                           Data/HashMap/Array.hs:177:1-28                                   3075        861    0.0    0.0     0.0    0.0
  union                                 Data.HashMap.Strict.InsOrd                   src/Data/HashMap/Strict/InsOrd.hs:385:1-23                       3129          7    0.0    0.0     0.0    0.0
  toList                                Data.HashMap.Strict.InsOrd                   src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22)               3049          0    1.8    1.1     1.8    1.1
  toList                                Data.HashMap.Array                           Data/HashMap/Array.hs:397:1-21                                   3048          0    0.0    0.0     0.0    0.0
 lookup                                 Data.HashMap.Strict.InsOrd                   src/Data/HashMap/Strict/InsOrd.hs:302:1-59                       3023         12    0.0    0.0     0.0    0.0
 insert                                 Data.HashMap.Strict.InsOrd                   src/Data/HashMap/Strict/InsOrd.hs:317:1-25                       3034          1    0.0    0.0     0.0    0.0
 union                                  Data.HashMap.Strict.InsOrd                   src/Data/HashMap/Strict/InsOrd.hs:385:1-23                       3144          1    0.0    0.0     0.0    0.0
 toList                                 Data.HashMap.Strict.InsOrd                   src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22)               2295          0    0.2    0.1     0.2    0.1

normalise is still being called 20 million times! I can't imagine that all of that's necessary. There's still work to be done here, I guess.

@quasicomputational: I think one clue here is that it's not just anynormalize, but rather the normalize located here:

https://github.com/dhall-lang/dhall-haskell/blob/8bd410f8f5fe0bbef73388a70ea762b3477589aa/src/Dhall/TypeCheck.hs#L482

... which occurs as part of type-checking RecordLits. Specifically, it's the normalize called on the type of the type of each record field to verify whether all the record fields are terms or types.

That indicates that there might be some poor time complexity in type-checking record literals

@quasicomputational: Another thing which can have an effect on the constant-factors of the performance is that the current parser emits way too many nested Noted constructors (mainly due to how it parses operators). You can also try changing the noted function here to id which might clean up the profile:

https://github.com/dhall-lang/dhall-haskell/blob/8bd410f8f5fe0bbef73388a70ea762b3477589aa/src/Dhall/Parser.hs#L139

@quasicomputational: Ah, I think I can guess what is the problem. I think it's because my advice on https://github.com/dhall-lang/dhall-haskell/pull/420 was bad :)

Here is what I think is happening. In the following expression:

    let x = constructors BigEnum

in e

... when e is type-checked the normalized right-hand-side of the let expression (i.e. constructors BigEnum) is added to the context, and normalizing constructors BigEnum generates the gigantic record. Type-checking that record of constructors every time we look up x from the context is leading to the huge number of normalize calls

The solution is to ignore my original advice on that pull request. Instead, you should not normalize the right-hand-side on the fast path of type-checking let expressions

That's a useful observation; I hadn't realised that normalize would turn constructors BigEnum into the large thing. Removing normalize calls where the returned expression isn't immediately inspected does improve the benchmark a bit. Before:

time                 966.7 ms   (931.6 ms .. 1.035 s)
                     0.999 R虏   (0.999 R虏 .. 1.000 R虏)
mean                 943.0 ms   (936.0 ms .. 955.7 ms)
std dev              12.51 ms   (565.0 渭s .. 15.44 ms)

After:

time                 887.8 ms   (766.7 ms .. NaN s)
                     0.998 R虏   (0.991 R虏 .. 1.000 R虏)
mean                 890.5 ms   (875.3 ms .. 907.6 ms)
std dev              20.13 ms   (7.584 ms .. 26.46 ms)

But this isn't borne out in the dhall-to-cabal test. Before:

    total time  =       13.00 secs   (12999 ticks @ 1000 us, 1 processor)
    total alloc = 7,754,179,628 bytes  (excludes profiling overheads)

COST CENTRE     MODULE                     SRC                                                 %time %alloc

getInput        Text.Megaparsec            Text/Megaparsec.hs:404:1-40                          20.9   17.5
traverseWithKey Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(420,1)-(421,78)   19.0    8.4
normalizeWith   Dhall.Core                 src/Dhall/Core.hs:(1325,1)-(1675,22)                 14.9   21.6
denote          Dhall.Core                 src/Dhall/Core.hs:(1247,1)-(1307,39)                 13.6   17.4
toHashMap       Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:599:1-44            9.3   10.4
getPosition     Text.Megaparsec            Text/Megaparsec.hs:417:1-51                           5.1    5.6
normalize       Dhall.Core                 src/Dhall/Core.hs:1221:1-41                           3.4    5.9
toList          Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22)    2.9    1.4
new_            Data.HashMap.Array         Data/HashMap/Array.hs:177:1-28                        2.8    4.7
shift           Dhall.Core                 src/Dhall/Core.hs:(647,1)-(800,29)                    2.7    3.7
typeWithA       Dhall.TypeCheck            src/Dhall/TypeCheck.hs:(102,1)-(758,49)               2.2    0.9

After:

    total time  =       13.13 secs   (13126 ticks @ 1000 us, 1 processor)
    total alloc = 7,714,118,368 bytes  (excludes profiling overheads)

COST CENTRE     MODULE                     SRC                                                 %time %alloc

getInput        Text.Megaparsec            Text/Megaparsec.hs:404:1-40                          21.2   17.6
traverseWithKey Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(420,1)-(421,78)   18.9    8.5
normalizeWith   Dhall.Core                 src/Dhall/Core.hs:(1325,1)-(1675,22)                 14.5   21.3
denote          Dhall.Core                 src/Dhall/Core.hs:(1247,1)-(1307,39)                 13.5   17.1
toHashMap       Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:599:1-44            9.7   10.5
getPosition     Text.Megaparsec            Text/Megaparsec.hs:417:1-51                           5.2    5.6
normalize       Dhall.Core                 src/Dhall/Core.hs:1221:1-41                           3.2    5.9
new_            Data.HashMap.Array         Data/HashMap/Array.hs:177:1-28                        2.9    4.7
toList          Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22)    2.7    1.4
shift           Dhall.Core                 src/Dhall/Core.hs:(647,1)-(800,29)                    2.7    3.8
typeWithA       Dhall.TypeCheck            src/Dhall/TypeCheck.hs:(102,1)-(757,49)               2.1    0.9
lookup          Dhall.Context              src/Dhall/Context.hs:(61,1)-(68,33)                   1.2    0.1

It's possible that the normalize on let-bound values is bad but the normalize on lambda-bound values is good, but this is where I start to worry about over-tuning for specific cases.

I think this makes sense: the Field case checks that the first operand is a record, and type-checking Record involves going into the types of all the fields, type-checking them, and normalising them to ensure types and terms aren't mixed. If we can figure out a way to do that once at the binding site and not at every use site, I think that's a potentially large win. (Also, I think this has changed due to #420, but I see the hot normalize call being in the Record case, not the RecordLit case; that would make sense as well, since there aren't any large RecordLits being substituted in any more.)

Here's a profile of dhall-to-cabal with noted stubbed out:

       dhall-to-cabal +RTS -p -RTS dhall-to-cabal.dhall

    total time  =       12.13 secs   (12133 ticks @ 1000 us, 1 processor)
    total alloc = 7,399,873,104 bytes  (excludes profiling overheads)

COST CENTRE     MODULE                     SRC                                                 %time %alloc

traverseWithKey Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(420,1)-(421,78)   22.3    8.5
denote          Dhall.Core                 src/Dhall/Core.hs:(1247,1)-(1307,39)                 17.2   18.2
normalizeWith   Dhall.Core                 src/Dhall/Core.hs:(1325,1)-(1675,22)                 15.9   22.6
expr            Dhall.Parser               src/Dhall/Parser.hs:1550:1-20                        14.5   20.7
toHashMap       Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:599:1-44           10.9   10.9
toList          Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22)    3.9    1.4
normalize       Dhall.Core                 src/Dhall/Core.hs:1221:1-41                           3.3    6.2
new_            Data.HashMap.Array         Data/HashMap/Array.hs:177:1-28                        3.3    4.9
shift           Dhall.Core                 src/Dhall/Core.hs:(647,1)-(800,29)                    3.2    3.8
typeWithA       Dhall.TypeCheck            src/Dhall/TypeCheck.hs:(102,1)-(758,49)               2.6    1.0
lookup          Dhall.Context              src/Dhall/Context.hs:(61,1)-(68,33)                   1.3    0.1

And the benchmark:

time                 914.0 ms   (898.0 ms .. 926.6 ms)
                     1.000 R虏   (1.000 R虏 .. 1.000 R虏)
mean                 929.0 ms   (921.7 ms .. 938.0 ms)
std dev              9.101 ms   (3.645 ms .. 12.56 ms)
variance introduced by outliers: 19% (moderately inflated)

Would #693 fix this?

I think it will definitely help, but it's still possible to have large records and I think the performance is still going to be bad. It might be time for a re-evaluation and to see where we are; I think the original description as written is a bit outdated now...

We are working in replace constructor with the new direct union access, the performance of tests has improved significantly (50%-200%, and the real case dhall-to-cabal a 200%!). Both test in the same machine using dhall-1.19.0:

golden tests
  dhall-to-cabal
    SPDX:                     OK (5.18s)
    nested-conditions:        OK (4.54s)
    mixins-no-signatures:     OK (4.58s)
    map-source-repo:          OK (4.44s)
    gh-55:                    OK (4.55s)
    gh-53:                    OK (4.00s)
    empty-package:            OK (0.82s)
    dhall-to-cabal:           OK (5.35s)
    default-license-2.2:      OK (3.65s)
    conditional-dependencies: OK (5.05s)
    compiler-options-order:   OK (4.47s)
    allrightsreserved-2.2:    OK (3.42s)
    allrightsreserved-2.0:    OK (3.36s)
golden tests
  dhall-to-cabal
    SPDX:                     OK (9.33s)
    nested-conditions:        OK (9.22s)
    mixins-no-signatures:     OK (12.56s)
    map-source-repo:          OK (5.21s)
    gh-55:                    OK (6.30s)
    gh-53:                    OK (6.35s)
    empty-package:            OK (0.76s)
    dhall-to-cabal:           OK (16.17s)
    default-license-2.2:      OK (4.76s)
    conditional-dependencies: OK (8.27s)
    compiler-options-order:   OK (6.95s)
    allrightsreserved-2.2:    OK (5.17s)
    allrightsreserved-2.0:    OK (5.13s)

Thanks @FintanH! :100:

Nice! :') Thanks to @Gabriel439 for the great idea :smile:

On the nbe-elaboration branch:

[nix-shell:/home/ollie/work/dhall-haskell/dhall]# time ./dist/build/dhall/dhall <<< ./dhall-prelude-enum-test/main.dhall 
[ < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
]

real    0m0.015s
user    0m0.011s
sys     0m0.004s

I'll close this out as a duplicate of https://github.com/dhall-lang/dhall-haskell/issues/1039 since the only work left here is to complete the elaboration work linked in that issue

Was this page helpful?
0 / 5 - 0 ratings

Related issues

philandstuff picture philandstuff  路  18Comments

Michael-Kateregga picture Michael-Kateregga  路  26Comments

sjakobi picture sjakobi  路  18Comments

Michael-Kateregga picture Michael-Kateregga  路  22Comments

mmhat picture mmhat  路  23Comments