In https://github.com/quasicomputational/dhall-prelude-enum-test, I have a conceptually-small Dhall program: it has two enums, one small and one large (301 constructors), and a prelude for convenient access to the enums' constructors. main.dhall then uses that to create a list of values from the smaller enum.With 301 constructors, constructors BigEnum has a rough size of ~90,000, which is big but hopefully not prohibitive. This pattern is distilled from dhall-lang/dhall-to-cabal#42.
However, contrary to hope, this pattern makes Dhall's typechecking really, really slow:
dhall-hash +RTS -p -RTS
total time = 47.01 secs (47007 ticks @ 1000 us, 1 processor)
total alloc = 28,577,665,440 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc
toList Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22) 76.3 84.5
traverseWithKey Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(420,1)-(421,78) 8.6 1.3
normalizeWith Dhall.Core src/Dhall/Core.hs:(1338,1)-(1688,22) 4.8 5.2
denote Dhall.Core src/Dhall/Core.hs:(1260,1)-(1320,39) 2.7 2.7
normalize Dhall.Core src/Dhall/Core.hs:1234:1-41 2.7 4.1
typeWithA Dhall.TypeCheck src/Dhall/TypeCheck.hs:(97,1)-(724,49) 2.1 0.0
shift Dhall.Core src/Dhall/Core.hs:(660,1)-(813,29) 1.2 1.5
The problem isn't (just?) with toList from insert-ordered-containers. If I change the Data.HashMap.Strict.InsOrd.toList kts to Data.HashMap.Strict.toList (Data.HashMap.Strict.InsOrd.toHashMap kts) in the Union branch in Dhall.TypeCheck, this is the performance profile, with a significant win but still way too slow:
dhall-hash +RTS -p -RTS
total time = 17.05 secs (17054 ticks @ 1000 us, 1 processor)
total alloc = 9,520,904,320 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc
traverseWithKey Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(420,1)-(421,78) 32.9 23.1
toHashMap Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:599:1-44 22.0 24.1
normalizeWith Dhall.Core src/Dhall/Core.hs:(1338,1)-(1688,22) 11.4 15.7
typeWithA Dhall.TypeCheck src/Dhall/TypeCheck.hs:(98,1)-(725,49) 8.5 4.8
denote Dhall.Core src/Dhall/Core.hs:(1260,1)-(1320,39) 6.4 8.0
normalize Dhall.Core src/Dhall/Core.hs:1234:1-41 5.8 12.4
toList Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22) 4.2 0.4
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 3.2 5.9
shift Dhall.Core src/Dhall/Core.hs:(660,1)-(813,29) 3.0 4.5
getInput Text.Megaparsec Text/Megaparsec.hs:404:1-40 2.1 0.9
Digging deeper into the profile, it looks like things are getting typechecked and typechecked and then typechecked again. Here's the hot part of the profile from the patched version:
typeWithA Dhall.TypeCheck src/Dhall/TypeCheck.hs:(98,1)-(725,49) 1827 2 8.5 4.8 97.8 99.0
normalize Dhall.Core src/Dhall/Core.hs:1234:1-41 1834 11744589 1.1 2.5 12.9 21.0
normalizeWith Dhall.Core src/Dhall/Core.hs:(1338,1)-(1688,22) 1835 11744589 5.9 9.6 11.7 18.5
denote Dhall.Core src/Dhall/Core.hs:(1260,1)-(1320,39) 1836 17511510 5.3 7.9 5.5 8.4
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 1837 1788859 0.3 0.5 0.3 0.5
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 1838 1788859 0.2 0.5 0.2 0.5
shift Dhall.Core src/Dhall/Core.hs:(660,1)-(813,29) 1875 2837894 2.9 4.5 3.0 4.7
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 1912 876525 0.1 0.2 0.1 0.2
toHashMap Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:599:1-44 1832 39004 4.6 4.8 5.1 5.7
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 1833 3631129 0.6 0.9 0.6 0.9
subst Dhall.Core src/Dhall/Core.hs:(820,1)-(960,29) 1911 586 0.0 0.0 0.0 0.0
traverseWithKey Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(420,1)-(421,78) 1841 291 32.9 23.1 67.4 62.6
normalize Dhall.Core src/Dhall/Core.hs:1234:1-41 1846 47233424 4.6 9.9 11.2 16.0
normalizeWith Dhall.Core src/Dhall/Core.hs:(1338,1)-(1688,22) 1847 47233424 5.5 6.0 6.6 6.1
denote Dhall.Core src/Dhall/Core.hs:(1260,1)-(1320,39) 1848 47234348 1.1 0.1 1.1 0.1
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 1849 284 0.0 0.0 0.0 0.0
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 1850 284 0.0 0.0 0.0 0.0
toHashMap Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:599:1-44 1843 156674 17.5 19.3 19.5 23.1
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 1844 14628125 2.0 3.8 2.0 3.8
insert Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:317:1-25 1896 38880 0.0 0.0 0.3 0.1
insert Dhall.Context src/Dhall/Context.hs:40:1-49 1894 38880 0.0 0.0 0.0 0.0
lookup Dhall.Context src/Dhall/Context.hs:(61,1)-(68,33) 1893 38880 0.1 0.0 0.1 0.0
lookup Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:302:1-59 1889 38880 0.0 0.0 0.1 0.0
lookup Data.HashMap.Base Data/HashMap/Base.hs:(434,1)-(448,29) 1890 38880 0.1 0.0 0.1 0.0
shift Dhall.Core src/Dhall/Core.hs:(660,1)-(813,29) 1895 38880 0.0 0.0 0.0 0.0
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 1853 12162 0.0 0.0 0.0 0.0
mapWithKey Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(411,1)-(414,30) 1851 5 0.0 0.0 0.0 0.0
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 1852 284 0.0 0.0 0.0 0.0
toList Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22) 1845 0 3.2 0.2 3.2 0.2
toList Data.HashMap.Array Data/HashMap/Array.hs:397:1-21 1842 0 0.0 0.0 0.0 0.0
lookup Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:302:1-59 1862 125 0.0 0.0 0.0 0.0
lookup Data.HashMap.Base Data/HashMap/Base.hs:(434,1)-(448,29) 1870 125 0.0 0.0 0.0 0.0
insert Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:317:1-25 1876 32 0.0 0.0 0.0 0.0
insert Dhall.Context src/Dhall/Context.hs:40:1-49 1874 32 0.0 0.0 0.0 0.0
lookup Dhall.Context src/Dhall/Context.hs:(61,1)-(68,33) 1873 32 0.0 0.0 0.0 0.0
judgmentallyEqualDhall.Core src/Dhall/Core.hs:(1694,1)-(1697,51) 1914 30 0.0 0.0 0.0 0.0
alphaNormalize Dhall.Core src/Dhall/Core.hs:(966,1)-(1222,11) 1918 360 0.0 0.0 0.0 0.0
shift Dhall.Core src/Dhall/Core.hs:(660,1)-(813,29) 1923 240 0.0 0.0 0.0 0.0
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 1927 60 0.0 0.0 0.0 0.0
subst Dhall.Core src/Dhall/Core.hs:(820,1)-(960,29) 1922 240 0.0 0.0 0.0 0.0
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 1926 60 0.0 0.0 0.0 0.0
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 1928 60 0.0 0.0 0.0 0.0
normalize Dhall.Core src/Dhall/Core.hs:1234:1-41 1915 60 0.0 0.0 0.0 0.0
normalizeWith Dhall.Core src/Dhall/Core.hs:(1338,1)-(1688,22) 1916 60 0.0 0.0 0.0 0.0
denote Dhall.Core src/Dhall/Core.hs:(1260,1)-(1320,39) 1917 360 0.0 0.0 0.0 0.0
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 1924 60 0.0 0.0 0.0 0.0
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 1925 60 0.0 0.0 0.0 0.0
mapWithKey Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(411,1)-(414,30) 1839 1 0.0 0.0 0.0 0.0
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 1840 94 0.0 0.0 0.0 0.0
toList Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22) 1828 0 0.9 0.2 0.9 0.2
Those numbers are really big! 11 million normalisations? 1.7 million variable shifts? That's too many; I'm sure some redundant work is going on here.
Inspecting Dhall.TypeCheck, I can see a possible reason for this: each time a variable is looked up from the environment, it gets re-typechecked, and each time a field is accessed on a record, the whole record is typechecked. So each of those 40 prelude.types.Little.Foo references is actually re-typechecking the large (301 fields, each for a 301-constructor type) record prelude.types.Big 3 times!
I think this can be improved. Let-bindings already typecheck the bound value; this can be stored in the Context, and then use sites of the variable won't need to re-typecheck. That should be easy: it just means that typeWithA needs to change the Context it accepts, and to establish the invariant that the types should match the values in it. If records could store their fields' types at construction time, they wouldn't have to re-typecheck at use time; that's harder to do, and will require some kind of variant AST.
Have you considered just slapping explicit memoization over type-checking, and seeing how that performs? Obviously it's not the best solution, but it's a good way to work out if repeated type checking is the problem.
I considered it, but I'm not sure how well it would work, since it needs to be keyed off the environment of bound variables as well. Probably worth trying!
@quasicomputational: If you can ensure that every variable inserted into a Context has already been type-checked then you can eliminate the type-check upon variable lookup (i.e. delete this line: https://github.com/dhall-lang/dhall-haskell/blob/master/src/Dhall/TypeCheck.hs#L105)
@quasicomputational: Actually, just go ahead and profile with the type-check upon variable lookup removed just so we can verify whether that is responsible for the slowdown
Another thing to note is that we can always as a last resort use a more optimal implementation under the hood for unions that are just enums (i.e. store them as just Naturals for efficiency). That way you only pay the price for them once at parse time.
Huh. This is weird. Here's the profile after I removed that check:
dhall-hash +RTS -p -RTS
total time = 42.44 secs (42436 ticks @ 1000 us, 1 processor)
total alloc = 28,577,671,512 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc
toList Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22) 76.0 84.5
traverseWithKey Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(420,1)-(421,78) 8.6 1.3
normalizeWith Dhall.Core src/Dhall/Core.hs:(1326,1)-(1676,22) 4.8 5.2
denote Dhall.Core src/Dhall/Core.hs:(1248,1)-(1308,39) 3.0 2.7
normalize Dhall.Core src/Dhall/Core.hs:1222:1-41 2.7 4.1
typeWithA Dhall.TypeCheck src/Dhall/TypeCheck.hs:(97,1)-(723,49) 2.0 0.0
shift Dhall.Core src/Dhall/Core.hs:(648,1)-(801,29) 1.4 1.5
That's without the toList adjustment, so compare it to the first profile above.
It's an improvement, but not as much of an improvement as I was hoping: I was expecting more like a 33% gain, since 1/3 of the per-use re-typechecks have been eliminated, but that's only about 9%. Maybe something else is going on here and I don't really understand it.
Some more thoughts on having peered more closely at this:
Memoisation will require a change of AST anyway to allow for fast comparison - something like storing a hash with each constructor summarising its children. But looking up the big things would still be quite slow, since they'd still have to do a full comparison.
Special-casing enums might not be enough. dhall-to-cabal has Extension, which I'm pretty sure is big enough to already be causing performance issues, though less dramatic ones since it's only ~120 constructors.
I was wrong in the initial comment: the types are already stored in the Context, so the Var case isn't actually the problem, it seems. I'm still trying to figure out exactly what the problem is, though.
I think I've got it: the Let case substitutes in the value of the bound variable directly, rather than going through the Context. But that's a pretty massive pessimisation if the value being substituted in is large! This was a deliberate change in #69, but I think we can fix it: if we check if it's a type or a term, we can stuff terms into the Context and only do the substitution for types to get the nice dependent behaviour. I'll try and work up a patch for this.
It seems like there might still be some performance issues lurking, even with #420. Here's the hot part of the profile from running dhall-to-cabal on its own dhall-to-cabal.dhall file, with a few boring lines removed:
typeWithA Dhall.TypeCheck src/Dhall/TypeCheck.hs:(98,1)-(749,49) 2286 1 1.3 0.5 55.7 68.3
normalize Dhall.Core src/Dhall/Core.hs:1222:1-41 2358 1845053 0.2 0.5 2.3 3.6
normalizeWith Dhall.Core src/Dhall/Core.hs:(1326,1)-(1676,22) 2359 1845053 1.0 1.7 2.1 3.2
denote Dhall.Core src/Dhall/Core.hs:(1248,1)-(1308,39) 2360 2861138 0.9 1.3 1.0 1.4
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 3111 325862 0.1 0.1 0.1 0.1
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 3112 325862 0.0 0.1 0.0 0.1
lookup Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:302:1-59 3118 2 0.0 0.0 0.0 0.0
shift Dhall.Core src/Dhall/Core.hs:(648,1)-(801,29) 2500 1464940 1.3 2.1 1.4 2.2
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 3113 424866 0.1 0.1 0.1 0.1
lookup Dhall.Context src/Dhall/Context.hs:(61,1)-(68,33) 2495 339775 0.1 0.0 0.1 0.0
subst Dhall.Core src/Dhall/Core.hs:(808,1)-(948,29) 2548 158736 0.1 0.1 0.1 0.2
shift Dhall.Core src/Dhall/Core.hs:(648,1)-(801,29) 3115 110635 0.0 0.1 0.0 0.1
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 3116 20069 0.0 0.0 0.0 0.0
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 3117 22514 0.0 0.0 0.0 0.0
insert Dhall.Context src/Dhall/Context.hs:40:1-49 2496 75635 0.0 0.0 0.0 0.0
judgmentallyEqual Dhall.Core src/Dhall/Core.hs:(1682,1)-(1685,51) 2543 19504 0.0 0.0 0.1 0.2
alphaNormalize Dhall.Core src/Dhall/Core.hs:(954,1)-(1210,11) 2547 54964 0.0 0.0 0.1 0.1
shift Dhall.Core src/Dhall/Core.hs:(648,1)-(801,29) 3124 73666 0.0 0.1 0.0 0.1
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 3149 2718 0.0 0.0 0.0 0.0
subst Dhall.Core src/Dhall/Core.hs:(808,1)-(948,29) 3123 73068 0.0 0.1 0.0 0.1
shift Dhall.Core src/Dhall/Core.hs:(648,1)-(801,29) 3125 2958 0.0 0.0 0.0 0.0
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 3148 2718 0.0 0.0 0.0 0.0
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 3128 2548 0.0 0.0 0.0 0.0
normalize Dhall.Core src/Dhall/Core.hs:1222:1-41 2544 39008 0.0 0.0 0.0 0.0
normalizeWith Dhall.Core src/Dhall/Core.hs:(1326,1)-(1676,22) 2545 39008 0.0 0.0 0.0 0.0
denote Dhall.Core src/Dhall/Core.hs:(1248,1)-(1308,39) 2546 54964 0.0 0.0 0.0 0.0
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 3126 2548 0.0 0.0 0.0 0.0
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 3127 2548 0.0 0.0 0.0 0.0
toHashMap Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:599:1-44 2415 8717 0.6 0.8 0.6 1.0
traverseWithKey Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(420,1)-(421,78) 3047 90 14.2 6.8 49.5 60.4
normalize Dhall.Core src/Dhall/Core.hs:1222:1-41 3057 19572568 2.4 4.9 24.1 40.6
lookup Dhall.Context src/Dhall/Context.hs:(61,1)-(68,33) 3054 4373775 1.0 0.1 1.0 0.1
shift Dhall.Core src/Dhall/Core.hs:(648,1)-(801,29) 3056 1262834 0.2 0.2 0.2 0.2
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 3086 27395 0.0 0.0 0.0 0.0
insert Dhall.Context src/Dhall/Context.hs:40:1-49 3055 959891 0.0 0.0 0.0 0.0
subst Dhall.Core src/Dhall/Core.hs:(808,1)-(948,29) 3083 218956 0.0 0.0 0.0 0.0
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 3110 5434 0.0 0.0 0.0 0.0
shift Dhall.Core src/Dhall/Core.hs:(648,1)-(801,29) 3093 3358 0.0 0.0 0.0 0.0
judgmentallyEqual Dhall.Core src/Dhall/Core.hs:(1682,1)-(1685,51) 3078 182106 0.2 0.3 0.9 1.0
alphaNormalize Dhall.Core src/Dhall/Core.hs:(954,1)-(1210,11) 3082 428650 0.1 0.0 0.4 0.5
shift Dhall.Core src/Dhall/Core.hs:(648,1)-(801,29) 3091 346610 0.1 0.3 0.1 0.3
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 3109 1500 0.0 0.0 0.0 0.0
subst Dhall.Core src/Dhall/Core.hs:(808,1)-(948,29) 3090 343070 0.2 0.2 0.2 0.2
shift Dhall.Core src/Dhall/Core.hs:(648,1)-(801,29) 3092 17070 0.0 0.0 0.0 0.0
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 3108 1500 0.0 0.0 0.0 0.0
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 3089 7356 0.0 0.0 0.0 0.0
normalize Dhall.Core src/Dhall/Core.hs:1222:1-41 3079 364212 0.1 0.1 0.2 0.2
toHashMap Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:599:1-44 3073 89863 6.4 8.8 7.2 10.6
lookup Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:302:1-59 3050 1372 0.0 0.0 0.0 0.0
insert Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:317:1-25 3060 1125 0.0 0.0 0.0 0.0
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 3075 861 0.0 0.0 0.0 0.0
union Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:385:1-23 3129 7 0.0 0.0 0.0 0.0
toList Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22) 3049 0 1.8 1.1 1.8 1.1
toList Data.HashMap.Array Data/HashMap/Array.hs:397:1-21 3048 0 0.0 0.0 0.0 0.0
lookup Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:302:1-59 3023 12 0.0 0.0 0.0 0.0
insert Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:317:1-25 3034 1 0.0 0.0 0.0 0.0
union Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:385:1-23 3144 1 0.0 0.0 0.0 0.0
toList Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22) 2295 0 0.2 0.1 0.2 0.1
normalise is still being called 20 million times! I can't imagine that all of that's necessary. There's still work to be done here, I guess.
@quasicomputational: I think one clue here is that it's not just anynormalize, but rather the normalize located here:
... which occurs as part of type-checking RecordLits. Specifically, it's the normalize called on the type of the type of each record field to verify whether all the record fields are terms or types.
That indicates that there might be some poor time complexity in type-checking record literals
@quasicomputational: Another thing which can have an effect on the constant-factors of the performance is that the current parser emits way too many nested Noted constructors (mainly due to how it parses operators). You can also try changing the noted function here to id which might clean up the profile:
@quasicomputational: Ah, I think I can guess what is the problem. I think it's because my advice on https://github.com/dhall-lang/dhall-haskell/pull/420 was bad :)
Here is what I think is happening. In the following expression:
let x = constructors BigEnum
in e
... when e is type-checked the normalized right-hand-side of the let expression (i.e. constructors BigEnum) is added to the context, and normalizing constructors BigEnum generates the gigantic record. Type-checking that record of constructors every time we look up x from the context is leading to the huge number of normalize calls
The solution is to ignore my original advice on that pull request. Instead, you should not normalize the right-hand-side on the fast path of type-checking let expressions
That's a useful observation; I hadn't realised that normalize would turn constructors BigEnum into the large thing. Removing normalize calls where the returned expression isn't immediately inspected does improve the benchmark a bit. Before:
time 966.7 ms (931.6 ms .. 1.035 s)
0.999 R虏 (0.999 R虏 .. 1.000 R虏)
mean 943.0 ms (936.0 ms .. 955.7 ms)
std dev 12.51 ms (565.0 渭s .. 15.44 ms)
After:
time 887.8 ms (766.7 ms .. NaN s)
0.998 R虏 (0.991 R虏 .. 1.000 R虏)
mean 890.5 ms (875.3 ms .. 907.6 ms)
std dev 20.13 ms (7.584 ms .. 26.46 ms)
But this isn't borne out in the dhall-to-cabal test. Before:
total time = 13.00 secs (12999 ticks @ 1000 us, 1 processor)
total alloc = 7,754,179,628 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc
getInput Text.Megaparsec Text/Megaparsec.hs:404:1-40 20.9 17.5
traverseWithKey Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(420,1)-(421,78) 19.0 8.4
normalizeWith Dhall.Core src/Dhall/Core.hs:(1325,1)-(1675,22) 14.9 21.6
denote Dhall.Core src/Dhall/Core.hs:(1247,1)-(1307,39) 13.6 17.4
toHashMap Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:599:1-44 9.3 10.4
getPosition Text.Megaparsec Text/Megaparsec.hs:417:1-51 5.1 5.6
normalize Dhall.Core src/Dhall/Core.hs:1221:1-41 3.4 5.9
toList Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22) 2.9 1.4
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 2.8 4.7
shift Dhall.Core src/Dhall/Core.hs:(647,1)-(800,29) 2.7 3.7
typeWithA Dhall.TypeCheck src/Dhall/TypeCheck.hs:(102,1)-(758,49) 2.2 0.9
After:
total time = 13.13 secs (13126 ticks @ 1000 us, 1 processor)
total alloc = 7,714,118,368 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc
getInput Text.Megaparsec Text/Megaparsec.hs:404:1-40 21.2 17.6
traverseWithKey Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(420,1)-(421,78) 18.9 8.5
normalizeWith Dhall.Core src/Dhall/Core.hs:(1325,1)-(1675,22) 14.5 21.3
denote Dhall.Core src/Dhall/Core.hs:(1247,1)-(1307,39) 13.5 17.1
toHashMap Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:599:1-44 9.7 10.5
getPosition Text.Megaparsec Text/Megaparsec.hs:417:1-51 5.2 5.6
normalize Dhall.Core src/Dhall/Core.hs:1221:1-41 3.2 5.9
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 2.9 4.7
toList Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22) 2.7 1.4
shift Dhall.Core src/Dhall/Core.hs:(647,1)-(800,29) 2.7 3.8
typeWithA Dhall.TypeCheck src/Dhall/TypeCheck.hs:(102,1)-(757,49) 2.1 0.9
lookup Dhall.Context src/Dhall/Context.hs:(61,1)-(68,33) 1.2 0.1
It's possible that the normalize on let-bound values is bad but the normalize on lambda-bound values is good, but this is where I start to worry about over-tuning for specific cases.
I think this makes sense: the Field case checks that the first operand is a record, and type-checking Record involves going into the types of all the fields, type-checking them, and normalising them to ensure types and terms aren't mixed. If we can figure out a way to do that once at the binding site and not at every use site, I think that's a potentially large win. (Also, I think this has changed due to #420, but I see the hot normalize call being in the Record case, not the RecordLit case; that would make sense as well, since there aren't any large RecordLits being substituted in any more.)
Here's a profile of dhall-to-cabal with noted stubbed out:
dhall-to-cabal +RTS -p -RTS dhall-to-cabal.dhall
total time = 12.13 secs (12133 ticks @ 1000 us, 1 processor)
total alloc = 7,399,873,104 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc
traverseWithKey Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(420,1)-(421,78) 22.3 8.5
denote Dhall.Core src/Dhall/Core.hs:(1247,1)-(1307,39) 17.2 18.2
normalizeWith Dhall.Core src/Dhall/Core.hs:(1325,1)-(1675,22) 15.9 22.6
expr Dhall.Parser src/Dhall/Parser.hs:1550:1-20 14.5 20.7
toHashMap Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:599:1-44 10.9 10.9
toList Data.HashMap.Strict.InsOrd src/Data/HashMap/Strict/InsOrd.hs:(580,1)-(584,22) 3.9 1.4
normalize Dhall.Core src/Dhall/Core.hs:1221:1-41 3.3 6.2
new_ Data.HashMap.Array Data/HashMap/Array.hs:177:1-28 3.3 4.9
shift Dhall.Core src/Dhall/Core.hs:(647,1)-(800,29) 3.2 3.8
typeWithA Dhall.TypeCheck src/Dhall/TypeCheck.hs:(102,1)-(758,49) 2.6 1.0
lookup Dhall.Context src/Dhall/Context.hs:(61,1)-(68,33) 1.3 0.1
And the benchmark:
time 914.0 ms (898.0 ms .. 926.6 ms)
1.000 R虏 (1.000 R虏 .. 1.000 R虏)
mean 929.0 ms (921.7 ms .. 938.0 ms)
std dev 9.101 ms (3.645 ms .. 12.56 ms)
variance introduced by outliers: 19% (moderately inflated)
Would #693 fix this?
I think it will definitely help, but it's still possible to have large records and I think the performance is still going to be bad. It might be time for a re-evaluation and to see where we are; I think the original description as written is a bit outdated now...
We are working in replace constructor with the new direct union access, the performance of tests has improved significantly (50%-200%, and the real case dhall-to-cabal a 200%!). Both test in the same machine using dhall-1.19.0:
golden tests
dhall-to-cabal
SPDX: OK (5.18s)
nested-conditions: OK (4.54s)
mixins-no-signatures: OK (4.58s)
map-source-repo: OK (4.44s)
gh-55: OK (4.55s)
gh-53: OK (4.00s)
empty-package: OK (0.82s)
dhall-to-cabal: OK (5.35s)
default-license-2.2: OK (3.65s)
conditional-dependencies: OK (5.05s)
compiler-options-order: OK (4.47s)
allrightsreserved-2.2: OK (3.42s)
allrightsreserved-2.0: OK (3.36s)
golden tests
dhall-to-cabal
SPDX: OK (9.33s)
nested-conditions: OK (9.22s)
mixins-no-signatures: OK (12.56s)
map-source-repo: OK (5.21s)
gh-55: OK (6.30s)
gh-53: OK (6.35s)
empty-package: OK (0.76s)
dhall-to-cabal: OK (16.17s)
default-license-2.2: OK (4.76s)
conditional-dependencies: OK (8.27s)
compiler-options-order: OK (6.95s)
allrightsreserved-2.2: OK (5.17s)
allrightsreserved-2.0: OK (5.13s)
Thanks @FintanH! :100:
Nice! :') Thanks to @Gabriel439 for the great idea :smile:
On the nbe-elaboration branch:
[nix-shell:/home/ollie/work/dhall-haskell/dhall]# time ./dist/build/dhall/dhall <<< ./dhall-prelude-enum-test/main.dhall
[ < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
, < Bar : {} | Baz : {} | Foo : {} >.Foo
]
real 0m0.015s
user 0m0.011s
sys 0m0.004s
I'll close this out as a duplicate of https://github.com/dhall-lang/dhall-haskell/issues/1039 since the only work left here is to complete the elaboration work linked in that issue
Most helpful comment
On the
nbe-elaborationbranch: