Julia: 20+ minute precompilation regression on v0.6.1

Created on 28 Oct 2017  Â·  20Comments  Â·  Source: JuliaLang/julia

From here: https://github.com/JuliaDiffEq/DifferentialEquations.jl/issues/209

We see a >20 minute compile time regression, along with post-compilation using statements taking as long as precompilation used to take. @mauro3 noted that it may not occur on Linux though, but latest Travis tests seem to show that this regression is widespread.

latency regression

Most helpful comment

Switching back to Julia 0.6.0 solves the performance issue. Is this fixed on master? I know that 1.0 is about feature completeness but 25 seconds and 1GB allocations for a tiny package like FileIO sound more like a bug to me.

All 20 comments

Chris narrowed it down to BoundaryValueDiffEq.jl. I ran the pre-compilation of for both Julia 0.6 versions and see a 4x regression:

Here the comparison of v0.6.0 and v0.6.1 of precompilation of BoundaryValueDiffEq.jl (with the __precompile__() statement active here):

   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.6.0 (2017-06-19 13:05 UTC)
 _/ |\__'_|_|_|\__'_|  |
|__/                   |  x86_64-pc-linux-gnu

julia> @time using BoundaryValueDiffEq
INFO: Recompiling stale cache file /home/mauro/.julia/lib/v0.6/BoundaryValueDiffEq.ji for module BoundaryValueDiffEq.
315.972758 seconds (9.44 M allocations: 690.993 MiB, 0.06% gc time)
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.6.1 (2017-10-24 22:15 UTC)
 _/ |\__'_|_|_|\__'_|  |  
|__/                   |  x86_64-pc-linux-gnu

julia> @time using BoundaryValueDiffEq                                                                                                                                   
INFO: Recompiling stale cache file /home/mauro/.julia/lib/v0.6/BoundaryValueDiffEq.ji for module BoundaryValueDiffEq.                                                    
1120.472693 seconds (17.77 M allocations: 1.298 GiB, 0.06% gc time)                                                                                                      

Also on master. I notice that BandedMatrices.ji is about 120MB, apparently due to excessive inference. I suspect #21677, as much as I like that change.

BoundaryValueDiffEq is the only library in the DiffEq stack that has BandedMatrices.jl, so that seems like that's where it is. I'll remove BoundaryValueDiffEq.jl for now from DifferentialEquations.jl to isolate the problem.

I tried bisecting this. It's a little tricky since it seems to have regressed in two steps, where BandedMatrices.ji went from 20MB to 80MB to 120MB. Unfortunately the culprit from bisecting is #23012. That was definitely a bugfix, which means the bug was an accidental optimization. Frankly, even 20MB is way too big for this package, so I suspect we're doing too much inference, and the bug caused us to skip saving a bunch of stuff that actually didn't matter. We end up spending lots of time in jl_recache_types (loading lots of types takes longer than saving them).

Pinging @dlfivefifty

Commenting out precompile fixes this issue, though that's probably not the ideal solution.

But a package which has __precompile__() in it, will then still precompile BandedMatrices.jl, unless you set __precompile__(false), but then any packages depending on it can not have precompilation enabled.

The issue is not the __precompile__(), it's the precompile.jl file with a list of functions to precompile (from SnoopCompile.jl).

So I meant comment out the include("precompile.jl") line

This is related to the unions of matrix types like StridedMatrix. One thing that fixes it is to set MAX_UNION_SPLITTING in inference to 1 (disabling it). Long term, it would be nice to replace those type unions. For now, we might be able to hack around it in inference.

I think StridedMatrix would work better as a trait. This would allow other, non-Base, matrix types to dispatch to the right BLAS routines.

Is here anything that can be done?

I tracked down why Images.jl loads so long and its

julia> @time using FileIO
 25.040638 seconds (19.17 M allocations: 1.065 GiB, 1.07% gc time)

This is pretty long for such a core package like FileIO which cannot be avoided (I need Images.jl)

Is there any workaround?

Switching back to Julia 0.6.0 solves the performance issue. Is this fixed on master? I know that 1.0 is about feature completeness but 25 seconds and 1GB allocations for a tiny package like FileIO sound more like a bug to me.

Can you try deleting your FileIO.ji precompilation file and try again.

currently not since I changed to 0.6.0 and need that environment right now. I missed to cross-ref https://github.com/JuliaIO/FileIO.jl/issues/156

Is this still an issue? Perhaps an update?

under 0.6.3

   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.6.3 (2018-05-28 20:20 UTC)
 _/ |\__'_|_|_|\__'_|  |  Official http://julialang.org/ release
|__/                   |  x86_64-apple-darwin14.5.0

julia> @time using FileIO
  8.100726 seconds (4.97 M allocations: 279.204 MiB, 0.75% gc time)

so still way too slow. (if you were interested in 0.6.3 timing)

Sorry for not being clear - is it an issue in 0.7-alpha too? Good to see that at least its better than the 25 seconds.

Well can't really do 0.7 testing yet, because of the deprecation warnings. With all the deprecations being printed, I got 15 seconds on a fairly newish server.

On the original issue posted here (BoundaryValueDiffEq), I got the whole thing in 0.6.3 in 64 seconds. I did a Pkg.add(), which brought in a bunch of dependent packages for the first time as well.

I propose closing this and tracking compile-time regressions in a new/existing 0.7/1.0 related issue.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

musm picture musm  Â·  3Comments

StefanKarpinski picture StefanKarpinski  Â·  3Comments

dpsanders picture dpsanders  Â·  3Comments

wilburtownsend picture wilburtownsend  Â·  3Comments

Keno picture Keno  Â·  3Comments