Doing some profiling of startup time, the main items are:
Function *
objects for symbols in the system image (~15%)Low hanging fruits are perhaps, creating Function *
objects lazily, and deferring libgit2 initialization until it's needed. Might also be worth looking into cholmod's init to try to get it to compile everything up front.
Loading and parsing ~/.julia_history
may also be contributing when present. Doing that lazily could be a big win and shouldn't be too hard.
Yes, it can be significant if your history file is large.
Why is it not possible to pre-create Function *
objects for symbols in the system image within the system image? Would it make sense to defer BLAS initialization until it is needed?
I think many history files are large; and as the user base expands that proportion will grow.
Update of what takes time to startup (ran at #28118)
So, ~55% is restoring the sysimg, 18% is blas, 5% is initialization of frontend, 4% is a call to srand
.
True, but also about 100x faster than Ruby 😁
@StefanKarpinski thats the exact opposite of what i just said - can you explain
What I'm saying is that in exchange for Julia's 20% slower startup time, you get a language that's 100x faster than Ruby. The fact that Julia is _only_ 20% slower to startup is impressive given that compiling your C++ code would take considerably longer, for example. Startup and compilation time is and will continue to be a high priority for us and will be improved going forward, but posting comparisons between Julia's startup time and the startup time of various slow, interpreted languages doesn't really contribute anything useful. We already know how long it takes to start Julia and there's nothing we can learn from those languages since they are so technologically dissimilar. If you happen to know of a fast, JIT compiled language with really snappy startup time, then that would be potentially helpful.
@StefanKarpinski first one that comes to mind is PowerShell - but that might not be a good match - i will link in case it is helpful - thanks
PowerShell command execution time is more comparable to timing evaluation in Julia's REPL, which is quite fast for everyone's favorite super useful example:
julia> @time println("Hello, world")
Hello, world
0.002232 seconds (27 allocations: 1.750 KiB)
julia> @time println("Hello, world")
Hello, world
0.000022 seconds (8 allocations: 240 bytes)
@StefanKarpinski how about luajit?
just joking 😂
BTW, here is a list of the innovative features in LuaJIT, hope it helps.
LuaJIT is a super impressive piece of work. There's still not much to learn since Lua is notoriously minimal and LuaJIT doesn't do any of the things that are taking time above: loading and initializing a BLAS library, a high-performance RNG, multithreading infrastructure, etc.
Please keep future commentary on this thread to constructive points about Julia's startup time.
"4% is a call to srand."
How about getting rid of that/MersenneTwister for Julia 1.1?
Does having to do "using Random" eliminate srand already? For compatibility "using RandomMersenneTwister" could be a possibility while Karpinski has said on my issue on replacement it's not to be relied on. Alternative good and much faster RNG Julia code exists that doesn't have a huge state.
Which codes are you referring to?
https://sunoru.github.io/RandomNumbers.jl/stable/man/xorshifts/
"The successor to Xorshift128 series." (that series is also implemented, e.g. "Xorshift128Plus is presently used in the JavaScript engines of Chrome, Firefox and Safari."). https://github.com/JuliaLang/julia/issues/27614
I thought those RNG don't pass the RNG testsuite.
Use 1.3. 1.2 has a known startup time regression.
And you use some arbitrary julia package on the julia code. Just use Printf.
Also, startup times of interpreters are not comparable with jit compiled languages.
I will try new version and also Printf later and update
This specific example is pretty arbitrary though. If we only talk about latency you can do
❯ cat app.jl
using Printf
@printf "%05.2f\n" 1.2
❯ time julia --compile=min app.jl
01.20
julia --compile=min app.jl 0.08s user 0.06s system 133% cpu 0.104 total
Comparing random small pieces of code between different languages is not really productive.
@KristofferC but it is a comparison. I think youre in a better position to criticise my methods, after youve presented your own comparison.
My point is that simple comparisons don't provide any information that help make Julia improve. Profiling like https://github.com/JuliaLang/julia/issues/17285#issuecomment-407721005, identifying hot spots where things can be optimized etc, on the other hand, might be useful.
Saying Julia does x in 0.5s while OtherLanguage does it in 0.3s is just pointless in an issue like this. You could post the result on a blog or something.
@KristofferC youre right in that it doesnt help fix the problem.
But it does help confirm the existence of the problem, and the degree.
Something to help fix the problem would be more valuable, which I think you are
alluding to. But suggesting that comparison testing has no value is just wrong.
Comparison is literally the only way one could know a problem exists in the
first place. Without comparison, youd have to have some objective measure of
what is fast and slow in regard to interpreter startup, and I am not aware of
any such standard.
At any rate, in order to minimize additional noise I am editing my comment with
the following. I reran the test with suggestions and it does make considerable
difference. Julia is stiller slower than all, but by lesser margin:
~~~
$ bin/julia -v
julia version 1.3.0-rc5
$ cat app.jl
using Printf
@printf "%05.2fn" 1.2
$ time bin/julia --compile=min app.jl
01.20
real 0m0.180s
~~~
~~~
$ cat app.rb
s1 = '%05.2f' % 1.2
puts s1
$ time ruby app.rb
01.20
real 0m0.140s
~~~
~~~
$ cat app.py
s1 = format(1.2, '05.2f')
print(s1)
$ time python3 app.py
01.20
real 0m0.094s
~~~
~~~
$ cat app.php
$s1 = sprintf('%05.2f', 1.2);
var_dump($s1);
$ time php app.php
string(5) "01.20"
real 0m0.078s
~~~
For cholmod, we could avoid compiling it into the system image. I in imagine compiling SuitesSparse is hardly noticeable.
I imagine libgit2 can also be removed from the system image in 1.4 once Pkg does not need it.
Not building blas into the system image will open the door to easier use of alternate blas libraries. However this one will affect a lot of people, if it is not in the system image. And these packages probably have significant compile time.
I imagine libgit2 can also be removed from the system image in 1.4 once Pkg does not need it.
Pkg needs LibGit2 in 1.4.
Is it 1.5 then or am I misunderstanding?
Yes, we still support using git for registries and to add unregistered packages via git URLs.
Not building blas into the system image will open the door to easier use of alternate blas libraries. However this one will affect a lot of people, if it is not in the system image. And these packages probably have significant compile time.
Why not build it into the system image and also allow an alternate blas library to be used?
Most helpful comment
LuaJIT is a super impressive piece of work. There's still not much to learn since Lua is notoriously minimal and LuaJIT doesn't do any of the things that are taking time above: loading and initializing a BLAS library, a high-performance RNG, multithreading infrastructure, etc.
Please keep future commentary on this thread to constructive points about Julia's startup time.