See https://discourse.julialang.org/t/status-of-pkg-speed-improvements-outside-us-in-v1-5/46395/17. In 1.5.2, we should clone registries by default on Windows since unpacking tarballs with lots of small files is so slow there.
Sounds like a good plan.
I'm assuming we'll do this not only for the 1.5.2 release, but also for master, right?
Or are you hoping to have "use the registry without unpacking the tarball" functionality ready in time for 1.6?
We should do it on master for now but for 1.6 I am hoping to implement reading the registry directly from the tarball. Once I've hooked up using Tar.jl for extracting tarballs, this shouldn't be very hard.
FYI, Microsoft is making progress with tarball unpacking: https://github.com/microsoft/WinDev/issues/27.
Thanks, that's a very interesting link.
Would be nice to see how Julia/Pkg behaves after that update. Is there a way to know if you have it?
I'm sure you can just look it up in the Windows Update log somehow, but I brute-forced it. I downloaded the firefox 40 source, uncompressed, then timed the unpack with Measure-Command { tar xf .\firefox-40.0.source.tar } in Powershell as described in that issue. Result 118 seconds. Which means I already have the fix since it didn't take close to 30 minutes.
And that I guess is bad news for us since I already had the fix when I got slow Pkg timings in the Discourse thread.
Thanks for fixing this.
I am teaching a class with about half the students using windows and this is generating some frustration with time to first package install at 10-15 minutes.
Should already be in the 1.5.3 release.
After removing my .julia dir, adding DataFrames on win10 took over 6 minutes.
Most of the time was spend after the "Installing known registries into ..." log line.
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.5.3 (2020-11-09)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> versioninfo()
Julia Version 1.5.3
Commit 788b2c77c1 (2020-11-09 13:37 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Xeon(R) Gold 6238R CPU @ 2.20GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-9.0.1 (ORCJIT, skylake-avx512)
julia> import Pkg
julia> @time Pkg.add("DataFrames")
Installing known registries into `C:\Users\Manela\.julia`
Added registry `General` to `C:\Users\Manela\.julia\registries\General`
Resolving package versions...
Installed Reexport ──────────────────── v0.2.0
Installed PooledArrays ──────────────── v0.5.3
Installed DataAPI ───────────────────── v1.4.0
Installed Formatting ────────────────── v0.4.1
Installed Compat ────────────────────── v3.23.0
Installed DataFrames ────────────────── v0.22.0
Installed DataStructures ────────────── v0.18.8
Installed PrettyTables ──────────────── v0.10.1
Installed SortingAlgorithms ─────────── v0.3.1
Installed IteratorInterfaceExtensions ─ v1.0.0
Installed Tables ────────────────────── v1.2.1
Installed CategoricalArrays ─────────── v0.9.0
Installed JSON ──────────────────────── v0.21.1
Installed DataValueInterfaces ───────── v1.0.0
Installed Parsers ───────────────────── v1.0.12
Installed Crayons ───────────────────── v4.0.4
Installed TableTraits ───────────────── v1.0.0
Installed InvertedIndices ───────────── v1.0.0
Installed StructTypes ───────────────── v1.1.0
Installed Missings ──────────────────── v0.4.4
Installed OrderedCollections ────────── v1.3.2
Updating `C:\Users\Manela\.julia\environments\v1.5\Project.toml`
[a93c6f00] + DataFrames v0.22.0
Updating `C:\Users\Manela\.julia\environments\v1.5\Manifest.toml`
[324d7699] + CategoricalArrays v0.9.0
[34da2185] + Compat v3.23.0
[a8cc5b0e] + Crayons v4.0.4
[9a962f9c] + DataAPI v1.4.0
[a93c6f00] + DataFrames v0.22.0
[864edb3b] + DataStructures v0.18.8
[e2d170a0] + DataValueInterfaces v1.0.0
[59287772] + Formatting v0.4.1
[41ab1584] + InvertedIndices v1.0.0
[82899510] + IteratorInterfaceExtensions v1.0.0
[682c06a0] + JSON v0.21.1
[e1d29d7a] + Missings v0.4.4
[bac558e1] + OrderedCollections v1.3.2
[69de0a69] + Parsers v1.0.12
[2dfb63ee] + PooledArrays v0.5.3
[08abe8d2] + PrettyTables v0.10.1
[189a3867] + Reexport v0.2.0
[a2af1166] + SortingAlgorithms v0.3.1
[856f2bd8] + StructTypes v1.1.0
[3783bdb8] + TableTraits v1.0.0
[bd369af6] + Tables v1.2.1
[2a0f44e3] + Base64
[ade2ca70] + Dates
[8bb1440f] + DelimitedFiles
[8ba89e20] + Distributed
[9fa8497b] + Future
[b77e0a4c] + InteractiveUtils
[76f85450] + LibGit2
[8f399da3] + Libdl
[37e2e46d] + LinearAlgebra
[56ddb016] + Logging
[d6f4376e] + Markdown
[a63ad114] + Mmap
[44cfe95a] + Pkg
[de0858da] + Printf
[3fa0cd96] + REPL
[9a3f8284] + Random
[ea8e919c] + SHA
[9e88b42a] + Serialization
[1a1011a3] + SharedArrays
[6462fe0b] + Sockets
[2f01184e] + SparseArrays
[10745b16] + Statistics
[8dfed614] + Test
[cf7118a7] + UUIDs
[4ec0a83e] + Unicode
396.214050 seconds (7.26 M allocations: 463.055 MiB, 0.07% gc time)
julia>
Which of these paths exists?
C:\Users\Manela\.julia\registries\General\.gitC:\Users\Manela\.julia\registries\General\.tree_info.tomlC:\Users\Manela\.julia\registries\General\.tree_info.toml exists but not the .git one.
Looks like my commit didn't get cherry-picked onto the release-1.5 branch even though I tagged it. @KristofferC, @fredrikekre, was there some issue with backporting this change to 1.5?
No backport label, I guess. Or maybe it came after 1.5.3 was already made?
It's this PR, right? https://github.com/JuliaLang/Pkg.jl/pull/2175
We definitely gotta make sure #2175 is backported to the 1.6 branch.
Thanks for looking into it. I hope it makes it into a release soon.
Any quick fixes I can share with my students in the meanwhile?
Thanks for looking into it. I hope it makes it into a release soon.
Any quick fixes I can share with my students in the meanwhile?
Tell your students to open a new Julia session and run the following code:
ENV["JULIA_PKG_SERVER"] = ""
import Pkg
Pkg.Registry.rm("General")
Pkg.Registry.add("General")
This will force Pkg to Git clone the General registry.
Please note: this is the only time that the students need to set the JULIA_PKG_SERVER environment variable. We do it during this step to force Pkg to get the register via Git.
I did put the backport label on it.
Thanks for the workaround!
It still took 8 minutes to add the git registry the first time, but hopefully updates are faster.
Yes, the bottom line is that some versions of Windows are just really slow at writing files. The only way to improve that is to avoid doing it in the first place, which we can't do yet, but the git approach at least minimizes the amount of writes on each update.
I'm seeing often an error on windows that may be related.
ERROR: SystemError: opening file "C:\\Users\\Manela\\.julia\\registries\\General\\Registry.toml": No such file or directory
I reproduced it by removing just the General/Registry.toml file on a working installation.
My guess is that users give up waiting for the General registry to install and close julia while it is copying the files.
I found that removing the entire registries/General dir helps, for example by running following:
rm(joinpath(homedir(), ".julia", "registries", "General"), recursive=true)
followed by a package update.
I reproduced it by removing just the General/Registry.toml file on a working installation.
My guess is that users give up waiting for the General registry to install and close julia while it is copying the files.
I found that removing the entire registries/General dir helps, for example by running following:
rm(joinpath(homedir(), ".julia", "registries", "General"), recursive=true)
followed by a package update.
That probably should be a new issue. I guess pkg operations should first validate that the Registry.toml exists and regenerate it if not.
Most helpful comment
We should do it on master for now but for 1.6 I am hoping to implement reading the registry directly from the tarball. Once I've hooked up using Tar.jl for extracting tarballs, this shouldn't be very hard.