This is mostly just a meta issue attempting to summarize several existing discussions (issues). I'll try to include links to existing issues I've found whenever possible, but if I miss something please feel free to post a suggestion.
I think the general problem is best summarized by the following statement:
stdlibs are the place code goes to die
I don't think this statement is entirely fair, but folks do seem to avoid contributing new stdlibs, fixes or updates (myself included). From my experience this is mostly because the stdlib dev workflow is closer to working on the core language rather than working on an external package (Pkg.jl and Statistics.jl sit in the middle).
Here are some things that might make this better from a user standpoint.
] dev Statistics
(current solution is pretty manual)] free Statistics
(so stdlibs are basically just pinned default packages that can be unpinned and updated).Base
, stdlibs or an external package. I think folks jump to wanting things in Base
that could often be an independent stdlib or even an external package?A couple technical problems that I've seen from other issues include:
pip install -U pip
) and just requiring a restart? Currently, Pkg.jl doesn't need to define a compat
because all dependencies are baked stdlibs, but if all stdlibs could be updated then I assume we'd need to handle potential incompatibilities appropriately?Thanks so much for opening this!
I believe the primary reason that "stdlibs are the place where code goes to die" is that the deployment cycle of stdlibs is so completely tied to Base.
That is, you cannot make changes to a stdlib and immediately deploy them in any project unless you build Julia from source. This is a very great disincentive for anyone who is using Julia to get work done.
I agree that the development cycle for stdlibs is also more painful than it needs to be, but I feel like the inability to deploy applications relying on stdlib changes is the largest problem.
As to technical challenges, I think the stdlib load time is the big one. I think the current plan is to have PackageCompiler more nicely integrated so that the sysimage can be easily rebuilt. Improvements to separate compilation have also been discussed several times, but it's not clear this will actually help as method invalidation might undo most attempts at separate compilation.
Would the idea be that have a minimal version of PackageCompiler integrated into Julia (as a stdlib), so that Pkg.jl can use it directly (rather than the other way around)?
Thanks a lot for this. This will open up a new world for Julia.
I think the current plan is to have PackageCompiler more nicely integrated so that the sysimage can be easily rebuilt
Would the idea be that have a minimal version of PackageCompiler integrated into Julia (as a stdlib), so that Pkg.jl can use it directly (rather than the other way around)?
Can't we use the artifacts system to download the ready stdlibs? We should not need to build them offline on each system when we do have the artifacts (?).
This probably requires each stdlib to be a separate file. It may not be possible (?) via current facilities.
Can't we use the artifacts system to download the ready stdlibs? We should not need to build them offline
What do you mean with "ready" and "build"? With the current technology we need to rebuild the sysimage so that it contains the new version of the stdlib. This process is identical to how PackageCompiler makes a new sysimage with another random package.
This probably requires each stdlib to be a separate file. It may not be possible (?) via current facilities.
I don't understand this. Why would an stdlib have to be a single file?
If we could build each stdlib as a dynamic library (like a dll), we could just download and change that file. But from what you mentioned this may not be possible currently.
If we could build each stdlib as a dynamic library
The difficulty is that method invalidation is likely to force a lot of that code to be recompiled dynamically as multiple modules are loaded and modify the method table of existing functions. So it's not clear that such separate compilation will actually help. Hence the plan to recompile the sysimage on demand seems like a good option.
More than that, most instantiations of compiled code are the result of combining a generic algorithm from one package with one or more operations on specific types from other packages. That combination inherently cannot be separately compiled. This is exactly the same as template instantiation in C++ which also cannot be separately compiled.
The difficulty is that method invalidation is likely to force a lot of that code to be recompiled dynamically as multiple modules are loaded and modify the method table of existing functions.
The trick here is defining "a lot." I'm increasingly confident that if the community pitches in (I'm still waiting for a couple more additions to Julia before I publish my blog post and invite people to join the hunt), by the time Julia 1.6 is released our invalidations will be a small fraction (maybe 5-10%?) of what they were in Julia 1.4. At that point, separate compilation will work pretty well.
But Stefan's point remains fundamental. We may need packages that exist solely for the purpose of glue.
Not just packages: it's common that user code combine unrelated packages in such a way that new code needs to be generated for some or all of them. It may be possible to do more compilation separately, but it's not going to be a panacea for a language like this.
That combination inherently cannot be separately compiled
As long as there exists a package that includes both, we can precompile the combination.
Most helpful comment
Thanks so much for opening this!
I believe the primary reason that "stdlibs are the place where code goes to die" is that the deployment cycle of stdlibs is so completely tied to Base.
That is, you cannot make changes to a stdlib and immediately deploy them in any project unless you build Julia from source. This is a very great disincentive for anyone who is using Julia to get work done.
I agree that the development cycle for stdlibs is also more painful than it needs to be, but I feel like the inability to deploy applications relying on stdlib changes is the largest problem.
As to technical challenges, I think the stdlib load time is the big one. I think the current plan is to have PackageCompiler more nicely integrated so that the sysimage can be easily rebuilt. Improvements to separate compilation have also been discussed several times, but it's not clear this will actually help as method invalidation might undo most attempts at separate compilation.