Pkg.jl: Project.jl as an extension to Projec.toml

Created on 19 May 2020 · 12Comments · Source: JuliaLang/Pkg.jl

It would be nice if we can have a Project.jl as an extension to Project.toml in our root directory to perform custom things that are not covered in Project.toml.

Pkg.instantiate() should run this file after it processes Project.toml.

Some use cases:

Interactive experience

function prompt(message::String="")::String
   print(message)
   return chomp(readline())
end

use_newfeature  = prompt("Do you want to use the new feature of this package? (Y, N)")
# do some stuff with use_newfeature  

backend_lib = prompt("Which XML library do you want to use as the backend?")
# do some stuff with backend_lib

Dependency based on the VERSION or OS

using Pkg
@static if VERSION > v"1.3"
    Pkg.add("AcuteML")
else
    Pkg.add(PackageSpec(url="https://github.com/aminya/AcuteML.jl", version=v"0.5"))
end

Unregistered packages

I might want to add a dependency that is not registered yet. I can simply do it in the Project.jl file.

using Pkg
Pkg.add(PackageSpec(url="https://github.com/aminya/AcuteML.jl", rev="master"))

Adding compat methods based on the VERSION or OS

@static if VERSION < v"1.3"
    Base.write("src/compat.jl","""
        static_hasmethod(args...) = hasmethod(args...)
    """)
    # inside the package, one can use `@static isfile("compat.jl")`
else
    Pkg.add("Tricks") # some package with compat `julia = 1.3`
end

Custom script after installation

include("src/build.jl")
include("src/postinstallation.jl")

Run the tests after installation

using Pkg
Pkg.test()

and many other examples.

This is similar to deps/build.jl, but being in the root, separates the Pkg and init stuff from building.

Source

aminya

👎3

Most helpful comment

@DilumAluthge: I think it would be cool for us to eventually completely remove deps/build.jl.

Yes, that would be great. Not sure if we'll ever be able to fully get there, but we want to at least eliminate as much non-declarative package setup as possible.

@aminya: I don't understand why Julia is different here...

Allowing people to run arbitrary code when installing, configuring or setting up packages is certainly the easiest thing to do form a design perspective and it's very seductive. I suspect that's why so many systems do it—it's easy and maximally flexible.

But it really ruins reproducibility, predictability and portability. If you run arbitrary code that can look at anything on the system it's running on when configuring or installing a package, then how to even install a project implicitly depends on all of the global mutable state of the environment it happens to be running on. If arbitrary code can be run to determine the dependencies of a package, then it's not even possible to definitively say what a package's dependencies are 🙀

Most of Pkg work in 1.0 and since has been in the exact opposite direction of this: we're trying to make as much of package installation and setup declarative and immutable. This proposal does the opposite and thus is antithetical to the philosophy of Pkg.

StefanKarpinski on 22 May 2020

👍3

All 12 comments

This reminds me of installing Python packages. When you install a python package, it can run basically any code it wants. This makes installing Python packages a nightmare.

Personally I do not like the idea of running arbitrary code when I instantiate a project.

DilumAluthge on 19 May 2020

👍1

This reminds me of installing Python packages. When you install a python package, it can run basically any code it wants. This makes installing Python packages a nightmare.
Personally I do not like the idea of running arbitrary code when I instantiate a project.

You can run any code in Julia too. Not having this option means that people will run these stuff inside their __init__. It is just deferred and will be repeated! This makes a nightmare each time you want to use the package (rather only once).

This:

Forces the mixture of the runtime code and package management code
Prevent Julia to implement proper Tree Shaking. If the __init__ functions were empty, Julia could do way more optimizations than it is possible now.
Prevents tools such as PackageCompiler work properly.
For having fully static Julia, we need to separate Project.jl from the packages.
Makes people use Require for their optional deps
Makes the loading time of the package slow. See here for example.
Makes it very hard to use a package with high Julia compat. See here
...

The limitation is not desirable all the time. Sometimes people want more flexibility, and not having that results in undesirable things that are very hard to fix later.

Running the code is already allowed by the Pkg.build. I don't understand what is different here, and what is the reason for the objection.

Side thing: not having this option does not help security. See here.

aminya on 19 May 2020

👎1

Dependency based on the VERSION or OS
I might want to add a dependency that is not registered yet. I can simply do it in the Project.jl file.

Dependencies and features should be declarative. If we want OS/VERSION dependent dependencies or unregistered dependencies then that would be added by adding syntax for it to the existing Project file, not by running some script file where things are queried. Interactive things during package install strike me as particularly bad.

In general, I would say we want to move more things towards more declarative and less arbitrary code style (cf the artifact system over the build scripts) so this would be a step in the wrong direction in my opinion.

KristofferC on 19 May 2020

👍3

100% agree with @KristofferC. Arguing that making things less declarative somehow makes it easier to statically compile things doesn't make sense to me.

StefanKarpinski on 19 May 2020

👍2

If you can give me solutions for the things I said in this issue, I would appreciate it!

Personally, I will abuse deps/build.jl with include("../Project.jl") for package management until this is added to Pkg.

C++ and its package manager, Conan, have this option [conanfile.py].
https://github.com/search?q=conanfile.py&type=Code
C++ and CMake have this option. [CMakeLists.cmake]. The user can change what backend libraries, features, etc to use by passing flags or using a GUI.
https://github.com/conan-io/cmake-conan
Rust and Cargo have similar features.

I don't understand why Julia is different here...

@KristofferC. Pkg.add adds deps to the Project.toml. What is the difference here?

Even if it edited the Manifest, static compilation mostly should use Manifest.toml, not Project.toml. You want all the dependencies not just the direct ones (unless wanna do incremental compilation).

When you have all the code in __init__, you can't do anything.

Interactive things during package install strike me as particularly bad.

That was just an example of the possibilities! I would use global parameters and isdefined instead of prompting the user directly.

aminya on 22 May 2020

@staticfloat has opened some good issues and pull requests regarding this kind of stuff.

The goal is to eventually make package directories completely immutable.

This is possible with a combination of:

The Project.toml file
Artifacts - this has already been implemented into Pkg.
"Scratch spaces" - #1833
"Preferences" - #1835

I think that all of the stuff you want to do can be covered by those four.

For example:

Preferences such as "do you want to use feature X" and "which backend do you want to use" would go into "preferences".
Installing different binaries depending on what operating system you are on would go into "artifacts."

If there is something that you want to do that is not covered by Project.toml, artifacts, scratch spaces, or preferences, it might be best to open a specific issue to track that feature request. Then we can figure out how to implement it in a declarative way.

DilumAluthge on 22 May 2020

👍2

Also, for what it is worth, (and this may be off-topic): I think it would be cool for us to eventually completely remove deps/build.jl.

I am hoping that this will be possible eventually. But I will let @staticfloat correct me as to the feasibility of this.

DilumAluthge on 22 May 2020

👍2

Let me explain why doing package operations like

I might want to add a dependency that is not registered yet. I can simply do it in the Project.jl file.

Dependency based on the VERSION or OS

in a generic post-script hook is not such a good idea. The way Pkg (and many other package managers) works is that they gather up all the packages you want to have installed and what compatibility bounds they have as well as all packages and versions it knows about (from registries) and sends this to a "resolver".
The job of the resolver is to give back a set of package versions so that all dependencies are fulfilled and all compatibility info is adhered to.

However, if you run another package operation as a part of the installation of a package, you call back into the resolver to give you a new set of versions. Those versions might be different than in the first resolver call because the package you added in the post hook can introduce new compatibility bounds. So now you are re-running the resolver and might install new packages which might have their own post-installation hooks and might do Pkg operations calling back into the resolver etc. It seems likely that you can even end up in cycles here where you are just spinning around running post-installation hooks forever.

Therefore, we want to be able to up-front gather all packages that are going to be installed so we only have to run the resolver once. This is done by e.g. making sure that we know what dependencies and compat info packages have without having to execute arbitrary Julia code.

KristofferC on 22 May 2020

👍3

@DilumAluthge: I think it would be cool for us to eventually completely remove deps/build.jl.

Yes, that would be great. Not sure if we'll ever be able to fully get there, but we want to at least eliminate as much non-declarative package setup as possible.

@aminya: I don't understand why Julia is different here...

StefanKarpinski on 22 May 2020

👍3

Another point: CMake is a build system not a package manager. Build system have to run code—that's how you build things. That's a totally different situation. Conan is a package manager but it invokes an arbitrary build script, so it makes sense it would also allow arbitrary execution; they're just not in a position to constrain things more even if it would be beneficial. Rust is/has a build system, not a package manager. I don't believe that Cargo actually lets you run arbitrary code to determine what the dependencies of a package are. Even if it does, that doesn't mean we should.

StefanKarpinski on 22 May 2020

One of the primary things that impeded making my PyEnv.jl being useful was the fact that pip installation often must run arbitrary code in order to determine which packages to install. It makes trying to install dependencies for a Python package (especially for a foreign platform/architecture) a real chore.

Because you can't know what packages need to be installed beforehand, you end up with longer install times (you can't fetch all packages in parallel, you have to fetch some packages, start installing them, then go fetch the dependencies that are missing from those dependencies, start installing _those_ dependencies, and recurse), and you end up needing to actually set up every environment that you'd want to be able to install into, rather than being able to just collect everything you need for all platforms from the get-go. The Pythonistas know this is an issue, and they're moving toward a fully-declarative model with .whl files that don't allow this kind of flexibility, because it is a disaster for reproducibility.

staticfloat on 22 May 2020

👍1

you can't fetch all packages in parallel, you have to fetch some packages, start installing them, then go fetch the dependencies that are missing from those dependencies, start installing those dependencies, and recurse

Not only is this slow, but it means that Python projects can't actually do version resolution correctly for reasons like what @KristofferC described. You need a fully dependency graph in advance in order to do version resolution—and even with that graph, it's a non-trivial problem. If you have to run code for a version to find out what the dependencies are, that means that to do proper version resolution, you'd have to run the code for every single version of any package that you're considering installing, just to find out what it depends on and then run the setup code for that, and so on. All just to get a graph that you can use to compute which versions of which packages you actually need to install. This unfortunate design choice in pip is probably one of the major reasons why package management is such an unreliable mess in Python. Once your dependency graph requires running arbitrary code to generate, you're basically screwed. Even with a static graph it's already a very hard problem!

StefanKarpinski on 22 May 2020

👍2

Was this page helpful?

0 / 5 - 0 ratings