Currently Project.toml is not enough to use a package, unless all it's deps can be found and installed from registries. I think it would be nice to include some information in Project.toml about where to find the deps. In particular two usecases come to mind:
name = "Foo"
uuid = "4bdd1c6b-68bb-4dfe-a423-fa0d59eb7384"
version = "0.1.0"
[deps]
Bar = "ce02910b-3083-40c5-958c-8fd1036218f1"
[where]
Bar = "https://www.bar.com"
name = "Foo"
uuid = "4bdd1c6b-68bb-4dfe-a423-fa0d59eb7384"
version = "0.1.0"
[deps]
Bar = "ce02910b-3083-40c5-958c-8fd1036218f1"
[where]
Bar = "modules/Bar"
This would make Project.toml a bit more standalone; I can give that to someone and things will work, but if I want that someone to obtain the exact state I have, then I can also send Manifest.toml.
Is there a workaround until this is fixed? Specifically, if using an unregistered (but publicly available) package, the way to make eg CI work is by
Pkg.add("url_to_repo")
before running the test?
Alternatively, check in a manifest.
Thanks, but in my specific context that does not help, as my Manifest.toml has a local path as the repo-url (I devd that package, and developing it in tandem with the main one).
BTW, what's the recommended test script in CI (eg Travis) now?
using Pkg
Pkg.add("https://github.com/tpapp/that_unregistered_package.jl")
Pkg.activate(".")
Pkg.build()
Pkg.test(; coverage=true)
works, and is really cool (no more repeating package name :+1:), but I mostly figured it out by trial and error.
Should be no need for activate since the default Travis script puts that in for you automatically now.
AFAICT the default Travis script fails with
ERROR: LoadError: Cannot develop package with the same name or uuid as the project
when the package already has a Project.toml (yes, I know, it is not supposed to have one yet, but it is so convenient).
It does? It shouldn't, see e.g: https://github.com/JuliaData/DataFrames.jl/pull/1452
@fredrikekre In your example, what should happen if Bar already exists as a registered package? Should we use the registered version of Bar, or should we use the unregistered version from the URL (https://www.bar.com in your example) given in the [where] section of the Project.toml file?
In my opinion, the correct action would be to use the unregistered version from the URL specified in the [where] section of the Project.toml file.
I think that [where] part is quite ambiguous, considering that we should deal with a registry and not an individual package. The package does not store an explicit version history required for the dependency resolution, so whatever version is pulled from [where] location will put a hard brake on future updates which out proper version history.
So the following configuration is more compatible the logic of the package manager:
[deps]
Bar = "ce02910b-3083-40c5-958c-8fd1036218f1"
[deps.registries]
BarRegistry="https://bar.package/registry/location"
Hmmm. But what if I only have one unregistered package that I want to use as a dependency? Do I have to create (and maintain!) my own registry just for that one package?
I definitely think that the package manager should support unregistered packages as dependencies, without needing a registry.
If you want to specify a version, you could do so using the same #branch or #commit syntax that you can use when adding an unregistered package from the REPL. For example:
name = "Foo"
uuid = "4bdd1c6b-68bb-4dfe-a423-fa0d59eb7384"
version = "0.1.0"
[deps]
Bar = "ce02910b-3083-40c5-958c-8fd1036218f1"
[where]
Bar = "https://www.bar.com/Bar.jl#release-1.5"
@wildart: your suggestion seems to just be a different way of spelling the same information so I don't see how it changes anything. Edit: oh, I see, you want to give the location of a registry, not the package itself. Right, that requires a registry which defeats the purpose of this feature.
The [where] section would merely be informative: "here is the place where this version was last pulled from; it may or may not still be there." Which makes me wonder if it should be called [from] instead of [where]. If Bar with the right UUID appears in a registry then we still know about that location and can look for newer versions there.
If Bar with the right UUID appears in a registry then we still know about that location and can look for newer versions there.
Basically, this feature introduce an additional, and possibly different, location of the package to the one which is specified in the registry. If new version is downloaded from the new source, which location to use for downloading previous version? What in the next version a new location is introduced, so there would be three or more sources from where packages could be downloaded. And so on.
I thought that the registry is responsible to keep track of such functionality. Thus, my proposal to include relevant registry location.
And, if the package is unregistered then it is sole location which does not provide any relevant version history. Then, there is a question of dependency management of such package.
The purpose of this feature is to allow recording a location for packages that are not registered. If a location is given for a package that is registered—or has become registered—that's fine: it's just another place to look for versions of the package. Having multiple upstreams is not a problem.
And, if the package is unregistered then it is sole location which does not provide any relevant version history.
Version history is not necessary since we record tree hashes in the manifest. All one needs as some repository which knows about that tree hash. Unregistered packages can also still have version tags.
Then, there is a question of dependency management of such package.
This is not a problem since the unregistered package repository contains a project file and possibly a manifest file as well. The project file records the dependencies with UUIDs, which unambiguously determines what those dependencies are. If they are registered, they can be looked up in any registry which includes them. If they are unregistered, finding them may be harder, but recording their location in the project file is the point of the proposed feature.
Alternative direction: what if instead of recording this information in project files we allowed a really lightweight "pre-registration" process where you just record that a UUID lives at a particular repository URL? The entire process would just be submitting a URL—from that URL, the server could clone the project, look up the UUID and then it would know where to find that UUID. That would allow people to just share a link to an unregistered package and as long as all of its dependencies are "pre-registered" other people could install it. We could maybe even make pre-registration part of the PkgDev project generation process... when you generate a UUID for a project, we record that UUID and the upstream URL of the project at the time. The down-side would be that people developing private packages may not want this information to be recorded, but a UUID + a URL doesn't really reveal much information.
That could be a good idea.
Although: what happens if the URL of a “pre-registered” package changes?
Who would have the permissions/authorization to change the URL for a
preregistered package?
On Wed, Aug 29, 2018 at 12:05 PM Stefan Karpinski notifications@github.com
wrote:
Alternative direction: what if instead of recording this information in
project files we allowed a really lightweight "pre-registration" process
where you just record that a UUID lives at a particular repository URL? The
entire process would just be submitting a URL—from that URL, the server
could clone the project, look up the UUID and then it would know where to
find that UUID. That would allow people to just share a link to an
unregistered package and as long as all of its dependencies are
"pre-registered" other people could install it. We could maybe even make
pre-registration part of the PkgDev project generation process... when you
generate a UUID for a project, we record that UUID and the upstream URL of
the project at the time. The down-side would be that people developing
private packages may not want this information to be recorded, but a UUID +
a URL doesn't really reveal much information.—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/JuliaLang/Pkg.jl/issues/492#issuecomment-417006764,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFXAre9zy1pLzfChFAU6StJ0qr9m5LTrks5uVrvBgaJpZM4VKHKB
.
a really lightweight "pre-registration" process where you just record that a UUID lives at a particular repository URL?
If technically feasible, I would prefer a design that does not depend on any kind of central registry, however lightweight. Eg the "merely informative" where you proposed above.
A use case of this feature for me would be an set of unregistered packages I am experimenting with, potentially registering them if it pans out. In the meantime, I could just put them in repos on Github/Gitlab, specify [where] the dependencies are, and then once registration happens the transition would be seamless. OTOH, if I abandonned these packages, there would be no trace of the project in a central registry.
Version history is not necessary since we record tree hashes in the manifest. All one needs as some repository which knows about that tree hash. Unregistered packages can also still have version tags.
Usually, the package does not have manifest, and only tags would provide history info. But, the larger problem is the dependencies of the particular version which are only listed in Project.toml file of that version.
I fear that the location information would undermine the registry approach in handling dependency problem. People would easily put a url in a project file and tag repo, rather then go through a process of registering version in a registry. And then, we would need a completely different way to resolve dependencies in decentralized way (no or partial registry use).
Usually, the package does not have manifest, and only tags would provide history info.
I think that packages should generally check in a manifest file and use that for CI during development.
But, the larger problem is the dependencies of the particular version which are only listed in Project.toml file of that version.
I don't see why that's a problem...
I fear that the location information would undermine the registry approach in handling dependency problem. People would easily put a url in a project file and tag repo, rather then go through a process of registering version in a registry. And then, we would need a completely different way to resolve dependencies in decentralized way (no or partial registry use).
I'm not all that concerned that decentralized packages become _too easy_ to use. That's a problem that no language ecosystem in the history of the world has ever had. We should be so lucky.
If technically feasible, I would prefer a design that does not depend on any kind of central registry, however lightweight. Eg the "merely informative" where you proposed above.
I'm inclined to agree.
If no one else has started working on this, I might have time a few weeks from now to be able to work on a PR for this feature.
I think that packages should generally check in a manifest file and use that for CI during development.
My concern is more about package versioning and not about development cycle during which many problems could be solved with a little bit of tinkering.
I'm not all that concerned that decentralized packages become too easy to use.
That is a bold statement especially when you try to develop one.
Originally, my correction to the extension of the project dependency description was about using original registry-based functionality in the package manager for finding proper dependencies. Next missing step is a tool that allows developers easily manage their own registries. I have seen efforts of @KristofferC & @fredrikekre in this direction which I find quite exiting. As soon as particular features will be accomplished, there is going to be a satiable model for handling dependencies: centralized or decentralized, public or private - just know a proper location of a registry with your dependencies.
One always could find such model overcomplicated, particularly novice users that do not want to bother with managing a registry for a sake of one or two packages. Thus, the original proposition is very attractive - put a direct location of the dependency in the project description. After all, many package managers do it - cargo, golang, npm. Question: How does the proposed changes fit in the established paradigm of registry-based dependency resolution? What needs to be done so this approach become viable? There are manifests with all necessary information about locations that can be used in a development cycle. So, if this change related to a production cycle then my above questions a quite valid. Just a matter of proper engineering vs prototype-happy approach :wink:.
How does the proposed changes fit in the established paradigm of registry-based dependency resolution?
Maybe I missed something, but I was implicitly assuming that in a repo, tags would mark the versions (+ master is a version on its own), and the corresponding Project.toml would have the other information. I think this is what happens now indirectly via attobot (except for master implicitly being version).
In the meantime, I am wondering what the best workaround is for multiple unregistered packages, eg in a CI environment. If I had a single one, I could just pkg"add http://github...", but for multiple ones it will error as it cannot find the others.
A workaround I found is take just the sections for the relevant packages from a working Manifest.toml, remove the deps, leave everything else, commit that, and call pkg"resolve" in the before_script: section on Travis. But this is somewhat tedious and I wonder if there is anything easier, eg via pkg"add ..." if it could wait with checking until I add everything.
I would prefer to avoid committing a full Manifest.toml, as one of the advantages of CI is testing in an environment that may be different from what I have.
You can check in the manifest and call up.
Tamas said in his previous comment that he does not want to check in the
manifest.
On Thu, Aug 30, 2018 at 11:22 AM Fredrik Ekre notifications@github.com
wrote:
You can check in the manifest and call up.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/JuliaLang/Pkg.jl/issues/492#issuecomment-417358648,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFXArdKvW9ZrkEIhImggUiLMa5VUSFnQks5uWALVgaJpZM4VKHKB
.
I agre with Tamas. I would also like a solution that does not require
checking Manifest.toml into source control.
On Thu, Aug 30, 2018 at 11:32 AM Dilum Aluthge notifications@github.com
wrote:
Tamas said in his previous comment that he does not want to check in the
manifest.On Thu, Aug 30, 2018 at 11:22 AM Fredrik Ekre notifications@github.com
wrote:You can check in the manifest and call up.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/JuliaLang/Pkg.jl/issues/492#issuecomment-417358648,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AFXArdKvW9ZrkEIhImggUiLMa5VUSFnQks5uWALVgaJpZM4VKHKB.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/JuliaLang/Pkg.jl/issues/492#issuecomment-417362545,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFXArVLPNc8RmqZjiesJqe6KNL5ZGOwQks5uWAVzgaJpZM4VKHKB
.
Maybe I missed something, but I was implicitly assuming that in a repo, tags would mark the versions (+ master is a version on its own).
Actually commit sha1 are tied to specific versions, not tags. Tags can be modified, so they are not reliable sources of version (see registry).
I am wondering what the best workaround is for multiple unregistered packages, eg in a CI environment.
Create your own registry and add all your dependencies there, then copy this registry in a CI environment.
Create your own registry and add all your dependencies there, then copy this registry in a CI environment.
If he added those packages to a registry, then by definition they would no longer be _unregistered_ packages. Tamas specifically asked about unregistered packages; your comment does not address his question.
Multiple people have said on this thread that we want a solution for unregistered dependencies. We do not want to have to create a registry just so we can use a package as a dependency.
@wildart Every one of your comments on this thread has consisted of you telling us to use a registry, which as @StefanKarpinski points out defeats the purpose of this feature request. Can I respectfully ask that you refrain from repeating this point and instead provide some constructive suggestions on how we can make _unregistered_ dependencies work?
@DilumAluthge I just point a the obvious fact that problem of matching a package UUID and a package location is already solved through the centralized storage - registry.
And, what @StefanKarpinski proposed
lightweight "pre-registration" process where you just record that a UUID lives at a particular repository URL?
is just a simplified implementation of the same registry.
There is no simple solution to such problem. As soon as, a name or a location stopped serving as a unique identifiers of the package, and this roles dedicated to UUID, we ended up with problem of matching former to latter. Only the developer of the dependency package can provide necessary information, and not a developer that uses such unregistered package.
Here is some examples:
A developer introduces an unregistered dependency in its project and uses UUID1 that was in some way acquired from the unregistered package (assume it exists). So we have a pair of <UUID1, pkg.location>. Then the developer of the unregistered package changes its id toUUID2 (it is unregistered after all). The package manager would not be able to recognize unregistered dependency, even after downloading it locally, because UUID mismatch.
Again, a developer introduces dependency <UUID, pkg.location1> in its project. And then, the developer of the unregistered package changes the package location to pkg.location2 (e.g. github sucks). After that no updates for first developer until the location is fixed in the project description.
Similar to scenario 2 but use a package name instead of a location.
These are just obvious examples of misuse of the proposed functionality. I didn't listed issues when one unregistered package start reference another unregistered, and so on.
Any of the above scenarios could happen regardless of credibility and trustworthiness of the unregistered package developer.
Given that the proposed feature is very easy to use, I'm afraid it will become predominant way to deal with the dependencies.
One obvious way to resolve above issue is to drop UUID as a dependency identifier and use a location[+name] instead.
name = "Foo"
uuid = "4bdd1c6b-68bb-4dfe-a423-fa0d59eb7384"
version = "0.1.0"
[deps]
Bar = "https://www.bar.com/Bar.jl.git"
Pak = "modules/Pak"
If the dependency contains a project file then use listed UUID, otherwise generate a new one. Of course, every time when the dependency is downloaded, its UUID should be verified, and updated in the manifest.
@wildart:
Only the developer of the dependency package can provide necessary information, and not a developer that uses such unregistered package.
I am not sure I agree. If I am using an unregistered package, I can just provide the URL of the repo where I found it.
The scenarios you list would result in errors when the package is added, and prompt some response from the user/developer. Same as any other kind of bug. The use case for providing the location explicitly is WIP packages which are not yet registered, so if someone depends on these, they should understand these risks.
Given that the proposed feature is very easy to use, I'm afraid it will become predominant way to deal with the dependencies.
I am concerned about the opposite: developers registering packages that could use a bit more time to mature, just to get some convenience benefits, which should be possible to provide independently of registration.
For example, I am now working on a set of packages which I don't want to register before they mature, because the API is evolving on a daily basis. As you can observe above, I am jumping through hoops to deal with CI, and the [where] feature would also make it easier for others to try out these packages.
I don't think that assuming that people would abuse this feature is the right mindset to approach this. Julia is very powerful and allows you to shoot yourself in the foot countless ways, yet it works out just fine.
@fredrikekre: getting back to the workaround, I found that locally I have a Manifest.toml with some local directories as I am deving these packages (they are developed concurrently), which would need to be replaced with URIs to the Git repos to the manifest I check in. Do you have a suggestion for dealing with this?
Yes you pushed you deved changes to a branch and add that branch instead.
I see there hasn't been much activity on this for a bit. Is there are any specific plan to address this problem? I'm definitely in the same camp as @tpapp and @DilumAluthge; Ultimately I would like a simple, low-overhead way to reference multiple in-development packages along a dependency chain, without hacking Manfiest.toml.
I'm fine with the notion that such URL references should be avoided in a production environment. In addition to the added overhead, another current issue with creating our own registry (as @wildart suggests), is that it isn't actually officially documented—as far as I know—how one actually does that. That leads me to suspect that whatever solutions do exist (https://discourse.julialang.org/t/creating-a-registry/12094, https://github.com/fredrikekre/Registry) may change. Or is it safe to assume those approaches are stable-ish and can be used to create a private registry?
I attempted to code a solution, but I don’t know nearly enough about the
Pkg.jl codebase.
Unfortunately, I think that this feature won’t come to fruition unless
someone that is familiar with the Pkg.jl code has time to work on a pull
request.
On Wed, Oct 10, 2018 at 09:01 David Little notifications@github.com wrote:
I see there hasn't been much activity on this for a bit. Is there are any
specific plan to address this problem? I'm definitely in the same camp as
@tpapp https://github.com/tpapp and @DilumAluthge
https://github.com/DilumAluthge; Ultimately I would like a simple,
low-overhead way to reference multiple in-development packages along a
dependency chain, without hacking Manfiest.toml. I'm fine with the notion
that such URL references should be avoided in a production environment. In
addition to the added overhead, another current issue with creating our own
registry (as @wildart https://github.com/wildart suggests), is that it
isn't actually officially documented—as far as I know—how one actually does
that. That leads me to suspect that whatever solutions do exist (
https://discourse.julialang.org/t/creating-a-registry/12094,
https://github.com/fredrikekre/Registry) may change?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/JuliaLang/Pkg.jl/issues/492#issuecomment-428562136,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFXArRtXZ0cAxlMU7Ca6wMzEpARxOExKks5uje-ggaJpZM4VKHKB
.
I have been running into this with unregistered packages depending on unregistered packages (and especially multi-package single git repository).
I had a orthogonal idea, that instead of adding a where/url field in Project.toml one could consult the file that already has the information, e.g. Manifest.toml.
The situation I have currently have surprisingly often (maybe I like that style to much):
- Dir1
- ProjectA
- ProjectB
- dev ../ProjectA
- Dir2
- dev ../Dir1/ProjectB
The second dev will fail because we don't know where ProjectA lives, but the manifest of ProjectB declares where it is getting it's dependency on ProjectA from.
So the information is present, but not used. To be clear I propose to only look up location information in the manifest and only for un-registered packages. I don't think we should add the ability for registered packages to depend on unregistered packages.
My current work around is that I do:
- Dir1
- ProjectA
- ProjectB
- dev ../ProjectA
- Dir2
- dev ../Dir1/ProjectA
- dev ../Dir1/ProjectB
but that get's really annoying really fast and prohibits collaboration since I no longer can say: Just dev my package at X, and it get's really convoluted with multiple packages in the same repo.
Being able to give an absolute or relative URI of the dependency would be of great help. Having to go through registries is a huge burden for some of the perfectly valid project structures, e.g.:
|- my-project-repo
|- packages
|- Commons
|- ...
|- SomePackage
|- Project.toml (depends on Commons)
|- applications
|- Application1
|- Project.toml (depends on SomePackage, Commons, etc.)
there are many more packages and applications in the above project which are not ready to be extracted into separate repos. The applications work with the latest versions of packages, but I still want the ability to package a single application together with its dependencies as a release (e.g. using PackageCompiler).
I think that when a package is dev'd we should record the upstream repo URL for the dev repo in the manifest. Note that this may be different than the registered primary repo for the package, e.g. if you have a fork of the package. The reason I think we should record that URL instead is that this is where dev commits are most likely to be found.
It seems to me that what people actually want is to be able to easily snapshot and share their current state of work. Very close to what we want is to have some command that does add url#branch for each dev'd package. We might want an option to save it to a new, separate manifest file. We probably want to get the url and branch from the current branch of each dev repo: the upstream repo and branch of the current local branch. Maybe snapshot and snapshot -m Snapshot.toml if you want to give a new file name?
I think I was somewhat misunderstanding the nature of this issue. The motivation for storing location info in the project file seems like it's a proxy for identity: for unregistered dependencies, a UUID is of little to no use since you need to know where you might be able to find a repo for the given UUID and you've got no registry in which to look it up. To that end having Bar.where = "https://www.bar.com" seems like it might help since at least you know where to look for the Bar with the given UUID. However, in that case why not just give someone a manifest file?
However, in that case why not just give someone a manifest file?
I think the reason is that the workflow for adding/deving a package is much nicer than downloading + instantiating a new environment and manifest.
I think the Manifest is a fine place to record the identity information, but it only works when I instantiate the environment that it represents and not when I add a package to a different environment and said package depends on further unregistered dependencies.
@StefanKarpinski:
We might want an option to save it to a new, separate manifest file.
This could actually solve some of my problems. Just to clarify, I am going to describe my use case.
Suppose I am working on a project Main, which depends on Model and Estimation. The first is an "application", the last two are "packages". All three are unregistered, live in separate private repositories on Gitlab, and are unlikely to ever be registered, even though eventually when the paper comes out they will be made public.
My local workflow is pkg> activate /path/to/Main each time I work on this, to which I deved Model and Estimation with local paths. Whenever I work on an isolated issue in Model or Estimation, I pkg> activate and pkg> test it locally.
The CI part of the workflow would ideally test whatever is in the repos. Here I am running into the problem that I can
Manifest.toml with local dev paths, in which case CI does not work,add the_gitlab_repo_url and commit the Manifest.toml, in which case I will need to update constantly whenever I make a minor change in the dependencies.I currently work around this with a small script that goes back and forth between these two states.
The cooperative part of the workflow would involve the local part from the perspective of a coauthor, working on the same code. But I am reluctant to suggest this to someone who is not a 100% die-hard committed Julia fan ("You said Julia would make things simple!"). So currently we (ahem) share a Dropbox folder with the git repos.
Somehow "stacking" Manifest.toml layers would allow me to enable-disable one with the dev paths.
I can confirm that this issue is still a key impediment to newcomers (me and my lab) developing their own inter-dependent packages before they're ready to be registered.
I also have the same problem as @guyvdbroeck!
I want to point out to folks who are struggling with this problem that a good alternative solution that you can use now is to create a local registry via LocalRegistry. It is pretty easy to use.
Most helpful comment
I can confirm that this issue is still a key impediment to newcomers (me and my lab) developing their own inter-dependent packages before they're ready to be registered.