Conan: Package revisions

Created on 23 Dec 2016  Â·  89Comments  Â·  Source: conan-io/conan

Most package managers have a concept of package revision, i.e. additional version number that reflects changes in packaging scripts or applied patches when "main" version number of packaged software remains the same.

It would be great if Conan added support for revisions too. This will make package updates more trasparent ("updated from vX.Y.Z-r1 to vX.Y.Z-r2"). Also there could be a policy that "stable" channel can never change conanfile and binaries without bumping revision, to prevent accidental changes in packages used in CI with manifest verification.

It would be great if it was possible to keep binary packages for previous revisions so that CI system with manifests checking does not get broken in case new revision is uploaded without committing new reference manifests.

It was previously briefly discussed at https://github.com/conan-io/conan/issues/480#issuecomment-247545547

low high feature

Most helpful comment

Hi @ringgelerch,

There is a huge WIP #3055 and we have been internally discussing all the use cases of this feature. We know it is an important feature in our roadmap but cannot commit to a release date yet.

Thanks for your patience and stay tuned.

All 89 comments

I think I agree with the goal of this issue, but please let me ask one question: Would you like the version to contain the revision? Like what you said X.Y.Z-r1? So it has to be referenced that way? I guess no, but just in case

Ensuring the stable channel cannot be overwritten might be opt-in configurable, we don't want to break existing workflows. We'll try to ask a few more users for feedback, while this feature could be useful, it is very important not to break anything badly.

Would you like the version to contain the revision? Like what you said X.Y.Z-r1?

In dependencies list it should be possible to use either X.Y.Z to get latest revision, or X.Y.Z-r1 to get fixed revision. Feel free to omit r letter btw.

Ensuring the stable channel cannot be overwritten might be opt-in configurable

Yes, I think it would be the most convenient option, but I was afraid it would complicate server code and web UI. You can make it configurable per server instance also. For main server, I think it would be reasonable to apply this policy for all "stable" channels, but it's up to you t decide.

In dependencies list it should be possible to use either X.Y.Z to get latest revision, or X.Y.Z-r1 to get fixed revision. Feel free to omit r letter btw.

I think this should be doable with the new version ranges. We might have to extend the notation of the ranges, but could be a reasonable approach, so basically:

Pkg/[2.3.4]@user/channel

Would get latest revision, and using Pkg/[2.3.4-r4]@user/channel or just Pkg/2.3.4-r4@user/channel would get the exact r4 revision.

But Pkg/2.3.4@user/channel needs to work for backward compatibility

What do you mean that Pkg/2.3.4@user/channel needs to work for backwards compatibility? As I understand backward compatibility:

  • Yes, Pkg/2.3.4@user/channel, will still work as already defined, that will (and cannot change)
  • 2.3.4 needs to resolve always to the same, original (without revision) package that it was pointing to.
  • If the user (the consumer) wants to use the new behavior, he might be able to do so with Pkg/[2.3.4]@user/channel or some similar syntax.

Is that what you meant? Thanks!

I mean that Pkg/2.3.4@user/channel should resolve to the latest available revision, this was no existing package will break in case we enforce revision increment policy on stable branches

I think that yes, it can break. Even if you force the revision increment on stable branches, automatically changing users the package they are depending on, without them doing it explicitly, is breaking. We don't do it even for overwritten packages, you have to explicitly use --update if you want your locally cached package to be updated. Users that depend on the first (without) revision of a package, would have to maintain their behavior, depending on that exact version.

Even if the package creators starts to publish new revisions, updating the consumers on the revisions without them noticing, doesn't sound like the expected approach.

Please, also note that enforcing revision increment in stable channels in conan.io might be difficult in the short term, or at least very controversial. We implemented the package overwriting feature, because it was a very requested feature, and many users are using it. Enforcing (user configurable) that on conan_servers, surely can be done easily, but conan.io is a different story. We have tried from the beginning to be as less opinionated as possible, letting users (and very important, package creators) do almost whatever they want to do. We are not changing this unless there is a very broad consensus that this should be done.

I don't currently have anything to add to the design discussion - but this functionality would be useful for my group.

The same over here. It would be super nice to have this feature. In our project, some of the libraries have a dependency tree with a depth or 3 or 4 levels ... It is a bit annoying to have to touch all the depedencies to update the requirements. And most of the times we do not update the libraries but we make small revisions in the recipes.

We will be reviewing the model for developing packages in next 0.23, in https://github.com/conan-io/conan/issues/1171, but also, the big picture of how packages are developed and evolve will be reviewed, so we will take this point into account.

Still don't know how to address simultaneously addressable content, compatible package binary hashes, updates to the latest revision. So to make sure, the problem we are trying to solve:

  • Modifications to the package recipes, without modifications to the source code. (if there are source code modifications, I think we all agree that it should be a different package version), without changing the package version Pkg/version@user/channel
  • Be able to store package recipes and binaries for previous revisions of a package recipe
  • Be able to address and install any specific revision of a package recipe/binary
  • Be able to install latest revision of a package, automatically without specifying anything.

Note that the latest requisite might imply doing server calls, that can be slow, even for installed packages. This doesn't scale for large projects (if you have hundreds of packages, doing a conan install, even if everything is installed would go from a few seconds to more than one minute).

Overall it seems a very challenging thing, but lets analyze it again a bit further.

I am starting to review this issue.

I would need some feedback/help. I am looking for some references of other package managers using such concept of revisions, but so far, didn't find them. Do you have some pointers to the revision concept in other package managers?

I think that a revision is just a versioning thing. So if we have a Pkg/0.1@user/channel, creating new "revisions" of the package, if we want to make them addressable, they would be something like:

Pkg/0.1#1@user/channel
Pkg/0.1#2@user/channel
...
Pkg/0.1#N@user/channel
````
I am using # as an indicator of the revision, but could be any other thing. In my first analysis, I don't see the revision as something "internal", that could be hidden from the package addressable reference.

If the revision number is included in the version, then, using revisions could be just a matter of tuning the version ranges expressions, and consuming it could be just:

Pkg/[0.1]@user/channel #get latest revision
Pkg/0.1#3@user/channel #get exact revision
```

Here are some references to this concept in other package management systems:

  • Debian Linux
    debian_revision version component
  • Gentoo Linux
    Ebuild revision (documented here and here). This is very similar to the Conan situation, as Gentoo Linux explicitly versions the build recipes (ebuilds) and not the binary packages built from them, or upstream software sources.
  • Python/pip
    Local version identifiers
  • RPM-based Linux
    Release tag (documented here and Fedora-specific usage here)
  • FreeBSD
    PORTREVISION (documented here)

FPM multi-packager calls this concept "iteration".

Thanks @himikof for such useful links. I have been reviewing them (to be honest, I had looked mostly to language package managers for developers, which is the context of conan, not that much into system package managers), and I think there are good insights. First lets summarize the basics:

  • We want to create package revisions, in a way that they are addressable by the consumers, so consumers will be able to specify at any moment any specific revision and link against it. Such package revisions can be published to remotes and shared.
  • Packages can be simultaneously installed and coexist, so two different revisions can be installed in the same machine, and the users can swap easily with no further installations in their projects.
  • There should be some way for consumers to be updating their dependencies as new revisions are published, to the latest revision, without changing their declared requirements.

Regarding your references:

  • FPM is indeed a great project. I didn't know about their "iteration" concept, but couldn't find details about it in the docs.
  • Python/pip. From the Pep: "Local version identifiers SHOULD NOT be used when publishing upstream projects to a public index server" So they are mostly a consumer thing. Furthermore, pip doesn't allow different versions to coexist in the same installation, as they are overwritten. Sure there are virtualenvironments which allow to store different dependency graphs. Conan also support the same concept via CONAN_USER_HOME to set the packages local cache, but from our experience, conan must support installation of different versions in the same local cache, and share the cache among different projects.
  • System package managers, apt, rpm, freebsd, gentoo ebuild revisions. All of them generate a specific package reference that includes the revision number, like pkgName-version-revision. So it is not something that can be fully transparent, revision numbers are included in the package references or filenames.
  • Such managers doesn't handle smoothly the simultaneous installation of different versions, sometimes possible but applying workarounds. They use their dependency resolution to get latest revision and update accordingly if necessary.

So, in my opinion, the above suggested approach is quite aligned with the findings:

  • I proposed a versioning scheme, something like Pkg/0.1#1@user/channel. The revision number can come from a revision=1 field, and conan could append it. But the package reference will include it, it cannot be "hidden" or transparent.
  • Regarding dependency resolution, the explicit case will be directly handled.
  • The implicit use of latest revision. In my opinion, this is a complicated trade-off. If we enable automatic range-versioning resolution for all requires of the form Pkg/0.1@user/channel, to get the latest revision, it could make things slower. Range-version resolution requires more API calls to servers, which are slow. There are users with up to hundreds of dependencies, and enabling this by default could highly increase install times, without any need. Very little we could optimize, the bottleneck is the network calls.

This is why I was suggesting to be this feature an opt-in, users wanting to use package revisions could use some form of requires = Pkg/[0.1]@user/channel, and version-range resolution will handle accordingly revision numbers.

Feedback very welcome.

I propose the following format the package reference: pkg/0.1.3@user/channel[0.24] or pkg/0.1.3@user/channel-0.24 where 0.24 is package revision. The rationale for this format (or any similar variation) is that the part before @ remains the name and version of the packaged item and the part after the @ remains "package metadata". This is easy to understand and document, follows the present setup and keeps the most important part at the beginning of the package reference, and can be opt-in easily.

@sztomi, yes I like the idea of associating the package revision to the channel, and the "package metadata" idea. I have to check and think how this would be processed regarding the revision resolution, it should be similar to the version ranges logic, but applied to the channel. Still think that it should be opt-in, I am not sure you want to check for latest revision for all dependencies, regarding they have revisions or not, and being much slower due to extra network calls.

Still think that it should be opt-in, I am not sure you want to check for latest revision for all dependencies

Agreed. If it's not specified, the latest revision should be used automatically.

Agreed. If it's not specified, the latest revision should be used automatically.

But that is the issue :(
If we want to use the latest revision automatically, you need to opt-in using version ranges:

  • Pkg/1.1@user/channel will not use revisions. Will just look for the exact version
  • Pkg/[1.1]@user/channel will use latest revision.

If we make the former to automatically use revisions, we are forced to do a search for every single package in the dependency graph, which is an expensive operation, and will be much slower than the current install approach.

But Pkg/1.1@user/channel which does not use latest revision is useless, in this case it should be disallowed for packages that use revisions. And if we want all packages to be revision-enabled, for the sake of sanity, this means that everyone will have to use explicit revision numbers. That's fine with me, but I'm not sure everyone will be happy about it

I'm not sure I understand why it would make installation slow. Earlier you wrote:

Range-version resolution requires more API calls to servers, which are slow.

I'm probably missing something, but the way I see it, it should be possible with a single call, shouldn't it? And this call should be the one that gets a package identified by a package reference.

(1) "Give me the package that matches this reference: Pkg/1.1@user/channel" or

(2) "Give me the package that matches this reference: Pkg/1.1@user/channel-2.3".

In the (1) case, the server can look at the repository and see if there are multiple revisions. Since there was no revision specified in the package reference, it returns the latest revision of the package. In case (2) it can select revision 2.3 and return that.

For version ranges, you need to search the server, for all recipes matching the pattern, then do the actual download. That is much, much slower than just doing the actual, direct download, not only because there are more network calls, but also it requires a search in the server, that even if optimized, is slow. And if it has to be done for dependency graphs with up to hundreds of packages, by default, some people are going to complain because of the performance.

Does conan perform a separate request to find each package? Maybe server could provide a request which allows to find multiple packages at once and also resolve dependencies? However this task may be non-trivial in case when dependencies are located on several remotes.
Another possible solution is to retrieve full package index from the remotes (like many Linux distros do) and then decide what to download locally.

In the normal case it doesn't perform a separate request to find each package, right now, it directly fetches the package. When version-ranges are involved, yes, a request is issued per Pkg/*@user/channel to find all available versions for that pattern in the remote. Such call will be done to each remote in the client configured remotes, in order.

Retrieving the full package index would be even more costly, specially in big servers (like conan.io or bintray), and it will be a caching problem, you know "there are two hard problems in CS: naming things and invalidating caches". This would render the revisions unusable, they are something intended to be fast and agile to update, so the cached full index would become obsolete, and you cannot be retrieving it again and again from all the remotes, that would kill performance for users and also saturate the servers.

We are working on it, currently considering different approaches:

  • Making revisions opt-in. Only users explicitly configuring (conan.conf) to use revisions, will use them.
  • Defining aliases/symlinks, or even creating two copies of the package, one would be always the latest one. Storage is not a problem, and if it is, the servers might provide de-duplication. And the fetching will be direct.

They both have pros & cons, so we are working on it, trying to move forward, but this is a much, much harder problem than it seems at first sight, and there are different trade-offs to take into account. Thanks very much for your feedback!

Forgot to notice. When I wrote about full package index retrieving I meant something like git remote update, and not apt update. In other words this could be done incrementally. E.g. server could store a simplified changelog (simplified means without unnecessary log record like "package A was added", "package A was removed"), and client could store a "working state" + its revision. Then to update the index client should only download only a little part of the changelog, apply it to the current state and update the revision. Even for huge servers it's hard to imagine that such request may be very slow.

Also what do you think about a single request to resolve multiple dependencies? In most "good cases" this approach should require 1-2 requests, while in the worst case require almost the same amount of requests as conan does now, but "worst cases" may appear only when a user is doing something really strange and wrong.

The incremental changelog is not simple: the server might need to maintain a incremental changelog for each different client, otherwise, you end up with a huge changelog, and it loses its utility. And having an incremental changelog for each client is almost impossible, taking into account that most of the transactions with clients are totally anonymous (and even that, they do not depend on logged user, but on running client instance, also different per different CONAN_USER_HOME the machine might be running, which could be very large for CI servers)

It is not so simple to retrieve multiple dependencies at once. The dependency graph has to be incrementally evaluated, there are conditional requirements, so the process is to retrieve one dependency, check and evaluate its requirements, get them, and so on. It might improve, for sure, when some of them can be retrieved in parallel, but it depends on the graph depth and breadth. However, even if some network calls can be saved, search for each package still requires a search query in the server, that it is still very slow. Even if we try to use fast approaches, it is still a pattern search, other of magnitude slower than direct fetching. Not possible to implement as the default, that is why considering opt-in or other approaches.

Hmm, I didn't get why the changelog should be stored for each client. This changelog is the same for every client and every client knows its revision, so it knows which transactions are missing locally.
About the changelog size, the most important word is simplified (in my company we call it "compressed"). We use this approach for synching data between servers and so far it works :) The main idea of such changelog is "do not store records which do not have an impact on the final result". So finally such log must contain exactly the same number of records as number of binary packages in the repository. After the first updates check all consequent checks should be much faster because the amount of data will be reduced significantly.

Ok, yes, an indexed changelog in the server, where clients can send the latest index and retrieve just the new part is possible. However, still not simple, and costly: it requires a DB on the server, with linear-time retrieval of latest indexes. And some servers like conan.io are already large, serving up to a hundred thousands of packages per month, and growing. And there are many, many, many queries from CI machines (maybe even more than from developers machines), that just fire a new clean environment all the time, so those will require the full index every time too. Not sure either how to handle package removals in the changelog. And, also, doing server side changes is very difficult, take into account that conan now has support in conan_server, conan.io, Artifactory and Bintray. Seems very overkill, adding complexity on the servers should be avoided at all costs, migrations are hard, and bugs take many time to be fixed and distributed. It doesn't scale at all, and we have to consider the community of users as a whole. All the time that we would be developing and maintaining those things, is time we cannot use to develop other more important features or providing support.

Still, to make this issue clear. Having package-revisions is something that can be done today, just by two steps:

  • Add a revision number to the package version
  • Use version ranges to get the latest revision of a package.

We are considering ways to improve this, but they should keep a reasonable cost/value ratio. Lets keep working on it. Thanks very much for your help and feedback :)

Sure. Finally when a command feels that things are getting slow it could decide to use only its private server which would contain only needed packages.

BTW one important concept mentioned in the beginning of the thread wasn't discussed much. Is it planned to implement protection against uploaded package overwriting?

Sure. Finally when a command feels that things are getting slow it could decide to use only its private server which would contain only needed packages.

Yes, but that is precisely the main issue of the #1373 . He is already using a private server, but hosting so many packages that it becomes painfully slow.

Regarding the non overwrite, there are related issues in #679 and #1381.

Ok, I am trying to move forward this issue from the very roots of the problem:

  • Version ranges are painfully slow, because they require search in the server, which is order of magnitudes slower than direct retrieval
  • We want to put the power on the package creator hands, they will be the ones saying which is the good version to be used (typically the latest revision)

So I am considering the concept illustrated in my branch: https://github.com/conan-io/conan/compare/develop...memsharded:feature/conan_links?expand=1

  • We could define "proxy" or "link" package recipes. They will look like this:
class TestConan(ConanFile):
    name = "Hello"
    version = "0.X"
    conan_link = "Hello/0.1@lasote/channel"

Maybe don't even need the version and name.

  • That "proxy" is depend on exactly like Hello/0.X@.... or using any other way, like Hello/0.1@... for proxying Hello/0.1.1@... packages
  • The proxy is retrieved when the dependency graph is computed, but totally resolved and replaced. So in the dependency graph, only the real exact final version is used. All the hashes and package-IDs are computed with the final exact version, not the proxy one
  • This is a powerful configurable mechanism, as package creators might easily revert back to an older revision without removing the newer ones. It might also allow proxying different user/channels too.
  • It will produce one extra download of the recipe, but I think that might be faster than the mentioned search.

This kind of package can be manually edited and uploaded, but further automation could be done:

  • a conan link command might generate and export such package on the fly
  • conan upload Hello* will upload the package and the proxy package.
  • Maybe some revision field in conanfile could be use to automate the creation of the proxy package while export-ing such recipe.

Please feedback. cc/ @claasd @lasote

package creators might easily revert back to an older revision without removing the newer ones

This should be prohibited on the public server, otherwise users who downloaded higher revision (but don't specify exact revision number) won't be upgraded to good revision. Instead, previsous revision should be published with higher revision number.

This should be prohibited on the public server, otherwise users who downloaded higher revision (but don't specify exact revision number) won't be upgraded to good revision. Instead, previsous revision should be published with higher revision number.

I still don't get the willing to restrict users, specially package creators. Users will be retrieved and updated with the version that the package creator wants them to be updated to. Lets say that you have published Pkg/1.2.3, which is proxied by Pkg/1.2. Then you upload Pkg/1.2.4, and update Pkg/1.2 to point to it, but you definitely screw 1.2.4 with a serious security bug that you don't know how to fix quickly. Would you be forced to create a Pkg/1.2.5, which will be identical to Pkg/1.2.3? Doesn't seem very logical, sounds confusing and a waste of resources. Instead you can just update Pkg/1.2 to point to Pkg/1.2.3 again, which is an almost instantaneous fix.

That's not a restriction, just common sense. Versions and revisions need to be monotonic in time, otherwise they don't make much more sense than e.g. hash sums of binaries.

Lets say that you have published Pkg/1.2.3, which is proxied by Pkg/1.2. Then you upload Pkg/1.2.4, and update Pkg/1.2 to point to it, but you definitely screw 1.2.4 with a serious security bug that you don't know how to fix quickly. Would you be forced to create a Pkg/1.2.5, which will be identical to Pkg/1.2.3?

Wait, we are talking about Pkg/1.2-r3, Pkg/1.2-r4, and Pkg/1.2-r5, aren't we? Yes, this is the only meaningful decision to publish r3 as r5, if r4 is broken

As for software versions, they are completely separate topic from package revisions

And yes, if you released software with version "1.2.4" and it has security bug, you definitely need to release 1.2.5 immediately, even if it's otherwise identical to 1.2.3

@memsharded +1. Technically sounds good, and I think it can solve the problems with the package revisions, slow version ranges resolving and so on.
@annulen yes, these are good practices, and we should add it to the docs, but only good practices, we can't and we won't control the versions scheme/flow management of the conan-center packages.

@lasote Versioning of original software is out of control, but it's not the topic here. Revision numbers are related to repsective Conan recipes only, and they can and should be controlled

@annulen Without entering into the debate of how hard we should control the user's packages, we don't have the resources to do it, so it's not an open discussion today.

You say that conan-center is "curated" repository, how can it be called so if it doesn't even have monotonic revision numbers?

It doesn't have revision numbers at all, not even monotonic.

It doesn't have them yet, which is the point of this issue

This is a general discussion about package revisions, not centered in the conan-center repository. You can have your own practices in an on-premises conan server or Artifactory. And of course, it's interesting to find a good solution. You can also have package revisions in conan-center, of course.

Hi,
I like the solution @memsharded. It technically solves our issues. (espacially if we get a conan link command)

@annulen: We basically already use package revisions, by adding the build number of our CI server to the version number (our packages look like MyPkg/1.0.0-alpha-build.11. To get the latest build, we use version ranges (MyPkg/[~1.0]@user/testing). Thus, we have a lot of different version for each package on the server (between 70k and 100k in total). Then, resolving the revision is very slow. This kind of proxy package solves our speed problem. furthermore, I get more control over wich package will be selected.

As for the discussion about prohibiting and good practice, it is my strong believe that you should never enforce those constraints by technical means. Conan itself does not even enforce semantic versioning. We use it, and I encourage everyone to do so, but it may not fit everyone. We also use package references, and I would agree that if package 1.0.0-rc5 broke something, I need to release 1.0.0-rc6. But again, I would never technically enforce it.

Bottom line: I like it! When do I get it? :stuck_out_tongue_winking_eye:

Yeah, I think this is pretty good and would also help with our use cases.

@memsharded In principle I like the proposal but I have one question. You said "Maybe don't even need the version and name." Does it mean that the final consumer conanfile.py would look like this ?

class MyApp(ConanFile):
    settings = 'os', 'compiler', 'build_type', 'arch'
    generators = 'cmake'

    def requirements(self):
        self.requires('Lib1Proxy@user/stable')
        self.requires('Lib2Proxy@user/stable')
        ...

If possible I would prefer to have version in the Proxy as an optional field. I think it helps to have a global view of the versions you are using in a project (For example, to know if you are using Qt4 or Qt5). But I do not have a strong preference for that.

In case this approach is approved, what would be the steps to update the conan code of a project to use this approach ? Would we need to do a lot of work ?

@piponazo No, the consumer requirement will look as always, nothing changes there.

I meant the "package-link" (btw, still need a good name to refer to this concept: link, proxy?) could be:

class TestConan(ConanFile):
    conan_link = "Hello/0.1@lasote/channel"

because it could be generated by something like:

$ conan package_link Hello/0.X Hello/0.1@user/channel

So the name and version are already explicit in the command, and can be used to put the package-link in the conan cache.

If possible I would prefer to have version in the Proxy as an optional field.

It doesn't matter. In your dependency graph, you will have the real version you are using. So you can make your App require Qt/latest@team/stable, and if you make a package-link between Qt/latest => Qt/5.0, in your dependency graph you will see "App" => "Qt/5.0"

Conan has merged the conan alias functionality into develop, will be available for 0.25. This might be a very good core for implementing package revisions.

I have done some performance testing for the conan alias approach, and seemed very reasonable, like incrementing over 5% wrt directly retrieving packages (running with local Artifactory instance). Mainly because most of the cost comes from initiating conan, still have to check the performance for single-instance resolution.

I am moving the "revisions" feature to 0.26 at the moment, or until we get some feedback about the conan alias feature.

I've no new feedback about the conan alias command. So let's wait to 0.27.

I would like to bring to attention a slightly different aspect of package revisions: build reproducibility. Basically, I don't want to use "latest" version anywhere. If I use it, this means, that results that I get depend on point in time when I did the build. But I want to make sure, that if anyone at any point in time check out our source code repository (which contains full list of references for packages it uses) on a given revision and build the solution, they will have exactly the same result, including exactly the same third-parties (even if third-parties used at this point had some bugs).

I believe aliases have nothing to do with this issue. We could add revision number into references, like this: mylib/1.2.3-r.2@user/channel. And this works fine for most of scenarios. The problem is that this brings semver into the domain of pre-release versions. First of all, this is not what we try to express here. This is the _release_ version, but it also contains _package revision_, which is a different beast, but we put it here just because we don't have another place to put it.

Besides wrong intent, I can also point out on a particular problem we have with this approach. Imagine I have recipe for lib A, which wants to use lib B with version 1.2.x. But, according to our approach, versions of lib B are in the form 1.2.3-r1 or 1.2.4-r2. None of these match 1.2.x mask. I would like to have a mask like 1.2.x-r, so I can have a latest revision (treated as pre-release version by semver) for any patch version within a given major and minor version. But this mask is not a valid semver range.

So, the questions:

  1. For current version of conan, how do I specify valid version range with the meaning of "1.2.x-r", so that it matches the latest revision for latest patch version?
  2. For future versions of conan, are you considering a "proper" way to specify revisions in a reference, that does not bring version into pre-release domain?

@memsharded any ideas on the questions?..

Besides wrong intent, I can also point out on a particular problem we have with this approach. Imagine I have recipe for lib A, which wants to use lib B with version 1.2.x. But, according to our approach, versions of lib B are in the form 1.2.3-r1 or 1.2.4-r2. None of these match 1.2.x mask.

We have implemented revisions very similarly, only without the r before the revision number. We have a script that updates revisions in dependent packages recursively (because updating a revision changes its contents, so we up the revision all the way to leaf packages. This works very well in practice and avoids the need for version ranges. As a result, we completely avoid the issue you raise here: we always use exact versions-revisions everywhere. We'll probably never use ranges or aliases precisely because we want build reproducibility. I guess we could get away with collecting and storing the resolved version numbers at the time of build, but that detaches the information from version control.

There is something in @sztomi approach that I like. Being very explicit, use automation to support being explicit. Full reproducibility, always.

Regarding your questions above, I think going with the 1.2.3-something is not the way to go. As specified by semver, they are pre-releases, which are quite opposite usage to revisions. Pre-releases are only used explicitly, while revisions are intended by default.

Not having the revision in the package reference is not possible if we want to satisfy the requirement of being able to explicitly opt-in to retrieve a specific revision. Using version-ranges by default, so Pkg/1.2.3@user/testing automatically resolves to the latest revision is not feasible, it doesn't scale and performance penalties are not possible for most users.

So, if anything, maybe the way to go would be to force semver, and to use revisions as the fourth version element X.Y.Z.R and consume it with something like Pkg/[X.Y.Z]@user/channel. I don't know to which extent this is possible without having to rebuild the node-semver library or without breaking existing behavior.

maybe the way to go would be to force semver

I disagree for a variety of reasons, but it all boils down to real-world use-cases.

In my experience, semver is one of those things that is designed and works well in theory, but in practice it's a false sense of security. Some projects that claim to use semver fail to deliver on the stability promises. Other projects just use a version that looks like semver, but isn't. You might not be able to add the revision as the forth version element because a particular package might already have a forth element in their version. It's all a crazy mess and establishing a standard on the package maintainer level causes more harm than good, imo. Libraries might also decide to change their versioning scheme from one version to another.

In our implementation we actually have a separate attribute for the revison number, but it is ultimately appended to the version number for conan (by our conanfile base class).

@sztomi thanks for sharing your workflow! We wanted to use version ranges in recipes, but always specify concrete versions in conanfile. This way we won't need to change recipes when we introduce a new revision or version, but will still have perfect reproducability.

@memsharded did you think of expanding reference to something like Pkg/1.2.3/4@user/channel, or Pkg/1.2.3#4@user/channel, where only the part 1.2.3 is considered to be semver, while 4 is treated as revision?

We have done some preliminary analysis of a server-side solution of this feature, which could be interesting. However, this requires development in all servers simultaneously. Moving this to 1.3, and for that release hopefully we have a full proposal of how it will be in the servers, but unfortunately won't be possible to have a full implementation.

Glad to see this is being discussed and maybe even worked on. This has been plaguing me ever since I first started using Conan and it only just now occurred to me to search for an open issue. For reference, I like the Pkg/1.2.3#1@foo/bar syntax the most, with another / instead of the # being the only other option I see as reasonable. I think manually specifying the package revision should be optional, but that all packages should have it. Recipes in a cache/server generated prior to this feature will be treated as package revision of 0 and new recipes that don't provide a revision = ... attribute will default to 1. For the sake of speed, specifying requires = 'pkg/1.2.3@foo/bar' should resolve as revision 0 but throw a LOUD warning that this is deprecated and that a revision number should be added. Conan v2.0 can remove support for requirements w/o revision (unless range is used of course).

I would go for the name "release" or "revision", though prefixing either with "package_" would be reasonable too given it's far more obvious naming.

Any updates on the time schedule when this issue will be solved. Revision support is a important feature for us as well.

Hi @ringgelerch,

There is a huge WIP #3055 and we have been internally discussing all the use cases of this feature. We know it is an important feature in our roadmap but cannot commit to a release date yet.

Thanks for your patience and stay tuned.

Excellent news! Thank you for the update!

I'll add my voice to those who've expressed a need for this feature.

We build several open source packages as dependencies for software we write. These need to be built with particular flags/options that are important to us; and frequently the upstream source must be patched. (We build dependencies as static libraries; but that's not terribly important for this discussion.) The particular revision of the build scripts and patch set needs to be expressed in the identity of the binary package; and it doesn't make sense to use the package's version for this (since we don't "own" that).

This scenario is very nearly the same as that for creating packages for Linux distributions; and, accordingly, we need a release number for pretty much the same reasons.

I'll also add that it's useful to have that release number—or, more accurately, release version—follow the expression [0-9](\.[0-9])*, rather than use a single integer. This can come in handy should the need arise to branch the package build scripts. E.g., I've created package 1.1.2 release 4 and the current release of my software uses it; but a previous release of my software using 1.1.2 release 3 needs a fix to the dependency that would not be satisfied by upgrading to 1.1.2 release 4 (or upgrading to that release is otherwise problematic). It makes the most sense to branch from the 1.1.2 release 3; but, of course, this new package release needs a new release number. I don't want to call it 5; because this should not be seen as an "upgrade" for users of 1.1.2 release 4. Instead, I want to call it 1.1.2 release 3.1.

@bradenmcd, I would actually argue that either the revision need to be fully semantic, or just a single integer (and personally, I'd prefer single integer). Your use case is just one _tiny_ step away from someone else saying "I'd like to fully express major/minor/patch updates."

Your use case is just one _tiny_ step away from someone else saying "I'd like to fully express major/minor/patch updates."

@DavidZemon, it's really not. As my example demonstrates, it's about identifying a new release that falls logically between existing fielded releases. That is all it's about; but, you cannot do that with a single integer.

Semantic versioning is about what increments to particular version components mean for a software interface. That's a huge leap (requiring gobs more logic to understand) relative to what I'm asking for. I'd also suggest that "semantic versioning" as it might apply to a binary package release version (separate from an software package version) is not something that's at all well-defined; thus a request for it in this context is not meaningful without providing that definition.

With the current Conan/revisions model, if you change the "version" field you have a different package, with its own revisions. But I think, in the scenario you described, it is correct, has to be a different package (1.1.2 release 3.1 or whatever) because users will depend explicitly on that version and the users have to perceive the change. The revisions mechanism is mostly a way to guarantee reproducibility and resolving the latest, the current POC (more than a POC now) is based on:

  • Storing always a new recipe when a recipe changes
  • Storing always a new binary package when (not changing the recipe) I've changed a binary package.
  • Resolving automatically the latest revision when a reference is provided.
  • Unless you specify a concrete revision. For this purpose, some way to "lock" the graph is needed (not even POC yet).
  • The "revision number" is based on a commit hash of the code or the hash of the files when the SCM is missing. So you keep a direct correlation between your SCM and your package revisions.

The "revision number" is based on a commit hash of the code or the hash of the files when the SCM is missing. So you keep a direct correlation between your SCM and your package revisions.

That's interesting. According to this POC, will the revision be a part of a reference? And if yes, could you please give an example on how such reference might look like?

According to this POC, will the revision be a part of a reference? And if yes, could you please give an example on how such reference might look like?

Yes, it will be part of a reference but usually it won't be specified in a requirement but part of a freezed reference at a lock files, but you could do it. The revision will be a commit hash:

lib/1.0@conan/stable#d763a453f0940f06f85102fbcc519c3c694caf8f

You could "go back in time" a revision by looking at the git log of your scm.

I am very encouraged to see how much consideration of this issue has already happened and would only like to cheer for the conan devs in hopefully releasing something that solves the issue soon. Just wondering what is the plan for roll-out of this or is it still very much a design effort?

According to this POC, will the revision be a part of a reference? And if yes, could you please give an example on how such reference might look like?

Yes, it will be part of a reference but usually it won't be specified in a requirement but part of a freezed reference at a lock files, but you could do it. The revision will be a commit hash:

lib/1.0@conan/stable#d763a453f0940f06f85102fbcc519c3c694caf8f

You could "go back in time" a revision by looking at the git log of your scm.

while this appears very clever it doesn't seem incrementally intuitive for a user? for example just staring at some hash values i have no idea if it is more recent then any other unless i have access to the scm. in the case of no scm the file hash tells me nothing about which one is more recent. it is also in practice hard to have quick discussions about packages and local version history if you need to remember hashes. something like 'hey are you using revision 1 or 2 because i heard 3 fixes both issues.'

That's a pretty annoying way to do it... I think understand why... but I don't think I like it. But maybe I'm just being adverse to change here.

But just like groovyd said, this is _really_ verbose and does not work well for humans talking to each other about what version to use.

Is there a way to force a specific revision number in the recipe, instead of letting Conan auto-compute the hash? I certainly like the idea of the hash as a fallback if no other revision was provided - it's brilliant. But I don't like it as a forced option that can't be overridden.

@groovyd The work on revisions is quite advanced, some parts like API v2 have already been merged. We are investing lots of time in this it is a high priority for us. Now we are researching the model and flows how to actually consume that revisions, use them in CI and "lockfiles" to have reproducible builds. A challenging problem, but also a high priority for us.

@DavidZemon I am proposing the possibility to let users to define their own revisions numbers, both for recipes and binaries (like the buildID of the CI build, for example). Still need to be discussed, but I have that in mind,.

Thanks both for the feedback!

cheers and thanks for listening.

On Oct 3, 2018, at 7:02 PM, James notifications@github.com wrote:

@groovyd The work on revisions is quite advanced, some parts like API v2 have already been merged. We are investing lots of time in this it is a high priority for us. Now we are researching the model and flows how to actually consume that revisions, use them in CI and "lockfiles" to have reproducible builds. A challenging problem, but also a high priority for us.

@DavidZemon I am proposing the possibility to let users to define their own revisions numbers, both for recipes and binaries (like the buildID of the CI build, for example). Still need to be discussed, but I have that in mind,.

Thanks both for the feedback!

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

Just an FYI (and maybe i'm doing something wrong). I use the commit hash as a channel in the ci/cd to deal with this issue. Seems to solve (so far)

  • commit-hash pinning for devs
  • ci/cd parallel build/upload on the same machine

Hi @earonesty

No, that is not a bad approach, and it can work. The main issue is that it might require a lot of job in the CI side, when you want to propagate that channel down the dependency graph, so consumers start to use that specific commit of the dependency. Could you share a little bit more how do you apply it in the consumers of those packages? Thanks!

We just stick: vapilib/1.0@vida/4e95ec56ced10b333647b69dcfae60fc789bc209 in the conanfile,txt of dependent modules. Then those modules consume the correct commit and we don't have to worry about new versions of vapilib breaking things on a daily basis.

Later, if someone wants to get the most recent vapilib, they can just git ls-remote --refs [email protected]:vidaid/vapilib.git stable to get the latest ref to use.. which we have enshrined in a "update-conanfile [tag] [libname1...] " script that walks through and updates refs (by default "everything/stable")

Of course if we have two dependent modules set to consume different refs, things can get wacky.... but that's the same as versions (only worse). Seems ok (to me) for "new projects" or fast-moving things that don't have deep dependency trees.

The package revisions feature has been deployed in Conan 1.10 in the client and conan_server but still locked. If anyone is interested in giving it a try ping me, I can tell you how to unlock the feature and play with it. Thanks!

This guy is interested!
(Email removed for privacy)

@lasote where can we read about chosen design? Cannot see anything on the docs.

Hi, @mmatrosov sorry for the delay. I created a wiki page here: https://github.com/conan-io/conan/wiki/Revisions-experimental-feature explaining what it is, how to try it and what is the release plan. Thanks for your interest.

I found this ticket accidentally and it appears to solve the "issue" I've tried to describe here: https://github.com/conan-io/conan/issues/4569

As far I've understood this feature could be used to "pin" the latest package, including the dependencies. The documentation was a bit unclear to me https://github.com/conan-io/conan/wiki/Revisions-experimental-feature?

How would it look like if I wanted to package pre-built binaries from producer and consumer point of view?

project/1.0@user/rc1 # nightly snapshot, depends on A/1.0 and B/1.0
A/1.0@user/rc1 # nightly snapshot
B/1.0@user/rc1 # nightly snapshot

Like this?

Producer:
project/1.0@user/rc1#1.0-1234
requires: "A/1.0@user/rc1", B/1.0@user/rc1 # should pick the latest revision?

Consumer:

conan install project/1.0@user/rc1 # should pick the latest revision?

hi @unzap,

Yes, this feature will fit the use case you described in #4569.

The idea is that everytime you create a new package with the same reference (lib/version@user/channel), Conan will create a new revision creating a hash of the recipe content or the commit from git/svn (lib/version@user/channel#recipe_revison).

This also works for package references (lib/version@user/channel#recipe_revision:package_id).
Even if the recipe does not change but you change the binary in the package (you make changes in the source code of the library), Conan will also create a new package revision (lib/version@user/channel#recipe_revision:package_id#package_revision).

So you would have this kind of structure if you upload it to the remote:

lib/version@user/channel
  - recipe_revision1
    - package_id1
       - package_revision1
       - package_revision2
    - package_id2
       - package_revision1
  - recipe_revision2
    - package_id1
       - package_revision1
       - package_revision2
       - package_revision3

The interesting part here is that consumers wouldn't have to change the reference being installed.

If they install the normal referece they will get the latest recipe revision and the latest package revision suitable for their configuration:

In the above example, a conan install lib/version@user/channel (let's suppose we are looking for a package with the settings hashed in pacakge_id1) command will download the following full referece: lib/version@user/channel#recipe_revision2:package_id1#package_revision3

Hope it helps to clarify the idea on this new feature and that you find it useful! 😄

hi @danimtb,

Hope it helps to clarify the idea on this new feature and that you find it useful!

Yes this feature is very welcome!

A couple questions still :)

Correct me if I understood it wrongly but it looks fully automatic from consumer and producer perspective?
Or do I (producer) need to add the revision, e.g. append the nightly CI snapshot number after the channel like "project/1.0@user/rc1#1.0-1234"?
If the revision feature is automatic what is the mechanism to remove (clean) old revisions from server, how to find/reference those in the "remove" command?

Does this work along with "alias" command if I want to e.g. flag "rc" as "final", will the "final" alias point to the latest revision automatically (my assumption, but just checking).

Yes, from the producer side it is currently automated by Conan and no input revision can be given, but it is something we are considering to implement. However, I think the commit hash will be more useful to track the changes as you can make a direct association source code change <> new conan package.

Regarding the commands to work with revisions, if you don't specify it manually you will get the latest resolved. However you would be able to reference the recipe revision in the installation/requirements: conan install lib/version@user/channel#recipe_revision2

The same but including packages can be done for removal/deletion: conan remove lib/version@user/channel#recipe_revision2:package_id1#package_revision3 --remote remote1 although this is something that you won't be doing often as it could affect you regarding traceability of package generation.

There are also commands for searching recipe and package revisions too!
conan search lib/version@user/channel --revisions --remote remote1 and conan search lib/version@user/channel#recipe_revision2:package_id1 --revisions --remote1

Regarding alias, I am not sure about they behavior with revisions but I's say you could point them to any reference with or without recipe revision

Allow me to play a little pessimistic chord here. The whole design looks a little over-generalized to me. Especially the decision to use hashes as revisions. Let's consider your example from above (https://github.com/conan-io/conan/issues/798#issuecomment-467797171). Substituted with real values it would look like this:

lib/version@user/channel
  - recipe_revision_10cfc30b3a761f7052a545d22d0d52953c636800
    - package_id_1ce3a9dce5ee5f7adb42f16ea1e812d187502583
       - package_revision_f1e462b1d8d9cffdad98b1c373113671621dd88e
       - package_revision_d7c16de57903c4a94e17a480a9b1552d40df13a9
    - package_00b17467a14c6546026c59bbc3724e711d9f199d
       - package_revision_2941f23f8ae5f96996d70d11dc029b644e18492d
  - recipe_revision_8e8b05bbaf9d18277ee41c052cb2cbc44080cd0b
    - package_id_c2053771cd94bcb3078d620d01e11e5cf97726b8
       - package_revision_a48c9ba329b778e36dd6d261fadbc3ff67068277
       - package_revision_9ab2e86421bdd7e90c17c715923353634c482b8c
       - package_revision_ca586123ce701b4bc3d88d74f5606d28c8e965c0

Simple question: how do I order these things? We have already had hashes as packages ids from the very birth of conan, but this is ok, since order on packages does not make any sense. But it does for revisions.

Looking on your implementation, I feel like we will continue using our custom homebrew approach, see https://github.com/conan-io/conan/issues/3158#issuecomment-402129907.

Yes, from the producer side it is currently automated by Conan

Good, this should make it easier from producer and consumer point of view

although this is something that you won't be doing often as it could affect you regarding traceability of package generation.

@danimtb
In our case probably not daily but weekly, in the "worst" case one week can produce roughly 50G-100G worth of snapshot binaries so we want to clean up old unused revisions.

Ps. it would be a handy feature to have it work something like this:

conan remove --follow-dependencies --clean-old-revisions=

@mmatrosov

I feel like we will continue using our custom homebrew approach

Afaik changing the version number in "reference" will produce new package repositories, how the consumers can update the packages as the reference has changed? Or are you using the alias mechanism here?

Edit: Is the work flow (consumer) below actually possible?

> conan install lib/1.0-1@user/channel
> conan install --update lib/1.0-2@user/channel  # revision changed

I.e. can you update a package from a different reference?

@unzap

Afaik changing the version number in "reference" will produce new package repositories,

If by "new package repositories" you mean "new almost completely unrelated packages", then yes, you are correct.

how the consumers can update the packages as the reference has changed?

Manually. The whole approach was invented to ensure reproducibility of our builds. I.e. when someone checks out particular git revision in our source code and builds the solution, they will obtain exactly the same packages, no matter when they do it. Even if there are some more modern revisions uploaded on the server.

This means, when we prepare a new revision of a package, no one gets it automatically. We have a special file called CurrentReferences.yaml which is stored and updated alongside with our code, which contains, well, currently used references for all libraries that we use. Thus, if you want your new revision to actually participate in the build, you update corresponding reference in CurrentReferences.yaml. Simple, traceable, reliable, reproducible.

Same for recipes that reference updated recipe. Want to reference latest revision? Go and manually update requires field. By "manually" I mean it could easily be done with an automated script, but anyway it will be directly reflected in the code of the recipe. And yes, updating requires of a recipe means it gains new revision. Thus, when we update a revision for a particular recipe, it should be propagated downstream, updating revisions for all referencing recipes. All these updates should be reflected in CurrentReferences.yaml.

If something goes out of sync, you might have different libraries referencing different versions (meaning different revisions) of a single library. Surprisingly, conan is ok with this. That's why I created https://github.com/conan-io/conan/issues/2800. Please go and cast a vote :)

Or are you using the alias mechanism here?

No, we do not.

Edit: Is the work flow (consumer) below actually possible? "conan install --update ..."

I believe not, because we actually have different references for every revision.

@mmatrosov

Simple question: how do I order these things? We have already had hashes as packages ids from the very birth of conan, but this is ok, since order on packages does not make any sense. But it does for revisions.

Actually new revisions will get a timestamp in the server and you would be able to see the chronological order of recipe revisions in the output of conan search <reference> --revisions (same concept for package revisions).

Your approach above seems reasonable if you are controlling the automatic bumping of requirements in a recipe when you want to update them, but you have to relay on something external probably automated in your CI or git hook.

Regarding reproducibility, we are working on the concept of "graph locks" to get files that can be used to reproduce all the dependencies and their relations used in a conan install so you can get the exact same output of a CI build for example without forcing to have "pinned" references.

Finally, I forgot to say that the revisions feature would be something opt-in via conan.conf/env var for the moment and you would still be able to use the same mode without revisions.

we are working on the concept of "graph locks"

This looks intriguing. Could you please provide some details?

the revisions feature would be something opt-in

I'm not a fan of "opt-in" features. Roughly speaking, if a feature is good, make it available for everyone. If it is not, why it is needed at all? Do you plan to make it available by default eventually?

I proposed the concept of the "graph lock" in #101. It is similar to the concept used in npm or yarn. It didn't seem to get any traction with the team. I think the lack of this is a critical hole in conan and an invitation for build reproducibility problems.

@kenfred,

I think it would be a great idea if you opened a new ticket with your lock file idea. I would certainly vote for it. My team was planning to implement our own version of a lock file because we absolutely need that level of reproducability for a given build of a project.

@mmatrosov

I'm not a fan of "opt-in" features. Roughly speaking, if a feature is good, make it available for everyone. If it is not, why it is needed at all? Do you plan to make it available by default eventually?

The revisions feature changes the behavior of Conan in some flows and we cannot break it. Probably they will default in Conan 2.0

@DavidZemon While opening an issue, I came across #1042. I'm going to add some comments there.

Was this page helpful?
0 / 5 - 0 ratings