DCAT does not take a position on what should be considered as a version, recognising that versioning is used and implemented differently across communities, domains, and platforms.
However, it is to be verified whether the current guidelines are useful enough for implementors, and to facilitate interoperability, or, rather, a more precise definition of versioning should be used.
For instance, one of the possible issues is that the versioning "types" discussed in the current DCAT specification are covering possibly heterogeneous and overlapping scenarios (e.g., versions vs series, versions vs translations), which may not strictly considered as related to versioning.
@agreiner review in https://github.com/w3c/dxwg/issues/1275 provides a number of key discussion points.
Quoting:
There is some interesting reading about possible uses of the term "version". I think it would be helpful to go a step further and clarify what our own definition of the term is. Moreover, I'm surprised, given the requirement to add support for versioning, that we do not seem to be ready to add a version term of our own. It's true that one can interpret the word to mean a handful of different things, but I don't think we solve the problem by simply reiterating the complexity and then telling people to do whatever works for them. I think that people who have asked for DCAT to support versions are asking that because they don't think that solution works.
If we will add such a term, the text should clearly state that and should define it narrowly. I'm worried about overlap with other concepts that people may add in a profile, like series. Obviously, one can't use the same term for all the things suggested, and there will be datasets that use more than one (e.g., series where one item is updated).
The one usage with which I disagree entirely is for languages. I don't believe that different language distributions are versions in the sense we're discussing, and I think that including that is potentially confusing.
And specifically about the first point:
As of my reading, the text didn't make it clear to me that the proposed path forward was not to add explicit support for versions, since the draft contains references to the related requirements, and readers may well assume we do plan to add terms for them.
A new draft for discussion on versioning support in DCAT has been prepared - see PR https://github.com/w3c/dxwg/pull/1295
Compared with the first draft in DCAT 3 FPWD, this section has been revised to focus specifically on versions derived from the revision of a resource, and by following the [PAV] approach for the specification of version chains and hierarchies - previous, next, current, last version.
I am very much in favour of specifying the DCAT approach to definition. Granted, DCAT shouldn't constraint publishers some specific details like the granularity of a new version (i.e. which data changes should lead to the production of a new version). But as you write, "version" in general can include notions like 'editions, adaptations' and this would go too far.
In this respect I think the new additions from PR #1295 are useful!
I am wondering whether they could be made more precise though: it's all in one sentence "versions resulting from a revision - i.e., from changes occurring to a resource as part of its life-cycle.". There is an implicit acknowledgement that time plays a role, but 'life-cycle' may still include 'adaptations'... Maybe the temporal aspect could be reinforced by bringing 'history' in the intro? And 'release', too?
By the way (and maybe this is a closely related issue) DCAT provides elements to describe releases, but the section on versioning is not very explicit on whether a new release is expected to be a new version.
Again I reckon DCAT should remain generic, but I think it could feature a guideline that says that a new release of a resource would typically lead to the creation of a new version. This could otherwise result in inconsistencies in DCAT metadata, couldn't it? (that is, at least for the data publishers who would care about versioning).
Good point, @antoine . Do you have any suggestions on how to revise the current text?
For the precision, maybe we could start by amending the note on the focus (in the intro of the Versioning section) with something like this?
This section focuses specifically on how to use DCAT to describe versions resulting from a revision - i.e., from changes occurring to a resource as part of its life-cycle. One of the typical cases here is representing a history of the versions of a datasets that have been released over time.
We could make a stronger guideline about releases, pointing to the DCAT release info properties, but as I'm not sure that everyone is on board with the idea I'm not going to suggest this yet :-)
Thanks, @aisaac . While waiting for feedback on including guidance for releases, I've revised the draft as per your suggestion - see https://github.com/w3c/dxwg/pull/1295/commits/3d6859d767f87ba74e6a441d3304b069f297d12c
Great, thanks @andrea-perego !
PR https://github.com/w3c/dxwg/pull/1295 is now merged.
I suggest we close this issue and create a new one about the point about releases raised by @aisaac in https://github.com/w3c/dxwg/issues/1277#issuecomment-784557124
Any objections?
Thanks, @agbeltran & @riccardoAlbertoni .
@aisaac , could you please create a new issue on releases, following up from your point in https://github.com/w3c/dxwg/issues/1277#issuecomment-784557124 ?
@aisaac , could you please create a new issue on releases, following up from your point in #1277 (comment) ?
I created this issue: https://github.com/w3c/dxwg/issues/1331
@aisaac , could you please check if it needs to be further elaborated?
Meanwhile, I'm closing this issue.
Most helpful comment
PR https://github.com/w3c/dxwg/pull/1295 is now merged.
I suggest we close this issue and create a new one about the point about releases raised by @aisaac in https://github.com/w3c/dxwg/issues/1277#issuecomment-784557124
Any objections?