Identify DCAT resources that are subject to versioning, i.e. Catalog, Dataset, Distribution.
If I have correctly understood the discussion we had in the sprint call, all first classes dcat citizen can be subject to versioning ( e.g. dataset, distribution, catalogs). Am I wrong?
As far as I am concerned, you are right
I agree - and we haven't discussed it much, but we actually need to consider dcat:Resource and services too in this context
@agbeltran If our decision is to allow versioning for all first-class citizens in DCAT, then this would also include dcat:Resource and dcat:DataService. The question is whether we would restrict this to the simple case (indicating version number/string and providing version info) or investigate the kinds of version relationships that would be relevant for those other classes.
@makxdekkers and @agbeltran: I agree, dcat:Resource, dcat:DataService can be potentially versioned as well. In my research activity, I have experienced a case in which even catalogs could be versioned.
By the way, my feeling is that there is a "core part" related to versioning which is not much depending on the subjects we are considering.
For example, both the unqualified relation prov:wasRevisionOf and the qualified counterpart prov:qualifiedRevision works on prov:Entity, which can be any of the first class citizens we were mentioning before.
Similarly, does PAV which specialize PROV with specific authorship, curation and digital creation terminology.
So in order to deal with this issue,
I would suggest acknowledging in versioning section that versioning can be applied to any of the first class citizens classes and then, as a starting point, we can illustrate versioning with most common cases dealing with artefacts such as Distribution, Dataset.
+1 from me to not limiting the scope of versioning to some specific classes.
Said that, I wonder what actually means versioning a "service": is this about the version of the software and/or service interface? This can be already addressed by indicating the specification to which the service conforms to.
In other words, shouldn't versioning be related to content/data only?
@andrea-perego said
In other words, shouldn't versioning be related to content/data only?
This is the key question, isn't it? We're certainly more interested in content/data than in versions of software but I'm struggling to see how that could work in practice. For example, if I say that I'm using a specific version of a dcat:DataService and then the end point changes to a new version then it may behave differently - for example, if the previous endpoint had a defect/bug that was then fixed. Wouldn't some users want that change to be reflected in the dcat:DataService version?
So I don't see how we can limit it just to the content/data... Of course I may be missing a subtlety somewhere that we need to explain.
@davebrowning , I see your point, but what has changed is the version of software/API used, and this can already be specified via dct:conformsTo.
E.g., the following is a service conformant with CSW 2.0.2:
turtle
a:Service a dcat:DataService ;
dct:conformsTo <http://www.opengis.net/def/serviceType/ogc/csw/2.0.2> .
If the service is switched to the latest version of the CSW specification, its description would be:
turtle
a:Service a dcat:DataService ;
dct:conformsTo <http://www.opengis.net/def/serviceType/ogc/csw/3.0> .
The version here does not concern the service, but the reference specifications:
````turtle
http://www.opengis.net/def/serviceType/ogc/csw a dct:Standard , adms:Asset ;
dct:hasVersion http://www.opengis.net/def/serviceType/ogc/csw/2.0.2 ,
http://www.opengis.net/def/serviceType/ogc/csw/3.0 .
http://www.opengis.net/def/serviceType/ogc/csw/2.0.2 a dct:Standard , adms:Asset ;
dct:isVersionOf http://www.opengis.net/def/serviceType/ogc/csw ;
adms:next http://www.opengis.net/def/serviceType/ogc/csw/3.0 ;
.
http://www.opengis.net/def/serviceType/ogc/csw/3.0 a dct:Standard , adms:Asset ;
dct:isVersionOf http://www.opengis.net/def/serviceType/ogc/csw ;
adms:prev http://www.opengis.net/def/serviceType/ogc/csw/2.0.2 ;
.
````
Reading the discussion here and in the last sprint on versioning, I understand that the vocabularies under consideration are DCTerms, PROV and PAV.
It may be worth mentioning that versioning was also addressed in ADMS, where specific properties are defined:
adms:last: "A link to the current or latest version of the Asset."adms:next: "A link to the next version of the Asset."adms:prev: "A link to the previous version of the Asset."Please note that, despite what said in their discursive definitions, no domain or range restriction is specified for these properties.
Sorry, I forgot adms:versionNotes ("A description of changes between this version and the previous version of the Asset."), and that ADMS also re-uses owl:versionInfo (which in ADMS is used as "A version number or other designation of the Asset.").
I don't remember if this was already mentioned, but another aspect of versioning may concern the resource "status" - e.g., draft, stable, deprecated, withdrawn.
The EU Publications Office maintains some reference code lists - the two that may be most relevant here are:
E.g., the dataset status code list above includes the following statuses (alphabetically ordered):
Code | Label | Definition
-- | -- | --
COMPLETED | completed | This dataset is considered to be complete, it holds all information that is intended.
DEPRECATED | deprecated | It is recommended that the contents of this dataset be no longer used.
DEVELOP | under development | This dataset is currently being assembled. It may be in an incomplete or faulty state.
DISCONT | discontinued | This dataset is no longer produced or updated.
WITHDRAWN | withdrawn | This dataset is no longer meant to be published.
The concept status code list includes additional statuses.
This information is clearly useful for administrative purposes, but relevant as well for users.
E.g., in the JRC Data Catalogue, these statuses determine where a dataset can be published (e.g., a dataset in draft status is not supposed to be published in production). On the other hand, deprecated, discontinued, or withdrawn records are not removed from the catalogue (because of the persistence policy we have in place), but they are "marked" as such, so that users are aware they shouldn't be used or are not longer available.
I think we are close to an agreement on this issue, as the following sentence, already included in DCAT 2, shows that we apply versioning on all the first-class entities (datasets, distribution, but in principle also catalogs and data service)
see section about Versioning
Versioning can be applied to any of the first-class citizens DCAT resources, including Catalogs, Datasets, Distributions. The notion of versions is very much related to the community practices, data management policy and the workflows in place. It is up to data providers to decide when and why a new version should be released. For this reason, DCAT refrains from providing definitions or rules about when changes in a resource should turn in a new release of it.
Some concerns were expressed on versioning for data services, in particular, whether content or service changes should trigger a new version for data Service, but I think this is up to the adopters to decide.
I suggest adding the requirement mentioned by Andrea about the "resource status" as a separate issue. Then if there are no objections, I think we can close this GitHub issue. What do others think?
+1 from me. Thanks for the summary, @riccardoAlbertoni .
I include in this discussion, @pwin' s view as expressed in the mailing list:
+1 from Peter (see https://lists.w3.org/Archives/Public/public-dxwg-wg/2020May/0007.html)
I think that this is something that can be illustrated in a primer, some of
the options and how DCAT might be used together with prov, pav, or similar (see https://lists.w3.org/Archives/Public/public-dxwg-wg/2020May/0008.html)