Dxwg: Should DCAT define a property for version identifiers?

Created on 14 Nov 2020  Â·  19Comments  Â·  Source: w3c/dxwg

DCAT currently recommends the use of owl:versionInfo for specifying version identifiers.

It is to be decided whether a more specific property is needed, which may be possibly defined in the DCAT namespace.

NB: The discussion that led to the adoption of owl:versionInfo is documented in https://github.com/w3c/dxwg/issues/92

dcat versioning

Most helpful comment

owl:versionInfo is perfectly fine for versioning an ontology itself. PAV author here, in http://pav-ontology.github.io/pav/pav.rdf we even use that rather than "eat our own dog food" (as it would take it into OWL Full or pollute with OWL annotation properties):

    <owl:Ontology rdf:about="http://purl.org/pav/">
        <rdfs:label xml:lang="en">Provenance, Authoring and Versioning (PAV)</rdfs:label>
        <owl:versionInfo rdf:datatype="&xsd;string">2.3.1</owl:versionInfo>
        <owl:versionIRI rdf:resource="&pav;2.3"/>
        <owl:priorVersion rdf:resource="&pav;2.2"/>
        <owl:backwardCompatibleWith rdf:resource="&pav;2.2"/>
        <owl:backwardCompatibleWith rdf:resource="&pav;2.1"/>
        <owl:backwardCompatibleWith rdf:resource="&pav;2.0/"/>
        <owl:backwardCompatibleWith rdf:resource="&pav;authoring/2.0/"/>
        <owl:backwardCompatibleWith rdf:resource="&pav;provenance/2.0/"/>
        <owl:backwardCompatibleWith rdf:resource="&pav;versioning/2.0/"/>
        <owl:incompatibleWith rdf:resource="http://swan.mindinformatics.org/ontologies/1.2/pav.owl"/>
<!-- ... -->
</owl:Ontology>

But for tracking versions of datasets then owl:versionInfo owl:priorVersion etc becomes cumbersome as it implies you are making an ontology. But your dataset is not an OWL ontology - it is just described using ontologies.

https://practicalprovenance.wordpress.com/2016/05/07/tracking-versions-with-pav/#organize we say how you can also use PAV relations like pav:previousVersion to show the whole version hierarchy, if you like, including the unversioned "latest":

PAV has Current Version

All 19 comments

Original thread started in https://github.com/w3c/dxwg/issues/1275 , following feedback reported in https://lists.w3.org/Archives/Public/public-dxwg-wg/2020Oct/0086.html

Copying below the comments submitted so far:

@agreiner - https://github.com/w3c/dxwg/issues/1275#issue-741202772

[...]

Since schema.org is already supporting version as a term, and it is in some sense a child vocabulary, following what schema is doing would be helpful for compatibility with them.

@andrea-perego - https://github.com/w3c/dxwg/issues/1275#issuecomment-726363227

@agreiner , [...]

Could you please elaborate?

Currently, the versioning section recommends using owl:versionInfo to specify the resource identifier. Do you see any issue in re-using this property?

@riccardoAlbertoni - https://github.com/w3c/dxwg/issues/1275#issuecomment-726831135

[...]

Just providing more context to the discussion.
We provide an alignment between schema.org and DCAT, see appendix B in the current FPWD. The alignment was already included in DCAT 2.

I wonder if we want to keep the same line and add a new line in the mapping for the correspondence between owl:versionInfo and schema:version.

@agreiner - https://github.com/w3c/dxwg/issues/1275#issuecomment-726423294

Yes, I think owl:versionInfo is meant to describe versions of ontology terms and constructs specifically.

@riccardoAlbertoni - https://github.com/w3c/dxwg/issues/1275#issuecomment-726831135

[...]

Yes, it is mainly for ontology terms, but using it for datasets is a common practice.
For example, owl:versionInfo is one of the implementation examples considered in the "Data Web Best Practice 7: Provide a version indicator" https://www.w3.org/TR/dwbp/#dataVersioning. Examples in the DWBP Rec have not normative status. Nevertheless, they should still suggest proper ways to implement Best Practices.
Similarly, it is reused by ADMS (see https://www.w3.org/TR/vocab-adms/#owl-versioninfo).

@agreiner - https://github.com/w3c/dxwg/issues/1275#issuecomment-727094882

I think both those examples would have happily preferred a dcat:version had it been available, and neither was in a position to mint their own, so it's not surprising that they would use what was available, whether or not it was a strong choice.

@riccardoAlbertoni said:

Yes, it is mainly for ontology terms, but using it for datasets is a common practice.
For example, owl:versionInfo is one of the implementation examples considered in the "Data Web Best Practice 7: Provide a version indicator" https://www.w3.org/TR/dwbp/#dataVersioning. Examples in the DWBP Rec have not normative status. Nevertheless, they should still suggest proper ways to implement Best Practices.
Similarly, it is reused by ADMS (see https://www.w3.org/TR/vocab-adms/#owl-versioninfo).

Just to add that owl:versionInfo is also used in DCAT-AP, and all the related profiles and extensions.

@agreiner said:

I think both those examples would have happily preferred a dcat:version had it been available, and neither was in a position to mint their own, so it's not surprising that they would use what was available, whether or not it was a strong choice.

I don't recall the DWBP rationale behind the decision of using owl:versionInfo in examples, but for ADMS the adoption of this property was based on the principle of re-using existing terms, when available. A specific term could have been defined in the ADMS namespace if deemed necessary, as it was done with adms:versionNotes.

I forgot to mention that also DISCO is using owl:versionInfo:

https://rdf-vocabulary.ddialliance.org/discovery.html#versioning-information

As far as I am concerned, the use of owl:versionInfo is both common usage (ADMS, DCAT, DISCO) as well as suggested best practice (DWBP), so I don't see why there is a need to create a property in DCAT that does exactly the same thing. To me, if feels like finding a solution where there is no (practical) problem.
My perspective is always: don't change things unless there is a serious problem, like something doesn't work in practice. That doesn't seem to be the case here.

HCLS uses pav:version, so not everyone agrees on owl:version-info. From looking at the use case (one of only three we have, where the third is a special case of the first one, so in a way this is half our requirements), "Being able to publish dataset version information in a standard way will help both producers publishing their data on data catalogues or archiving data and dataset consumers who want discover new versions of a given dataset, etc." Failing to adopt any specific term seems like failing to address this use case. The more recent feedback from Sandia National Lab also asks for a version term in DCAT.

Related to pav:version, we should consider that PAV (the Provenance, Authoring and Versioning vocabulary) is not a W3C recommendation.

The notes on owl:versionInfo indicate that "Although this property is typically used to make statements about ontologies, it may be applied to any OWL construct. For example, one could attach a owl:versionInfo statement to an OWL class. ". So, it seems a suitable existing property to cover the use case.

@agreiner said:

Failing to adopt any specific term seems like failing to address this use case.

Do you think that adopting owl:versionInfo is not appropriate for addressing the use case?

Thanks, Alejandra. I think we need to do more than just recommending. With this clarification, owl:versionInfo seems like a reasonable choice. I think what people are hoping for is an agreement to use one particular term as part of DCAT, not just a statement that there are many options and we like one of them.
-Annette

On Nov 18, 2020, at 2:32 PM, Alejandra Gonzalez-Beltran notifications@github.com wrote:

Related to pav:version https://pav-ontology.github.io/pav/#d4e869, we should consider that PAV https://pav-ontology.github.io/pav/ (the Provenance, Authoring and Versioning vocabulary) is not a W3C recommendation.

The notes on owl:versionInfo https://www.w3.org/TR/owl-ref/#versionInfo-def indicate that "Although this property is typically used to make statements about ontologies, it may be applied to any OWL construct. For example, one could attach a owl:versionInfo statement to an OWL class. ". So, it seems a suitable existing property to cover the use case.

@agreiner https://github.com/agreiner said:

Failing to adopt any specific term seems like failing to address this use case.

Do you think that adopting owl:versionInfo is not appropriate for addressing the use?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/w3c/dxwg/issues/1280#issuecomment-729999219, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGVLNZC7NIA5YIZUI2CE4TSQRDQRANCNFSM4TVMY6RQ.

The RDF OWA means that we can't actually force anyone to do anything.
But I agree with @agreiner that we should make a recommendation wherever possible.
At the very least we need to provide worked examples showing the preferred approach for people to emulate.

Around 15 years ago I was involved in developing the Geography Markup Language. (Actually several versions between 2001 and 2007). We hedged our bets and provided multiple alternative options everywhere, allowing for almost every variation that anyone asked for. Big mistake. It was a huge burden on data consumers (who had to be ready to accept anything) and almost no work for data providers (since they had inserted their existing model into the standard). And no-one was happy because there was just too much variation to allow data to be brought together. I learned my lesson.

The RDF OWA means that we can't actually force anyone to do anything.
But I agree with @agreiner that we should make a recommendation wherever possible.
At the very least we need to provide worked examples showing the preferred approach for people to emulate.

Around 15 years ago I was involved in developing the Geography Markup Language. (Actually, several versions between 2001 and 2007). We hedged our bets and provided multiple alternative options everywhere, allowing for almost every variation that anyone asked for. Big mistake. It was a huge burden on data consumers (who had to be ready to accept anything) and almost no work for data providers (since they had inserted their existing model into the standard). And no-one was happy because there was just too much variation to allow data to be brought together. I learned my lesson.

@dr-shorthair:
I don't see any problem in recommending owl:versionInfo in the normative part under the class dcat:Resource, we can borrow terms from well-established vocabularies, similarly to what we did for dct, prov, odrl terms. If we haven't yet included the new terms in the normative part, it is mainly because we want to hear comments to consolidate the direction before embarking on a major change which includes modifying the DCAT schema figure.

I agree with @agreiner and @dr-shorthair's comments that we need to make a recommendation of specific terms and provide examples. As @riccardoAlbertoni hinted to, I think that is the intention for DCAT3, while for this FPWD we want to hear comments about the approach and properties chosen.

owl:versionInfo is perfectly fine for versioning an ontology itself. PAV author here, in http://pav-ontology.github.io/pav/pav.rdf we even use that rather than "eat our own dog food" (as it would take it into OWL Full or pollute with OWL annotation properties):

    <owl:Ontology rdf:about="http://purl.org/pav/">
        <rdfs:label xml:lang="en">Provenance, Authoring and Versioning (PAV)</rdfs:label>
        <owl:versionInfo rdf:datatype="&xsd;string">2.3.1</owl:versionInfo>
        <owl:versionIRI rdf:resource="&pav;2.3"/>
        <owl:priorVersion rdf:resource="&pav;2.2"/>
        <owl:backwardCompatibleWith rdf:resource="&pav;2.2"/>
        <owl:backwardCompatibleWith rdf:resource="&pav;2.1"/>
        <owl:backwardCompatibleWith rdf:resource="&pav;2.0/"/>
        <owl:backwardCompatibleWith rdf:resource="&pav;authoring/2.0/"/>
        <owl:backwardCompatibleWith rdf:resource="&pav;provenance/2.0/"/>
        <owl:backwardCompatibleWith rdf:resource="&pav;versioning/2.0/"/>
        <owl:incompatibleWith rdf:resource="http://swan.mindinformatics.org/ontologies/1.2/pav.owl"/>
<!-- ... -->
</owl:Ontology>

But for tracking versions of datasets then owl:versionInfo owl:priorVersion etc becomes cumbersome as it implies you are making an ontology. But your dataset is not an OWL ontology - it is just described using ontologies.

https://practicalprovenance.wordpress.com/2016/05/07/tracking-versions-with-pav/#organize we say how you can also use PAV relations like pav:previousVersion to show the whole version hierarchy, if you like, including the unversioned "latest":

PAV has Current Version

Thanks for your feedback, @stain .

We have a new draft of the versioning section, where we opted to focus on a more specific notion of version, and to build upon PAV for the specification of version history:

https://raw.githack.com/w3c/dxwg/dcat-versioning-v2/dcat/index.html#dataset-versions

@andrea-perego just a quick (formal) note: it looks a bit strange that this new text in PR #1295 presents the new dcat:version but still has an editor note (mentioning this issue) that seems not to acknowledge that DCAT has begun to move away from owl:versionInfo.

Thanks for pointing this out, @aisaac . As this is still a draft for discussion, a ref to all the relevant open issues has been kept, whereas EDNOTEs are used to explain how they are being addressed in the current version.

Thanks @andrea-perego . You're right the reference to the issue should be kept, if the idea is not to close this issue when the PR is merged. And this is tricky to handle from an editorial perspective. Would it be possible to have the EDNOTE right after the issue note (i.e., moving the issue note on 1271 elsewhere)? This could make the story easier to understand for readers.

Excellent, @andrea-perego !

No other issues were raised after PR #1295 was merged.

Closing.

Was this page helpful?
0 / 5 - 0 ratings