Dxwg: Review global domain axioms on dcat properties

Created on 8 Feb 2018 · 35Comments · Source: w3c/dxwg

Axiomatizations in RDF vocabularies can have big implications for re-use.
In particular, providing an rdfs:domain for a property means that it may only be used in the context of that class - the corollary being that when it is used, then there is an entailment that the subject of a statement is a member of the domain class. THat limits re-use of useful properties.

We already agreed to relax the constraint on dcat:contactPoint #95 #97.
It is probably worth reviewing all the domain axioms in DCAT and dropping those that are not needed.

change-proposal dcat due for closing

Source

dr-shorthair

All 35 comments

I am in favour of being the least restrictive as possible with domain axioms. I would suggest that rather than "_dropping those that are not needed_" we could use an approach of "keeping only those that are absolutely necessary".

makxdekkers on 13 Feb 2018

👍1

dr-shorthair on 14 Feb 2018

Properties for which the domain axiom seems to be necessary are:

For dcat:Catalog: dcat:dataset, dcat:record
For dcat:Dataset: dcat:distribution
For dcat:Distribution: dcat:accessURL, dcat:downloadURL (are domain axioms really necessary here?)

I do note that by removing the domain axiom from dcat:mediaType, it becomes identical to its super-property dct:format -- the only difference then being the usage note recommending IANA media types for dcat:mediaType and anything else for dct:format.

For dcat:theme it is similar, although the difference with dct:subject remains because of the narrower range of dcat:theme (skos:Concept) versus dct:subject (undefined).

makxdekkers on 14 Feb 2018

arminhaller on 14 Feb 2018

I agree with the idea of "keeping only those that are absolutely necessary" and for those applications that that do need more restrictions, I think they could be provided in terms of a DCAT 1.1 profiles. However, we might need to think about the implications on terms of reasoning that this added flexibility will have. Additionally, I think it would be important to link these changes to a relevant use case, and derive the requirement from that.

agbeltran on 14 Feb 2018

I also agree with Makx' proposal to drop domain axioms as much as possible. There are currently 12 non-deprecated properties in DCAT with a domain restriction (see below). Dropping the domain restrictions will make these properties more broadly reusable; although it may be difficult to think of specific use cases for them in advance.
Regarding the implications in terms of reasoning, dropping domain axioms would no longer make it possible to infer class membership from the use of the property. I think most users of dcat would expect this to be stated explicitly in the metadata.

dcat:dataset    rdfs:domain dcat:Catalog. 
dcat:record rdfs:domain dcat:Catalog. 
dcat:themeTaxonomy  rdfs:domain dcat:Catalog.
dcat:contactPoint   rdfs:domain dcat:Dataset. #see also #95 
dcat:distribution   rdfs:domain dcat:Dataset. 
dcat:keyword    rdfs:domain dcat:Dataset.
dcat:landingPage    rdfs:domain dcat:Dataset.
dcat:theme  rdfs:domain dcat:Dataset.
dcat:accessURL  rdfs:domain dcat:Distribution.
dcat:byteSize   rdfs:domain dcat:Distribution.
dcat:downloadURL    rdfs:domain dcat:Distribution.
dcat:mediaType  rdfs:domain dcat:Distribution.

stijngoedertier on 14 Feb 2018

danbri on 14 Feb 2018

As discussed in today's meeting, I have created separate issues for each of the non-deprecated DCAT properties so that the discussion on each can be recorded discretely. dcat:contactPoint already has #95 and #109

dr-shorthair on 15 Feb 2018

Remaining to be discussed as part of this general issue is the possible impact on tools and applications of any changes to the constraints - as discussed in part in https://www.w3.org/2018/02/14-dxwgdcat-minutes#x06

dr-shorthair on 15 Feb 2018

We may also need to reflect the motivation for this discussion up into the UCR tracking.

dr-shorthair on 15 Feb 2018

As suggested by @philarcher https://github.com/w3c/dxwg/issues/131#issuecomment-366165887 maybe replace (or supplement) rdfs:domain and rdfs:range constraints with schema:domainIncludes and schema:rangeIncludes annotations (from schema.org)

dr-shorthair on 19 Feb 2018

What exactly is meant by "dropping" and "relaxing"?

The properties already are specified in a W3C Recommendation:
https://www.w3.org/TR/vocab-dcat/

Is it indended to obsolete that spec? (How would that work?)

akuckartz on 19 Feb 2018

The Data exchange working group is chartered to revise DCAT.
https://www.w3.org/2017/dxwg/charter
W3C recommendations get updated from time to time in response to requirements from the community.

dr-shorthair on 19 Feb 2018

The most relevant section in the charter seems to be "DCAT 1.1":

An update and expansion of the current DCAT Recommendation. The new version may deprecate, but MUST NOT delete, any existing terms.

akuckartz on 19 Feb 2018

@dr-shorthair @philarcher I am a little uneasy about the idea to have two versions of DCAT, e.g. DCAT-loose where domains and ranges are only suggestions, and DCAT-tight where domains and ranges are axioms. As it is, DCAT version 1 has proven to be extremely useful and I don't understand why we seem to be drifting in a direction that goes toward fundamental redesign. Yes, there are things that are missing, things that don't work very well and things that need more explanation. I hope we can concentrate on making it better, and not try to create something different. For example, I have heard no-one in the implementer community I know asking to get rid of domains and ranges. I am really not in favour of relaxing the rules because some people violate them and then penalising people who did the 'right' thing.

makxdekkers on 19 Feb 2018

@akuckartz - deprecate but not delete - indeed! However, there has been no proposal that I've seen to delete (or even deprecate) any terms. The expectations around this are quite clear.

The issue on the table here is whether and how much to adjust the definitions of some existing terms. You have quite validly raised the question about whether any proposed changes are sufficient that the meaning has changed, so new URIs are required.

As mentioned at the top of the thread, this concern was triggered by the discovery that re-use of some of the DCAT properties is - almost certainly inadvertently - disallowed because global domain/range constraints were used ubiquitously in DCAT v1 - see #95 and #97 . This issue #110 is an 'umbrella' for a set of individual issues to examine each property, in turn, to verify if the global constraints are indeed appropriate.

My take on where we have got to so far:

@makxdekkers initially proposed that the starting assumption should be to _not_ have domain constraints, in order to enable maximum re-use https://github.com/w3c/dxwg/issues/110#issuecomment-365337292 . - this got quite a few +1s

@philarcher provided a bit of context around the use of global constraints, and maybe a hint about the desirability of moving away from that modeling style https://github.com/w3c/dxwg/issues/131#issuecomment-366165887 .

I have suggested that we might adopt a pattern recently used by the Spatial Data group, in which the vocabulary is packaged in multiple graphs, which allows users to get the level of axiomatization that they prefer https://github.com/w3c/dxwg/issues/110#issuecomment-366585293 and https://github.com/w3c/dxwg/issues/111#issuecomment-366556100 .

@makxdekkers sees some risks in this https://github.com/w3c/dxwg/issues/110#issuecomment-366769200

Several people have quite reasonably cautioned that any changes must be motivated by real documented requirements.

dr-shorthair on 19 Feb 2018

A contribution to this issue sent to the mailing list:

From: Peter.[email protected] [mailto:[email protected]]
Sent: Monday, 19 February, 2018 22:40

Is there a problem with "dropping" / "relaxing" axiomatization that might lead to a problem similar to the Anemic Data Model antipattern in DDD (see - https://martinfowler.com/bliki/AnemicDomainModel.html ) i.e. that axiomatization provides benefits from reasoning, and that one would have all the 'costs' of marking up the data appropriately but would have none of the benefits from the reasoning.

If the answer is to include the axiomatization in the application profile then isn't this is similar to the 'service layer' that Fowler discusses.

dr-shorthair on 19 Feb 2018

As @philarcher pointed out in https://github.com/w3c/dxwg/issues/131#issuecomment-366165887 there has been a tendency lately to stay clear of domain/range restrictions in ontologies, since it is hard to anticipate consumer behaviour, i.e. the way the ontology is used. In the past, global constraints have been used to impose a certain use of properties on the user. However, it has been shown in many applications (particularly in federated querying and search on the semantic Web, e.g. [1]) that these global constraints are leading to undesired issues (naïve users tend to think of domain and range restrictions as constraints, too). Guarded local constraints are achieving a similar purpose (i.e. suggesting a specific use of a property) without the undesired side-effects of global constraints. They should be accompanied with a better documentation of the property, which can include domain/rangeIncludes annotation properties. Deprecating domain/range restrictions (i.e. relaxing the semantics) cannot, however, "break" any current use of DCAT. The only way it can have an impact is, if an application wrongly relies on domain/range restrictions only for auto-completion or faceted browsing. We agreed in our last phone conference that we need to investigate if that is the case with current DCAT implementations.

I also agree that there is no need for a seperate namespace. However, there can be a lightweight profile of DCAT in a seperate namespace that the new version of DCAT imports and extends (under the old DCAT namespace). We went down this route with SOSA/SSN and I am happy to outline our reasoning and design choices in the telco.

[1] http://ceur-ws.org/Vol-628/ldow2010_paper04.pdf

arminhaller on 20 Feb 2018

However, there can be a lightweight profile of DCAT in a seperate namespace

... or in the _original_ namespace, particularly if the lightweight description is strictly just 'annotations'.

dr-shorthair on 20 Feb 2018

I've just looked through the current DCAT vocabulary, seeing if there are any domain/range declarations that look potentially restrictive. I found very few things to worry about. That is, the ones that are present almost all seem to be entirely untroublesome. For example, it seems to me that declaring the range of dcat:distribution as dcat:Distribution is entirely sensible. Many of the properties have a domain of dcat:Dataset, dcat:Catalog or dcat:Distribution - and they don't look wrong.

Where there might be room for change - by which I mean relaxation - is in the ranges of
dcat:contactPoint (v:Kind)
dcat:landingPage (foaf:Document)
dcat:mediaType (dct:MediaTypeOrExtent)

I'm not advocating relaxation of these, just saying that these are the only ones that are currently defined that look like potential candidates for discussion. My point was only that if DCAT declares domains and ranges that can be shown to have been harmful or that are widely ignored, OK, relax them. And, in general, only apply domains and ranges where they actually help clarify the model, otherwise, leave it open. I'm not advocating relaxing rules for the sake of it.

Add in the properties that take skos:Concept as a range to my short list and I realise that the thing they have in common is that they're all properties in the dcat namespace that declare a range outside of it - and that may or may not be how they're used in the wild.

As @makxdekkers says, DCAT is used widely already. The main task is to add in the terms that usage has shown are lacking. In doing that, and in addressing new use cases, one might find a reason to add in an extra layer of semantics - no problem with that as the SOSA/SSN experience shows.

And as @arminhaller points out, domains and ranges aren't actually restrictions, rather they can allow computers to infer the type of resource. So if you see dcat:landingPage, you can infer that its object is a foaf:Document - and Web pages certainly fit that description.

Where semantics might get more tricky, as recent discussions show, is around things like PROV. That seems more like the area where an additional graph might be handy.

philarcher on 20 Feb 2018

@philarcher - the case that triggered this discussion was #95, where a domain is provided for dcat:contactPoint which limits its re-use in the context of individuals of any other class (or more strictly, it entails that when it is used, the subject is a dcat:Dataset). Multiple domain values are combined with AND not OR, so you can't easily add further classes into the domain.

The suspicion was that there may be other cases in DCAT where an innocent looking global domain constraint would limit its re-use elsewhere, so it was agreed (in the last meeting) to create a complete set of issues for the DCAT properties to trigger evaluation of these.

dr-shorthair on 20 Feb 2018

That makes sense @dr-shorthair of course. My fear was that this would become a general cry to 'burn all domains and ranges' which would I think, risk throwing away those that are actually helpful.

philarcher on 20 Feb 2018

PR https://github.com/w3c/dxwg/pull/140 adds an RDF file for DCAT-schema.org alignment, which has domainIncludes and rangeIncludes annotations matching all domain and range axioms from DCAT 1.0

dr-shorthair on 1 Mar 2018

Of course any domain/range constraints that are dropped in the revision would be retained in the DCAT 1.0 profile - e.g. see https://github.com/w3c/dxwg/blob/gh-pages/dcat/rdf/dcat10.ttl

dr-shorthair on 19 Mar 2018

130 and ID51 are the underlying motivation for this issue and the set of individual issues generated on Feb 15th

dr-shorthair on 19 Mar 2018

@dr-shorthair I am not sure I understand the approach. In https://github.com/w3c/dxwg/blob/gh-pages/dcat/rdf/dcat10.ttl it notes that the imported namespace http://www.w3.org/ns/dcat continues to contain all existing axioms. Under which namespace URI would the new version with modified axioms be published, and how would it refer to existing properties with different axioms?

makxdekkers on 19 Mar 2018

👍1

I also noticed this line in https://github.com/w3c/dxwg/blob/gh-pages/dcat/rdf/dcat10.ttl

@prefix dcat10: http://www.w3.org/ns/dcat10# .

This namespace is new. I do not think that I like the creation of a new namespace for the old version and the use of the old namespace for the new version.

akuckartz on 19 Mar 2018

We can defer the namespace question for now.
This RDF file is a simple approach to building a 'profile' which merely imports the current DCAT, and adds the axioms that were removed back in.
It is essentially just bookkeeping until the preferred profile mechanism is determined.

Pending the formal publication of a DCAT 1.0 profile, people can get DCAT 1.0 by using this graph.

dr-shorthair on 19 Mar 2018

BTW - the 'new namespace' @prefix dcat10: http://www.w3.org/ns/dcat10# . merely identifies this graph or file. There are no classes or properties in this namespace.

dr-shorthair on 20 Mar 2018

@dr-shorthair I am still trying to understand how changing domains and/or ranges would work in terms of RDF schemas.
The file at https://github.com/w3c/dxwg/blob/gh-pages/dcat/rdf/dcat10.ttl owl:imports <http://www.w3.org/ns/dcat>. Doesn't that mean that it imports all the axioms in the RDF namespace? How could a new profile, e.g. DCAT 1.1, reuse the URIs in the existing namespace but with different domain/range axioms?

makxdekkers on 25 Mar 2018

👍1

It imports every triple (axiom) in dcat 1.1, and adds additional axioms (i.e. the ones that were removed relative to dcat 1.0).
Since we have added nothing to dcat 1.1 yet, at present you end up pretty much what we had before - i.e. all the axioms that were in dcat 1.0.

Moving forward it'll likely get more complicated, but for now the dcat10 file is mostly bookkeeping, in a formalized way.

dr-shorthair on 25 Mar 2018

See https://www.w3.org/TR/owl2-syntax/#Imports

dr-shorthair on 26 Mar 2018

One open issue (#125) and a related issued (#313) remain outstanding at publication of 2PWD, so this issue removed from the milestone. All others have been completed for 2PWD

davebrowning on 20 Sep 2018

Remaining or associated work here (#113, #125, #313 #317) is either due for closing or future work, so will mark this as due for closing as well since the specific issue does seem to have been addressed.

davebrowning on 25 Sep 2019

Specific open aspects are registered in other open issues which are linked to this issue, so this can be closed without harm.