The representation of Usage notes found in the rec document as skos:scopeNote elements in the RDF representation (i.e. the .ttl file) is incomplete for terms from external namespaces (mostly DCT). This should be remedied. It requires an entry for each term that we use that includes only a skos:scopeNote for each Usage note seen in the HTML document.
Maybe also the See also entries that we have included in the normative vocabulary description (i.e. the tables in chapter 6)
And the Domain and Range where they are different (tighter scope) than in the original namespace.
This is a relatively mechanical task but still needs executing.
During the translation to Czech, I came across the annotations of external vocabularies and there is an issue with this approach. We SHOULD NOT annotate external properties and classes (such as DCTERMS) with DCAT specific texts, as those statements then apply to ALL instances of those properties, regardless of DCAT.
Example: I have an endpoint where I use dcterms:issued for the date a book was issued in some library dataset. Now I use the same endpoint to store some DCAT metadata, and I load the DCAT vocabulary there. Suddenly, when someone looks at a record about a book, they see for instance This indicates the date of listing the dataset in the catalog and not the publication date of the dataset itself., which makes no sense in this context.
Therefore, if we absolutely need to state this information about the properties, we would have to create DCAT subproperties for everything and make the statements about them. Otherwise we should refrain from stating specific things about external vocabularies to avoid creating mess.
The argument that "the user can choose to separate DCAT vocabulary with those annotations into a separate graph` IMHO does not hold, as we should not presume any particular named graph use case.
In addition, this problem can be seen even in the context of DCAT alone, where we have e.g.:
dct:issued skos:definition "Date of formal issuance (e.g., publication) of the item."@en .
dct:issued skos:definition "Date of formal issuance (e.g., publication) of the distribution."@en .
dct:issued skos:definition "The date of listing (i.e. formal recording) of the corresponding dataset or service in the catalog."@en .
with no way of distinguishing which definition goes with which usage of the property.
Perhaps what this also shows is that "definition" is the wrong property for that information. Each usage community will have its own business rules for a property; maybe we need a property that specifically indicates the local definition and the local business rules, separate from the definition at the origin. In the case of DCMI Terms, there are comments that give recommendations for usage that are separate from the definitions. Users of DCMI Terms are not bound by those recommendations; adding their own guidance for usage seems logical, as long as there is a property that distinguishes it as belonging to their usage.
As for the different uses within a single dataset, this is a problem that may be solved with the use of "shapes" over RDF graphs. RDF graphs are not good at providing context for properties, while shapes do. How one adds the shapes concept to something like DCAT, perhaps with SHACL or ShEx, is yet another specification.
@jakubklimek I disagree. Following the RDF OWA 'anyone can say anything about anything' these annotations apply to these elements _in this context_. The annotations are in a file (graph) called 'dcat.ttl' so the context is clear. You should only load this graph when you want the DCAT context. You say that you have multiple applications accessing 'the same endpoint' so you are trying to switch context in one place (the application) while not also switching context in the RDF graph that you are accessing. You are attempting to shortcut setting the right context and that is what is creating the unexpected side-effect.
adding their own guidance for usage seems logical, as long as there is a property that distinguishes it as belonging to their usage
@kcoyle I agree, it is important to agree that we want to say "this is how this property is used in DCAT", not "this property means something DCAT-specific universally". There is a huge difference. And I don't think SHACL nor ShEx will help us here - they work on top of existing data, i.e. when the damage is already done.
@dr-shorthair And I disagree with your argument in multiple points.
anyone can say anything about anything holds universally for RDF - anyone can have a statement about anything they want in their data. But this is not OWA.anyone can say anything about anything. OWA says The fact that there is not a particular statement does not entail that the statement is false, it is just unknown. Translated to the example given above, when DCAT says dcterms:issued is formal issuance of distribution, other people using dcterms:issued may not know about this, but it is still out there. And there is no machine-readable description of this is how DCAT uses it. It just reads this is what dcterms:issued is for everyone.what is in this file is just a view of DCAT. If we view the data model just as triples, then all triples stated everywhere hold. According to OWA, we may just not know about them yet. But they still should be semantically correct, otherwise, when we get to know them, we get an error/conflict. In this way, I think there is no such thing as a context.trig, not ttl, so that everyone gets the same graph IRI. In addition, there should be some machine-readable description of what this graph means.dcat.ttl file, as illustrated above - there is no machine readable way to distinguish, which annotation goes with which usage of the property, creating mess. For example, if my application is a simple RDF browser, which, in addition, displays skos:definitions for properties used on instances, I will see "The date of listing (i.e. formal recording) of the corresponding dataset or service in the catalog."@en among other skos:definitions on an instance of dcat:Distribution. The application has no way of knowing the correct context of those annotations.You say that you have multiple applications accessing 'the same endpoint' so you are trying to switch context in one place (the application) while not also switching context in the RDF graph that you are accessing. You are attempting to shortcut setting the right context and that is what is creating the unexpected side-effect.
I am saying an application has no way of knowing how to switch contexts, as those are not described in a machine-readable way.
IMHO it still holds that when DCAT says something about the global dcterms:issued, it should be something applicable in all contexts where dcterms:issued is used, and therefore, should not do it. The other users, when not using DCAT, may not know about that statement. But when they do, it should not introduce mess/conflicts.
Another example to illustrate my point:
If I say dcterms:title rdfs:label "Dataset title", I am saying _The globally used property dcterms:title has a label "Dataset title" and all that use it can also use this title"_, which is incorrect. What I wanted to say is _I want to use "dcterms:title" in DCAT as "Dataset title"_, which is something different. And there are 2 ways of saying that. Either create a subproperty, which is truly a Dataset title, and not a generic Title, or devise some other way of saying that, e.g. having an "annotation" entity, linked do dcterms:title and to the usage note coming from DCAT. How about using the Web Annotation Vocabulary for that?
I also think this is quite a fundamental disagreement and I would like to get views on this from the group, @makxdekkers ?
@jakubklimek @dr-shorthair I can see good points in both your opinions. I agree with @dr-shorthair that people who load dcat.ttl are obviously interested in that particular context, but I agree with @jakubklimek that it is dangerous to make any 'local' assertions about other peoples' classes and properties for the reasons he outlines. I myself made the error once to make an assertion (really a cut-and-paste error) about an external property that created a conflict with the canonical definition, with obvious interoperability consequences.
I have looked back at the Turtle expression of DCAT 2014 and saw that there only the DCAT classes and properties were included, and none of the terms of 'external' vocabularies. Maybe that is the safest way to proceed, removing all non-DCAT terms from the ttl?
I would either remove the non-DCAT terms or I would investigate the option of using the Web Annotation Vocabulary for that, if possible (I am not sure, I have not used it yet).
The problematic scope note on dct:issued is for its use as a property on dcat:CatalogRecord, (see the text, section 6.5.3). Because in the ttl file, the property context is not noted, the scope note should be revised:
skos:scopeNote "When used as a property of dcat:CatalogRecord, this property indicates the date of listing the dataset in the catalog and not the publication date of the dataset itself."@en ;
when DCAT says something about the global dcterms:issued, it should be something applicable in all contexts where dcterms:issued is used, and therefore, should not do it
I disagree strongly. It is completely legitimate to add annotations to any class or property from any namespace, to indicate expectations _in a particular context_. The context here is DCAT. That is clear from the fact that this annotation appears in the RDF representation of DCAT, as packaged in an artefact called dcat.ttl and distributed by W3C. What is done with that information in a particular application context is up to the application developer, who should be careful to load the graph they need for the application they are building. There are many technical ways to achieve this, which are not the point here (named-graphs are just one possibility),
The 'danger' is introduced by an application developer using a graph constructed for one application in a different context. It is the responsibility of the application builder to load the graphs that are needed for their application.
Using an RDF property or class from someone else's namespace and adding some rules about how it should be used in a specific context is part of the RDF methodology. We have a lot of specific usage notes relating to dct, foaf, prov ... resources. I would be OK for them to be all moved to skos:scopeNote rather than skos:definition (I think that is @kcoyle suggestion), but not to remove them altogether from the RDF representation _in the file called dcat.ttl_.
If you can't add annotations for your context that greatly limits the concept of application profiles. If dct:title is "The name given to the resource" and if your context is that the resource is a dataset, then you will want to convey to users of your metadata specifically that this is the name given to the dataset, as "resource" may not express that well. DCMI Terms give this information in an rdfs:comment, and the wording is consciously broad to allow reuse.
Also, I have often thought that the RDF language-tagged strings do not cover all of the common use cases, in that one not only has a desire to give the user the language of their choice but in many instances there is additional context that could be used to select display forms. In the educational arena, terminology changes based on the presumed reading level of the audience, and you may not want to address early readers with the same concept names as university students. Having a "context" tag would help, although how it would function within RDF is something yet to be worked out.
there is no machine readable way to distinguish, which annotation goes with which usage of the property, creating mess.
@jakubklimek you have a good point here. There are a small number of cases where the usage guideline is different depending on context, even within DCAT.
Yes, the web-annotation vocabulary might provide a platform - I've just skimmed it and can see the general intention. It might be interesting to develop some patterns relating to our use case here - perhaps you could propose something. This could contribute to implementation of the more general 'profile' case mentioned by @kcoyle. Mind you, OA appears to be mostly aimed at applications to annotate text-ish resources, and I'm not sure if users who loaded dcat.ttl would find or notice annotations nested in this way.
In the short term, perhaps we could provide some assistance to users by inserting the phrase "In the context of a DCAT {classname}, ..." at the beginning of each textual definition or scope note.
I disagree strongly. It is completely legitimate to add annotations to any class or property from any namespace, to indicate expectations in a particular context. The context here is DCAT.
@dr-shorthair On a philosophical level, I agree that it is legitimate. Semantically, the way it is done now seems wrong to me, because there is no machine-readable way of determining that the context is DCAT. The fact that the triples reside in a particular file is lost the moment the data is loaded into a triplestore and I do not think that leaving this to application developers is a good idea. Besides, the data should be usable and self-explanatory even without applications, e.g. for analysts using SPARQL, etc. That is why it is so important to keep it clean.
On that note, limiting all those texts to skos:scopeNote as @kcoyle suggested and prefixing them with In the context of a DCAT {classname}, ... seems far more reasonable, as it does not contradict with anything even when merged with other vocabularies.
Having said that, I would like more to have all that was said here to be machine-readable in the RDF file, i.e. that all this applies in the context of DCAT, or, in other words, that the creators of DCAT are saying, that a particular property is used in a certain way in the context of DCAT.
I agree that OA seems to be aimed mostly at annotation of text-ish resources, however, I think it can be applied nicely here. E.g. instead of saying:
dct:issued skos:scopeNote "Date of formal issuance (e.g., publication) of the distribution."@en .
why not say according to this example
<http://example.org/anno49> a oa:Annotation ;
oa:hasBody [
a oa:TextualBody ;
rdf:value "Date of formal issuance (e.g., publication) of the distribution."@en ] ;
oa:hasTarget dct:issued ;
oa:motivatedBy oa:commenting ;
dcterms:creator <https://some.iri.for.dcat.working.group> .
In addition, these statements could be in a separate file from the core dcat.ttl, e.g. dcat-annotations.ttl, so that everyone can decide whether or not they are interested in them.
@jakubklimek I do not see how using OA solves the context issue? The Turtle snippet in your comment appears to be about dct:issued in general, not just about the DCAT context. The fact that the annotation is attributed to the DCAT WG might be a helpful piece of information, but nowhere is it stated (machine-understandably) that the annotation only applies to DCAT. Or am I missing something?
@makxdekkers This is true. It just explicitly attributes the annotation to the DCAT WG, which is, nevertheless, better than human-readably saying that everything in dcat.ttl is in the context of DCAT, in my opinion. I was just trying to suggest a bit more acceptable alternative to the direct annotation of dcterms:issued with no contextual info.
If we were to focus on the machine-readable expression of something applicable to dcterms:issued in the scope of DCAT, I do not see any ideal solution now. Here is what I see as options:
dcat:Distribution and see the generic Date of formal issuance (e.g., publication) of the resource. description of dcterms:issued, instead of more specific explanation such as Date of formal issuance (e.g., publication) of the distribution..dcterms:issued (and others similarly), e.g. dcat:distributionIssued rdfs:subPropertyOf dcterms:issued; dcat:distributionIssued rdfs:comment "Date of formal issuance (e.g., publication) of the distribution."@en. allows us to write all we want to about the property (clearly in the DCAT context), while maintaining the semantic relation to generic dcterms:issued. The downside here is the added overhead and need for users to understand the concept of subproperties. Nevertheless, this is the way we do it in the Czech Republic when creating RDF datasets when we need to say something specific in a context.dcterms:title and only a human will be able to understand what.Maybe there are some other options I do not see now, but the original solution just does not seem right to me.
Maybe a solution lies between 1 and 2?
In cases where the usage note in the DCAT-context doesn't really have a lot of additional value -- as in the replacement of 'resource' by 'distribution' -- I would opt for 1: keep the information in the human-readable part of the specification, and delete it from the TTL file.
In cases, where the usage note really makes a difference, in the sense that people would otherwise not understand how to use the property, I would opt for 2 and create a sub-property.
I haven't looked in any detail in the specification how many of the latter we have, but I have the impression that most are in the former category. At least, in DCAT 2014, option 1 was chosen: all usage notes for 'external' terms are only in the HTML, not in the TTL.
I don't think OA solves the problem, there doesn't seem to be any way to indicate that the annotation is about the annotation target in some context, e.g.
issued as a property of CatalogRecord
vs.
issued as a property of Distribution
and that is what would be needed for a machine readable representation.
there is no machine-readable way of determining that the context is DCAT. The fact that the triples reside in a particular file is lost the moment the data is loaded into a triplestore ...
@jakubklimek That's the application-builder's choice. For example, they could load the data from dcat.ttl into a named-graph in a quad-store, which would differentiate it from all the other data in the store.
I agree with you and @makxdekkers that, if the semantics are really being constrained, then strictly there is a sub-property involved. But I think we all see the interoperability risks of making that sub-property explicit, by giving it a new name.
Simply making the annotations on the re-used properties a bit more explicit is the pragmatic solution. It is not machine-actionable. But annotations are 'informal' information, for human consumption only, so a for-human-consumption-only _solution_ is matched to the problem.
OK, to sum up, the proposed solution now is to:
skos:scopeNote or skos:usageNote (i.e. no skos:definition)correct?
I am preparing a PR implementing this solution.
OK, to sum up, the proposed solution now is to:
- change all properties attaching texts to properties and classes of external vocabularies to
skos:scopeNoteorskos:usageNote(i.e. noskos:definition)- Start all such texts with "In the context of DCAT 2.0 class , ..." so that the context is humanly understandable
correct?
@davebrowning Actually, I do not think so. There are still skos:defintions while we agreed to change them to skos:usageNote and skos:scopeNotes.
:smile: That's what question marks are for - I was going to check when I had a moment.....
@jakubklimek wrote:
@davebrowning Actually, I do not think so. There are still skos:defintions while we agreed to change them to skos:usageNote and skos:scopeNotes.
As discussed in the related PR #1010, we have decided to keep the skos:definition when we include the original definitions from third-party vocabularies. See https://github.com/w3c/dxwg/pull/1010#issuecomment-519081757.
So unless there are some oversights, I think we can close the issue.
Yes - the local definitions are now all in skos:definition props, but they are now clearly scoped to DCAT in the text. I believe the elements are all correctly annotated now, and the context of the annotations are all clear.
@jakubklimek is correct - at one point we mooted moving all the annotations into skos:scopeNote, but when I was executing the changes, I realised that the text prefix resolved the problem better and allowed the defs to stay in skos:definition props.
I believe we can now close this issue.
@riccardoAlbertoni @dr-shorthair I don't see a discussion related to what was discussed here (i.e. skos:definition vs others) in https://github.com/w3c/dxwg/pull/1010#issuecomment-519081757.
The current state concerns me for the following reasons:
skos:definitions for external vocabularies in one language - one original, one In scope of DCAT - this again seems like creating mess.skos:definitions without a language tag, which is an anti-pattern. All natural language texts in RDF should have a language tag. This might be an oversight, but definitely needs to be fixed, if the next point is not fixed.skos:definitions in DCAT, when they are exactly the same. I do not think DCAT has the mandate to define anything for items from external vocabularies - this again seems like creating mess.@dr-shorthair This again comes down to our disagreement about the scope of statements published in RDF vocabularies.
dct:conformsTo
skos:definition "An established standard to which the described resource conforms." ;
skos:definition "Un est谩ndar establecido al cu谩l se ajusta el recurso descripto."@es ;
skos:definition "In the context of DCAT 2.0 dcat:Distribution, an established standard to which the distribution conforms."@en ;
During the discussion, two remedies were proposed, namely
My judgement is that 2. is a better solution, and also removes the need to do 1. Furthermore, as the text in the skos:definition reflects the text in the document, which is labeled "Definition", this makes the RDF representation consistent with the document. Scope-note is _not_ the same as definition.
The ability to have repeated properties is a 'feature' of RDF, not a 'bug'. Where there is more than one definition, the ones other than the originals are now clearly scoped to DCAT (in some cases to a specified class), and by language. The vocabularies that we are re-using (DCT, FOAF, PROV, ODRL) don't all use the same way to record the definitions in the RDF representation, but now DCAT at least has adopted a uniform approach (using skos:definition).
The goal here is for users who only load dcat.ttl to have all the annotations visible to them. This solution achieves that. If there is repetition with triples from other RDF graphs, then the merged RDF graph does not create duplicates, so there is no harm done. I don't see how this is 'creating mess', unless you are running a strange RDF API.
(Two missing language tags is a small error that is corrected with #1034 .)
Though examples in SKOS reference and primer use just one skos:definition for language, no integrity conditions state such an expectation (contrary to what SKOS documents state for other properties such as skos:prefLabel).
I understand the perplexity raised by @jakubklimek, but for this specific case, in which we prefix the text with a note about the context, I support the solution proposed by @dr-shorthair.
@dr-shorthair One issue I have with the approach is that while I am OK with using skos:definition for DCAT classes and properties in scope of DCAT, as DCAT clearly defines then, I am not OK with using skos:definition with classes and properties from reused vocabularies simply because their meaning is not DCAT`s to define.
What happens if, hypothetically, after DCAT becomes Recommendation, the reused vocabularies get updated, start using skos:definition with other values? Then we are back where we started - with conflicting repeated properties coming from the reused vocabulary and from DCAT, with the only possibility of separating the statements by somehow knowing they came from dcat.ttl, which, IMHO, is not a widely used pattern.
I was more OK with using skos:usageNote or skos:scopeNote prefixed with "In DCAT ..." for that, but not skos:definition.
Regarding multiple skos:definitions per DCAT class or property - when I define something, I do it so that it is clearer what is meant. When a thing has more than one definition, its meaning is automatically less clear than when it has just one. I know there is nothing prohibiting us from having multiple definitions per class or property, I am just saying that the usability and understandability of such definitions is reduced by this for no good reason. For instance, what is prohibiting us from just merging the definitions into one literal?
Finally, you say the goal is for people who just want to load dcat.ttl to see everything. But this is prohibiting me from just loading definitions of DCAT classes and properties easily. In contrast, if the files were split into dcat.ttl with only DCAT things and dcat-ext.ttl or similar with annotations of external vocabulary things, both use cases would be possible, mine with only DCAT, and "complete" with both files merged.
+1 for skos:usageNote -1 for skos:definition
+1 for an ontology of just DCAT-defined elements in DCAT namespace and a separate vocabulary file of the entire DCAT specification. To me this makes the discovery and re-use of DCAT-defined terms clearer. It separates the origin of elements from their use in a specific context.
On this splitting of the vocab file - which one is dcat.ttl, the ontology of just DCAT-defined elements in DCAT namespace or the separate vocabulary file of the entire DCAT spec?
Since the 2014 dcat.ttl includes just DCAT-defined elements, I'd assume we keep it that way....?
@davebrowning
On this splitting of the vocab file - which one is dcat.ttl, the ontology of just DCAT-defined elements in DCAT namespace or the separate vocabulary file of the entire DCAT spec?
Since the 2014 dcat.ttl includes just DCAT-defined elements, I'd assume we keep it that way....?
+1
In my opinion, the second file might include only the "annotations" on external terms, not replicating the dcat 2 terms. The second file is provided as it might be handy to have it in certain applications/contexts.
We have separated DCAT in two ontologies. The first includes all the dcat-defined elements, the second groups all the dcat extra-annotations for third-parties vocabularies.
As far as I have understood, there is no agreement on whether to use skos:definition or skos:usageNote in the second ontology. Some of the members prefer to maintain coherence in the use of skos:definition between the two ontologies, others would replace skos:definition with skos:usageNote in the second ontology.
However, I think the above issue has lost part of the importance after splitting DCAT in two ontologies. In fact, the problem impacts only the second ontology, which is an optional resource with extra annotations. It is not returned by resolving the dcat namespace and comes into play only if explicitly loaded. Users and systems that are uncomfortable with the extra definitions on third-parties terms can simply ignore the second ontology.
I think the current setting offers a good compromise between the distinct positions. So I would suggest closing this issue. Objections?
I am marking this Issue as "due for closing", @kcoyle and @jakubklimek, you have been active in the discussion, I am hoping there are no strong objections from you. Please upvote if you are fine with the compromise. Thanks.
We have separated DCAT in two ontologies. The first includes all the dcat-defined elements, the second groups all the dcat extra-annotations for third-parties vocabularies.
As far as I have understood, there is no agreement on whether to use skos:definition or skos:usageNote in the second ontology. Some of the members prefer to maintain coherence in the use of skos:definition between the two ontologies, others would replace skos:definition with skos:usageNote in the second ontology.
However, I think the above issue has lost part of the importance after splitting DCAT in two ontology. In fact, the problem impacts only the second ontology, which is an optional resource with extra annotations. It is not returned by resolving the dcat namespace and comes into play only if explicitly loaded. Users and systems that are uncomfortable with the extra definitions on third-parties terms can simply ignore the second ontology.
I think the current setting offers a good compromise between the distinct positions. So I would suggest closing this issue. Objections?
Most helpful comment
OK, to sum up, the proposed solution now is to:
skos:scopeNoteorskos:usageNote(i.e. noskos:definition)correct?