Dear DXWG,
I'm working in a profile for dataset description in cultural heritage (within Europena).
I'm looking for guidance on how to specify the values that the dcterms:conformsTo values should have, and ensure machine interoperability.
I've observed some real-life cases and noticed sometimes the values are namespaces and in other cases the values are the links to the specifications.
Since we are interested in machine interoperability, namespaces look more appropriate but I'm far from certain.
Here is an example. In the case of a SPARQL endpoint...
... should namespaces defined by the protocol be used:
Is there a common practice, or recommendation, within DCAT?
Thanks in advance,
Nuno
It seems that DCAT uses mostly URLs of specifications as in Example 45. Is it a best practice we should follow?
My preference would be for the use of specification URIs, not namespace URIs. This is because, according to all of the profiles thinking we have done in DXWG, a specification/profile is fundamentally a different thing from a namespace. Obviously specifications might have namespaces for their technical content but a specification is a "larger" thing than just a namespace.
The problem here is, of course, that most specifications' URIs are not set up for any sort of machine actioning so the best you could currently hope for there is that the URI provide a universally unique identifier.
I hope that, in time, specifications that wish to be well machine-readable provided Linked Data functionality for their specification URI so that you can get to both a human-readable specification document, as you can presently for a spec like, say, DCAT but that you can also get an RDF version of the specification. That specification version should probably be something like a Profiles Vocabulary description of it which then tells you where all the other profile parts, such as machine-actionable constraints etc., are.
This question is interesting for the profiles work so I'm tagging it profiles-vocabulary too.
Namespaces just give you a collection of elements with some axioms - i.e. it is the ingredients, not really the recipe. It would usually be the recipe that you want to conform to - the patterns and usage rules. These might be expressed in a machine actionable form (e.g. SHACL) but will often be supplemented by additional instructions that are expressed in natural language. As @nicholascar says, there will typically be a suite of artefacts associated with a specification. The profiles-vocabulary provides one way to describe and 'package' them. I would expect a document conforming-to the profiles vocabulary to be accessible from the specification URI (i.e. the /TR/ URI, not the /ns/ URI) (using conneg by profile of course ...)
In our case we would like to use the dcterms:conformsTo to refer to specifications of the metadata schema expressed as a SHACL document. The goal is to make explicit to which metadata schema the provided metadata record should conform to. This is used not only as information but also supports the validation of the metadata content against its schema.
@luizbonino you have a noble goal in mind here but the issue is that the use you're describing is not standardised therefore if you do this, others might not. The proposal in the Profiles Vocabulary, which is aiming at being a standard is to identify specifications and then link things like SHACl documents to them. The documents can have roles so you can know that Specification_X has a Resource_Y that is a SHACL file use for validation (as opposed to something else that you could use SHACL for like transformation.
I think the Profiles Vocabulary can cater for your requirement and it would have you define your specifications of interest (and profiles of them) and the relevant validation resources.
@luizbonino We also have a similar requirement to yours. In our work around Europeana, we also have difficulties with the use of dcterms:conformsTo for data resources. Some data sources will state that the data conforms to a namespace, others use the URL to an XML-Schema. And we know also that any of these choices will not be appropriate in cases of general data container formats. It is also necessary to know the data profile in use.
@nicholascar you pointed out the key aspects of unique identification and the machine readability.
I'm looking forward for seing the Profiles Vocabulary become a standard.
@nicholascar has pointed out that this use case is a motivation for the Profiles Vocabulary as an information resource that can address such Use Cases. Its worth pointing out that implicit in this is the HTTP-range14 problem - you need to reference the conceptual specification as @dr-shorthair suggested - and then decide what information resource it best has.
When you are talking about "schema" - thats a perfect example of where flexibility is needed for different resources that form part of the recipe for a specification. For XML that will be XSD, for JSON that will be JSON-schema, for RDF its RDFS, OWL and/or SHACL. But if using JSON you also might want JSON-LD contexts.
At this point you can both describe all the available resources using Profiles - but you can also dereference specification URI using content-negotiation and content-negotiation-by-profile to access resources directly without processing the prof description.
Ultimately you need a framework for disseminating the identifiers to the community - having a URL link to some information resource is inherently fragile and non-extensible, so you need to manage non-information-resource URIs and infrastructure to dereference these.
At the OGC I am working through publishing profile descriptions for specifications, and setting up such infrastructure - but there is a legacy of embedded document references to think about too.
Some members of the DXWG are interested in best practices for publishing specifications from a practical sense, but whether this emerges as a formal deliverable in some guidance document is not clear yet. Please include us in reviews of any proposals to formalise requirements.
The intent of dct:conformsTo has to do with "established standards". The full definition is:
"An established standard to which the described resource conforms."
This is where you can say that your resource conforms to ISO 2709 or Oasis-Open "legaldocml-akomantoso". An internal document or application, such as a SHACL file, doesn't really fit this definition. The profiles vocabulary may be better suited to this. I don't see PROF as a substitute for this DC term, but a statement with a different meaning.
@kcoyle , thanks you Karen, this distinction between established standards and internal documents is key for our needs. I think that, for our case, PROF would be appropriate. The PROF's example 1 (https://www.w3.org/TR/dx-prof/#eg-initial-example) is very close to what we intend.
It seems to me the important question is from the client perspective. The client parses a metadata document and finds various distributions for a resource of interest; the dc:conformsTo property should provide the criteria to identify a distribution that the client will be able to parse and use. The problem is that there is a dependency on the sophistication of the client-- some clients might require a very specific serialization of the resource representation to work (e.g. xml according to schema X, using vocabulary Y for property values), others might have the logic necessary to handle any application/xml. Thus the dc:conformsTo profile URI needs to be hierarchical or multivalued. Established standards are good, but for a particular community and client, as long as the client recognizes a conformsTo URI for a distribution it can work with, it really doesn't matter if its a 'standard'.
@smrgeoinfo What does matter is not to change the semantics of the dct:conformsTo, since that is defined by DCMI. If you need different semantics then you need a different property. So, yes, it does matter that it is a 'standard' although the interpretation of 'established standard' is somewhat loose. I would say that using dct:conformsTo to link from a dataset to a specific processing program or internal schema is more than a stretch, and doesn't help the client select an application vs a defined standard. Obviously there are no Dublin Core Cops to stop you, but you may not be communicating well outside of your own narrow community if you use dct:conformsTo with two very different meanings.
@kcoyle
I don't see PROF as a substitute for this DC term, but a statement with a different meaning.
PROF uses dcterms:conformsTo directly and as intended by its definition.
To achieve what @smrgeoinfo wants, we just need to see more things that are used as "An established standard...", as per the dcterms:conformsTo, allocated a URI.
I imagine that for a "big" standard like ISO 2709, you would use PROF to indicate conformance to it with dcterms:conformsTo and also the conformance of things to profiles of it that communities might make and ID with a URI.
I'll just add that I've joined ISO's Technical Committee 211 with a view to ensuring that the ISO 19* series standards (about geographic information) have sensible URIs to use for conformance claims. This is partly already the case, see https://def.isotc211.org/ which shows how to generate URIs for parts of 19* standards.
Just remember there are three things happening - all of which are fairly common sense unless mixed up...
dct:Standard is) - so there needs to be a published URI shared amongst a community (this matches @smrgeoinfo view)So PROF isnt a pre-requisite of dct:conformsTo - it just makes the URI identifiers of conformance statement targets useful beyond simple string comparison use cases and human reading of specification documents.
I'm concerned that the dots are not being fully connected here.
@kcoyle points out the dct:conformsTo should point to an 'established standard`.
But how is this done? With a URI reference!
That begs the question of _what do you get when you dereference the URI denoting a 'standard'?_
It might be a PDF or HTML page - i.e. a classic standard document.
But you could also do content-negotiation _on the same URI_ and get a PROF resource, which is an alternative representation of a standard, in particular to support traversal to a variety of artefacts, some of which are executable in a particular environment.
@dr-shorthair - there is no actual grounding for "established" - thats up to whoever uses the URI to determine if it meets their needs.
there is thus no control over what dereferencing will mean - only a best practice. Generally humans should be able to cope with a HTML rendering of Prof as a "Landing page" - the only thing that would be upset would be automated document harvesters that are unable to ask for the right profile (which would be advertised). Still working through this implementation detail - at least we have canonical mechanisms available now, the question is only one of transition strategy ;-)
Actually I was not particularly interested in the 'established' question.
I was trying to respond to what I understood to be the original concern of @nfreire , which is whether a dcterms:conformsTo property should point to a document or to something more machine readable. I'm attempting to point out that the answer can be both, through the magic of URI references and content negotiation. Certainly it needs practices to be established, but 'Standard' (_established_ or not) does not mean only a specification document.
@kcoyle
As I see it, the semantic definition of dct:Standard does not refer to a 'standard' or 'established standard'; it is defined as "_A reference point against which other things can be evaluated or compared._" Certainly, an ISO standard or W3C Recommendation would fit this definition but also a 'internal schema' or some other set of rules. Even the objective of a SHACL file is to compare things against, isn't it?
Having said that, I agree that if you want something more specific than dct:conformsTo, you could define a different property, possible a sub-property like foo:conformsToSpecification or foo:conformsToSchema.
Thanks everyone for the input! With @nfreire we've decided that we're going to put our bet on PROF. So ideally we would have URIs for the specifications used with dcterms:conformsTo, and expect PROF descriptions for them (after content negotation). For the time being we're going to use URLs forhttp://www.w3.org/TR/sparql11-query/, assuming that they are the closest we have for serving as reference URIs for specs.
This is all food for the Profile Guidance document which looks like it鈥檚 still definitely needed!
Yes good point @nicholascar ! I guess that instead of closing it we could keep it open and tag it with "guidance" so that we don't forget to add it there. And remove the tags "profile-vocabulary" and "DCAT"...
note that there is a relationship with the discussion in #1130
@andrea-perego w.r.t. can you shed any light on these:
https://inspire.ec.europa.eu/metadata-codelist/ProtocolValue
they reference a sort of "abstract typing" for specific OGC specifications - not the specific implementing versions.
I suspect that what they really do implicitly is reference an INSPIRE profile of a particular version of a standard, as defined in some out of band rules or guidance.
If there is something that can be added to the very plain objects being referenced at the OGC - maybe a comment or something, we can run this past the OGC NamingAuthority and update it.
For example INSPIRE profiles described using PROF could be linked to these abstract resources, and this published list could have a note saying conformance requires use of the relevant INSPIRE profile (an example of "conformance" being relative to the community of practice, and hard to be specific about).
@rob-metalinkage maybe this could be a different github issue? I was much involved in the creation of the issue, and I can't make the link. There is probably one, but it's unclear and it seems it hints to a more general (or specific?) issue whether to pointing to time-stamped standards or more abstract ones.
The original issue was one of pointing either to the standard or to the document expressing it.
In fact now I may extend myself, but because it's our issue with @nfreire maybe I can do it ;-)
My question is about protocol vs query standard, for an access service. The URI that we agreed here for a SPARQL endpoint (and was in the DCAT spec) was that of the query language (http://www.w3.org/TR/sparql11-query/).
I've just seen that a recent change in the spec uses the URI for the SPARQL protocol instead (https://www.w3.org/TR/sparql11-protocol/):
https://github.com/w3c/dxwg/pull/1310/files
Does this represent a change of approach, or some recommendation that should be followed, on what is expected wrt dct:conformsTo for data services?
As a side question, I'm wondering whether there is anything here that the DCAT spec editors should be pushed to more specific DCAT-related guidelines such as the DCAT-AP ones. (maybe @andrea-perego or @makxdekkers know?)
@aisaac said:
@rob-metalinkage maybe this could be a different github issue?
Yep, better to create a separate issue, not to overload this one.
@rob-metalinkage , I've copy-pasted your comment in https://github.com/w3c/dxwg/issues/1338 . I'll reply to you there.
@aisaac said:
My question is about protocol vs query standard, for an access service. The URI that we agreed here for a SPARQL endpoint (and was in the DCAT spec) was that of the query language (
http://www.w3.org/TR/sparql11-query/).
I've just seen that a recent change in the spec uses the URI for the SPARQL protocol instead (https://www.w3.org/TR/sparql11-protocol/):
https://github.com/w3c/dxwg/pull/1310/filesDoes this represent a change of approach, or some recommendation that should be followed, on what is expected wrt
dct:conformsTofor data services?
You're right, @aisaac . The revision you refer to stems from discussion in https://github.com/w3c/dxwg/issues/1225 - in particular, https://github.com/w3c/dxwg/issues/1225#issuecomment-719374471 -, and I'm afraid I forgot about what discussed in this thread.
As a side question, I'm wondering whether there is anything here that the DCAT spec editors should be pushed to more specific DCAT-related guidelines such as the DCAT-AP ones. (maybe @andrea-perego or @makxdekkers know?)
Following the discussion in https://github.com/w3c/dxwg/issues/1225 , we have updated the draft by providing some guidelinles - see the last part of 搂13.2.1 (Conformance to a standard).
But those guidelines are not discussing specifically whether you should point to the protocol or the query language specification.
@andrea-perego thanks for the answer. I must say I am mildly convinced by the discussion on #1225. The protocol URI seems to have appeared out of the blue there, without explanation.
I have tried to see if the European Data Portal could give us expectation of the usage of the protocol URI vs the query language URI. Firing the query
PREFIX dct: <http://purl.org/dc/terms/>
SELECT distinct ?d ?s
WHERE { ?d dct:conformsTo ?s
FILTER contains( STR(?s), "sparql")}
at https://www.europeandataportal.eu/sparql-manager/en/
I obtained only results with https://www.w3.org/TR/sparql11-protocol/. But they seem to come from one catalogue, so I'm not sure this is a strong proof.
Maybe this is something we could ask advice from the wider group? Unless you can show me some more motivation, which I would probably accept :-)
@aisaac said:
@andrea-perego thanks for the answer. I must say I am mildly convinced by the discussion on #1225. The protocol URI seems to have appeared out of the blue there, without explanation.
I think it came out after I made a reference - see https://github.com/w3c/dxwg/issues/1225#issuecomment-718265384 - to the list of URIs maintained by OSGeo, where they use https://www.w3.org/TR/sparql11-protocol :
https://github.com/OSGeo/Cat-Interop/blob/master/LinkPropertyLookupTable.csv
I have tried to see if the European Data Portal could give us expectation of the usage of the protocol URI vs the query language URI. Firing the query
[...]
at https://www.europeandataportal.eu/sparql-manager/en/
I obtained only results withhttps://www.w3.org/TR/sparql11-protocol/. _But_ they seem to come from one catalogue, so I'm not sure this is a strong proof.Maybe this is something we could ask advice from the wider group? Unless you can show me some more motivation, which I would probably accept :-)
An argument could be that what distinguishes a service / API are the protocol and query parameters, whereas it is not necessarily bound to a specific query language (there are services / APIs that support different query languages).
But we should indeed ask input from the group.
But they seem to come from one catalogue, so I'm not sure this is a strong proof.
@aisaac this is not proof as this comes from our Czech catalog where I used it in connection to the referenced discussion.
I agree with @andrea-perego in the argumentation that the protocol (e.g. HTTP methods and Media types) is what characterizes the data service more than the specification of the query language.
Hi, following a suggestion from my colleague @Abbe98 suggested we could have two dcterms:conformsTo, one for the language and one for the protocol. Would this be acceptable?
Most helpful comment
Just remember there are three things happening - all of which are fairly common sense unless mixed up...
dct:Standardis) - so there needs to be a published URI shared amongst a community (this matches @smrgeoinfo view)So PROF isnt a pre-requisite of
dct:conformsTo- it just makes the URI identifiers of conformance statement targets useful beyond simple string comparison use cases and human reading of specification documents.