Dxwg: Clarify need for unambiguous profile URIs when compound specifications referenced.

Created on 25 Jul 2019  路  12Comments  路  Source: w3c/dxwg

in response to question from @tombaker

"if.. sections of those documents were not clearly defined, any DCAT profile (or indeed most profiles that currently exist in the wild) would not be suitable for content negotiation as per CONNEG?"

need to clarify in the document that if you can't identify what a profile means you cant make any statements about it, such as referencing it in a conneg request or response. (for example if there are different options for conformance in a single specification document, and only an identifier for the document)

agreed in meeting:

If a specification is unambiguous about the requirements for conformance as a data profile then it only needs a single URI...

if there are multiple options or parts of or within a specification which may be considered suitable for being conformed to then these must be unambiguously identified with distinct URIs and thus can be negotiated with for content

conneg-by-aP provides no mechanism for describing which arbitrary part of a specification is being conformed to .

profile-negotiation

Most helpful comment

@nicholascar I'm fine with this in theory and will look for it in the document when reviewing to see what it says. It sounds to me like there will be no stated mechanism for designating portions of documents, but also nothing that prevents it. If that's the case, that's fine with me.

All 12 comments

@rob-metalinkage Can you link to the source context for this? And hopefully it includes the question that was asked. Thanks.

https://lists.w3.org/Archives/Public/public-dxwg-comments/2019Jul/0004.html

included question and updated original comment to make it more readable.

I like the idea of tying this to functionality, such that IF you wish certain functionality then you MUST provide a "hook" (URI) into the description that supports that functionality.

Perhaps the statement in the document could be generalized to say something simple like: Any document or specification or portion thereof that is suitable for content negotiation must be unambiguously identified with a URI.

Although I do wonder if people won't think that it means a separate URI and not a URI with document fragment. Can that be made clear? It would be good to reference that section of the IETF RFC.

@kcoyle +1 to "such that IF you wish certain functionality then you MUST provide a "hook" (URI) into the description that supports that functionality."

I'm not sure the second sentence is clear enough though - it sounds as though content negotiation might apply to the document /specification/portion itself - not to the set of data instances described by it, which is the real functionality described by conneg..

Regarding the 'hook' statement:

The specification, in the Abstract Model's 6.2 Profile Identification states:

A client requesting the representation of a resource conforming to a profile MUST identify the resource by a Uniform Resource Identifier (URI) [RFC3986] and MUST identify a profile either by a URI or a token that unambiguously identifies the profile for the server within that request/response session.

So, from this document's point of view identification by URI is mandated.

Regarding the "separate URI and not a URI with document fragment":

The specification never mentions fragment URIs, only URIs, and, for profile identification, also tokens that map to URIs. While servers strip off URI fragments when resolving them and leave clients (web browsers) to navigate (scroll) to the fragment-identified part in a served resource, the identification role of a URI including a fragment uses the whole URI. so yes, URIs + fragments can be used to identify profiles and this is normal URI use.

It would be a matter for perhaps Guidance to flesh out how/why people could/should allocate URIs, with or without fragments, to document/specification portions.

So since I don't think any change is needed to the conneg doc here, I'm marking this 'due for closing' and will close in a week unless there is any thing else raised here.

I'm reopening this to make sure that @tombaker is satisfied with this response to his comment.

From @tombaker 2019-08-23 https://lists.w3.org/Archives/Public/public-dxwg-comments/2019Aug/0003.html :

Dear Lars,

On Tue, Jul 16, 2019 at 11:49:12AM +0200, Lars Svensson wrote:

I always thought that the spirit of CONNEG was to be
liberal about may be used as a URI for content
negotiation (e.g., even purl.org/dc/terms/),

From a conneg point of view, the profile URI can point
to anything (including nothing, i. e. the profile URI
can return 404).

This is an important point with which I enthusiastically
agree. For starters, profiles URIs _will_ inevitably
return 404s, increasingly over time, as resources are
moved or become unavailable.

In the IETF document we say that _if_ the profile URI
is a protocol URI (http/https/ftp/sftp/...) it SHOULD

The judicious use of SHOULD here looks spot-on.

A further possibility is to constrain
this further and say that the profile URI SHOULD
resolve to a profile description and that the server
SHOULD use content negotiation (by profile) to serve
the best available representation of the profile. The
client can ask for a specific profile of the profile
(e. g. a ShEx version) and then the server would return
a ShEx document that can be used for validation.

The first sentence refers to a "representation" of the
profile. Do I correctly understand you to mean: "the
client can ask for a specific _representation_ of the
profile" (instead of "profile") and "a ShEx
_representation_" (instead of "version")?

Tom

From @tombaker 2019-08-23 https://lists.w3.org/Archives/Public/public-dxwg-comments/2019Aug/0004.html :

Dear Lars,

On Mon, Aug 05, 2019 at 09:49:13AM +0200, Lars Svensson wrote:

In order to be able to use profiles specified as
parts/sections of larger documents, those
profiles/sections need their own URIs.

On Wed, Aug 07, Lars wrote [1]:

If a profile is part of a larger document that contains
several profile specifications, each of those profile
specifications needs it's own URI (which would probably
be a URI fragment). _Iff_ the larger document only
contains one profile specification, it MIGHT be
possible to use the document URI as a proxy for the
profile URI for conneg purposes, but that feels a bit
streched.

I'm having trouble reconciling what look to me like three
quite different messages about the role of URIs in
conneg:

  1. The notion that "From a conneg point of view, the
    profile URI can point to anything (including nothing,
    i.e. the profile URI can return 404)". [2]

  2. The first position above: that from a conneg point of
    view, a profile that is part of a larger document
    MUST have its own URI.

  3. The second (and more radical) position above:
    that a document URI is not good enough for conneg if
    the document "contains several profile specifications".

FWIW, I strongly agree with #1. For starters, profile
URIs _will_ increasingly return 404s over time. And
service interruptions on servers that provide profile
documents should surely not prevent users from leveraging
a URI to get data.

Requiring that document fragments have their own URIs
raises many questions. A "hash URI", for example, may
have a conceptual meaning (e.g., "the definitions of
terms used in this document"), but it also may function
as an anchor for positioning the browser window at the
start of a section. What a hash URI does not do, as far
as I know, is define the _end_ of a document fragment.
In other words, a hash URI provides no basis for actually
extracting a fragment from a larger document. In the
case of PDF documents, the hash URI would not even serve
as an anchor.

WRT #3, I see a tension between the very generalized view
of profiles as being pretty much anything that builds on
something else -- e.g., the vocabulary DCMI Metadata
Terms as a "profile" of RDF -- and the functional role of
URIs in conneg. From a conneg point of view, as I
understand it (see #1), the URIs used in content
negotiation might typically be profiles, but they are in
fact not even required to be profiles at all.

Of the three approaches, the first is the most realistic
and workable. This approach can be presented with plenty
of SHOULDs (e.g., SHOULD resolve to a resource with a
profile description), as done in the IETF document.

Approaches #2 and #3 hope that implementers will grasp
the notion that a profile document (i.e., what most
people think of as profiles, such as DCAT in its various
manifestations) might actually consist of distinct
"profiles" (or "profile specifications"); that they will
come to a reasonably coherent collective understanding of
the nature and granularity of such embedded
specifications; and that they will go through the trouble
of coining URIs for those embedded specifications.

If I have understood it correctly, this approach seems
both quite new and somewhat out of line with how people
currently think of profiles (i.e., as documents). Maybe
we should be moving in this direction -- I'm sceptical,
to be honest -- but it would need to be implemented and
tested, and we would IMO need to see evidence that people
actually want to do things this way.

In short: the idea that a profile URI can point to
anything (or nothing) is IMO exactly right [4].
Introducing, at this point in the process, new, complex,
and untested notions about separately identified parts of
documents IMO muddies the waters and should therefore be
out of scope for CONNEG -- or perhaps reserved for a
future revision, by which time one might be able to point
to a significant body of implementation experience.

Tom

[1] https://lists.w3.org/Archives/Public/public-dxwg-comments/2019Aug/att-0001/00-part
[2] https://lists.w3.org/Archives/Public/public-dxwg-comments/2019Jul/0006.html
[3] https://lists.w3.org/Archives/Public/public-dxwg-comments/2019Jul/0003.html
[4] https://lists.w3.org/Archives/Public/public-dxwg-comments/2019Aug/0003.html

From discussion in the CNEG subgroup meeting 2019-08-29, when using the proposed CNEG mechanisms it is up to each community to determine what a profile identifier means in terms of expected responses. If a URI points to an ambiguous resource with no clear conformance requirements the nature of the server response may not be usefully predictable.

We will say this in the document with a forthcoming PR if we have consensus on this response.

@tombaker @kcoyle what do you think of this response?

@nicholascar I'm fine with this in theory and will look for it in the document when reviewing to see what it says. It sounds to me like there will be no stated mechanism for designating portions of documents, but also nothing that prevents it. If that's the case, that's fine with me.

@nicholascar OK, if you make the PR, I'll review.

And @kcoyle's comment and @tombaker's thumbs-up renders my ACTION-372 void...

Closing as all have agreed the interpretation here.

Was this page helpful?
0 / 5 - 0 ratings