Entered from Google Doc
Suggested Re-wording: "Requirement: A short token to specify a profile may be used as long as there is a discoverable mapping from it to the profile's identifying URI"
Profile negotiation requires an identifier for the profile to be passed from client to server, and from some source of metadata (such as the server itself) to the client (either at run-time or at some prior configuration step).
With MIME encoding based negotiation, tokens are registered at IANA
There will be potentially many more profiles across application domains than distinct encodings, as profiles are simple content rules scoped to communities of practice, whereas encodings require software clients to be implemented.
Use of URIs as profile identifiers is necessary for discovery of details - so such URIs are available as tokens for content negotiation.
This has some drawbacks however - URIs are hard to encode in query strings, and if a server supports many profiles it may be a burden for humans to read a large set (for example to choose an option).
One option is to use CURIE syntax prefix:token in which case this would be sematically equivalent to the full URI form. This would require the server implementation to be willing to advertise prefixes it understands, or clients to specify prefix assumptions and to match either full URI or CURIE forms.
there may also be a default prefix, and a global registry of well known profiles - in which case the profile negotiation specification should perhaps declare the default prefix namespace.
IMHO this should be discussed at F2F, taking advantge of joint session with JSON-LD team who are looking at "framing"[1] which is an allied (but perhaps not identical) concept
Added @rob-metalinkage 's comment to f2f agenda.
There was a longer discussion between @azaroth42 and @nicholascar in the google doc so I copy that here for completeness
Rob S:
-1, unnecessary complication and introduces non-uniqueness challenges. Short tokens are good for humans, but humans are not the actors for content negotiation. Each token would need to be publisher specific to avoid collisions, requiring additional time and architecture for no obvious gain.
Nick:
Unnecessary for machine yes but the scope of this profile neg includes humans!
Use Cases like #239 are specifically in scope so Requirements like this one follow from that.
Rob S:
Yes, and I'm questioning that scope :) What's the use case in which a token is discoverable by a human who then types it into the address bar of a browser in a particular place in order to receive content designed for machine interpretation?
Nick:
Exactly as per UC 239 examples although the profile they receive may not be entirely machine-readable (e.g. an HTML web page). We have people already using the Alternates View / Linked Data Platform approach, e.g. http://pid.geoscience.gov.au/sample/AU239?_view=alternates and I've just shown that this is completely compatible with the proposed Accept-Profile and Profile HTTP conneg (see my recent emails about demos with http://w3id.org/mediatype/).
So this Requirement is to ensure that IF a token used THEN it must be compatible with the HTTP connect methods.
Currently my demo demo may be missing some bits to satisfy this Requirement but perhaps not: the RDF format of an Alternates View allows for graph traversal from tokens to URIs. But it was made pre-this requirement and could do with alignment with new terminology ("profile" rather than "view").
I have started work on a QSA dummy Implementation of the abstract model for profile neg that we have slated for the Conneg by AP document: https://github.com/CSIRO-enviro-informatics/profile-conneg-qsa-realisation
Not working yet but will by next week but the code’s goal is to implement, in a simple few file Python app, all of the QSA functions necessary to emulate the full set of HTTP profile Conneg functions.
I want to get this dummy up and feature complete - with its dummy, static, data, before re implementing the QSA approach in our real production Linked Fata tooling - pyLDAPI which currently implements the soon-to-be superseded view/format method I outlined in UC Issue 239.
Dummy (https://github.com/CSIRO-enviro-informatics/profile-conneg-qsa-realisation) seems to be now pretty complete.
Rename from "There needs to be metadata about the views provided by profiles (“named collections of properties”) that can included in a http header [ID5] (5.5)" to "Requirement: A short token to specify a profile may be used as long as there is a discoverable mapping from it to the profile's identifying URI"
A demo of the QSA demonstrator linked to above is now online at http://13.236.122.60/qsa/
Subgroup 2019-02-14: can be catered for with a list of tuples for GET requests for profile lists (uri, token), (uri, profile)... and in HTTP headers there is a microformat in Link headers allowing for extra info with URIs that might be equivalent.
discussed in CNEG sub group 13 Fec.. action to propose some options:
Option 1:
Server offers a 1:1 mapping from (optional) tokens to URIs using typical "parameter" syntax - we define the parameter "token"
Content-profile: URI[;token=mytoken],
Pros: self-contained,flexible, explicit.
Cons: potential redundancy in URIs if many offered
Option 2:
tokens are registered in a global registry and servers may use them in place of a URI
Content-profile: token1, URI2
Pros: compact
Cons: global registry for profiles not manageable when many systems define profiles - limits all the other capabilities of profile description by forcing generic profiles.
Option 3: allow namespace declarations - like in JSON CURIE (compact URI - declare namespaces)
Content-profile: @:http://example.org/profiles/,@w3c:http://w3.org/knownprofiles#, :token1, w3c:token2
Pros: compact, no registry requirement
Cons: clients have more parsing to do
Note: could easily be a JSON-LD payload
(alternative syntax using parameter ns2;ns=http://example.org/profiles/)
Option 4: add a namespace header
Content-profile-namespaces: :http://example.org/profiles/,w3c:http://w3.org/knownprofiles#
Content-profile: :token1, w3c:token2
@rob-metalinkage
Thanks for putting this together. Just some spontaneous remarks on the different options.
Option 1:
Server offers a 1:1 mapping from (optional) tokens to URIs using typical "parameter" syntax - we define the parameter "token"Accept-profile: URI[;token=mytoken],
Pros: self-contained,flexible, explicit.
Cons: potential redundancy in URIs if many offered
Should that be Content-Profile: URI[;token=mytoken]? Otherwise I don't understand from where the client gets the mappings of URIs to tokens. In my understanding the message exchange would be:
<request>
GET /some/resource HTTP/1.1
Accept-Profile: urn:example:profile:1
-----------
<response>
HTTP/1.1 200 OK
Content-Profile: urn:example:profile:2;token=profile-2
------------
<new request, now the client knows that urn:example:profile:2 is bound to the token profile-2>
GET /some/other/resource HTTP/1.1
Accept-Profile: profile-2
Option 2:
tokens are registered in a global registry and servers may use them in place of a URI
This is the cleanest solution. If it works really depends on how many profiles will be registered (and by whom...). For some organisations it can be too complex to register profiles. OTOH could something like prefix.cc for profiles be a solution (or maybe re-use prefix.cc!).
Option 3: allow namespace declarations - like in JSON CURIE (compact URI - declare namespaces)
Accept-profile: @:,@ns2:
Sorry, My brain cannot parse that syntax... Can you provide an example?
Option 4: add a namespace header
Do I understand correctly that this would mean that tokens are not global identifiers but only valid in the context of the specified namespace?
@nicholascar scripsit:
and in HTTP headers there is a microformat in Link headers allowing for extra info with URIs that might be equivalent.
I just re-read the syntax section of RFC 8288 (Web Linking) and could not really find anything... Can you expand a bit on what this microformat looks like and where it's defined?
fixed formatting in original comment so namespaces visible...
Please so not select a solution which depends on a central registry for profiles.
I find "a short token […] may be used" to be vague. Which party can use it, and which party is responsible for this mapping?
Does a server necessarily need to understand that short token? Is the mapping from short tokens to URIs a universally agreed one? If not, does that mean that different service can interpret different short tokens differently?
Or are short tokens… universal? Like a sort of URI?
@RubenVerborgh yes, tokens are underspecified as the document is written right now. I _think_ (@rob-metalinkage and @nicholascar might disagree...) that:
Further, my understanding is that:
Okay, but what would be the benefit of such a one-server mapping? It wouldn't really save bandwidth, and the client would still need to know what it is looking for.
Notes:
1) many servers already support such a one-server mapping for what are logically profiles of a resource - these can be described, so adding a list profiles method makes these APIs conform to this spec
2) the Profiles vocabulary supports a "preferred token"
De-tagging as profile-negotiation as dealt with by its point of view
de-tagging for Profile Negotiation as the issue is handled in the doc
I have a couple of questions, the answers of which might help me understand the reasons for a client to use a token instead of a URI.
curl, i.e., of saving keystrokes / avoiding typos?I'm asking specifically because this can help us assess the validity of a proposed token mapping process.
client may or may not know the token or what it means - thats a different architectural question for each implementation we can only address by specifying how they could find out.
The token mapping exists is for three main reasons:
1) it allows us to characterize existing systems that use a token to identify a profile - in general that is how the most common and trivial case of negotiation ("i want this one") is handled in many systems that offer different representations of things. (all they have to do is provide a means to list profiles and tokens - i.e. canonical form of documentation about what such tokens mean) - this is actually the whole motivating Use Case for all the participants planning to implement conneg, and why we need the profiles ontology to offer an out-of-band implementation of list profiles.
2) It makes for clarity when a human finds links that specify a particular profile to access
3) it allows us to provide a pathway towards use of URIs as dereferencable identifiers
So, yes URIs have primacy as unambiguous option, but we are seeking a way to retrofit them with minimal pain.
The guidance document could usefully cover this and other usage patterns. From the conneg perspective this is an identified requirement that has been met, and can go out for review for specific feedback. Feel free to open a new and conneg-specific issue if you want to propose an improvement in wording - or point out a specific flaw - but this issue was about meeting the requirement in the proposed text, and that has been done.
(in general this is why i have voted to restart the guidance doc - i think these questions about understanding how various current practices map onto formalisms live there in general - leaving specs clean as possible. )
@rob-metalinkage I am totally unclear how this relates to the proposed guidance document, which is about general "best practices" for the 'creation' of profiles. I am not aware that it will say anything about content negotiation by profile, nor anything about "usage patterns".
I find @RubenVerborgh 's arguments against tokens to be quite sensible.
client may or may not know the token or what it means
That puts us in trouble for creating and assessing such a mapping.
The current proposal I've seen is:
Link: <http://example.org/resource/a.profx.ttl>; rel="self"; type="text/turtle"; profile="urn:example:profile:x;token=px",
<http://example.org/resource/a.profy.ttl>; rel="alternate"; type="text/turtle"; profile="http://example.org/profile/2;token=py",
<http://example.org/resource/a.profx.xml>; rel="alternate"; type="application/xml"; profile="urn:example:profile:x;token=px",
<http://example.org/resource/a.profy.xml>; rel="alternate"; type="application/xml"; profile="http://example.org/profile/2;token=py",
<http://example.org/resource/a.html>; rel="alternate"; type="text/html"
However, this approach is useless for clients that don't know the token.
So I'm afraid we'll need better constraints. The minimum that I can imagine (but help me if I'm wrong) is that the client knows the token, but this begs the question (as I wrote above) why you wouldn't give it the URI in the first place.
- it allows us to characterize existing systems that use a token to identify a profile
But this is a case where the client _knows_ the token, right?
- It makes for clarity when a human finds links that specify a particular profile to access
But one would presume such links to be in human format, e.g., HTML with a link? People today also don't guess the content type from a URL.
- it allows us to provide a pathway towards use of URIs as dereferencable identifiers
You mean the "token" is a URI then? But then it's just the general case, i.e., not a token at all.
In any case, none of the above 3 cases showcase a "agent does not know token" situation.
I find @RubenVerborgh 's arguments against tokens to be quite sensible.
Thanks @kcoyle, but I was not even arguing yet :-) This is me honestly trying to understand how a mapping mechanism would work, and what prior knowledge we can assume in clients that want to use it. (And indeed, given some configurations of that prior knowledge, why we have tokens altogether.)
Thanks Ruben for pushing the architecture angle.. its where so many things fail to translate to implementation that works..
A client needs to "know" a URI in exactly the same way... but external systems are empowered to make unambiguous and navigable assertions using URIs .. which is why the list profiles mechanism is provided, and can be added to a system that already expects clients know tokens, to improve its interoperability and self-description.
Clients may also know the identifier from a catalog.. such as DCAT. The same way it may know the server location and capabilities.
From server and data metadata it may also be able to gain information via the Profiles ontology now URI identifiers support this linkage. Once you have a description or have found the service endpoint the client "knows" both what the URI might mean in relation to standards it can use as well as the token mapping.
Architecturally we are trying to make servers a little better at being self describing, but still expect most descriptions to be via some form of catalog or specific documentation. In its current form i think it improves the capacity to describe services that all content profiles to be specified for information retrieval purposes. Hopefully the extra step of looking up tokens would encourage people to use well known URIs but without well known URIs already it would be difficult to convince people to both change their systems AND have faith in a wide acceptance of such URIs as well known or dereferencable to get useful descriptions.
Without this assumption that the client may know about a profile there is a higher bar that URIs must be resolvable and to be useful that means a canonical description. The profiles vocabulary provides a canonical means to discover the relationships necessary to map a clients knowledge of profiles it needs or supports with profiles offered by a server .. and constraints specifications.. but it doesnt attempt to harmonise all those constraint description methods. Its also too early to mandate a fully self describing approach but the introduction of URI identifiers as mandatory provides a hook for future improvement.
there is a higher bar that URIs must be resolvable
That's not the case for the IETF draft we are preparing through—URIs are just (unique) identifiers. Does the W3C document mandate resolving for _all_ URIs (not just HTTP URIs)?
If the main advantage of tokens is that they do not resolve, URIs can do the same thing.
It seems that we are building a complex system to not really solve anything.
But my earlier questions still stand: what can we assume about clients that need to access the mapping? Can you (and hopefully several others) confirm that clients _always_ know the string of the token? Because if not, we don't even have a viable mechanism.
There is no advantage in not being resolvable. The advantage is in being able to describe systems that already, or prefer to, use short tokens.. so we provide a way to map to URIs. As one of the implementations planned thats my motivating use case.. to connect those existing tokens to descriptons via URIs so i contend it is more than useful.. its critical.
@rob-metalinkage I understand, but we won't be able to help your case without an answer to the questions I have written above.
Regarding your specific use case, it seems that a solution would also be to have a client-side mapping, so a server-side mechanism might not even be needed.
@RubenVerborgh I am assuming you mean these questions:
"But my earlier questions still stand: what can we assume about clients that need to access the mapping?
Can you (and hopefully several others) confirm that clients always know the string of the token?
Because if not, we don't even have a viable mechanism."
I might be missing something important - but if a client asks for a list of profiles and gets the URI and any token mappings, then it can invoke them and potentially interrogate the URI to get information. As an implementer i can make servers do this - or describe services in catalogues. What is not viable about this, and/or what alternative exists?
- because as i tried to explain we offer a mechanism for them to discover both the URIs and tokens
Which gives them the token _and_ the URI—so why would they want to us the token then?
All of the above answers make me very concerned about the token mechanism, as it complicates things unnecessarily, since there seems to be no single action that a client can do with a token that it could not already do.
The argument above that this mechanism would allow to
describe systems that already, or prefer to, use short tokens.
is in my opinion not relevant, because the only case for such a preference seem to be legacy. Yet these legacy systems would _still_ need to be adapted to use the mapping mechanism, so easier to just replace the tokens by URIs.
How many such tokens are actually in use today? Is it more than 10 or 20?
Wouldn't it just be easier to publish one constant token-to-URI mapping for those that exist (let's say xyz becomes https://w3id.org/dxwg/tokens#xyz) and leave it at that?
The motivation for catering for tokens is that existing profile negotiation systems (pyLDAPI, OAI-PMH, OGC Name Server) all use tokens. The things clients using these services currently need to know is a resource URI as well as service endpoints, QSA parameters etc.
With the inclusion of tokens in list profiles responses, then the number of things client needs to know is reduced - just the resource URI.
The additional client burden, if tokens are given by a server but not wanted by the client, is that the client just has to ignore them in the Link header.
(Sorry to be the one that replies to every single comment; trying to limit my interactions to the strictly necessary.)
The motivation for catering for tokens is that existing profile negotiation systems (pyLDAPI, OAI-PMH, OGC Name Server) all use tokens.
Check, that's a good list. It would be helpful if you could answer these questions: https://github.com/w3c/dxwg/issues/290#issuecomment-525306101, the most important one being: do those clients all _know_ the token before they make a request?
With the inclusion of tokens in
list profilesresponses, then the number of things client needs to know is reduced - just the resource URI.
Let's deconstruct that argument. If a client does a "list profiles", I am presuming that they _don't_ know the token they want, otherwise they wouldn't ask. So presumably, they are asking for a list of possible profiles, to then either harvest them all, or give the choice to a user or agent who _does_ know about tokens. So that is either a manual choice, or either there _is_ an agent who knows about the token. But that last case would negate he need to list the profiles, so only an interactive situation seems to fit. So the situation of "not knowing the token beforehand" does not seem to warrant the need for non-URI profile identifiers.
Now let's examine the situation of "knowing the token beforehand". If the client does know the token, why does it matter whether that token has the shape of the URI or whether it does not? I.e., what is the technical reason for such clients to prefer a non-URI string over a URI string?
So the number of things a client needs to know, does not seem to be reduced. You either have an identifier for the profile, or you do not. I have not yet seen an argument for why it would matter whether that identifier is a URI or a non-URI.
The additional client burden, if tokens are given by a server but not wanted by the client, is that the client just has to ignore them in the Link header.
So, we have established that:
But we have still not established that:
Or, refining that last point: is it really so much easier to retrofit existing negotiation systems with a mapping discovery process, only such that they can use non-URI identifiers for a profile?
tokens do not require extra work for servers -they can simply not support them
tokens (in the context of conneg) require less work for clients that already use them (because servers do) because now there would be a way of automating finding what they mean.
And as someone who is retrofitting an existing token based negotiation system, the proposed solution is the only option we've identified that seems workable - and its certainly easy enough.
tokens require more work for servers that are using tokens already - in that to conform they need to self-describe the mappings - but less work than if they are forced to change APIS to support URIs - which is also a burden on existing clients.
tokens (in the context of conneg) require less work for clients that already use them (because servers do) because now there would be a way of automating finding what they mean.
I have a hard time understanding that.
"finding what they mean" then equals "determining the URI for the non-URI identifier"?
To then use the URI in the future?
So those systems have to be _changed_ to look up the non-URI as URI?
If you have to change anyway, why not simply change it directly into a URI?
the proposed solution is the only option we've identified that seems workable
And changing non-URIs into URIs is not workable because?
As I wrote above, we could just make a list.
but less work than if they are forced to change APIS to support URIs
I can't see that. Tokens are strings, URIs are a subset of strings. The only change (if even needed) is to accept strings that start with xyz:.
…compared to implementing a whole mapping mechanism.
APIs with published tokens that describe what those tokens mean in documentations arent that easy to change compared to adding a way to map them to URIs so such documentation can be found in future. yes they are just strings - but we are talking about strings that exist in the world - for example I have lots of data where such tokens have been used in entailed links based on the object type and what services will respond to it.
If those APIs are published standards, so changing token strings to be URI versions (agreed they are just strings) is even more complex than just changing the systems.
So I still see no reason for a new API to support legacy tokens, or to even have to support a standardized mapping.
I agree completely with Ruben. When paving cowpaths, not every little bush causes the road to have an unnecessary curve. Standards need to consider the good of the overall environment, not just the ease of layering on top of existing non-interoperable systems. I share the opinion that a token is unnecessary, and trending towards harmful.
The problem with the logic is here "So those systems have to be changed to look up the non-URI as URI?"
No - if systems already use tokens and understand them they dont need the URI - the lookup method allows better self-documentation and new clients to discover what tokens mean (something they cant do very well at the moment - it usually means discovering and reading human readable documentation - both non automatable).
I dont think we are disputing it is going to be preferable to use only URIs moving forward. At the moment however all the implementers proposing to test this need to handle the token case. Unfortunately the bush in the road is the existing Web, and every service out there that allows alternative representations to be provided for a given non-information resource. To offer a solution that doesnt support tokens we'd need to find at least two new implementers who are ready to go deploying systems that have stable URIs for profiles. I'm happy to retrofit to support URIs (as well) - but all my data uses links with tokens (and i can adapt to the QSA model for list-profiles) and I'm unwilling to create unreadable URLs with nested encoded URIs while I'm easing my community into the whole idea of deferenceable URIs.
clients that do already know tokens (or URIs) dont have to be changed - what this specification offers is a canonical means of finding what profiles a server supports - so the "change" is an optional retrofit of better metadata - and the burden of describing tokens already in use is trivial - it just requires a server to map to URIs. Systems that are starting from scratch can use URIs without the added option of tokens. Minimal impact for maximum applicability.
The burden (the straight road we are forcing) is to mint stable URIs and provide a mechanism to tell people what they mean. Tokens are just a trivial annotation on that - and my view is that if we do not ease existing practices into URIs by allowing them to document existing APIs the barrier of minting URIs before you start becomes too high. (and this is from the perspective of someone employed to help a standards organisation mint URIs - its a seriously non-trivial task unless you think you have the one uber standard the world is going to use for all things and never need to specialise.)
@rob-metalinkage Can you give concrete examples of APIs using tokens for profiles, or needing them because profile discovery is currently done with tokens? And by "concrete" I mean either links to the API entry points (if they are browser-based) or their documentation. In the UCR document we have one requirement relating to tokens, although tokens are not mentioned in the Use Case it is derived from. As use case and requirement these are pretty non-specific, and you haven't yet pointed to actual usage in this thread. That would be helpful. Thanks.
OK here's one (separate from the one I will implement)
https://github.com/UKGovLD/ukl-registry-poc/wiki/Api
The OGC definitions server is based on a thing called the Linked Data API
which is also deployed on UK and Australian infrastructures:
https://documentation.ands.org.au/display/DOC/Linked+Data+API
(you need to either look at examples or dive down into specs at https://github.com/UKGovLD/linked-data-api/blob/wiki/API_Query_Parameters.md#viewing to see how it has a _view parameter which is basically a profile.)
Nick can point to Australian Government Linked Data WG practices which are different again.
Taking a slightly wider scope beyond simple URI based mechanisms, many Web Services support something similar. A Case in point are OGC Web Services - which offer a GetCapabilities method which return a list of resources that can be asked for by token. (Map layer, Feature Type) etc. One of the limitations of these is that they dont usefully distinguish between when these tokens respresent different objects or alternative views of the same object - and improving that situation is a key motivation for profiles description work, and the use of conneg to map such tokens to URIs means we can make such services self-describing without forcing rewrite of those specifications which are adopted as standards in ISO, EU legislation etc,
@rob-metalinkage Thanks for the pointers. I may have been misunderstanding the goal here, but I was expecting that there would be examples that involve profiles, since that's the discussion here, tokens for profiles. Is the point that some future registries of profiles would use tokens? Unfortunately, when I try to see the UK registry itself linked from there I get a 404. The Australian one states that it is only for SKOS vocabularies.
@kcoyle these are examples that use things that are consistent with the functional notion of a profile. Not all things that can be called a dct:Standard will be called a "standard" - so profiles is a general term for constraints which include such things as "views" or "shapes" or "rules" or many other things.
It may help to understand that the gap between services that support specification of a particular representation - and the conneg spec - is the ability to map those options to unambigious URIs in a canonical way (the "list profiles" function). This handles the simple case where a service supports the most trivial case - returning just the profile asked for. We do need to think about the case where a "A client executing a get resource by profile MAY request a resource representation conforming to one of any number of profiles with its preference expressed in a functional profile-specific ordering."
(I'll implement support for this and the list profiles method to evolve services like these to conneg functional profile conformance, it may not be possible to fully describe existing systems as conneg conformant but we can show the pathway is not hard)
NB Conneg has another advantage of making URIs available to document profiles - it also provides rules for handling hierarchies, in that you may return a specific profile if asked for a general one, as long as the specific profile is transitively a profile of the general one (i.e. inherits constraints so that instances conform to the general one).
The discussion over at #501 has lead to the following resolution:
Servers can use a second link header to publish any mapping between a profile URI and the corresponding token. This is now part of conneg-by-ap in example 12, and the Link header attribute token is defined in the same document in §7.1
In my opinion, that means that this requirement is satisfied. Do all of the commenters agree?
@nicholascar @rob-metalinkage @kcoyle @RubenVerborgh @aisaac @akuckartz @azaroth42
Most of the commenters (not @aisaac & @azaroth42) have 'thumbs up' above so I'm marking this due-for-closing.
Most helpful comment
The discussion over at #501 has lead to the following resolution:
Servers can use a second link header to publish any mapping between a profile URI and the corresponding token. This is now part of conneg-by-ap in example 12, and the
Linkheader attributetokenis defined in the same document in §7.1In my opinion, that means that this requirement is satisfied. Do all of the commenters agree?
@nicholascar @rob-metalinkage @kcoyle @RubenVerborgh @aisaac @akuckartz @azaroth42