In the introduction, there is no mention of profiles before the last sentence, which is quite disconnected from the text immediately above it. There needs to be another paragraph to explain that profiles are the solution to the problem posed above.
I also think that the motivation section belongs as part of the introduction.
makes sense. Do you want the editors to do another draft or propose changes via a PR?
As the doc gets larger, the intro needs to introduce it - so keeping the motivation separate still feels right to me at this stage.
Here's an attempt at a new introduction. It uses most of what Phil had written, but adds a paragraph at the beginning. I'm not sure about the part that talks about "best practices" for ontologies - I agree with it but I'm not sure what we should say about it here. In any case, I've left his text intact.
(updated with wording from @agreiner, 05-10-2018)
A profile is generally understood as representing a point of view. In information technology, profiles may also support the data needs of specific applications. Profiling is often the work of a community interested in interoperability and data exchange. We define a profile generally as named set of constraints on one or more identified base specifications.
Communities create and use data standards to ensure interoperability for information exchange. Although members of a community may use the same basic standard schema, it is very common for different subsets within the larger community to need some further specification of the data they create to meet their own needs. To continue to support interoperability of their data with others, these community members need to express the specifics of their implementation of the data schema. Profiles serve this purpose. Profiles enumerate vocabulary terms, cardinality, and validation rules, and can also include descriptions of the rules used by creators to make decisions regarding their data elements.
Good metadata practice begins with the builders of vocabularies and ontologies. Builders of vocabularies and ontologies are encouraged to make their work as broadly applicable as possible so as to maximize future adoption. As a result, vocabularies and ontologies typically define a data model using minimal semantics. For example, DCAT [vocab-dcat-2] defines the concept of a dataset as an abstract entity with distributions and data services as means of accessing data. It is silent on whether a distribution should be in a particular serialization, or set of serializations. It is also silent on how data services should be configured. While it states that the value of dcat:theme should be a SKOS concept, it does not specify a particular SKOS [skos-reference] concept scheme, and so on. Other vocabularies such as Dublin Core Terms [DCTERMS] are equally parsimonious in their prescriptions of how they should be used. This means that data models and methods of working can be applied in different circumstances than those in which the original definition work was carried out and, in that sense, these promote broad interoperability.
In addition to addressing the needs of a specific community, a profile may also apply to a single system. Any individual system will be designed to meet a specific set of needs; that is, it will operate in a specific context. It is that context, and the individual choices made by the engineers working within it, that will determine how a vocabulary or set of vocabularies will be used. For example, a system ingesting data may require that a specific subset of properties from a range of vocabularies is used and that only terms from a defined code list are used as values for specified properties. In other words, where the 'base vocabulary' might say "the value of this property SHOULD be a value from a managed code list", a specialized profile will say "the value of this property MUST be from this specific code list".
This document is about how to formulate and communicate profiles.
I think @kcoyle 's text sound good. But before approving it I'd like to ask everyone here - especially @agreiner ! - what they think of the proposal that @pwin had proposed (the one starting with 'A profile is generally understood as being the outline' present in the note at the beginning of the intro).
It tried to put 'profile' right at the beginning...
I like the way this brings in profiles early and begins to explain what they are. I agree with the general idea, but I'd like this to be about data, not just metadata. So, something like this would work better for me:
"Communities create and use data standards to ensure interoperability for information exchange. Although members of a community may use the same basic standard schema, it is very common for different subsets within the larger community to need some further specification of the data they create to meet their own needs. To continue to support interoperability of their data with others, these community members need to express the specifics of their implementation of the data schema. Profiles serve this purpose. Profiles enumerate vocabulary terms, cardinality, and validation rules, and can also include descriptions of the rules used by creators to make decisions regarding their data elements.
Good data modeling practice begins with . . ."
The last paragraph starts with a sentence fragment, which might be solved by removing the word "since". It strikes me as a little odd that this paragraph only addresses the needs of a single system rather than a community or subset of a community. Given the first paragraph's focus on the latter, it could be reworded as
"In addition to addressing the needs of a specific community, a profile may also apply to a single system. Any individual system will be designed to meet a specific set of needs; that is, it will operate in a specific context. It is that context, and the individual choices made by the engineers working within it, that will determine how a vocabulary or set of vocabularies will be used. For example, a system ingesting data may require that a specific subset of properties from a range of vocabularies is used and that only terms from a defined code list are used as values for specified properties. In other words, where the 'base vocabulary' might say "the value of this property SHOULD be a value from a managed code list", a specialised profile will say "the value of this property MUST be from this specific code list".
@agreiner I'm fine going with "data" rather than "metadata" - it's a long-standing tension - the whole "my metadata is your data" etc. I hope we don't have to define "data" though - ?? do we?
I don't see any need to define data.
FYI, here is the text provided by @pwin
A profile is generally understood as being the outline of the margin of some thing when seen from a specific point of view. Profiling is also the task of distilling the essential aspects or character of something, such as a person, from a specific angle. Also, in the craft domain, a profile is taken by a tool that matches itself in detail to the contours of a 3-dimensional object and returns a 2-dimensional accurate representation from which other formable materials can be constrained and fashioned to that profile, or matched with it to determine how accurately it portrays the original 3-dimensional object from which the profile was taken.
<p><span style="font-style:italic;">In the same sense then, information entities can be viewed from different perspectives and in order to prepare them for specific uses they are frequently tested for their goodness of fit to some pattern, or the pattern can be provided prior to the gathering of the information to provide some constraint to ensure adequacy and appropriateness of that information asset to the job in hand.</span></p>
<p><span style="font-style:italic;">How should we describe this profiling tool for information? In the physical domain we can profile with a paper cutout in the simple case, or with a laser in the technologically more complex case. Is this the same with information assets?</span></p>
I think that is a brilliant analogy.
This seems to be a description of what is commonly called a 'view'.
Here's what the introduction looks like substituting Peter's top paragraphs for the previous first paragraph:
A profile is generally understood as being the outline of the margin of some thing when seen from a specific point of view. Profiling is also the task of distilling the essential aspects or character of something, such as a person, from a specific angle. Also, in the craft domain, a profile is taken by a tool that matches itself in detail to the contours of a 3-dimensional object and returns a 2-dimensional accurate representation from which other formable materials can be constrained and fashioned to that profile, or matched with it to determine how accurately it portrays the original 3-dimensional object from which the profile was taken.
In the same sense then, information entities can be viewed from different perspectives and in order to prepare them for specific uses they are frequently tested for their goodness of fit to some pattern, or the pattern can be provided prior to the gathering of the information to provide some constraint to ensure adequacy and appropriateness of that information asset to the job in hand.
Communities create and use data standards to ensure interoperability for information exchange. Although members of a community may use the same basic standard schema, it is very common for different subsets within the larger community to need some further specification of the data they create to meet their own needs. To continue to support interoperability of their data with others, these community members need to express the specifics of their implementation of the data schema. Profiles serve this purpose. Profiles enumerate vocabulary terms, cardinality, and validation rules, and can also include descriptions of the rules used by creators to make decisions regarding their data elements.
Good metadata practice begins with the builders of vocabularies and ontologies. Builders of vocabularies and ontologies are encouraged to make their work as broadly applicable as possible so as to maximize future adoption. As a result, vocabularies and ontologies typically define a data model using minimal semantics. For example, DCAT [vocab-dcat-2] defines the concept of a dataset as an abstract entity with distributions and data services as means of accessing data. It is silent on whether a distribution should be in a particular serialization, or set of serializations. It is also silent on how data services should be configured. While it states that the value of dcat:theme should be a SKOS concept, it does not specify a particular SKOS [skos-reference] concept scheme, and so on. Other vocabularies such as Dublin Core Terms [DCTERMS] are equally parsimonious in their prescriptions of how they should be used. This means that data models and methods of working can be applied in different circumstances than those in which the original definition work was carried out and, in that sense, these promote broad interoperability.
In addition to addressing the needs of a specific community, a profile may also apply to a single system. Any individual system will be designed to meet a specific set of needs; that is, it will operate in a specific context. It is that context, and the individual choices made by the engineers working within it, that will determine how a vocabulary or set of vocabularies will be used. For example, a system ingesting data may require that a specific subset of properties from a range of vocabularies is used and that only terms from a defined code list are used as values for specified properties. In other words, where the 'base vocabulary' might say "the value of this property SHOULD be a value from a managed code list", a specialized profile will say "the value of this property MUST be from this specific code list".
This document is about how to formulate and communicate profiles.
@dr-shorthair I agree that Peter's text talks about a "view". In Dublin Core we use the "view" analogy often for profiles. IMO a profile is a "view" made concrete. I suppose we could say that there are views that are temporary, and we would not generally consider those application profiles. Note also that we seem to be dropping the "application" term from the charter title and description. I don't think that's a bad thing, but we should think about whether we are losing anything in the process. It may be worth its own issue. I'll open.
The everyday understanding of profile covers both the view from a
perspective and also a pattern from which material can be either tested for
goodness of fit or assembled to match the pattern. That's what I am trying
to include early in the document.
I think that these are the two senses in which we are using the term.
I think that thinking that my text described a view is too simplistic an
interpretation and doesn't recognise the point that the profile is also
potentially a validator and/or a jig
On Sat, 6 Oct 2018, 06:30 Simon Cox, notifications@github.com wrote:
This seems to be a description of what is commonly called a 'view'.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/w3c/dxwg/issues/417#issuecomment-427547898, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABBcTCwKuNy7tNai7bSnL0Msz1_0aHRKks5uiEASgaJpZM4W5G8V
.
Perhaps the actual sense is that profiles are specifications of views - they provide a basis for validation and identification of viewpoints.
Everything is observable via some viewpoint(s) - but viewpoints are not necessarily interoperable (i.e. understandable in a domain without further information not immediately accessible). Profiles provide for identification of views, and discovery of specifications that support, among other things, validation.
also i find this a little awkward: "the outline of the margin of some thing" - it introduces two levels of abstraction - unless this is grounded in some article that uses this expression, why not just "the outline of some thing" ?
+1 to dropping "application". We've done this everywhere in the Conneg & Prof Ont docs as it really adds nothing of value and becomes another term that then needs to be defined and understood for effective use. Happy to elaborate in a distinct issue if created (over here it seems: https://github.com/w3c/dxwg/issues/448).
Maybe the discussion on 'view' risks too much simplifying what we say about the notion of profile, especially if it can be understood as 'database views'. But I think it's worth keeping the reference to it though (if just because some profiles could actually be implemented by a DB view...)
Keeping @pwin 's original questions would maybe be a way to keep that 'view' metaphor but instruct the reader that the notion of profile is more sophisticated than some understandings of the word.
Exactly my intention @aisaac
I wrote it as a "funnel" to draw the reader in from the general
understanding of the term to help us then be specific about what we mean
by the term.
Every time I mention to people that we're working on profiles the initial
response is something along the lines of "that's a loaded term. What
exactly do you mean by 'profile'?"
On Mon, 8 Oct 2018, 08:13 aisaac, notifications@github.com wrote:
Maybe the discussion on 'view' risks too much simplifying what we say
about the notion of profile, especially if it can be understood as
'database views'. But I think it's worth keeping the reference to it though
(if just because some profiles could actually be implemented by a DB
view...)
Keeping @pwin https://github.com/pwin 's original questions would maybe
be a way to keep that 'view' metaphor but instruct the reader that the
notion of profile is more sophisticated than some understandings of the
word.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/w3c/dxwg/issues/417#issuecomment-427739392, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABBcTPF_JeCGeaXEyst6EemxlCdlNDfEks5uivsmgaJpZM4W5G8V
.
OK I miss the time the make a new suggestion for text, but I'm going to try to help by make sure that other comments that were made in @kcoyle 's PR about an earlier version of her text are visible in this issue, and can be tackled in future suggestions:
https://github.com/w3c/dxwg/pull/446#issuecomment-427430817
https://github.com/w3c/dxwg/pull/446#pullrequestreview-162323738
https://github.com/w3c/dxwg/pull/446#pullrequestreview-162319297
https://github.com/w3c/dxwg/pull/446#issuecomment-427739038
Please close - issue addressed and relevant changes pulled to doc
The issue is not addressed, there are still some comments pending as per the discussion above and the list to other tickets. For one, I can see that the intro still includes "Good metadata practice".
@nicholascar I'm not asking you to solve it btw. I guess that now that you were ready to close it, and @rob-metalinkage has chimed in on several aspects, moving on this issue is probably something @kcoyle and I could do at TPAC.
Hi everyone, we've updated the introduction in a way that hopefully addresses all the suggestions that seemed to have been agreed upon here. I hope it will be alright with everyone and we can close the issue.
Please remember that this is about improving the intro, not making it perfect!
New text: https://rawgit.com/w3c/dxwg/updates-intro/profiles/#introduction
Diff: https://services.w3.org/htmldiff?doc1=https%3A%2F%2Fw3c.github.io%2Fdxwg%2Fprofiles%2F&doc2=https%3A%2F%2Frawgit.com%2Fw3c%2Fdxwg%2Fupdates-intro%2Fprofiles%2F
After one week the PR has now been merged, and I'm hereby closing the issue!
@agreiner I think I should have perhaps explicitly asked you for feedback, as you were the one creating the issue. Are you happy with the changes? If yes, then perfect. If no, then I'd invite you to create a new issue with suggestions, as this one is a bit crowded :-)
I'm happy with it.
Thanks @agreiner !
Most helpful comment
I like the way this brings in profiles early and begins to explain what they are. I agree with the general idea, but I'd like this to be about data, not just metadata. So, something like this would work better for me:
"Communities create and use data standards to ensure interoperability for information exchange. Although members of a community may use the same basic standard schema, it is very common for different subsets within the larger community to need some further specification of the data they create to meet their own needs. To continue to support interoperability of their data with others, these community members need to express the specifics of their implementation of the data schema. Profiles serve this purpose. Profiles enumerate vocabulary terms, cardinality, and validation rules, and can also include descriptions of the rules used by creators to make decisions regarding their data elements.
Good data modeling practice begins with . . ."
The last paragraph starts with a sentence fragment, which might be solved by removing the word "since". It strikes me as a little odd that this paragraph only addresses the needs of a single system rather than a community or subset of a community. Given the first paragraph's focus on the latter, it could be reworded as
"In addition to addressing the needs of a specific community, a profile may also apply to a single system. Any individual system will be designed to meet a specific set of needs; that is, it will operate in a specific context. It is that context, and the individual choices made by the engineers working within it, that will determine how a vocabulary or set of vocabularies will be used. For example, a system ingesting data may require that a specific subset of properties from a range of vocabularies is used and that only terms from a defined code list are used as values for specified properties. In other words, where the 'base vocabulary' might say "the value of this property SHOULD be a value from a managed code list", a specialised profile will say "the value of this property MUST be from this specific code list".