Current text states:
"dcat:Catalog represents a catalog, which is a dataset in which each individual item is a metadata record "
and in the next paragraph:
"dcat:Resource represents an individual item in a catalog."
This seems likely to confuse a careful reader (is a dcat:Resource a metadata record?). suggested edit for dcat:Resource
_dcat:Resource represents a dataset or data service described by a metadata record in a catalog. This class is not intended to be used directly, but is the parent class of dcat:Dataset, dcat:DataService, and dcat:Catalog (as a subclass of dcat:Dataset). Member items in a catalog should describe a dataset or data service, a sub-class of these, or a sub-class of dcat:Resource defined in a DCAT profile or other DCAT application. dcat:Resource is effectively an extension point for defining a catalog of any kind of resource._
Thanks, @smrgeoinfo.
I would propose to rephrase the sentence for dcat:Resource in
"dcat:Resource is a metadata record which represents an individual item in a catalog."
Rather than
"dcat:Resource represents a dataset or data service described by a metadata record in a catalog."
As we are discussing in issues #967, Datasets and data services are not the only items we want to represent and, in my mind, the sentence you have proposed seems somehow to reinforce the opposite idea.
Would this work for you?
Yes, that works for me. I posted #966 before starting consideration about #967. I like the more general language.
@riccardoAlbertoni , saying that a dcat:Resource is a "metadata record", implies that dcat:Resource is a subclass of dcat:CatalogRecord, which is not the case.
I think it is more correct @smrgeoinfo 's proposed revision - maybe, slightly modified as follows to take into account the role of dcat:Resource as "extension point":
dcat:Resource represents a dataset, data service, or another resource described by a metadata record in a catalog.
@andrea-perego wrote:
@riccardoAlbertoni , saying that a dcat:Resource is a "metadata record", implies that dcat:Resource is a subclass of dcat:CatalogRecord, which is not the case.
I am afraid that I don't see the implication about subclassing that you are mentioning, as in my mind not all the metadata record are catalogRecord.
CatalogRecord is a specific metadata record for an entry in the catalog (..it is the metadata record of a metadata record .. ), but aren't dcat:Dataset and dcat:dataService metadata records for the actual datasets and services?
My apologies if I have forgotten any previous discussion on that point, but I was assuming that when we use the expression "metadata record" in
dcat:Catalog represents a catalog, which is a dataset in which each individual item is a metadata record describing some resource; the scope of dcat:Catalog is collections of metadata about datasets or data services.
we were referring to all the kinds of metadata records.
That also because in a dataset representing a catalog I would include all the above.
If I am wrong and the intended meaning was "metadata record" = "dcat:CatalogRecord",
in order to avoid that others might get confused as I am, I suggest replacing "metadata record" with dcat:CatalogRecord, as we use "metadata record" in the document only once, in the sentence above.
@riccardoAlbertoni , to me, using "is a" states that dcat:Resource is a subclass or instance of a metadata record - and therefore it represents a metadata record (and so "metadata about metadata", a dcat:CatalogRecord), not a resource described in a catalogue. I understand from your reply that this is not what you intended to state, but I think this is how many people will read it, so it may lead to misunderstandings.
The definition of dcat:Catalog you cite sounds correct to me, so I don't think it needs to be changed - actually, the second sentence may need to, as it seems to limit the scope of dcat:Catalog to datasets and data services, excluding other types of resources (dcat:Resource).
On a different note, this discussion has also made me wonder whether it is appropriate to state (or simply suggest) that dcat:Resource denotes only resources documented in a catalogue, as this would apply to all its subclasses - with the result that dcat:Dataset and dcat:DataService cannot be used for datasets and services which are not documented in a catalogue.
This "constraint" is not included in the actual definition of dcat:Resource, but it may be (wrongly?) inferred from other statements - e.g., the section title:
Class: Cataloged Resource
and the usage note
The class of all cataloged resources...
An option could be to slightly revise them as follows:
Class: Resource
and
The class of all resources that MAY be documented in a catalog...
Ok, let's keep the first sentence as it is, and let's revise @smrgeoinfo's proposal differently.
(changes to @smrgeoinfo's proposal are marked below in bold)
dcat:Resource represents a dataset, a data service or any other item described by a metadata record in a catalog. This class is not intended to be used directly, but is the parent class of dcat:Dataset, dcat:DataService and dcat:Catalog. Member items in a catalog should be members of one of the sub-classes, or of a sub-class of these, or of a sub-class of dcat:Resource defined in a DCAT profile or other DCAT application. dcat:Resource is effectively an extension point for defining a catalog of any kind of resource.
Would this make happy both @andrea-perego and @smrgeoinfo?
If you prefer we can even use "resource" in place of "item", but it would be a little tautological.
@andrea-perego wrote
On a different note, this discussion has also made me wonder ...
I would say that what is only suggested is not recommended and shouldn't be a big problem, but perhaps, I am oversimplifying ...
Said that, If you think it is a "risk" we need to address, I suggest you opening a dedicated git issue about it.
@andrea-perego wrote
dcat:Resource denotes only resources documented in a catalogue, as this would apply to all its subclasses - with the result that dcat:Dataset and dcat:DataService cannot be used for datasets and services which are not documented in a catalogue.
Perhaps we should say 'catalogable' in the sense of 'potentially included in a catalog' which I think is the sense of what we are talking about, i.e. description of these things of a form that is suitable for inclusion in a catalog but which may not yet appear in any specific catalog instance.
I prefer 'represent' to 'denote' for the relationship between the dcat:Resource and the thing in the world it describes. If I'm following, the suggested language is now something like
dcat:Resource represents any resource that can be documented in a dcat:Catalog.
@andrea-perego wrote
using "is a" states that dcat:Resource is a subclass or instance of a metadata record
which illustrates how plain 'is a' is ambiguous. Much better to be explicit and say either
@smrgeoinfo indeed, 'denote' means 'name' rather than 'describe'.
This may be too far afield from DCAT, but in libraries we say that metadata _describes_ a resource, and also that the metadata record is a _surrogate_ for the resource itself. To me, "represent" doesn't quite carry the idea that it contains key information about the resource. You could use a triangle to represent a mountain on a map, but it doesn't say much about the mountain.
@kcoyle , "representation" is the typical term used in RDF/OWL to denote the semantic level of the description of a resource (see also "knowledge representation"), as opposed to "encoding" / "serialisation", which denotes the syntactic level.
Using other terms for the same purpose in DCAT will just add confusion.
@riccardoAlbertoni said:
Ok, let's keep the first sentence as it is, and let's revise @smrgeoinfo's proposal differently.
(changes to @smrgeoinfo's proposal are marked below in bold)dcat:Resource represents a dataset, a data service or any other item described by a metadata record in a catalog. This class is not intended to be used directly, but is the parent class of dcat:Dataset, dcat:DataService and dcat:Catalog. Member items in a catalog should be members of one of the sub-classes, or of a sub-class of these, or of a sub-class of dcat:Resource defined in a DCAT profile or other DCAT application. dcat:Resource is effectively an extension point for defining a catalog of any kind of resource.
Would this make happy both @andrea-perego and @smrgeoinfo?
If you prefer we can even use "resource" in place of "item", but it would be a little tautological.
I would indeed replace "item" with "resource", and I would rephrase the above definition as follows:
dcat:Resource represents a dataset, a data service or any other resource that may be described by a metadata record in a catalog.
@dr-shorthair said:
@andrea-perego wrote
dcat:Resource denotes only resources documented in a catalogue, as this would apply to all its subclasses - with the result that dcat:Dataset and dcat:DataService cannot be used for datasets and services which are not documented in a catalogue.
Perhaps we should say 'catalogable' in the sense of 'potentially included in a catalog' which I think is the sense of what we are talking about, i.e. description of these things of a form that is suitable for inclusion in a catalog but which may not yet appear in any specific catalog instance.
I tend to agree, but the meaning of "catalogable" may be not crystal-clear to a non-native speaker, and it can easily be mistaken with "cataloged". I think it is safer to use a more "expanded" formulation of this statement.
with the previous commit, I have updated the definition of dcat:Resource as agreed with @andrea-perego, i.e.,
dcat:Resource represents a dataset, a data service or any other resource that may be described by a metadata record in a catalog. This class is not intended to be used directly, but is the parent class of dcat:Dataset, dcat:DataService and dcat:Catalog. Member items in a catalog should be members of one of the sub-classes, or of a sub-class of these, or of a sub-class of dcat:Resource defined in a DCAT profile or other DCAT application. dcat:Resource is effectively an extension point for defining a catalog of any kind of resource.
If we add at the end of the sentence above
dcat:Dataset and dcat:DataService can be used for datasets and services which are not documented in any catalog.
would it address the second part of the discussion?
Considered that we have agreed to add the sentence
dcat:Dataset and dcat:DataService can be used for datasets and services which are not documented in any catalog.
Do we want to remove "in a catalog" in the definitions of dataset and data-service that follow? Otherwise, these definitions might be perceived as contradictory with the previous adding.
dcat:Dataset represents a dataset
in a catalog. A dataset is a collection of data, published or curated by a single agent. Data comes in many forms including numbers, words, pixels, imagery, sound and other multi-media, and potentially other types, any of which might be collected into a dataset.
dcat:DataService represents a data service
in a catalog. A data service is a collection of operations accessible through an interface (API) that provide access to one or more datasets or data processing functions.
What do you think?
Most helpful comment
Considered that we have agreed to add the sentence
Do we want to remove "in a catalog" in the definitions of dataset and data-service that follow? Otherwise, these definitions might be perceived as contradictory with the previous adding.
What do you think?