In DCAT, a Distribution "represents a specific available form of a dataset" and some few properties are given for a Distribution that presuppose certain forms of availability, for example downloadURL indicates the Dataset may be downloaded and byteSize indicates bundles, electronic delivery (as opposed to streaming, Web Services, a printed copy etc.).
Can DCAT2 generalise a Distribution to cater for more and unknown future forms of Distribution by indicating, in essence, what a distribution should contain vis a vis a Dataset is distributes and how to go about specialising that essence for particulars?
A corollary is to also remove specialised properties, such as dcat:bytesize from distribution.
This is motivated by work within the Research Data Alliance (the Storage Service Definitions WG: https://rd-alliance.org/groups/storage-service-definitions-wg) to characterise data storage and compute scenarios which will have specialised information required for their decision making which could, with guidance, be presented as sepcialised forms of a Distribution.
I agree that the definition of Distribution could be made more future-proof. It now supposes too much that a Distribution is a file.
However, I vehemently oppose removing properties in general. In the case of bytesize, it is used in practice (e.g. by around 15,000 Distributions in the European Data Portal).
I agree that the definition of Distribution could be made more future-proof. It now supposes too much that a Distribution is a file.
+1. A clarification of what dcat:Distribution is would indeed be beneficial. And, for this purpose, it may be worth checking the relevant issues from the GLD WG:
https://www.w3.org/2011/gld/track/issues/
The word "distribution" appears in the title of quite a few of them.
However, I vehemently oppose removing properties in general. In the case of bytesize, it is used in practice (e.g. by around 15,000 Distributions in the European Data Portal).
I totally concur.
We have a requirement to do that clarification at #52. I think that needs to subsume this, as well as others such as #53 and #54 (which also touches on 'a distribution isn't just a file'.
I suspect #56 needs to be included as well - what are the 'services' that we expect?
@makxdekkers: I dont propose to remove the possibility for people to use byteSize, only that particular distribution type properties like this (for a file) be made non-core members of a more general Distribution. A File Distribution could still use byteSize, format etc.
@nicholascar I was just reacting to your formulation "_remove specialised properties, such as dcat:bytesize from distribution_" (my emphasis). It is not clear to me whether or how we will make a distinction between 'core' and 'non-core' members/properties. This would need further discussion.
This issue and #56 should be folded into #52
Has this issue been overtaken by the new structure with dcat:DataService alongside dcat:Distribution.
I think we can mark it 'resolved'.
This issue does seem to have been reflected in the resolution of #52 already in the editors draft. Closing this issue and tidying up DCAT recommendation to reflect.