Just a feature request, assuming it is felt that SMW is suitable for handling this kind of use case...
I'd like to be able to add a table of data to a page and query that table from other pages--without needing to break up the data and add it to a multiplicity of pages.
Taking the example of say this were a table from the United Nations (and which I didn't want or need to break up into its own pages), listing countries and let's say area and population...
{|
! Country
! Area (km2)
! Population (millions)
|-
| United States || 9,833,520 || 327.2
| Germany || 357,386 || 82.79
|}
It seems I might try this with subobjects...However, even the +sep= shorthand doesn't seem to imply any structure beyond indicating additional values.
But perhaps I could define one subobject for each row of the table, i.e., with one subobject for each country, and give each of such subobjects the property "Has area" and the property "Has population".
I don't know if I could use the country as identifier and then query it (to be able to search for data of a specific country as well as search across subobjects), or if I'd need to represent the country as a property as well like "Has country", assuming I could make this work, it is still rather cumbersome, even with the shorthand to add multiple properties at once, because I still have to repeat the property names for each subobject.
If this could work, and SMW would be an ideal venue for it, might some shorthand be added鈥攊deally, I would say, where one which just wrapped say HTML tables or Mediawiki table syntax鈥攖o allow for easy input of such structures?
Perhaps the first column could be enclosed in a tag or otherwise marked up, and with the first column by default indicating the owner of the properties in that row.
Thanks, and apologies if I am missing some obvious ways to implement.
I'd say use subojects and this tip. You can use [this wiki][https://sandbox.semantic-mediawiki.org/wiki/Main_Page) to get your feet wet.
I am not clear on how that would help with entering tabular data (within a single page). That seems to be about formatting the results.
If the Mediawiki table can't just be wrapped, to demonstrate what I'm going for leveraging the subobject syntax a little, it might allow for something more like this:
<!-- POSSIBLE SYNTAX -->
{{#subobject:|
! fields = Country | Area | Population |+sep=|
| United States | 9,833,520 | 327.2 |+sep=|
| Germany | 357,386 | 82.79 |+sep=|
}}
Otherwise, my point is that it is too cumbersome to repeat the field names (and subobject syntax) for each row as in the following when there is a large amount of data:
{{#subobject:|
| Country=United States
| Area=9,833,520
| Population=327.2
|}
{{#subobject:|
| Country=Germany
| Area=357,386
| Population=82.79
|}
I'm also not sure that the approach just above ties the data together either.
Obviously I did not understand at all what you were trying to get to. No the suggested syntax is currently not possible and personally I am not sure if would like to have something like this. If I have many data sets I construct the subobjets in a spreadsheet program and paste it over. You could also have a loot at the ExternalData extension.
Yeah, the tipp is about displaying data.
I'd like to be able to add a table of data to a page and query that table
from other pages--without needing to break up the data and add it to a
multiplicity of pages.
I have seen such a request in the past and my advise is that this sort
of data sourcing should go into a different extension called something
like "SemanticStructuredDocs" (or similar) where additional content
types like CSV, JSON, or TEI are provided that allows structured data
to create triples from the source without additional user
intervention.
For example, as for tabular data you would most certainly have a
CSV/TSV table with each column representing a property and the content
type (e.g. smw/csv) has instructions of how to create subobjects that
corresponds to the columns and rows from the definition of the data.
Of course, it should be noted that those CSV/TSV or JSON contained
data have a limit since the MW parser needs to store the content in a
revision table and SMW has to create/update relevant data columns.
Meaning that posting a 10000 rows CSV with 10 columns will most
certainly create issues either for MW and the HTML generation or SMW
to handle all the subobject creation at once.
As for JSON, you would face the challenge of hierarchically data
definitions (see #3702 for some pointers).
The ideal solution for those new content types would be to use MCR
where one slot contains metadata (tags, description, property mapping
to columns etc.) about the document while the main slot hosts the
unaltered ("original") content.
Since no usable information on MCR has been published (meaning on how
to use different edit fields and slots to construct a multi content
revision), SMW core is unable to determine how to use MCR (#3345) in
general leaving out any possibility of storing metadata to documents.
Taken aside the MCR issue, storing and constructing SMW data from a
raw CSV, JSON, or TEI content can be done nevertheless.
I have some code snippets for the CSV and TEI content type and would
be willing to donate them to someone who creates the aforementioned
new extension to polish the code (including writing necessary unit and
integration tests).
Cheers
On 4/9/19, Brett Zamir notifications@github.com wrote:
Just a feature request, assuming it is felt that SMW is suitable for
handling this kind of use case...
>
I'd like to be able to add a table of data to a page and query that table
from other pages--without needing to break up the data and add it to a
multiplicity of pages.
>
Taking the example of say this were a table from the United Nations (and
which I didn't want or need to break up into its own pages), listing
countries and let's say area and population...
>
```
{|
! Country
! Area (km2)
! Population (millions)
|-
| United States || 9,833,520 || 327.2
| Germany || 357,386 || 82.79
|}
```
>
It seems I might try this with subobjects...However, even the
+sep=
shorthand doesn't seem to imply any structure beyond indicating additional
values.
>
But perhaps I could define one subobject for each row of the table, i.e.,
with one subobject for each country, and give each of such subobjects the
property "Has area" and the property "Has population".
>
I don't know if I could use the country as identifier and then query it (to
be able to search for data of a specific country as well as search across
subobjects), or if I'd need to represent the country as a property as well
like "Has country", assuming I could make this work, it is still rather
cumbersome, even with the shorthand to add multiple properties at once,
because I still have to repeat the property names for each subobject.
>
If this could work, and SMW would be an ideal venue for it, might some
shorthand be added鈥攊deally, I would say, where one which just wrapped say
HTML tables or Mediawiki table syntax鈥攖o allow for easy input of such
structures?
>
Perhaps the first column could be enclosed in a tag or otherwise marked up,
and with the first column by default indicating the owner of the properties
in that row.
>
Thanks, and apologies if I am missing some obvious ways to implement.
>
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/SemanticMediaWiki/SemanticMediaWiki/issues/3911
Thanks, @mwjames . It looks like https://www.mediawiki.org/wiki/Extension:External_Data#Getting_data_from_a_non-API_text_file (this section in particular) might possible be helpful... Though the extension name includes reference to external data, it does also mention here one might use on one's own Mediawiki pages. Perhaps one could then run it through a template or modules (as per other functions in the extension), though I haven't looked closely or tried any of this. I don't know if https://www.mediawiki.org/wiki/Extension:Cargo has anything that might help in this regard either.
https://www.mediawiki.org/wiki/Extension:External_Data#Getting_data_from_a_non-API_text_file
(this section in particular) might possible be helpful... Though the
extension name includes reference to external data, it does also mention
Yes, this extension has some use cases [0] when relying on an
"external data set" but as outlined, once the data on the external
endpoint becomes outdated there is no trigger to update referenced
data automatically. Also, when the data on the endpoint disappears so
does the data within the wiki where queries start to break because
there are not permanently stored as revision that can be re-parsed.
So, for serious applicants that require some sort of revision
management for their data (as in who changed what when etc.) and want
to provide data consistency and availability relying on
"Extension:External_Data" via API calls is just not an option.
[0] https://www.mediawiki.org/wiki/Extension:External_Data#Storing_data
On 4/12/19, Brett Zamir notifications@github.com wrote:
Thanks, @mwjames . It looks like
https://www.mediawiki.org/wiki/Extension:External_Data#Getting_data_from_a_non-API_text_file
(this section in particular) might possible be helpful... Though the
extension name includes reference to external data, it does also mention
here one might use on one's own Mediawiki pages. Perhaps one could then run
it through a template or modules (as per other functions in the extension),
though I haven't looked closely or tried any of this. I don't know if
https://www.mediawiki.org/wiki/Extension:Cargo has anything that might help
in this regard either.--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
https://github.com/SemanticMediaWiki/SemanticMediaWiki/issues/3911#issuecomment-482548189
There is no task for SMW core here, however I added this to "Enhancements and features" into the "Not SMW core" column for future pickup.