Openrefine: n3 importer doesn't work

Created on 7 Apr 2018  Â·  21Comments  Â·  Source: OpenRefine/OpenRefine

When reporting a bug please provide the following information to help reproduce the bug:

Version of OpenRefine used (Google Refine 2.6, OpenRefine2.8, an other distribution?):

2.8

Operating Systems and version:

Windows 10

Browser + version used - Please note that OpenRefine doesn't support Internet Explorer but works OK in most cases:

Chrome 65 and Firefox 59

Steps followed to create the issue:

Try to import any .n3 or .ttl RDF file.

If you are allowed and are OK with making your data public, it would be awesome if you can include the data causing the issue or a URL pointing to where the data is (if your concerned about keeping your data private, ping us on our mailing list):

Any n3 file, even very simple.

Current Results:

screenshot-localhost-3333-2018 04 07-16-24-11

Expected Results:

A serie of triples.

Personal comment

By the way, there is no importer for the N-triple format. Is there nowadays no Java library (RDF4j ? ) that can guess and parse all RDF serializations (Turtle, RDF / XML, N-Triple, JSON-LD, RDF / JSON, TriG, NQuads, TriX ...) without having to create a specific importer for each one?

RDF bug

Most helpful comment

I don't think it has worked once. But no worry, since the JRDF library is deprecated, I think we can switch to JENA and that will be working.

All 21 comments

@ettorerizza Yes there is a Java package for Reading Writing RDF serializations now called Riot from the Jena framework

oaj.riot Package Summary

@thadguidry I was reading the code of the RDF importer and, if I understand correctly, it should be able to parse N-triple, N3 and RDF/XML. Weird.

@ettorerizza give us a dump of the console please.

Nothing special.

16:40:47.238 [                   refine] POST /command/core/get-models (109ms)
16:40:47.247 [                   refine] POST /command/core/get-rows (9ms)
16:40:54.674 [                   refine] POST /command/core/importing-controller (7427ms)
16:40:54.759 [                   refine] POST /command/core/get-models (85ms)
16:40:54.765 [                   refine] POST /command/core/get-rows (6ms)
16:40:56.930 [                   refine] POST /command/core/importing-controller (2165ms)
16:40:57.031 [                   refine] POST /command/core/get-models (101ms)
16:40:57.035 [                   refine] POST /command/core/get-rows (4ms)
16:41:02.097 [                   refine] POST /command/core/importing-controller (5062ms)
16:41:02.195 [                   refine] POST /command/core/get-models (98ms)
16:41:02.200 [                   refine] POST /command/core/get-rows (5ms)
16:44:53.100 [                   refine] POST /command/core/cancel-importing-job (230900ms) <--- I cancelled myself

it does not work either with LODrefine (based on OR 2.6) So I wonder if the n3 importator used to work in the past.

screenshot-127 0 0 1-3333-2018 04 07-16-54-22

@ettorerizza I mean both consoles... including javascript ( from your Chrome browser CTRL + Shift + J )

This ?

index-bundle.js:10335 JQMIGRATE: Logging is active
index-bundle.js:9594 [Deprecation] Synchronous XMLHttpRequest on the main thread is deprecated because of its detrimental effects to the end user's experience. For more help, check https://xhr.spec.whatwg.org/.
send @ index-bundle.js:9594
chrome-extension://jnhgnonknehpejjnehehllkliplmbmhn/content_script.js:24 initializing Content Script message listener

@ettorerizza yeah, but keep it open when you start OpenRefine and then perform your import test...does anything else show up in the console , any errors ? Also...disable all your browser extensions when your testing stuff like this :)

I tried with Firefox (disabling all extensions) and the error messages look more explicit.

The page was reloaded, because the character encoding declaration of the HTML document was not found when prescanning the first 1024 bytes of the file. The encoding declaration needs to be moved to be within the first 1024 bytes of the file.  localhost:3333:36
JQMIGRATE: Logging is active  index-bundle.js:10335
Webconsole context has changed
Synchronous XMLHttpRequest on the main thread is deprecated because of its detrimental effects to the end user’s experience. For more help http://xhr.spec.whatwg.org/  index-bundle.js:9594:5
getRecipes: falling back to a synchronous message for: "http://localhost:3333"  LoginRecipes.jsm:244
this._recipeManager is null  LoginManagerParent.jsm:86
The character encoding of a framed document was not declared. The document may appear different if viewed without the document framing it.  importing-controller
XML Parsing Error: not well-formed
Location: http://localhost:3333/command/core/get-importing-job-status?jobID=2
Line Number 1, Column 1:  get-importing-job-status:1:1
XML Parsing Error: not well-formed
Location: http://localhost:3333/command/core/importing-controller?controller=core%2Fdefault-importing-controller&jobID=2&subCommand=initialize-parser-ui&format=text%2Frdf%2Bn3
Line Number 1, Column 1:  importing-controller:1:1
XML Parsing Error: not well-formed
Location: http://localhost:3333/command/core/importing-controller?controller=core%2Fdefault-importing-controller&jobID=2&subCommand=update-format-and-options
Line Number 1, Column 1:

@jackyq2015 please fix :) I also reproduced this same issue just now on the wikidata-extension branch I'm testing other things with. And same error as @ettorerizza

JQMIGRATE: Logging is active
index-bundle.js:10335:2
window.controllers/Controllers is deprecated. Do not use it for UA detection.
index-bundle.js:44477:1

Synchronous XMLHttpRequest on the main thread is deprecated because of its detrimental effects to the end user’s experience. For more help http://xhr.spec.whatwg.org/
index-bundle.js:9594:5

The character encoding of a framed document was not declared. The document may appear different if viewed without the document framing it.

importing-controller
XML Parsing Error: not well-formed
Location: http://127.0.0.1:3333/command/core/get-importing-job-status?jobID=1
Line Number 1, Column 1:
get-importing-job-status:1:1

XML Parsing Error: not well-formed
Location: http://127.0.0.1:3333/command/core/importing-controller?controller=core%2Fdefault-importing-controller&jobID=1&subCommand=initialize-parser-ui&format=text%2Frdf%2Bn3
Line Number 1, Column 1:
importing-controller:1:1

From the code, seems the n3 format is still parsed as NT format. Tried to force it to parse as N3 but does not work for some reason. will switch to Jena and have another try

I tried with Google Refine 2.5 and with OR 2.6 RC1 on the Data science workbench cloud : same result. Are we sure this importer has already worked once?

I don't think it has worked once. But no worry, since the JRDF library is deprecated, I think we can switch to JENA and that will be working.

@ettorerizza I created the PR #1563. I tried several n3 files and it works fine.
Could you please help to verify it?

@jackyq2015 Could you post an example of n3 file that works ? When I try, i get this error.

sans titre 1

Looks like it doesn't like prefix. But a n3/ttl file without prefix is just ntriple, am i wrong ?

For ntriple : it works *, but it doesn't recognize the extension (the file is parsed as line based). You have to click yourself on "RDF/n3" (even if the file is not n3).

Files tested :

https://www.wikidata.org/wiki/Special:EntityData/Q42.n3

https://www.w3.org/2000/10/swap/test/meet/blue.n3

https://www.wikidata.org/wiki/Special:EntityData/Q42.nt

https://gist.github.com/kal/ee1260ceb462d8e0d5bb (turtle )

http://www.agfa.com/w3c/rdf/rdfs-transitive-subSubProperty/test002.nt (ntriple)

Note: Parsing a Ntriple file produces a new column by predicate. This is fine when there are few triples, but it becomes invasive when you import a whole Wikidata or DBpedia page. A better parsing would probably consist in creating only three columns: subjects, predicates, objects. You can then use a transpose only if you want.

@ettorerizza My bad. There is some issue with the format guesser by .n3. Not a parser issue.

Just fixed. Tested file: blue.n3

@msaby reported further issues importing n3 at https://groups.google.com/d/msg/openrefine/t67D8_JrUs0/W-wL1jdXBQAJ

I had a quick look at Jena, and found some things that suggested it doesn't full support n3 - only the subset of n3 that makes up Turtle / ttl.

This page http://jena.apache.org/documentation/io/index.html says ".n3 is supported but only as a synonym for Turtle."

Looking at https://github.com/apache/jena/blob/master/jena-arq/src/main/java/org/apache/jena/riot/lang/RiotParsers.java#L58 seems to confirm this (although I'm not 100%)

Nice catch, @ostephens. In summary, the parser called "RDF/N3" cannot read n3, just Turtle, but is able to parse N-triple, which is not indicated. It should at least be renamed.

I have made a new issue. Thank you all for your answers.
Mathieu

Le mer. 10 oct. 2018 à 17:21, Ettore Rizza notifications@github.com a
écrit :

Nice catch, @ostephens https://github.com/ostephens. In summary, the
parser called "RDF/N3" cannot read n3, just Turtle, but is able to parse
N-triple, which is not indicated. It should at least be renamed.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/OpenRefine/OpenRefine/issues/1560#issuecomment-428615116,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA47xPLdPMRXJ6JvBNfmdTsZnX1-KBDWks5ujhCTgaJpZM4TLGkg
.

--
Mathieu Saby

@msaby What's the link for the new issue?

Was this page helpful?
0 / 5 - 0 ratings