The ClearNLP doc pointed to doesn't include quite few of the dependency tags. Here is a Stanford doc that has all of them except DATIVE.
We use the ClearNLP converter, which differs slightly from the Stanford one in some cases. The ClearNLP converter is generally more accurate and practical for our situation (i.e.: we just want to convert treebanks into dependency parses). It increases accuracy by making use of the additional annotations in the treebank. In contrast, the Stanford converter has to support the use-case of converting parser output into dependencies. These parsers don't have the additional annotations, so the Stanford converter uses less information than ClearNLP's.
If the ClearNLP docs really don't describe our dependencies, then okay, we have a problem, and I'll raise it with Jin-ho. But are you sure that's the case?
It may just be that the ClearNLP doc itself needs updating as it is rather old. Appendix B2 lists the Stanford dependencies, which also does not include all of the labels I've observed and differs from the doc I pointed to.
The following dependencies are described by the ClearNLP Doc and listed in Table 2:
ACOMP Adjectival complement
ADVCL Adverbial clause modifier
ADVMOD Adverbial modifier
AGENT Agent NN Noun compound modifier
AMOD Adjectival modifier
APPOS Appositional modifier
ATTR Attribute
AUX Auxiliary NUM Numeric modifier
AUXPASS Auxiliary (passive)
CC Coordinating conjunction
CCOMP Clausal complement
COMPLM Complementizer
CONJ Conjunct
CSUBJ Clausal subject
CSUBJPASS Clausal subject (passive)
DEP Unclassified dependent
DET Determiner
DOBJ Direct object
EXPL Expletive
HMOD Modifier in hyphenation
HYPH Hyphen
INFMOD Infinitival modifier
INTJ Interjection
IOBJ Indirect object
MARK Marker
META Meta modifier
NEG Negation modifier
NMOD Modifier of nominal
NPADVMOD Noun phrase as ADVMOD
NSUBJ Nominal subject
NSUBJPASS Nominal subject (passive)
NUMBER Number compound modifier
OPRD Object predicate
PARATAXIS Parataxis
PARTMOD Participial modifier
PCOMP Complement of a preposition
POBJ Object of a preposition
POSS Possession modifier
POSSESSIVE Possessive modifier
PRECONJ Pre-correlative conjunction
PREDET Predeterminer
PREP Prepositional modifier
PRT Particle
PUNCT Punctuation
QUANTMOD Quantifier phrase modifier
RCMOD Relative clause modifier
ROOT Root
XCOMP Open clausal complement
Here are the dependency labels generated by SpaCy I've observed while parsing my corpus, * denotes labels not in the ClearNLP doc (these are only what I've observed, there may be more):
Not sure what happened to the formatting on my last post after I submitted it, in the observed labels section each label was on its own line and asterisks are now replaced with bullets. So the following are observed but not documented:
acl
case
compound
dative
nummod
relcl
Hmm, okay. Thanks, I didn't realise those docs were out of date.
Hey @honnibal any chance we could get a full list of all possible dependency labels in SpaCy? Similar to spacy.parts_of_speech.NAMES?
From symbols.pyx:
"acomp": acomp,
"advcl": advcl,
"advmod": advmod,
"agent": agent,
"amod": amod,
"appos": appos,
"attr": attr,
"aux": aux,
"auxpass": auxpass,
"cc": cc,
"ccomp": ccomp,
"complm": complm,
"conj": conj,
"csubj": csubj,
"csubjpass": csubjpass,
"dep": dep,
"det": det,
"dobj": dobj,
"expl": expl,
"hmod": hmod,
"hyph": hyph,
"infmod": infmod,
"intj": intj,
"iobj": iobj,
"mark": mark,
"meta": meta,
"neg": neg,
"nmod": nmod,
"nn": nn,
"npadvmod": npadvmod,
"nsubj": nsubj,
"nsubjpass": nsubjpass,
"num": num,
"number": number,
"oprd": oprd,
"parataxis": parataxis,
"partmod": partmod,
"pcomp": pcomp,
"pobj": pobj,
"poss": poss,
"possessive": possessive,
"preconj": preconj,
"prep": prep,
"prt": prt,
"punct": punct,
"quantmod": quantmod,
"rcmod": rcmod,
"root": root,
"xcomp": xcomp
I tried that list, but it seems to be incomplete, some missing items
include for example compound, nummod and ROOT
On Sep 1, 2016 5:46 PM, "Matthew Honnibal" [email protected] wrote:
From symbols.pyx:
"acomp": acomp, "advcl": advcl, "advmod": advmod, "agent": agent, "amod": amod, "appos": appos, "attr": attr, "aux": aux, "auxpass": auxpass, "cc": cc, "ccomp": ccomp, "complm": complm, "conj": conj, "csubj": csubj, "csubjpass": csubjpass, "dep": dep, "det": det, "dobj": dobj, "expl": expl, "hmod": hmod, "hyph": hyph, "infmod": infmod, "intj": intj, "iobj": iobj, "mark": mark, "meta": meta, "neg": neg, "nmod": nmod, "nn": nn, "npadvmod": npadvmod, "nsubj": nsubj, "nsubjpass": nsubjpass, "num": num, "number": number, "oprd": oprd, "parataxis": parataxis, "partmod": partmod, "pcomp": pcomp, "pobj": pobj, "poss": poss, "possessive": possessive, "preconj": preconj, "prep": prep, "prt": prt, "punct": punct, "quantmod": quantmod, "rcmod": rcmod, "root": root, "xcomp": xcomp—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/spacy-io/spaCy/issues/233#issuecomment-244122239, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AA1hdz9Grr_CbfSfiE4AFccLSaE0wOBTks5qlvNlgaJpZM4HI2OX
.
Hello @honnibal,
I am parsing a German text using your new model and facing the same issue: the dependency tags are not clearly documented. Could you please fix that s.t. we could get the most of your API? :)
UPDATE:
I figured, the German model uses its own tags. Specifically, those of the TIGER Treebank as described here http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/annotation/tiger_introduction.pdf.
Nevertheless I am looking forward to the description of the English labels:)
Would it be too much work to adapt spaCy to output Universal Dependencies for the English and German parser?
@tanya-h: you can find more info here, but it's in German
Apologies for commenting on a closed issue, but I was scouring github (this issue and #676, #677) trying to figure out what the acl label is supposed to be, since it's not in the Stanford dependencies manual. After hopping around ClearNLP's (now _NLP4J's_) docs, I found the following page:
https://emorynlp.github.io/nlp4j/components/dependency-parsing.html
... which describes all of the mystery labels @sdenning helpfully posted above, except nummod. I post only in case this helps someone in the future.
Hi @honnibal
Could you please tell me, how can I get complete list of dependency relations in spacy?
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
It may just be that the ClearNLP doc itself needs updating as it is rather old. Appendix B2 lists the Stanford dependencies, which also does not include all of the labels I've observed and differs from the doc I pointed to.
The following dependencies are described by the ClearNLP Doc and listed in Table 2:
ACOMP Adjectival complement
ADVCL Adverbial clause modifier
ADVMOD Adverbial modifier
AGENT Agent NN Noun compound modifier
AMOD Adjectival modifier
APPOS Appositional modifier
ATTR Attribute
AUX Auxiliary NUM Numeric modifier
AUXPASS Auxiliary (passive)
CC Coordinating conjunction
CCOMP Clausal complement
COMPLM Complementizer
CONJ Conjunct
CSUBJ Clausal subject
CSUBJPASS Clausal subject (passive)
DEP Unclassified dependent
DET Determiner
DOBJ Direct object
EXPL Expletive
HMOD Modifier in hyphenation
HYPH Hyphen
INFMOD Infinitival modifier
INTJ Interjection
IOBJ Indirect object
MARK Marker
META Meta modifier
NEG Negation modifier
NMOD Modifier of nominal
NPADVMOD Noun phrase as ADVMOD
NSUBJ Nominal subject
NSUBJPASS Nominal subject (passive)
NUMBER Number compound modifier
OPRD Object predicate
PARATAXIS Parataxis
PARTMOD Participial modifier
PCOMP Complement of a preposition
POBJ Object of a preposition
POSS Possession modifier
POSSESSIVE Possessive modifier
PRECONJ Pre-correlative conjunction
PREDET Predeterminer
PREP Prepositional modifier
PRT Particle
PUNCT Punctuation
QUANTMOD Quantifier phrase modifier
RCMOD Relative clause modifier
ROOT Root
XCOMP Open clausal complement
Here are the dependency labels generated by SpaCy I've observed while parsing my corpus, * denotes labels not in the ClearNLP doc (these are only what I've observed, there may be more):
acomp
advcl
advmod
agent
amod
appos
attr
aux
auxpass
cc
ccomp
csubj
csubjpass
dep
det
dobj
expl
intj
iobj
mark
meta
neg
nmod
npadvmod
nsubj
nsubjpass
oprd
parataxis
pcomp
pobj
poss
preconj
predet
prep
prt
punct
quantmod
xcomp