Spacy: Pronoun resolution?

Created on 10 Jun 2016  路  6Comments  路  Source: explosion/spaCy

Hey, sorry if this has been asked before. But I couldn't find any docs on whether any type of pronoun resolution is built into spaCy. If there isn't, is there a plan for when this could be added?

enhancement help wanted

Most helpful comment

https://github.com/huggingface/neuralcoref
One of the implementation of coreferencing on Spacy. Hope it is useful.

All 6 comments

There was this issue: https://github.com/spacy-io/spaCy/issues/22 (which is closed for some reason).

Still, with spacy 0.100.6 (and Python 2.7) for

for tok in nlp(u"You and I make us"):
    print tok.orth_, tok.pos_

However I am getting:

You NOUN
and CONJ
I NOUN
make VERB
us NOUN

Hey, I've been working with Spacy for a while now and, predictably, ran into my first anaphora resolution problem :) I fully realize this is currently not solvable, even if we're just considering pronouns. Compare

The woman tipped the waitress $20. She was very generous.

versus

The woman tipped the waitress $20. She was very efficient.

That said, I'm still interested in at least generating all possible nouns that a pronoun could reference in this example just by congruency. We already have the data for pronouns in lang_data/en/morphs.json, allowing us to match by singular or plural.

_Question:_ Is there a' way to determine the grammatical gender of words like sister, husband, waitress etc?

maebert: I suggest you hunt down a lexicon that lists that out, and then add a flag to the Lexemes, like IS_MASC, IS_FEM, etc. An example of setting a user flag is here: https://github.com/spacy-io/spaCy/blob/master/examples/matcher_example.py

If you find a good lexicon for this, under a free license, please let me know. It's very easy to bake this into the data, and since it works well as binary flags, it won't take any additional memory to store.

https://github.com/huggingface/neuralcoref
One of the implementation of coreferencing on Spacy. Hope it is useful.

Quick update: This might be a nice use case for the new custom processing pipeline components and extension attributes introduced in v2.0!

Edit: Merging this with #820!

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Was this page helpful?
0 / 5 - 0 ratings