The #3248 now throws a different error in 2.1.0a8-nightly:
sre_constants.error: bad escape \p at position 284
Could you share the full error with traceback and a reproducible example? Also, if you're using models, can you run python -m spacy validate and check that they're up to date?
It sounds like this might be more related to the change from regex to re, rather than the phrase matcher.
If I remember correctly, the regex package does deal with escaped characters slightly differently than re. I remember changing the escaping in some existing expressions because of this very reason when shifting over to re - I think the error message was similar to what is cited here. I could help look into this if you can share a reproducable example!
This is also the error that shows up if a model built for the regex package gets loaded with the new nightly, though. The models have the tokenizer serialized, so if you load an old model, it has the old expressions.
Pretty sure this should be resolved by downloading the new models – I don't think it's related to the PhraseMatcher. If there's still a problem, feel free to reopen.
Is there any consideration to looking into this further? Even if it is solved by downloading updated models, downstream plugins with their own models (specific example being NeuralCoref) find themselves incompatible with this switch. Is there any solution other than to report the issue to each plugin's developer? What version of Spacy could I revert back to in order to ignore this issue?
I had this problem and found updating to 2.1 models worked
Same issue with Spacy=2.1.3 and en_core_web_lg 2.1.0. python -m spacy validate returns this:
✔ Loaded compatibility table
====================== Installed models (spaCy v2.1.3) ======================
ℹ spaCy installation:
C:\Users\dyz.virtualenvs\xyz-S_K9trBh\lib\site-packages\spacy
TYPE NAME MODEL VERSION
package en-core-web-lg en_core_web_lg 2.1.0 ✔
I'm also installing regex in addition to spacy which is probably what's causing the issue.
@omri374 : are you at any point importing regex as re ?
@svlandeg yes I am, but I tried playing with it (renaming to re2) with no luck. Do you think this is the root cause?
That would be my guess - you're going to get this kind of error when the two packages get mixed up somehow, either by building/loading a model with a different package or having them mixed at runtime...
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
I had this problem and found updating to 2.1 models worked