Spacy: Error on en_core_web_lg with thinc 6.11.0

Created on 24 Mar 2018  Â·  4Comments  Â·  Source: explosion/spaCy

I just upgraded to spaCy 2.0.10 (which upgraded thinc to 6.11.0) and ran into an error when I tried to use en_core_web_lg. I'm pretty sure this is a problem with thinc, not spaCy, so let me know if I should open the issue there instead.

  • spacy==2.0.10 and thinc==6.11.0 produced the error below with en_core_web_lg
  • Loading en_core_web_sm instead worked fine
  • I downgraded to spacy==2.0.09 and thinc==6.10.2, and then that worked fine, too
  • Upgrading back to spacy==2.0.010 but keeping thinc==6.10.2 also worked fine
  • Upgrading thinc==6.11.0 caused the issue to reappear, so I'm pretty sure it's a thinc + en_core_web_lg problem.

Here's the full traceback when it happens

In [1]: import spacy
In [2]: nlp = spacy.load("en_core_web_lg")
In [3]: nlp("green")
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-3-a542f097f5fe> in <module>()
----> 1 nlp("green")

~/anaconda3/lib/python3.6/site-packages/spacy/language.py in __call__(self, text, disable)
    342             if name in disable:
    343                 continue
--> 344             doc = proc(doc)
    345         return doc
    346

pipeline.pyx in spacy.pipeline.Tagger.__call__()

pipeline.pyx in spacy.pipeline.Tagger.predict()

~/anaconda3/lib/python3.6/site-packages/thinc/neural/_classes/model.py in __call__(self, x)
    159             Must match expected shape
    160         '''
--> 161         return self.predict(x)
    162
    163     def pipe(self, stream, batch_size=128):

~/anaconda3/lib/python3.6/site-packages/thinc/api.py in predict(self, X)
     53     def predict(self, X):
     54         for layer in self._layers:
---> 55             X = layer(X)
     56         return X
     57

~/anaconda3/lib/python3.6/site-packages/thinc/neural/_classes/model.py in __call__(self, x)
    159             Must match expected shape
    160         '''
--> 161         return self.predict(x)
    162
    163     def pipe(self, stream, batch_size=128):

~/anaconda3/lib/python3.6/site-packages/thinc/api.py in predict(seqs_in)
    291     def predict(seqs_in):
    292         lengths = layer.ops.asarray([len(seq) for seq in seqs_in])
--> 293         X = layer(layer.ops.flatten(seqs_in, pad=pad))
    294         return layer.ops.unflatten(X, lengths, pad=pad)
    295

~/anaconda3/lib/python3.6/site-packages/thinc/neural/_classes/model.py in __call__(self, x)
    159             Must match expected shape
    160         '''
--> 161         return self.predict(x)
    162
    163     def pipe(self, stream, batch_size=128):

~/anaconda3/lib/python3.6/site-packages/thinc/api.py in predict(self, X)
     53     def predict(self, X):
     54         for layer in self._layers:
---> 55             X = layer(X)
     56         return X
     57

~/anaconda3/lib/python3.6/site-packages/thinc/neural/_classes/model.py in __call__(self, x)
    159             Must match expected shape
    160         '''
--> 161         return self.predict(x)
    162
    163     def pipe(self, stream, batch_size=128):

~/anaconda3/lib/python3.6/site-packages/thinc/neural/_classes/model.py in predict(self, X)
    123
    124     def predict(self, X):
--> 125         y, _ = self.begin_update(X, drop=None)
    126         return y
    127

~/anaconda3/lib/python3.6/site-packages/thinc/api.py in uniqued_fwd(X, drop)
    375                                                     return_counts=True)
    376         X_uniq = layer.ops.xp.ascontiguousarray(X[ind])
--> 377         Y_uniq, bp_Y_uniq = layer.begin_update(X_uniq, drop=drop)
    378         Y = Y_uniq[inv].reshape((X.shape[0],) + Y_uniq.shape[1:])
    379         def uniqued_bwd(dY, sgd=None):

~/anaconda3/lib/python3.6/site-packages/thinc/api.py in begin_update(self, X, drop)
     59         callbacks = []
     60         for layer in self._layers:
---> 61             X, inc_layer_grad = layer.begin_update(X, drop=drop)
     62             callbacks.append(inc_layer_grad)
     63         def continue_update(gradient, sgd=None):

~/anaconda3/lib/python3.6/site-packages/thinc/api.py in begin_update(X, *a, **k)
    174     def begin_update(X, *a, **k):
    175         forward, backward = split_backward(layers)
--> 176         values = [fwd(X, *a, **k) for fwd in forward]
    177
    178         output = ops.xp.hstack(values)

~/anaconda3/lib/python3.6/site-packages/thinc/api.py in <listcomp>(.0)
    174     def begin_update(X, *a, **k):
    175         forward, backward = split_backward(layers)
--> 176         values = [fwd(X, *a, **k) for fwd in forward]
    177
    178         output = ops.xp.hstack(values)

~/anaconda3/lib/python3.6/site-packages/thinc/api.py in wrap(*args, **kwargs)
    256     '''
    257     def wrap(*args, **kwargs):
--> 258         output = func(*args, **kwargs)
    259         if splitter is None:
    260             to_keep, to_sink = output

~/anaconda3/lib/python3.6/site-packages/thinc/api.py in begin_update(X, *a, **k)
    174     def begin_update(X, *a, **k):
    175         forward, backward = split_backward(layers)
--> 176         values = [fwd(X, *a, **k) for fwd in forward]
    177
    178         output = ops.xp.hstack(values)

~/anaconda3/lib/python3.6/site-packages/thinc/api.py in <listcomp>(.0)
    174     def begin_update(X, *a, **k):
    175         forward, backward = split_backward(layers)
--> 176         values = [fwd(X, *a, **k) for fwd in forward]
    177
    178         output = ops.xp.hstack(values)

~/anaconda3/lib/python3.6/site-packages/thinc/api.py in wrap(*args, **kwargs)
    256     '''
    257     def wrap(*args, **kwargs):
--> 258         output = func(*args, **kwargs)
    259         if splitter is None:
    260             to_keep, to_sink = output

~/anaconda3/lib/python3.6/site-packages/thinc/api.py in begin_update(X, *a, **k)
    174     def begin_update(X, *a, **k):
    175         forward, backward = split_backward(layers)
--> 176         values = [fwd(X, *a, **k) for fwd in forward]
    177
    178         output = ops.xp.hstack(values)

~/anaconda3/lib/python3.6/site-packages/thinc/api.py in <listcomp>(.0)
    174     def begin_update(X, *a, **k):
    175         forward, backward = split_backward(layers)
--> 176         values = [fwd(X, *a, **k) for fwd in forward]
    177
    178         output = ops.xp.hstack(values)

~/anaconda3/lib/python3.6/site-packages/thinc/api.py in wrap(*args, **kwargs)
    256     '''
    257     def wrap(*args, **kwargs):
--> 258         output = func(*args, **kwargs)
    259         if splitter is None:
    260             to_keep, to_sink = output

~/anaconda3/lib/python3.6/site-packages/thinc/api.py in begin_update(X, *a, **k)
    174     def begin_update(X, *a, **k):
    175         forward, backward = split_backward(layers)
--> 176         values = [fwd(X, *a, **k) for fwd in forward]
    177
    178         output = ops.xp.hstack(values)

~/anaconda3/lib/python3.6/site-packages/thinc/api.py in <listcomp>(.0)
    174     def begin_update(X, *a, **k):
    175         forward, backward = split_backward(layers)
--> 176         values = [fwd(X, *a, **k) for fwd in forward]
    177
    178         output = ops.xp.hstack(values)

~/anaconda3/lib/python3.6/site-packages/thinc/api.py in wrap(*args, **kwargs)
    256     '''
    257     def wrap(*args, **kwargs):
--> 258         output = func(*args, **kwargs)
    259         if splitter is None:
    260             to_keep, to_sink = output

~/anaconda3/lib/python3.6/site-packages/thinc/neural/_classes/static_vectors.py in begin_update(self, ids, drop)
     56             ids = self.ops.xp.ascontiguousarray(ids[:, self.column])
     57         vector_table = self.get_vectors()
---> 58         vectors = vector_table[ids * (ids < vector_table.shape[0])]
     59         assert vectors.shape[0] == ids.shape[0]
     60         def finish_update(gradients, sgd=None):

IndexError: arrays used as indices must be of integer (or boolean) type

Your Environment

  • thinc==6.11.0
  • spaCy version: 2.0.10 / 2.0.9
  • Platform: Darwin-17.4.0-x86_64-i386-64bit
  • Python version: 3.6.4
  • Models: en_core_web_lg, en_core_web_sm
🔮 thinc

All 4 comments

Wtf. The requirements and setup.py for spaCy 2.0.10 both specify Thinc 6.10.2

Should we delist Thinc 6.11.0? It was a mistaken build, and nothing should be depending on it.

I've gone ahead and deleted 6.11.0

https://www.google.com/maps/d/drive?state=%7B%22ids%22:%5B%221VuEXRAT2zrNpu6qxiQXHgDbd3Mk%22%5D,%22action%22:%22open%22,%22userId%22:%22104879032583722475577%22%7D
-------- Original message --------From: Ines Montani notifications@github.com Date: 3/24/18 4:10 PM (GMT-06:00) To: explosion/spaCy spaCy@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [explosion/spaCy] Error on en_core_web_lg with thinc 6.11.0
  (#2144)
Closed #2144.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/explosion/spaCy","title":"explosion/spaCy","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/explosion/spaCy"}},"updates":{"snippets":[{"icon":"DESCRIPTION","message":"Closed #2144."}],"action":{"name":"View Issue","url":"https://github.com/explosion/spaCy/issues/2144#event-1539360979"}}}

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Was this page helpful?
0 / 5 - 0 ratings