Spacy: ValueError: shapes (4,0) and (300,128) not aligned from numpy at Thinc

Created on 28 Nov 2017 · 15Comments · Source: explosion/spaCy

I need to handle German and English languages with a single application. It worked fine with spaCy 1.8.2, 1.9.0 and 1.10.0 but gets broken with spaCy 2.0.3.

To reproduce the issue:

 >>> import spacy
 >>> nlpEN = spacy.load('en')
 >>> nlpDE = spacy.load('de')
 >>> doc = nlpEN('Hello world!')

The error messages:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Projects/foobar/.env/lib/python3.6/site-packages/spacy/language.py", line 333, in __call__
    doc = proc(doc)
  File "pipeline.pyx", line 390, in spacy.pipeline.Tagger.__call__
  File "pipeline.pyx", line 402, in spacy.pipeline.Tagger.predict
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/neural/_classes/model.py", line 161, in __call__
    return self.predict(x)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 55, in predict
    X = layer(X)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/neural/_classes/model.py", line 161, in __call__
    return self.predict(x)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 293, in predict
    X = layer(layer.ops.flatten(seqs_in, pad=pad))
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/neural/_classes/model.py", line 161, in __call__
    return self.predict(x)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 55, in predict
    X = layer(X)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/neural/_classes/model.py", line 161, in __call__
    return self.predict(x)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/neural/_classes/model.py", line 125, in predict
    y, _ = self.begin_update(X)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 372, in uniqued_fwd
    Y_uniq, bp_Y_uniq = layer.begin_update(X[ind], drop=drop)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 61, in begin_update
    X, inc_layer_grad = layer.begin_update(X, drop=drop)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in begin_update
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in <listcomp>
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 258, in wrap
    output = func(*args, **kwargs)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in begin_update
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in <listcomp>
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 258, in wrap
    output = func(*args, **kwargs)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in begin_update
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in <listcomp>
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 258, in wrap
    output = func(*args, **kwargs)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in begin_update
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in <listcomp>
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 258, in wrap
    output = func(*args, **kwargs)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/neural/_classes/static_vectors.py", line 67, in begin_update
    dotted = self.ops.batch_dot(vectors, self.W)
  File "ops.pyx", line 299, in thinc.neural.ops.NumpyOps.batch_dot
ValueError: shapes (4,0) and (300,128) not aligned: 0 (dim 1) != 300 (dim 0)

Info about spaCy

spaCy version: 2.0.3
Platform: Darwin-17.2.0-x86_64-i386-64bit
Python version: 3.6.3
Models: en_core_web_md, fr_core_news_md, it, it_core_news_sm, de_core_news_sm, de, fr, en
en_core_web_md version: 2.0.0
de_core_news_sm version: 2.0.0

bug feat / vectors 🔮 thinc

Source

zhaow-de

Most helpful comment

I think this comes down to an ill-considered use of a global variable when using pre-trained models in Thinc. The global variable is used to avoid storing extra copies of the vectors data. However, I think it's not keyed correctly by the spaCy model --- causing this error when there are multiple language models in memory.

I expect to get to this bug before the end of the week -- thanks for your patience; and thanks for reporting.

honnibal on 24 Jan 2018

👍4 😕2

All 15 comments

I have a similar error working with Dutch and English in the same application. The Dutch does not give the error. Only the english model causes the error

from spacy import displacy
import en_core_web_md
import nl_core_news_sm

nlp_english = en_core_web_md.load()
nlp_dutch = nl_core_news_sm.load()
nlp_english("I Am an englis barclay bank")

Version
Python 3.6.4
[93m Installed models (spaCy v2.0.5)?[0m
\AppData\Local\Programs\Python\Python36\libsite-packagesspacy

TYPE        NAME                  MODEL                 VERSION
package     xx-ent-wiki-sm        xx_ent_wiki_sm        ?[38;5;2m2.0.0?[0m    ?[38;5;2m?[0m
package     nl-core-news-sm       nl_core_news_sm       ?[38;5;2m2.0.0?[0m    ?[38;5;2m?[0m
package     en-core-web-md        en_core_web_md        ?[38;5;2m2.0.0?[0m    ?[38;5;2m?[0m

File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packagesspacy\language.py", line 333, in __call__
doc = proc(doc)
File "pipeline.pyx", line 390, in spacy.pipeline.Tagger.__call__
File "pipeline.pyx", line 402, in spacy.pipeline.Tagger.predict
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\neural_classes\model.py", line 161, in __call__
return self.predict(x)
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\api.py", line 55, in predict
X = layer(X)
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\neural_classes\model.py", line 161, in __call__
return self.predict(x)
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\api.py", line 293, in predict
X = layer(layer.ops.flatten(seqs_in, pad=pad))
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\neural_classes\model.py", line 161, in __call__
return self.predict(x)
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\api.py", line 55, in predict
X = layer(X)
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\neural_classes\model.py", line 161, in __call__
return self.predict(x)
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\neural_classes\model.py", line 125, in predict
y, _ = self.begin_update(X)
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\api.py", line 374, in uniqued_fwd
Y_uniq, bp_Y_uniq = layer.begin_update(X_uniq, drop=drop)
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\api.py", line 61, in begin_update
X, inc_layer_grad = layer.begin_update(X, drop=drop)
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\api.py", line 176, in begin_update
values = [fwd(X, a, *k) for fwd in forward]
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\api.py", line 176, in
values = [fwd(X, a, *k) for fwd in forward]
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\api.py", line 258, in wrap
output = func(args, *kwargs)
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\api.py", line 176, in begin_update
values = [fwd(X, a, *k) for fwd in forward]
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\api.py", line 176, in
values = [fwd(X, a, *k) for fwd in forward]
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\api.py", line 258, in wrap
output = func(args, *kwargs)
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\api.py", line 176, in begin_update
values = [fwd(X, a, *k) for fwd in forward]
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\api.py", line 176, in
values = [fwd(X, a, *k) for fwd in forward]
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\api.py", line 258, in wrap
output = func(args, *kwargs)
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\api.py", line 176, in begin_update
values = [fwd(X, a, *k) for fwd in forward]
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\api.py", line 176, in
values = [fwd(X, a, *k) for fwd in forward]
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\api.py", line 258, in wrap
output = func(args, *kwargs)
File "C:\Users\mike\AppData\Local\Programs\Python\Python36\libsite-packages\thinc\neural_classesstatic_vectors.py", line 67, in begin_update
dotted = self.ops.batch_dot(vectors, self.W)
File "ops.pyx", line 338, in thinc.neural.ops.NumpyOps.batch_dot
ValueError: shapes (7,0) and (300,128) not aligned: 0 (dim 1) != 300 (dim 0)

mikevanpamelen on 2 Jan 2018

I am experiencing the same problem. Whenever there are multiple spacy models in memory, one of them tends to fail (usually english) with the stack trace @zhaow-de shows in his post.

Has anyone found a fix for this?

tom-dorrington on 8 Jan 2018

I expect to get to this bug before the end of the week -- thanks for your patience; and thanks for reporting.

honnibal on 24 Jan 2018

👍4 😕2

Has anybody found even a temporary fix to this?

This does not only happen with pre-trained models, this is happening for me with all custom models

liuzzi on 7 Feb 2018

👍1

It works if you use a smaller model.

For example, en_web_core_md, Spanish and Dutch loaded will cause English to fail,

But en_web_core_sm, Spanish and Dutch loaded seems to all work.

Still waiting on a fix though :)

tom-dorrington on 8 Feb 2018

Im guessing thats because the smaller model only has the context vectors and not the full set of vectors overhead (which is whats getting rewritten?) Can anybody point to where this is 'keyed'?

Tom, your point works, but I am skeptical about what is going on behind the scenes. for example, if I load like this everything is fine

    spanish = spacy.load('es_core_news_sm')
    english = spacy.load('large_custom_english_model')

but if I load in the reverse order with english first

    english = spacy.load('large_custom_english_model')
    spanish = spacy.load('es_core_news_sm')

I get the error. This leads me not to trust the results. Is the first one working because the english is just overwriting the context vectors of the spanish model, which share the same dimensions and works coincidentally?

liuzzi on 8 Feb 2018

I also get this for running two English models (I wanted to compare them)

md_nlp = spacy.load('en_core_web_md')
sm_nlp = spacy.load('en_core_web_sm')

causes uses of md_nlp to fail.

oxinabox on 15 Feb 2018

Same situation here.

Error message:
ValueError: shapes (12,50) and (300,128) not aligned: 50 (dim 1) != 300 (dim 0)

Models:

Installed models (spaCy v2.0.8)
C:\ProgramData\Anaconda3\envs\dflt3\lib\site-packages\spacy

TYPE        NAME                  MODEL                 VERSION
package     en-core-web-md        en_core_web_md        2.0.0
package     en-core-web-sm        en_core_web_sm        2.0.0
package     es-core-news-sm       es_core_news_sm       2.0.0
package     es-core-news-md       es_core_news_md       2.0.0
package     xx-ent-wiki-sm        xx_ent_wiki_sm        2.0.0
link        es_core_news_md       es_core_news_md       2.0.0
link        xx                    xx_ent_wiki_sm        2.0.0
link        es                    es_core_news_md       2.0.0
link        en                    en_core_web_sm        2.0.0
link        en_core_web_md        en_core_web_md        2.0.0

CYUlysses on 21 Feb 2018

@honnibal could you please give an update on this bug? It is a real dealbreaker for my use-case, since I want to serve for multiple languages. I would like to know the timeframe, if possible :), before I setup different instances for each language.

oguzserbetci on 26 Feb 2018

@honnibal This is a huge problem for us, we need Spanish and English. I don't really want to change my microservices to need Spanish and English instances.

cjacques1 on 6 Mar 2018

+1 -- We are preparing to move off of Spacy entirely in a large production environment because of this critical issue. It would have been nice to even just get a pointer to where in Thinc this is happening so that maybe the community could investigate and help...

liuzzi on 6 Mar 2018

@liuzzi Really sorry for losing track of this issue.

First, here's some background on the life-cycle of pre-trained vectors, and how they get used within the models.

Most of the models build context-sensitive word representations using the spacy._ml.Tok2Vec function. This function builds some learned vector representations, and then optionally also concatenates on the pre-trained vectors, if a parameter is passed telling it the dimensions of the pre-trained vectors to pass in.

The pre-trained vectors are loaded within the thinc.neural._classes.static_vectors module. This maintains a dictionary to cache multiple vector tables, so that they don't need to be reloaded. To actually assign vectors to a batch of words, the StaticVectors class is passed in an array of integer IDs, and it's told ahead of time which column to read. It gets this column, and then uses this as indices into the vectors table.

The StaticVectors class is where the error you're seeing occurs, and the only mention of the class should be within the spacy._ml file, within that Tok2Vec function. Having a look at the Tok2Vec function, we can already see the problem:

        if pretrained_dims is not None and pretrained_dims >= 1:
            glove = StaticVectors(VECTORS_KEY, width, column=cols.index(ID))

            embed = uniqued(
                (glove | norm | prefix | suffix | shape)
                >> LN(Maxout(width, width*5, pieces=3)), column=5)
        else:
            embed = uniqued(
                (norm | prefix | suffix | shape)
                >> LN(Maxout(width, width*4, pieces=3)), column=5)

We're passing in a single key VECTORS_KEY to the StaticVectors class, which means the vectors from different models are stepping on each other. For models which have different pre-trained dimensions, this then causes the models to fail to load correctly.

One solution would be to avoid using the StaticVectors class altogether. That class has trouble because it expects to be passed in the array of IDs, which means it has to somehow load the vectors in the background. This is difficult.

We also have another class within the _ml module, used in the text classifier. The SpacyVectors class gets passed in a batch of Doc objects, rather than the arrays. This makes it very easy to fetch the vectors. The downside is it's harder to cache the whole vector computation per word type: if we first extract the array, we can have a convenient column to unique by. It's also much easier to keep the GPU efficient this way.

Edit: It's tough to give up on extracting the array and indexing into it, so I've been looking at passing the vectors into the Tok2Vec function. It's still difficult though. The keying issue also comes up in the link_vectors_to_models helper function, which can also be found within spacy._ml. In this function, we set the lex.rank attribute for the words in the vocab to the row in the vector table. We do this once, and it saves us as hash lookup on each token.

However, we need a unique ID for the vocab and vectors -- which currently we don't have. The vocab knows its language, but that's not unique enough. We also don't want to do id(vocab), as we can't persist that.

We also can't simply pass in the data to the StaticVectors class. If we did that, we'd have to save the vectors within each model (parser, tagger, entity recognizer, etc) --- because once we deserialized the models, we'd have the same problem.

honnibal on 28 Mar 2018

Still working on this. In the meantime, this package should mitigate the issue: https://github.com/kootenpv/spacy_api

honnibal on 28 Mar 2018

Same issue here, when loaded like this:

nlp_en = spacy.load('en_core_web_md')
nlp_es = spacy.load('es_core_news_md')