Spacy: TypeError: Only cupy arrays can be concatenated (Training on GPU)

Created on 4 Jan 2018 · 21Comments · Source: explosion/spaCy

Training the textcat on GPU, gives the following error. Looks like to_array() doesn't handle CuPy arrays.

$ python scripts/train_textcat.py
Created blank 'en' model
Loading IMDB data...
Using 2000 examples (1600 training, 400 evaluation)
Training the model...
LOSS      P       R       F
Traceback (most recent call last):
  File "scripts/train_textcat.py", line 133, in <module>
    plac.call(main)
  File "/home/motoki/aes/lib/python3.6/site-packages/plac_core.py", line 328, in call
    cmd, result = parser.consume(arglist)
  File "/home/motoki/aes/lib/python3.6/site-packages/plac_core.py", line 207, in consume
    return cmd, self.func(*(args + varargs + extraopts), **kwargs)
  File "scripts/train_textcat.py", line 68, in main
    losses=losses)
  File "/home/motoki/aes/lib/python3.6/site-packages/spacy/language.py", line 407, in update
    proc.update(docs, golds, drop=drop, sgd=get_grads, losses=losses)
  File "pipeline.pyx", line 817, in spacy.pipeline.TextCategorizer.update
  File "/home/motoki/aes/lib/python3.6/site-packages/thinc/api.py", line 61, in begin_update
    X, inc_layer_grad = layer.begin_update(X, drop=drop)
  File "/home/motoki/aes/lib/python3.6/site-packages/thinc/api.py", line 176, in begin_update
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/home/motoki/aes/lib/python3.6/site-packages/thinc/api.py", line 176, in <listcomp>
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/home/motoki/aes/lib/python3.6/site-packages/thinc/api.py", line 258, in wrap
    output = func(*args, **kwargs)
  File "/home/motoki/aes/lib/python3.6/site-packages/thinc/api.py", line 61, in begin_update
    X, inc_layer_grad = layer.begin_update(X, drop=drop)
  File "/home/motoki/aes/lib/python3.6/site-packages/spacy/_ml.py", line 101, in _preprocess_doc
    keys = ops.xp.concatenate(keys)
  File "/home/motoki/aes/lib/python3.6/site-packages/cupy/manipulation/join.py", line 49, in concatenate
    return core.concatenate_method(tup, axis)
  File "cupy/core/core.pyx", line 2410, in cupy.core.core.concatenate_method
  File "cupy/core/core.pyx", line 2422, in cupy.core.core.concatenate_method
TypeError: Only cupy arrays can be concatenated

Your Environment

Info about spaCy

spaCy version: 2.0.5
Platform: Linux-4.10.0-42-generic-x86_64-with-Ubuntu-16.04-xenial
Python version: 3.6.4
Models: en_core_web_sm
thinc 6.10.1
cupy 2.2.0
Nvidia build version 387.34
CUDA version 8.0

bug gpu

Source

tokestermw

All 21 comments

I have the same issue.

spaCy version: 2.0.4
Platform: Windows 10 64-bit
Python version: 3.6.4
Models: en_core_web_lg
thinc 6.10.2
cupy 2.2.0
Nvidia build version 390.65
CUDA version 9.1

Any updates?

Nickeron on 19 Jan 2018

Same issue here

spaCy 2.0.6
Platform Ubuntu 16.04
Python version: 3.5.2
Models: en
thinc: 6.10.2
cupy 2.3.0
Nvidia build version 384.111
CUDA version 9.0

ismaeIfm on 2 Feb 2018

I am facing the same issue, any updates on this?

madhujahagirdar on 12 Mar 2018

I think it was fixed in the latest version

ismaeIfm on 12 Mar 2018

👎1

It appears to be the issue has gone, but i see that even though you set the flag to use gpu it does not use it. Not sure what is wrong.

madhujahagirdar on 13 Mar 2018

👎1

Not fixed for me.
Error looks still the same as in the first post.

spaCy 2.0.10
Platform Windows 10 64-bit
Python 3.6.4
Models en
thinc 6.10.1
cupy 2.5.0
NVidia 391.35
CUDA 9.0

markusradtke on 3 Apr 2018

I have the same error as in the first post and still not fixed.

weiwei5444 on 5 Apr 2018

I am encountering the same error:

spaCy 2.0.11
Ubuntu 17.10 x86_64
Python 3.6.3
Models: en, en_core_web_sm
thinc 6.10.2
cupy 2.5.0
Nvidia build 387.26
CUDA 9.1

Has anyone had any luck resolving this?

codydh on 13 Apr 2018

any suggestion?
@ismaeIfm pls specify your spacy version and environment!

einali on 11 Jun 2018

I have the same error:

spaCy 2.0.11
Ubuntu 16.04 x86_64
Python 2.7.12
thinc 6.10.2
cupy-cuda91 4.0.0
Nvidia build 387.26
CUDA 9.1

Has anyone had any luck resolving this?

ArmanAjdani on 11 Jun 2018

i hope we can use append ,but we don't have it on cupy TAT

zfk513 on 13 Jun 2018

I am having the same issue with my custom code and ALSO the example from the website - was able to reproduce it on multiple machines and configs. Can someone from the team take a look at it?

Stamenov on 20 Jun 2018

Im having the same issue.
spacy==2.0.11
thinc==6.10.2
CUDA 9.0
UBUNTU 16 (running in nvidia-docker container on UBUNTU machine)

mikerossgithub on 5 Jul 2018

Same issue running on GPU enabled Azure VM (NC6 with Tesla K80 GPU)
ubuntu 17.10 x86_64
python 3.6.3
spacy 2.0.11
thinc 6.11.2
cuda 9.2.148
cupy 4.2.0
Models en_core_web_sm

pedrostc on 10 Jul 2018

Bump. Definitely having this issue. Similar setup as previous reporters. Blank 'en' model. spacy HEAD (pre 2.0.12, 1a16162d), thinc 6.11.2, CUDA 9.0.176, cupy 4.3.0.

schuyler on 29 Jul 2018

Same. I tried editing 'cupy/manipulation/join.py', function concatenate, to convert the numpy arrays into cupy arrays but that led to several other errors down the line (something about ndarrays being in place rather than bytes). I would guess that thinc uses/assumes some feature of numpy that cupy does not support.
ubuntu 18.04
python 3.6.5
spacy 2.0.12
thinc 6.10.3
cuda 9.2
cupy 4.4.0

cmckain on 26 Aug 2018

After applying a (BS) fix to concatenate, I tried convert the cupy.ndarrays into bytes and numpy.ndarrays with both causes further problems in thinc. The heart of this error is that, originally, thinc uses numpy arrays which don't work with cupy's methods but, if you try and change that, it fails later given thinc (and cython) want and only work with numpy arrays. I also tried various changes to thinc/linear/linear.pyx:LinearModel.begin_update() (as that is here the problems seem to come from) with no success (probably given my lack of cython knowledge). Honestly, the problem seems to be on thinc's side instead but I still haven't figured out why some people's GPUs do work for training.

cmckain on 26 Aug 2018

Any update on this? Or any workaround?

ciphurus on 11 Sep 2018

Sorry for the delay on this. The issue is in the linear model in Thinc, which we ensemble with the textcat model in spaCy.

I think we should probably have a flag in the textcat to disable the ensembling. The ensemble is really most effective when we have small data sets, e.g. for Prodigy. For GPU training, presumably the dataset is large, so the ensemble is less motivated.

Simply fixing the linear model implementation in Thinc turns out to be difficult, because Thinc is using the "hashing trick". Making sure the hashing works the same across the CPU and GPU without making the CPU implementation inefficient is non-trivial.