hi i am using spacy2.2.3 with gpu enabled then i tried to use displacy.render function i got error
code that i used is like below
```import spacy
spacy.require_gpu()
import pandas as pd
import re
from bs4 import BeautifulSoup
import random
from spacy.util import minibatch,compounding
from spacy import displacy
spacy.util.use_gpu(0)
df = pd.read_json("data_point_section_dataset.json",lines=True)
print(len(df))
df = df[df["doc_type"]=="doc_type"]
df = df[df["user_role"]=="special"]
print(len(df))
model_path = "spacy_2_2_3"
def populate_train_data(df):
train_data = []
for d_index, row in df.iterrows():
content = row["annotations"].replace("\n", "\n").replace("\n", " ")
content = re.sub(r"(?<=[:])(?=[^\s])", r" ", content)
# Finding tags and entities and store values in a entity list-----
soup = BeautifulSoup(content, "html.parser")
text = soup.get_text()
entities = []
for tag in soup.find_all():
if tag.string is None:
# failing silently for invalid tag
print(f'Tagging is invalid: {row["_id"], tag.name}, skipping..')
continue
tag_index = content.split(str(tag))[0].count(tag.string)
try:
for index, match in enumerate(re.finditer(re.escape(tag.string), text)):
if index == tag_index:
entities.append((match.start(), match.end(), tag.name))
except Exception as e:
print(e)
continue
if entities:
train_data.append((text, {"entities": entities}))
return train_data
def _train(train_data):
nlp = spacy.load("en_core_web_sm")
if "ner" not in nlp.pipe_names:
ner = nlp.create_pipe("ner")
nlp.add_pipe(ner, last=True)
else:
ner = nlp.get_pipe("ner")
for _, annotations in train_data:
for ent in annotations.get("entities"):
ner.add_label(ent[2])
optimizer = nlp.begin_training()
for i in range(20):
random.shuffle(train_data)
correct = 1
batches = minibatch(train_data)
for batch in batches:
texts, annotations = zip(*batch)
nlp.update(texts, annotations, sgd=optimizer)
return nlp
def predict(text, expected_dps):
try:
nlp = spacy.load(model_path)
except OSError:
raise ModelNotFoundError(f"Model not found. :{self.type}")
text = text.replace("\n", " ")
doc = nlp(text)
entities = []
for entity in doc.ents:
if entity.label_ in expected_dps:
data = {
"label": entity.label_,
"value": entity.text,
"start_index": entity.start_char,
"end_index": entity.end_char,
}
entities.append(data)
return entities,doc
def train():
train_data = populate_train_data(df)
nlp = _train(train_data)
nlp.to_disk(model_path)
expected_dps = ["claim_date_claim_form","injury_date_claim_form","start_injury_claim_form","end_injury_claim_form","injuries_claim_form","app_address_claim_form","injury_report_date_claim_form"]
dps,dps_for_html = predict(text="the text i want to predict",expected_dps)
print(dps)
return ""
train()
soup = BeautifulSoup(text,"html.parser")
text = soup.get_text()
expected_dps = ["claim_date_claim_form","injury_date_claim_form","start_injury_claim_form","end_injury_claim_form","injuries_claim_form","app_address_claim_form","injury_report_date_claim_form"]
dps,dps_for_html = predict(text="the text i want to predict",expected_dps)
print("hello")
print(dps)
html = displacy.render(dps_for_html.sents,style="ent")
print(html)```
but i am getting an error
as follows
Traceback (most recent call last):
File "train_test.py", line 112, in <module>
html = displacy.render(dps_for_html.sents,style="ent")
File "/usr/local/lib/python3.6/site-packages/spacy/displacy/__init__.py", line 46, in render
docs = [obj if not isinstance(obj, Span) else obj.as_doc() for obj in docs]
File "/usr/local/lib/python3.6/site-packages/spacy/displacy/__init__.py", line 46, in <listcomp>
docs = [obj if not isinstance(obj, Span) else obj.as_doc() for obj in docs]
File "span.pyx", line 232, in spacy.tokens.span.Span.as_doc
File "span.pyx", line 192, in __iter__
File "cupy/core/core.pyx", line 948, in cupy.core.core.ndarray.__add__
File "cupy/core/_kernel.pyx", line 886, in cupy.core._kernel.ufunc.__call__
File "cupy/core/_kernel.pyx", line 90, in cupy.core._kernel._preprocess_args
TypeError: Unsupported type <class 'numpy.ndarray'>
this only happens with gpu enabled if remove require_gpu function dispalcy works well .
Hi @pyshahid , thanks for the report!
This certainly looks like an issue with spaCy.
Unfortunately your code snippet could not be run as-is, because it uses an external json file. Is there any chance you could reduce your code to a minimal running script that still exhibits the error? That would really help debugging on our side.
@svlandeg i tried with
import spacy
spacy.require_gpu()
import random
from spacy.util import minibatch,compounding
from spacy import displacy
nlp = spacy.load("en_core_web_sm")
train_data = [("Uber blew through $1 million", {"entities":[(0,4, "ORG")]})]
other_pipes = [pipe for pipe in nlp.pipe_names if pipe != "ner"]
with nlp.disable_pipes(other_pipes):
optimizer = nlp.begin_training()
for i in range(10):
random.shuffle(train_data)
batches = minibatch(train_data, size=compounding(4.0, 32.0, 1.001))
for batch in batches:
text, annotation = zip(batch)
nlp.update(text, annotation, sgd=optimizer)
nlp.to_disk("model")
model = spacy.load("model")
doc = model(train_data)
sentence = list(doc.sents)
displacy.serve(sentence,style='ent')
but i wasn't able to replicate that issue but if i use previously given code i still get this issue
i will give update once i create a json file and try again
Hi,
I am also seeing this issue.
Running this on Google Colab with a GPU runtime:
import spacy
spacy.prefer_gpu()
nlp = spacy.load("en_core_web_sm")
doc = nlp("Apple is looking at buying U.K. startup for $1 billion")
produces
NameError Traceback (most recent call last)
<ipython-input-1-7caac881dafb> in <module>()
3
4 nlp = spacy.load("en_core_web_sm")
----> 5 doc = nlp("Apple is looking at buying U.K. startup for $1 billion")
20 frames
/usr/local/lib/python3.6/dist-packages/spacy/language.py in __call__(self, text, disable, component_cfg)
400 if not hasattr(proc, "__call__"):
401 raise ValueError(Errors.E003.format(component=type(proc), name=name))
--> 402 doc = proc(doc, **component_cfg.get(name, {}))
403 if doc is None:
404 raise ValueError(Errors.E005.format(name=name))
pipes.pyx in spacy.pipeline.pipes.Tagger.__call__()
pipes.pyx in spacy.pipeline.pipes.Tagger.predict()
/usr/local/lib/python3.6/dist-packages/thinc/neural/_classes/model.py in __call__(self, x)
167 Must match expected shape
168 """
--> 169 return self.predict(x)
170
171 def pipe(self, stream, batch_size=128):
/usr/local/lib/python3.6/dist-packages/thinc/neural/_classes/feed_forward.py in predict(self, X)
38 def predict(self, X):
39 for layer in self._layers:
---> 40 X = layer(X)
41 return X
42
/usr/local/lib/python3.6/dist-packages/thinc/neural/_classes/model.py in __call__(self, x)
167 Must match expected shape
168 """
--> 169 return self.predict(x)
170
171 def pipe(self, stream, batch_size=128):
/usr/local/lib/python3.6/dist-packages/thinc/api.py in predict(seqs_in)
308 def predict(seqs_in):
309 lengths = layer.ops.asarray([len(seq) for seq in seqs_in])
--> 310 X = layer(layer.ops.flatten(seqs_in, pad=pad))
311 return layer.ops.unflatten(X, lengths, pad=pad)
312
/usr/local/lib/python3.6/dist-packages/thinc/neural/_classes/model.py in __call__(self, x)
167 Must match expected shape
168 """
--> 169 return self.predict(x)
170
171 def pipe(self, stream, batch_size=128):
/usr/local/lib/python3.6/dist-packages/thinc/neural/_classes/feed_forward.py in predict(self, X)
38 def predict(self, X):
39 for layer in self._layers:
---> 40 X = layer(X)
41 return X
42
/usr/local/lib/python3.6/dist-packages/thinc/neural/_classes/model.py in __call__(self, x)
167 Must match expected shape
168 """
--> 169 return self.predict(x)
170
171 def pipe(self, stream, batch_size=128):
/usr/local/lib/python3.6/dist-packages/thinc/neural/_classes/model.py in predict(self, X)
131
132 def predict(self, X):
--> 133 y, _ = self.begin_update(X, drop=None)
134 return y
135
/usr/local/lib/python3.6/dist-packages/thinc/api.py in uniqued_fwd(X, drop)
377 )
378 X_uniq = layer.ops.xp.ascontiguousarray(X[ind])
--> 379 Y_uniq, bp_Y_uniq = layer.begin_update(X_uniq, drop=drop)
380 Y = Y_uniq[inv].reshape((X.shape[0],) + Y_uniq.shape[1:])
381
/usr/local/lib/python3.6/dist-packages/thinc/neural/_classes/feed_forward.py in begin_update(self, X, drop)
44 callbacks = []
45 for layer in self._layers:
---> 46 X, inc_layer_grad = layer.begin_update(X, drop=drop)
47 callbacks.append(inc_layer_grad)
48
/usr/local/lib/python3.6/dist-packages/thinc/api.py in begin_update(X, *a, **k)
161 def begin_update(X, *a, **k):
162 forward, backward = split_backward(layers)
--> 163 values = [fwd(X, *a, **k) for fwd in forward]
164
165 output = ops.xp.hstack(values)
/usr/local/lib/python3.6/dist-packages/thinc/api.py in <listcomp>(.0)
161 def begin_update(X, *a, **k):
162 forward, backward = split_backward(layers)
--> 163 values = [fwd(X, *a, **k) for fwd in forward]
164
165 output = ops.xp.hstack(values)
/usr/local/lib/python3.6/dist-packages/thinc/api.py in wrap(*args, **kwargs)
254
255 def wrap(*args, **kwargs):
--> 256 output = func(*args, **kwargs)
257 if splitter is None:
258 to_keep, to_sink = output
/usr/local/lib/python3.6/dist-packages/thinc/api.py in begin_update(X, *a, **k)
161 def begin_update(X, *a, **k):
162 forward, backward = split_backward(layers)
--> 163 values = [fwd(X, *a, **k) for fwd in forward]
164
165 output = ops.xp.hstack(values)
/usr/local/lib/python3.6/dist-packages/thinc/api.py in <listcomp>(.0)
161 def begin_update(X, *a, **k):
162 forward, backward = split_backward(layers)
--> 163 values = [fwd(X, *a, **k) for fwd in forward]
164
165 output = ops.xp.hstack(values)
/usr/local/lib/python3.6/dist-packages/thinc/api.py in wrap(*args, **kwargs)
254
255 def wrap(*args, **kwargs):
--> 256 output = func(*args, **kwargs)
257 if splitter is None:
258 to_keep, to_sink = output
/usr/local/lib/python3.6/dist-packages/thinc/api.py in begin_update(X, *a, **k)
161 def begin_update(X, *a, **k):
162 forward, backward = split_backward(layers)
--> 163 values = [fwd(X, *a, **k) for fwd in forward]
164
165 output = ops.xp.hstack(values)
/usr/local/lib/python3.6/dist-packages/thinc/api.py in <listcomp>(.0)
161 def begin_update(X, *a, **k):
162 forward, backward = split_backward(layers)
--> 163 values = [fwd(X, *a, **k) for fwd in forward]
164
165 output = ops.xp.hstack(values)
/usr/local/lib/python3.6/dist-packages/thinc/api.py in wrap(*args, **kwargs)
254
255 def wrap(*args, **kwargs):
--> 256 output = func(*args, **kwargs)
257 if splitter is None:
258 to_keep, to_sink = output
/usr/local/lib/python3.6/dist-packages/thinc/neural/_classes/hash_embed.py in begin_update(self, ids, drop)
57 if ids.ndim >= 2:
58 ids = self.ops.xp.ascontiguousarray(ids[:, self.column], dtype="uint64")
---> 59 keys = self.ops.hash(ids, self.seed) % self.nV
60 vectors = self.vectors[keys].sum(axis=1)
61 mask = self.ops.get_dropout_mask((vectors.shape[1],), drop)
ops.pyx in thinc.neural.ops.CupyOps.hash()
NameError: name 'gpu_ops' is not defined
Simply removing the line
spacy.prefer_gpu()
makes the code run perfectly.
Hi,
installing CUDA-enabled spacy with
pip install spacy[cuda100]
solved the problem for me. Perhaps this could be made clearer in the documentation. Thanks!
Good to hear you got your issue resolved @josesho, but it does look like the original post may be related to something else.
@pyshahid: can you show the commands and output you used for installing spaCy ? And can you provide a minimal running script that exhibits your error?
This issue has been automatically closed because there has been no response to a request for more information from the original author. With only the information that is currently in the issue, there's not enough information to take action. If you're the original author, feel free to reopen the issue if you have or find the answers needed to investigate further.
Hi,
Sorry @svlandeg, new error spotted! It seems when there are out-of-vocab words in the model, a TypeError is thrown?
import spacy
spacy.prefer_gpu()
nlp_core = spacy.load("en_core_web_lg")
text1 = "The effect of anxiogenic treatments on three rodent models of anxiety: \
the open field test, the elevated plus-maze, and the light-dark box."
doc1 = nlp_core(text1)
doc1.vector
produces
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-3-531ef58ab65a> in <module>()
1 doc1 = nlp_core(text1)
2
----> 3 doc1.vector
doc.pyx in __iter__()
cupy/core/core.pyx in cupy.core.core.ndarray.__add__()
cupy/core/_kernel.pyx in cupy.core._kernel.ufunc.__call__()
cupy/core/_kernel.pyx in cupy.core._kernel._preprocess_args()
TypeError: Unsupported type <class 'numpy.ndarray'>
Relevant package versions:
print(spacy.__version__)
print(cupy.__version__)
2.2.3
7.2.0
Hi, this problem has been fixed (#4680) and should be available soon in version 2.2.4.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
Hi, this problem has been fixed (#4680) and should be available soon in version 2.2.4.