Spacy: First example on webspage doesn't work

Created on 31 May 2017  Â·  13Comments  Â·  Source: explosion/spaCy

kbriggs:~/python> python -m spacy download en

Command "/usr/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-5txkgh-build/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-2rrJEB-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-5txkgh-build/

kbriggs:~/python> python try_spacy.py

Warning: no model found for 'en'

Only loading the 'en' tokenizer.

Traceback (most recent call last):
File "try_spacy.py", line 9, in
doc = nlp(text)
File "/usr/local/lib/python2.7/dist-packages/spacy/language.py", line 320, in __call__
doc = self.make_doc(text)
File "/usr/local/lib/python2.7/dist-packages/spacy/language.py", line 293, in
self.make_doc = lambda text: self.tokenizer(text)
TypeError: Argument 'string' has incorrect type (expected unicode, got str)

Most helpful comment

@keithbriggs I reinstalled my Ubuntu today. You know what, as it happens I encountered exactly the same problem. And I resolved it using pip itself. Now, pip has a bug right now which shows that it has not updated to the latest version but if you do pip show pip it will show that it already has.
So,

  1. Update pip to latest version and so sudo pip install -U spacy
  2. Install requirements. I downloaded the requirements.txt from GitHub code of spaCy.
  3. sudo pip -r requirements.txt
  4. Then only downloaded the model: python -m spacy download en

Everything is working now.

All 13 comments

Try to write the complete model name while downloading model:
python -m spacy download en_core_web_sm

Use en_core_web_sm instead of en everywhere.

Note you can also use other models. I've used the default one.

Also make sure the input text in nlp () should be Unicode. Not str.

@keithbriggs I haven't had time to look at your other issue #1095 in detail, but I'm pretty sure this is related and there's an underlying installation problem in your case. (I hope it's okay if I close this and merge it with #1095, so we can keep this in one place.)

As @mraduldubey mentioned, the last error might be related to unicode. If you're using Python 2, make sure that the text in doc = nlp(text) is unicode. For example, text = u"This is a sentence".

Try to write the complete model name while downloading model:

This shouldn't actually be necessary – spaCy should automatically map en to en_core_web_sm (the default model). However, if you want to download a non-default model, like en_core_web_md, you'll definitely have to type the full name.

Thanks very much for your help - I've got it working now. My main concern is that anyone should be able to take the first example on your website, and it should just work exactly is it is.

@ines I know that spaCy should automatically map en to en_core_web_sm (the default model). But it didn't in my case. I had to explicilty mention the model name en_core_web_sm.

@keithbriggs What did you do?

I had installed it with pip on an ubuntu 16 system - apparently the version of pip is a bit old. So I uninstalled it, and did a new install with easy_install. This worked.

@keithbriggs I reinstalled my Ubuntu today. You know what, as it happens I encountered exactly the same problem. And I resolved it using pip itself. Now, pip has a bug right now which shows that it has not updated to the latest version but if you do pip show pip it will show that it already has.
So,

  1. Update pip to latest version and so sudo pip install -U spacy
  2. Install requirements. I downloaded the requirements.txt from GitHub code of spaCy.
  3. sudo pip -r requirements.txt
  4. Then only downloaded the model: python -m spacy download en

Everything is working now.

@mraduldubey That's interesting – if you have time, could you give more details on this? Like, the spaCy version you used, and if you got an error message when running python -m spacy download en?

@keithbriggs Glad to hear it worked – sorry you were having problems! Did the solution you outlined also solve #1095?

About the example: Was the unicode issue the main problem here? This definitely needs to be fixed regardless. To be honest, we're not perfectly happy with the landing page examples anyway, so we might want to re-write them entirely for spaCy v2.0.

I think so, but I was trying spacy on several different machines and did not keep records of what worked and what did not. Maybe next week I'll try again on a new clean standard-configuration Ubuntu machine.

Thanks and no worries! We'll also look into this when we update the cross-platform tests for v2. This issue was particularly interesting, because I don't remember ever seeing this exact error before. (So even if it turns out to be related to a specific pip version or bug, it'd definitely be worth mentioning that in the docs.)

@ines I am extremely sorry for not seeing this till now. I've been too busy. I don't remember what error I ran into while installing spaCy. But, the reason can be due to pip or even something similar to this. As for the version, it was definitely spacy==1.8.2.

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Was this page helpful?
0 / 5 - 0 ratings