I'm considering having several apps talk to the same spaCy instance on 1 machine. I assume http will be too slow, and I was wondering if other people considered using spaCy only once and making it available through zeromq on the local machine for other apps?
I've been thinking about what might be the best solution to this problem too.
I'm not an expert at this, but it seems to me that REST+WSGI+JSON is quite a round-about way to serve a remote procedure call. So I've wondered whether there's a better solution.
Btw, if you want an easy way to get slightly better performance, you might try using uWSGI and communicating over a socket with nginx, using the uwsgi binary protocol. You're still adding a lot of overhead, but you might find it doesn't matter.
It's big props to spaCy to make nlp worry about http rather than... nlp :)
Wouldn't uWSGI spawn multiple spaCys? That's what I understood of it... or I guess to somehow have it as a shared object. I'll investigate that alley.
The idea was to have a microservice, probably with only one worker. I'm sure there will be better ways to do this, but it would let you stay within the REST/json model.
How about grpc?
http://www.grpc.io/docs/quickstart/python.html
More thoughts later... another issue is that ideally we would be able to work with a spaCy object within the user process.
So I assume we would do to_bytes and from_bytes, but from_bytes requires spaCy to load a vocab? Meaning it would incur the loading hit?
Or what kind of JSON would we send?
What about JSON like:
request:
POST /nlp
json={"document_text": "This. Is. Text.", "attributes": "text,lemma_,pos_,tag_,etc.."}
response:
json={"sentences": [ [{"text": "This", "lemma_": "this"}], [...], [...]]}
Can also have a bulk endpoint :)
Here it is:
It's okay if I close this, right? Also added the spacy_api library to the third-party libraries in our showcase btw: https://spacy.io/docs/usage/showcase#libraries
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
Here it is:
https://github.com/kootenpv/spacy_api