Hi Team Spacy,
I was wondering whether you're considering to add comparison between Spacy and AllenNLP in the Documentation section?
I'm currently writing the new docs for v2.1 and I've actually been thinking that we should update the facts and figures / comparison section and include more relevant libraries.
To make the comparison a bit more useful, maybe we could also have a section along the lines of "When should I use what?" that describes different user scenarios and whether spaCy or any of the other NLP libraries would be a good fit. For example: "I want to try out and compare different neural network model architectures for NLP." → spaCy (no), AllenNLP (yes), and so on?
One thing that's important from our perspective is that the comparisons should be helpful and reasonable. There are many use cases, especially in research, where people want to use a different library and that's fine – in fact, we want users to be able to pick the right tool for the right purpose and be happy. (I've always disliked comparison tables that strategically pick arbitrary features that tool X has and others don't to make it look like X is the best.)
cc: @DeNeutoy, maybe you have some ideas, too?
As far as we are concerned, AllenNLP and SpaCy don't really exist in the same space at all - AllenNLP's objective is explicitly _not_ to provide production grade text processing. We're primarily a research library and a research library that is almost exclusively focused around applying neural nets to different nlp tasks. For instance, we don't really have any speed based benchmarks at all - all of our ELMo models will be spectacularly slow on the CPU, for instance.
So as @ines says, i'm not sure the comparison will be super helpful - e.g we actually use SpaCy for some preprocessing for some of our online demos (e.g many of our models assume tokenised input, or require POS tags.)
A note describing these differences could be helpful though, maybe? I'm not sure how many new users are trying to decide between allennlp and spacy and failing, on the other hand.
A note describing these differences could be helpful though, maybe? I'm not sure how many new users are trying to decide between allennlp and spacy and failing, on the other hand.
For what it is worth, I am a new user and I was looking for this kind of information when I stumbled into this thread. For people who are new to the subject and learn as they go along, any information which helps clarify the mind is very helpful.
@woctezuma Thanks for your feedback, that's really good to know!
I think one problem is that people often talk about "NLP libraries" or say things like "NLP libraries like spaCy or AllenNLP", which makes it sound like they're different options for the same thing, when in fact, they're very different libraries for very different things.
Here are some ideas for a "When should I use what?" comparison table:
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
@woctezuma Thanks for your feedback, that's really good to know!
I think one problem is that people often talk about "NLP libraries" or say things like "NLP libraries like spaCy or AllenNLP", which makes it sound like they're different options for the same thing, when in fact, they're very different libraries for very different things.
Here are some ideas for a "When should I use what?" comparison table: