Flair: What is the default Text Classification model?

Created on 15 Apr 2020  路  10Comments  路  Source: flairNLP/flair

What is the default Text Classification model? Is there a paper to refer to?

question wontfix

All 10 comments

There is no default model. You have to use some DocumentEmbeddings, e.g. mean of token embeddings. The last layer is a linear layer with output classes.

You can use Flair to do several models: If you use DocumentPoolEmbeddings together with classic word embeddings you can recreate the classic FastText approach which we often use as baseline. With DocumentRNNEmbeddings you get an RNN over word embeddings which was a strong approach until recently.

Now, we are adding TransformerDocumentEmbeddings to the master branch, which allows the BERT approach of fine-tuning a transformer. (Might be renamed to DocumentTransformerEmbeddings for consistency with earlier classes.) So depending on which approach you use, different papers could be cited. But transformers by far give best performance.

You can use Flair to do several models: If you use DocumentPoolEmbeddings together with classic word embeddings you can recreate the classic FastText approach which we often use as baseline. With DocumentRNNEmbeddings you get an RNN over word embeddings which was a strong approach until recently.

Now, we are adding TransformerDocumentEmbeddings to the master branch, which allows the BERT approach of fine-tuning a transformer. (Might be renamed to DocumentTransformerEmbeddings for consistency with earlier classes.) So depending on which approach you use, different papers could be cited. But transformers by far give best performance.

Thanks for the explanation and sharing on the latest release! It'd be great to include more examples in the tutorial about recreating certain models for baseline, for instance, logistic regression, SVM, etc. Also from the tutorial, I cant tell if setting the rnn layers and hidden states for the document RNN embeddings are actually the same as implementing RNN with the same structure with the same word-/sentence-level embeddings.

Agree, we don't currently use "classic" methods like SVM, but I hope we can add some examples in the future on how to fit an SVM on top of sentence embeddings.

The document RNN is an RNN that takes as input word embeddings - the final hidden state of the RNN after consuming all words is then used for classification. This RNN needs to be trained for a specific task in order to make sense (happens automatically in Flair if you use this class in the ModelTrainer). So if I understand your question correctly it is the same.

Agree, we don't currently use "classic" methods like SVM, but I hope we can add some examples in the future on how to fit an SVM on top of sentence embeddings.

The document RNN is an RNN that takes as input word embeddings - the final hidden state of the RNN after consuming all words is then used for classification. This RNN needs to be trained for a specific task in order to make sense (happens automatically in Flair if you use this class in the ModelTrainer). So if I understand your question correctly it is the same.

Yes I got that part by reading the code.. It seems that the tutorial on text classification could be more explicit :)

Also, may I check if flair will be supporting sequence prediction tasks?

Yes, I think we will update the tutorial once we're done with testing the new BERT additions.

We haven't yet thought of sequence prediction from our side, though maybe somebody in the community has. Are there specific sequence prediction tasks you are interested in?

Yes! I'm working on projects regarding trajectory mining, for instance, predicting next tourist city based on travel history and predicting next job title based on career history, among others. The former is more a multimodal problem but part of the data it is social media posts.

So is that something of interest? @alanakbik

Yes sounds interesting :) I'm not sure if we will develop support for this from our side any time soon, but we would always welcome a contribution!

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Was this page helpful?
0 / 5 - 0 ratings