Flair: Consider adding fine tuning BERT for text classification and sequence tagging?

Created on 7 Feb 2019 · 5Comments · Source: flairNLP/flair

Currently flair supports feature-based transfer learning from BERT. Any plan on adding fine-tuning based module in near future? That would be a great addition!

question

Source

gccome

👍4

Most helpful comment

I allow myself to highlight the freshly available paper Parameter-Efficient Transfer Learning for NLP. It seems relevant to this topic for further disccusion.

mauryaland on 7 Feb 2019

👍7

All 5 comments

I allow myself to highlight the freshly available paper Parameter-Efficient Transfer Learning for NLP. It seems relevant to this topic for further disccusion.

mauryaland on 7 Feb 2019

👍7

@mauryaland interesting paper! Thanks for sharing!

gccome on 7 Feb 2019

You could override BertEmbeddings

class TrainableBERT(BertEmbeddings):
    pass

and set self.static_embeddings = False in the constructor, and override the _add_embeddings_internal method to do the same thing as the current method but without model.eval() and the with torch.no_grad() block.

AFAICT this should allow fine-tuning on the full input embeddings. You'd need to be able to set this to eval mode (and not calculating gradients) when done training and doing predictions, however.

tc-wolf on 7 Aug 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.