Fairseq: Enable per-token classification in RoBERTa

Created on 15 Feb 2020 · 3Comments · Source: pytorch/fairseq

🚀 Feature Request

Enable per-token classification in RoBERTa (also called "sentence tagging" or "sequence tagging" in the original BERT paper)

Motivation

Currently, you can only classify whole sentences using RoBERTA. There are many usecases for wanting to classify each input token.

Pitch

Add --classify-per-token or similar flag to sentence_prediction task and ensure that the classification head and sentence_prediction loss can handle processing all tokens.
-- edit: Settled on creating separate sequence_tagging task and criterion

Alternatives

Using the translation task and a different architecture.

Additional context

https://arxiv.org/abs/1810.04805 (search "tagging")

enhancement help wanted

Source

prihoda

All 3 comments

A sequence tagging task would be helpful. I'd prefer not to complicate the existing sentence_prediction.py task, but please feel free to copy/paste into sequence_tagging.py and adapt as needed.