Is there any way that I can feed a label file to the training mechanism, Farrelly with source and target files.
Could you be more specific, please!
@Vsanku01 Thank you for the interest.
Basically I want to feed a class label for the source text. I am thinking about whether I can feed a class label, while feeding source and target text (similar to text generation or translation task) in the training time.
I think the easiest way would be to build this into your vocabulary. For example, find a unique token (ex: __class_label_0__, __class__label_1__, ..., __class_label_n__) and prepend these special tokens on to the beginning (or end) of your sequences before calling fairseq-preprocess.
Thank you very much.
@lematt1991
How can I create a unique token as you mentioned above?
What if I append a token like "__class_label_0__" to the text and then do the tokenization.
What if I append a token like "class_label_0" to the text and then do the tokenization.
Yep, that's exactly what I meant.
Thanks a lot.