Fasttext: Any plan to support different weight for each class in loss function?

Created on 17 Jan 2017 · 18Comments · Source: facebookresearch/fastText

Looking at the current code, it seems to me that loss function are evaluated with the same weight for each class, which is OK for balanced data. For highly imbalanced data, are there any plan to support different weight for each class in loss function? I am thinking in command line, do:

fasttext -input XXX -output XXX -weight_class1 10 -weight_class2 1 -weight_class3 3

or simply

fasttext -weight_balanced

if the weight is inversely proportional to number of instances in that class?

Source

kuangchen

👍17

Most helpful comment

@cpuhrsch, Is there any update on this?

narendra36 on 9 Dec 2019

👍2

All 18 comments

bkj on 28 Jan 2017

blackyang on 9 Jun 2017

Hello @kuangchen,

This is part of future work. For now you can balance classes by subsampling or upsampling (e.g. duplicating) datapoints. Indeed some simple heuristic that takes the label count into account, as you mentioned, could already be helpful. Stay tuned for updates and feel free to reopen this issue, if you don't see such changes released.

Thanks,
Christian