Question in title. Thanks!
Hi @hughbzhang,
Thank you for asking this question!
As of now, there is no support for regression in fastText. However, we might add it in the future. Could you please suggest some datasets on which we could evaluate fastText? Thanks!
Best,
Edouard
This guy wrote an example of using fasttext to predict Yelp reviews. His categories are 1-star, 2-star, 3-star, 4-star, and 5-star reviews. You can check it out here:
https://medium.com/@ageitgey/text-classification-is-your-new-secret-weapon-7ca4fad15788
It seems he achieved reasonable accuracy. However, I think this problem is more suited to be treated as a regression problem: if the ground truth of your review is 3 stars but you predict 4, it's a better prediction than, say, 5 or 1. But when you treat it as a classification problem, all incorrect predictions are equally bad.
So there, that's a dataset you can test regression support for fasttext. Looking forward to having this feature!
I agree. Tagging problems are better implemented using regression because it allows you to estimate uncertainty.
I'd like a continuous label, or a probability distribution over the tags just so I can tell when the model doesn't know.
This helps with debugging as well.
Most helpful comment
This guy wrote an example of using fasttext to predict Yelp reviews. His categories are 1-star, 2-star, 3-star, 4-star, and 5-star reviews. You can check it out here:
https://medium.com/@ageitgey/text-classification-is-your-new-secret-weapon-7ca4fad15788
It seems he achieved reasonable accuracy. However, I think this problem is more suited to be treated as a regression problem: if the ground truth of your review is 3 stars but you predict 4, it's a better prediction than, say, 5 or 1. But when you treat it as a classification problem, all incorrect predictions are equally bad.
So there, that's a dataset you can test regression support for fasttext. Looking forward to having this feature!