Hi there, How to choose loss function for multi-label problem
it's different from multi-class output, the former output is a 0/1 vector with multiple ones, whereas the multi-class output is a single one-hot vector.
Thanks
the multi-class entropy thing is for multi-class problem, I suppose?
the multi-class entropy thing is for multi-class problem, I suppose?
Yes. What you want is binary_crossentropy
@NasenSpray binary_crossentropy is for multiclass, but not multilabel, right?
categorical_crossentropy
: 1-of-N (one-hot)
binary_crossentropy
: 1-or-more 0/1 labels
MSE/MAE also work, binary crossentropy would be preferred in general though.
@kingfengji i'm doing multi-label and do you share your code how to do it. Or Do you give one simple example how to implement multi-label classification.Thanks
@alyato this might help you multi label image classification
If using binary_crossentropy as loss function, does it mean we are minimizing the average of all cross-entropies over all classes?
I believe so.
On 3Feb 2017, at 20:57, michelleowen notifications@github.com wrote:
If using binary_crossentropy as loss function, does it mean we are minimizing the average of all cross-entropies over all classes?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/fchollet/keras/issues/2166#issuecomment-277360245, or mute the thread https://github.com/notifications/unsubscribe-auth/APZ8xXygSlgwxsLkMd-PwWHqL7oy3dUAks5rY5SdgaJpZM4H-MfW.
@keunwoochoi Could you explain why binary crossentropy is preferred for multi-label classification? I thought binary crossentropy was only for binary classification where y label is only 0 or 1. Now that the y label is in the format of [1,0,1,0,1..], do you know how the loss is calculated with binary crossentropy?
Thanks,My last layer is softmax layer. I use 'binary_crossentropy' as loss and I get 99% accuracy, while I use other loss function, I get only 10% accuracy.
I want to know how the accuracy is calculated?
I get high accuracy,but when I see the predict labels. I find the labels are all-zeros.
@1064950364 Yes, that's the definition of accuracy and that's why accuracy doesn't matter with many multi-label problems. In your true labels, there are so many zeros, right?
@lipeipei31 More precisely, crossentropy is preferred over MAE/MSE. It's bounded, it's loss computation (which is in proportion to the gradient applied) is more plausible. In that case it computes crossetnropy over each output and then compute their average.
@1064950364 Do you compare the two different layer in the last layer,softmax or sigmoid.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.
I need to classify attributes in a face like colour of eye, hair, skin; facial hair, lighting and so on. Each has few sub-categories in it. So should I directly apply sigmoid on all the labels or separately apply softmax on each subcategory like hair/eye colour etc?
Which one will be better in this case?
Or should I combine both as some subclasses are binary?
@sarthakahuja11, sounds like you have a multi-output problem where each output is binary or multi-class classification. I think you should have different loss functions for different outputs.
@lipeipei31 You have identified the problem correctly. So I should choose binary cross entropy for binary-class classification and categorical-cross entropy for multi-class classification? And combine them together afterwards in the same model?
@sarthakahuja11 Yes, that's right. And you can easily do that with the keras functional api:
https://keras.io/getting-started/functional-api-guide/#multi-input-and-multi-output-models. And the loss functions can be a list or dictionary if you named the outputs.
Thanks! @lipeipei31
hi,
if binary cross entropy is working in Keras for multi-label problems, will categorical_crossentropy work for multi one-hot encoded classes as well?
My example output is:
[
[0,0,1,0]
[0,0,0,1]
[1,0,0,0]
]
So I have three one hot encoded vectors. For a single on the loss function to choose would be categorical cross entropy. What will Keras do in a case like these?
Most helpful comment
categorical_crossentropy
: 1-of-N (one-hot)binary_crossentropy
: 1-or-more 0/1 labels