This is a new feature request.
In Caffe, SigmoidCrossEntropyLossLayer can specify a label to be ignored.
This feature is required for the implementation of Fully Convolutional Networks for Semantic Segmentation, which says "The training ignores pixels that are masked out (as ambiguous
or difficult) in the ground truth." in section 4.
How to mask binary crossentropy loss? - Google Group mentions this feature. In this thread,
def binary_crossentropy(y_true, y_pred):
return K.mean(K.binary_crossentropy(tf.multiply(y_pred, tf.cast(tf.not_equal(y_true, -1), tf.float32)),
tf.multiply(y_true, tf.cast(tf.not_equal(y_true, -1), tf.float32))), axis=-1)
was the answer, but this implementation only supports TensorFlow.
I'm having the exact same issue, and opened a similar thread: #5911
But nobody answered yet :-/
Hi @kivantium, have you figured out how to implement the sigmoid cross entropy with ignore label?
I have two questions for the code you posted.
How well does the code work?
This implementation has a final average operation along the last dimension (axis=-1).
Shouldn't it be just averaging along the unignored labels, i.e. excluding the ignore labels?
Hi @wangg12,
I still don't have any good idea about implementation.
Answers for your questions about code: 1) sorry, I have not tested it. 2) I think so, but averaging along the unignored labels is not supported (as long as I know), so we need some workaround.
OK, thanks @kivantium .
I am trying to implement the same paper now. As far as I understood (from this thread: https://stackoverflow.com/questions/37312421/tensorflow-whats-the-difference-between-sparse-softmax-cross-entropy-with-logi) then you can use 'sparse_softmax_cross_entropy_with_logits' with the label you want to ignore as -1 to do what you're looking for.
Hi here is my suggestion to deal with ignored label... to use compute_weighted_loss, here I use sigmoid_cross_entropy_with_logits for example to calculate loss of foreground/background segmentation. The unc is a tensor same shape as label, the value of unc is set to 0 in the position of ignored labels and 1 in the position of labels that should not be ignored... in this case, the final loss not calculated on the ignored labels. Actually, with this method you can ignore whatever you want...
xentropy = tf.reduce_mean(tf.losses.compute_weighted_loss(
weights = tf.cast(unc, tf.float32),
losses = tf.nn.sigmoid_cross_entropy_with_logits(
logits = logits,
labels = tf.cast(label, tf.float32))), name='xentropy')
@TheRevanchist @JimmyCai91
Thank you! So we need to use backend functions to implement this...
Hi,This is my suggestion to deal with ignored label.
raw_prediction=tf.reshape(logits,[-1,FLAGS.NUM_OF_CLASSESS])
gt=tf.reshape(annotation,[-1])
#supposed 2 is the ignored label
indices=tf.squeeze(tf.where(tf.not_equal(gt,2)),1)
gt=tf.cast(tf.gather(gt,indices),tf.int32)
prediction=tf.gather(raw_prediction,indices)
loss=tf.reduce_mean((tf.nn.sparse_softmax_cross_entropy_with_logits(logits=prediction,labels=gt,name="entropy")))
If you find any problem,Please tell me.But I try it run ok.
Closing as this is resolved, free to reopen if problem persists.
@liuzhisheng1226 As far as I know, sigmoid_cross_entropy_with_logits should be called with valid probability distributions on labels. Wouldn't your approach mess up the probabilities?
Any status on this? Would love a cleaner solution similar to PyTorch's ignore_index parameter in CrossEntropyLoss.
I can only second that @wt-huang. Being able to pass an integer indicating labels that should not enter the loss would be great! I know it is possible to define a custom loss function, however dragging that around is rather complicated.
+1 for a cleaner solution similar to PyTorch's ignore_index and caffe's ignore_label. I suggest to reopen this issue since the accepted solution above is not clear. Could you please provide a full example using keras and tensorflow >= 2.x ?
In caffe using, "SoftmaxWithLoss" layer, we can add a loss_param { ignore_label: 255 } to tell caffe to ignore this label:
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "prediction"
bottom: "labels_with_255_as_ignore"
loss_weight: 1
loss_param: { ignore_label: 255 }
}
Looking on the web, there is plethora of issues when asking keras (or tensorflow) to deal with ignore label for semantic segmentatation but still no cleaner solution...
https://stackoverflow.com/questions/59972024/mask-the-loss-function-for-segmantic-segmentation-in-tf-keras
https://stackoverflow.com/questions/56328140/how-do-i-implement-a-masked-softmax-cross-entropy-loss-function-in-keras
https://stackoverflow.com/questions/54887933/how-to-to-drop-a-specific-labeled-pixels-in-semantic-segmentation
https://stackoverflow.com/questions/46097968/tensorflow-how-to-handle-void-labeled-data-in-image-segmentation
https://stackoverflow.com/questions/55529944/is-there-a-way-to-make-keras-ignore-a-label-when-computing-binary-crossentropy-l
+1 for a cleaner solution similar to PyTorch's ignore_index and caffe's ignore_label. I suggest to reopen this issue since the accepted solution above is not clear. Could you please provide a full example using keras and tensorflow >= 2.x ?
In caffe using, "SoftmaxWithLoss" layer, we can add a loss_param { ignore_label: 255 } to tell caffe to ignore this label:
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "prediction"
bottom: "labels_with_255_as_ignore"
loss_weight: 1
loss_param: { ignore_label: 255 }
}Looking on the web, there is plethora of issues when asking keras (or tensorflow) to deal with ignore label for semantic segmentatation but still no cleaner solution...
https://stackoverflow.com/questions/59972024/mask-the-loss-function-for-segmantic-segmentation-in-tf-keras
https://stackoverflow.com/questions/56328140/how-do-i-implement-a-masked-softmax-cross-entropy-loss-function-in-keras
https://stackoverflow.com/questions/54887933/how-to-to-drop-a-specific-labeled-pixels-in-semantic-segmentation
https://stackoverflow.com/questions/46097968/tensorflow-how-to-handle-void-labeled-data-in-image-segmentation
https://stackoverflow.com/questions/55529944/is-there-a-way-to-make-keras-ignore-a-label-when-computing-binary-crossentropy-l
+1 to this, I really want to keep using keras, but find PyTorch to be way easier at ignoring background values in semantic segmentation tasks.
I cannot reopen this issue, because a collaborator @wt-huang closed it. (cf. How to re-open an issue in github?)
If you still have a trouble, it might be better to open a new issue and link to this thread.
Most helpful comment
Hi,This is my suggestion to deal with ignored label.
raw_prediction=tf.reshape(logits,[-1,FLAGS.NUM_OF_CLASSESS])
gt=tf.reshape(annotation,[-1])
#supposed 2 is the ignored label
indices=tf.squeeze(tf.where(tf.not_equal(gt,2)),1)
gt=tf.cast(tf.gather(gt,indices),tf.int32)
prediction=tf.gather(raw_prediction,indices)
loss=tf.reduce_mean((tf.nn.sparse_softmax_cross_entropy_with_logits(logits=prediction,labels=gt,name="entropy")))