Keras: Support ignore label in cross entropy functions

Created on 3 Apr 2017 · 16Comments · Source: keras-team/keras

This is a new feature request.

In Caffe, SigmoidCrossEntropyLossLayer can specify a label to be ignored.
This feature is required for the implementation of Fully Convolutional Networks for Semantic Segmentation, which says "The training ignores pixels that are masked out (as ambiguous
or difficult) in the ground truth." in section 4.

How to mask binary crossentropy loss? - Google Group mentions this feature. In this thread,

def binary_crossentropy(y_true, y_pred):
    return K.mean(K.binary_crossentropy(tf.multiply(y_pred, tf.cast(tf.not_equal(y_true, -1), tf.float32)),
                                        tf.multiply(y_true, tf.cast(tf.not_equal(y_true, -1), tf.float32))), axis=-1)

was the answer, but this implementation only supports TensorFlow.

Do you have any ideas about the implementation of ignore label in cross entropy functions?
Do you think this feature should be supported in Keras?

Source

kivantium

👍9

Most helpful comment

Hi,This is my suggestion to deal with ignored label.
raw_prediction=tf.reshape(logits,[-1,FLAGS.NUM_OF_CLASSESS])
gt=tf.reshape(annotation,[-1])
#supposed 2 is the ignored label
indices=tf.squeeze(tf.where(tf.not_equal(gt,2)),1)
gt=tf.cast(tf.gather(gt,indices),tf.int32)
prediction=tf.gather(raw_prediction,indices)
loss=tf.reduce_mean((tf.nn.sparse_softmax_cross_entropy_with_logits(logits=prediction,labels=gt,name="entropy")))

liuzhisheng1226 on 18 Mar 2018

👍7

All 16 comments

I'm having the exact same issue, and opened a similar thread: #5911
But nobody answered yet :-/

emoebel on 16 May 2017

Hi @kivantium, have you figured out how to implement the sigmoid cross entropy with ignore label?

I have two questions for the code you posted.

How well does the code work?
This implementation has a final average operation along the last dimension (axis=-1).
Shouldn't it be just averaging along the unignored labels, i.e. excluding the ignore labels?

wangg12 on 26 Jun 2017

Hi @wangg12,

I still don't have any good idea about implementation.

Answers for your questions about code: 1) sorry, I have not tested it. 2) I think so, but averaging along the unignored labels is not supported (as long as I know), so we need some workaround.

kivantium on 26 Jun 2017

OK, thanks @kivantium .

wangg12 on 26 Jun 2017

I am trying to implement the same paper now. As far as I understood (from this thread: https://stackoverflow.com/questions/37312421/tensorflow-whats-the-difference-between-sparse-softmax-cross-entropy-with-logi) then you can use 'sparse_softmax_cross_entropy_with_logits' with the label you want to ignore as -1 to do what you're looking for.

TheRevanchist on 10 Jul 2017

Hi here is my suggestion to deal with ignored label... to use compute_weighted_loss, here I use sigmoid_cross_entropy_with_logits for example to calculate loss of foreground/background segmentation. The unc is a tensor same shape as label, the value of unc is set to 0 in the position of ignored labels and 1 in the position of labels that should not be ignored... in this case, the final loss not calculated on the ignored labels. Actually, with this method you can ignore whatever you want...

xentropy = tf.reduce_mean(tf.losses.compute_weighted_loss(
weights = tf.cast(unc, tf.float32),
losses = tf.nn.sigmoid_cross_entropy_with_logits(
logits = logits,
labels = tf.cast(label, tf.float32))), name='xentropy')

JimmyCai91 on 26 Sep 2017

👍3

@TheRevanchist @JimmyCai91
Thank you! So we need to use backend functions to implement this...

kivantium on 28 Sep 2017

liuzhisheng1226 on 18 Mar 2018

👍7

If you find any problem,Please tell me.But I try it run ok.

liuzhisheng1226 on 18 Mar 2018

Closing as this is resolved, free to reopen if problem persists.

wt-huang on 13 Nov 2018

👎4

@liuzhisheng1226 As far as I know, sigmoid_cross_entropy_with_logits should be called with valid probability distributions on labels. Wouldn't your approach mess up the probabilities?

munum on 16 Feb 2019

Any status on this? Would love a cleaner solution similar to PyTorch's ignore_index parameter in CrossEntropyLoss.

jarednielsen on 13 Feb 2020

I can only second that @wt-huang. Being able to pass an integer indicating labels that should not enter the loss would be great! I know it is possible to define a custom loss function, however dragging that around is rather complicated.

TobiasFischerP4D on 22 Jun 2020

+1 for a cleaner solution similar to PyTorch's ignore_index and caffe's ignore_label. I suggest to reopen this issue since the accepted solution above is not clear. Could you please provide a full example using keras and tensorflow >= 2.x ?

In caffe using, "SoftmaxWithLoss" layer, we can add a loss_param { ignore_label: 255 } to tell caffe to ignore this label:

layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "prediction"
bottom: "labels_with_255_as_ignore"
loss_weight: 1
loss_param: { ignore_label: 255 }
}

Looking on the web, there is plethora of issues when asking keras (or tensorflow) to deal with ignore label for semantic segmentatation but still no cleaner solution...

https://stackoverflow.com/questions/59972024/mask-the-loss-function-for-segmantic-segmentation-in-tf-keras
https://stackoverflow.com/questions/56328140/how-do-i-implement-a-masked-softmax-cross-entropy-loss-function-in-keras
https://stackoverflow.com/questions/54887933/how-to-to-drop-a-specific-labeled-pixels-in-semantic-segmentation
https://stackoverflow.com/questions/46097968/tensorflow-how-to-handle-void-labeled-data-in-image-segmentation
https://stackoverflow.com/questions/55529944/is-there-a-way-to-make-keras-ignore-a-label-when-computing-binary-crossentropy-l

fabricecarles on 2 Jul 2020

👍1

+1 for a cleaner solution similar to PyTorch's ignore_index and caffe's ignore_label. I suggest to reopen this issue since the accepted solution above is not clear. Could you please provide a full example using keras and tensorflow >= 2.x ?

In caffe using, "SoftmaxWithLoss" layer, we can add a loss_param { ignore_label: 255 } to tell caffe to ignore this label:

layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "prediction"
bottom: "labels_with_255_as_ignore"
loss_weight: 1
loss_param: { ignore_label: 255 }
}

Looking on the web, there is plethora of issues when asking keras (or tensorflow) to deal with ignore label for semantic segmentatation but still no cleaner solution...

https://stackoverflow.com/questions/59972024/mask-the-loss-function-for-segmantic-segmentation-in-tf-keras
https://stackoverflow.com/questions/56328140/how-do-i-implement-a-masked-softmax-cross-entropy-loss-function-in-keras
https://stackoverflow.com/questions/54887933/how-to-to-drop-a-specific-labeled-pixels-in-semantic-segmentation
https://stackoverflow.com/questions/46097968/tensorflow-how-to-handle-void-labeled-data-in-image-segmentation
https://stackoverflow.com/questions/55529944/is-there-a-way-to-make-keras-ignore-a-label-when-computing-binary-crossentropy-l