Keras: Categorical hinge loss

Created on 26 May 2016  路  8Comments  路  Source: keras-team/keras

At the moment, it seems only binary version of hinge loss is implemented as objective function (equivalent to binary_hinge_loss in lasagne). Shouldn't the name of this objective better reflect the fact that is the binary version, i.o.w. be named binary_hinge instead of just hinge?

Also, wouldn't it be good to mention in the docs that the binary hinge loss function should be used with labels being {-1,+1} instead of {0,1}?

Finally, does anyone have an idea how I can implement the categorical version of hinge loss (equivalent to multiclass_hinge_loss in lasagne)?

stale

Most helpful comment

^@fchollet Can you please confirm which one of this functions represent the 'hingeloss' in keras?

All 8 comments

I think the hinge loss in Keras is multiclass. The label/prediction is in type of one-hot vector.

Also, wouldn't it be good to mention in the docs that the binary hinge loss function should be used with labels being {-1,+1} instead of {0,1}?

Yes, I agree with you. Document should specify the that.

To build a multiclass hinge loss for label in {0, 1}, just insert it in objective.py.

def hinge_onehot(y_true, y_pred):
    y_true = y_true*2 - 1
    y_pred = y_pred*2 - 1

    return K.mean(K.maximum(1. - y_true * y_pred, 0.), axis=-1)

If the hinge loss were multiclass (=multiple classes, single correct label) in keras then it should implement:

image

which it seems to me it does not.

It seems to implement:

image

which is multilabel or, in case of only two classes, binary.

^@fchollet Can you please confirm which one of this functions represent the 'hingeloss' in keras?

I just created a pull request for adding a multiclass hinge loss function

https://github.com/fchollet/keras/pull/6687

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

@meberstein Can you provide a source reference of this version of categorical hinge loss?

@atomextranova - I just wrote it based on how I think it should work. I added a unit test to show how it works in my PR. Is something wrong?

@atomextranova - Oh wait I see what is happening. Lets break this down.

def categorical_hinge(y_true, y_pred):
    pos = K.sum(y_true * y_pred, axis=-1)
    neg = K.max((1.0 - y_true) * y_pred, axis=-1)
    return K.mean(K.maximum(0.0, neg - pos + 1), axis=-1)

y_true = one-hot encoded vector which represents the one correct true label (i.e. all the values are zero except for the correct label which is one)
y_pred = logit values which represents how likely the model thinks each label is correct
pos = the logit value for the one correct label
neg = the largest logit value of all the incorrect labels
return = neg - pos + 1

This works as a multiclass hinge loss function (as mentioned above). Which also reads to me as the same definition as what @equialgo wrote in the first equation.

Was this page helpful?
0 / 5 - 0 ratings