Keras: Loss became negative during training

Created on 21 Mar 2016 · 4Comments · Source: keras-team/keras

I wrote my own loss function and during training, the loss turned into negative, which is impossible since there's only squared values in my loss function. So I think there might be an overflow, but theano's NanGuardMode doesn't complain about NaNs or Infs. The rest of my model doesn't cause the negative loss since it works well with another loss function. Here is my loss function:

def overlap(x1,w1,x2,w2):
    """
    Args:
      x1: x for the first box
      w2: width for the first box
      x2 : x for the second box
      w2: width for the second box
    """
    l1 = x1 - w1*w1/2
    l2 = x2 - w2*w2/2
    left = T.switch(T.lt(l1,l2),l2,l1)
    r1 = x1 + w1*w1/2
    r2 = x2 + w2*w2/2
    right = T.switch(T.lt(r1,r2),r1,r2)
    return right - left

def box_intersection(a,b):
    """
    Args:
      a: the first box, a n*4 tensor
      b: the second box,another n*4 tensor
    Returns:
      area: n*1 tensor, indicating the intersection area of each two boxes
    """
    w = overlap(a[:,:,0],a[:,:,3],b[:,:,0],b[:,:,3])
    h = overlap(a[:,:,1],a[:,:,2],b[:,:,1],b[:,:,2])
    w = T.switch(T.lt(w,0),0,w)
    h = T.switch(T.lt(h,0),0,h)
    area = w*h
    return area

def box_union(a,b):
    i = box_intersection(a,b)
    area_a = a[:,:,2]*a[:,:,3]
    area_b = b[:,:,2]*b[:,:,3]
    u = area_a*area_a + area_b*area_b - i
    return u

def box_iou(a,b):
    #the net and groud truth are all the square root of height and width
    u = box_union(a,b)
    i = box_intersection(a,b)
    u = T.switch(T.le(u,0),10000,u)
    return i/u

def custom_loss_2(y_pred,y_true):
    loss = 0.0
    y_pred = y_pred.reshape((y_pred.shape[0],49,30))
    y_true = y_true.reshape((y_true.shape[0],49,30))

    a = y_pred[:,:,0:4]
    b = y_pred[:,:,5:9]
    gt = y_true[:,:,0:4]

    #iou bewteen box a and gt
    iou_a_gt = box_iou(a,gt)

    #iou bewteen box b and gt
    iou_b_gt = box_iou(b,gt)

    #mask is either 0 or 1, 1 indicates box b has a higher iou with gt than box a
    mask = T.switch(T.lt(iou_a_gt,iou_b_gt),1,0)

    #loss bewteen box a and gt
    loss_a_gt = T.sum(T.square(a - gt),axis=2) * 5

    #loss bewteen box b and gt
    loss_b_gt = T.sum(T.square(b - gt),axis=2) * 5

    loss = loss + y_true[:,:,4] * (1 - mask) * loss_a_gt
    loss = loss + y_true[:,:,4] * mask * loss_b_gt

    #confident loss bewteen a and gt
    closs_a_gt = T.square(y_pred[:,:,4] - y_true[:,:,4])
    #confident loss bewteen b and gt
    closs_b_gt = T.square(y_pred[:,:,9] - y_true[:,:,4])

    loss = loss + closs_a_gt * (1-mask) * y_true[:,:,4]
    loss = loss + closs_b_gt * mask * y_true[:,:,4]

    #if the cell has no obj
    loss = loss + closs_a_gt * (1-y_true[:,:,4]) * 0.5
    loss = loss + closs_b_gt * (1-y_true[:,:,4]) * 0.5

    #add loss for the conditioned classification error
    loss = loss + T.sum(T.square(y_pred[:,:,10:30] - y_true[:,:,10:30]),axis=2)* y_true[:,:,4]
    loss = T.sum(loss,axis=1)

    loss = T.mean(loss)

    return loss

y_pred and y_true are batch_size4930 tensors, values in y_true are either 1 or 0.

Source

sunshineatnoon

Most helpful comment

I found the problem. y_true has to be the first parameter and y_pred the second, I accidentally pass them to my loss function in the wrong order, such a stupid mistake.

sunshineatnoon on 22 Mar 2016

👍6

All 4 comments

loss = loss + T.sum(T.square(y_pred[:,:,10:30] - y_true[:,:,10:30]),axis=2)* y_true[:,:,4]

if y_true has negative value, the whole thing will be negative since it is outside the square function.

EderSantana on 21 Mar 2016

@EderSantana Thanks for replying, but I'm pretty sure y_true is either 1 or 0, I've already double checked :). It's more like an overflow, since when I set learning_rate=1e-4, the loss is negative at the beginning of training, but if I set learning_rate=1e-8, the loss is positive at the beginning of training, but decrease to negative during training.

sunshineatnoon on 21 Mar 2016

I found the problem. y_true has to be the first parameter and y_pred the second, I accidentally pass them to my loss function in the wrong order, such a stupid mistake.

sunshineatnoon on 22 Mar 2016

👍6

Is this a custom bounding box loss that works? Will this work with TF backend?