Darknet: Loss function formula

Created on 15 May 2018 · 14Comments · Source: AlexeyAB/darknet

Hi,@AlexeyAB
Could you tell me what is the formula of the YOLOv2 loss function? I only find the loss formula about YOLOv1 in paper.

question

Source

Caroline1994

Most helpful comment

@wsyzzz So you have only Yolo v2, and you want to know loss function for Yolo v2 instead of Yolo v3?

For Yolo v2:

Loss (cost) = sum of squares of deltas

Total loss (cost) = sum of squares of deltas = pow(sqrt(sum_for_each_i(delta[i] * delta[i])), 2) = sum_for_each_i(delta[i] * delta[i])
https://github.com/AlexeyAB/darknet/blob/573d7e80814a4cc3c08897f6c0f67ea189339856/src/region_layer.c#L354

AlexeyAB on 16 May 2018

👍3 🎉1

All 14 comments

@Caroline1994 Hi,

Loss (cost) = sum of squares of deltas

Delta =

for classes if(correct_class) loss = 1-p; else loss = -p;
https://github.com/AlexeyAB/darknet/blob/8b5344ee2dc551dbe673020a33021e7f84f305f1/src/yolo_layer.c#L143-L144
for objectness if (iou > ignore_thresh) then delta = 0
if (iou <= ignore_thresh) then delta = -objectness (T0)
(if (iou > truth_thresh) then delta = 1 - objectness (T0) - never happen because truth_thresh=1)
https://github.com/AlexeyAB/darknet/blob/8b5344ee2dc551dbe673020a33021e7f84f305f1/src/yolo_layer.c#L210-L217
for boxes loss_x = (truth.x - x) * (2-truth.w*truth.h)
or loss_w = (log(truth.w*w/anchor_w) - x) * (2-truth.w*truth.h)
https://github.com/AlexeyAB/darknet/blob/8b5344ee2dc551dbe673020a33021e7f84f305f1/src/yolo_layer.c#L99-L107
https://github.com/AlexeyAB/darknet/blob/8b5344ee2dc551dbe673020a33021e7f84f305f1/src/yolo_layer.c#L253

Total loss (cost) = sum of squares of deltas = pow(sqrt(sum_for_each_i(delta[i] * delta[i])), 2) = sum_for_each_i(delta[i] * delta[i])
https://github.com/AlexeyAB/darknet/blob/8b5344ee2dc551dbe673020a33021e7f84f305f1/src/yolo_layer.c#L272
https://github.com/AlexeyAB/darknet/blob/8b5344ee2dc551dbe673020a33021e7f84f305f1/src/utils.c#L517-L524

AlexeyAB on 15 May 2018

👍3 🎉1

@Caroline1994 This slide maybe helpful.

It's from https://www.slideshare.net/xavigiro/object-detection-d2l5-insightdcu-machine-learning-workshop-2017.

@AlexeyAB Could you tell me how you implement the loss function in your previous version or in pjreddie's version? I notice that yolo_layer.c was updated 8 days ago, and it isn't in my previous download files. Could you help me locate the loss function code?

wsyzzz on 16 May 2018

@wsyzzz yolo_layer.c was the same as now. There were added only

Focal loss: https://github.com/AlexeyAB/darknet/commit/6056b835eb76b8a078aab18db3e7aba87314f4ce#diff-180a7a56172e12a8b79e41ec95ae569dR121

and max boxes: https://github.com/AlexeyAB/darknet/commit/35cc0aaa15b991b348cc8d9623eed5d4f8a1e435#diff-180a7a56172e12a8b79e41ec95ae569d

Or what exactly do you mean?

AlexeyAB on 16 May 2018

@AlexeyAB I cloned your project several months before and i want to locate the code of loss functions. But in my files there is no file named yolo_layer.c.

wsyzzz on 16 May 2018

@wsyzzz So you have only Yolo v2, and you want to know loss function for Yolo v2 instead of Yolo v3?

For Yolo v2:

Loss (cost) = sum of squares of deltas

AlexeyAB on 16 May 2018

👍3 🎉1

@AlexeyAB Where is the loss of bounding box coordinate regression? Is it included in the three things you mentioned?

wsyzzz on 16 May 2018

for boxes https://github.com/AlexeyAB/darknet/blob/573d7e80814a4cc3c08897f6c0f67ea189339856/src/region_layer.c#L92-L111

AlexeyAB on 16 May 2018

@AlexeyAB Emmm...acutually i want to change the second line of the picture below from [(sqrt{w_i} - sqrt{\hat{w}_i})^2 + (sqrt{h_i} - sqrt{\hat{h}_i})^2] to [(w_i - \hat{w}_i) / w_i)]^2 + [(h_i - \hat{h}_i) / h_i)]^2.
But i don't understand how to translate the equation to code. Could you give me some advice?Thanks.

wsyzzz on 16 May 2018

Thank you for your reply.@wsyzzz
The picture you gave is about the Loss formula of YOLOv1. I would like to ask if YOLOv2's Loss formula is the same as YOLOv1's？
@AlexeyAB Could you tell me the difference between the two versions of the loss formula?

Caroline1994 on 16 May 2018

@Caroline1994 Sorry for that. Paper of YOLOv2 doesn't mention it. And it seems that there's no explanation on the Internet. I find this site maybe helpful. But there is also a question that I can't understand how the equation is implemented in code.

https://groups.google.com/forum/#!topic/darknet/TJ4dN9R4iJk

wsyzzz on 17 May 2018

@AlexeyAB
Hello, my problem is the same as that of the landlord. I also need to change the part of the loss function w, h. Where can I modify it in the code?

lvshuaigg on 7 Jan 2019

@lvshuaigg

There are several formulas for:

delates for bounded boxes: https://github.com/AlexeyAB/darknet/blob/fd0df9297c86a272f0bf0841291bc4565e90a7cd/src/yolo_layer.c#L94-L109

deltas of objectness (T0):
- https://github.com/AlexeyAB/darknet/blob/fd0df9297c86a272f0bf0841291bc4565e90a7cd/src/yolo_layer.c#L218-L221
- https://github.com/AlexeyAB/darknet/blob/fd0df9297c86a272f0bf0841291bc4565e90a7cd/src/yolo_layer.c#L264
deltas for classes

Deltas for classes differ in Yolo v2 and Yolo v3:

multi-label classification - each bounded box (each anchor) can have several classes. And in total there are in the neural model >= 1 classes. There is used Binary cross-entropy with Logistic activation (sigmoid). Is used in Yolo v3 https://github.com/AlexeyAB/darknet/issues/1695#issuecomment-450995001
multi-class classification - each bounded box (each anchor) can have only one classes. And in total there are in the neural model >= 1 classes. There is used Categorical cross-entropy with Softmax activation. Is used in Yolo v2 https://github.com/AlexeyAB/darknet/issues/1695#issuecomment-451002957

AlexeyAB on 7 Jan 2019

@AlexeyAB
I looked at the source code and some data found "scale = 2 - groundtruth.w * groundtruth.h", can you explain why this is done, but in the source code does not reflect the loss function w, h with the root number?
Thank u very much!

lvshuaigg on 10 Jan 2019

@AlexeyAB
Can you please elaborate on function parameter best_n ? for the boxes loss function ? it seems like you are assigning the best anchor box to ground truth ? Shouldn't the network predict the best anchor box ratios ?