Darkflow: yolov2 loss inconsistent with paper

Created on 26 Mar 2017  Â·  22Comments  Â·  Source: thtrieu/darkflow

In original Yolo 9000 paper, the network predicts P(Object) * IOU(Object, boundingbox). When I scaned through your implementation in net/yolov2/train.py, I found the losses was defined by the binary P(Object) only. Is this consistent with the original darknet? (I haven't read the source code of original darknet).

bug

Most helpful comment

I read the C source code of darknet. The author did implement the loss function of objectness score in the same way as he stated in the paper, so I think this inconsistency in tensorflow implementation could be a problem.

All 22 comments

I read the C source code of darknet. The author did implement the loss function of objectness score in the same way as he stated in the paper, so I think this inconsistency in tensorflow implementation could be a problem.

@ryansun1900

I implemented YoloV2 training by referring darkflow yolo V1 & Yolo 9000 paper.
It's possible not consistent with original darknet C source code.
I appreciate you for pointing out this problem. But recently I am too busy.
It's great if anyone could help translate the training from C source code directly and pull a new request!

Looking at darknet, assuming the loss code(edit yolo v1) is https://github.com/pjreddie/darknet/blob/master/src/detection_layer.c#L66 . The code in https://github.com/thtrieu/darkflow/blob/master/net/yolov2/train.py looks like a pretty good vectorization of it. I did notice two things:

  1. There is no float rmse = box_rmse(out, truth)(line 121) in darkflow train. It is used when best_iou is 0.

  2. Assuming I'm understading it correctly, the loss in darknet is (abbreviated)

l.delta[class_index+j] = l.class_scale * (net.truth[truth_index+1+j] - l.output[class_index+j]);
l.delta[p_index] = l.object_scale * (1.-l.output[p_index]);
l.delta[box_index+0] = l.coord_scale*(net.truth[tbox_index + 0] - l.output[box_index + 0]); 
*(l.cost) = pow(mag_array(l.delta, l.outputs * l.batch), 2);

but loss in darkflow is

loss = tf.pow(adjusted_net_out - true, 2)
loss = tf.multiply(loss, wght)
loss = tf.reshape(loss, [-1, H*W*B*(4 + 1 + C)])
loss = tf.reduce_sum(loss, 1)
self.loss = .5 * tf.reduce_mean(loss)

which is close, but slightly out of order.

@jcarletgo there is an addition line
if(l.rescore){ l.delta[p_index] = l.object_scale * (iou - l.output[p_index]); }
which states that iou is used in the loss layer.

Do we have any update on this matter?

No, nothing.
Can I ask you a question ?
What's kind of program language do you usually use?

2017-05-29 19:18 GMT+08:00 EmmanouelP notifications@github.com:

Do we have any update on this matter?

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/thtrieu/darkflow/issues/104#issuecomment-304637209,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AWmbwvrtI82hn1iNi2vdE6VWjGyVRmhAks5r-qlvgaJpZM4Mpeex
.

I usually use python & R. Now I am still busy for some personal projects and don't have time to check the C source code. Hope someone could help to check. Thanks.

R language for big data ?
I still usually use c.
Then, my python is still in a low level.
When I look your program, I feel so frustrated.
And I'm sorry, because I can't help you.
But you give me a goal.

2017-05-29 23:20 GMT+08:00 ryansun1900 notifications@github.com:

I usually use python & R.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/thtrieu/darkflow/issues/104#issuecomment-304685536,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AWmbwpalKayApYOZW-XlvDkYK_uw9E7wks5r-uIpgaJpZM4Mpeex
.

thank you

2017-05-29 23:57 GMT+08:00 廖心瑜 gs45ewe54ger@gmail.com:

R language for big data ?
I still usually use c.
Then, my python is still in a low level.
When I look your program, I feel so frustrated.
And I'm sorry, because I can't help you.
But you give me a goal.

2017-05-29 23:20 GMT+08:00 ryansun1900 notifications@github.com:

I usually use python & R.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/thtrieu/darkflow/issues/104#issuecomment-304685536,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AWmbwpalKayApYOZW-XlvDkYK_uw9E7wks5r-uIpgaJpZM4Mpeex
.

I just have a question regarding this bug: Will it affect inference? If I finetune/train using the darknet C code to get the weights and use this repository for inference will it work?

@ryansun1900, @minhnhat93 raises a good point. If the training script won't work in this repo because of implementation differences, then the next best thing is to train on darkflow and simply use the weights here.

@minhnhat93 My instinct is that it should work fine. A bug in the training function should not affect the predictions for pretrained models. In fact, I tried darkflow on the pretrained darknet models and they appeared to work just fine.

@minhnhat93 but the problem is that when you compare the person detection with original yolo, this darkflow is not good that it fails to detect some people in an image. I did not dig too much into the code, not sure why this inconsistency happens since we are using the pretrained model from original yolo, right ?

@ryansun1900 @thtrieu was this fixed?

@ryansun1900 @thtrieu was this bug fixed?

@jcarletgo @ryansun1900 @thtrieu nice thread to follow. There's another "inconsistency" compare with paper. In YOLOv1 paper YOLO should output feature map of shape (S^2, B * 5+C). But implementation here is using shape of [S^2, B*(5+C)]... If I understand correctly, for regression on class probability, the way the paper mentioned is to penalize classification error for all anchor boxes as long as that grid cell is matching with true box, but the implementation here is to penalize only the "best_match" anchor box and ignore the non-matching ones. I didn't look at darknet c code to compare. Is this done on purpose?

@tianyu-tristan I was wondering the same thing and checked out darknet code. It seems like B*(5+C) is the right way to go, looking at https://github.com/pjreddie/darknet/blob/508381b37fe75e0e1a01bcb2941cb0b31eb0e4c9/src/region_layer.c#L22

@jcarletgo Could you elaborate on what you mean with "slightly out of order" ? I'm trying hard to match the darknet implementation against the darkflow implementation of calculating the loss but I just start slowly recognising the components of the formula (https://stats.stackexchange.com/questions/287486/yolo-loss-function-explanation) in each implementation... and do not yet see any differences.

Another good explanation of the loss function: https://hackernoon.com/understanding-yolo-f5a74bbc7967

I think @zhisong is right. Rescoring is not implemented in darkflow.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

hrshovon picture hrshovon  Â·  5Comments

Kowasaki picture Kowasaki  Â·  4Comments

xunkaixin picture xunkaixin  Â·  4Comments

eugtanchik picture eugtanchik  Â·  4Comments

wonny2001 picture wonny2001  Â·  4Comments