RGB, it can be seen in image.c:load_image_cv
@hemp110 Thanks~~ 😀
@hemp110 Hi I was looking for this in the code and as far is I understand the OpenCV documentation an image is read in the order BGR. Could you point me to the line where the order is flipped? How is it if I have not compiled darknet with OpenCV?
@hemp110 Hi I was looking for this in the code and as far is I understand the OpenCV documentation an image is read in the order BGR. Could you point me to the line where the order is flipped? How is it if I have not compiled darknet with OpenCV?
Easy:
https://github.com/AlexeyAB/darknet/blob/49189c22e8c09ba7c11622ec22a72b3678887ed2/src/image.c#L1043-L1074
That function in darknet uses OpenCV as an "image loader" to read images from disk. And OpenCV always reads images into BGR in RAM.
Look at the final lines of that code:
if (out.c > 1)
rgbgr_image(out);
Meaning: If there are more than 1 channel (grayscale), run the function rgbgr_image, which is as follows:
https://github.com/AlexeyAB/darknet/blob/49189c22e8c09ba7c11622ec22a72b3678887ed2/src/image.c#L943-L951
That function flips the Blue and Red channels. In other words it converts BGR (the data read from disk by OpenCV) into RGB (the format that Darknet wants).
But wait, let's get more proof!
OpenCV has a built-in implementation of Darknet's algorithms (it does not use any darknet source code). They are able to load Darknet models.
Here's how they define Darknet importer: https://github.com/opencv/opencv/blob/631b246881f04021dfcdb2f6be03c6c108f82163/samples/dnn/models.yml
yolo:
model: "yolov3.weights"
config: "yolov3.cfg"
mean: [0, 0, 0]
scale: 0.00392
width: 416
height: 416
rgb: true
classes: "object_detection_classes_yolov3.txt"
sample: "object_detection"
They use that information as follows: https://github.com/opencv/opencv/blob/5115e5decbef657ceb234d21ad8e8e33a6c96ca4/samples/dnn/classification.py
blob = cv.dnn.blobFromImage(frame, args.scale, (inpWidth, inpHeight), args.mean, args.rgb, crop=False)
So the "rgb: true/false" (args.rgb) value of the YML file above is being used as the swapRB parameter to cv.dnn.blobFromImage. Which, if the value is true, will flip the OpenCV "BGR" image (captured by cv.VideoCapture in that example) into "RGB".
So, what conclusions can we get from how OpenCV implements Darknet:
.weights files are trained as RGB files.Lastly, you can simply use OpenCV and try their darknet engine. If you feed it a BGR image of an object which has significant blue or red color, the "detection certainty" will be LOWER than if you feed it a proper RGB image of that object. This proves that the neural network weights are tuned for RGB.
For example. I have a network trained on blue and red balls (precisely the channels that differ in BGR vs RGB). If I feed a BGR image to it, I get 0.7374227046966553 confidence. If I feed the same image flipped to RGB channels, I get 0.8908946514129639 confidence. This test was done by loading the image using cv.imread which loads as BGR, and then using blobFromImage with swapRB either True (BGR -> RGB) or False (keeps as BGR) when making the blob to feed into the neural network.
The neural network weights then decide how it reacts to each color channel. As you can see, the weights (trained by the official darknet.exe) react properly when fed RGB input.
So yes, in every way, it is conclusively proven: Darknet is RGB-based, and the weights are RGB-based.
Oh and wait, there's even more proof! Read this implementation of the actual Darknet library as a Python module: https://github.com/madhawav/YOLO3-4-Py/issues/67
So yes, Darknet = RGB.
Most helpful comment
Easy:
https://github.com/AlexeyAB/darknet/blob/49189c22e8c09ba7c11622ec22a72b3678887ed2/src/image.c#L1043-L1074
That function in darknet uses OpenCV as an "image loader" to read images from disk. And OpenCV always reads images into BGR in RAM.
Look at the final lines of that code:
Meaning: If there are more than 1 channel (grayscale), run the function
rgbgr_image, which is as follows:https://github.com/AlexeyAB/darknet/blob/49189c22e8c09ba7c11622ec22a72b3678887ed2/src/image.c#L943-L951
That function flips the Blue and Red channels. In other words it converts BGR (the data read from disk by OpenCV) into RGB (the format that Darknet wants).