Darknet: Image with 2 channels

Created on 9 Jul 2018 · 8Comments · Source: AlexeyAB/darknet

Can I use this network with images that only contains 2 channels? I'm dealing with x-ray images. The first channel is raw image(16bit grayscale) and the second channel is log transformed image. Does that work? And could you tell me which file should I modify? Thank you !

Source

ycui123

Most helpful comment

So you can convert it to the common 8-bit 3 channels in any way as you want and it will work:

or convert two 16-bit channels to the two 8-bit channels, and set all zeros in the 3rd channel.
or convert first 16-bit channel to the two 8-bit channels, and second 16-bit channel to the one 8-bit channel

Just you should do Training and Detection on the same type of converting.

Also you should disable some types of color data augmentation, i.e. set

saturation = 1.0
exposure = 1.5 
hue=0

instead of: https://github.com/AlexeyAB/darknet/blob/e301fee8a0d1343824dd8038bc051f728b93bc57/cfg/yolov3.cfg#L14-L16

AlexeyAB on 9 Jul 2018

👍3

All 8 comments

The first channel is raw image(16bit grayscale) and the second channel is log transformed image.

If you can convert these 2 channels (1st 16-bit + 2nd 8-bit) to the 8-bit 3-channels (total 24-bit), then just use such images for training and detection as usual.

If you can't convert in such a way, then you should change source code to do this. Look at these changes that were made to support 1-channel 8-bit images: https://github.com/AlexeyAB/darknet/pull/936/files

You should change these functions:

https://github.com/AlexeyAB/darknet/blob/e301fee8a0d1343824dd8038bc051f728b93bc57/src/image.c#L936-L954

Also if OpenCV is used:

If OpenCV isn't used:

AlexeyAB on 9 Jul 2018

Thanks for the quick reply. I could convert the two channels to 8 bit. And zero pad the 3rd channel? I wonder if that works?

ycui123 on 9 Jul 2018

The first channel is raw image(16bit grayscale) and the second channel is log transformed image.

Do you have 1st channel with 16 bit?
What number of bits in the 2nd channel (second channel is log transformed image)?

AlexeyAB on 9 Jul 2018

Yes. And the 2nd channel is also 16 bit since I transformed from the first channel.

ycui123 on 9 Jul 2018

So you can convert it to the common 8-bit 3 channels in any way as you want and it will work:

or convert two 16-bit channels to the two 8-bit channels, and set all zeros in the 3rd channel.
or convert first 16-bit channel to the two 8-bit channels, and second 16-bit channel to the one 8-bit channel

Just you should do Training and Detection on the same type of converting.

Also you should disable some types of color data augmentation, i.e. set

saturation = 1.0
exposure = 1.5 
hue=0

instead of: https://github.com/AlexeyAB/darknet/blob/e301fee8a0d1343824dd8038bc051f728b93bc57/cfg/yolov3.cfg#L14-L16

AlexeyAB on 9 Jul 2018

👍3

Thank you! I'll try and let you know!

ycui123 on 9 Jul 2018

Hi @AlexeyAB,

or convert two 16-bit channels to the two 8-bit channels, and set all zeros in the 3rd channel.

I used the above method and trained for 8000 iterations and I only have one class. I found that the model didn't overfit the data with more and more iterations.

Here's what I got for 8000 iterations:
for thresh = 0.25, precision = 0.86, recall = 0.65, F1-score = 0.74
for thresh = 0.25, TP = 652, FP = 103, FN = 348, average IoU = 62.57 %
mean average precision (mAP) = 0.677931, or 67.79 %

I followed all instructions you gave in https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects

Are there any other ways to improve my performance? I want to lower FP as well as FN as much as possible. Should I train for more iterations?

Thank you

EDITED: My object is very small(usually within 100 * 100) and image is big(around 1200*4000).

ycui123 on 12 Jul 2018

Can we use route function to concat the two imge? But I don't know how to write in .cfg file in data layer.