Darknet: Image with 2 channels

Created on 9 Jul 2018  路  8Comments  路  Source: AlexeyAB/darknet

Can I use this network with images that only contains 2 channels? I'm dealing with x-ray images. The first channel is raw image(16bit grayscale) and the second channel is log transformed image. Does that work? And could you tell me which file should I modify? Thank you !

Most helpful comment

So you can convert it to the common 8-bit 3 channels in any way as you want and it will work:

  • or convert two 16-bit channels to the two 8-bit channels, and set all zeros in the 3rd channel.

  • or convert first 16-bit channel to the two 8-bit channels, and second 16-bit channel to the one 8-bit channel

Just you should do Training and Detection on the same type of converting.


Also you should disable some types of color data augmentation, i.e. set

saturation = 1.0
exposure = 1.5 
hue=0

instead of: https://github.com/AlexeyAB/darknet/blob/e301fee8a0d1343824dd8038bc051f728b93bc57/cfg/yolov3.cfg#L14-L16

All 8 comments

The first channel is raw image(16bit grayscale) and the second channel is log transformed image.

If you can convert these 2 channels (1st 16-bit + 2nd 8-bit) to the 8-bit 3-channels (total 24-bit), then just use such images for training and detection as usual.

If you can't convert in such a way, then you should change source code to do this. Look at these changes that were made to support 1-channel 8-bit images: https://github.com/AlexeyAB/darknet/pull/936/files

You should change these functions:

  1. https://github.com/AlexeyAB/darknet/blob/e301fee8a0d1343824dd8038bc051f728b93bc57/src/image.c#L936-L954

Also if OpenCV is used:

  1. https://github.com/AlexeyAB/darknet/blob/e301fee8a0d1343824dd8038bc051f728b93bc57/src/image.c#L956-L986

  2. https://github.com/AlexeyAB/darknet/blob/e301fee8a0d1343824dd8038bc051f728b93bc57/src/data.c#L723-L789

  3. And may be this: https://github.com/AlexeyAB/darknet/blob/e301fee8a0d1343824dd8038bc051f728b93bc57/src/http_stream.cpp#L272-L328


If OpenCV isn't used:

  1. https://github.com/AlexeyAB/darknet/blob/e301fee8a0d1343824dd8038bc051f728b93bc57/src/image.c#L1810-L1841

  2. https://github.com/AlexeyAB/darknet/blob/e301fee8a0d1343824dd8038bc051f728b93bc57/src/data.c#L791-L842

Thanks for the quick reply. I could convert the two channels to 8 bit. And zero pad the 3rd channel? I wonder if that works?

The first channel is raw image(16bit grayscale) and the second channel is log transformed image.

  • Do you have 1st channel with 16 bit?
  • What number of bits in the 2nd channel (second channel is log transformed image)?

Yes. And the 2nd channel is also 16 bit since I transformed from the first channel.

So you can convert it to the common 8-bit 3 channels in any way as you want and it will work:

  • or convert two 16-bit channels to the two 8-bit channels, and set all zeros in the 3rd channel.

  • or convert first 16-bit channel to the two 8-bit channels, and second 16-bit channel to the one 8-bit channel

Just you should do Training and Detection on the same type of converting.


Also you should disable some types of color data augmentation, i.e. set

saturation = 1.0
exposure = 1.5 
hue=0

instead of: https://github.com/AlexeyAB/darknet/blob/e301fee8a0d1343824dd8038bc051f728b93bc57/cfg/yolov3.cfg#L14-L16

Thank you! I'll try and let you know!

Hi @AlexeyAB,

or convert two 16-bit channels to the two 8-bit channels, and set all zeros in the 3rd channel.

I used the above method and trained for 8000 iterations and I only have one class. I found that the model didn't overfit the data with more and more iterations.

Here's what I got for 8000 iterations:
for thresh = 0.25, precision = 0.86, recall = 0.65, F1-score = 0.74
for thresh = 0.25, TP = 652, FP = 103, FN = 348, average IoU = 62.57 %
mean average precision (mAP) = 0.677931, or 67.79 %

I followed all instructions you gave in https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects

Are there any other ways to improve my performance? I want to lower FP as well as FN as much as possible. Should I train for more iterations?

Thank you

EDITED: My object is very small(usually within 100 * 100) and image is big(around 1200*4000).

Can we use route function to concat the two imge? But I don't know how to write in .cfg file in data layer.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Mididou picture Mididou  路  3Comments

shootingliu picture shootingliu  路  3Comments

HilmiK picture HilmiK  路  3Comments

PROGRAMMINGENGINEER-NIKI picture PROGRAMMINGENGINEER-NIKI  路  3Comments

qianyunw picture qianyunw  路  3Comments