Keras-retinanet: how to edit anchor box size to fit my annotations

Created on 8 Sep 2018  路  18Comments  路  Source: fizyr/keras-retinanet

This is an excellent open source project!Thanks a lot.
But I have a problemis that ,I have a task for recognize text on image,so the text areas is small size .After I used debug.py and I found the most annotations were in red color,and I think I need to change the anchor size to fit my ground truth box.But actually I am a novice,Could u give some advises on how to change generated anchor size?
thank u!

Most helpful comment

@Anesthesic Hey, following are parameters I am using currently !! Check out if this helps for you or you can use the repo shared by @zhouyuangan to calculate the ratios but inverse the ratios and replace them in anchors.py.

sizes   = [32, 64, 128, 256, 512],
strides = [8, 16, 32, 64, 128],
ratios  = np.array([0.19, 0.32, 0.79], keras.backend.floatx()),
scales  = np.array([2 ** 0, 2 ** (1.0 / 3.0), 2 ** (2.0 / 3.0)], keras.backend.floatx()),

All 18 comments

This is a good use case for modifying anchors. Unfortunately currently the anchor sizes are hardcoded, although there is an open PR to improve this.

You'll have to modify the code in several places to change your anchors, see here: https://github.com/fizyr/keras-retinanet/issues/439#issuecomment-385972712

@zhouyuangan Hey, did you change the anchors and tried? If so how did you calculated the anchors for your dataset ?

@IntelligentIndia7 Hey, I have changed anchor ratio for my own dataset and it works well.Actually, I had read yolov3 paper and the paper mentions a new way called KNN cluster algorithm, here is my repo, try it yourself, good luck.

@zhouyuangan Thanks !! Will try the same.

@zhouyuangan Hey, I tried changing the anchor sales which were calculated using the code you have provided. I chaged the values in the file keras_retinanet/utils/anchors.py here -

AnchorParameters.default = AnchorParameters(
sizes = [32, 64, 128, 256, 512],
strides = [8, 16, 32, 64, 128],
ratios = np.array([1.27, 2.08, 3.64], keras.backend.floatx()),
scales = np.array([2 * 0, 2 * (1.0 / 3.0), 2 ** (2.0 / 3.0)], keras.backend.floatx()),
)

I trained for 50 epochs, my training loss started at around 0.25 and ended around 0.02. I did not get any errors but my mAP is very less. I am training for detecting cable wires and floor mats.

Do I need to change anywhere else? Can you advise me if I am doing something wrong? Should I increse the number of values for ratios (more than 3) ?

Thanks in advance.

@IntelligentIndia7 Hi, do not increse number of your anchor ratios, otherwise, it will increse subnetwork such as classification net and regression net GPU memory comsuption. I think the problem is in your dataset or training hyper parameters.

@zhouyuangan I am using the default hyper parameters. I'll check my dataset anyways.

@yhenon I had tried to modify scales, too. It does not work, sadly.

@zhouyuangan I have two objects Cable wire and Floor Mat, RetinaNet is able to fairly learn Cable wire and very naive almost nothing in case of Floor Mat ( Which are generally long and thin sometimes covers entire width of the image). When I trained Yolo, I get a very strong response for Floor Mats, even Tiny Yolo V1 is performing far better in case of my current dataset. Am I doing something wrong or it just that RetinaNet is not designed for such elongated objects ???

@IntelligentIndia7 I got it. In my opinion, there is a issue that you need to confirm, whether your number of Cable wire and Floor Mat in your dataset are in balanced. And you said you had tried YOLO algorithm which perform well for Floor Mat? I think the reason is that YOLO's anchor hyper parameters fit for training to localize Floor Mat but not for Cable wire, I used this repo in my text detection task, after I annotated 90K images, the result became good enough, actually I just change anchor ratios, and it works well. Good luck for U.

@zhouyuangan Thank you for your reply. Yea I have pretty balanced dataset. Yolo perform way better for FloorMat and even Cable wire. From the detections what I observed is RetinaNat is able to detect small cable wires (coiled or small usb cables) which are missed by Yolo but in case of cable wires which are long, kind of straight line Yolo is detecting way better, Infact RetinaNet completely misses. In case of FloorMats RetinaNet is able to detect floorMats which have broad bounding boxes (which are inclined so that we can see a major portion) but not those thin strainght lined ones (which are right infront). My dataset is captured by camera on to the base of a robo facing towards front.

@IntelligentIndia7 Sorry for reply so late. I believe Retinanet could detect long and thin objects if we set reasonable anchors' hyper parameters. I'd like to use debug.py in this repo seeing if shapes of anchor and ground truth are suitable. Try to increase number of anchor ratios for your long and thin objects in your dataset and increase anchor's scale and also number if needed, I think it should works according to my emprical work.

@zhouyuangan Hey ! I think I figured out the problem. The repo you have shared for calculating the aspect ratios, calculates ratios of x to y ( ie width to height ). But the ratios defined in the retinanet anchors are of height to width ( What I understood from different explanations ). So when I inversed the ratios and started training, the models seems to learn both the classes and giving better results. Correct me If I am wrong !!! and again Thanks for your help.

@zhouyuangan Hey Am I doing it right,? The results seem to be much improved, in fact a lot actually !! The precision of floor mat class has improved from 6% to about 80%

@IntelligentIndia7 could you please feed me the new ratios that you used? I also have some long narrow objects and I would like to see if I can detect them. Thank you in advance!

@Anesthesic Hey, following are parameters I am using currently !! Check out if this helps for you or you can use the repo shared by @zhouyuangan to calculate the ratios but inverse the ratios and replace them in anchors.py.

sizes   = [32, 64, 128, 256, 512],
strides = [8, 16, 32, 64, 128],
ratios  = np.array([0.19, 0.32, 0.79], keras.backend.floatx()),
scales  = np.array([2 ** 0, 2 ** (1.0 / 3.0), 2 ** (2.0 / 3.0)], keras.backend.floatx()),

@zhouyuangan Hey ! I think I figured out the problem. The repo you have shared for calculating the aspect ratios, calculates ratios of x to y ( ie width to height ). But the ratios defined in the retinanet anchors are of height to width ( What I understood from different explanations ). So when I inversed the ratios and started training, the models seems to learn both the classes and giving better results. Correct me If I am wrong !!! and again Thanks for your help.

Hi @IntelligentIndia7, I followed your recommendations and based on @zhouyuangan repo I calculated the ratios to better fit my annotations. I was wondering, you said you inverted the ratios, how did you do it?

I am getting: [0.77, 1.04, 3.0]

Are these the inversed ones? : [1.29, 0.96, 0.33]

Hey @mariaculman18 , Yea I have inverted the ratios as you have shown and it worked.

Was this page helpful?
0 / 5 - 0 ratings