I'm not sure if this still appears, I'm using an older version of this repo with a lot of changes, but maybe it is still helpful for someone. If we use the detect.py the Bounding-Boxes get plotted in the order from highest to lowest Class-Specific Confidence Score. Thus, the Bounding Box with the highest Class-Specific Confidence Score gets plotted first and gets overlapped by Bounding Boxes with lower values.
My suggestion:
Reverse this order, by adding the following Code in detect.py right before this for-loop: for *xyxy, conf, cls in det: (i.e. line 103 in the current repo state):
det = det.index_select(0, torch.LongTensor([idx for idx in np.arange(det.size(0)-1, -1, -1)]).to(device))
Hello @Deep-Learner, thank you for your interest in our work! Ultralytics has open-sourced YOLOv5 at https://github.com/ultralytics/yolov5, featuring faster, lighter and more accurate object detection. YOLOv5 is recommended for all new projects.

To continue with this repo, please visit our Custom Training Tutorial to get started, and see our Google Colab Notebook, Docker Image, and GCP Quickstart Guide for example environments.
If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.
If this is a custom model or data training question, please note that Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:
For more information please visit https://www.ultralytics.com.
@Deep-Learner that's an interesting observation, thanks for the feedback!
Would it be simpler to do this with reversed() or torch.flip()?
@glenn-jocher thanks, glad I could help :)
Yes, you're right.
det = torch.flip(det, [0])
is a lot cleaner and does the same. Thanks for your advice.
@Deep-Learner yes, this is a really good idea actually. A nice simple change that provides a big improvement:
pred[0]
tensor([[1.46275e+02, 2.38052e+02, 2.20637e+02, 5.12097e+02, 8.66369e-01, 0.00000e+00],
[4.59652e+01, 2.33343e+02, 1.62515e+02, 5.33323e+02, 8.58993e-01, 0.00000e+00],
[4.11506e+02, 2.34880e+02, 4.95996e+02, 5.26622e+02, 8.34650e-01, 0.00000e+00],
[2.85534e+01, 1.28263e+02, 4.96050e+02, 4.56747e+02, 7.84867e-01, 5.00000e+00],
[1.53441e+01, 3.33301e+02, 5.90303e+01, 5.20278e+02, 4.44571e-01, 0.00000e+00]])
>>>
pred[0].flip(0)
tensor([[1.53441e+01, 3.33301e+02, 5.90303e+01, 5.20278e+02, 4.44571e-01, 0.00000e+00],
[2.85534e+01, 1.28263e+02, 4.96050e+02, 4.56747e+02, 7.84867e-01, 5.00000e+00],
[4.11506e+02, 2.34880e+02, 4.95996e+02, 5.26622e+02, 8.34650e-01, 0.00000e+00],
[4.59652e+01, 2.33343e+02, 1.62515e+02, 5.33323e+02, 8.58993e-01, 0.00000e+00],
[1.46275e+02, 2.38052e+02, 2.20637e+02, 5.12097e+02, 8.66369e-01, 0.00000e+00]])
Most helpful comment
@Deep-Learner yes, this is a really good idea actually. A nice simple change that provides a big improvement: