Yolov5: Model Ensembling Tutorial

Created on 8 Jul 2020  路  22Comments  路  Source: ultralytics/yolov5

馃殌 This guide explains how to use model ensembling during testing and inference for improved mAP and Recall. From https://www.sciencedirect.com/topics/computer-science/ensemble-modeling:

Ensemble modeling is a process where multiple diverse models are created to predict an outcome, either by using many different modeling algorithms or using different training data sets. The ensemble model then aggregates the prediction of each base model and results in once final prediction for the unseen data. The motivation for using ensemble models is to reduce the generalization error of the prediction. As long as the base models are diverse and independent, the prediction error of the model decreases when the ensemble approach is used. The approach seeks the wisdom of crowds in making a prediction. Even though the ensemble model has multiple base models within the model, it acts and performs as a single model.

Before You Start

Clone this repo and install requirements.txt dependencies, including Python>=3.8 and PyTorch>=1.6.

git clone https://github.com/ultralytics/yolov5 # clone repo
cd yolov5
pip install -r requirements.txt # install requirements.txt

Test Normally

Before ensembling we want to establish the baseline performance of a single model. This command tests YOLOv5x on COCO val2017 at image size 640 pixels. yolov5x.pt is the largest and most accurate model available. Other options are yolov5s.pt, yolov5m.pt and yolov5l.pt, or you own checkpoint from training a custom dataset ./weights/best.pt. For details on all available models please see our README table.

$ python test.py --weights yolov5x.pt --data coco.yaml --img 640

Output:

Namespace(augment=False, batch_size=32, conf_thres=0.001, data='./data/coco.yaml', device='', img_size=640, iou_thres=0.65, save_json=True, save_txt=False, single_cls=False, task='val', verbose=False, weights=['yolov5x.pt'])
Using CUDA device0 _CudaDeviceProperties(name='Tesla P100-PCIE-16GB', total_memory=16280MB)

Fusing layers... Model Summary: 284 layers, 8.89222e+07 parameters, 0 gradients
Scanning labels ../coco/labels/val2017.cache (4952 found, 0 missing, 48 empty, 0 duplicate, for 5000 images): 5000it [00:00, 17761.74it/s]
               Class      Images     Targets           P           R      [email protected]  [email protected]:.95: 100% 157/157 [02:34<00:00,  1.02it/s]
                 all       5e+03    3.63e+04       0.409       0.754       0.669       0.476
Speed: 23.6/1.6/25.2 ms inference/NMS/total per 640x640 image at batch-size 32

COCO mAP with pycocotools... saving detections_val2017__results.json...
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.492 < ---------- baseline mAP
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.676
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.534
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.318
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.541
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.633
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.376
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.616
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.670
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.493
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.723
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.812

Ensemble Test

Multiple pretraind models may be ensembled togethor at test and inference time by simply appending extra models to the --weights argument in any existing test.py or detect.py command. This example tests an ensemble of 2 models togethor:

  • YOLOv5x
  • YOLOv5l
$ python test.py --weights yolov5x.pt yolov5l.pt --data coco.yaml --img 640

Output:

Namespace(augment=False, batch_size=32, conf_thres=0.001, data='./data/coco.yaml', device='', img_size=640, iou_thres=0.65, save_json=True, save_txt=False, single_cls=False, task='val', verbose=False, weights=['yolov5x.pt', 'yolov5l.pt'])
Using CUDA device0 _CudaDeviceProperties(name='Tesla P100-PCIE-16GB', total_memory=16280MB)

Fusing layers... Model Summary: 284 layers, 8.89222e+07 parameters, 0 gradients  # Model 1
Fusing layers... Model Summary: 236 layers, 4.77901e+07 parameters, 0 gradients  # Model 2
Ensemble created with ['yolov5x.pt', 'yolov5l.pt']  # Ensemble Notice

Scanning labels ../coco/labels/val2017.cache (4952 found, 0 missing, 48 empty, 0 duplicate, for 5000 images): 5000it [00:00, 17883.26it/s]
               Class      Images     Targets           P           R      [email protected]  [email protected]:.95: 100% 157/157 [03:42<00:00,  1.42s/it]
                 all       5e+03    3.63e+04       0.402       0.764       0.677        0.48
Speed: 37.5/1.4/38.9 ms inference/NMS/total per 640x640 image at batch-size 32

COCO mAP with pycocotools... saving detections_val2017__results.json...
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.496 < ---------- improved mAP
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.684
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.538
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.323
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.548
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.633
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.377
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.615
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.670
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.495
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.723
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.815

Ensemble Inference

Append extra models to the --weights argument to run ensemble inference:

$ python detect.py --weights yolov5x.pt yolov5l.pt --img 640 --source ./inference/images/

Output:

Namespace(agnostic_nms=False, augment=False, classes=None, conf_thres=0.25, device='', img_size=640, iou_thres=0.45, output='inference/output', save_txt=False, source='./inference/images/', update=False, view_img=False, weights=['yolov5x.pt', 'yolov5l.pt'])
Using CUDA device0 _CudaDeviceProperties(name='Tesla P100-PCIE-16GB', total_memory=16280MB)

Fusing layers... Model Summary: 284 layers, 8.89222e+07 parameters, 0 gradients  # Model 1
Fusing layers... Model Summary: 236 layers, 4.77901e+07 parameters, 0 gradients  # Model 2
Ensemble created with ['yolov5x.pt', 'yolov5l.pt']  # Ensemble Notice

image 1/2 inference/images/bus.jpg: 640x512 4 persons, 1 bicycles, 1 buss, Done. (0.073s)
image 2/2 inference/images/zidane.jpg: 384x640 3 persons, 3 ties, Done. (0.063s)
Results saved to inference/output
Done. (0.319s)

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

documentation enhancement

Most helpful comment

@Blaze-raf97 with the right amount of coffee anything is possible.

All 22 comments

Can I use it in version 1?

Can I use it in version 1?

Can I use it in version 1?

Can I use it in version 1?

Can I use it in version 1?

Can I use it in version 1?

Can I use it in version 1?

what's influence of model ensemble

compare to test-time augmentation? which one will be better?

Well, I see, the model ensembling method is actually more like using a poor model to find missed detections for a good model. In contrast, TTA can also find missed detections by changing the input, while maintaining using the best model.

@Zzh-tju ensembling and TTA are not mutually exclusive. You can TTA a single model, and you can ensemble a group of models with or without TTA:

python detect.py --weights model1.pt model2.pt --augment

@Zzh-tju ensembling runs multiple models, while TTA tests a single model at with different augmentations. Typically I've seen the best result when merging output grids directly, (i.e. ensembling YOLOv5l and YOLOv5x), rather than simply appending boxes from multiple models for NMS to sort out. This is not always possible however, for example Ensembling an EfficientDet model with YOLOv5x, you can not merge grids, you must use NMS or WBF (or Merge NMS) to get a final result.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

How can I ensemble EfficientDet D7 with YOLO V5x?

@Blaze-raf97 with the right amount of coffee anything is possible.

How to solve this problem?
COCO mAP with pycocotools... saving detections_val2017__results.json...
ERROR: pycocotools unable to run: invalid literal for int() with base 10: 'Image_20200930140952222'

@LokedSher pycocotools is only intended for mAP on COCO data using coco.yaml. https://pypi.org/project/pycocotools/

@LokedSher pycocotools is only intended for mAP on COCO data using coco.yaml. https://pypi.org/project/pycocotools/

Thanks for your reply!

@LokedSher I also encountered the same problem as you, but after I read your Q&A, I still don't know how to improve to get the picture given by the author.
image

I want to ensemble yolov3-spp and yolov5x which are trained by using your excellent work yolov3 and yolov5. Few months ago i got the ensemble result ,but i try it again now, error encountered, can you help me? ths!

python detect.py --weights runs/train/exp13/weights/best.pt /home/work/pretrained_weights/best.pt --source /home/work/data

Fusing layers...
Model Summary: 484 layers, 88390614 parameters, 0 gradients
Traceback (most recent call last):
File "detect.py", line 172, in
detect()
File "detect.py", line 33, in detect
model = attempt_load(weights, map_location=device) # load FP32 model
File "/home/work/xunuo/yolov5/yolov5/models/experimental.py", line 137, in attempt_load
model.append(torch.load(w, map_location=map_location)['model'].float().fuse().eval()) # load FP32 model
AttributeError: 'collections.OrderedDict' object has no attribute 'float'

@PromiseXu1 we are currently updating YOLOv3 models and should have them available for autodownload within this repo at the end of the week. In the meantime if you have an existing YOLOv3 model that runs inference correctly with this repo, you can modify the ensemble type to NMS ensemble to allow it to ensemble with the YOLOv5 models. v3 and v5 models have different heads (FPN and PANet), meaning the current ensemnble method will not work as it expects every output to have the same size and shape. You can modify ensemble type in the Ensemble() class:
https://github.com/ultralytics/yolov5/blob/201bafc7cf9545362552ad5b0fc5033d64d7ae77/models/experimental.py#L117-L130

@PromiseXu1 we are currently updating YOLOv3 models and should have them available for autodownload within this repo at the end of the week. In the meantime if you have an existing YOLOv3 model that runs inference correctly with this repo, you can modify the ensemble type to NMS ensemble to allow it to ensemble with the YOLOv5 models. v3 and v5 models have different heads (FPN and PANet), meaning the current ensemnble method will not work as it expects every output to have the same size and shape. You can modify ensemble type in the Ensemble() class:
https://github.com/ultralytics/yolov5/blob/201bafc7cf9545362552ad5b0fc5033d64d7ae77/models/experimental.py#L117-L130

WOw ! thanks for your prompt response, i will try and wish you all the best in your work ~

Was this page helpful?
0 / 5 - 0 ratings

Related issues

abhiksark picture abhiksark  路  3Comments

xinxin342 picture xinxin342  路  3Comments

FSNStefan picture FSNStefan  路  4Comments

lisa676 picture lisa676  路  3Comments

Alex-afka picture Alex-afka  路  3Comments