Yolov5: Training very very slowly

Created on 29 Oct 2020  ·  11Comments  ·  Source: ultralytics/yolov5

❔Question

Traing very very very slowly, and the GPU-Util is always 0 by nvidia-smi, however the GPU Memory Usage about 20G+.
Is this normal?

Additional context

Here is my env:
yolov5 version :83deec
Python : 3.8
CUDA : 10.1
cudnn: 7.6.3
PyTorch: 1.6.0
GPU: Tesla V100 32G Mem version.
I train yolov5m with 20k+ images, the GPU usage always 0.

Stale question

Most helpful comment

@SiyangXie @dongjuns yes the nvidia-smi command is the best way to monitor GPU stats.

A new option for monitoring GPU utilization is also W&B logging, which plots your utilization, temperature, CUDA memory over your full training run. Here are stats for a COCO128 YOLOv5x training with a V100 on Colab Pro. We are putting togethor tutorials this week for our recent W&B integration.
Screenshot 2020-11-02 at 11 32 46

All 11 comments

Hello @mengban, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook Open In Colab, Docker Image, and Google Cloud Quickstart Guide for example environments.

If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom model or data training question, please note Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:

  • Cloud-based AI systems operating on hundreds of HD video streams in realtime.
  • Edge AI integrated into custom iOS and Android apps for realtime 30 FPS video inference.
  • Custom data training, hyperparameter evolution, and model exportation to any destination.

For more information please visit https://www.ultralytics.com.

@mengban GPU utilisation should be about 90% when running nvidia-smi. You may have environment problems. I would recommend the Docker Image as an easy way to reproduce our environment while exploiting your hardware.

Please ensure you meet all dependency requirements if you are attempting to run YOLOv5 locally. If in doubt, create a new virtual Python 3.8 environment, clone the latest repo (code changes daily), and pip install -r requirements.txt again. We also highly recommend using one of our verified environments below.

Requirements

Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.6. To install run:

$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

CI CPU testing

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are passing. These tests evaluate proper operation of basic YOLOv5 functionality, including training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu.

@mengban GPU utilisation should be about 90% when running nvidia-smi. You may have environment problems. I would recommend the Docker Image as an easy way to reproduce our environment while exploiting your hardware.

Please ensure you meet all dependency requirements if you are attempting to run YOLOv5 locally. If in doubt, create a new virtual Python 3.8 environment, clone the latest repo (code changes daily), and pip install -r requirements.txt again. We also highly recommend using one of our verified environments below.

Requirements

Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.6. To install run:

$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

  • Google Colab Notebook with free GPU: Open In Colab
  • Kaggle Notebook with free GPU: https://www.kaggle.com/ultralytics/yolov5
  • Google Cloud Deep Learning VM. See GCP Quickstart Guide
  • Docker Image https://hub.docker.com/r/ultralytics/yolov5. See Docker Quickstart Guide Docker Pulls

Status

CI CPU testing

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are passing. These tests evaluate proper operation of basic YOLOv5 functionality, including training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu.

Thanks for your reply. I re-install the package with pip install -r requirements.txt, and my problem still exists.
And I find that the 8(num of workers)*CPU works nearly 100%, so I think perhaps it's caused by my dataset. In my dataset, the image pixel is about 3000 * 4000 , even 6000 * 4000... and the number of box in the single image nearly 100+, so I think CPU can't feed data into GPU in time and then slow the whole training process. what do u think?

@mengban both CPU and GPU utilization should be 90-100%. 8 --workers is the default, you're free to vary as you see fit.

As I said try the docker image.

Docker usage link, https://github.com/ultralytics/yolov5/wiki/Docker-Quickstart

sudo docker run --ipc=host --gpus all -it -v "$(pwd)"/yourDirectory:/usr/src/yourDirectory ultralytics/yolov5:latest

replace 'yourDirectory' to your directory which you want to use in YOLOv5 docker container.

Docker usage link, https://github.com/ultralytics/yolov5/wiki/Docker-Quickstart

sudo docker run --ipc=host --gpus all -it -v "$(pwd)"/yourDirectory:/usr/src/yourDirectory ultralytics/yolov5:latest

replace 'yourDirectory' to your directory which you want to use in YOLOv5 docker container.

thanks, bro. I'll have a try.

+1, in the docker container, yolov5 directory placed on /usr/src/app

So where do you see your GPU-Util? I don't see it when training.

@SiyangXie
Use the command in the terminal space

nvidia-smi
watch nvidia-smi

@SiyangXie @dongjuns yes the nvidia-smi command is the best way to monitor GPU stats.

A new option for monitoring GPU utilization is also W&B logging, which plots your utilization, temperature, CUDA memory over your full training run. Here are stats for a COCO128 YOLOv5x training with a V100 on Colab Pro. We are putting togethor tutorials this week for our recent W&B integration.
Screenshot 2020-11-02 at 11 32 46

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

KangHoyong picture KangHoyong  ·  3Comments

linhaoqi027 picture linhaoqi027  ·  4Comments

Alex-afka picture Alex-afka  ·  3Comments

maykulkarni picture maykulkarni  ·  3Comments

lisa676 picture lisa676  ·  3Comments