Hey I get a new error whan I run the train script:
Downloading https://drive.google.com/uc?export=download&id=158g62Vs14E3aj7oPVPuEnNZMKFNgGyNq as weights/ultralytics49.pt... Done (2.8s)
Traceback (most recent call last):
File "train.py", line 444, in <module>
train() # train normally
File "train.py", line 111, in train
chkpt['model'] = {k: v for k, v in chkpt['model'].items() if model.state_dict()[k].numel() == v.numel()}
File "train.py", line 111, in <dictcomp>
chkpt['model'] = {k: v for k, v in chkpt['model'].items() if model.state_dict()[k].numel() == v.numel()}
KeyError: 'module_list.85.Conv2d.weight'
I am having a much similar issue:
File "train.py", line 111, in <dictcomp>
chkpt['model'] = {k: v for k, v in chkpt['model'].items() if model.state_dict()[k].numel() == v.numel()}
KeyError: 'module_list.85.Conv2d.weight'
I think something is wrong w/ custom .cfg and/or .data file. because when I do a sanity check w/ default files I get:
'No labels found. Recommend correcting image and label paths.'
AssertionError: No labels found. Recommend correcting image and label paths.
Please see, "Train On Custom Data" - https://github.com/ultralytics/yolov3/issues/621
Did you check the coco.data file? And your .cfg file should have nothing to do with this.
The easiest way to fix this is by making sure that you have a directory called 'labels' inside your data directory. In this directory you place all the labels for both the test/validation.
Also make sure that you have the correct path names of your images. I have found relative paths to be better than then full paths.

Nope still broken
Why isnβt there instructions on simply running your own images thru it, while using coco/yolo, and getting some metrics like mAP and false positives and negatives? I canβt believe the docs have made it this hard. Iβm willing to rewrite them if I can figure this out.
@alontrais @joehoeller thank you for your interest in our work! Please note that most technical problems are due to:
git clone version of this repository we can not debug it. Before going further run this code and ensure your issue persists:sudo rm -rf yolov3 # remove exising repo
git clone https://github.com/ultralytics/yolov3 && cd yolov3 # git clone latest
python3 detect.py # verify detection
python3 train.py # verify training (a few batches only)
# CODE TO REPRODUCE YOUR ISSUE HERE
train_batch0.jpg and test_batch0.jpg for a sanity check of training and testing data.If none of these apply to you, we suggest you close this issue and raise a new one using the Bug Report template, providing screenshots and minimum viable code to reproduce your issue. Thank you!
@alontrais I had a similar error before, and I figured it out. The cause of this error in my end is because I used yolov3.cfg as my configure, but use the default weight file 'ultralytic49.pt', and the two does not match.
In the case that you want to use the default weight, you can use the yolov3-spp.cfg as a baseline and modify the corresponding filters/num_class as instructed.
@glenn-jocher I followed your instructions:
sudo rm -rf yolov3 # remove exising repo
git clone https://github.com/ultralytics/yolov3 && cd yolov3 # git clone latest
python3 detect.py # verify detection
python3 train.py # verify training (a few batches only)
I get this when I run train.py:
line 374, in __init__
assert nf > 0, 'No labels found. Recommend correcting image and label paths.'
AssertionError: No labels found. Recommend correcting image and label paths.
python3 detect.py works just fine../coco/trainvalno5k.txt.@joehoeller you need the coco dataset to run the training examples:
$ bash yolov3/data/get_coco_dataset_gdrive.sh
Yes, I already did that. Is there something that needs to be done to the labels, other than putting them in /data folder? For example, should they be in the nested folders in which they came from? As
@Fransisco stated above. (The labels were copied so their original path is still intact).
@joehoeller nothing needs to be done to the labels. You just git clone the repo, copy the coco dataset and train. You can even follow the notebook, just click play in each cell.
https://colab.research.google.com/drive/1G8T-VFxQkjDe4idzN8F-hbIBqkkkQnxw
I still get the error. Why did you close it?
On Sun, Nov 24, 2019 at 3:45 PM Glenn Jocher notifications@github.com
wrote:
@joehoeller https://github.com/joehoeller nothing needs to be done to
the labels. You just git clone the repo, copy the coco dataset and train.
You can even follow the notebook, just click play in each cell.https://colab.research.google.com/drive/1G8T-VFxQkjDe4idzN8F-hbIBqkkkQnxw
β
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ultralytics/yolov3/issues/650?email_source=notifications&email_token=ABHVQHEMX4ELZI3NHDSXMOTQVLYX7A5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFAVVUQ#issuecomment-557931218,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABHVQHG4GM62Y76WBJX7SKTQVLYX7ANCNFSM4JQ3CBSA
.
@joehoeller your error is not reproducible, there's no bug. Follow the steps, everything works properly.
That is false sir, because I did. And I get the error for the labels as
shown.
On Sun, Nov 24, 2019 at 4:47 PM Glenn Jocher notifications@github.com
wrote:
@joehoeller https://github.com/joehoeller your error is not
reproducible, there's no bug. Follow the steps, everything works properly.β
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ultralytics/yolov3/issues/650?email_source=notifications&email_token=ABHVQHAS6QZX3AGBPNYGBWDQVL77PA5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFAW4QQ#issuecomment-557936194,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABHVQHHRCTIAHIR3PW5DBFDQVL77PANCNFSM4JQ3CBSA
.
@joehoeller To get started simply run the following in a terminal, or open the notebook and click play on the first cells (same code):
https://colab.research.google.com/drive/1G8T-VFxQkjDe4idzN8F-hbIBqkkkQnxw
rm -rf yolov3 coco coco.zip # WARNING: remove existing
git clone https://github.com/ultralytics/yolov3 # clone
bash yolov3/data/get_coco_dataset_gdrive.sh # copy COCO2014 dataset (19GB)
cd yolov3
python3 train.py
How many times do I have to tell you I did that.
Iβm moving on to build my own solution β which I can do, I was just hoping
to save time.
On Sun, Nov 24, 2019 at 5:21 PM Glenn Jocher notifications@github.com
wrote:
@joehoeller https://github.com/joehoeller To get started simply run the
following in a terminal, or open the notebook and click play on the first
cells (same code):
https://colab.research.google.com/drive/1G8T-VFxQkjDe4idzN8F-hbIBqkkkQnxwrm -rf yolov3 coco coco.zip # WARNING: remove existing
git clone https://github.com/ultralytics/yolov3 # clone
bash yolov3/data/get_coco_dataset_gdrive.sh # copy COCO2014 dataset (19GB)
%cd yolov3
python3 train.pyβ
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ultralytics/yolov3/issues/650?email_source=notifications&email_token=ABHVQHFARQIBSGLVIWYU6KLQVMEANA5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFAXYFQ#issuecomment-557939734,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABHVQHBHLDPEZJIFQ65WGADQVMEANANCNFSM4JQ3CBSA
.
How many times do I have to tell you I did that. Iβm moving on to build my own solution β which I can do, I was just hoping to save time.
β¦
On Sun, Nov 24, 2019 at 5:21 PM Glenn Jocher @.*> wrote: @joehoeller https://github.com/joehoeller To get started simply run the following in a terminal, or open the notebook and click play on the first cells (same code): https://colab.research.google.com/drive/1G8T-VFxQkjDe4idzN8F-hbIBqkkkQnxw rm -rf yolov3 coco coco.zip # WARNING: remove existing git clone https://github.com/ultralytics/yolov3 # clone bash yolov3/data/get_coco_dataset_gdrive.sh # copy COCO2014 dataset (19GB) %cd yolov3 python3 train.py β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#650?email_source=notifications&email_token=ABHVQHFARQIBSGLVIWYU6KLQVMEANA5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFAXYFQ#issuecomment-557939734>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHVQHBHLDPEZJIFQ65WGADQVMEANANCNFSM4JQ3CBSA .
Don't be rude! Instead of complaining, you need to embrace the spirit of collaboration. This is the best PyTorch implementation public. Contribute to making it better.
FYI. If you are not on a notebook and you want to run this. I would advise that you follow the setup in that is made by
bash get_coco_dataset.sh
There you will get the perfect structure.
Let me make this more clear for you since you do not understand:
I followed the steps exactly as stated. Then I got the error message about the
labels, which still persists.
So no itβs not the best. Iβm making my own so I donβt waste any more time.
I just thought I could save time using this, and clearly I was wrong.
@joehoeller if the default code I sent you works in your environment, then use that as a starting point for your own development efforts. You simply mimic the coco data format with your own data. All of the info, including step by step directions and code to reproduce are in the custom training example in the wiki.
https://github.com/ultralytics/yolov3/wiki
It does not for the last time. How many times do I have to tell you. Scroll up and read the label error. Because thatβs what I get after I performed the command line cmdβs as given per your instructions.
Actually donβt bother because Iβm already hooking up analytics and metrics to my own solution Iβve built in Torch w Tensorboard.
I got the same error,
Traceback (most recent call last):
File "train.py", line 444, in <module>
train() # train normally
File "train.py", line 111, in train
chkpt['model'] = {k: v for k, v in chkpt['model'].items() if model.state_dict()[k].numel() == v.numel()}
File "train.py", line 111, in <dictcomp>
chkpt['model'] = {k: v for k, v in chkpt['model'].items() if model.state_dict()[k].numel() == v.numel()}
KeyError: 'module_list.85.Conv2d.weight'
I have tried the suggested steps, but nothing worked out. https://github.com/ultralytics/yolov3/issues/650#issuecomment-557939734
so sad! the same errorοΌ
File "train.py", line 444, in
train() # train normally
File "train.py", line 111, in train
chkpt['model'] = {k: v for k, v in chkpt['model'].items() if model.state_dict()[k].numel() == v.numel()}
File "train.py", line 111, in
chkpt['model'] = {k: v for k, v in chkpt['model'].items() if model.state_dict()[k].numel() == v.numel()}
@Samjith888 @inspire-lts @joehoeller see https://github.com/ultralytics/yolov3/issues/657
This error is caused by a user supplying incompatible --weights and --cfg arguments. To solve this you must specify no weights (i.e. random initialization of the model) using --weights '' and any --cfg, or use a --cfg that is compatible with your --weights. If none are specified, the defaults are --weights ultralytics49.pt and --cfg cfg/yolov3-spp.cfg.
Examples of compatible combinations are:
python3 train.py --weights yolov3.pt --cfg cfg/yolov3.cfg
python3 train.py --weights yolov3.weights --cfg cfg/yolov3.cfg
python3 train.py --weights yolov3-spp.pt --cfg cfg/yolov3-spp.cfg
python3 train.py --weights ultralytics49.pt --cfg cfg/yolov3-spp.cfg
python3 train.py --weights '' --cfg cfg/*.cfg # any cfg will work here
ultralytics49.pt is currently the highest performing YOLOv3 model (trained from scratch using this repo) available at the default img-size of 416 (see https://github.com/ultralytics/yolov3/issues/310), which is the reason it is used as the default backbone.
So for the last time, what does this mean and how do I fix it:
assert nf > 0, 'No labels found. Recommend correcting image and label paths.'
A lot of this could be resolved if there was better docs and tutorials,
with some minor improvements in the code.
\How can we work together to make this happen?
On Mon, Nov 25, 2019 at 3:00 PM Glenn Jocher notifications@github.com
wrote:
@Samjith888 https://github.com/Samjith888 @inspire-lts
https://github.com/inspire-lts @joehoeller
https://github.com/joehoeller see #657
https://github.com/ultralytics/yolov3/issues/657This error is caused by a user supplying incompatible --weights and --cfg
arguments. To solve this you must specify no weights (i.e. random
initialization of the model) using --weights '' and any --cfg, or use a
--cfg that is compatible with your --weights. If none are specified, the
defaults are --weights ultralytics49.pt and --cfg cfg/yolov3-spp.cfg.Examples of compatible combinations are:
python3 train.py --weights yolov3.pt --cfg cfg/yolov3.cfg
python3 train.py --weights yolov3.weights --cfg cfg/yolov3.cfg
python3 train.py --weights yolov3-spp.pt --cfg cfg/yolov3-spp.cfg
python3 train.py --weights ultralytics49.pt --cfg cfg/yolov3-spp.cfg
python3 train.py --weights '' --cfg cfg/*.cfg # any cfg will work hereβ
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ultralytics/yolov3/issues/650?email_source=notifications&email_token=ABHVQHC7LZKFTZGM75L2MJTQVQ4ILA5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFDY7NY#issuecomment-558337975,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABHVQHAQBVMTBZ5JKGX6B63QVQ4ILANCNFSM4JQ3CBSA
.
They were pretty clear in their meanings. But it took me a second try to get my labels working. I am going to write a Medium article on how to use the Ultralytics model to train better. And write a wiki page on distributive computing.
It is a great model. With some amazing work. But I feel that if we all contribute it can be the top Yolov3 model.
For sure, I have some ideas and some more ppl in CV space willing to help
as well.
On Wed, Nov 27, 2019 at 9:24 AM Francisco Reveriano <
[email protected]> wrote:
They were pretty clear in their meanings. But it took me a second try to
get my labels working. I am going to write a Medium article on how to use
the Ultralytics model to train better. And write a wiki page on
distributive computing.
It is a great model. With some amazing work. But I feel that if we all
contribute it can be the top Yolov3 model.β
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ultralytics/yolov3/issues/650?email_source=notifications&email_token=ABHVQHACMNKX5BPH4UIZB4TQV2GJDA5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFJ3BMY#issuecomment-559132851,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABHVQHAB2MB7ZPCTT336NXDQV2GJDANCNFSM4JQ3CBSA
.
Send me a message we can collaborate in article. Or add me at Linked which is my profile.
Will do. Thanks!
I just updated the mAP section of the README with the latest results. We're making good progress on training. Earlier in the year we were behind darknet, now we are ahead in most metrics, using the same yolov3-spp.cfg architecture. The best results now are from ultralytics68.pt, which I should have up on the Google Drive folder soon.
https://github.com/ultralytics/yolov3#map
| [email protected]:0.95| [email protected]:0.95| [email protected]:0.95
--- | --- | --- | ---
darknet YOLOv3-tiny | 14.0 | 16.0 | 16.6
darknet YOLOv3 | 28.7 | 31.1 | 33.0
darknet YOLOv3-SPP | 30.5 | 33.9 | 37.0
ultralytics YOLOv3-SPP | 35.2 | 38.8 | 40.4
Yes a medium article and better docs would be great! I don't have much time unfortunately though, between running different trainings and developing/debugging.
How do we correct image/label paths? I have them but it is not clear as to where to set those up at.
AssertionError: No labels found. Recommend correcting image and label paths.
Send me a message we can collaborate in article. Or add me at Linked which is my profile.
Message is in your LinkedIn inbox. I built the automation tool, I now call "Dark Chocolate", it converts COCO annotations to Darknet annotation format.
@joehoeller coco.data points to the train.txt and test.txt list of images on lines 2 and 3.

These files have lists of image paths as they would be from the yolov3 directory:

If in doubt, you can run python3 train.py in debug mode, and put a breakpoint on this line to see what values the img_files are. If there are no images there, or if there are no labels in the corresponding labels folder (by replacing /images/ with /labels/ in the image paths) you will get this error message.
These are my own images and my own annotations (in Darknet format).
The images you show are just the paths to the images. Not the labels.
On Sat, Nov 30, 2019 at 9:38 PM Glenn Jocher notifications@github.com
wrote:
@joehoeller https://github.com/joehoeller coco.data points to the
train.txt and test.txt list of images on lines 2 and 3.
[image: image]
https://user-images.githubusercontent.com/26833433/69908986-7402de80-13a8-11ea-862c-c75765f5d790.pngThese files have lists of image paths as they would be from the yolov3
directory:
[image: image]
https://user-images.githubusercontent.com/26833433/69908992-81b86400-13a8-11ea-8a8f-4b4f41476b89.pngIf in doubt, you can run python3 train.py in debug mode, and put a
breakpoint on this line to see what values the img_files are. If there are
no images there, or if there are no labels in the corresponding labels
folder (by replacing /images/ with /labels/ in the image paths) you will
get this error message.β
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ultralytics/yolov3/issues/650?email_source=notifications&email_token=ABHVQHEYQIWXZCWJMGMAKWTQWMWSHA5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFQ2V5Q#issuecomment-560048886,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABHVQHEUXYIS6J2LDUNWI6DQWMWSHANCNFSM4JQ3CBSA
.
I will be uploading a reader for this if you have a custom Dataset. All I can say is that its better if you are using the full path of the images. So the computer knows where to grab the images/labels.
@joehoeller the same structure is used for custom data as for coco. The labels need to be in a separate folder next to the images folder. The labels folder needs to be found simply by replacing /images/ with /labels/ in the image folder path, like this custom "dataset1" (ds1). Each labelname is identical to each image name, except the extension for the labels is *.txt. This example trains on the first 8 images of the dataset, and tests on the last 2.
The paths all need to be relative to your yolov3 folder (or absolute paths, though these break easier if you send the code to a different environment).

Then run:
cd yolov3
python3 train.py --data ../data/ds1/out.data
BTW @FranciscoReveriano @joehoeller this is legacy structure from darknet, so the same exact data can also be used to train darknet.
This repo now outperforms darknet by a wide margin I believe, but nevertheless darknet has a strong following (i.e. pjreddier/darknet has 15k stars, alexeyab/darknet has 6k stars), so I'm not sure if we should keep following the darknet convention, or perhaps start from a clean-slate mentality about what would be easiest for the most people to train their own custom data with a minimum of hassle.
In principle this repo is here to create the most accurate, fastest object detector in the world. In practice though, people seem to care more about quick results and ease of use, and don't care as much about being the best or the fastest.
I think we need to continue to DarkNet. I guess people still follow it because it provides a nice benchmark with a lot of literature. Although I don't think Machine Learning or Object Detection should be 'people'-proof. At some point people should be expected to do the learning curve. Seems like alot of people just want quick fixes.
Although it might not be a bad idea to make a version of Facebook's Detectron 2 that could be sold. That would be the best way to start from a clean state in my opinion.
I agree β one my biggest rants is how 98% of Git repos in CV space are
awful and basic. People need to learn the concepts as well as the math. (I
took the Udacity CV course and recommend it because it dives deep in Torch
and math). However for small one offβs things like Darknet are perfect.
On Mon, Dec 2, 2019 at 8:25 AM Francisco Reveriano notifications@github.com
wrote:
I think we need to continue to DarkNet. I guess people still follow it
because it provides a nice benchmark with a lot of literature. Although I
don't think Machine Learning or Object Detection should be 'people'-proof.
At some point people should be expected to do the learning curve. Seems
like alot of people just want quick fixes.Although it might not be a bad idea to make a version of Facebook's
Detectron 2 that could be sold. That would be the best way to start from a
clean state in my opinion.β
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ultralytics/yolov3/issues/650?email_source=notifications&email_token=ABHVQHHE6F5RASKDYTBR6STQWULHLA5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFTU4HY#issuecomment-560418335,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABHVQHEPBVVEBAEYDBW6RBDQWULHLANCNFSM4JQ3CBSA
.
For me. The problem is when people ask you to interpret, figure out, or tell them to how to make their results much better. This is GitHub not ResearchGate. I was looking for a Udacity course to take this break. I might do that CV course.
Most my experience is with Tensorflow and Keras. Trying to move to Torch like the rest of us.
COCO JSON to Darknet/YOLOv3 annotation conversion tool, see readme for
instructions and how to validate:
https://github.com/joehoeller/Dark-Chocolate
On Mon, Dec 2, 2019 at 9:23 AM Francisco Reveriano notifications@github.com
wrote:
For me. The problem is when people ask you to interpret, figure out, or
tell them to how to make their results much better. This is GitHub not
ResearchGate. I was looking for a Udacity course to take this break. I
might do that CV course.
Most my experience is with Tensorflow and Keras. Trying to move to Torch
like the rest of us.β
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ultralytics/yolov3/issues/650?email_source=notifications&email_token=ABHVQHBV2CTIFCI3HJJRQJDQWUR7HA5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFT26FQ#issuecomment-560443158,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABHVQHFCLTLGEMWGFNU6TH3QWUR7HANCNFSM4JQ3CBSA
.
@glenn-jocher you show paths for images but not labels - i have been doing all of this already, and just like others in this thread it continues to fail.
@joehoeller the label paths are inferred automatically by replacing /images/ with /labels/ in the image paths. You only need to specify image paths.
The labelfile definition happens here.
@Samjith888 @inspire-lts @joehoeller see #657
This error is caused by a user supplying incompatible
--weightsand--cfgarguments. To solve this you must specify no weights (i.e. random initialization of the model) using--weights ''and any--cfg, or use a--cfgthat is compatible with your--weights. If none are specified, the defaults are--weights ultralytics49.ptand--cfg cfg/yolov3-spp.cfg.Examples of compatible combinations are:
python3 train.py --weights yolov3.pt --cfg cfg/yolov3.cfg python3 train.py --weights yolov3.weights --cfg cfg/yolov3.cfg python3 train.py --weights yolov3-spp.pt --cfg cfg/yolov3-spp.cfg python3 train.py --weights ultralytics49.pt --cfg cfg/yolov3-spp.cfg python3 train.py --weights '' --cfg cfg/*.cfg # any cfg will work here
ultralytics49.ptis currently the highest performing YOLOv3 model (trained from scratch using this repo) available at the defaultimg-sizeof 416 (see #310), which is the reason it is used as the default backbone.
This tutorial, https://github.com/ultralytics/yolov3/wiki/Train-Custom-Data , says:
I HAVE TRIED ALL OF THE SUGGESTIONS ABOVE AND STILL GET:
assert nf > 0, 'No labels found. Recommend correcting image and label paths.
This script will generate file paths to images:
import os
filee = open('FILE_NAME.txt','w')
given_dir = 'PATH_TO_CUSTOM_IMAGES'
[filee.write(os.path.join(given_dir,i)+'\n') for i in os.listdir(given_dir)]
I got it going, now have CUDA memory error, but that's a "me" problem. Not a "you" problem. I will write a very clear and concise tutorial for medium when I am done.
Yes, I think the default training settings should probably use a smaller batch size. The current settings should work fine for a 1080Ti or 2080Ti and up (11GB) cuda memory, but smaller graphics cards may run out.
The current default is --batch-size 32 --accumulate 2 to get to an effective 64 batch size. I think I should reduce this to --batch-size 16 --accumulate 4 to get the most number of people running smoothly without CUDA out of memory issues. The performance hit (from batch norming less images) is not very large.
Ok, this should do it: https://github.com/ultralytics/yolov3/commit/93a70d958a1138b082f7b5c29c550b7d383f56f3
If you git pull you can get all the latest updates.
Thatβs weird because I have a 2080Ti.
I would have thought 11GB is fine.
On Mon, Dec 2, 2019 at 1:32 PM Glenn Jocher notifications@github.com
wrote:
Ok, this should do it: 93a70d9
https://github.com/ultralytics/yolov3/commit/93a70d958a1138b082f7b5c29c550b7d383f56f3If you git pull you can get all the latest updates.
β
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ultralytics/yolov3/issues/650?email_source=notifications&email_token=ABHVQHF3GAWSHGUXEFWWM6DQWVPEZA5CNFSM4JQ3CBSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFUUC6Y#issuecomment-560546171,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABHVQHH344V3NWGKOAP6ISTQWVPEZANCNFSM4JQ3CBSA
.
Thanks, did the git pull, and is working just fine.
How long does it take to train normally?
Note, I ran:
python3 train.py --data data/custom.data --cfg cfg/yolov3-spp.cfg --weights weights/yolov3-spp.weights
@joehoeller training speeds are here.
https://github.com/ultralytics/yolov3#speed
Roughly a week to train COCO. Smaller datasets faster of course.
@joehoeller nvidia apex speeds things up a lot. This repo uses it automatically if it installed.
https://github.com/NVIDIA/apex
Cool, thanks - btw how do we extract metrics like map, false positives etc
from detect command like this:
python3 detect.py --source ./coco/images/FLIR_Dataset/training/Data/
--nms-thres 0.7 --conf-thres 0.6 --data data/custom.data --cfg
cfg/yolov3-spp.cfg --weights weights/yolov3-spp.weights
@joehoeller nvidia apex speeds things up a lot. This repo uses it automatically if it installed.
https://github.com/NVIDIA/apex
Check out my Pytorch/Anaconda/TensorRT container on my github, TensorRT does same thing :)
@joehoeller to get test metrics run python3 test.py with the same dataset and model you trained on.
$ python3 test.py --weights ultralytics68.pt --img-size 512 --device 0
Namespace(batch_size=16, cfg='cfg/yolov3-spp.cfg', conf_thres=0.001, data='data/coco.data', device='0', img_size=512, iou_thres=0.5, nms_thres=0.5, save_json=False, weights='ultralytics68.pt')
Using CUDA device0 _CudaDeviceProperties(name='GeForce RTX 2080 Ti', total_memory=10989MB)
Downloading https://drive.google.com/uc?export=download&id=1Jm8kqnMdMGUUxGo8zMFZMJ0eaPwLkxSG as ultralytics68.pt... Done (7.6s)
Class Images Targets P R [email protected] F1: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 313/313 [08:30<00:00, 1.35it/s]
all 5e+03 3.58e+04 0.0823 0.798 0.595 0.145
person 5e+03 1.09e+04 0.0999 0.903 0.771 0.18
bicycle 5e+03 316 0.0491 0.782 0.56 0.0925
car 5e+03 1.67e+03 0.0552 0.845 0.646 0.104
motorcycle 5e+03 391 0.11 0.847 0.704 0.194
airplane 5e+03 131 0.099 0.947 0.878 0.179
bus 5e+03 261 0.142 0.874 0.825 0.244
train 5e+03 212 0.152 0.863 0.806 0.258
truck 5e+03 352 0.0849 0.682 0.514 0.151
boat 5e+03 475 0.0498 0.787 0.504 0.0937
traffic light 5e+03 516 0.0304 0.752 0.516 0.0584
fire hydrant 5e+03 83 0.144 0.916 0.882 0.248
stop sign 5e+03 84 0.0833 0.917 0.809 0.153
parking meter 5e+03 59 0.0607 0.695 0.611 0.112
bench 5e+03 473 0.0294 0.685 0.363 0.0564
bird 5e+03 469 0.0521 0.716 0.524 0.0972
cat 5e+03 195 0.252 0.908 0.78 0.395
dog 5e+03 223 0.192 0.883 0.829 0.315
horse 5e+03 305 0.121 0.911 0.843 0.214
sheep 5e+03 321 0.114 0.854 0.724 0.201
cow 5e+03 384 0.105 0.849 0.695 0.187
elephant 5e+03 284 0.184 0.944 0.912 0.308
bear 5e+03 53 0.358 0.925 0.875 0.516
zebra 5e+03 277 0.176 0.935 0.858 0.297
giraffe 5e+03 170 0.171 0.959 0.892 0.29
backpack 5e+03 384 0.0426 0.708 0.392 0.0803
umbrella 5e+03 392 0.0672 0.878 0.65 0.125
handbag 5e+03 483 0.0238 0.629 0.242 0.0458
tie 5e+03 297 0.0419 0.805 0.599 0.0797
suitcase 5e+03 310 0.0823 0.855 0.628 0.15
frisbee 5e+03 109 0.126 0.872 0.796 0.221
skis 5e+03 282 0.0473 0.748 0.454 0.089
snowboard 5e+03 92 0.0579 0.804 0.559 0.108
sports ball 5e+03 236 0.057 0.733 0.622 0.106
kite 5e+03 399 0.087 0.852 0.645 0.158
baseball bat 5e+03 125 0.0496 0.776 0.603 0.0932
baseball glove 5e+03 139 0.0511 0.734 0.563 0.0956
skateboard 5e+03 218 0.0655 0.844 0.73 0.122
surfboard 5e+03 266 0.0709 0.827 0.651 0.131
tennis racket 5e+03 183 0.0694 0.858 0.759 0.128
bottle 5e+03 966 0.0484 0.812 0.513 0.0914
wine glass 5e+03 366 0.0735 0.738 0.543 0.134
cup 5e+03 897 0.0637 0.788 0.538 0.118
fork 5e+03 234 0.0411 0.662 0.487 0.0774
knife 5e+03 291 0.0334 0.557 0.292 0.0631
spoon 5e+03 253 0.0281 0.621 0.307 0.0537
bowl 5e+03 620 0.0624 0.795 0.514 0.116
banana 5e+03 371 0.052 0.83 0.41 0.0979
apple 5e+03 158 0.0293 0.741 0.262 0.0564
sandwich 5e+03 160 0.0913 0.725 0.522 0.162
orange 5e+03 189 0.0382 0.688 0.32 0.0723
broccoli 5e+03 332 0.0513 0.88 0.445 0.097
carrot 5e+03 346 0.0398 0.766 0.362 0.0757
hot dog 5e+03 164 0.0958 0.646 0.494 0.167
pizza 5e+03 224 0.0886 0.875 0.699 0.161
donut 5e+03 237 0.0925 0.827 0.64 0.166
cake 5e+03 241 0.0658 0.71 0.539 0.12
chair 5e+03 1.62e+03 0.0432 0.793 0.489 0.0819
couch 5e+03 236 0.118 0.801 0.584 0.205
potted plant 5e+03 431 0.0373 0.852 0.505 0.0714
bed 5e+03 195 0.149 0.846 0.693 0.253
dining table 5e+03 634 0.0546 0.82 0.49 0.102
toilet 5e+03 179 0.161 0.95 0.81 0.275
tv 5e+03 257 0.0922 0.903 0.79 0.167
laptop 5e+03 237 0.127 0.869 0.744 0.222
mouse 5e+03 95 0.0648 0.863 0.732 0.12
remote 5e+03 241 0.0436 0.788 0.535 0.0827
keyboard 5e+03 117 0.0668 0.923 0.755 0.125
cell phone 5e+03 291 0.0364 0.704 0.436 0.0692
microwave 5e+03 88 0.154 0.841 0.743 0.261
oven 5e+03 142 0.0618 0.803 0.576 0.115
toaster 5e+03 11 0.0565 0.636 0.191 0.104
sink 5e+03 211 0.0439 0.853 0.544 0.0835
refrigerator 5e+03 107 0.0791 0.907 0.742 0.145
book 5e+03 1.08e+03 0.0399 0.667 0.233 0.0753
clock 5e+03 292 0.0542 0.836 0.733 0.102
vase 5e+03 353 0.0675 0.799 0.591 0.125
scissors 5e+03 56 0.0397 0.75 0.461 0.0755
teddy bear 5e+03 245 0.0995 0.882 0.669 0.179
hair drier 5e+03 11 0.00508 0.0909 0.0475 0.00962
toothbrush 5e+03 77 0.0371 0.74 0.418 0.0706
training error:
assert c.max() <= model.nc, 'Target classes exceed model classes'
AssertionError: Target classes exceed model classes
UPDATE: I fixed the PR and updated the math, the COCO JSON -> Darknet conversion tool (Dark Chocolate) works now: https://github.com/joehoeller/Dark-Chocolate/issues/2
Most helpful comment
@alontrais I had a similar error before, and I figured it out. The cause of this error in my end is because I used yolov3.cfg as my configure, but use the default weight file 'ultralytic49.pt', and the two does not match.
In the case that you want to use the default weight, you can use the yolov3-spp.cfg as a baseline and modify the corresponding filters/num_class as instructed.