There is no problem for object detection, and it's a great job, thank you!
However, I want to use this repo as a detector in my project, which is the first stage. But I can't use 'torch.load()' to load the weights you provided, get the error as follows:
self.model = torch.load(self.weight_path, map_location=self.device)['model']
File "torch1.5-py37/lib/python3.7/site-packages/torch/serialization.py", line 593, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "torch1.5-py37/lib/python3.7/site-packages/torch/serialization.py", line 773, in _legacy_load
result = unpickler.load()
ModuleNotFoundError: No module named 'models.yolo'
torch.save(the_model, PATH)
the_model = torch.load(PATH)
However in this case, the serialized data is bound to the specific classes and the exact directory structure used, so it can break in various ways when used in other projects, or after some serious refactors.
```\
torch.save(the_model.state_dict(), PATH)
the_model = TheModelClass(args, *kwargs)
the_model.load_state_dict(torch.load(PATH))
and My code as follows:
weights='weights/yolov5m.pt'
device = torch_utils.select_device(device='cpu' if ONNX_EXPORT else '0')
model = torch.load(weights, map_location=device)['model']
torch.save(model.state_dict(), 'weights/yolov5m_resave.pt')
- So I use the new method to load weights
from models.yolo import Model
yaml_path='models/yolov5m.yaml'
new_weights='weights/yolov5m_resave.pt'
model = Model(yaml_path).to(device)
model.load_state_dict(torch.load(new_weights))
- After that, I found I can get the same model and parameters as 'torch.load()' that you used, and the code can run. **But I got a new problem!!!**
##### New problem
- **I can get the detection results before NMS, but after the NMS, there is '[None]'**, My print as follows:
before nms: tensor([[[5.57901e+00, 5.70358e+00, 2.26364e+01, ..., 1.07860e-03, 9.78606e-04, 1.86649e-03],
[1.35772e+01, 5.58121e+00, 2.83575e+01, ..., 7.84854e-04, 6.75088e-04, 1.18259e-03],
[2.03256e+01, 5.90291e+00, 2.71849e+01, ..., 1.05030e-03, 7.25093e-04, 1.90396e-03],
...,
[3.39442e+02, 3.87110e+02, 1.64121e+02, ..., 1.63732e-02, 5.22475e-03, 1.01126e-02],
[3.65044e+02, 3.88645e+02, 1.44507e+02, ..., 1.25172e-02, 4.94093e-03, 9.01083e-03],
[3.91104e+02, 3.97117e+02, 1.44332e+02, ..., 1.07815e-02, 4.93309e-03, 8.51673e-03]]], device='cuda:0')
after nms: [None]
```
I don't know what's the problem it is? And I don't understand why you use this save method instead of another more flexible way? About my problem, do you have any good ideas? Thank you very much!
Hello @yxxxqqq, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Google Colab Notebook, Docker Image, and GCP Quickstart Guide for example environments.
If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.
If this is a custom model or data training question, please note that Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:
For more information please visit https://www.ultralytics.com.
@yxxxqqq thanks for your feedback. Yes you are correct, we use the current method, saving and loading the entire model. In the past we use the alternative method https://github.com/ultralytics/yolov3, of creating a model from a cfg file, and then replacing the random weights with the checkpoints weights using a state_dict().
This method caused two problems. The first is that initialization is slower, as a model is created with random weights, and then those random weights are replaced with the checkpoint weights, creating duplication of effort. The second, and main problem, was that a user was required to supply two items to load a model for inference or testing (the weights and cfg), instead of a single item. This places extra requirements on the user, and introduces a failure point during usage, as the user would often incorrectly match weights with incompatible cfg (i.e. yolov3-spp.pt with yolov3.cfg), leading to errors and confusion, and them raising issues and bug reports, using our time.
So we view the current method as the lesser of two evils. The main downside we see are SourceChangeWarnings that are generated when the modules the model is built on are updated since it was created.
@glenn-jocher Thanks for your reply! I have solved the 'SourceChangeWarnings' by the code you provided.
model = torch.load(weights, map_location=device)['model']
torch.save(torch.load(weights, map_location=device), weights) # update model if SourceChangeWarning
But the problems I said still exists:
1. use original weights, torch.load()
pred before nms: tensor([[[5.38951e+00, 6.87055e+00, 1.14993e+01, ..., 1.90228e-03, 1.01164e-03, 2.54049e-03],
[7.83045e+00, 6.57221e+00, 1.45590e+01, ..., 1.57367e-03, 8.64962e-04, 2.01560e-03],
[2.25311e+01, 5.58812e+00, 1.23454e+01, ..., 1.72529e-03, 9.21386e-04, 2.28453e-03],
...,
[4.31154e+02, 6.14794e+02, 1.36958e+02, ..., 1.80755e-03, 1.52067e-03, 1.51791e-03],
[4.56398e+02, 6.17055e+02, 1.22339e+02, ..., 2.12122e-03, 1.61005e-03, 1.63509e-03],
[4.91976e+02, 6.23088e+02, 1.45217e+02, ..., 3.99010e-03, 1.72312e-03, 2.11344e-03]]], device='cuda:0')
pred after nms: [tensor([[ 44.06211, 235.47171, 162.47781, 537.28436, 0.91711, 0.00000],
[146.72403, 240.72610, 219.93156, 511.04062, 0.90797, 0.00000],
[412.23538, 237.46272, 497.78629, 522.23077, 0.89330, 0.00000],
[ 22.67275, 135.73569, 490.28171, 438.86267, 0.74369, 5.00000],
[ 16.38007, 324.36755, 63.95830, 529.78113, 0.54598, 0.00000]], device='cuda:0')]
2. use resaved weights, model.load_state_dict()
pred before nms: tensor([[[5.39362e+00, 5.79549e+00, 2.25946e+01, ..., 1.25067e-03, 1.00686e-03, 1.47676e-03],
[1.25392e+01, 5.98638e+00, 2.68692e+01, ..., 9.48603e-04, 8.45199e-04, 1.03681e-03],
[2.11967e+01, 5.65385e+00, 2.41934e+01, ..., 1.24312e-03, 9.92147e-04, 1.58688e-03],
...,
[4.33180e+02, 6.20522e+02, 1.69033e+02, ..., 5.71506e-03, 3.09453e-03, 3.54823e-03],
[4.61483e+02, 6.20247e+02, 1.54342e+02, ..., 7.58316e-03, 3.30421e-03, 3.97864e-03],
[4.91035e+02, 6.24763e+02, 1.59548e+02, ..., 9.68921e-03, 3.65757e-03, 4.65747e-03]]], device='cuda:0')
pred after nms: [None]
@yxxxqqq the behavior you describe is the default behavior of all pytorch models.
For self contained models that do not require any external dependencies or imports you would need to export to onnx or torchscript formats. An alternative solution is to integrate this repo with torch hub https://pytorch.org/hub/.
@glenn-jocher Thank you very much !
@yxxxqqq we recently added support for PyTorch Hub. You may be able to use YOLOv5 in your own repository like this:
import torch
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
@glenn-jocher wow, so great! thanks for your excellent work!
@yxxxqqq you're welcome!
I use your torch.hub.load solution in order to have a self-contained detector module,and it works very well, thanks! However, it is very verbose. Even setting verbose=True in hub.load still outlines all the library. Is there another less-verbose approach?
@elinor-lev no
Original issue seems resolved, so I am closing this issue now.
@yxxxqqq Hello, will you please explain in detail what you did to resolve the problem?
I have run into the same exact nms problem and i cant seem to resolve it even with the hub.load!
Thank you
@elinor-lev if you'd like to add verbose functionality to the hub loading, I don't have time to do this personally, but we are open to PRs!
I do it dirty: copy dir models and utils and paste into target dir, this can work.
@yxxxqqq Have you solved it?
I have the same problem with NMS
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
After NMS:
pred: [None]
@yxxxqqq if pred[i] = None for image i, you have no detections above threshold in that image.
@glenn-jocher
Thanks for the reply
For the same image
In detect.py
When I used
model = attempt_load(weights, map_location=device)
pred = model(img, augment=opt.augment)[0]
pred got 3 dims and after NMS
The result is perfect.
I changed it to
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True).to(device)
pred = model(img, augment=opt.augment)[0]
pred got 5 dims and after NMS
pred = [None]
Do I need to reshape the pred before NMS?
@1chimaruGin torch hub model may be in training mode rather than eval mode.
@glenn-jocher
Ah Thank you.
Got it!
@glenn-jocher
I comment this issue because I got the same problem. I just integrate the detect.py to an existing project but I got the error message that say
model = attempt_load(weights_file, map_location=device)
File "/home/florian/PycharmProjects/eyesr_custom_ai_detector/CustomDetector/detector_files/yolov5/models/experimental.py", line 137, in attempt_load
model.append(torch.load(w, map_location=map_location)['model'].float().fuse().eval()) # load FP32 model
File "/home/florian/PycharmProjects/eyesr_custom_ai_detector/virtual_env/lib/python3.6/site-packages/torch/serialization.py", line 584, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/home/florian/PycharmProjects/eyesr_custom_ai_detector/virtual_env/lib/python3.6/site-packages/torch/serialization.py", line 842, in _load
result = unpickler.load()
ModuleNotFoundError: No module named 'models'
The previous message don't really explain how to fix this (Instead of using back the cfg file and so one). What is exactly the solution to this error message ?
I use Pytorch 1.6 and Python 3.6
@FlorianRuen
I faced the same problem when I use attempt_load(weights_file, map_location=device) from the outside of this repo.
So, I load the pretrained model from hub model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True).to(device).eval()
@1chimaruGin so you assume that every time we launch the script, it will download from the hub, so we need network access on the device that will execute the project ?
If I change all the line from:
model.append(torch.load(w, map_location=map_location)['model'].float().fuse().eval()) # load FP32 model
to
model.append(torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True).to(map_location).eval())
I got a SSL CERTIFICATE_VERIFY_FAILED, that I can easily correct.
But the other problem, it that I'm using the exact same directory structure. So when I try to run the script it say it can't found utils.google_utils, which is normal because the path should be detector_files.ultralytics.yolov5.utils.google_utils
@FlorianRuen
Yeah It will download from the hub but for once.
I mean in detect.py line 35. Not in experimental.py.
model = model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True).to(device).eval()
@1chimaruGin Thanks, but if I run the detect.py without any changes in folders architecture it works, my need is to change the yolov5 folders architecture to fit a bigger project, so I need to change the imports to fit my architecture.
Very strange that if I change the imports in all files, there is still a call to "models" which is wrong path ... I don't know where is this call. I will go deeper to find a solution, if @glenn-jocher has an idea how to fix this ?
Thanks again for your help
have you tried to add
import sys
sys.path.insert(0, "path/to/yolov5")
to the file where the bug occurs?
The same issue here. It is actually very annoying :(
I've trained the small model on a custom dataset and now I am trying to integrate it into another project. I've copied the models and utils folders, fixed the imports there. When I attempt loading the model I get the same issue - ModuleNotFoundError: No module named 'models'. When I run the same model from the original repo, works like a charm.
Has anyone found a solution to the problem?
@glenn-jocher do you possibly have any suggestions?
Thanks.
hope this pull request will resolve the issue. See [this branch on my fork] (https://github.com/PetrDvoracek/yolov5/tree/fix-no-module-models-in-export)
The same issue here. It is actually very annoying :(
I've trained the small model on a custom dataset and now I am trying to integrate it into another project. I've copied the models and utils folders, fixed the imports there. When I attempt loading the model I get the same issue - ModuleNotFoundError: No module named 'models'. When I run the same model from the original repo, works like a charm.
Has anyone found a solution to the problem?
@glenn-jocher do you possibly have any suggestions?Thanks.
I meet the same issue, I used the new version and integrate it into another project
torch.load() requires model module in the same folder
https://stackoverflow.com/questions/42703500/best-way-to-save-a-trained-model-in-pytorch
torch.save(the_model, PATH)
Then later:
the_model = torch.load(PATH)
However in this case, the serialized data is bound to the specific classes and the exact directory structure used, so it can break in various ways when used in other projects, or after some serious refactors.
To fix it save and load only the model parameters.
https://github.com/pytorch/pytorch/issues/3678
PyTorch internally uses pickle and it's a limitation of pickle. You can try meddling with sys.path to include the directory where module.py is. This is exactly why we recommend saving only the state dicts and not whole model objects.
I've tried the solution of yxxxqqq, and I've face the problem that he mentioned,
And after some effort,
Here is my solution,
In attempt_load function, I save the model into state_dicts():
def attempt_load(weights, map_location=None):
# Loads an ensemble of models weights=[a,b,c] or a single model weights=[a] or weights=a
model = Ensemble()
for w in weights if isinstance(weights, list) else [weights]:
attempt_download(w)
model2 = torch.load(w, map_location=map_location)['model']
torch.save(model2.state_dict(), '/path/to/best_state_model.pth')
....
# Compatibility updates
for m in model.modules():
if type(m) in [nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6]:
m.inplace = True # pytorch 1.7.0 compatibility
elif type(m) is Conv:
m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatibility
print("len model = {}".format(len(model)))
if len(model) == 1:
print("pass only")
print()
return model[-1] # return model
else:
print('Ensemble created with %s\n' % weights)
for k in ['names', 'stride']:
setattr(model, k, getattr(model[-1], k))
return model # return ensemble
And I've load the weights using:
model = Model(cfg='/path/to/yolov5s.yaml',
nc=1)
print(model.state_dict().keys())
print(len(model.eval().state_dict().keys()))
model.load_state_dict(torch.load('/path/to/best_state_model.pth', map_location=device))
When I apply these changes, the results of two loading model method are the same.
I hope that my solution can help someone.