A clear and concise description of what the bug is.
I followed https://gist.github.com/jakepoz/eb36163814a8f1b6ceb31e8addbba270 to derive the script model.
In my C++ code and my python code, I tested the same picture, I checked that the input tensors were the same after pre-processing of the picture, but the model output is different.
the picture shape is (channel = 3, height = 360, width = 640)
python Input:
import cv2
img_path = 'test.png'
img = cv2.imread(img_path)
img = letterbox(img, new_shape = (640,640))[0]
img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB
img = np.ascontiguousarray(img)
img = torch.from_numpy(img).to(device).float()
img /= 255.0 # 0 - 255 to 0.0 - 1.0
if img.ndimension() == 3:
img = img.unsqueeze(0)#shape(1,3,384,640)
pred = model(img, augment=False)
print(pred[0].shape)
Python Output:
torch.Size([1, 15120, 85])
C++ input
string img_path = "test.png";
Mat img = imread(img_path);
img = letterbox(img);//resize
cvtColor(img, img, CV_BGR2RGB);// bgr->rgb
img.convertTo(img, CV_32FC3, 1.0f / 255.0f);// 1/255
auto tensor_img = torch::from_blob(img.data, {img.rows, img.cols, img.channels()});
tensor_img = tensor_img.permute({2, 0, 1});
tensor_img = tensor_img.unsqueeze(0);
cout << "line 111, tensor size is " << tensor_img.sizes() << endl;//(1,3,384,640)
std::vector<torch::jit::IValue> inputs;
inputs.push_back(tensor_img);
torch::jit::IValue output = model.forward(inputs);
auto op = output.toList().get(0).toTensor();
cout << "line 133, op[0] is " << op.sizes() << endl;
C++ output
output tensor shape [1, 3, 48, 80, 85],
and 3*48*80 = 11520 != 15120
A clear and concise description of what you expected to happen.
I would wish the model output in C++ will be the same as it in python.
Hello @zherlock030, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook , Docker Image, and Google Cloud Quickstart Guide for example environments.
If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.
If this is a custom model or data training question, please note that Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:
For more information please visit https://www.ultralytics.com.
@jakepoz
We have updated export.py to support torchscript export now, among others. The tutorial is here: https://github.com/ultralytics/yolov5/issues/251
Note that these are simple examples to get you started. Actual export and deployment (to an edge device) for example is a very complicated journey. We have not open sourced the entire process, but we do offer paid support in this area. If you have a business need let us know and we'd be happy to help you!
@zherlock030, this is because the final Detect layer in yolov5 is undoing the action of yolo's "anchor" system when in regular operations, but this is not being exported in the export script:
https://github.com/pjreddie/darknet/issues/568
Unfortunately, I have not yet figured out the details here, it seems as if some of the variables like the self.anchors and self.anchor_grid are stored as registered parameters, but self.strides is not, and I have difficulty exporting the model with the anchor code turned on.
@zherlock030 @jakepoz do you solve solve the problem? I meet the same. looking forward to your reply. thank you .
@zherlock030 @jakepoz do you solve solve the problem? I meet the same. looking forward to your reply. thank you .
@winself
Think I have made it. Im not sure if I should open source it since @glenn-jocher have his concern.
I could share my code with u.
@zherlock030, this is because the final Detect layer in yolov5 is undoing the action of yolo's "anchor" system when in regular operations, but this is not being exported in the export script:
Unfortunately, I have not yet figured out the details here, it seems as if some of the variables like the self.anchors and self.anchor_grid are stored as registered parameters, but self.strides is not, and I have difficulty exporting the model with the anchor code turned on.
@jakepoz
Thanks for your reply. I just treats self.strides as constants. And for now it produces reasonable results as it does in python.
We have updated export.py to support torchscript export now, among others. The tutorial is here: #251
Note that these are simple examples to get you started. Actual export and deployment (to an edge device) for example is a very complicated journey. We have not open sourced the entire process, but we do offer paid support in this area. If you have a business need let us know and we'd be happy to help you!
Thanks for ur reply. Think I have made it, yolov5s is so fast.
@zherlock030 hi no worries about open sourcing your work! The only requirement is that you retain the current GPL3 license on modifications.
We eventually want to open source 100% of everything, including the export pipelines and the iDetection iOS app source code. We are trying to adjust our business model to make this happen either later this year or next year.
@zherlock030 Thanks for ur reply!!! " self.training |= self.export" cause this results. when export is True -> training is True. so the Torchscript product the training output. we need write some code to process the result . Is this right ?
@zherlock030 Thanks for ur reply!!! " self.training |= self.export" cause this results. when export is True -> training is True. so the Torchscript product the training output. we need write some code to process the result . Is this right ?
yeah, we need to write code for image preprocess, detect layer and nms.
U can see my implementation in https://github.com/zherlock030/YOLOv5_Torchscript.
Hello,
I also interested in running Yolov5 in C++. @zherlock030 when you run yolov5, how many GB of GPU do you use? Is it lower than running in python?
Thanks
@zherlock030,Hi!! It's similar to you that I also write the nms.cpp code. I used the official export.py to export the torchscript files,the output op is a tensor of [1,gridx, gridy, 9], however, the 9 vector is totally wrong. Does the exported torchscript files is not right?? Do I need any modificaton? Because I find the warnning words :TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. in the torch.jit.trace
@easycome2009 what I did is set "model.model[-1].export = False" in exprot.py line#28, I get similar result from python and c++
@phamdat09 hi, actually Im using a CPU.
@easycome2009 yes, when u run export.py, u need to modify the detect layer, let it just output the imputed list 'x', and then implement detect layer in ur c++ code.
@yasenh yes, I tried that too, that way we can't feed the network pictures in different shapes.
@zherlock030, here is my implementation just FYI: https://github.com/yasenh/libtorch-yolov5
The image will be padded to fix size e.g (640, 640)
@yasenh yeah I know what u mean, but actually with function letterbox, image in any shape can be feeded to yolo.
@yasenh yeah I know what u mean, but actually with function letterbox, image in any shape can be feeded to yolo.
I think you can still do that, but I think the benefit of padding images to same size is that we can process images as a batch. Otherwise you might need to process images with different sizes one by one.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
C++ output
output tensor shape [1, 3, 48, 80, 85], and 3*48*80 = 11520 != 15120
That's because the c++ output is a list [(1, 3, height / 8, width / 8, 6), (1, 3, height / 16, width / 16, 6), (1, 3, height / 16, width / 16, 6)], while the python output is a tuple ([1, num_anchors, 6], [(1, 3, height / 8, width / 8, 6), (1, 3, height / 16, width / 16, 6), (1, 3, height / 16, width / 16, 6)]).
In your case: 3*(48*80+24*40+12*20) == 15120
Most helpful comment
@zherlock030 hi no worries about open sourcing your work! The only requirement is that you retain the current GPL3 license on modifications.
We eventually want to open source 100% of everything, including the export pipelines and the iDetection iOS app source code. We are trying to adjust our business model to make this happen either later this year or next year.