I am integrating ONNX runtime to C++ application. I have latest version, build from sources, OS is Windows 10, evaluation is on CPU. I followed exactly usage of C-API as defined in C_Api_Sample.cpp (code is basically the same).
When I feed 'test pattern' instead of real image to model, I get expected results as stated
in example source code:
// Results should be as below...
// Score for class[0] = 0.000045
// Score for class[1] = 0.003846
// Score for class[2] = 0.000125
// Score for class[3] = 0.001180
// Score for class[4] = 0.001317
This led me to think, that my integration is correct.
When I however feed some images (very nice & clean from here: https://microsoft.github.io/onnxjs-demo/#/squeezenet)
I get completely non-sensical results (ie for completely different images (cheetah / lighthouse)) network returns same resulting class.
Is there any explanation for this ? Is it actually safe to think about ONNX runtime in current
state as production ready ?
Thanks for any insights !
Hi, what model are you using and have you done all the appropriate preprocessing on the image to ensure the inputs are correct? ONNX Runtime is production ready and already powers many high-volume, high-scale production scenarios for Microsoft products and services.
I am using squeezenet 1.3 and I am normalizing the inputs with appropriate mean and std-dev.
However the results are very similar even if I don't do any normalization.
Hi, I have an alternative. I'm creating another demo for inception model on imagenet dataset. And the demo code uses the C API. I have checked the result, it's correct. If you are interested, you could use this new model.
The C API and runtime itself is good, if the result was wrong, mostly likely it's because of the model or preprocessing.
I'll be happy if I can take a look into this new demo, once it is available !
Hi @petrsm,
I downloaded a sample image from the ONNX.JS squeezenet demo (the 'Granny Smith' apple) and ran a simple python script to check if the result label looked okay and it was fine. (I did this for 0.4.0 release and current master)
This is my python script -
import onnxruntime as rt
import numpy as np
from PIL import Image
def preprocess(img_path):
img = Image.open(img_path)
img = img.resize((224, 224), Image.BILINEAR)
img_data = np.array(img, dtype=np.float32)
img_data = np.transpose(img_data, [2, 0, 1])
img_data = np.expand_dims(img_data, 0)
img_data = img_data[:,0:3,:,:] # Input is RGBA - so strip off A
return img_data
sess = rt.InferenceSession("squeezenet.onnx")
input_name = sess.get_inputs()[0].name
pred_onnx = sess.run(None, {input_name: preprocess("apple.png")})[0]
max_index = np.argmax(pred_onnx)
print(max_index)
This prints 948 which corresponds to 'Granny Smith' in the ImageNet categories. So this rules out any issue in the core runtime itself.
@petrsm - It would help if you showed (code snippet) how you were feeding in the images corresponding to these lines -
"
When I however feed some images (very nice & clean from here: https://microsoft.github.io/onnxjs-demo/#/squeezenet)
I get completely non-sensical results (ie for completely different images (cheetah / lighthouse)) network returns same resulting class.
"
The only explanation I can think of is that the pre-processing was incomplete/not correct.
I guess there must be something wrong in my code, question is what :)
I created simple repro by modifying C_API_sample.cpp. Entire solution with data can be
downloaded from here:
https://drive.google.com/file/d/1EBYkoXMGjwRJSy9p7XJNpQ43nP9qMpxs/view?usp=sharing
As stated before, when feeding test pattern, all is OK. When feeding cheetah image, it is classified as 669 - 'mosquito net'. I tried various layouts of input data (can be seen in code) without any success.
I really appreciate willingness to help from all of you !
Here is relevant part of code where I feed input tensor data:
// initialize input data with values in [0.0, 1.0]
// This works as expected
for (size_t i = 0; i < input_tensor_size; i++)
input_tensor_values[i] = (float)i / (input_tensor_size + 1);
int img_sizex, img_sizey, img_channels;
stbi_uc * img_data = stbi_load("cheetah.png", &img_sizex, &img_sizey, &img_channels, STBI_default);
assert(img_data);
assert(img_sizex == 224);
assert(img_sizey == 224);
assert(img_channels == 4);
struct S_Pixel
{
unsigned char RGBA[4];
};
static_assert(sizeof(S_Pixel) == 4, "");
const S_Pixel * imgPixels(reinterpret_cast
const float mean[3] = { 0.485f, 0.456f, 0.406f };
const float stddev[3] = { 0.229f, 0.224f, 0.225f };
// Does not make any difference
const float mean[3] = { 0.0f, 0.0f, 0.0f };
const float stddev[3] = { 1.0f, 1.0f, 1.0f };
size_t offs = 0;
// NCWH layout (this should be correct)
for (size_t c = 0; c < 3; c++)
{
for (size_t y = 0; y < 224; y++)
{
for (size_t x = 0; x < 224; x++, offs++)
{
const float val((float)imgPixels[y * 224 + x].RGBA[c] / 255);
input_tensor_values[offs] = (val - mean[c]) / stddev[c];
}
}
}
// Desperate attempt to try NWHC layout (fails miserably - same as passing vector of zeroes)
for (size_t y = 0; y < 224; y++)
{
for (size_t x = 0; x < 224; x++)
{
for (size_t c = 0; c < 3; c++, offs++)
{
const float val((float)imgPixels[y * 224 + x].RGBA[c] / 255);
input_tensor_values[offs] = (val - mean[c]) / stddev[c];
}
}
}
assert(offs == input_tensor_size);
// create input tensor object from data values
OrtAllocatorInfo* allocator_info;
CHECK_STATUS(OrtCreateCpuAllocatorInfo(OrtArenaAllocator, OrtMemTypeDefault, &allocator_info));
OrtValue* input_tensor = NULL;
CHECK_STATUS(OrtCreateTensorWithDataAsOrtValue(allocator_info, input_tensor_values.data(), input_tensor_size * sizeof(float), input_node_dims.data(), 4, ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT, &input_tensor));
assert(OrtIsTensor(input_tensor));
OrtReleaseAllocatorInfo(allocator_info);
...
Hi @petrsm,
I see several things that can be tried -
1) For now we can skip the mean subtraction and standard_deviation division (as the python code that works didn't require that)
2) We can skip the division by 255 for each pixel (as the python code that works works with values in the range 0-255) (i.e.) un-normalized
3) I think the correct format is NCHW (and not NCWH as you mention as "should work" in your comment).
4) After using stbi_load(), what is the format of the data bytes- is it HWC or CHW or WHC ? That needs to be checked first. If the format is HWC then life becomes easier - since we just want to ditch the last channel, you could just consume the first 224 * 224 * 3 values and after appropriate float casts, can try feeding this to ORT.
Thanks
You hit the nail on the head !
After removing normalization to <0, 1> range (and also mean/std dev) everything works fine.
I blindly followed advice from here :
https://github.com/onnx/models/tree/master/models/image_classification/squeezenet
"Preprocessing
The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225]. The transformation should preferrably happen at preprocessing. Check imagenet_preprocess.py for code."
Next time, when facing such issue, denormalization will be first thing to try :)
Thanks a lot for your help !
Glad you could get it working!
Closing this issue.
Most helpful comment
Hi @petrsm,
I downloaded a sample image from the ONNX.JS squeezenet demo (the 'Granny Smith' apple) and ran a simple python script to check if the result label looked okay and it was fine. (I did this for 0.4.0 release and current master)
This is my python script -
import onnxruntime as rt
import numpy as np
from PIL import Image
def preprocess(img_path):
img = Image.open(img_path)
img = img.resize((224, 224), Image.BILINEAR)
img_data = np.array(img, dtype=np.float32)
img_data = np.transpose(img_data, [2, 0, 1])
img_data = np.expand_dims(img_data, 0)
img_data = img_data[:,0:3,:,:] # Input is RGBA - so strip off A
return img_data
sess = rt.InferenceSession("squeezenet.onnx")
input_name = sess.get_inputs()[0].name
pred_onnx = sess.run(None, {input_name: preprocess("apple.png")})[0]
max_index = np.argmax(pred_onnx)
print(max_index)
This prints 948 which corresponds to 'Granny Smith' in the ImageNet categories. So this rules out any issue in the core runtime itself.