Machinelearning: Unable to find input colum on tensor flow model using C#

Created on 22 Oct 2019 · 17Comments · Source: dotnet/machinelearning

System information

OS version/distro:
Windows 10
.NET Version (eg., dotnet --info):
.NET Core 3.0

Issue
What did you do?
I created my own TensorFlow object detection model and tried to load it in a C# console application.
What happened?
When I tried to create a model by using pipeline.fit I get this error System.ArgumentOutOfRangeException: ”Could not find input column ‘image_tensor’ (Parameter ‘inputSchema’)
What did you expect?
I would have been able to create the model and continue to create the prediction engine and test my TensorFlow.

Source code / logs

I posted the question on the blog that showed how to load a pretrained TensorFlow model to "Call out @yaeldekel in the issue. She might be able to help here."

Here is what I see in Netron

Here is how I am loading the pipeline
```C#
var pipeline = context.Transforms
.LoadImages("input", @"D:\tensorflow1\images",nameof(ImageNetData.ImagePath))
.Append(context.Transforms.ExtractPixels(outputColumnName: "input", interleavePixelColors: true))
.Append(context.Model.LoadTensorFlowModel(@"D:\tensorflow1\models\research\object_detection\inference_graph\PotatoDetector.pb")
.ScoreTensorFlowModel(outputColumnNames: new[] { "detection_boxes","detection_classes","detection_scores", "num_detections" } },inputColumnNames: new[] { "image_tensor" }, addBatchDimensionInput: true));

When I try to load the model inPython like so it works just fine.
```python
image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
num_detections = detection_graph.get_tensor_by_name('num_detections:0')


image = cv2.imread(PATH_TO_IMAGE)
image_expanded = np.expand_dims(image, axis=0)

print("going to run the model now")

(boxes, scores, classes, num) = sess.run(
    [detection_boxes, detection_scores, detection_classes, num_detections],
    feed_dict={image_tensor: image_expanded})

Any ideas on how to find the input column that ML.NET wants?
Please paste or attach the code or logs or traces that would be helpful to diagnose the issue you are reporting.

P1 bug image loadsave

Source

Nightrain

Most helpful comment

Hi, Will this issue be befixed with next release?
We also have the same problem when we try to use Tensorflow model.

We have been using CNTK for object detection, but we urgently need alternative to this.
We now trained our data with Tensorflow but we are stuck with using it in ML.NET.
Tried to use different models, but issue is the same.
Thanks.

mveeris on 5 May 2020

👍2

All 17 comments

Hi! I think your pipeline may need some retooling. Take a look at this sample to understand how to load images to be consumed by a Tensorflow model.

gvashishtha on 1 Nov 2019

Thanks for the feedback @gvashishtha

The only things I see different between the two is I am not resizing the images in the pipeline. But the model was not trained with a specific size of images. They were all over the board. And I do not know what the image offset is, again the model was not trained with an image offset. Or does every .jpg have an image offset of 117?

And on the ScoreTensorFlowModel my column names are different, but as I understand it, each model can have its own set of input and output column names. That is one of the reasons to inspect your model with Netron?

No trying to argue with you, I just do not understand the difference between my pipeline construction and the samples pipeline construction; given we are using two different models.

Thanks again for the feedback.

` public struct InceptionSettings
{
// for checking tensor names, you can use tools like Netron,
// which is installed by Visual Studio AI Tools

        // input tensor name
        public const string inputTensorName = "input";

        // output tensor name
        public const string outputTensorName = "softmax2";
    }

Nightrain on 1 Nov 2019

Hey there! One thing I notice is that you have "input" as the outputColumnName from the ExtractPixels step, but then you use "image_tensor" as the inputColumnName in the ScoreTensorFlowModel step. I'm not sure this is your only issue, but I think it may be one of the problems you are facing.

gvashishtha on 4 Nov 2019

Ahhh that makes sense, thanks a bunch. I am just starting with this technology and basically following recipes without fully understanding what is going on. Thanks again for the tip.
cheers

Nightrain on 4 Nov 2019

No problem! Do let me know if your problem gets resolved.

gvashishtha on 4 Nov 2019

@gvashishtha I am getting closer, I changed the output columns to input_tensor. Then I got an error Schema mismatch for input column 'image_tensor': expected known-size image, got unknown-size image (Parameter 'inputSchema'about

So I added the transformer.resize, but I did not know what size to use so I just used 500,500.

So the next error was
`'Schema mismatch for input column 'image_tensor': expected Byte, got Single (Parameter 'inputSchema')'

So a little research brought me to add outputAsFloatArray:false to the ExtractPixels transform. So now my pipe construction looks like this.

`var pipeline = context.Transforms
             .LoadImages("image_tensor", @"D:\tensorflow1\images",nameof(ImageNetData.ImagePath))
              .Append(context.Transforms.ResizeImages(outputColumnName: "image_tensor", imageWidth: 500, imageHeight: 500, inputColumnName: "input"))
                .Append(context.Transforms.ExtractPixels(outputColumnName: "image_tensor", interleavePixelColors: true, outputAsFloatArray:false))
                .Append(context.Model.LoadTensorFlowModel(@"D:\tensorflow1\models\research\object_detection\inference_graph\PotatoDetector.pb")              
              .ScoreTensorFlowModel( outputColumnNames: new[] { "detection_boxes","detection_scores", "num_detections" } , inputColumnNames: new[] { "image_tensor" }, addBatchDimensionInput: true));`

Now the error is

'Input shape mismatch: Input 'image_tensor' has shape (-1, -1, -1, 3), but input data is of length 1500000.'

So I am stuck again, any ideas? I am really confused whey the shape is empty shouldn't it be the shape of my image?

cheers
bob

Nightrain on 4 Nov 2019

OK, so your error suggests to me that somehow you're converting your input into a vector instead of a 4-dimensional set of images. I wonder if your problem stems from the same issue we had earlier: in Resize_images, you specify "input" as inputColumnName, but in LoadImages, your outputColumnName is "image_tensor"

gvashishtha on 6 Nov 2019

Nice catch but that did not help, I am still getting the same error message? Where in the pipeline would the image get transformed to 4-dimension array? I know the Transform.ExtractPixels returns a byte array but I do not know the dimensions. And it looks like one of the pictures is sending out an empty array some how. Does that make sense?

Nightrain on 6 Nov 2019

The image should get transformed to a 4-dimensional array in the LoadImages step. You can test this out by putting the Load Images step on its own line, inserting a debugging breakpoint, and then calling .Fit().Transform() on it and inspecting the returned object to see what the InputSchema is.

Please send your model along for further debugging if needed. What do you mean by "one of the pictures is sending out an empty array?"

gvashishtha on 6 Nov 2019

Ok I did what you suggested, and everything looked normal. The input schema was the image path, and the path to my labels. The output schema was the same plus an image named "image_tensor".

I figured it was an empty array with the reported values being (-1, -1, -1, 3). If I understand it correctly the first three dimensions would hold the RGB values for the pixel, and the 3 indicated RGB somehow? Not so sure on that last dimension honestly.

Not real sure how to send along my model for further debugging. If I did correctly, the link below should be access to the model from my one drive shared folder. The file name is PotatoDetector.pb

Thanks again for sticking with me on this
cheers
bob

https://1drv.ms/u/s!ApCOT9MVH4Z1gah98NVg-dR55tic4A?e=AZB1Bb

Nightrain on 7 Nov 2019

OK, I was able to reproduce your error. I think it's coming from this line, but I'm not entirely sure why.
https://github.com/dotnet/machinelearning/blob/master/src/Microsoft.ML.TensorFlow/TensorflowTransform.cs#L533

As a workaround, is there some way you could change your "image_tensor" node to expect a float as input rather than a u_int8? I can't find any example code that uses OutputAsFloatArray in this method, so it may be a problem with that option.

OutputAsFloatArray consumed here: https://github.com/dotnet/machinelearning/blob/master/src/Microsoft.ML.ImageAnalytics/ImagePixelExtractor.cs#L474

gvashishtha on 7 Nov 2019

Well I feel relieved that we have some sort of answer. I will ask on another forum related to the tutorial I used to make the tensor model. I wonder if it would as easy as just picking another model from the Tensor Model Zoo? You know, use a different algorithm to train the model? I will dig around this afternoon and get back to you tomorrow. Again thanks a bunch for all your help.
cheers

Nightrain on 7 Nov 2019

Confirmed that the issue has to do with OutputAsFloatArray.

If OutputAsFloatArray is set to be "false," then the resulting Byte array has a non-null shape, which causes the TensorFlowTransform to enter the "else" condition rather than the "if" condition in this conditional statement.

Once the "else" condition activates, the code assumes that given a data vector of length n and an input tensor specified with dimensions (-1, -1, -1, 3), the data vector can only be properly deserialized if there exists an integer _d_ such that _ddd3=n_. The correct check would be whether there exist an x, y, and z such that _xyz3=n_.

gvashishtha on 12 Nov 2019

OK that sort of makes sense. If I take out the OutputAsFloatArray here is my error message. {"Schema mismatch for input column 'image_tensor': expected Byte, got Single (Parameter 'inputSchema')"} But this does not make much sense to me either? So that is why I added the OutputAsFloatArray parameter.

Nightrain on 12 Nov 2019

I have investigated this issue, and here are my findings:
1) The OutputAsFloatArray needs to be set to false for this model because the graph expects an input of type UINT8 (or Byte). It does not need to be set for the other models in the samples because they take a float input. This explains error: {"Schema mismatch for input column 'image_tensor': expected Byte, got Single (Parameter 'inputSchema')"}
2) @gvashishtha 's findings on where the error occurs are correct. However, the code path taken here https://github.com/dotnet/machinelearning/blob/b7db4fa475ba4bd52824eb98bbf5f5bf4a0a6f7a/src/Microsoft.ML.TensorFlow/TensorflowTransform.cs#L517 is not because of the option OutputAsFloatArray, but because the input node in the graph has a non-null shape (-1, -1, -1, 3), whereas the samples in the repo have graphs where input tensor shapes are not set(shape is null).

The check at https://github.com/dotnet/machinelearning/blob/b7db4fa475ba4bd52824eb98bbf5f5bf4a0a6f7a/src/Microsoft.ML.TensorFlow/TensorflowTransform.cs#L539 assumes that all the unknown dimensions are the same, even though the tensor shape is (batch_size=-1, height=-1, width=-1, channels=3). This check is wrong, and I am working on pushing a fix for this.

ashbhandare on 16 Nov 2019

@Nightrain on a different note, I am interested to know what brought you to ML.NET, given that you seem to have significant background in Python. If you wouldn't mind sharing, please do drop me a note at gopalv microsoft.com

gvashishtha on 26 Nov 2019

Hi, Will this issue be befixed with next release?
We also have the same problem when we try to use Tensorflow model.

mveeris on 5 May 2020

👍2

Was this page helpful?

0 / 5 - 0 ratings