Machinelearning: How to process image with 1 channel?

Created on 2 Oct 2019  路  9Comments  路  Source: dotnet/machinelearning

System information

  • OS version/distro:
    Windows 10
  • .NET Version (eg., dotnet --info):
    .NET Core 3.0

Issue

I have a model that is trained in keras that accept a input shape of 64x64x1, in this case an image in gray scale. But I don't found a way of pre-process image to send only one channel to model, in ML.NET only have a method that extract one channel, R, G, B or Alpha.

  • What did you do?
    var pipeline = mlContext.Transforms.ResizeImages(resizing: ImageResizingEstimator.ResizingKind.Fill, outputColumnName: onnxModel.ModelInput, imageWidth: ImageSettings.imageWidth, imageHeight: ImageSettings.imageHeight, inputColumnName: nameof(ImageInputData.Image)) .Append(mlContext.Transforms.ConvertToGrayscale(outputColumnName: onnxModel.ModelInput, inputColumnName: onnxModel.ModelInput)) .Append(mlContext.Transforms.ExtractPixels(outputColumnName: onnxModel.ModelInput, inputColumnName: onnxModel.ModelInput, interleavePixelColors: true, orderOfExtraction: ImagePixelExtractingEstimator.ColorsOrder.ARGB, colorsToExtract: ImagePixelExtractingEstimator.ColorBits.Red)) .Append(mlContext.Transforms.ApplyOnnxModel(modelFile: onnxModel.ModelPath, outputColumnName: onnxModel.ModelOutput, inputColumnName: onnxModel.ModelInput));

My image come from a input form, like onnx examples in samples repository

  • What happened?
    I'm getting incorrect prediction, because I don't know how to reshape image to 64x64x1, in case that I send 64x64x3 format, I'm getting the error "Length of memory (12288) must match product of dimensions (4096)."
  • What did you expect?
    I expect that ML.NET provide a way to reshape my image to one channel image to get correct predictions

Source code / logs

Please paste or attach the code or logs or traces that would be helpful to diagnose the issue you are reporting.

question

Most helpful comment

I will send some example, but basically I'm using a fer2013 trained model in ONNX.
In python I simple made the below code and it's works with this model.

I tried with Alpha channel, but still not works.

At ML.NET

// Configurator
var pipeline = mlContext.Transforms.ResizeImages(resizing: ImageResizingEstimator.ResizingKind.Fill, outputColumnName: onnxModel.ModelInput, imageWidth: ImageSettings.imageWidth,
                imageHeight: ImageSettings.imageHeight, inputColumnName: nameof(ImageInputData.Image))
                 .Append(mlContext.Transforms.ExtractPixels(outputColumnName: onnxModel.ModelInput, inputColumnName: onnxModel.ModelInput, colorsToExtract: ImagePixelExtractingEstimator.ColorBits.Alpha))
                .Append(mlContext.Transforms.ApplyOnnxModel(modelFile: onnxModel.ModelPath, outputColumnName: onnxModel.ModelOutput, inputColumnName: onnxModel.ModelInput));

//Bitmap to gray scale using OpenCVSharp, I had tried with C# pure, but without success

var mat = BitmapConverter.ToMat(original);
Mat gray = mat.CvtColor(ColorConversionCodes.RGB2GRAY, 1);
//I had tried with other formats like 8bpp, 32 and 24.
 return gray.ToBitmap(PixelFormat.Format1bppIndexed);

 ImageInputData imageInputData = new ImageInputData { Image = bitmapImage) };

 var probs = predictionEngine.Predict(imageInputData).PredictedLabels;

At Python

from keras.models import load_model
import cv2

--I had tried this too in ML.NET
gray_image = cv2.cvtColor(bgr_image, cv2.COLOR_BGR2GRAY)

emotion_target_size = emotion_classifier.input_shape[1:3]
emotion_model_path = '../trained_models/emotion_models/fer2013_mini_XCEPTION.102-0.66.hdf5'
emotion_classifier = load_model(emotion_model_path, compile=False)
#Resize image, I made this in ML.NET too
 gray_face = cv2.resize(gray_face, (emotion_target_size))
 emotion_prediction = emotion_classifier.predict(gray_face)

Note: I had used ML.NET with other model in ONNX format that I used in python, but in this scenario, the model accept 3 channel image, if you can send some example that uses only one, it would be nice.

All 9 comments

Hi @sergioprates. When you use a model trained outside of ML.NET, you need to make sure that you featurize the image the same way as it was featurized prior to passing it to the trainer. I would verify these things:

  1. Could you try extracting the Alpha channel instead of the Red?
  2. Make sure that the order of the pixels (row first vs. column first) is the same as at training time.
  3. Make sure to scale/normalize the pixels the same way they were scaled at training time.

I will send some example, but basically I'm using a fer2013 trained model in ONNX.
In python I simple made the below code and it's works with this model.

I tried with Alpha channel, but still not works.

At ML.NET

// Configurator
var pipeline = mlContext.Transforms.ResizeImages(resizing: ImageResizingEstimator.ResizingKind.Fill, outputColumnName: onnxModel.ModelInput, imageWidth: ImageSettings.imageWidth,
                imageHeight: ImageSettings.imageHeight, inputColumnName: nameof(ImageInputData.Image))
                 .Append(mlContext.Transforms.ExtractPixels(outputColumnName: onnxModel.ModelInput, inputColumnName: onnxModel.ModelInput, colorsToExtract: ImagePixelExtractingEstimator.ColorBits.Alpha))
                .Append(mlContext.Transforms.ApplyOnnxModel(modelFile: onnxModel.ModelPath, outputColumnName: onnxModel.ModelOutput, inputColumnName: onnxModel.ModelInput));

//Bitmap to gray scale using OpenCVSharp, I had tried with C# pure, but without success

var mat = BitmapConverter.ToMat(original);
Mat gray = mat.CvtColor(ColorConversionCodes.RGB2GRAY, 1);
//I had tried with other formats like 8bpp, 32 and 24.
 return gray.ToBitmap(PixelFormat.Format1bppIndexed);

 ImageInputData imageInputData = new ImageInputData { Image = bitmapImage) };

 var probs = predictionEngine.Predict(imageInputData).PredictedLabels;

At Python

from keras.models import load_model
import cv2

--I had tried this too in ML.NET
gray_image = cv2.cvtColor(bgr_image, cv2.COLOR_BGR2GRAY)

emotion_target_size = emotion_classifier.input_shape[1:3]
emotion_model_path = '../trained_models/emotion_models/fer2013_mini_XCEPTION.102-0.66.hdf5'
emotion_classifier = load_model(emotion_model_path, compile=False)
#Resize image, I made this in ML.NET too
 gray_face = cv2.resize(gray_face, (emotion_target_size))
 emotion_prediction = emotion_classifier.predict(gray_face)

Note: I had used ML.NET with other model in ONNX format that I used in python, but in this scenario, the model accept 3 channel image, if you can send some example that uses only one, it would be nice.

I've made a treatment in my images using opencv, but in python I've a method in numpy that modify the array, how I do this to pass to ML.NET, since it's receive a bitmap object?

private  Bitmap MakeGrayscale3(Bitmap original)
        {
            var fotoColorida = BitmapConverter.ToMat(original);

            var classifier = new CascadeClassifier(Environment.CurrentDirectory + @"\Modelos\haarcascade_frontalface_default.xml");

            Mat gray = fotoColorida.CvtColor(ColorConversionCodes.BGR2GRAY, 1);
            var faces = classifier.DetectMultiScale(gray, 1.3, 5);

            foreach (var face in faces)
            {

                var coordenadaAjustada = new Rect(face.X - 20, face.Y - 20, face.Width + 30, face.Height + 40);
                using (var faceCortada = new Mat(gray, coordenadaAjustada))
                {
                    var redimensionado = faceCortada.Resize(new OpenCvSharp.Size(64, 64));
                    redimensionado = redimensionado / 255;
                    redimensionado = redimensionado - 0.5;
                    redimensionado = redimensionado * 2;

                    return redimensionado.ToBitmap(PixelFormat.Format32bppArgb);
                }
            }

            return original;
        }

Python code

gray_face = face in gray scale
gray_face = gray_face / 255.0
gray_face = gray_face - 0.5
gray_face = gray_face * 2.0

# I cannot know how to do this in c# to pass to ml.net
 gray_face = np.expand_dims(gray_face, 0)
 gray_face = np.expand_dims(gray_face, -1)

ML.NET CODE

var pipeline = mlContext.Transforms.ExtractPixels(outputColumnName: onnxModel.ModelInput, inputColumnName: nameof(ImageInputData.Image), colorsToExtract: ImagePixelExtractingEstimator.ColorBits.Alpha)
                .Append(mlContext.Transforms.ApplyOnnxModel(modelFile: onnxModel.ModelPath, outputColumnName: onnxModel.ModelOutput, inputColumnName: onnxModel.ModelInput));

Anyone?

Hi @sergioprates,
Your transforms for reading file should be something like this .
var pipeline= mlContext.Transforms.LoadImages(outputColumnName: "image_object", imageFolder: imagesFolder, inputColumnName: nameof(DataModels.ImageData.ImagePath)) .Append(mlContext.Transforms.ResizeImages(outputColumnName: "image_object_resized", imageWidth: ImageSettingsForTFModel.imageWidth, imageHeight: ImageSettingsForTFModel.imageHeight, inputColumnName: "image_object")) .Append(mlContext.Transforms.ConvertToGrayscale(outputColumnName: "image_object_resized1", inputColumnName: "image_object_resized")) .Append(mlContext.Transforms.ExtractPixels(outputColumnName: "linku_input", inputColumnName: "image_object_resized1", interleavePixelColors: ImageSettingsForTFModel.channelsLast, colorsToExtract: Microsoft.ML.Transforms.Image.ImagePixelExtractingEstimator.ColorBits.Blue, outputAsFloatArray: true,scaleImage: .0039216f))

Yout might be thinking why I used colorstoExtract as blue above, in my testing I found that for a single channel image, blue had the correct pixel values matching with my python test data. Scaling the pixels using scaleImage is also important, as I believe your model has been trained on scaled pixels

Also I will suggest that you compare the image data you are sending to the Model in python and see that it matches with the transformed data from pipeline above.
Same can be done as
var transData1 = pipeline.Preview(data); Console.WriteLine(transData1);
Preferably put a breakpoint at Console.Writeline and check your data.

-Ketan

Hi @ketankalia, I have tested with your code and Its works! But I dont understand the parameter "scaleImage", what means?

HI @sergioprates , I apologize for the late response.
"scaleImage" will convert a pixel value from 255 to 1 when you use 1/255f. This value is multiplied with the actual pixel value to achieve the scaled pixel value. Hope this helps.

-Ketan

Since this question was answered by @ketankalia I am closing this. Thanks for your colaboration!

@ketankalia For me, keeping scaleImage: 1/255f results in bad prediction. Keeping it as 1 fixed it. Using emotion-ferplus-8.onnx.
Many thanks for suggesting to extract the blue channel though. It saved my day :).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

daholste picture daholste  路  3Comments

rogancarr picture rogancarr  路  3Comments

samueleresca picture samueleresca  路  3Comments

bs6523 picture bs6523  路  4Comments

rebecca-burwei picture rebecca-burwei  路  3Comments