I have saved the mask_rcnn model as the .pb file .But there are totally two parts of input in keras code: image and meta. I couldn't find how to feed them into the input in tensorflow c++.
This is my c++ code:
Status run_status = session->Run(
{{"input_image", image_tensor},{"input_image_meta",meta_tensor}},
{"output_node0"}, {}, &outputs
);
And I got an error "Running model failed: Not found: FeedInputs :unable to find feed output input_image_meta". Are there any tricks to solve the problem??
Thanks!
Hi,
Did you solved your problem? I'm facing the same issus
Try to visualize the graph in your pb file with
https://github.com/tensorflow/tensorflow/blob/r1.4/tensorflow/python/tools/import_pb_to_tensorboard.py .
There is a good post about it at https://medium.com/@daj/how-to-inspect-a-pre-trained-tensorflow-model-5fd2ee79ced0. Probably the name "input_image_meta" has changed somehow or the pb is not complete?
I'm also very keen on getting this to run...
@hxw111 @ivshli @jmtatsch I'm curious how you built your image_tensor and meta_tensor.
I can get my session to run, but I'm having trouble getting the outputs to work (all zeros on output_detections:0).
UPDATE: now working; posted full code in follow up comment: https://github.com/matterport/Mask_RCNN/issues/222#issuecomment-373130661
Here's how I'm running it
std::vector<tensorflow::Tensor> outputs;
tensorflow::Status run_status = session->Run({{"input_image:0", inputTensor}, {"input_image_meta:0", inputMetadataTensor}},
{"output_detections:0", "output_mrcnn_class:0", "output_mrcnn_bbox:0", "output_mrcnn_mask:0", "output_rois:0", "output_rpn_class:0", "output_rpn_bbox:0"},
{},
&outputs);
@moorage I'm working on it, try to recode the python code into c++ style, you can check utils.resize_image molded_images, compose_image_meta fucntions
If anyone who knows or already done it, very welcome to share their experiences :)
@ivshli I indeed was able to implement all these. Including unmold. Kind of painful. But I suppose in the spirit of sharing and good karma I'll post it below :) . It's not particularly efficient; happy to take feedback in that department.
// given inputMat of type RGB (not BGR) / CV_8UC3 (possibly from an imread + cvtColor)
// also given dest of type cv::Mat(inputMat.size(), CV_8UC1)
// we trained on 256x256 , so TF_MASKRCNN_IMG_WIDTHHEIGHT = 256
// we copied MEAN_PIXEL configs, so cv::Scalar TF_MASKRCNN_MEAN_PIXEL(123.7, 116.8, 103.9);
// we statically defined float TF_MASKRCNN_IMAGE_METADATA[10] = { 0 ,TF_MASKRCNN_IMG_WIDTHHEIGHT ,TF_MASKRCNN_IMG_WIDTHHEIGHT , 3 , 0 , 0 ,TF_MASKRCNN_IMG_WIDTHHEIGHT ,TF_MASKRCNN_IMG_WIDTHHEIGHT , 0 , 0 };
// Resize to square with max dim, so we can resize it to 512x512
int largestDim = inputMat.size().height > inputMat.size().width ? inputMat.size().height : inputMat.size().width;
cv::Mat squareInputMat(cv::Size(largestDim, largestDim), CV_8UC3);
int leftBorder = (largestDim - inputMat.size().width) / 2;
int topBorder = (largestDim - inputMat.size().height) / 2;
cv::copyMakeBorder(inputMat, squareInputMat, topBorder, largestDim - (inputMat.size().height + topBorder), leftBorder, largestDim - (inputMat.size().width + leftBorder), cv::BORDER_CONSTANT, cv::Scalar(0));
cv::Mat resizedInputMat(cv::Size(TF_MASKRCNN_IMG_WIDTHHEIGHT, TF_MASKRCNN_IMG_WIDTHHEIGHT), CV_8UC3);
cv::resize(squareInputMat, resizedInputMat, resizedInputMat.size(), 0, 0);
// Need to "mold_image" like in mask rcnn
cv::Mat moldedInput(resizedInputMat.size(), CV_32FC3);
resizedInputMat.convertTo(moldedInput, CV_32FC3);
cv::subtract(moldedInput, TF_MASKRCNN_MEAN_PIXEL, moldedInput);
// Move the data into the input tensor
// remove memory copies by using code at https://github.com/tensorflow/tensorflow/issues/8033#issuecomment-332029092
// allocate a Tensor and get pointer to memory for that Tensor, allocate a "fake" cv::Mat from it to use as a basis to convert
tensorflow::Tensor inputTensor(tensorflow::DT_FLOAT, {1, moldedInput.size().height, moldedInput.size().width, 3}); // single image instance with 3 channels
float_t *p = inputTensor.flat<float_t>().data();
cv::Mat inputTensorMat(moldedInput.size(), CV_32FC3, p);
moldedInput.convertTo(inputTensorMat, CV_32FC3);
// Copy the TF_MASKRCNN_IMAGE_METADATA data into a tensor
tensorflow::Tensor inputMetadataTensor(tensorflow::DT_FLOAT, {1, TF_MASKRCNN_IMAGE_METADATA_LENGTH});
auto inputMetadataTensorMap = inputMetadataTensor.tensor<float, 2>();
for (int i = 0; i < TF_MASKRCNN_IMAGE_METADATA_LENGTH; ++i) {
inputMetadataTensorMap(0, i) = TF_MASKRCNN_IMAGE_METADATA[i];
}
// Run tensorflow
cv::TickMeter tm;
tm.start();
std::vector<tensorflow::Tensor> outputs;
tensorflow::Status run_status = tfSession->Run({{"input_image", inputTensor}, {"input_image_meta", inputMetadataTensor}},
{"output_detections", "output_mrcnn_class", "output_mrcnn_bbox", "output_mrcnn_mask",
"output_rois", "output_rpn_class", "output_rpn_bbox"},
{},
&outputs);
if (!run_status.ok()) {
std::cerr << "tfSession->Run failed: " << run_status << std::endl;
}
tm.stop();
std::cout << "Inference time, ms: " << tm.getTimeMilli() << std::endl;
if (outputs[3].shape().dims() != 5 || outputs[3].shape().dim_size(4) != 2) {
throw std::runtime_error("Expected mask dimensions to be [1,100,28,28,2] but got: " + outputs[3].shape().DebugString());
}
auto detectionsMap = outputs[0].tensor<float, 3>();
for (int i = 0; i < outputs[3].shape().dim_size(1); ++i) {
auto scoreAtI = detectionsMap(0, i, 5);
auto detectedClass = detectionsMap(0, i, 4);
auto y1 = detectionsMap(0, i, 0), x1 = detectionsMap(0, i, 1), y2 = detectionsMap(0, i, 2), x2 = detectionsMap(0, i, 3);
auto maskHeight = y2 - y1, maskWidth = x2 - x1;
if (maskHeight != 0 && maskWidth != 0) {
// Pointer arithmetic
const int i0 = 0, /* size0 = (int)outputs[3].shape().dim_size(1), */ i1 = i, size1 = (int)outputs[3].shape().dim_size(1), size2 = (int)outputs[3].shape().dim_size(2), size3 = (int)outputs[3].shape().dim_size(3), i4 = (int)detectedClass /*, size4 = 2 */;
int pointerLocationOfI = (i0*size1 + i1)*size2;
float_t *maskPointer = outputs[3].flat<float_t>().data();
// The shape of the detection is [28,28,2], where the last index is the class of interest.
// We'll extract index 1 because it's the toilet seat.
cv::Mat initialMask(cv::Size(size2, size3), CV_32FC2, &maskPointer[pointerLocationOfI]); // CV_32FC2 because I know size4 is 2
cv::Mat detectedMask(initialMask.size(), CV_32FC1);
cv::extractChannel(initialMask, detectedMask, i4);
// Convert to B&W
cv::Mat binaryMask(detectedMask.size(), CV_8UC1);
cv::threshold(detectedMask, binaryMask, 0.5, 255, cv::THRESH_BINARY);
// First scale and offset in relation to TF_MASKRCNN_IMG_WIDTHHEIGHT
cv::Mat scaledDetectionMat(maskHeight, maskWidth, CV_8UC1);
cv::resize(binaryMask, scaledDetectionMat, scaledDetectionMat.size(), 0, 0);
cv::Mat scaledOffsetMat(moldedInput.size(), CV_8UC1, cv::Scalar(0));
scaledDetectionMat.copyTo(scaledOffsetMat(cv::Rect(x1, y1, maskWidth, maskHeight)));
// Second, scale and offset in relation to our original inputMat
cv::Mat detectionScaledToSquare(squareInputMat.size(), CV_8UC1);
cv::resize(scaledOffsetMat, detectionScaledToSquare, detectionScaledToSquare.size(), 0, 0);
detectionScaledToSquare(cv::Rect(leftBorder, topBorder, inputMat.size().width, inputMat.size().height)).copyTo(dest);
}
}
this is really useful, thanks a lot.
@moorage hello, thanks for your code
but I think
int pointerLocationOfI = (i0size1 + i1)size2;
should be
int pointerLocationOfI = (i0size1 + i1)size2size3size4;
How do you think?
I don't know much about outputs[3].flat
My version worked for me @luoshanwei :)
@moorage Hi, do you run your code on cpu or gpu? I try to run the code on a single cpu (to test the time cost), by setting the cpu number:
GraphDef graph_def;
SessionOptions opts;
TF_CHECK_OK(ReadBinaryProto(Env::Default(), graph_definition, &graph_def));
graph::SetDefaultDevice("/cpu:0", &graph_def);`
However, it doesn't work. The program also occupy other cpus.
Do you meet this problem?
@ypflll never tried that, sorry. I ran on a single i7 laptop, but didn't check CPU usage.
@luoshanwei did you get the pointer math sorted?
I am looking at this an going mildy cross eyed: https://eli.thegreenplace.net/2015/memory-layout-of-multi-dimensional-arrays/ but short of addressing each pixel by hand in a five deep for loop via the tensor math I cannot be sure of how else to do it.
OpenCV leaves a lot to be desired when it comes to multichannel images: this should make short work of the problem: https://github.com/OpenImageIO/oiio/blob/master/src/libOpenImageIO/imagebuf_test.cpp
Hi, I have a question about C++ implementation.
I implemented with reference the #222 comment
But the output and input is different from the comment now. I think this is because of latest update.
The error message is like this.
tfSession->Run failed: Invalid argument: You must feed a value for placeholder tensor 'input_anchors' with dtype float and shape [?,?,4]
But I am not sure how to build 'input_anchors'. Does anybody know how to build in C++?
Thank you
@Masahiro1002 did you get to the bottom of this?
seems like we need to convert some python from here:
https://github.com/parai/Mask_RCNN/commit/6289c1bd08fc90a1c3e296be8155674651f82a4b
@moorage thank you! could tell how you converted the keras to .pb in the first place?
@marcown you can find a multitude of guides here: https://github.com/matterport/Mask_RCNN/issues/218
thanks!
@ivshli I indeed was able to implement all these. Including unmold. Kind of painful. But I suppose in the spirit of sharing and good karma I'll post it below :) . It's not particularly efficient; happy to take feedback in that department.
// given inputMat of type RGB (not BGR) / CV_8UC3 (possibly from an imread + cvtColor) // also given dest of type cv::Mat(inputMat.size(), CV_8UC1) // we trained on 256x256 , so TF_MASKRCNN_IMG_WIDTHHEIGHT = 256 // we copied MEAN_PIXEL configs, so cv::Scalar TF_MASKRCNN_MEAN_PIXEL(123.7, 116.8, 103.9); // we statically defined float TF_MASKRCNN_IMAGE_METADATA[10] = { 0 ,TF_MASKRCNN_IMG_WIDTHHEIGHT ,TF_MASKRCNN_IMG_WIDTHHEIGHT , 3 , 0 , 0 ,TF_MASKRCNN_IMG_WIDTHHEIGHT ,TF_MASKRCNN_IMG_WIDTHHEIGHT , 0 , 0 }; // Resize to square with max dim, so we can resize it to 512x512 int largestDim = inputMat.size().height > inputMat.size().width ? inputMat.size().height : inputMat.size().width; cv::Mat squareInputMat(cv::Size(largestDim, largestDim), CV_8UC3); int leftBorder = (largestDim - inputMat.size().width) / 2; int topBorder = (largestDim - inputMat.size().height) / 2; cv::copyMakeBorder(inputMat, squareInputMat, topBorder, largestDim - (inputMat.size().height + topBorder), leftBorder, largestDim - (inputMat.size().width + leftBorder), cv::BORDER_CONSTANT, cv::Scalar(0)); cv::Mat resizedInputMat(cv::Size(TF_MASKRCNN_IMG_WIDTHHEIGHT, TF_MASKRCNN_IMG_WIDTHHEIGHT), CV_8UC3); cv::resize(squareInputMat, resizedInputMat, resizedInputMat.size(), 0, 0); // Need to "mold_image" like in mask rcnn cv::Mat moldedInput(resizedInputMat.size(), CV_32FC3); resizedInputMat.convertTo(moldedInput, CV_32FC3); cv::subtract(moldedInput, TF_MASKRCNN_MEAN_PIXEL, moldedInput); // Move the data into the input tensor // remove memory copies by using code at https://github.com/tensorflow/tensorflow/issues/8033#issuecomment-332029092 // allocate a Tensor and get pointer to memory for that Tensor, allocate a "fake" cv::Mat from it to use as a basis to convert tensorflow::Tensor inputTensor(tensorflow::DT_FLOAT, {1, moldedInput.size().height, moldedInput.size().width, 3}); // single image instance with 3 channels float_t *p = inputTensor.flat<float_t>().data(); cv::Mat inputTensorMat(moldedInput.size(), CV_32FC3, p); moldedInput.convertTo(inputTensorMat, CV_32FC3); // Copy the TF_MASKRCNN_IMAGE_METADATA data into a tensor tensorflow::Tensor inputMetadataTensor(tensorflow::DT_FLOAT, {1, TF_MASKRCNN_IMAGE_METADATA_LENGTH}); auto inputMetadataTensorMap = inputMetadataTensor.tensor<float, 2>(); for (int i = 0; i < TF_MASKRCNN_IMAGE_METADATA_LENGTH; ++i) { inputMetadataTensorMap(0, i) = TF_MASKRCNN_IMAGE_METADATA[i]; } // Run tensorflow cv::TickMeter tm; tm.start(); std::vector<tensorflow::Tensor> outputs; tensorflow::Status run_status = tfSession->Run({{"input_image", inputTensor}, {"input_image_meta", inputMetadataTensor}}, {"output_detections", "output_mrcnn_class", "output_mrcnn_bbox", "output_mrcnn_mask", "output_rois", "output_rpn_class", "output_rpn_bbox"}, {}, &outputs); if (!run_status.ok()) { std::cerr << "tfSession->Run failed: " << run_status << std::endl; } tm.stop(); std::cout << "Inference time, ms: " << tm.getTimeMilli() << std::endl; if (outputs[3].shape().dims() != 5 || outputs[3].shape().dim_size(4) != 2) { throw std::runtime_error("Expected mask dimensions to be [1,100,28,28,2] but got: " + outputs[3].shape().DebugString()); } auto detectionsMap = outputs[0].tensor<float, 3>(); for (int i = 0; i < outputs[3].shape().dim_size(1); ++i) { auto scoreAtI = detectionsMap(0, i, 5); auto detectedClass = detectionsMap(0, i, 4); auto y1 = detectionsMap(0, i, 0), x1 = detectionsMap(0, i, 1), y2 = detectionsMap(0, i, 2), x2 = detectionsMap(0, i, 3); auto maskHeight = y2 - y1, maskWidth = x2 - x1; if (maskHeight != 0 && maskWidth != 0) { // Pointer arithmetic const int i0 = 0, /* size0 = (int)outputs[3].shape().dim_size(1), */ i1 = i, size1 = (int)outputs[3].shape().dim_size(1), size2 = (int)outputs[3].shape().dim_size(2), size3 = (int)outputs[3].shape().dim_size(3), i4 = (int)detectedClass /*, size4 = 2 */; int pointerLocationOfI = (i0*size1 + i1)*size2; float_t *maskPointer = outputs[3].flat<float_t>().data(); // The shape of the detection is [28,28,2], where the last index is the class of interest. // We'll extract index 1 because it's the toilet seat. cv::Mat initialMask(cv::Size(size2, size3), CV_32FC2, &maskPointer[pointerLocationOfI]); // CV_32FC2 because I know size4 is 2 cv::Mat detectedMask(initialMask.size(), CV_32FC1); cv::extractChannel(initialMask, detectedMask, i4); // Convert to B&W cv::Mat binaryMask(detectedMask.size(), CV_8UC1); cv::threshold(detectedMask, binaryMask, 0.5, 255, cv::THRESH_BINARY); // First scale and offset in relation to TF_MASKRCNN_IMG_WIDTHHEIGHT cv::Mat scaledDetectionMat(maskHeight, maskWidth, CV_8UC1); cv::resize(binaryMask, scaledDetectionMat, scaledDetectionMat.size(), 0, 0); cv::Mat scaledOffsetMat(moldedInput.size(), CV_8UC1, cv::Scalar(0)); scaledDetectionMat.copyTo(scaledOffsetMat(cv::Rect(x1, y1, maskWidth, maskHeight))); // Second, scale and offset in relation to our original inputMat cv::Mat detectionScaledToSquare(squareInputMat.size(), CV_8UC1); cv::resize(scaledOffsetMat, detectionScaledToSquare, detectionScaledToSquare.size(), 0, 0); detectionScaledToSquare(cv::Rect(leftBorder, topBorder, inputMat.size().width, inputMat.size().height)).copyTo(dest); } }
Hey man, could you share you pb model file with me? Now you code can not work on the latest pd model of mask-rcnn. Here is my e-mail: [email protected]
I really need to make mask-rcnn to do inference in C++ environment, thanks!
@Masahiro1002 I have the same issue on c++ Tensorflow, have you solved this problem?
why @moorage 's code hasn't 'input_anchors' ?
@moorage Can you provide complete code for calling the Mask RCNN model in C++?
Mask RCNN model input should have three parameters, why is there less input_anchors parameter in your code?
@moorage I'm working on it, try to recode the python code into c++ style, you can check utils.resize_image molded_images, compose_image_meta fucntions
If anyone who knows or already done it, very welcome to share their experiences :)
i am working on it ,did you finish it
My version worked for me @luoshanwei :)
this mrcnn model of .pb should have 3 inputs (input_image:0, input_image_meta:0, input_anchors:0) and 7 outputs, but why you only have two inputs? where's the last one?
My version worked for me @luoshanwei :)
this mrcnn model of .pb should have 3 inputs (input_image:0, input_image_meta:0, input_anchors:0) and 7 outputs, but why you only have two inputs? where's the last one?
the Version2.1 have 3 inputs, V2.0 have 2 inputs, you can find them in source code mrcnn/model.py
My version worked for me @luoshanwei :)
this mrcnn model of .pb should have 3 inputs (input_image:0, input_image_meta:0, input_anchors:0) and 7 outputs, but why you only have two inputs? where's the last one?
the Version2.1 have 3 inputs, V2.0 have 2 inputs, you can find them in source code mrcnn/model.py
Hi, man! Do you know how to generate mask on the image from the model output? Because I find with above code moorage shared, the maskHeight<1 and maskWidth<1. This leads to resize failed, how did you solved it? Could you help me?
cv::resize(binaryMask, scaledDetectionMat, scaledDetectionMat.size(), 0, 0);
My version worked for me @luoshanwei :)
this mrcnn model of .pb should have 3 inputs (input_image:0, input_image_meta:0, input_anchors:0) and 7 outputs, but why you only have two inputs? where's the last one?
the Version2.1 have 3 inputs, V2.0 have 2 inputs, you can find them in source code mrcnn/model.py
Hi, man! Do you know how to generate mask on the image from the model output? Because I find with above code moorage shared, the maskHeight<1 and maskWidth<1. This leads to resize failed, how did you solved it? Could you help me?
cv::resize(binaryMask, scaledDetectionMat, scaledDetectionMat.size(), 0, 0);
My colleague modified the original code.But I dont know how to upload the file, give me your email , I will send it to u
My colleague modified the original code.But I dont know how to upload the file, give me your email , I will send it to u
My email: [email protected], thank you very much!
@ivshli I indeed was able to implement all these. Including unmold. Kind of painful. But I suppose in the spirit of sharing and good karma I'll post it below :) . It's not particularly efficient; happy to take feedback in that department.
// given inputMat of type RGB (not BGR) / CV_8UC3 (possibly from an imread + cvtColor) // also given dest of type cv::Mat(inputMat.size(), CV_8UC1) // we trained on 256x256 , so TF_MASKRCNN_IMG_WIDTHHEIGHT = 256 // we copied MEAN_PIXEL configs, so cv::Scalar TF_MASKRCNN_MEAN_PIXEL(123.7, 116.8, 103.9); // we statically defined float TF_MASKRCNN_IMAGE_METADATA[10] = { 0 ,TF_MASKRCNN_IMG_WIDTHHEIGHT ,TF_MASKRCNN_IMG_WIDTHHEIGHT , 3 , 0 , 0 ,TF_MASKRCNN_IMG_WIDTHHEIGHT ,TF_MASKRCNN_IMG_WIDTHHEIGHT , 0 , 0 }; // Resize to square with max dim, so we can resize it to 512x512 int largestDim = inputMat.size().height > inputMat.size().width ? inputMat.size().height : inputMat.size().width; cv::Mat squareInputMat(cv::Size(largestDim, largestDim), CV_8UC3); int leftBorder = (largestDim - inputMat.size().width) / 2; int topBorder = (largestDim - inputMat.size().height) / 2; cv::copyMakeBorder(inputMat, squareInputMat, topBorder, largestDim - (inputMat.size().height + topBorder), leftBorder, largestDim - (inputMat.size().width + leftBorder), cv::BORDER_CONSTANT, cv::Scalar(0)); cv::Mat resizedInputMat(cv::Size(TF_MASKRCNN_IMG_WIDTHHEIGHT, TF_MASKRCNN_IMG_WIDTHHEIGHT), CV_8UC3); cv::resize(squareInputMat, resizedInputMat, resizedInputMat.size(), 0, 0); // Need to "mold_image" like in mask rcnn cv::Mat moldedInput(resizedInputMat.size(), CV_32FC3); resizedInputMat.convertTo(moldedInput, CV_32FC3); cv::subtract(moldedInput, TF_MASKRCNN_MEAN_PIXEL, moldedInput); // Move the data into the input tensor // remove memory copies by using code at https://github.com/tensorflow/tensorflow/issues/8033#issuecomment-332029092 // allocate a Tensor and get pointer to memory for that Tensor, allocate a "fake" cv::Mat from it to use as a basis to convert tensorflow::Tensor inputTensor(tensorflow::DT_FLOAT, {1, moldedInput.size().height, moldedInput.size().width, 3}); // single image instance with 3 channels float_t *p = inputTensor.flat<float_t>().data(); cv::Mat inputTensorMat(moldedInput.size(), CV_32FC3, p); moldedInput.convertTo(inputTensorMat, CV_32FC3); // Copy the TF_MASKRCNN_IMAGE_METADATA data into a tensor tensorflow::Tensor inputMetadataTensor(tensorflow::DT_FLOAT, {1, TF_MASKRCNN_IMAGE_METADATA_LENGTH}); auto inputMetadataTensorMap = inputMetadataTensor.tensor<float, 2>(); for (int i = 0; i < TF_MASKRCNN_IMAGE_METADATA_LENGTH; ++i) { inputMetadataTensorMap(0, i) = TF_MASKRCNN_IMAGE_METADATA[i]; } // Run tensorflow cv::TickMeter tm; tm.start(); std::vector<tensorflow::Tensor> outputs; tensorflow::Status run_status = tfSession->Run({{"input_image", inputTensor}, {"input_image_meta", inputMetadataTensor}}, {"output_detections", "output_mrcnn_class", "output_mrcnn_bbox", "output_mrcnn_mask", "output_rois", "output_rpn_class", "output_rpn_bbox"}, {}, &outputs); if (!run_status.ok()) { std::cerr << "tfSession->Run failed: " << run_status << std::endl; } tm.stop(); std::cout << "Inference time, ms: " << tm.getTimeMilli() << std::endl; if (outputs[3].shape().dims() != 5 || outputs[3].shape().dim_size(4) != 2) { throw std::runtime_error("Expected mask dimensions to be [1,100,28,28,2] but got: " + outputs[3].shape().DebugString()); } auto detectionsMap = outputs[0].tensor<float, 3>(); for (int i = 0; i < outputs[3].shape().dim_size(1); ++i) { auto scoreAtI = detectionsMap(0, i, 5); auto detectedClass = detectionsMap(0, i, 4); auto y1 = detectionsMap(0, i, 0), x1 = detectionsMap(0, i, 1), y2 = detectionsMap(0, i, 2), x2 = detectionsMap(0, i, 3); auto maskHeight = y2 - y1, maskWidth = x2 - x1; if (maskHeight != 0 && maskWidth != 0) { // Pointer arithmetic const int i0 = 0, /* size0 = (int)outputs[3].shape().dim_size(1), */ i1 = i, size1 = (int)outputs[3].shape().dim_size(1), size2 = (int)outputs[3].shape().dim_size(2), size3 = (int)outputs[3].shape().dim_size(3), i4 = (int)detectedClass /*, size4 = 2 */; int pointerLocationOfI = (i0*size1 + i1)*size2; float_t *maskPointer = outputs[3].flat<float_t>().data(); // The shape of the detection is [28,28,2], where the last index is the class of interest. // We'll extract index 1 because it's the toilet seat. cv::Mat initialMask(cv::Size(size2, size3), CV_32FC2, &maskPointer[pointerLocationOfI]); // CV_32FC2 because I know size4 is 2 cv::Mat detectedMask(initialMask.size(), CV_32FC1); cv::extractChannel(initialMask, detectedMask, i4); // Convert to B&W cv::Mat binaryMask(detectedMask.size(), CV_8UC1); cv::threshold(detectedMask, binaryMask, 0.5, 255, cv::THRESH_BINARY); // First scale and offset in relation to TF_MASKRCNN_IMG_WIDTHHEIGHT cv::Mat scaledDetectionMat(maskHeight, maskWidth, CV_8UC1); cv::resize(binaryMask, scaledDetectionMat, scaledDetectionMat.size(), 0, 0); cv::Mat scaledOffsetMat(moldedInput.size(), CV_8UC1, cv::Scalar(0)); scaledDetectionMat.copyTo(scaledOffsetMat(cv::Rect(x1, y1, maskWidth, maskHeight))); // Second, scale and offset in relation to our original inputMat cv::Mat detectionScaledToSquare(squareInputMat.size(), CV_8UC1); cv::resize(scaledOffsetMat, detectionScaledToSquare, detectionScaledToSquare.size(), 0, 0); detectionScaledToSquare(cv::Rect(leftBorder, topBorder, inputMat.size().width, inputMat.size().height)).copyTo(dest); } }Hello,my dear friend ,can you send your tensorflow.dll, tensorflow.lib and include files to me ?I can't compile the dll from tensorflow source files. I really need it, my email:[email protected]. thanks a lot
I have saved the mask_rcnn model as the .pb file .But there are totally two parts of input in keras code: image and meta. I couldn't find how to feed them into the input in tensorflow c++.
This is my c++ code:
Status run_status = session->Run(
{{"input_image", image_tensor},{"input_image_meta",meta_tensor}},
{"output_node0"}, {}, &outputs
);And I got an error "Running model failed: Not found: FeedInputs :unable to find feed output input_image_meta". Are there any tricks to solve the problem??
Thanks!
hey , why my x,y=0? and no mask and box was showed, please help me ,thanks

wht my detectionsMap is zero ?
this is my handle, https://blog.csdn.net/qq_33671888/article/details/89254537
@caishengzao Could you push it to a repository? My Chinese is not the best ;) , which makes it difficult to read
@MennoK ok,i will try to make it into English and then post all resource into github
@MennoK this is my code implementation https://github.com/CasonTsai/MaskRcnn_tensorflow_cpp_inference,
Sorry, it's late
@moorage @CasonTsai I implemented a tensorflow c++11 infer engine and run inference of Mask RCNN sucessfully, except that the output of tensor "detection" is full of zeros (I can see non-zeros values for "mrcnn mask" tensor). I see that you have encountered with similar questions.




The model are exported using modern TF 2.0 methods with tf 1.x compatible interface (you can easily see it from codes)

I wonder that how did you solve the problem (zeros with "detection" tensor)before when you got zeros from output tensors. This prevent me from moving forward, any helps are appreciated. @moorage
Here is some snippet of codes for inference:
// In c++ it is also possible for tensorflow to create an reader operator to automatically read images from an image
// path, where image tensor is built automatically and graph_def is finally converted from a variable of type tf::Scope.
// In tensorflow, see codes defined in "tensorflow/core/framework/tensor_types.h" and "tensorflow/core/framework/tensor.h"
// that users are able to use Eigen::TensorMap to extract values from the container for reading and assignment. (Lei ([email protected]) 2020.7)
tfe::Tensor _molded_images(tf::DT_FLOAT, tf::TensorShape({1, molded_shape(0), molded_shape(1), 3}));
auto _molded_images_mapped = _molded_images.tensor<float, 4>();
// @todo TODO using Eigen::TensorMap to optimize the copy operation, e.g.: float* data_mapped = _molded_images.flat<float>().data(); copy to the buf using memcpy
// ref: 1. discussion Tensorflow Github repo issue#8033
// 2. opencv2 :
// 2.1. grab buf: Type* buf = mat.ptr<Type>();
// 2.2 memcpy to the buf
// 3. Eigen::Tensor buffer :
// 3.1 grab buf in RowMajor/ColMajor layout: tensor.data();
// 3.2 convert using Eigen::TensorMap : Eigen::TensorMap<Eigen::Tensor<Type, NUM_DIMS>>(buf)
// _molded_images_mapped = Eigen::TensorMap<Eigen::Tensor<float, 4, Eigen::RowMajor>>(&data[0], 1, molded_shape_H, molded_shape_W, 3);
for (int h=0; h < molded_shape(1); h++) {
for (int w=0; w < molded_shape(2); w++) {
_molded_images_mapped(0, h, w, 0) = molded_images(0, h, w, 0);
_molded_images_mapped(0, h, w, 1) = molded_images(0, h, w, 1);
_molded_images_mapped(0, h, w, 2) = molded_images(0, h, w, 2);
}
}
inputs->emplace_back("input_image:0", _molded_images);
tfe::Tensor _images_metas(tf::DT_FLOAT, tf::TensorShape({1, images_metas.cols() } ) );
auto _images_metas_mapped = _images_metas.tensor<float, 2>();
for (int i=0; i < images_metas.cols(); i++)
{
_images_metas_mapped(0, i) = images_metas(0, i);
}
inputs->emplace_back("input_image_meta:0", _images_metas);
tfe::Tensor _anchors(tf::DT_FLOAT, tf::TensorShape({1, anchors.rows(), anchors.cols()}));
auto _anchors_mapped = _anchors.tensor<float, 3>();
for (int i=0; i < anchors.rows(); i++)
{
for (int j=0; j < anchors.cols(); j++)
{
_anchors_mapped(0,i,j) = anchors(i,j);
}
}
inputs->emplace_back("input_anchors:0", _anchors);
// @todo : TODO
// run base_engine_ detection
// see examples from main.cpp, usage of TensorFlowEngine
// load saved model
// tfe::FutureType fut = base_engine_->Run(*inputs, *outputs,
// {"mrcnn_detection/Reshape_1:0", "mrcnn_class/Reshape_1:0", "mrcnn_bbox/Reshape:0", "mrcnn_mask/Reshape_1:0", "ROI/packed_2:0", "rpn_class/concat:0", "rpn_bbox/concat:0"}, {});
// load saved graph
tfe::FutureType fut = base_engine_->Run(*inputs, *outputs,
{"output_detections:0", "output_mrcnn_class:0", "output_mrcnn_bbox:0", "output_mrcnn_mask:0", "output_rois:0", "output_rpn_class:0", "output_rpn_bbox:0"}, {});
// pass fut object to anther thread by value to avoid undefined behaviors
std::shared_future<tfe::ReturnType> fut_ref( std::move(fut) );
// wrap fut with a new future object and pass local variables in
std::future<ReturnType> wrapped_fut = std::async(std::launch::async, [=, &rets]() -> ReturnType {
LOG(INFO) << "enter into sfe TF handler ...";
// fetch result
fut_ref.wait();
tf::Status status = fut_ref.get();
std::string graph_def = base_model_dir_;
if (status.ok()) {
if (outputs->size() == 0) {
LOG(INFO) << format("[Main] Found no output: %s!", graph_def.c_str(), status.ToString().c_str());
return status;
}
LOG(INFO) << format("[Main] Success: infer through <%s>!", graph_def.c_str());
// @todo : TODO fill out the detectron result
tfe::Tensor detections = (*outputs)[0];
tfe::Tensor mrcnn_mask = (*outputs)[3];
// @todo : TODO convert tf::Tensor to eigen matrix/tensor
auto detections_mapped = detections.tensor<float, 3>();
auto mrcnn_mask_mapped = mrcnn_mask.tensor<float, 5>();
#ifndef NDEBUG
LOG(INFO) << format("detections(shape:(%d,%d,%d)):",
detections_mapped.dimension(0),
detections_mapped.dimension(1),
detections_mapped.dimension(2))
<< std::endl << detections_mapped;
// LOG(INFO) << "mask:" << std::endl << mrcnn_mask_mapped;
#endif
for (int i=0; i < images.size(); i++) {
// Eigen::Tensor is default ColMajor layout, which is different from c/c++ matrix layout.
// Note only column layout is fully supported for the moment (v3.3.9)
// Eigen::Tensor<float, 2> detection = Eigen::TensorLayoutSwapOp<Eigen::Tensor<float, 2, Eigen::RowMajor>>
// (detections_mapped.chip(i, 0));
Eigen::Tensor<float, 2, Eigen::RowMajor> detection = detections_mapped.chip(i, 0);
// Generate mask using a threshold
// Eigen::Tensor<float, 4> mask = Eigen::TensorLayoutSwapOp<Eigen::Tensor<float, 4, Eigen::RowMajor>>
// (mrcnn_mask_mapped.chip(i, 0));
Eigen::Tensor<float, 4, Eigen::RowMajor> mask = mrcnn_mask_mapped.chip(i, 0);
DetectronResult ret;
Eigen::MatrixXi window = windows.row(i);
unmold_detections(detection, mask, image_shape, molded_shape, window, ret);
rets.push_back( std::move(ret) );
}
@yiakwy you may need to check these steps:
1 keep same config(such input size ,batch_size ) when you save the keras model,turn to keras model to tf model,and inference model with c++
2 check the input name ,and grantee the image data have been flow into tensor correctly,
3 check the preprogress,such as generating the anchors
4 you can reference the link
https://github.com/CasonTsai/MaskRcnn_tensorflow_cpp_inference ,https://blog.csdn.net/qq_33671888/article/details/89254537
@CasonTsai Thanks for the suggestions. Could you help to check following codes ? You can also checkout the codes here :
https://github.com/yiakwy/SEMANTIC_VISUAL_SUPPORTED_ODEMETRY/blob/master/modules/models/sfe.h
https://github.com/yiakwy/SEMANTIC_VISUAL_SUPPORTED_ODEMETRY/blob/master/modules/models/simple_mrcnn_infer.cpp
Codes on exportation of models from keras to Tensorflow either in pure protobuf file with constant variables or introduced saved format (google Cloud team introduced this method in 2017 for tensorflow serving) could be find in https://github.com/yiakwy/SEMANTIC_VISUAL_SUPPORTED_ODEMETRY/blob/master/python/pysvso/models/sfe.py
Both input tests and output tests are included in cpp source to compare with results from python backend
Some operations has already been token to ensure:
Have you ever encountered the same question before ?
@hxw111 @CasonTsai @121649982 @moorage
Ultimate solution.
The bug has been fixed with tensorflow inference test sboth in python and cpp (fix a typo, wrong index in image_meta). Here is an example from output:

Recently I made a speech in Google Developer Group (GDG) about inference in end devices. Welcome to have a look at it!
Close the issue.
@waleedka @MennoK I want to add a pull request about solution to this problem. Also I used MASK_RCNN in my POC poroject svso on depth estimation in real time (a semantic SLAM project) and introduce it to the public in GDG.
Most helpful comment
@ivshli I indeed was able to implement all these. Including unmold. Kind of painful. But I suppose in the spirit of sharing and good karma I'll post it below :) . It's not particularly efficient; happy to take feedback in that department.