Describe the bug
I am trying to get the depth value of the points collected from the color image. After I transformed depth image to color camera, often the depth value of the point of interest is shown to be 0 and the point itself is not detected by Kinect Azure. Another answer #588 mentioned that there is some interpolation/filter method to compensate this issue. I am wondering how could I use those methods in the code?
First edit:
I have found a promising function from SDK doc called k4a_transformation_depth_image_to_color_camera_custom which has an input argument whose type named as k4a_transformation_interpolation_type_t. This looks very promising and could anyone provide me an example about how to use this? Much appreciated!
Second Edit:
To Reproduce
Expected behavior
Get the true depth value of the specific point extracted from 2D color image.
Code appendix
To help reproduce the error, I also append my code here:
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <k4a/k4a.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <direct.h>
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/opencv.hpp"
#include <opencv2/core/mat.hpp>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/calib3d.hpp>
#include <math.h>
#include <string>
#include <fstream>
#include <sstream>
using namespace cv;
using namespace std;
int main()
{
// 1.1: Start by counting the number of connected devices
uint32_t device_count = k4a_device_get_installed_count();
if (device_count == 0)
{
printf("No K4A devices found\n");
return 0;
}
else
{
printf("Found %d connected devices:\n", device_count);
}
// 1.2: Define the Exit block
int returnCode = 1;
// 1.3 Initialize the the device and capture attributes
k4a_device_t device = NULL;
k4a_capture_t capture = NULL;
const int32_t TIMEOUT_IN_MS = 1000;
// 1.4: Initialize the frame count
int totalFrame = 1;
int captureFrameCount = totalFrame;
printf("Capturing %d frames\n", captureFrameCount);
// 2: Set the configuration of device, you can also set it after open the device but before starting the camera
k4a_device_configuration_t config = K4A_DEVICE_CONFIG_INIT_DISABLE_ALL;
config.color_format = K4A_IMAGE_FORMAT_COLOR_BGRA32; // <==== For Color image
config.color_resolution = K4A_COLOR_RESOLUTION_2160P;
config.depth_mode = K4A_DEPTH_MODE_NFOV_UNBINNED; // <==== For Depth image
config.camera_fps = K4A_FRAMES_PER_SECOND_30;
config.synchronized_images_only = true;
// 3: Open the device
if (K4A_RESULT_SUCCEEDED != k4a_device_open(K4A_DEVICE_DEFAULT, &device))
{
printf("Failed to open device\n");
goto Exit;
}
// 3.2 Set the calibration
k4a_calibration_t calibration;
if (K4A_RESULT_SUCCEEDED !=
k4a_device_get_calibration(device, config.depth_mode, config.color_resolution, &calibration))
{
printf("Failed to get calibration\n");
goto Exit;
}
// 4: Start the camera
if (K4A_RESULT_SUCCEEDED != k4a_device_start_cameras(device, &config))
{
printf("Failed to start device\n");
goto Exit;
}
// 5: Start to receive the captures and frames
while (captureFrameCount-- > 0)
{
// Get a depth frame
switch (k4a_device_get_capture(device, &capture, TIMEOUT_IN_MS))
{
case K4A_WAIT_RESULT_SUCCEEDED:
break;
case K4A_WAIT_RESULT_TIMEOUT:
printf("Timed out waiting for a capture\n");
continue;
break;
case K4A_WAIT_RESULT_FAILED:
printf("Failed to read a capture\n");
goto Exit;
}
printf("==================================\n");
printf("Capture %4d | \n", captureFrameCount);
// 5.1 Probe for a color image
k4a_image_t image_color = k4a_capture_get_color_image(capture);
if (image_color != NULL)
{
// 5.2 Get the sizes of color image
int width = k4a_image_get_width_pixels(image_color);
int height = k4a_image_get_height_pixels(image_color);
int strides = k4a_image_get_stride_bytes(image_color);
printf("Color image height, width and strides: %d, %d, %d\n", height, width, strides);
// 5.3 Store the image using opencv Mat
uint8_t* color_image_data = k4a_image_get_buffer(image_color);
const Mat color_image(height, width, CV_8UC4, (void*)color_image_data, Mat::AUTO_STEP);
// 5.4 Display the color image
namedWindow("foobar", WINDOW_AUTOSIZE);
imshow("foobar", color_image);
waitKey(1000);
// 5.6 release the image
// k4a_image_release(image_color);
}
else
{
printf(" | Color None ");
}
// 6.1 Probe for a depth16 image
const k4a_image_t image_depth = k4a_capture_get_depth_image(capture);
if (image_depth != NULL)
{
// 6.2 Get the sizes of depth image
int width = k4a_image_get_width_pixels(image_depth);
int height = k4a_image_get_height_pixels(image_depth);
int strides = k4a_image_get_stride_bytes(image_depth);
printf("Depth image height, width and strides: %d, %d, %d\n", height, width, strides);
// 6.3 Store the image using opencv Mat
uint16_t* depth_image_data = (uint16_t*)(void*)k4a_image_get_buffer(image_depth);
const Mat depth_image(height, width, CV_16U, (void*)depth_image_data, Mat::AUTO_STEP);
// 6.4 Display the depth image
namedWindow("foobar", WINDOW_AUTOSIZE);
imshow("foobar", depth_image);
waitKey(1000);
}
else
{
printf(" | Depth16 None\n");
}
// 7: Convert 2D points to 3D point cloud
int width = k4a_image_get_width_pixels(image_color);
int height = k4a_image_get_height_pixels(image_color);
// Find the position of the center point in the image
float point_row = (height / 2);
float point_column = (width / 2);
int point_row_coord = (height / 2);
int point_column_coord = (width / 2);
// 9.2 derive the depth value in the color camera geometry using the function k4a_transformation_depth_image_to_color_camera().
k4a_transformation_t transformation = NULL;
k4a_image_t transformed_depth_image = NULL;
if (K4A_RESULT_SUCCEEDED != k4a_image_create(K4A_IMAGE_FORMAT_DEPTH16,
width,
height,
width * (int)sizeof(uint16_t),
&transformed_depth_image))
{
printf("Failed to create transformed color image\n");
return false;
}
else
{
// Transform the depth image to the size of color camera
transformation = k4a_transformation_create(&calibration);
k4a_transformation_depth_image_to_color_camera(transformation, image_depth, transformed_depth_image);
// 9.3 Store the image using opencv Mat
uint16_t* transformed_depth_image_data = (uint16_t*)(void*)k4a_image_get_buffer(transformed_depth_image);
const Mat trans_depth_image(height, width, CV_16U, (void*)transformed_depth_image_data, Mat::AUTO_STEP);
// 9.4 Get the corresponding depth values of 2D point
vector<float> depth_value(1, 0);
printf("print the depth value of the center point\n");
// The next line should have the output of the depth value of the middle point in the 2D color image.
cout << trans_depth_image.at<cv::int16_t>(point_column_coord, point_row_coord) << endl;
// 9.5 Display the images
namedWindow("foobar", WINDOW_AUTOSIZE);
imshow("foobar", trans_depth_image);
waitKey(1000);
}
// To be continued.... k4a_calibration_2d_to_3d(&calibration, depth_value, )
// release images
k4a_image_release(image_depth);
k4a_image_release(image_color);
k4a_image_release(transformed_depth_image);
// release capture
k4a_capture_release(capture);
}
returnCode = 0;
Exit:
if (device != NULL)
{
k4a_device_stop_cameras(device);
k4a_device_close(device);
}
return returnCode;
}
The code here should generate 3 windows that display color image, depth image and the transformed depth image. Finally it will print out the depth value of the center point.
Screenshots
Here I append the depth image and the transformed image that I have collected. Original depth image:

Transformed depth image:

Desktop (please complete the following information):
The K4A depth camera cannot always provide a depth value for every pixel. Especially for pixels representing scene points at a large distance or having low reflectivity in IR, the returned signal is too low to compute depth reliably. In such cases, we report a depth value of 0 indicating that the depth reading of the pixel is invalid.
We internally apply some processing steps in the depth engine to reduce the number of invalid pixels. However, we do not provide depth inpainting algorithms, i.e., algorithms that fill in invalid regions to provide a depth reading at every pixel. The reason is that such algorithms may be unreliable. If you want to reduce the number of invalid pixels, you might want to change the depth mode (if this is possible for your application). The narrow FoV modes will give you fewer invalid pixels than the wide FoV modes. Furthermore, the binned modes will give you fewer invalidations than the unbinned ones.
For clarity, this issue is not related to the function k4a_transformation_depth_image_to_color_camera(). Unfortunately, the interpolation methods in k4a_transformation_depth_image_to_color_camera_custom() will not help to solve this problem either.
@mbleyer Thanks for the clarification! I have tried the inpaint function built in opencv. And as mbleyer has mentioned the inpainting algorithm is not reliable. In my project actually I only need the depth values of a few points of interest. Therefore I just use the nearest neighbor to interpolate the missing depth values.
In the hope of helping others, here I also report what I have tested with the cv::inpaint function trying to interpolate the missing depth values, the results are appended below:
Original depth image:

Transform depth image to color camera:

Generate the mask of the transformed image:

Apply cv::inpaint function on depth images:

From the images above, the depth values from inpainted image have some slight vibration based on the position. Also it seems that it becomes harder to tell the difference of the depth values as points get closer to each other.
@opalmu Thank you for the sharing and exercise for depth inpaint. The information you provided in this issue surely will help others too. Let us know if you have any other questions, if not, please close this issue :)
@opalmu I am using your listing from above and I get the results below (with strips in transformed depth image).
Does someone know of that problem, may be it is related to my sensor?



Could somebody help with this problem?
Operating system: Windows 10 64-bit
Azure-Kinect-Sensor-SDK: latest build from source
Compiler version (if built from source): Visual Studio 2019
Firmware: Loading firmware package AzureKinectDK_Fw_1.6.108079014.bin.
File size: 1294306 bytes
This package contains:
RGB camera firmware: 1.6.108
Depth camera firmware: 1.6.79
Depth config files: 6109.7 5006.27
Audio firmware: 1.6.14
Build Config: Production
Certificate Type: Microsoft
Signature Type: Microsoft
@GHSch your striping issue may be related to #840 #294, hope it helps
Thank you a lot, that solved the problem.
Most helpful comment
@mbleyer Thanks for the clarification! I have tried the inpaint function built in opencv. And as mbleyer has mentioned the inpainting algorithm is not reliable. In my project actually I only need the depth values of a few points of interest. Therefore I just use the nearest neighbor to interpolate the missing depth values.
In the hope of helping others, here I also report what I have tested with the
cv::inpaintfunction trying to interpolate the missing depth values, the results are appended below:Original depth image:

Transform depth image to color camera:

Generate the mask of the transformed image:

Apply cv::inpaint function on depth images:

From the images above, the depth values from inpainted image have some slight vibration based on the position. Also it seems that it becomes harder to tell the difference of the depth values as points get closer to each other.