Azure-kinect-sensor-sdk: Deadlock in transform functions

Created on 22 May 2019  路  9Comments  路  Source: microsoft/Azure-Kinect-Sensor-SDK

Describe the bug

The transform functions (e.g. k4a_transformation_color_image_to_depth_camera) are susceptible to deadlocks (I would guess by lost wakeup). These won't occur in the example because the functions is only called once there.

To Reproduce

Call some transformation functions in a loop and wait. The lockdown occurred much faster on my slower devices. I assembled a minimal repro application here

Expected behavior

An infinite loop of transformed frames.

Desktop (please complete the following information):

  • Windows 1903
  • c2949cc06c82530d153f33ca450f09303fc0c22a
Bug

Most helpful comment

Great stuff, with the workaround I also couldn't reproduce the deadlock on my machine. Also as mbleyer guessed the frame rate improved quite a lot.

All 9 comments

@mbleyer, can you take a look?

Not sure if I understand the bug. Does your example program hang or does it run out of memory? It seems like the images are not deallocated.

The program hangs between getting the frame and processing it. Thanks to C++ RAII the image gets deallocated after line 37.

Do you know if you still reproduce the problem if you don't re-create the transformation object in each iteration of the loop?

I was able to repro using Helco's code sample. The issue goes away when initializing the transformation object outside of the while loop, which represents the right use of the SDK. We might want to highlight that in the documentation.

@christianmakela: We can try investigating why the deadlock occurs. It would be expected that the frame rate drops significantly in Helco's code, but we should not see a deadlock.

Yes, we should probably highlight that the transformation object itself is somewhat heavy weight. I can see why it wouldn't be obvious though. The C API hints at this, but I imagine that C++ consumers won't dig that deep unless they have a problem.

In any case, we certainly shouldn't deadlock.

I tried a few more times and was not able to repro the deadlock anymore. Andrew @rabbitdaxi is looking at it now.

Great stuff, with the workaround I also couldn't reproduce the deadlock on my machine. Also as mbleyer guessed the frame rate improved quite a lot.

Meanwhile, I will take a deeper look with a low end computer, it might be a race of the worker thread and main thread mutex was not robust enough to prevent the dead lock when user loop the transformation_create (although which is not the recommened way), however, we still need to fix the dead lock if we find a way to repro :)

Was this page helpful?
0 / 5 - 0 ratings