Azure-kinect-sensor-sdk: Publish recording sample data

Created on 10 May 2019  路  27Comments  路  Source: microsoft/Azure-Kinect-Sensor-SDK

Is your feature request related to a problem? Please describe.
Making recordings available enable developers without access to Azure Kinect DK hardware to begin integration with the Azure Kinect Sensor SDK.

Describe the solution you'd like
We would like to pre-record some short sample data that would be useful for developers to test their solutions.

:question: :grey_question: :question: :grey_question: :question: :grey_question:

_Developers!_ If you have suggestions on what recorded content would be useful for you, please add them to this issue.

Enhancement More Info Needed

Most helpful comment

@rfilkov regarding the question "Why are the depth images octagonal in NFOV mode and circular in WFOV mode?"

  • NFOV and WFOV modes use different laser diffusers. The NFOV diffuser generates a hexagonal illumination field (and NFOV is essentially a region of interest in the depth sensor, then the ROI boundary clip the hexagon resulting in an octagonal shape), while the WFOV diffuser generates a circular one. Pixels outside of the illumination field are not recommended to use to generate depth for quality purpose.
  • Here is the official documentation of the depth camera concept which has a section discuss the invalidation causing by illumination mask https://docs.microsoft.com/en-us/azure/Kinect-dk/depth-camera

All 27 comments

OK. Here is a suggestion for two recording scenarios, each within 20-30 seconds. They could be used later for tests of the body tracking SDK, as well:

  1. Slow moving people with occlusion: Two, three or four people walking in a room that contains several large objects. The people may go in front or behind the objects, leave the camera view, then come back again.
  2. Fast moving people, randomly obscuring each other: Several people playing basketball (or football) outside or in a large enough room.

If possible, please provide recordings from multiple (2-3) synchronized cameras, located at the room walls or around the playfield.

For some medical use precision tests how about a subject in undergarments (skin showing to evaluate reflectivity) standing in front of the sensor placed on a 1m tall tripod at 1.5m distance (whitepaper says that's the 1mm error range)...since i realize that's a privacy issue the subject can be turned back towards sensor...also will the SDK have fusion API for creating and exporting vertex colored meshes?

_### fastpointcloud example only gives you a ply with the vertex values and no triangles...do you suggest we interpolate the facets from the depth map in the example...or could we have an insight how was that done in v2 SDK? TIA_

EDIT _Or better yet just take a recording of a know size mannequin and give us its size thus we'd be able to measure the accuracy ourselves too and see if some point cloud smoothing techniques may give even better results in case of excessive jitters..._ and enable pre-orders in Europe too :)

Great idea, that would certainly help us poor chaps without a physical sensor :)
A few more generic suggestions that could be useful for various scenarios:

  • recordings with a static sensor looking at a generic interior in all the various depth resolutions
    preferably with objects/walls at various distances all the way up to the maximum distance the sensor can see

  • a sensor moving around an object or a (cluttered) desk, to test fusion/scanning algorithms

  • a medium shot of the upper body of a person doing some hand gestures

  • a closeup of a person's face doing some expressions, speaking and looking around

  • a set of recordings from several synchronized sensors, for example a person walking in and waving

If possible with accompanying video and/or infrared data (as well as IMU)

I'd echo Brekel's requests, in particular the multi-sensor data.

Hi there, when could we expect at least 1-2 sample recordings?

Here are some recordings. Sorry they are not fancier, there is a bit more process associated with releasing video's with people in them.

https://www.microsoft.com/en-us/download/details.aspx?id=58385&WT.mc_id=

Thanks for sharing those, that's really helpful to get some of us started!

Could someone also share usage, I tried the viewer, it doesn't work...not like, previously, Studio playback which had apps, not just one app, detect a virtual sensor playback....I'm guessing you playback via code, but it's not clear to me how? Copied some code from playback_external_sync/main.c but it still doesn't play?
TIA

Works perfectly fine.
You can have a quick look at the files with k4aviewer from the SDK's tool folder.

From code, check the following header file: k4arecord/playback.h
Call k4a_playback_open and point it to a file.
Call k4a_playback_get_next_capture to fetch a capture frame and use it in the same way as you would from a live sensor.

The documentation in the header file seems pretty self explanatory, the only thing I haven't really figured out the usage for is: k4a_playback_get_tag

Hi @Brekel
Thanks for the prompt reply, I tried to do the same, however it still doesn't work at best...
As you said I tried to add the k4arecord stuff, but it either wouldn't work from VS with an error (SEE EDIT)

[2019-06-16 14:42:31.149] [error] [t=19036] D:\a\1\s\extern\Azure-Kinect-Sensor-SDK\src\record\sdk\playback.cpp (49): k4a_playback_open(). Unable to open file 'OFFICE_SAMPLE_WFOV_UNBINNED.mkv': ios_base::failbit set: iostream stream error

(after k4a_playback_open call handle's NULL) or it just doesn't do anything from command line...thanks to your tip I reduced some complexity of the code, by not using the recording_t struct but it still produces no output when adding this to fastpointcloud:

/*
...  this was added initially prior to main ... 
typedef struct
{
    char *filename;
    k4a_playback_t handle;
    k4a_record_configuration_t record_config;
    k4a_capture_t capture;
} recording_t;
*/
... added in main ...
        std::string exePath = argv[0];

        std::size_t sepPos = exePath.rfind('\\');

        std::string filePath = exePath.substr(0, sepPos + 1);

    input_mkv = argv[1];
        file_name = argv[2];

    /* This was added prior to simplified code ... BUT STILL WOULDN'T WORK WITH SDK 1.1.0
        recording_t *files = (recording_t *)malloc(sizeof(recording_t));
        if (files == NULL)
        {
            printf("Failed to allocate memory for playback (%zu bytes)\n", sizeof(recording_t));
            return 1;
        }
        memset(files, 0, sizeof(recording_t));
        files[0].filename = (char *)input_mkv.c_str();
        k4a_playback_open(files[0].filename, &files[0].handle);
        */

    k4a_playback_t handle = nullptr;
        k4a_result_t result = k4a_playback_open(filePath.append(argv[1]).c_str(), &handle);

        // Get a capture
        switch (k4a_playback_get_next_capture(handle, &capture))
        {
        case K4A_WAIT_RESULT_SUCCEEDED:
            break;
        case K4A_WAIT_RESULT_TIMEOUT:
            printf("Timed out waiting for a capture\n");
            goto Exit;
        case K4A_WAIT_RESULT_FAILED:
            printf("Failed to read a capture\n");
            goto Exit;
        }

Did I miss something maybe, setting filename is required, anything else too? Much obliged, TIA

EDIT
The trouble in VS handle being NULL was that I was just passing filename without full path in command arguments, but even now with code to get the fullpath it failed on point_cloud_data[i].xyz.x = nanf(""); with

Exception thrown at 0x00007FF67471204D in fastpointcloud.exe: 0xC0000005: Access violation writing location 0x0000000000000000.

Why is the SDK provided bin and lib release only...I don't hit breakpoints prior to the line that causes violation access...is that becuse of it or are you supposed to use source code built yourself (I'll have switch to another computer with more storage for that and build kinfu example and OPENCV contribs also for that or are there binaries)?

Haven't compiled your code but I think your switch statement for k4a_playback_get_next_capture is the culprit as it returns a k4a_stream_result_t (and not a k4a_wait_result_t).

I do the following:

k4a_capture_t capture;
if(k4a_playback_get_next_capture(m_playback, &capture) != K4A_STREAM_RESULT_SUCCEEDED)
    return;

k4a_image_t image_color = k4a_capture_get_color_image(capture);
k4a_image_t image_IR    = k4a_capture_get_ir_image(capture);
k4a_image_t image_depth = k4a_capture_get_depth_image(capture);

Process_Color(image_color);
Process_IR(image_IR);
Process_Depth(image_depth);

k4a_image_release(image_color);
k4a_image_release(image_IR);
k4a_image_release(image_depth);
k4a_capture_release(capture);

I do use these (and some other things) right after opening the file but they may not be required, more for internal housekeeping:
k4a_playback_set_color_conversion(m_playback, K4A_IMAGE_FORMAT_COLOR_BGRA32);
k4a_playback_get_record_configuration(m_playback, &config);

Thanks, got it, it wasn't config, but made me look, I needed to get calibration, but not from device, which is non-existent, but playback...now, different question, will this somehow, even in C code, be abstracted in later version, since different methods is a mess, guessing multi-platform made MS loose kinectservice.exe...but? :/

P.S.

And that question about debug/release still stands, some breakpoints just aren't hit, when using SDK .msi, is source required?

Here are some recordings. Sorry they are not fancier, there is a bit more process associated with releasing video's with people in them.
https://www.microsoft.com/en-us/download/details.aspx?id=58385&WT.mc_id=

Thank you, @wes-b for sharing these recordings! They are really helpful to get started with K4A SDK. Do you plan to add more recordings later, with more dynamic content and (if possible) moving people, as well? And, if not too much, may I ask 1-2 questions regarding these recordings?

@rfilkov we have aspirations to publish more. No idea of a time frame, so what are your questions?

@wes-b Thank you in advance for your aspirations to publish more. Here are my questions regarding the current recordings:

  1. Why are the depth images octagonal in NFOV mode and circular in WFOV mode?
  2. In all recordings, the config says: color_track_enabled: true, depth_track_enabled: false, ir_track_enabled: false, imu_track_enabled: false. Regardless, the depth, IR and IMU frames are present in the recording. Is this a bug, or I'm missing something?

I have asked for help on #1. For #2 what tool/config are you referring to? I don't see similar results in k4aviewer.

I have asked for help on #1. For #2 what tool/config are you referring to? I don't see similar results in k4aviewer.

I'm referring to the config returned by calling k4a_playback_get_record_configuration() right after k4a_playback_open(). The other values in the config seem correct.

Hmmmm weird.

When I call this right after loading one of the sample files:
k4a_record_configuration_t config;
k4a_playback_get_record_configuration(m_playback, &config);

config.color_track_enabled, config.depth_track_enabled and config.imu_track_enabled are all true.
I'm on the latest Windows 1.1 build (installed from the MSI), maybe a bug in a specific version/platform?

Hm. I installed 'Azure Kinect SDK 1.1.0.msi', as well. Hopefully there are no different msi builds of the same version.

@wes-b, how can we tell in playback mode, whether the color and depth streams are synchronized? The only (maybe related) setting in the playback config is 'depth_delay_off_color_usec'. Does this time difference bring information regarding synched or not streams?

And, if a capture contains both color and depth images (regardless of the value of 'depth_delay_off_color_usec'), are these images synchronized or just packed together?

By the way, now with body tracking SDK available, at least 1-2 recordings with several moving bodies inside would be needed. To avoid personal and security issues, only the depth stream could be added to these recordings.

@rfilkov Color and depth frames will only be packed into the same capture object if their timestamps match up.
The only exception to this is if the recording was made with a non-zero depth_delay_off_color_usec value, in which case a single capture will contain a color and depth frame that are offset by that amount.

If there is no synchronized frame available (such as if frames were dropped during recording), the capture object may only contain a single image.

Now that the main documentation site has been published, this page may be useful for people working with these recordings:

https://docs.microsoft.com/en-us/azure/Kinect-dk/record-playback-api

@rfilkov regarding the question "Why are the depth images octagonal in NFOV mode and circular in WFOV mode?"

  • NFOV and WFOV modes use different laser diffusers. The NFOV diffuser generates a hexagonal illumination field (and NFOV is essentially a region of interest in the depth sensor, then the ROI boundary clip the hexagon resulting in an octagonal shape), while the WFOV diffuser generates a circular one. Pixels outside of the illumination field are not recommended to use to generate depth for quality purpose.
  • Here is the official documentation of the depth camera concept which has a section discuss the invalidation causing by illumination mask https://docs.microsoft.com/en-us/azure/Kinect-dk/depth-camera

Thank you, @xthexder! Thank you, @rabbitdaxi!

Hi, can I know how should I record using the body tracking SDK? Thanks!

Hello all I was looking for ideas to make in my lab and these are good ideas.
I will try to recreate some of these with my best ability.
My first video was posted yesterday, but please post me questions or use cases, love to try to do.

https://youtu.be/srSCs0vmvGU

Was this page helpful?
0 / 5 - 0 ratings