I am impressed you guys got hand tracking in with v0.9.4, but I found that it's usefulness only goes out to about 4 to 5 feet before it loses confidence and resorts back to tracking just to the wrists. By comparison, the ol' Kinect v2 that I have can go out to nearly 10 feet before it can no longer detect the hands while still giving me whether its a fist, open, or "lasso".
I guess your models were only done at a desk and not an open space?
@JZharay You should still have joint values for the hand joint even if the confidence is set to NONE. There is a marked drop off in hand joint accuracy beyond 2m (from the camera) hence the no confidence value.
In Kinect v2 we had a hand state classifier for open / closed / lasso states separate to the body tracking algorithm. For Azure Kinect we have a new body tracking algorithm and we have not made a hand state classifier. This is something that could potentially be added in future if required, but another possibility is to do fully articulated hand tracking as in HoloLens 2. A state classifier can produce good results at longer distances but it only outputs one discrete label, whereas a pose tracker requires higher input resolution but can provide full 3D articulation.
I would like to understand the various scenarios for which users want hand tracking so we can move the technology in the right direction. What is your use case?
As @qm13 says, the hand joints are still tracked beyond 2m but we set the confidence to NONE to indicate that the accuracy of the joint positions is questionable.
@tobysharp-msft The hand state classifier was already (many times) required. Look here: https://feedback.azure.com/forums/920053-azure-kinect-dk/suggestions/38154784-hands-support-in-body-tracking It would provide means for interaction with virtual objects. The hand states could also be used to emulate taps, drags, drops, interactions with the UI, like button presses, etc. I thought you have already started it and even expected it in the next Body SDK update :(
Of course, fully articulated hand tracking would be always preferred, but even then such kinds of discrete states (like open, closed hands, finger pinches, presses, touches, etc.) will be again very valuable to the developers. The beauty of Kinect has always been that the user is the controller, without carrying any extra gears, like headsets, controllers, trackers. etc..
As someone who has built a direct manipulation music interface with the existing Kinect V2 classifier poses (see holofunk.com), I strongly prefer the classifier approach. Anyone doing prototyping or experimental work with the Kinect is very likely to lack the resources to develop their own classifier. It is a significant effort. The feature will be much more widely adopted if you can develop a classifier. The basic Kinect poses are the obvious start, with the possible addition of one or two more (perhaps a thumb-and-finger "bang-bang" pointing variation, or a "Spock greeting" hand pose -- whatever is distinct enough to be classifiable).
Full finger data seems unlikely to be reliable enough to do much with, but maybe the new sensor would surprise me. But please, please develop a classifier -- MSFT has far more resource to make a good one than anyone external is likely to have for some time.
Here is a new feature request regarding hand states: https://feedback.azure.com/forums/920053-azure-kinect-dk/suggestions/38933653-classifier-for-hand-states
The original one: https://feedback.azure.com/forums/920053-azure-kinect-dk/suggestions/38154784-hands-support-in-body-tracking was closed half completed.
As someone who has built a direct manipulation music interface with the existing Kinect V2 classifier poses (see holofunk.com), I strongly prefer the classifier approach. Anyone doing prototyping or experimental work with the Kinect is very likely to lack the resources to develop their own classifier. It is a significant effort. The feature will be much more widely adopted if you can develop a classifier. The basic Kinect poses are the obvious start, with the possible addition of one or two more (perhaps a thumb-and-finger "bang-bang" pointing variation, or a "Spock greeting" hand pose -- whatever is distinct enough to be classifiable).
Full finger data seems unlikely to be reliable enough to do much with, but maybe the new sensor would surprise me. But please, please develop a classifier -- MSFT has far more resource to make a good one than anyone external is likely to have for some time.
@RobJellinghaus That's very useful info, and points well made. Thank you. I hope to be able to come back with news on this in the near future.
You're welcome, I sincerely appreciate your receptiveness to the feedback.
FWIW one of the biggest issues with the Kinect V2 hand pose classifier is accuracy when the hand is at a low angle and/or in front of the torso. I had to build my app to specifically work around the major loss of accuracy when the hand is in front of the torso; the classifier works well when hands are out to the side, but much less well in other poses. If this is something that the new classifier's training data set can address (having lots of training poses with hands in front of torso from the camera's perspective), it would be a significant improvement even if only the Kinect V2 poses are detected.
Most helpful comment
Here is a new feature request regarding hand states: https://feedback.azure.com/forums/920053-azure-kinect-dk/suggestions/38933653-classifier-for-hand-states
The original one: https://feedback.azure.com/forums/920053-azure-kinect-dk/suggestions/38154784-hands-support-in-body-tracking was closed half completed.