I am currently running the mediapipe_multi_hands_tracking_aar_example.
Inside the MainActivity class I am able to obtain the normalized x and y coordinates of the landmarks by using getX() and getY() on the received NormalizedLandmarks. However, I am trying to extract the box of pixels around the finger tips for each frame. To do this, I need the unnormalized, precise coordinates of the finger tip landmarks. The incoming frames are 1080 x 1440 pixels, however the normalized landmark coordinates are all between [0,1]. Using the normalized landmark coordinates to map out the actual finger tip pixels will not be accurate at all.
How can I obtain the coordinates of the original, unnormalized landmarks and use it in my MainActivity? If this is possible, what do these unnormalized coordinates represent? Are they pixels coordinates? Thank you.
Why do you think that it will not be accurate at all? The landmarks are at least single precision float point values.
I am successfully extracting the normalized landmark coordinates and then processing them further for on-frame drawing without any issues.
The beauty of normalized coordinates is that they [generally] do not care what size of output is used.
Using getX()*width & getY()*height should be plenty accurate.
Let's say you have a hand coordinate, in pixel space, of (123,123) with your 1080 x 1440 image, and it moves 1 pixel up (123,124):
In normalized coordinates this would be
123/1440.0 = 0.0854166666666666
124/1440.0 = 0.08611111111111111
meaning there is a difference of
abs(0.08541666666666667 - 0.08611111111111111) = 0.000694444444444442
which floating point can easily handle, even more accurate than integer coordinates, allowing for sub-pixel accuracy.
I realize my misinterpretation now. Thank you all for the input. I do have a new issue about accessing raw camera frames and would appreciate any help. https://github.com/google/mediapipe/issues/793
Most helpful comment
The beauty of normalized coordinates is that they [generally] do not care what size of output is used.
Using
getX()*width&getY()*heightshould be plenty accurate.Let's say you have a hand coordinate, in pixel space, of (123,123) with your 1080 x 1440 image, and it moves 1 pixel up (123,124):
In normalized coordinates this would be
123/1440.0 = 0.0854166666666666124/1440.0 = 0.08611111111111111meaning there is a difference of
abs(0.08541666666666667 - 0.08611111111111111) = 0.000694444444444442which floating point can easily handle, even more accurate than integer coordinates, allowing for sub-pixel accuracy.