Mediapipe: How do you crop the image which includes the hand after palm is detected?

Created on 20 Aug 2019 · 2Comments · Source: google/mediapipe

Hi,

Nice work on hand tracking and gesture recognition! I do have a question, like I said in the title. After palm is detected, how do you guarantee the image being cropped contains all the keypoints of the hand? If I notice correct, this gif (red box) misses some fingertips. Thanks!

model

Source

pharrellyhy

Most helpful comment

Hi,
Thanks for the questions.
1) For cropping the correct image region, we first rotate the image to an angle that the vector connecting wrist and MCP is vertical. Then we extend the palm square on each direction with fairly large scale based on the metrics from our experiment. Plus our model is trained with large augmentation to capture the variance of hand location within the cropped region. You can find the implementation detail in the Mediapipe hand tracking graph.
2) You are absolutely correct that the detail of the hand is not predicted well. This is mostly because the model doesn't handle motion blur well enough and of course the model itself is not perfect yet.
We are keep working on improving the model quality in various aspects. Your feedback is very appreciated!