How to get bounding box of the handtracking using python code?
The hands Python API doesn't return the palm bboxes. But, you can edit the source code to let the hands API output it:
--- a/mediapipe/python/solutions/hands.py
+++ b/mediapipe/python/solutions/hands.py
@@ -142,7 +142,7 @@ class Hands(SolutionBase):
'handlandmarkcpu__ThresholdingCalculator.threshold':
min_tracking_confidence,
},
- outputs=['multi_hand_landmarks', 'multi_handedness'])
+ outputs=['multi_hand_landmarks', 'multi_handedness', 'palm_detections'])
def process(self, image: np.ndarray) -> NamedTuple:
"""Processes an RGB image and returns the hand landmarks and handedness of each detected hand.
Then, you can get the palm detection list by
for palm_detection in results.palm_detections:
print(palm_detection.location_data.relative_bounding_box)
You will see the relative bbox data like the following:
xmin: 0.6212505102157593
ymin: 0.46138930320739746
width: 0.2514756917953491
height: 0.37721356749534607
Finally, you can multiple those values by image width or height to get the bbox coordinates.
The hands Python API doesn't return the palm bboxes. But, you can edit the source code to let the hands API output it:
--- a/mediapipe/python/solutions/hands.py +++ b/mediapipe/python/solutions/hands.py @@ -142,7 +142,7 @@ class Hands(SolutionBase): 'handlandmarkcpu__ThresholdingCalculator.threshold': min_tracking_confidence, }, - outputs=['multi_hand_landmarks', 'multi_handedness']) + outputs=['multi_hand_landmarks', 'multi_handedness', 'palm_detections']) def process(self, image: np.ndarray) -> NamedTuple: """Processes an RGB image and returns the hand landmarks and handedness of each detected hand.Then, you can get the palm detection list by
for palm_detection in results.palm_detections: print(palm_detection.location_data.relative_bounding_box)You will see the relative bbox data like the following:
xmin: 0.6212505102157593 ymin: 0.46138930320739746 width: 0.2514756917953491 height: 0.37721356749534607Finally, you can multiple those values by image width or height to get the bbox coordinates.
This sounds like a good parameter to add in future Python builds!
I got this type of error after adding that:
Traceback (most recent call last):
File "camera_set.py", line 34, in <module>
for palm_detection in results.palm_detections:
AttributeError: type object 'SolutionOutputs' has no attribute 'palm_detections'
If you use pip install mediapipe to install mediapipe, you need to modify the mediapipe/python/solutions/hands.py file in your python3.x/site-packages/mediapipe dir. If you edit the repo code, you need to rebuild the python package by running setup.py.
where the file of setup.py in mediapipe repo?
I use this command instead to get the bounding box
for hand_landmark in results.multi_hand_landmarks:
x = [landmark.x for landmark in hand_landmark.landmark]
y = [landmark.y for landmark in hand_landmark.landmark]
center = np.array([np.mean(x)*image_width, np.mean(y)*image_hight]).astype('int32')
cv2.circle(image, tuple(center), 10, (255,0,0), 1) #for checking the center
cv2.rectangle(image, (center[0]-200,center[1]-200), (center[0]+200,center[1]+200), (255,0,0), 1)
The hands Python API doesn't return the palm bboxes. But, you can edit the source code to let the hands API output it:
--- a/mediapipe/python/solutions/hands.py +++ b/mediapipe/python/solutions/hands.py @@ -142,7 +142,7 @@ class Hands(SolutionBase): 'handlandmarkcpu__ThresholdingCalculator.threshold': min_tracking_confidence, }, - outputs=['multi_hand_landmarks', 'multi_handedness']) + outputs=['multi_hand_landmarks', 'multi_handedness', 'palm_detections']) def process(self, image: np.ndarray) -> NamedTuple: """Processes an RGB image and returns the hand landmarks and handedness of each detected hand.Then, you can get the palm detection list by
for palm_detection in results.palm_detections: print(palm_detection.location_data.relative_bounding_box)You will see the relative bbox data like the following:
xmin: 0.6212505102157593 ymin: 0.46138930320739746 width: 0.2514756917953491 height: 0.37721356749534607Finally, you can multiple those values by image width or height to get the bbox coordinates.
I have tried this but it is bbox for palm, not for hand?
Most helpful comment
The hands Python API doesn't return the palm bboxes. But, you can edit the source code to let the hands API output it:
Then, you can get the palm detection list by
You will see the relative bbox data like the following:
Finally, you can multiple those values by image width or height to get the bbox coordinates.