Mediapipe: Getting Bounding Box on Handtracking python

Created on 13 Dec 2020  路  7Comments  路  Source: google/mediapipe

How to get bounding box of the handtracking using python code?

Most helpful comment

The hands Python API doesn't return the palm bboxes. But, you can edit the source code to let the hands API output it:

--- a/mediapipe/python/solutions/hands.py
+++ b/mediapipe/python/solutions/hands.py
@@ -142,7 +142,7 @@ class Hands(SolutionBase):
             'handlandmarkcpu__ThresholdingCalculator.threshold':
                 min_tracking_confidence,
         },
-        outputs=['multi_hand_landmarks', 'multi_handedness'])
+        outputs=['multi_hand_landmarks', 'multi_handedness', 'palm_detections'])

   def process(self, image: np.ndarray) -> NamedTuple:
     """Processes an RGB image and returns the hand landmarks and handedness of each detected hand.

Then, you can get the palm detection list by

for palm_detection in results.palm_detections:
      print(palm_detection.location_data.relative_bounding_box)

You will see the relative bbox data like the following:

xmin: 0.6212505102157593
ymin: 0.46138930320739746
width: 0.2514756917953491
height: 0.37721356749534607

Finally, you can multiple those values by image width or height to get the bbox coordinates.

All 7 comments

The hands Python API doesn't return the palm bboxes. But, you can edit the source code to let the hands API output it:

--- a/mediapipe/python/solutions/hands.py
+++ b/mediapipe/python/solutions/hands.py
@@ -142,7 +142,7 @@ class Hands(SolutionBase):
             'handlandmarkcpu__ThresholdingCalculator.threshold':
                 min_tracking_confidence,
         },
-        outputs=['multi_hand_landmarks', 'multi_handedness'])
+        outputs=['multi_hand_landmarks', 'multi_handedness', 'palm_detections'])

   def process(self, image: np.ndarray) -> NamedTuple:
     """Processes an RGB image and returns the hand landmarks and handedness of each detected hand.

Then, you can get the palm detection list by

for palm_detection in results.palm_detections:
      print(palm_detection.location_data.relative_bounding_box)

You will see the relative bbox data like the following:

xmin: 0.6212505102157593
ymin: 0.46138930320739746
width: 0.2514756917953491
height: 0.37721356749534607

Finally, you can multiple those values by image width or height to get the bbox coordinates.

The hands Python API doesn't return the palm bboxes. But, you can edit the source code to let the hands API output it:

--- a/mediapipe/python/solutions/hands.py
+++ b/mediapipe/python/solutions/hands.py
@@ -142,7 +142,7 @@ class Hands(SolutionBase):
             'handlandmarkcpu__ThresholdingCalculator.threshold':
                 min_tracking_confidence,
         },
-        outputs=['multi_hand_landmarks', 'multi_handedness'])
+        outputs=['multi_hand_landmarks', 'multi_handedness', 'palm_detections'])

   def process(self, image: np.ndarray) -> NamedTuple:
     """Processes an RGB image and returns the hand landmarks and handedness of each detected hand.

Then, you can get the palm detection list by

for palm_detection in results.palm_detections:
      print(palm_detection.location_data.relative_bounding_box)

You will see the relative bbox data like the following:

xmin: 0.6212505102157593
ymin: 0.46138930320739746
width: 0.2514756917953491
height: 0.37721356749534607

Finally, you can multiple those values by image width or height to get the bbox coordinates.

This sounds like a good parameter to add in future Python builds!

I got this type of error after adding that:

Traceback (most recent call last):
  File "camera_set.py", line 34, in <module>
    for palm_detection in results.palm_detections:
AttributeError: type object 'SolutionOutputs' has no attribute 'palm_detections'

If you use pip install mediapipe to install mediapipe, you need to modify the mediapipe/python/solutions/hands.py file in your python3.x/site-packages/mediapipe dir. If you edit the repo code, you need to rebuild the python package by running setup.py.

where the file of setup.py in mediapipe repo?

I use this command instead to get the bounding box

for hand_landmark in results.multi_hand_landmarks:
            x = [landmark.x for landmark in hand_landmark.landmark]
            y = [landmark.y for landmark in hand_landmark.landmark]

            center = np.array([np.mean(x)*image_width, np.mean(y)*image_hight]).astype('int32')
            cv2.circle(image, tuple(center), 10, (255,0,0), 1)  #for checking the center 
            cv2.rectangle(image, (center[0]-200,center[1]-200), (center[0]+200,center[1]+200), (255,0,0), 1)

The hands Python API doesn't return the palm bboxes. But, you can edit the source code to let the hands API output it:

--- a/mediapipe/python/solutions/hands.py
+++ b/mediapipe/python/solutions/hands.py
@@ -142,7 +142,7 @@ class Hands(SolutionBase):
             'handlandmarkcpu__ThresholdingCalculator.threshold':
                 min_tracking_confidence,
         },
-        outputs=['multi_hand_landmarks', 'multi_handedness'])
+        outputs=['multi_hand_landmarks', 'multi_handedness', 'palm_detections'])

   def process(self, image: np.ndarray) -> NamedTuple:
     """Processes an RGB image and returns the hand landmarks and handedness of each detected hand.

Then, you can get the palm detection list by

for palm_detection in results.palm_detections:
      print(palm_detection.location_data.relative_bounding_box)

You will see the relative bbox data like the following:

xmin: 0.6212505102157593
ymin: 0.46138930320739746
width: 0.2514756917953491
height: 0.37721356749534607

Finally, you can multiple those values by image width or height to get the bbox coordinates.

I have tried this but it is bbox for palm, not for hand?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

RealBBakGosu picture RealBBakGosu  路  4Comments

calvin422 picture calvin422  路  3Comments

PrinceP picture PrinceP  路  5Comments

karfly picture karfly  路  3Comments

elblogbruno picture elblogbruno  路  4Comments