Realtime_multi-person_pose_estimation: number of joint points

Created on 4 Feb 2017 · 5Comments · Source: ZheC/Realtime_Multi-Person_Pose_Estimation

@ZheC the testing demo generates output Mconv7_stage6_L1 (1x38x46x46) and Mconv7_stage6_L2 (1x19x46x46), which should represent the confidence map of 19 kinds of part and corresponding part affinity maps. But COCO challenge introduces their annotations with 17 keypoints, and it is also quite confusing when seeing np = 15 in testing demo. Could you please explain a bit about the definition of number of joint points in all these places? Looking forward to your answer. Thanks.

Source

Robert0812

Most helpful comment

I also get confused. MPII datasets has 16 keypoints
(0 - r ankle, 1 - r knee, 2 - r hip, 3 - l hip, 4 - l knee, 5 - l ankle, 6 - pelvis, 7 - thorax, 8 - upper neck, 9 - head top, 10 - r wrist, 11 - r elbow, 12 - r shoulder, 13 - l shoulder, 14 - l elbow, 15 - l wrist).
However, I find this code in modelDescriptorFactory.cpp
{{0, "Head"}, //head top
{1, "Neck"}, //upper neck
{2, "RShoulder"}, //r shoulder
{3, "RElbow"}, // r elbow
{4, "RWrist"}, // r wrist
{5, "LShoulder"}, // l shoulder
{6, "LElbow"}, // l elbow
{7, "LWrist"}, // l wrist
{8, "RHip"}, //r hip
{9, "RKnee"}, // r knee
{10, "RAnkle"}, //r ankle
{11, "LHip"}, //l hip
{12, "LKnee"}, //l knee
{13, "LAnkle"},//l ankle
{14, "Chest"},// thorax
{15, "Bkg"}}, // ?? pelvis
Does "Bkg" represent background? Where is the pelvis?

And coco has 17 keypoints which are:
['nose', 'left_eye', 'right_eye', 'left_ear', 'right_ear', 'left_shoulder', 'right_shoulder', 'left_elbow', 'right_elbow', 'left_wrist', 'right_wrist', 'left_hip', 'right_hip', 'left_knee', 'right_knee', 'left_ankle', 'right_ankle'].
but your code:
{{0, "Nose"}, //t
{1, "Neck"}, //f is not included in coco. How do you get the Neck keypoints
{2, "RShoulder"}, //t
{3, "RElbow"}, //t
{4, "RWrist"}, //t
{5, "LShoulder"}, //t
{6, "LElbow"}, //t
{7, "LWrist"}, //t
{8, "RHip"}, //t
{9, "RKnee"}, //t
{10, "RAnkle"}, //t
{11, "LHip"}, //t
{12, "LKnee"}, //t
{13, "LAnkle"}, //t
{14, "REye"}, //t
{15, "LEye"}, //t
{16, "REar"}, //t
{17, "LEar"}, //t
{18, "Bkg"}}, //f background ??

KeyKy on 5 Feb 2017

👍8

All 5 comments

KeyKy on 5 Feb 2017

👍8

@Robert0812 @KeyKy The above lists are mostly correct. For MPI, part 14 is the human center location provided by the annotated data (rather than chest in your list). We do not predict chest or pelvis location, and they are not in MPI evaluation as well. For COCO, part 1 is the neck position calculated by the mean of the two shoulders. The part 15 in MPI and part 18 in COCO are background predictions.

ZheC on 5 Feb 2017

👍3

@ZheC @KeyKy Thanks, this issue is clearly resolved.

Robert0812 on 5 Feb 2017

@Robert0812 Mconv7_stage6_L1 (1x38x46x46) and Mconv7_stage6_L2 (1x19x46x46), which should represent the confidence map of 19 kinds of part and corresponding part affinity maps? Is not the other way? I mean, L2 - Heat maps , L1-PAFs?