When I ran openpose on a full HD(1920x1080) frame it dumps a heatmap of size 24928x368. In this frame it has detected just one person and I have used --heatmaps_add_PAFs flag with which I expect it dumps 2x19 PAFs wrto all the limbs.
Can you help me make sense out of this heatmap output? I understand that it is the unit vector for each limb but in what form it is stored that I am unable to figure out.
./build/examples/openpose/openpose.bin --image_dir /media/ladmin/Data/Code/openpose/input/frames/Apr12/R1/outdirGOPR0114/ --write_heatmaps output/heatmaps/GOPR0114/ --heatmaps_add_PAFs "true" --no_display
{
"version":0.1,
"people":[
{
"body_parts":[
1138.74,407.935,0.648187,1141.57,413.755,0.829449,1138.54,413.895,0.77001,1132.89,431.526,0.543597,1135.73,437.302,0.467062,1153.39,413.761,0.760855,1159.11,431.417,0.660411,1147.52,434.396,0.533484,1138.76,454.741,0.83723,1138.66,481.254,0.806889,1138.75,504.904,0.650429,1153.37,452.012,0.771506,1153.39,481.238,0.764944,1153.4,504.845,0.649774,1138.65,407.809,0.53724,1138.84,407.775,0.610177,1138.56,407.849,0.304282,1141.68,405.036,0.493725
]
}
]
}
I am sorry, I need to add information about body part and PAF heatmaps output. I will do it soon.
Meanwhile, they are whole images. 24928x368 = 38 (2x19) images concatenated x 656x368 (net_resolution). You can easily see them by opening the generated png to see it is a huge image with all the PAF heatmaps.
The alternatives to saving it as a concatenated huge image were not that good:
As a side note, you outputted the pose keypoints, no the heatmaps. The heatmaps should have been saved as a png on output/heatmaps/GOPR0114/.
Thank You!
Yes I posted the keypoint data just to give an idea that the heatmap was generated wrto this pose.
So you use the following limb pairs:
POSE_COCO_PAIRS {1,2, 1,5, 2,3, 3,4, 5,6, 6,7, 1,8, 8,9, 9,10, 1,11, 11,12, 12,13, 1,0, 0,14, 14,16, 0,15, 15,17, 2,16, 5,17};
which give us 19 limbs so why 2x19 images and not just 19 per pose?
Secondly, can you explain what you mean by net_resolution of 656x368. My intention is identifying the pixel in the original image which belongs to that limb, how do I get that information from this 656x368 image?
For all this technical questions, you might better refer to the multi-people estimation paper.
Shortly:
x and one for the y coordinate, which are merged for rendering.656x368 is the deep net size. If you applied the net to the whole image (e.g. 1280x720), it'd be really slow, that is why it is resized down. Decrease it --> faster, increase it --> potentially more accuracy. So you can re-scale it again from this 656x368 to the original image resolution. Again, please check the paper for more technical and complete explanation about point 1.
What is the role of last two limbs[2,16] ,[5,17]? So can PAF channels change to 2*17?
Help detecting ears for people looking in the opposite direction to the camera, since eyes are not visible. Let me know if the explanation is not clear enough. Meanwhile I'll close this solved issue. Thanks
You use 0 for nose,1 for neck and so on.How to define the order by myself? @gineshidalgo99
The order is used in several places. I'd say it's better just to reorder the final keypoints rather than defining a different order.
Kind of building off these questions. I am trying to use the output of the first stage to possibly modify openpose for egocentric hand detection. I want to find the right and left forearms in the image, which from above would be the 3->4 and 6->7 pairings. However, I'm currently reading these png files in Matlab and I'm not fully sure how to interpret the vectors from the pngs as described in the paper.
Would love some assistance. Thanks
Most helpful comment
Help detecting ears for people looking in the opposite direction to the camera, since eyes are not visible. Let me know if the explanation is not clear enough. Meanwhile I'll close this solved issue. Thanks