How to calculate the head and body angle with respect to the camera?

asked 2019-01-15 21:53:39 -0600

updated 2019-01-23 04:02:32 -0600

I would like to calculate the human body angle with respect to the camera. I have attached a screenshot for your kind reference. If the person is looking into the camera then the angle is zero.
If the person is looking up then the angle is more than 0. I have the pose coordinates of the person. I would appreciate your thoughts on this. Thank you. C:\fakepath\Screen Shot 2019-01-16 at 12.38.55 PM.png

EDIT: Duplicate Question

maybe use dlib to get face landmark then caculate the angle like:

i think, you need a "head pose estimation", maybe something like this

I used this tensorflow code for head pose estimation a while back and it worked well for me

2 answers

answered 2019-01-17 02:57:36 -0600

updated 2019-01-18 03:35:24 -0600

i still think, you want a head pose estimation here. the gist is: you can use solvePnP to get an rvec and a tvec, containing rotation / translation wrt. the camera.

for this, you will need 2d landmarks (e.g from dlib, openpose, or from opencv's landmark models) and corresponding 3d points. i'll try with FacemarkLBF here:

// create and load the facemarks model
cv::Ptr<cv::face::Facemark> facemark;
facemark = cv::face::createFacemarkLBF();

// load the 68 3d points from file (see below !)
std::vector<cv::Point3d> pts3d;
cv::FileStorage fs2("points3d.yml",0);
fs2["points"] >> pts3d;

then for each image, detect a face, then get the current landmarks:

std::vector<cv::Rect> rects;
face_cascade.detectMultiScale(gray_img, faces, 1.4, 2, cv::CASCADE_SCALE_IMAGE, cv::Size(30, 30));

std::vector<cv::Rect> faces(1,rects[0]);
std::vector< std::vector<cv::Point2f> > shapes;

std::vector<cv::Point2d> &pts2d;
for(size_t k=0; k<shapes[0].size(); k++)

now we can apply solvePnP:

// if you did not calibrate it, use a camMatrix based on img size:
cv::Mat rvec,tvec;
cv::Mat camMatrix;
int max_d = std::max(s.width,s.height);
camMatrix = (cv::Mat_<double>(3,3) <<
    max_d,   0, s.width/2.0,
    0,     max_d, s.height/2.0,
    0,   0,        1.0);

// 2d -> 3d correspondence
cv::solvePnP(pts3d, pts2d, camMatrix, cv::Mat(1,4,CV_64F,0.0), rvec, tvec, false, cv::SOLVEPNP_EPNP);

image description

the whole code is here

and, for your convenience, here's points3d.yml:

points: [ -7.4077796936035156e+001, 4.5610500335693359e+001,
    1.7611330032348633e+001, -7.6078399658203125e+001,
    2.4455335617065430e+001, 1.4652364253997803e+000,
    -7.0680282592773438e+001, 3.8770267963409424e+000,
    1.6104341506958008e+001, -6.9542381286621094e+001,
    -1.8663349151611328e+001, -5.0522098541259766e+000,
    -6.0891132354736328e+001, -3.7201663970947266e+001,
    -4.9577393531799316e+000, -4.7551403045654297e+001,
    -5.1671474456787109e+001, 2.1515935897827148e+001,
    -3.3833507537841797e+001, -6.4209655761718750e+001,
    3.3763854980468750e+001, -1.8493196487426758e+001,
    -7.5527656555175781e+001, 4.2787197113037109e+001,
    -2.5200850963592529e+000, -7.7372253417968750e+001,
    4.5473331451416016e+001, 1.3970505714416504e+001,
    -7.3213897705078125e+001, 6.4313529968261719e+001,
    3.0962856292724609e+001, -6.6279350280761719e+001,
    3.5533737182617188e+001, 4.6108547210693359e+001,
    -5.3055961608886719e+001, 1.9751256942749023e+001,
    5.1060363769531250e+001, -3.2454692840576172e+001,
    6.6386039733886719e+001, 5.8377300262451172e+001,
    -1.4232730865478516e+001, 5.9445739746093750e+001,
    6.3227752685546875e+001, 4.3665418624877930e+000,
    4.6228359222412109e+001, 7.0979812622070312e+001,
    2.6926740646362305e+001, 4.6090355515480042e-001,
    6.5315643310546875e+001, 4.6220058441162109e+001,
    4.7723823547363281e+001, -5.8375602722167969e+001,
    5.7522071838378906e+001, 7.4911415100097656e+001,
    -4.7820030212402344e+001, 6.2965705871582031e+001,
    9.6279922485351562e+001, -3.4796894073486328e+001,
    6.4059089660644531e+001, 1.0583456420898438e+002,
    -2.3467020034790039e+001, 6.1613960266113281e+001,
    1.1014395904541016e+002, -1.2404515266418457e+001,
    5.8514854431152344e+001, 1.1110581970214844e+002,
    9.6587600708007812e+000, 5.9617969512939453e+001,
    1.1002123260498047e+002, 2.0711694717407227e+001,
    6.3654747009277344e+001, 1.0981579589843750e+002,
    3.2074752807617188e+001, 6.5145515441894531e+001,
    1.0512077331542969e+002, 4.5245258331298828e+001,
    6.3173934936523438e+001, 9.4144226074218750e+001,
    5.4559543609619141e+001, 5.6469257354736328e+001,
    7.4634750366210938e+001, -1 ...
HMMmmmnnn. No python. Too much work for me.

@supra56, again look at the learnopencv site, it has python code for the same idea, too.

@berak. I know that link.

wow wait, you got the complete solution, and then you simply say, ah no time to work on it, screw this xD

answered 2019-01-16 04:43:10 -0600

updated 2019-01-20 10:44:23 -0600

Like stated above, you will need more than the basic functionality provided by OpenCV. Look at gaze estimation algorithms. I know of a paper that uses Cascade Classifiers to define a rough angle, and which is fully integrated into OpenCV. Have a look here:

@StevenPuttemans, hope you don't mind me "pirating" your answer, but given it's so simple, -- it works amazingly well !! ;)

VideoCapture cap(0);
CascadeClassifier profile("c:/p/opencv/data/haarcascades/haarcascade_profileface.xml");
while(1) {
    Mat frame, gray;;
    cv::cvtColor(frame, gray, cv::COLOR_BGR2GRAY);
    std::vector<cv::Rect> faces_right,faces_left;
    std::vector<int> lvl_right,lvl_left;
    std::vector<double> weights_right,weights_left;
    // right face side
    profile.detectMultiScale(gray, faces_right, lvl_right, weights_right, 1.2, 1, 0, cv::Size(30, 30), Size(), true);
    // flip, and apply again for left one
    profile.detectMultiScale(gray, faces_left, lvl_left, weights_left, 1.2, 1, 0, cv::Size(30, 30), Size(), true);
    float angle = 0; // formula from paper: a=-90*l+90*r ;)
    if (weights_right.size()>0 && weights_right[0]>0)
        // 4: heuristic scale factor, though i'm pretty sure, this is not linear ;)
        angle += 90 * weights_right[0] / 4; 
    if (weights_left.size() && weights_left[0]>0)
        angle += -90 * weights_left[0] / 4;
    cout << angle << endl;

    if (waitKey(10) > -1) break;
Thanks, Steven. I forgot to mention that I have pose coordinates. Is finding the center of the head and center of camera enough? By that, I will find the distance between the head and the camera. After that, I can calculate the angle between this line and the of the body parts. Does that make sense?

there some issues

  • you will never get the physical distance to a camera, unless you have a reference size somewhere
  • people can rotate their head unrelated to their body parts
I forgot to mention that I have pose coordinates.

berak gravatar imageberak ( 2019-01-16 06:26:05 -0600 )edit

berak gravatar imageberak ( 2019-01-16 06:29:35 -0600 )edit

supra56 gravatar imagesupra56 ( 2019-01-16 06:54:11 -0600 )edit

Also in general, if you have an angle dataset, you can let a DL network predict the angle directly. As long as you have annotated data. Did you consider for example OpenPose?

You guys are awsome.

@StevenPuttemans -- is the code for the paper you mention still available ? (link to is dead)

berak gravatar imageberak ( 2019-01-18 01:59:07 -0600 )edit

prb gravatar imageprb ( 2019-01-18 02:23:00 -0600 )edit

berak gravatar imageberak ( 2019-01-18 02:30:34 -0600 )edit

