eye landmark points

c++
opencv

asked 2016-12-02 14:34:25 -0600

sarmad
66 ●1 ●5 ●11

updated 2016-12-09 13:38:17 -0600

I'm using facial landmark detector (dlib) to detect eye blinks . How the eye landmarks can be imported to a file ?

I need to use eye landmarks to calculate the ration between height and width of eye and to use SVM to classify blinks

Update : when I try to write landmark point to a file , different valuses are saved than the displayed landmarks in terminal windows , how to fix ?

Thanks

 #include <dlib/opencv.h>
 #include <opencv2/highgui/highgui.hpp>
 #include <dlib/image_processing/frontal_face_detector.h>
 #include <dlib/image_processing/render_face_detections.h>
 #include <dlib/image_processing.h>
 #include <dlib/gui_widgets.h>

using namespace dlib;
using namespace std;

int main()
{
try
{
    cv::VideoCapture cap(0);
    if (!cap.isOpened())
    {
        cerr << "Unable to connect to camera" << endl;
        return 1;
    }

    image_window win;
    frontal_face_detector detector = get_frontal_face_detector();
    shape_predictor pose_model;
    deserialize("shape_predictor_68_face_landmarks.dat") >> pose_model;

    while(!win.is_closed())
    {
        cv::Mat temp;
        cap >> temp;

        cv_image<bgr_pixel> cimg(temp);

        // Detect faces 
        std::vector<rectangle> faces = detector(cimg);
        // Find the pose of each face.
        std::vector<full_object_detection> shapes;
           ofstream outputfile;
           outputfile.open("data1.csv");

        for (unsigned long i = 0; i < faces.size(); ++i)
      {  

               full_object_detection shape = pose_model(cimg, faces[i]);
               cout << "number of parts: "<< shape.num_parts() << endl;

        cout << "Eye Landmark points for right eye : "<< endl;
        cout << "pixel position of 36 part:  " << shape.part(36) << endl;
        cout << "pixel position of 37 part: " << shape.part(37) << endl;
        cout << "pixel position of 38 part:  " << shape.part(38) << endl;
        cout << "pixel position of 39 part: " << shape.part(39) << endl;
        cout << "pixel position of 40 part: " << shape.part(40) << endl;
        cout << "pixel position of 41 part: " << shape.part(41) << endl;

        cout << endl;

        cout << "Eye Landmark points for left eye : "<< endl;

        cout << "pixel position of 42 part:  " << shape.part(42) << endl;
        cout << "pixel position of 43 part: " << shape.part(43) << endl;
        cout << "pixel position of 44 part:  " << shape.part(44) << endl;
        cout << "pixel position of 45 part: " << shape.part(45) << endl;
        cout << "pixel position of 46 part: " << shape.part(46) << endl;
        cout << "pixel position of 47 part: " << shape.part(47) << endl;

        double P37_41_x = shape.part(37).x() - shape.part(41).x();
        double P37_41_y=  shape.part(37).y() -shape.part(41).y() ;

        double p37_41_sqrt=sqrt((P37_41_x * P37_41_x) + (P37_41_y * P37_41_y));


       double P38_40_x = shape.part(38).x() - shape.part(40).x();
       double P38_40_y = shape.part(38).y() - shape.part(40).y();

       double p38_40_sqrt=sqrt((P38_40_x * P38_40_x) + (P38_40_y * P38_40_y));



      double P36_39_x = shape.part(36).x() - shape.part(39).x();  
      double P36_39_y = shape.part(36).y() - shape.part(39).y();

      double p36_39_sqrt=sqrt((P36_39_x * P36_39_x) + (P36_39_y * P36_39_y));



     double EAR= p37_41_sqrt +  p38_40_sqrt/2* p36_39_sqrt;


    cout << "EAR value =  " << EAR << endl;


  shapes.push_back(pose_model(cimg, faces[i]));


   const full_object_detection& d = shapes[0];

              }

        win.clear_overlay();
        win.set_image(cimg);
        win.add_overlay(render_face_detections(shapes));
    }
}
catch(serialization_error& e)
{
    cout << "You need dlib's default face landmarking model file to run this example." << endl;
    cout << "You can get it from the following URL: " << endl;
    cout << "   http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2" << endl;
    cout << endl << e.what() << endl;
}
catch(exception& e)
{
    cout << e.what() << endl;
}
 }

edit retag flag offensive close merge delete

Comments

did you actually try it ?

i don't think, that dlib's landmarks will deliver significant enough differences for open/closed eyes (but maybe i'm wrong here)

you'll still need the landmarks to detect the eye position, but imho, you'll need some cropped, open/close dataset of images to train on.

berak ( 2016-12-03 00:23:02 -0600 )edit

Hi , This paper : http://vision.fe.uni-lj.si/cvww2016/p... used facial landmarks to

detect eyes and then the eye aspect ratio (EAR) between height and width of the eye is computed.

sarmad ( 2016-12-03 12:38:21 -0600 )edit

in the end, you just need to save your EAR ratio (a single float) plus a "label" (open/closed), right ?

(i'm curious, how that'll work - training an SVM on a single feature)

berak ( 2016-12-04 03:08:38 -0600 )edit

It will only work if the classes are seperable. However, in this case I would go for a Normal Bayes classifier or a KNN classifier, who do way better in low dimensional data.

StevenPuttemans ( 2016-12-07 03:55:37 -0600 )edit

thanks @StevenPuttemans for your suggestion , is that means I should use Normal Bayes classifier or a KNN classifier on the obtained calcuaitng eye aspect ratio (EAR) ?

sarmad ( 2016-12-07 04:30:44 -0600 )edit

currently, you're saving all landmarks, but only printing out the eye-ones.

why don't you calculate your EAR right there, and save that ?

berak ( 2016-12-07 04:56:19 -0600 )edit

@berak Thanks for suggesting this , I will calculate EAR equation , but is ||p2-p6|| means Euclidian distance ? any suggestion of how can be calculated

sarmad ( 2016-12-08 05:08:32 -0600 )edit

yes, euclidean distance (L2 norm)

berak ( 2016-12-08 05:19:57 -0600 )edit

I have edited the question and included EAR calculation equation , is it correct ?

sarmad ( 2016-12-09 13:39:08 -0600 )edit

imho, you're missing braces here: (2.1 (1) in paper)

double EAR= (p37_41_sqrt +  p38_40_sqrt) / (2* p36_39_sqrt);

also, don't forget the other eye ! ("Since eye blinking is performed by both eyes synchronously, the EAR of both eyes is averaged")

(btw, just curious, what kind of (labelled??) data do you you have for this ?)

berak ( 2016-12-10 02:01:03 -0600 )edit

In the paper , it says ,

A linear SVM classier (called EAR SVM) is trained from manually annotated sequences. Positive examples are collected as ground-truth blinks, while the negatives are those that are sampled from parts of the videos where no blink occurs,

I have video data with annotations , but I don't have idea of how to make a classifier for EAR using it , do you have any suggestions ?

sarmad ( 2016-12-12 04:24:47 -0600 )edit

... look at the answer below, again ?

berak ( 2016-12-12 05:05:38 -0600 )edit

hmm, i don't quite understand your problem, as you have anything you need.

is it "reading video" ? you'd process frame by frame, and save ear value and label.

berak ( 2016-12-14 21:06:28 -0600 )edit

Yes , in processing video frames of the annotated video which has .tag ( blinks ) and .txt ( frames ). I got (EAR values computed for each frame in an annotated video sequence).

but , Now how I find a peak of an annotated blink, I don't know how to deal with the annotated video files for example in blink8 it is annotated by start and end of a blink . So the peak is probably a center of this interval. E. g. blink starts in a 38th frame and ends at 42th frame, so the blink peak is in the 40th frame of a sequence. after that take EAR values from 34th-46th frames = 13 scalar numbers and these numbers are 1 positive feature for training SVM.

sarmad ( 2016-12-15 08:42:52 -0600 )edit

oh, had to read the paper, again, to understand, what you mean...

i think, you need some kind of "ring buffer" here (eg. std::deque<float>(13)) . for each frame, push the current ear into one end, and pop the oldest at the other. then use the whole 13 elem vector as a feature for frame t-6 in the svm (write a label and all 13 ear's to csv)

(and still, for training, i'd try to do this seperately for each eye (not interpolated), to get more training data, but interpolate in the prediction phase)

berak ( 2016-12-15 09:14:33 -0600 )edit

To TEST: In a sliding-window fashion, for each frame in a video, take surrounding 13 EAR values and asked SVM classifier if these values are positive or negative. If positive, it means that the tested frame (in a center of 13 frames) is blink or not. In the annotated videos .tag and .txt shows the eye states and frame numbers LInk . but i'm very confused on how can I combine extracted EAR values with the annotated data

sarmad ( 2016-12-15 09:30:07 -0600 )edit

can you give a link to the data you're using ?

berak ( 2016-12-19 02:58:15 -0600 )edit

here is the link to it , it is annotated data . http://www2.fiit.stuba.sk/~fogelton/a...

sarmad ( 2016-12-19 06:29:21 -0600 )edit

add a comment

How eye landmarks points can be saved into .csv file ?

I applied face_landmark_detection_ex to an image and I got these points :

Test Image

are the eye landmarks correct ? how this can be used with a video file or web cam ?

sarmad ( 2016-12-05 03:43:04 -0600 )edit

yes, imho, the landmarks are correct. now try that with an closed eye image !

"how to save numbers into a txtfile" - now, cmon. that's basic.

berak ( 2016-12-05 03:53:04 -0600 )edit

I have tried on cllosed eye image , and got different eye landmarks . so now eye landmarks points are different for open/closed eyes .

I have tried this to save eye landmarks : Link to code but I'm getting errors .

sarmad ( 2016-12-05 05:01:10 -0600 )edit

sooo, what errors ?

berak ( 2016-12-05 05:24:05 -0600 )edit

unfortunately, the error does not match your code !

but i guess, you missed #include <dlib/opencv/cv_image.h>

berak ( 2016-12-06 19:08:12 -0600 )edit

eye landmark points

Comments

1 answer

Comments

Links

Question Tools

Stats

Related questions

eye landmark points edit

Comments

1 answer

Comments

Links

Question Tools

Stats

Related questions

eye landmark points