Ask Your Question

Haar Cascade detecting only faces(no heads)?

asked 2012-09-18 16:34:11 -0500

Kamil Kalinowski gravatar image

Hi, I'm currently trying to solve face tracking problem using mix of Haar Cascade and camshift methods. First, I find face region using CascadeClassifier, then I pass it to method which calculates histogram for that region and then, having mentioned region and historgram new region is calculated.

It looks fine on paper, in reality it's whole other thing. By saying that I mean that when using CascadeClassifier I get Rect object which contains not only face but also whole head and a bit of background. That makes calculated histogram to be useless - with every camshift loop iteration region which formerly contained face now expands because of bits of background mixed into histogram.

So, back to topic, do any of you know Haar cascade XML files which would cause CascadeClassifier to return only face region, excluding rest of head? Or do you know of any method that would help me with getting desired result?

By far, I tried those resources: - tried, all work like default cascades shipped with OpenCV - those xml files actually cause application fail while loading into cascade, though method used for teaching them(faces only) seems to be what i'm looking for.

I would appreciate any feedback, thanks

edit retag flag offensive close merge delete

2 answers

Sort by ยป oldest newest most voted

answered 2012-09-25 11:09:15 -0500

Kirill Kornyakov gravatar image

Why don't you try to cut the face by yourself? You have the face rectangle, so let's just create a mask in the form of ellipse, put it on the rectangle and fill everything outside of the ellipse with black color? If you don't trust to the face detector, you can try to find eyes and improve the positioning of the ellipse.

So, for all the above you need only ellipse function and setTo with mask.

You can find some more details here, search for "elliptical mask" in "How to preprocess facial images for Face Recognition" post.

edit flag offensive delete link more

answered 2012-09-26 02:25:50 -0500

elmiguelao gravatar image

updated 2012-09-26 05:56:37 -0500

Kirill Kornyakov gravatar image

First on the Haar output, I've always seen the face bounding box to include vertically from half the forehead to the chin approximately. Never the whole head; an experiment can be done by detecting a face partially sideways and downwards: the detected center is on the nose, i.e. not in the middle of the head, and the size gets reduced (largely, if very downwards) compared to the size of the head, with hair and all.

Another option is to use the facedetection radius, convert it to a bounding box and get the skin colour pixels inside and in an immediate neighbourhood, like, say, 1.25 times the width. Then use those skin pixels smartly :) like for instance using the camshift method on it.

For finding skin colour pixels, you can check in length the idea here (, but basically is an AND of the thresholded channels Hue, Sat and Value. From that link, I found that the rule works best slightly modified like:

skin_out = (hue < 20) ^ (sat > 48) ^ (val > 80), where   ^ mean pixels-wise AND

There are other rules working on RGB space but I have not seen great improvement compared to this one. The HSV rule has troubles with people looking very pale, but usually this can be corrected by a preprocessing stage of luminance-equalisation on the input image. This is recommendable anyway if you want to track real world sequences where camera autoexposure can cause non-linear colour changes that will confuse the Haar detection process, the skin detection, or both.

A sample function creating a skin/no-skin gray image, from the HSV version, would be (in C):

int compose_skin_matrix(IplImage* rgbin, IplImage* gray_out)
      // crappy static pointer, works, but best if done some other way.
      static IplImage* imageHSV = cvCreateImage( cvSize(rgbin->width, rgbin->height), IPL_DEPTH_8U, 3);
      cvCvtColor(rgbin, imageHSV, CV_RGB2HSV);

      static IplImage* planeH = cvCreateImage( cvGetSize(imageHSV), 8, 1);  // Hue component.
      static IplImage* planeS = cvCreateImage( cvGetSize(imageHSV), 8, 1);  // Saturation component.
      static IplImage* planeV = cvCreateImage( cvGetSize(imageHSV), 8, 1);  // Brightness component.
      cvCvtPixToPlane(imageHSV, planeH, planeS, planeV, 0); // Extract the 3 color components.

      // Detect which pixels in each of the H, S and V channels are probably skin pixels.
      // Assume that skin has a Hue between 0 to 18 (out of 180), and Saturation above 50, and Brightness above 80.
      cvThreshold(planeH , planeH , 20, UCHAR_MAX, CV_THRESH_BINARY_INV);     //(hue < 20)
      cvThreshold(planeS , planeS , 48, UCHAR_MAX, CV_THRESH_BINARY);         //(sat > 48)
      cvThreshold(planeV , planeV , 80, UCHAR_MAX, CV_THRESH_BINARY);         //(val > 80)

      // erode the HUE to get rid of noise.
      cvErode(planeH, planeH, NULL, 1);

      // Combine all 3 thresholded color components, so that an output pixel will only
      // be white (255) if the H, S and V pixels were also white.

      // gray_out = (hue < 20) ^ (sat > 48) ^ (val > 80), where   ^ mean pixels-wise AND
      cvAnd(planeH  , planeS , gray_out);   
      cvAnd(gray_out, planeV , gray_out);   

      return(1); // make it smarter ;)
edit flag offensive delete link more

Question Tools

1 follower


Asked: 2012-09-18 16:34:11 -0500

Seen: 5,049 times

Last updated: Sep 26 '12