First on the Haar output, I've always seen the face bounding box to include vertically from half the forehead to the chin approximately. Never the whole head; an experiment can be done by detecting a face partially sideways and downwards: the detected center is on the nose, i.e. not in the middle of the head, and the size gets reduced (largely, if very downwards) compared to the size of the head, with hair and all.
Another option is to use the facedetection radius, convert it to a bounding box and get the skin colour pixels inside and in an immediate neighbourhood, like, say, 1.25 times the width. Then use those skin pixels smartly :) like for instance using the camshift method on it.
For finding skin colour pixels, you can check in length the idea here (http://www.shervinemami.co.cc/blobs.html), but basically is an AND of the thresholded channels Hue, Sat and Value. From that link, I found that the rule works best slightly modified like:
skin_out = (hue < 20) ^ (sat > 48) ^ (val > 80), where ^ mean pixels-wise AND
There are other rules working on RGB space but I have not seen great improvement compared to this one. The HSV rule has troubles with people looking very pale, but usually this can be corrected by a preprocessing stage of luminance-equalisation on the input image. This is recommendable anyway if you want to track real world sequences where camera autoexposure can cause non-linear colour changes that will confuse the Haar detection process, the skin detection, or both.
A sample function creating a skin/no-skin gray image, from the HSV version, would be (in C):
int compose_skin_matrix(IplImage* rgbin, IplImage* gray_out)
{
// crappy static pointer, works, but best if done some other way.
static IplImage* imageHSV = cvCreateImage( cvSize(rgbin->width, rgbin->height), IPL_DEPTH_8U, 3);
cvCvtColor(rgbin, imageHSV, CV_RGB2HSV);
static IplImage* planeH = cvCreateImage( cvGetSize(imageHSV), 8, 1); // Hue component.
static IplImage* planeS = cvCreateImage( cvGetSize(imageHSV), 8, 1); // Saturation component.
static IplImage* planeV = cvCreateImage( cvGetSize(imageHSV), 8, 1); // Brightness component.
cvCvtPixToPlane(imageHSV, planeH, planeS, planeV, 0); // Extract the 3 color components.
// Detect which pixels in each of the H, S and V channels are probably skin pixels.
// Assume that skin has a Hue between 0 to 18 (out of 180), and Saturation above 50, and Brightness above 80.
cvThreshold(planeH , planeH , 20, UCHAR_MAX, CV_THRESH_BINARY_INV); //(hue < 20)
cvThreshold(planeS , planeS , 48, UCHAR_MAX, CV_THRESH_BINARY); //(sat > 48)
cvThreshold(planeV , planeV , 80, UCHAR_MAX, CV_THRESH_BINARY); //(val > 80)
// erode the HUE to get rid of noise.
cvErode(planeH, planeH, NULL, 1);
// Combine all 3 thresholded color components, so that an output pixel will only
// be white (255) if the H, S and V pixels were also white.
// gray_out = (hue < 20) ^ (sat > 48) ^ (val > 80), where ^ mean pixels-wise AND
cvAnd(planeH , planeS , gray_out);
cvAnd(gray_out, planeV , gray_out);
return(1); // make it smarter ;)
}