opencv 2.6.4 CascadeClassifier_GPU vs CascadeClassifier

asked 2013-10-05 16:24:14 -0500

interogativ gravatar image

updated 2013-10-05 16:25:11 -0500

berak gravatar image

I've built the winx64 version of OpenCV 2.6.4 and all of the samples with CUDA support using the CUDA 5.5 SDK and tested with an NVIDIA GFORCE GT 630 (CUDA level 2.1) video card. I modified the objectDetection tutorial to add another version of the detectAndDisplay method that uses the GPU. Given the same 16 second (465 frame) video of me facing the camera, the detectAndDisplay is much more accurate than the CUDA version which always fails to find the eyes. In the video the CPU version finds 378 faces and detects 141 eyes in 465 frames, the GPU version finds 387 faces and detects zero eyes. Is there something about the haarcascade_eye_tree_eyeglasses.xml file that doesn't work properly on the GPU? I would expect to get the same results given the same input for both scenarios, the only thing I can figure out is that the CV_HAAR_SCALE_IMAGE parameter to the cpu CascadeClassifier is affecting it's ability to detect the eyes. Is there a better detection file for the GPU classifier?

I've included the two detectAndDisplay functions for comparison.

    void detectAndDisplayGPU( Mat frame )
{
   //std::vector<Rect> faces;
   Mat frame_gray;
   GpuMat objBuf;

   cvtColor( frame, frame_gray, CV_BGR2GRAY );
   equalizeHist( frame_gray, frame_gray );
   GpuMat gray_gpu(frame_gray);

   //-- Detect faces
   int iNumDetects = face_cascade_GPU.detectMultiScale( gray_gpu, objBuf, 1.1, 2, Size(30, 30) );
   iTotalFaceDetects += iNumDetects;

   Mat obj_host;
   objBuf.colRange(0, iNumDetects).download(obj_host);
   Rect* faces = obj_host.ptr<Rect>();

    for( size_t i = 0; i < iNumDetects; i++ )
    {
      Point center( faces[i].x + faces[i].width/2, faces[i].y + faces[i].height/2 );
      ellipse( frame, center, Size( faces[i].width/2, faces[i].height/2), 0, 0, 360, Scalar( 255, 0, 255 ), 2, 8, 0 );

      Mat faceROI = frame_gray( faces[i] );
      GpuMat eye_gpu(faceROI);

      //-- In each face, detect eyes
      GpuMat eyeObjBuf;
      int iNumEyeDetects = eyes_cascade_GPU.detectMultiScale( eye_gpu, eyeObjBuf, 1.1, 2,  Size(30, 30) );
      iTotalEyeDetects += iNumEyeDetects;

      if(iNumEyeDetects)
      {
        Mat eye_obj_host;
        eyeObjBuf.colRange(0, iNumEyeDetects).download(eye_obj_host);
        Rect* eyes = eye_obj_host.ptr<Rect>();

        for( int j = 0; j < iNumEyeDetects; j++ )
        {
            Point eye_center( faces[i].x + eyes[j].x + eyes[j].width/2, faces[i].y + eyes[j].y + eyes[j].height/2 );
            int radius = cvRound( (eyes[j].width + eyes[j].height)*0.25 );
            circle( frame, eye_center, radius, Scalar( 255, 0, 0 ), 3, 8, 0 );
        }
      }

   }
   //-- Show what you got
   imshow( window_name, frame );
}

void detectAndDisplay( Mat frame )
{
   std::vector<Rect> faces;
   Mat frame_gray;

   cvtColor( frame, frame_gray, CV_BGR2GRAY );
   equalizeHist( frame_gray, frame_gray );
   //-- Detect faces
   face_cascade.detectMultiScale( frame_gray, faces, 1.1, 2, 0|CV_HAAR_SCALE_IMAGE, Size(30, 30) );
   iTotalFaceDetects += (int) faces.size();

   for( size_t i = 0; i < faces.size(); i++ )
    {
      Point center( faces[i].x + faces[i].width/2, faces[i].y + faces[i].height/2 );
      ellipse( frame, center, Size( faces[i].width/2, faces[i].height/2), 0, 0, 360, Scalar( 255, 0, 255 ), 2, 8, 0 );

      Mat faceROI = frame_gray( faces[i] );
      std::vector<Rect> eyes;

      //-- In each face, detect eyes
      eyes_cascade.detectMultiScale( faceROI, eyes, 1.1, 2, 0 |CV_HAAR_SCALE_IMAGE, Size(30, 30) );
      iTotalEyeDetects += (int) eyes.size();

      for( size_t j = 0 ...
(more)
edit retag flag offensive close merge delete

Comments

I've done more research and the implemented a detectMultiScaleWithParms method that allows you to pass in the following, enum { NCVPipeObjDet_Default = 0x000, NCVPipeObjDet_UseFairImageScaling = 0x001, // not implemented NCVPipeObjDet_FindLargestObject = 0x002, NCVPipeObjDet_VisualizeInPlace = 0x004, };

none of these make any difference in the output.

interogativ gravatar imageinterogativ ( 2013-10-07 16:14:28 -0500 )edit