Ask Your Question

Revision history [back]

Problem getting CUDA to work on OpenCV3 latest clone

So I am trying to get the CUDA interface to work on OpenCV, when I got the following error from the second I am calling upon a cuda related function:

 OpenCV Error: Gpu API call (NCV Assertion Failed: cudaError_t=2, file=/home/spu/Documents/github/opencv_CUDA/modules/cudalegacy/src/NCV.cpp, line=363) in NCVDebugOutputHandler, file /home/spu/Documents/github/opencv_CUDA/modules/cudaobjdetect/src/cascadeclassifier.cpp, line 156
terminate called after throwing an instance of 'cv::Exception'
  what():  /home/spu/Documents/github/opencv_CUDA/modules/cudaobjdetect/src/cascadeclassifier.cpp:156: error: (-217) NCV Assertion Failed: cudaError_t=2, file=/home/spu/Documents/github/opencv_CUDA/modules/cudalegacy/src/NCV.cpp, line=363 in function NCVDebugOutputHandler

Some digging brought me to an old topic in the OpenCV dev forum that suggested to upgrade my nvidia kernel. So that is what I did. I pulled the latest NVIDIA graphics driver for Ubuntu14.04 64 bits, then downloaded CUDA7.0 from the official website, configured my system and successfully built the OpenCV library.

However when running this code, the error still exists.

// Perform the GPU detector
Ptr<cuda::CascadeClassifier> cascade_gpu = cuda::CascadeClassifier::create("/home/spu/Documents/github/opencv_CUDA/data/haarcascades_cuda/haarcascade_fullbody.xml");
for(int scale = 1; scale<6; scale++){
    Mat current;
    resize(hist, current, Size(image.rows/scale, image.cols/scale));
    // Start timing here
    int64 t0 = getTickCount();
    // We need to include the time for pushing and retrieving the data to and from the GPU
    cuda::GpuMat image_gpu(current);
    cuda::GpuMat objbuf;
    cascade_gpu->detectMultiScale(image_gpu, objbuf);
    std::vector<Rect> detections;
    cascade_gpu->convert(objbuf, detections);
    // End timing here and output
    int64 t1 = getTickCount();
    double secs = (t1-t0)/getTickFrequency();
    cerr << "Measurement - division by " << scale << ": time = " << secs << " seconds"<< endl;
}

Anyone has a clue on how I can solve this?

System configuration:

  • CUDA 7.0 with graphics driver 352.30
  • Ubuntu 14.04 64 bit
  • Graphic cards: 2 times NVIDIA QUADRO K2000
  • OpenCV was built with all known compute capability support to avoid building it for the wrong one

Problem getting CUDA to work on OpenCV3 latest clone

So I am trying to get the CUDA interface to work on OpenCV, when I got the following error from the second I am calling upon a cuda related function:

 OpenCV Error: Gpu API call (NCV Assertion Failed: cudaError_t=2, file=/home/spu/Documents/github/opencv_CUDA/modules/cudalegacy/src/NCV.cpp, line=363) in NCVDebugOutputHandler, file /home/spu/Documents/github/opencv_CUDA/modules/cudaobjdetect/src/cascadeclassifier.cpp, line 156
terminate called after throwing an instance of 'cv::Exception'
  what():  /home/spu/Documents/github/opencv_CUDA/modules/cudaobjdetect/src/cascadeclassifier.cpp:156: error: (-217) NCV Assertion Failed: cudaError_t=2, file=/home/spu/Documents/github/opencv_CUDA/modules/cudalegacy/src/NCV.cpp, line=363 in function NCVDebugOutputHandler

Some digging brought me to an old topic in the OpenCV dev forum that suggested to upgrade my nvidia kernel. So that is what I did. I pulled the latest NVIDIA graphics driver for Ubuntu14.04 64 bits, then downloaded CUDA7.0 from the official website, configured my system and successfully built the OpenCV library.

However when running this code, the error still exists.

// Perform the GPU detector
Ptr<cuda::CascadeClassifier> cascade_gpu = cuda::CascadeClassifier::create("/home/spu/Documents/github/opencv_CUDA/data/haarcascades_cuda/haarcascade_fullbody.xml");
for(int scale = 1; scale<6; scale++){
    Mat current;
    resize(hist, current, Size(image.rows/scale, image.cols/scale));
    // Start timing here
    int64 t0 = getTickCount();
    // We need to include the time for pushing and retrieving the data to and from the GPU
    cuda::GpuMat image_gpu(current);
    cuda::GpuMat objbuf;
    cascade_gpu->detectMultiScale(image_gpu, objbuf);
    std::vector<Rect> detections;
    cascade_gpu->convert(objbuf, detections);
    // End timing here and output
    int64 t1 = getTickCount();
    double secs = (t1-t0)/getTickFrequency();
    cerr << "Measurement - division by " << scale << ": time = " << secs << " seconds"<< endl;
}

Anyone has a clue on how I can solve this?

System configuration:

  • CUDA 7.0 with graphics driver 352.30
  • Ubuntu 14.04 64 bit
  • Graphic cards: 2 times NVIDIA QUADRO K2000
  • OpenCV was built with all known compute capability support to avoid building it for the wrong one

EDIT: partially found the reason of crash

While processing a 8000x4000 pixel image on CPU has no memory limits, once you push data to your GPU, memory restrictions can occur. Since I am iteratively downscaling the image, and the error was triggered at the first run I went looking deeper into the error code.

cudaError_t=2 means cudaErrorMemoryAllocation (The API call failed because it was unable to allocate enough memory to perform the requested operation). So basically my GPU memory cannot contain the complete image. Looking deeper now on how to solve this.

Problem getting CUDA to work on OpenCV3 latest clone

So I am trying to get the CUDA interface to work on OpenCV, when I got the following error from the second I am calling upon a cuda related function:

 OpenCV Error: Gpu API call (NCV Assertion Failed: cudaError_t=2, file=/home/spu/Documents/github/opencv_CUDA/modules/cudalegacy/src/NCV.cpp, line=363) in NCVDebugOutputHandler, file /home/spu/Documents/github/opencv_CUDA/modules/cudaobjdetect/src/cascadeclassifier.cpp, line 156
terminate called after throwing an instance of 'cv::Exception'
  what():  /home/spu/Documents/github/opencv_CUDA/modules/cudaobjdetect/src/cascadeclassifier.cpp:156: error: (-217) NCV Assertion Failed: cudaError_t=2, file=/home/spu/Documents/github/opencv_CUDA/modules/cudalegacy/src/NCV.cpp, line=363 in function NCVDebugOutputHandler

Some digging brought me to an old topic in the OpenCV dev forum that suggested to upgrade my nvidia kernel. So that is what I did. I pulled the latest NVIDIA graphics driver for Ubuntu14.04 64 bits, then downloaded CUDA7.0 from the official website, configured my system and successfully built the OpenCV library.

However when running this code, the error still exists.

// Perform the GPU detector
Ptr<cuda::CascadeClassifier> cascade_gpu = cuda::CascadeClassifier::create("/home/spu/Documents/github/opencv_CUDA/data/haarcascades_cuda/haarcascade_fullbody.xml");
for(int scale = 1; scale<6; scale++){
    Mat current;
    resize(hist, current, Size(image.rows/scale, image.cols/scale));
    // Start timing here
    int64 t0 = getTickCount();
    // We need to include the time for pushing and retrieving the data to and from the GPU
    cuda::GpuMat image_gpu(current);
    cuda::GpuMat objbuf;
    cascade_gpu->detectMultiScale(image_gpu, objbuf);
    std::vector<Rect> detections;
    cascade_gpu->convert(objbuf, detections);
    // End timing here and output
    int64 t1 = getTickCount();
    double secs = (t1-t0)/getTickFrequency();
    cerr << "Measurement - division by " << scale << ": time = " << secs << " seconds"<< endl;
}

Anyone has a clue on how I can solve this?

System configuration:

  • CUDA 7.0 with graphics driver 352.30
  • Ubuntu 14.04 64 bit
  • Graphic cards: 2 times NVIDIA QUADRO K2000
  • OpenCV was built with all known compute capability support to avoid building it for the wrong one

EDIT: partially found the reason of crash

While processing a 8000x4000 pixel image on CPU has no memory limits, once you push data to your GPU, memory restrictions can occur. Since I am iteratively downscaling the image, and the error was triggered at the first run I went looking deeper into the error code.

cudaError_t=2 means cudaErrorMemoryAllocation (The API call failed because it was unable to allocate enough memory to perform the requested operation). So basically my GPU memory cannot contain the complete image. Looking deeper now on how to solve this.

EDIT2: 32 bit versus 64 bit

I was downloading the CUDA7.0 interface here which made no difference between 32 bit and 64 bit systems as for CUDA6.5 which can be found here. I am going to see if the 6.5 64bit download does the trick.

Problem getting CUDA to work on OpenCV3 latest clone

So I am trying to get the CUDA interface to work on OpenCV, when I got the following error from the second I am calling upon a cuda related function:

 OpenCV Error: Gpu API call (NCV Assertion Failed: cudaError_t=2, file=/home/spu/Documents/github/opencv_CUDA/modules/cudalegacy/src/NCV.cpp, line=363) in NCVDebugOutputHandler, file /home/spu/Documents/github/opencv_CUDA/modules/cudaobjdetect/src/cascadeclassifier.cpp, line 156
terminate called after throwing an instance of 'cv::Exception'
  what():  /home/spu/Documents/github/opencv_CUDA/modules/cudaobjdetect/src/cascadeclassifier.cpp:156: error: (-217) NCV Assertion Failed: cudaError_t=2, file=/home/spu/Documents/github/opencv_CUDA/modules/cudalegacy/src/NCV.cpp, line=363 in function NCVDebugOutputHandler

Some digging brought me to an old topic in the OpenCV dev forum that suggested to upgrade my nvidia kernel. So that is what I did. I pulled the latest NVIDIA graphics driver for Ubuntu14.04 64 bits, then downloaded CUDA7.0 from the official website, configured my system and successfully built the OpenCV library.

However when running this code, the error still exists.

// Perform the GPU detector
Ptr<cuda::CascadeClassifier> cascade_gpu = cuda::CascadeClassifier::create("/home/spu/Documents/github/opencv_CUDA/data/haarcascades_cuda/haarcascade_fullbody.xml");
for(int scale = 1; scale<6; scale++){
    Mat current;
    resize(hist, current, Size(image.rows/scale, image.cols/scale));
    // Start timing here
    int64 t0 = getTickCount();
    // We need to include the time for pushing and retrieving the data to and from the GPU
    cuda::GpuMat image_gpu(current);
    cuda::GpuMat objbuf;
    cascade_gpu->detectMultiScale(image_gpu, objbuf);
    std::vector<Rect> detections;
    cascade_gpu->convert(objbuf, detections);
    // End timing here and output
    int64 t1 = getTickCount();
    double secs = (t1-t0)/getTickFrequency();
    cerr << "Measurement - division by " << scale << ": time = " << secs << " seconds"<< endl;
}

Anyone has a clue on how I can solve this?

System configuration:

  • CUDA 7.0 with graphics driver 352.30
  • Ubuntu 14.04 64 bit
  • Graphic cards: 2 times NVIDIA QUADRO K2000
  • OpenCV was built with all known compute capability support to avoid building it for the wrong one

EDIT: partially found the reason of crash

While processing a 8000x4000 pixel image on CPU has no memory limits, once you push data to your GPU, memory restrictions can occur. Since I am iteratively downscaling the image, and the error was triggered at the first run I went looking deeper into the error code.

cudaError_t=2 means cudaErrorMemoryAllocation (The API call failed because it was unable to allocate enough memory to perform the requested operation). So basically my GPU memory cannot contain the complete image. Looking deeper now on how to solve this.

EDIT2: **EDIT2: 32 bit versus 64 bit

bit** I was downloading the CUDA7.0 interface here [here](https://developer.nvidia.com/cuda-downloads) which made no difference between 32 bit and 64 bit systems as for CUDA6.5 which can be found here. [here](https://developer.nvidia.com/cuda-toolkit-65). I am going to see if the 6.5 64bit download does the trick.trick.

EDIT3: forget about the above remark --> 7.0 installer is just standard for 32 AND 64 bit systems. Currently running CMAKE for 6.5 comparison to see if the error occurs there.