Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Strange behaviour when using GPU module

Hi all,

when I use GPU code I sometimes get a strange behaviour that is reproducible (at least on my machine).

When I let run this Code

cudaStream_t stream;

cudaSafeCall( cudaStreamCreate( &stream ) );

in a function nothing happens, everybody is happy and I get cudaSuccess from cudaStreamCreate.

As soon as I let run this Code

cudaStream_t stream;

cudaSafeCall( cudaStreamCreate( &stream ) );

gpu::Stream streamddd;

the second line cudaStreamCreate() produces a cudaErrorUnknown. Note that I didn't reach the line that was newly included in the second example.

I debug built with OpenCV 2.4.9 using CUDA 4.2 on Visual Studio 2008 (32bit build). I also compiled OpenCV on my own as debug and release build. Both worked out of the box using CMake.

What do I miss? What is wrong in my thinking? What does OpenCV do to my GPU? If you need any further information to decide what the problem may be do not hesitate to ask.

Thanks in advance.

Cheers, Willi

Strange behaviour when using GPU module

Hi all,

when I use GPU code I sometimes get a strange behaviour that is reproducible (at least on my machine).

When I let run this Code

cudaStream_t stream;

stream; cudaSafeCall( cudaStreamCreate( &stream ) );

);

in a function nothing happens, everybody is happy and I get cudaSuccess from cudaStreamCreate.

As soon as I let run this Code

cudaStream_t stream;

stream; cudaSafeCall( cudaStreamCreate( &stream ) );

); gpu::Stream streamddd;

streamddd;

the second line cudaStreamCreate() produces a cudaErrorUnknown. Note that I didn't reach the line that was newly included in the second example.

I debug built with OpenCV 2.4.9 using CUDA 4.2 on Visual Studio 2008 (32bit build). I also compiled OpenCV on my own as debug and release build. Both worked out of the box using CMake.

When I run the opencv_test_gpu of the OpenCV CMake generated Solution my graphics card is recognized correctly

[----------]
[ GPU INFO ]    Run on OS Windows x32.
[----------]
*** CUDA Device Query (Runtime API) version (CUDART static linking) ***

Device count: 1

Device 0: "Quadro K1100M"
  CUDA Driver Version / Runtime Version          6.0 / 4.20
  CUDA Capability Major/Minor version number:    3.0
  Total amount of global memory:                 2048 MBytes (2147483648 bytes)
  ( 2) Multiprocessors x (192) CUDA Cores/MP:     384 CUDA Cores
  GPU Clock Speed:                               0.71 GHz
  Max Texture Dimension Size (x,y,z)             1D=(65536), 2D=(65536,65536), 3
D=(4096,4096,4096)
  Max Layered Texture Size (dim) x layers        1D=(16384) x 2048, 2D=(16384,16
384) x 2048
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per block:           1024
  Maximum sizes of each dimension of a block:    1024 x 1024 x 64
  Maximum sizes of each dimension of a grid:     2147483647 x 65535 x 65535
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and execution:                 Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Concurrent kernel execution:                   Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support enabled:                No
  Device is using TCC driver mode:               No
  Device supports Unified Addressing (UVA):      No
  Device PCI Bus ID / PCI location ID:           1 / 0
  Compute Mode:
      Default (multiple host threads can use ::cudaSetDevice() with device simul
taneously)

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version  = 6.0, CUDA Runtime Vers
ion = 4.20, NumDevs = 1

Run tests on all supported devices

but it returns a lot of fails like

[ RUN      ] GPU_ImgProc/CvtColor.GRAY2BGR/8
unknown file: error: C++ exception with description "D:\compiled\OpenCV\sources\
modules\dynamicuda\include\opencv2/dynamicuda/dynamicuda.hpp:1134: error: (-217)
 unknown error in function CudaFuncTable::mallocPitch
" thrown in the test body.
[  FAILED  ] GPU_ImgProc/CvtColor.GRAY2BGR/8, where GetParam() = (Quadro K1100M,
 113x113, CV_16U, whole matrix) (3235 ms)

but not all are fails

[ RUN      ] GPU_ImgProc/CvtColor.BGR5652BGR/2
[       OK ] GPU_ImgProc/CvtColor.BGR5652BGR/2 (1 ms)

What do I miss? What is wrong in my thinking? What does OpenCV do to my GPU? If you need any further information to decide what the problem may be do not hesitate to ask.

Thanks in advance.

Cheers, Willi