Gpu API call error (out of memory) in mallocPitch    
   Hi, I compiled OpenCV 2.4.9 with cmake and WITH_CUDA=ON and I get the following error on my Late 2009 Mac Book Pro (GPU 9400M) with Snow Leopard. gpu-z does not find my cuda gpu... but OpenCV does! OpenCV finds the GPU 9400m but does not run my test code... but I found posts with people that made this work! What am I doing wrong? I have been trying to solve this problem for days now. Any help would be much appreciated!
#### error ###############
OpenCV Error: Gpu API call (out of memory) in mallocPitch, file /Users/michael/Documents/OpenCV-2.4.2/modules/core/src/gpumat.cpp, line 1276
terminate called after throwing an instance of 'cv::Exception'
  what():  /Users/michael/Documents/OpenCV-2.4.2/modules/core/src/gpumat.cpp:1276: error: (-217) out of memory in function mallocPitch
Abort trap
#### Code #########################
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <numeric>
#include <opencv2/core/core.hpp>
#include <opencv2/gpu/gpu.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/contrib/contrib.hpp>
#include <opencv2/imgproc/imgproc.hpp>
using namespace std;
using namespace cv;
int main(int, char**)
{
    gpu::printCudaDeviceInfo(cv::gpu::getDevice());
    Mat src2, dst;
    Mat src = imread("file.png", CV_LOAD_IMAGE_GRAYSCALE);  
    gpu::GpuMat edges;
    namedWindow("Window", WINDOW_AUTOSIZE);
    for(;;)
    {
    src.copyTo(src2);
    gpu::GpuMat frame_gpu(src2);
    gpu::GaussianBlur(frame_gpu, edges, Size(7,7), 1.5, 1.5);
    edges.download(dst);
        imshow("Window", dst);
        if(waitKey(30) >= 0) break;
    }
    return 0;
}
#### CUDA Device Query (Runtime API) #######################
Device count: 1
Device 0: "GeForce 9400M"
  CUDA Driver Version / Runtime Version          4.10 / 4.10
  CUDA Capability Major/Minor version number:    1.1
  Total amount of global memory:                 254 MBytes (265945088 bytes)
  ( 2) Multiprocessors x ( 8) CUDA Cores/MP:     16 CUDA Cores
  GPU Clock Speed:                               1.10 GHz
  Memory Clock rate:                             1062.50 Mhz
  Memory Bus Width:                              128-bit
  Max Texture Dimension Size (x,y,z)             1D=(8192), 2D=(65536,32768), 3D=(2048,2048,2048)
  Max Layered Texture Size (dim) x layers        1D=(8192) x 512, 2D=(8192,8192) x 512
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       16384 bytes
  Total number of registers available per block: 8192
  Warp size:                                     32
  Maximum number of threads per block:           512
  Maximum sizes of each dimension of a block:    512 x 512 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             256 bytes
  Concurrent copy and execution:                 No with 0 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            Yes
  Support host page-locked memory mapping:       Yes
  Concurrent kernel execution:                   No
  Alignment requirement for Surfaces:            Yes
  Device has ECC support enabled:                No
  Device is using TCC driver mode:               No
  Device supports Unified Addressing (UVA):      No
  Device PCI Bus ID / PCI location ID:           2 / 0
  Compute Mode:
      Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) 
#### copencv config ###########
-- General configuration for OpenCV 2.4.9 =====================================
--   Version control:               commit:b7b32e7
-- 
--   Platform:
--     Host:                        Darwin 10.8.0 ...
 
 
On which iteration of the
forloop you're getting the error? Is it possible that you have a memory leak?Kirill, Thanks for your reply. Sorry for my late answer, I could not work on this project for a while.
The loop exits on the first iteration when it tries to load the mat to the gpu.
Cuda 5.0.36 and Cudadriver 5.0.45 are now working on my late 2009 Mac Book Pro after I installed Mountain Lion 10.8.3. All Cuda 1.1. Nvidia SDK demos are working.