Ask Your Question

bharath422's profile - activity

2012-09-28 12:30:34 -0600 commented answer GPU Code Not Working Question

Oh cool. forgot about that. Isnt it also that the maximum threads per block is 512, so 40* 40 is > 512? Also, is the frames per second of the output video same as the input? Is there a way to measure this?

2012-09-21 12:05:55 -0600 commented answer GPU Code Not Working Question

You're a genius! Works like a charm! Thanks a ton! I'll probably have more questions, and hoping you can help me out. Also, I am wondering why you say 16*16 is the best decompositions. Is it because each block can take only up to 512 threads?

2012-09-21 12:03:57 -0600 received badge  Scholar (source)
2012-09-20 11:55:57 -0600 asked a question GPU Code Not Working Question

Hi,

(This is a continuation from the post "Pseudocode for custom GPU computation")

Following is my GPU opencv code. I tried 2 things - 1) to just set all pixels of the output to zero, 2) return a monochrome image based on a threshold on the pixel values of the input image. But for both cases when I run it, I dont get the result I want, I get some grayscale image, which is just peppered noise. My code below:


//The size of the video frame is 480 * 640
//funcam.cpp

using namespace std;
using namespace cv;
using namespace cv::gpu;

void callKernel(const GpuMat& src, const GpuMat& dst)
{
   uchar* p = src.data;
   uchar* p2 = dst.data;
   func(p, p2, src.step, src.cols, src.rows, dst.step);
}

int main(int, char**) 
{
   VideoCapture cap(0);
   if(!cap.isOpened()) return -1;

   int frameH    = (int) cap.get(CV_CAP_PROP_FRAME_HEIGHT);
   int frameW    = (int) cap.get(CV_CAP_PROP_FRAME_WIDTH);
   cout << frameH << " " << frameW << endl;

   CvSize size = cvSize(frameH, frameW);
   Mat frame;
   Mat input;
   Mat output;
   GpuMat d_frame;
   GpuMat d_output;

   for(;;)
   {
      cap >> frame;
      if (frame.empty())
         break;

      //convert to grayscale
      cvtColor(frame, input, CV_BGR2GRAY);

      // memory Copy from Host to Device
      d_frame.upload(input);

      // Call CUDA kernel
      d_output.create(size, CV_8UC1);
      callKernel(d_frame, d_output);

      // memory Copy from Device to Host
      d_output.download(output);

      imshow("output", output);
      if(waitKey(30) >= 0)
         break;
   }
    return 0;
}

//funcam_cuda.cu
__global__ void funcKernel(uchar* srcptr, uchar* dstptr, int step, int cols, int rows, int dststep) 
{
   int rowInd = blockIdx.y * blockDim.y + threadIdx.y;
   int colInd = blockIdx.x * blockDim.x + threadIdx.x;
   uchar* rowsrcPtr = srcptr + rowInd*step;
   uchar* rowdstPtr = dstptr + rowInd*dststep;
   uchar pixVal = rowsrcPtr[colInd];
// rowdstPtr[colInd] = (pixVal > 60 ? 255 : 0);
   rowdstPtr[colInd] = 0;
}

extern "C"
void func(uchar* srcptr, uchar* dstptr, int step, int cols, int rows, int dststep)
{
dim3 grDim (16, 12);
dim3 blDim (40, 40);
funcKernel<<< grDim, blDim >>>(srcptr, dstptr, step, cols, rows, dststep);
cudaThreadSynchronize(); //Is this reqd?
}
2012-09-20 11:53:30 -0600 commented answer Pseudocode for custom GPU computation

Hi Vladislav, I put my code in a new post "GPU Code Not Working Question". Thanks for your help so far!

2012-09-19 18:08:39 -0600 commented answer Pseudocode for custom GPU computation

when i say "I tried setting all the values of the d_output to 0", I mean the line in the cuda kernel:

rowdstPtr[colInd] = 0;

just wanted to make sure what i said wasnt confusing.

2012-09-19 18:02:37 -0600 commented answer Pseudocode for custom GPU computation

I tried setting all the values of the d_output to 0, but what I get is some image with peppered noise. I also tried to change the input image to monochrome using a threshold, 100, but that gave a similar meaningless output only. I also noticed the step sizes of the source and destination images were different., so I passed in that as well into the kernel and computed the rowdstPtr accordingly, and i was getting something partially looking like what my camera captures, but the image doesnt make sense, its mostly random. However, I tried to equate all the pixels to 0 in the cuda code, so wonder how the output GpuMat gets the camera image?

It's all kind of connfusing at this point. If you could give me some more hints that would help me fix the problem and get this code running, it'll be gr8

2012-09-19 17:54:41 -0600 commented answer Pseudocode for custom GPU computation

__global__ void funcKernel(uchar* srcptr, uchar* dstptr, int step, int cols, int rows) {

int rowInd = blockIdx.y * blockDim.y + threadIdx.y;

int colInd = blockIdx.x * blockDim.x + threadIdx.x;

uchar* rowsrcPtr = srcptr + rowInd*step;

uchar* rowdstPtr = dstptr + rowInd*step;

uchar pixVal = rowsrcPtr[colInd];

// rowdstPtr[colInd] = (pixVal > 100 ? 255 : 0);

rowdstPtr[colInd] = 0; }

2012-09-19 17:53:13 -0600 commented answer Pseudocode for custom GPU computation

extern "C"

void func(uchar* srcptr, uchar* dstptr, int step, int cols, int rows) {

dim3 grDim (16, 12);

dim3 blDim (40, 40);

funcKernel<<< grDim, blDim >>>(srcptr, dstptr, step, cols, rows);

}

2012-09-19 17:52:13 -0600 commented answer Pseudocode for custom GPU computation

I had the wrong .h files included. I am now able to compile, but the output I get is just noise. Here are my routines:

void callKernel(const GpuMat& src, const GpuMat& dst)

{

uchar* p = src.data;

uchar* p2 = dst.data;

func(p, p2, src.step, src.cols, src.rows); }

2012-09-19 15:41:02 -0600 commented answer Pseudocode for custom GPU computation

Hi Vladislav, Thanks for the help.

I followed you instructions, and when I try to compile, I get the error:

Bharath-Pattabiramans-MacBook-Pro:funcam bharath650$ make ==== Compiling funcam.o === g++ -c funcam.cpp -I. -I/usr/local/cuda/include -I/Developer/GPU\ Computing/C//common/inc/ -I/usr/local/cuda/include -I/usr/local/include/opencv -I/usr/local/include
funcam.cpp: In function ‘void callKernel(const cv::gpu::GpuMat&)’: funcam.cpp:22: error: invalid use of incomplete type ‘const struct cv::gpu::GpuMat’ /usr/local/include/opencv2/core/core.hpp:99: error: forward declaration of ‘const struct cv::gpu::GpuMat’ funcam.cpp:22: error: ‘uchar3’ was not declared in this scope funcam.cpp:22: error: expected primary-expression before ‘)’ token funcam.cpp:22: error: invalid use of incomplete

2012-09-17 16:50:45 -0600 received badge  Supporter (source)
2012-09-17 16:49:49 -0600 commented answer Pseudocode for custom GPU computation

Hi Vladislav, Thanks for the answer. But I still dont understand how to plug in the CUDA kernel. Say, I write the kernel like:

__global__ void func(uchar* srcptr, int step, int cols, int rows) { ....

}

Now, I have to call it like

func<<< GRID_DIM, BLOCK_DIM... >>>(...)

I dont understand where to call this. The call you have written seems to be func(src.ptr<uchar3>(), src.step, src.cols, src.rows);

which does not look like a CUDA call.

Also, I am working only on a grayscale image. In that case, I believe I wont need the src.step value, and I can pass do:

uchar* p = src.data; func(p, src.cols, src.rows);

Am I right?

Thanks again for your help! Bharath

2012-09-17 15:07:21 -0600 commented question Pseudocode for custom GPU computation

I dont understand what is meant by memory data. But yes, I want to use it in imshow in the end, so that the results of my computations can be viewed real-time.

2012-09-05 04:04:20 -0600 received badge  Student (source)
2012-09-04 17:41:53 -0600 asked a question Pseudocode for custom GPU computation

Hi,

I would like to work on a video stream and do some computation element by element on the frames, the output of which needs to be displayed in realtime. I am just starting with Opencv and am a newbie. In this case, I guess I wont be using one of the in-built GPU functions in opencv, but will be writing my own CUDA kernel. In addition, the output Image will be smaller than the input image due to the nature of my computations. Can I know what would be the pseudocode for this? I have come up with this so far:

#include "cv.h"
#include "highgui.h"
#include "CUDABOF.hpp"
#include "Array.hpp"
#include "LinearMemory.hpp"
#include "MemoryUtils.hpp"

using namespace std;
using namespace cv;
using namespace CUDABOF;

int main(int, char**)
{
    VideoCapture cap(0);
    if(!cap.isOpened()) return -1;

    Mat custom_output;
    namedWindow("custom_output",1);
    for(;;)
    {
        Mat frame;
        cap >> frame;
        cvtColor(frame, custom_output, CV_BGR2GRAY);
//        memory Copy from Host to Device;
//        Call CUDA kernel;
//        memory Copy from Device to Host;

        imshow("custom_output", custom_output);
        if(waitKey(30) >= 0) break;
    }
    return 0;
}

Can someone help me with the API's to fill up the commented places in my code. Your help is much appreciated!

-Bharath

2012-09-04 14:58:58 -0600 asked a question Video element access

Hi,

I am a newbie opencv user, and I wanted to ask how to access the elements of a Mat. This question has been answered before, I am pasting the answer to this which was answered before which is:

You can use template version of Mat:

Mat_<vec3b> bgrMat; bgrMat(y, x) = Vec3b(0,0,0); Vec3b* row = bgrMat[y] // pointer to mat row

However, for this I have to create a template version on Mat. However, I am trying to work on a video streaming problem where I want to operate on 2D images from a video stream as below:

Mat edges;
namedWindow("edges",1);
for(;;)
{
    Mat frame;
    cap >> frame;
    ............ //work on frame element by element.

}

I guess i cant use the template version of Mat in this case. I tried using it and I got compiler errors. I also, tried to follow what was given in the documentation that I can refer to the Mat elements using the .at() function provided I know the type, which I did not know. I tried to use the type() function for that, which returned the value 16 which I was not trying to interpret. The documentation only says type() can be used as in the previous versions with IplImage or CvMat, and assumes the user is acquainted with the older versions of opencv, but unfortunately i am not.

-Bharath