bharath422's profile - activity

overview network karma followed questions activity

2012-09-28 12:30:34 -0600	commented answer	GPU Code Not Working Question Oh cool. forgot about that. Isnt it also that the maximum threads per block is 512, so 40* 40 is > 512? Also, is the frames per second of the output video same as the input? Is there a way to measure this?
2012-09-21 12:05:55 -0600	commented answer	GPU Code Not Working Question You're a genius! Works like a charm! Thanks a ton! I'll probably have more questions, and hoping you can help me out. Also, I am wondering why you say 16*16 is the best decompositions. Is it because each block can take only up to 512 threads?
2012-09-21 12:03:57 -0600	received badge	● Scholar (source)
2012-09-20 11:55:57 -0600	asked a question	GPU Code Not Working Question Hi, (This is a continuation from the post "Pseudocode for custom GPU computation") Following is my GPU opencv code. I tried 2 things - 1) to just set all pixels of the output to zero, 2) return a monochrome image based on a threshold on the pixel values of the input image. But for both cases when I run it, I dont get the result I want, I get some grayscale image, which is just peppered noise. My code below: //The size of the video frame is 480 * 640 //funcam.cpp using namespace std; using namespace cv; using namespace cv::gpu; void callKernel(const GpuMat& src, const GpuMat& dst) { uchar* p = src.data; uchar* p2 = dst.data; func(p, p2, src.step, src.cols, src.rows, dst.step); } int main(int, char*) { VideoCapture cap(0); if(!cap.isOpened()) return -1; int frameH = (int) cap.get(CV_CAP_PROP_FRAME_HEIGHT); int frameW = (int) cap.get(CV_CAP_PROP_FRAME_WIDTH); cout << frameH << " " << frameW << endl; CvSize size = cvSize(frameH, frameW); Mat frame; Mat input; Mat output; GpuMat d_frame; GpuMat d_output; for(;;) { cap >> frame; if (frame.empty()) break; //convert to grayscale cvtColor(frame, input, CV_BGR2GRAY); // memory Copy from Host to Device d_frame.upload(input); // Call CUDA kernel d_output.create(size, CV_8UC1); callKernel(d_frame, d_output); // memory Copy from Device to Host d_output.download(output); imshow("output", output); if(waitKey(30) >= 0) break; } return 0; } //funcam_cuda.cu __global__ void funcKernel(uchar srcptr, uchar* dstptr, int step, int cols, int rows, int dststep) { int rowInd = blockIdx.y * blockDim.y + threadIdx.y; int colInd = blockIdx.x * blockDim.x + threadIdx.x; uchar* rowsrcPtr = srcptr + rowIndstep; uchar rowdstPtr = dstptr + rowInddststep; uchar pixVal = rowsrcPtr[colInd]; // rowdstPtr[colInd] = (pixVal > 60 ? 255 : 0); rowdstPtr[colInd] = 0; } extern "C" void func(uchar srcptr, uchar* dstptr, int step, int cols, int rows, int dststep) { dim3 grDim (16, 12); dim3 blDim (40, 40); funcKernel<<< grDim, blDim >>>(srcptr, dstptr, step, cols, rows, dststep); cudaThreadSynchronize(); //Is this reqd? }
2012-09-20 11:53:30 -0600	commented answer	Pseudocode for custom GPU computation Hi Vladislav, I put my code in a new post "GPU Code Not Working Question". Thanks for your help so far!
2012-09-19 18:08:39 -0600	commented answer	Pseudocode for custom GPU computation when i say "I tried setting all the values of the d_output to 0", I mean the line in the cuda kernel: rowdstPtr[colInd] = 0; just wanted to make sure what i said wasnt confusing.
2012-09-19 18:02:37 -0600	commented answer	Pseudocode for custom GPU computation I tried setting all the values of the d_output to 0, but what I get is some image with peppered noise. I also tried to change the input image to monochrome using a threshold, 100, but that gave a similar meaningless output only. I also noticed the step sizes of the source and destination images were different., so I passed in that as well into the kernel and computed the rowdstPtr accordingly, and i was getting something partially looking like what my camera captures, but the image doesnt make sense, its mostly random. However, I tried to equate all the pixels to 0 in the cuda code, so wonder how the output GpuMat gets the camera image? It's all kind of connfusing at this point. If you could give me some more hints that would help me fix the problem and get this code running, it'll be gr8
2012-09-19 17:54:41 -0600	commented answer	Pseudocode for custom GPU computation __global__ void funcKernel(uchar* srcptr, uchar* dstptr, int step, int cols, int rows) { int rowInd = blockIdx.y * blockDim.y + threadIdx.y; int colInd = blockIdx.x * blockDim.x + threadIdx.x; uchar* rowsrcPtr = srcptr + rowIndstep; uchar rowdstPtr = dstptr + rowInd*step; uchar pixVal = rowsrcPtr[colInd]; // rowdstPtr[colInd] = (pixVal > 100 ? 255 : 0); rowdstPtr[colInd] = 0; }
2012-09-19 17:53:13 -0600	commented answer	Pseudocode for custom GPU computation extern "C" void func(uchar* srcptr, uchar* dstptr, int step, int cols, int rows) { dim3 grDim (16, 12); dim3 blDim (40, 40); funcKernel<<< grDim, blDim >>>(srcptr, dstptr, step, cols, rows); }
2012-09-19 17:52:13 -0600	commented answer	Pseudocode for custom GPU computation I had the wrong .h files included. I am now able to compile, but the output I get is just noise. Here are my routines: void callKernel(const GpuMat& src, const GpuMat& dst) { uchar* p = src.data; uchar* p2 = dst.data; func(p, p2, src.step, src.cols, src.rows); }
2012-09-19 15:41:02 -0600	commented answer	Pseudocode for custom GPU computation Hi Vladislav, Thanks for the help. I followed you instructions, and when I try to compile, I get the error: Bharath-Pattabiramans-MacBook-Pro:funcam bharath650$ make ==== Compiling funcam.o === g++ -c funcam.cpp -I. -I/usr/local/cuda/include -I/Developer/GPU\ Computing/C//common/inc/ -I/usr/local/cuda/include -I/usr/local/include/opencv -I/usr/local/include funcam.cpp: In function ‘void callKernel(const cv::gpu::GpuMat&)’: funcam.cpp:22: error: invalid use of incomplete type ‘const struct cv::gpu::GpuMat’ /usr/local/include/opencv2/core/core.hpp:99: error: forward declaration of ‘const struct cv::gpu::GpuMat’ funcam.cpp:22: error: ‘uchar3’ was not declared in this scope funcam.cpp:22: error: expected primary-expression before ‘)’ token funcam.cpp:22: error: invalid use of incomplete
2012-09-17 16:50:45 -0600	received badge	● Supporter (source)
2012-09-17 16:49:49 -0600	commented answer	Pseudocode for custom GPU computation Hi Vladislav, Thanks for the answer. But I still dont understand how to plug in the CUDA kernel. Say, I write the kernel like: __global__ void func(uchar* srcptr, int step, int cols, int rows) { .... } Now, I have to call it like func<<< GRID_DIM, BLOCK_DIM... >>>(...) I dont understand where to call this. The call you have written seems to be func(src.ptr<uchar3>(), src.step, src.cols, src.rows); which does not look like a CUDA call. Also, I am working only on a grayscale image. In that case, I believe I wont need the src.step value, and I can pass do: uchar* p = src.data; func(p, src.cols, src.rows); Am I right? Thanks again for your help! Bharath
2012-09-17 15:07:21 -0600	commented question	Pseudocode for custom GPU computation I dont understand what is meant by memory data. But yes, I want to use it in imshow in the end, so that the results of my computations can be viewed real-time.
2012-09-05 04:04:20 -0600	received badge	● Student (source)
2012-09-04 17:41:53 -0600	asked a question	Pseudocode for custom GPU computation Hi, I would like to work on a video stream and do some computation element by element on the frames, the output of which needs to be displayed in realtime. I am just starting with Opencv and am a newbie. In this case, I guess I wont be using one of the in-built GPU functions in opencv, but will be writing my own CUDA kernel. In addition, the output Image will be smaller than the input image due to the nature of my computations. Can I know what would be the pseudocode for this? I have come up with this so far: #include "cv.h" #include "highgui.h" #include "CUDABOF.hpp" #include "Array.hpp" #include "LinearMemory.hpp" #include "MemoryUtils.hpp" using namespace std; using namespace cv; using namespace CUDABOF; int main(int, char**) { VideoCapture cap(0); if(!cap.isOpened()) return -1; Mat custom_output; namedWindow("custom_output",1); for(;;) { Mat frame; cap >> frame; cvtColor(frame, custom_output, CV_BGR2GRAY); // memory Copy from Host to Device; // Call CUDA kernel; // memory Copy from Device to Host; imshow("custom_output", custom_output); if(waitKey(30) >= 0) break; } return 0; } Can someone help me with the API's to fill up the commented places in my code. Your help is much appreciated! -Bharath
2012-09-04 14:58:58 -0600	asked a question	Video element access Hi, I am a newbie opencv user, and I wanted to ask how to access the elements of a Mat. This question has been answered before, I am pasting the answer to this which was answered before which is: You can use template version of Mat: Mat_<vec3b> bgrMat; bgrMat(y, x) = Vec3b(0,0,0); Vec3b* row = bgrMat[y] // pointer to mat row However, for this I have to create a template version on Mat. However, I am trying to work on a video streaming problem where I want to operate on 2D images from a video stream as below: `Mat edges; namedWindow("edges",1); for(;;) { Mat frame; cap >> frame; ............ //work on frame element by element.` } I guess i cant use the template version of Mat in this case. I tried using it and I got compiler errors. I also, tried to follow what was given in the documentation that I can refer to the Mat elements using the .at() function provided I know the type, which I did not know. I tried to use the type() function for that, which returned the value 16 which I was not trying to interpret. The documentation only says type() can be used as in the previous versions with IplImage or CvMat, and assumes the user is acquainted with the older versions of opencv, but unfortunately i am not. -Bharath