2012-09-28 12:30:34 -0600 | commented answer | GPU Code Not Working Question Oh cool. forgot about that. Isnt it also that the maximum threads per block is 512, so 40* 40 is > 512? Also, is the frames per second of the output video same as the input? Is there a way to measure this? |
2012-09-21 12:05:55 -0600 | commented answer | GPU Code Not Working Question You're a genius! Works like a charm! Thanks a ton! I'll probably have more questions, and hoping you can help me out. Also, I am wondering why you say 16*16 is the best decompositions. Is it because each block can take only up to 512 threads? |
2012-09-21 12:03:57 -0600 | received badge | ● Scholar (source) |
2012-09-20 11:55:57 -0600 | asked a question | GPU Code Not Working Question Hi, (This is a continuation from the post "Pseudocode for custom GPU computation") Following is my GPU opencv code. I tried 2 things - 1) to just set all pixels of the output to zero, 2) return a monochrome image based on a threshold on the pixel values of the input image. But for both cases when I run it, I dont get the result I want, I get some grayscale image, which is just peppered noise. My code below: |
2012-09-20 11:53:30 -0600 | commented answer | Pseudocode for custom GPU computation Hi Vladislav, I put my code in a new post "GPU Code Not Working Question". Thanks for your help so far! |
2012-09-19 18:08:39 -0600 | commented answer | Pseudocode for custom GPU computation when i say "I tried setting all the values of the d_output to 0", I mean the line in the cuda kernel: rowdstPtr[colInd] = 0; just wanted to make sure what i said wasnt confusing. |
2012-09-19 18:02:37 -0600 | commented answer | Pseudocode for custom GPU computation I tried setting all the values of the d_output to 0, but what I get is some image with peppered noise. I also tried to change the input image to monochrome using a threshold, 100, but that gave a similar meaningless output only. I also noticed the step sizes of the source and destination images were different., so I passed in that as well into the kernel and computed the rowdstPtr accordingly, and i was getting something partially looking like what my camera captures, but the image doesnt make sense, its mostly random. However, I tried to equate all the pixels to 0 in the cuda code, so wonder how the output GpuMat gets the camera image? It's all kind of connfusing at this point. If you could give me some more hints that would help me fix the problem and get this code running, it'll be gr8 |
2012-09-19 17:54:41 -0600 | commented answer | Pseudocode for custom GPU computation __global__ void funcKernel(uchar* srcptr, uchar* dstptr, int step, int cols, int rows) { int rowInd = blockIdx.y * blockDim.y + threadIdx.y; int colInd = blockIdx.x * blockDim.x + threadIdx.x; uchar* rowsrcPtr = srcptr + rowInd*step; uchar* rowdstPtr = dstptr + rowInd*step; uchar pixVal = rowsrcPtr[colInd]; // rowdstPtr[colInd] = (pixVal > 100 ? 255 : 0); rowdstPtr[colInd] = 0; } |
2012-09-19 17:53:13 -0600 | commented answer | Pseudocode for custom GPU computation extern "C" void func(uchar* srcptr, uchar* dstptr, int step, int cols, int rows) { dim3 grDim (16, 12); dim3 blDim (40, 40); funcKernel<<< grDim, blDim >>>(srcptr, dstptr, step, cols, rows); } |
2012-09-19 17:52:13 -0600 | commented answer | Pseudocode for custom GPU computation I had the wrong .h files included. I am now able to compile, but the output I get is just noise. Here are my routines: void callKernel(const GpuMat& src, const GpuMat& dst) { uchar* p = src.data; uchar* p2 = dst.data; func(p, p2, src.step, src.cols, src.rows); } |
2012-09-19 15:41:02 -0600 | commented answer | Pseudocode for custom GPU computation Hi Vladislav, Thanks for the help. I followed you instructions, and when I try to compile, I get the error: Bharath-Pattabiramans-MacBook-Pro:funcam bharath650$ make
==== Compiling funcam.o ===
g++ -c funcam.cpp -I. -I/usr/local/cuda/include -I/Developer/GPU\ Computing/C//common/inc/ -I/usr/local/cuda/include -I/usr/local/include/opencv -I/usr/local/include |
2012-09-17 16:50:45 -0600 | received badge | ● Supporter (source) |
2012-09-17 16:49:49 -0600 | commented answer | Pseudocode for custom GPU computation Hi Vladislav, Thanks for the answer. But I still dont understand how to plug in the CUDA kernel. Say, I write the kernel like: __global__ void func(uchar* srcptr, int step, int cols, int rows) { .... } Now, I have to call it like func<<< GRID_DIM, BLOCK_DIM... >>>(...) I dont understand where to call this. The call you have written seems to be func(src.ptr<uchar3>(), src.step, src.cols, src.rows); which does not look like a CUDA call. Also, I am working only on a grayscale image. In that case, I believe I wont need the src.step value, and I can pass do: uchar* p = src.data; func(p, src.cols, src.rows); Am I right? Thanks again for your help! Bharath |
2012-09-17 15:07:21 -0600 | commented question | Pseudocode for custom GPU computation I dont understand what is meant by memory data. But yes, I want to use it in imshow in the end, so that the results of my computations can be viewed real-time. |
2012-09-05 04:04:20 -0600 | received badge | ● Student (source) |
2012-09-04 17:41:53 -0600 | asked a question | Pseudocode for custom GPU computation Hi, I would like to work on a video stream and do some computation element by element on the frames, the output of which needs to be displayed in realtime. I am just starting with Opencv and am a newbie. In this case, I guess I wont be using one of the in-built GPU functions in opencv, but will be writing my own CUDA kernel. In addition, the output Image will be smaller than the input image due to the nature of my computations. Can I know what would be the pseudocode for this? I have come up with this so far: Can someone help me with the API's to fill up the commented places in my code. Your help is much appreciated! -Bharath |
2012-09-04 14:58:58 -0600 | asked a question | Video element access Hi, I am a newbie opencv user, and I wanted to ask how to access the elements of a Mat. This question has been answered before, I am pasting the answer to this which was answered before which is: You can use template version of Mat: Mat_<vec3b> bgrMat; bgrMat(y, x) = Vec3b(0,0,0); Vec3b* row = bgrMat[y] // pointer to mat row However, for this I have to create a template version on Mat. However, I am trying to work on a video streaming problem where I want to operate on 2D images from a video stream as below: } I guess i cant use the template version of Mat in this case. I tried using it and I got compiler errors. I also, tried to follow what was given in the documentation that I can refer to the Mat elements using the .at() function provided I know the type, which I did not know. I tried to use the type() function for that, which returned the value 16 which I was not trying to interpret. The documentation only says type() can be used as in the previous versions with IplImage or CvMat, and assumes the user is acquainted with the older versions of opencv, but unfortunately i am not. -Bharath |