2013-10-10 14:03:19 -0600 | answered a question | Runtime error building opencl kernel Replacing GeForce 8600 GT with Quadro 600 board fixed the issue. I am quite new to OpenCL but It turned out GeForce 8600GT device does not support DOUBLE_SUPPORT hardware extension as well as atomic operations which are essential for some of the OpenCL kernels used in face detect sample. Hope this helps others! |
2013-10-10 13:59:58 -0600 | answered a question | Runtime error CL_INVALID_BINARY Replacing GeForce 8600 GT with Quadro 600 board fixed the issue. I am quite new to OpenCL but It turned out GeForce 8600GT device does not support DOUBLE_SUPPORT hardware extension as well as atomic operations which is essential for some of the OpenCL kernels used in face detect sample. Hope this helps others! |
2013-10-08 11:16:04 -0600 | asked a question | Runtime error CL_INVALID_BINARY I am running the face detect sample with opencl acceleration and seeing runtime error when its trying to build the kernl for integral cols for NVIDIA GeForce 8600 GT graphics card. Building source:./integral_cols_D4_GeForce 8600 GT .clb clCreateProgramWithSource status=0 source= all_build_options= clBuildProgram status=-42 OpenCV Error: Gpu API call (CL_INVALID_BINARY) in openCLGetKernelFromSource, file /home/ovuser/OpenCV/opencv-2.4.6.1/modules/ocl/src/initialization.cpp, line 668 terminate called after throwing an instance of 'cv::Exception' what(): /home/ovuser/OpenCV/opencv-2.4.6.1/modules/ocl/src/initialization.cpp:668: error: (-217) CL_INVALID_BINARY in function openCLGetKernelFromSource Any pointers to debug this issue? |
2013-10-08 09:01:51 -0600 | asked a question | Runtime error building opencl kernel On Ubuntu 12.04 LTS with NVidia GeForce 8 series GPU card, I am getting runtime error from the ocl-facedetect sample of OpenCV 2.4.6.1 building the kernel required for the sample: $./ocl-example-facedetect -t haarcascade_frontalface_alt.xml -i friends.jpg In image read loop0 ~~~~ Loading convertC3C4 Building source:./convertC3C4_GeForce 8600 GT -D GENTYPE4=uchar4.clb ~~~~ Loading RGB2Gray Building source:./RGB2Gray_GeForce 8600 GT -D DEPTH_0.clb ~~~~ Loading resizeLN_C1_D0 Building source:./resizeLN_C1_D0_GeForce 8600 GT .clb ~~~~ Loading set_to_without_mask Building source:./set_to_without_mask_GeForce 8600 GT -D GENTYPE=int.clb ~~~~ Loading calc_sub_hist_D0 Building source:./calc_sub_hist_D0_GeForce 8600 GT .clb clCreateProgramWithSource status=0 clBuildProgram status=-11 BUILD LOG (0) Error: unsupported operation OpenCV Error: Gpu API call (CL_BUILD_PROGRAM_FAILURE) in openCLGetKernelFromSource, file /home/ovuser/OpenCV/opencv-2.4.6.1/modules/ocl/src/initialization.cpp, line 664 terminate called after throwing an instance of 'cv::Exception' what(): /home/ovuser/OpenCV/opencv-2.4.6.1/modules/ocl/src/initialization.cpp:664: error: (-217) CL_BUILD_PROGRAM_FAILURE in function openCLGetKernelFromSource It seems the OpenCL compiler is unable to build calc_sub_hist_D0 kernel required by the face detect sample. I have put some verbose messages and attached kernel source causing the problem. Can someone point out what's wrong with the kernel source? |
2013-10-07 13:38:00 -0600 | received badge | ● Editor (source) |
2013-10-07 13:35:46 -0600 | asked a question | ocl-facedetect sample - CL_BUILD_PROGRAM_FAILURE in function openCLGetKernelFromSource building kernel for calc_sub_hist_D0 On Ubuntu 12.04 LTS with NVidia GeForce 8 series GPU card, I am trying to run the ocl-facedetect sample of OpenCV 2.4.6.1 and seeing following error: clBuildProgram status=-11 BUILD LOG (0) Error: unsupported operation OpenCV Error: Gpu API call (CL_BUILD_PROGRAM_FAILURE) in openCLGetKernelFromSource, file /home/ovuser/OpenCV/opencv-2.4.6.1/modules/ocl/src/initialization.cpp, line 664 terminate called after throwing an instance of 'cv::Exception' what(): /home/ovuser/OpenCV/opencv-2.4.6.1/modules/ocl/src/initialization.cpp:664: error: (-217) CL_BUILD_PROGRAM_FAILURE in function openCLGetKernelFromSource It seems the OpenCL compiler is unable to build one of the kernel required by the face detect sample. I have put some verbose messages and here is the kernel source causing the problem. Can someone point out what's wrong with the kernel source? source is unreadable so attached jpeg too. ./ocl-example-facedetect -t haarcascade_frontalface_alt.xml -i friends.jpg---- verbose dump --- C:\fakepath\ocl-example-facedetect.jpg ~~~~ Loading calc_sub_hist_D0 Building source:./calc_sub_hist_D0_GeForce 8600 GT .clb clCreateProgramWithSource status=0 source=#define PARTIAL_HISTOGRAM256_COUNT (256) define HISTOGRAM256_BIN_COUNT (256)define HISTOGRAM256_WORK_GROUP_SIZE (256)define HISTOGRAM256_LOCAL_MEM_SIZE (HISTOGRAM256_BIN_COUNT)define NBANKS (16)define NBANKS_BIT (4)__kernel __attribute__((reqd_work_group_size(HISTOGRAM256_BIN_COUNT,1,1)))void calc_sub_hist_D0( __global const uint4* src, int src_step, int src_offset, __global int* globalHist, int dataCount, int cols, int inc_x, int inc_y, int hist_step) { __local int subhist[(HISTOGRAM256_BIN_COUNT << NBANKS_BIT)]; int gid = get_global_id(0); int lid = get_local_id(0); int gx = get_group_id(0); int gsize = get_global_size(0); int lsize = get_local_size(0); const int shift = 8; const int mask = HISTOGRAM256_BIN_COUNT-1; int offset = (lid & (NBANKS-1)); uint4 data, temp1, temp2, temp3, temp4; src += src_offset; for(int i=0, idx=lid; i<(NBANKS >> 2); i++, idx += lsize) { subhist[idx] = 0; subhist[idx+=lsize] = 0; subhist[idx+=lsize] = 0; subhist[idx+=lsize] = 0; } barrier(CLK_LOCAL_MEM_FENCE); int y = gid/cols; int x = gid - mul24(y, cols); for(int idx=gid; idx<datacount; idx+="gsize)" {="" data="src[mad24(y," src_step,="" x)];="" temp1="((data" &="" mask)="" <<="" nbanks_bit)="" +="" offset;="" data="" >>="shift;" temp2="((data" &="" mask)="" <<="" nbanks_bit)="" +="" offset;="" data="" >>="shift;" temp3="((data" &="" mask)="" <<="" nbanks_bit)="" +="" offset;="" data="" >>="shift;" temp4="((data" &="" mask)="" <<="" nbanks_bit)="" +="" offset;="" atomic_inc(subhist="" +="" temp1.x);="" atomic_inc(subhist="" +="" temp1.y);="" atomic_inc(subhist="" +="" temp1.z);="" atomic_inc(subhist="" +="" temp1.w);="" atomic_inc(subhist="" +="" temp2.x);="" atomic_inc(subhist="" +="" temp2.y);="" atomic_inc(subhist="" +="" temp2.z);="" atomic_inc(subhist="" +="" temp2.w);="" atomic_inc(subhist="" +="" temp3.x);="" atomic_inc(subhist="" +="" temp3.y);="" atomic_inc(subhist="" +="" temp3.z);="" atomic_inc(subhist="" +="" temp3.w);="" atomic_inc(subhist="" +="" temp4.x);="" atomic_inc(subhist="" +="" temp4.y);="" atomic_inc(subhist="" +="" temp4.z);="" atomic_inc(subhist="" +="" temp4.w);="" x="" +="inc_x;" int="" off="((x">=cols) ? -1 : 0); x = mad24(off, cols, x); y += inc_y - off; } barrier(CLK_LOCAL_MEM_FENCE); int bin1=0, bin2=0, bin3=0, bin4=0; for(int i=0; i<nbanks; i+="4)" {="" bin1="" +="subhist[(lid" <<="" nbanks_bit)="" +="" i];="" bin2="" +="subhist[(lid" <<="" nbanks_bit)="" +="" i+1];="" bin3="" +="subhist[(lid" <<="" nbanks_bit)="" +="" i+2];="" bin4="" +="subhist[(lid" <<="" nbanks_bit)="" +="" i+3];="" }="" globalhist[mad24(gx,="" hist_step,="" lid)]="bin1+bin2+bin3+bin4;" }="" __kernel="" void="" __attribute__((reqd_work_group_size ... (more) |