Why self compiled Opencv is slower than apt-get package?
Hi community.
I have 2 opencv package: one is self-compiled opencv3.1 with cuda and the other is installed using sudo apt-get install ros-kinetic-opencv3
.
When I tried the same program on different packages, it appears my self compiled package is much slower than that in ROS.
Ex> Template matching with ORB
My opencv3.1 : 150 - 200 ms per frame
ROS opencv3.2.0-dev : 40~50ms per frame
This is tested in python, but I also find my c++ codes have a similar result.
Here shows getBuildInformation() result in each version.
1.Self compiled opencv
General configuration for OpenCV 3.1.0 =====================================
Version control: 3.1.0-3-g50b7dfd-dirty
Platform:
Host: Linux 3.10.96-tegra aarch64
CMake: 3.5.1
CMake generator: Unix Makefiles
CMake build tool: /usr/bin/make
Configuration: RelWithDebugInfo
C/C++:
Built as dynamic libs?: YES
C++ Compiler: /usr/bin/c++ (ver 5.4.0)
C++ flags (Release): -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wno-narrowing -Wno-delete-non-virtual-dtor -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG -DNDEBUG
C++ flags (Debug): -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wno-narrowing -Wno-delete-non-virtual-dtor -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fvisibility=hidden -fvisibility-inlines-hidden -g -O0 -DDEBUG -D_DEBUG
C Compiler: /usr/bin/cc
C flags (Release): -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wno-narrowing -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fvisibility=hidden -O3 -DNDEBUG -DNDEBUG
C flags (Debug): -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wno-narrowing -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fvisibility=hidden -g -O0 -DDEBUG -D_DEBUG
Linker flags (Release):
Linker flags (Debug):
Precompiled headers: NO
Extra dependencies: Qt5::Test Qt5::Concurrent Qt5::OpenGL /usr/lib/aarch64-linux-gnu/libwebp.so /usr/lib/aarch64-linux-gnu/libpng.so /usr/lib/aarch64-linux-gnu/libtiff.so /usr/lib/aarch64-linux-gnu/libjasper.so /usr/lib/aarch64-linux-gnu/libjpeg.so v4l1 v4l2 avcodec-ffmpeg avformat-ffmpeg avutil-ffmpeg swscale-ffmpeg /usr/lib/aarch64-linux-gnu/libbz2.so Qt5::Core Qt5::Gui Qt5::Widgets /usr/lib/aarch64-linux-gnu/hdf5/serial/lib/libhdf5.so /usr/lib/aarch64-linux-gnu/libpthread.so /usr/lib/aarch64-linux-gnu/libsz.so /usr/lib/aarch64-linux-gnu/libz.so /usr/lib/aarch64-linux-gnu/libdl.so /usr/lib/aarch64-linux-gnu/libm.so correspondence multiview numeric glog gflags dl m pthread rt /usr/lib/aarch64-linux-gnu/libGLU.so /usr/lib/aarch64-linux-gnu/libGL.so tbb atomic cudart nppc nppi npps cufft -L/usr/local/cuda-8.0/lib64
3rdparty dependencies:
OpenCV modules:
To be built: cudev core cudaarithm flann hdf imgproc ml reg surface_matching video cudabgsegm cudafilters cudaimgproc cudawarping dnn fuzzy imgcodecs photo shape videoio cudacodec highgui objdetect plot ts xobjdetect xphoto bgsegm bioinspired dpm face features2d line_descriptor saliency text calib3d ccalib cudafeatures2d cudalegacy cudaobjdetect cudaoptflow cudastereo cvv datasets rgbd stereo structured_light superres tracking videostab xfeatures2d ximgproc aruco optflow sfm stitching python2
Disabled: world contrib_world
Disabled by dependency: -
Unavailable: java python3 viz matlab
GUI:
QT 5.x: YES (ver 5.5.1)
QT OpenGL ...
What flags did you compile it with? Did you make sure to include all the optimizations your processor can use?
The answer should be found by calling
(or the python equivalent) for both builds and compare the output.
Thank you guys. I used following flags:
cmake .. \ -DWITH_OPENGL:BOOL=ON \ -DWITH_QT:BOOL=ON \ -DWITH_CUDA:BOOL=ON \ -DCUDA_ARCH_BIN="5.2" \ -DCUDA_ARCH_PTX="5.2" \ -DCMAKE_BUILD_TYPE=RelWithDebugInfo \ -DCMAKE_INSTALL_PREFIX=/usr/local \ -DBUILD_TESTS:BOOL=OFF \ -DBUILD_PERF_TESTS:BOOL=OFF \ -DWITH_FFMPEG:BOOL=ON \ -DENABLE_NEON:BOOL=ON \ -DBUILD_EXAMPLES:BOOL=ON \ -DINSTALL_C_EXAMPLES:BOOL=OFF \ -DINSTALL_PYTHON_EXAMPLES:BOOL=ON \ -DOPENCV_EXTRA_MODULES_PATH=../opencv_contrib/modules
Actually, tomasth is correct. Can you edit your question with the getBuildInformation() from both versions? There should be some obvious differences.
Also, just to check, is your processor an ARM processor? Or is it Intel or AMD?
Hi. I added my getBuildInformation result. And my machine is 64bit arm.
Two things.
I notice you have OpenCL disabled. That could do it.
Secondly, they use the NVIDIA HAL called carotene. You should see an option in the WITH section of your cmake named WITH_CAROTENE. Give that a try.