Cannot reproduce same build
Can someone help me how to reproduce same build of OpenCV? A year ago I have compiled OpenCV 3.3.1 with CUDA support to check performance of our own undistortion gstreamer plugin which use remap function from OpenCV.
With compiled OpenCV 3.3.1 our plugin takes ~500ms to process 1minute video. Thought that the performance is because of CUDA support, but looks like GPU is not loaded as well. There is something else which contains optimization for remap function. We tried to compile OpenCV 4.1.1 with CUDA support and try to reach similar perfomance, but instead of 500ms it takes 8 seconds.
General configuration for OpenCV 3.3.1 ===================================== Version control: unknown
Extra modules: Location (extra): /storage/opencv-xavier/opencv-3.3.1/opencv_contrib/modules Version control (extra): 3.3.1
Platform: Timestamp: 2019-10-02T11:25:59Z Host: Linux 4.9.140-tegra aarch64 CMake: 3.10.2 CMake generator: Unix Makefiles CMake build tool: /usr/bin/make Configuration: RELEASE
CPU/HW features: Baseline: NEON FP16 required: NEON disabled: VFPV3
C/C++: Built as dynamic libs?: YES C++11: YES C++ Compiler: /usr/bin/c++ (ver 7.4.0) C++ flags (Release): -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winit-self -Wno-narrowing -Wno-delete-non-virtual-dtor -Wno-comment -Wno-implicit-fallthrough -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG -DNDEBUG C++ flags (Debug): -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winit-self -Wno-narrowing -Wno-delete-non-virtual-dtor -Wno-comment -Wno-implicit-fallthrough -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fvisibility=hidden -fvisibility-inlines-hidden -g -O0 -DDEBUG -D_DEBUG C Compiler: /usr/bin/cc C flags (Release): -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Winit-self -Wno-narrowing -Wno-comment -Wno-implicit-fallthrough -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fvisibility=hidden -O3 -DNDEBUG -DNDEBUG C flags (Debug): -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Winit-self -Wno-narrowing -Wno-comment -Wno-implicit-fallthrough -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fvisibility=hidden -g -O0 -DDEBUG -D_DEBUG Linker flags (Release): Linker flags (Debug): ccache: NO Precompiled headers: YES Extra dependencies: dl m pthread rt cudart nppc nppial nppicc nppicom nppidei nppif nppig nppim nppist nppisu nppitc npps cublas cufft -L/usr/local/cuda/lib64 3rdparty dependencies:
OpenCV modules: To be built: cudev core cudaarithm flann hdf imgproc ml objdetect phase_unwrapping plot reg surface_matching video xphoto bgsegm cudabgsegm cudafilters cudaimgproc cudawarping dnn face freetype fuzzy img_hash imgcodecs photo shape videoio xobjdetect cudacodec highgui bioinspired dpm features2d line_descriptor saliency text calib3d ccalib cudafeatures2d cudalegacy cudaobjdetect cudaoptflow cudastereo datasets rgbd stereo structured_light superres tracking videostab xfeatures2d ximgproc aruco optflow stitching python2 python3 Disabled: js world contrib_world Disabled by dependency: - Unavailable: java ts viz cnn_3dobj cvv dnn_modern matlab sfm
GUI: QT: NO GTK+ 3.x: YES (ver 3.22.30) GThread : YES (ver 2.56.4) GtkGlExt: NO OpenGL support: NO VTK support: NO
Media I/O: ZLib: /usr/lib/aarch64-linux-gnu ...
what are you actually calling in your code ?
cvRemap()
? there's no CUDA optimization in there at allplease show resp. code snippets
i would not think so, there's like 3 years between 3.3 and 4.4
Sorry, my fault. I meant these 2 builds are not absolute the same. I also forgot to say that the architecture is ARM.
Last days I successfully reproduced same build and speed with OpenCV 3.3.1 With same flags and OpenCV 4.1.1 doesn't work. It's still slow. Looks like something was changed in newer. version.
Diff between 2 builds https://www.diffchecker.com/KvKeEPOD
cvarrToMat
-- the c-api is gone in opencv 4 you have to change your codeand why bother with something as outdated as 4.1.1 even ? we're at 4.5 now