Cannot reproduce same build

asked 2020-10-21 06:32:50 -0500

Can someone help me how to reproduce same build of OpenCV? A year ago I have compiled OpenCV 3.3.1 with CUDA support to check performance of our own undistortion gstreamer plugin which use remap function from OpenCV.

With compiled OpenCV 3.3.1 our plugin takes ~500ms to process 1minute video. Thought that the performance is because of CUDA support, but looks like GPU is not loaded as well. There is something else which contains optimization for remap function. We tried to compile OpenCV 4.1.1 with CUDA support and try to reach similar perfomance, but instead of 500ms it takes 8 seconds.

General configuration for OpenCV 3.3.1 ===================================== Version control: unknown

Extra modules: Location (extra): /storage/opencv-xavier/opencv-3.3.1/opencv_contrib/modules Version control (extra): 3.3.1

Platform: Timestamp: 2019-10-02T11:25:59Z Host: Linux 4.9.140-tegra aarch64 CMake: 3.10.2 CMake generator: Unix Makefiles CMake build tool: /usr/bin/make Configuration: RELEASE

CPU/HW features: Baseline: NEON FP16 required: NEON disabled: VFPV3

C/C++: Built as dynamic libs?: YES C++11: YES C++ Compiler: /usr/bin/c++ (ver 7.4.0) C++ flags (Release): -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winit-self -Wno-narrowing -Wno-delete-non-virtual-dtor -Wno-comment -Wno-implicit-fallthrough -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG -DNDEBUG C++ flags (Debug): -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winit-self -Wno-narrowing -Wno-delete-non-virtual-dtor -Wno-comment -Wno-implicit-fallthrough -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fvisibility=hidden -fvisibility-inlines-hidden -g -O0 -DDEBUG -D_DEBUG C Compiler: /usr/bin/cc C flags (Release): -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Winit-self -Wno-narrowing -Wno-comment -Wno-implicit-fallthrough -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fvisibility=hidden -O3 -DNDEBUG -DNDEBUG C flags (Debug): -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Winit-self -Wno-narrowing -Wno-comment -Wno-implicit-fallthrough -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fvisibility=hidden -g -O0 -DDEBUG -D_DEBUG Linker flags (Release): Linker flags (Debug): ccache: NO Precompiled headers: YES Extra dependencies: dl m pthread rt cudart nppc nppial nppicc nppicom nppidei nppif nppig nppim nppist nppisu nppitc npps cublas cufft -L/usr/local/cuda/lib64 3rdparty dependencies:

OpenCV modules: To be built: cudev core cudaarithm flann hdf imgproc ml objdetect phase_unwrapping plot reg surface_matching video xphoto bgsegm cudabgsegm cudafilters cudaimgproc cudawarping dnn face freetype fuzzy img_hash imgcodecs photo shape videoio xobjdetect cudacodec highgui bioinspired dpm features2d line_descriptor saliency text calib3d ccalib cudafeatures2d cudalegacy cudaobjdetect cudaoptflow cudastereo datasets rgbd stereo structured_light superres tracking videostab xfeatures2d ximgproc aruco optflow stitching python2 python3 Disabled: js world contrib_world Disabled by dependency: - Unavailable: java ts viz cnn_3dobj cvv dnn_modern matlab sfm

GUI: QT: NO GTK+ 3.x: YES (ver 3.22.30) GThread : YES (ver 2.56.4) GtkGlExt: NO OpenGL support: NO VTK support: NO

Media I/O: ZLib: /usr/lib/aarch64-linux-gnu ... (more)

edit retag flag offensive close merge delete


what are you actually calling in your code ? cvRemap() ? there's no CUDA optimization in there at all

please show resp. code snippets

I know that the 2 builds are absolutely the same.

i would not think so, there's like 3 years between 3.3 and 4.4

berak gravatar imageberak ( 2020-10-21 07:56:15 -0500 )edit

Sorry, my fault. I meant these 2 builds are not absolute the same. I also forgot to say that the architecture is ARM.

Last days I successfully reproduced same build and speed with OpenCV 3.3.1 With same flags and OpenCV 4.1.1 doesn't work. It's still slow. Looks like something was changed in newer. version.

#include <opencv2/imgproc.hpp>

using namespace cv;

Mat src = cvarrToMat(img);
Mat dst = cvarrToMat(outimg);
remap(src, dst, map1, map2, INTER_LINEAR);

Diff between 2 builds

sbofirov gravatar imagesbofirov ( 2020-10-23 02:03:45 -0500 )edit

cvarrToMat -- the c-api is gone in opencv 4 you have to change your code

and why bother with something as outdated as 4.1.1 even ? we're at 4.5 now

berak gravatar imageberak ( 2020-10-23 02:23:47 -0500 )edit