Does OpenCV 3.4.x perform better(execution time) in release mode?

asked 2018-04-24 04:52:21 -0500

chetanskumar gravatar image

updated 2018-04-26 15:31:11 -0500

So I've built OpenCV 3.4.1 in Release mode with default flags. Does this make the build any different than if I built in Debug mode? I have a large codebase that utilizes OpenCV core libraries, and changing the build to release mode has not made much difference. I'm wondering If I've done something wrong or forgotten something during my build - perhaps some flag that should be set for enhanced performance.

Here is a log of my Release build config.

General configuration for OpenCV 3.4.1 =====================================
  Version control:               unknown

    Host:                        Linux 4.4.0-87-generic x86_64
    CMake:                       3.11.1
    CMake generator:             Unix Makefiles
    CMake build tool:            /usr/bin/make
    Configuration:               Release

  CPU/HW features:
    Baseline:                    SSE SSE2 SSE3
      requested:                 SSE3
    Dispatched code generation:  SSE4_1 SSE4_2 FP16 AVX AVX2
      requested:                 SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX
      SSE4_1 (3 files):          + SSSE3 SSE4_1
      SSE4_2 (1 files):          + SSSE3 SSE4_1 POPCNT SSE4_2
      FP16 (2 files):            + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
      AVX (5 files):             + SSSE3 SSE4_1 POPCNT SSE4_2 AVX
      AVX2 (9 files):            + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2

    Built as dynamic libs?:      YES
    C++ Compiler:                /usr/bin/c++  (ver 4.8.4)
    C++ flags (Release):         -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winit-self -Wno-narrowing -Wno-delete-non-virtual-dtor -Wno-comment -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -msse3 -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG  -DNDEBUG
    C++ flags (Debug):           -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winit-self -Wno-narrowing -Wno-delete-non-virtual-dtor -Wno-comment -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -msse3 -fvisibility=hidden -fvisibility-inlines-hidden -g  -O0 -DDEBUG -D_DEBUG
    C Compiler:                  /usr/bin/cc
    C flags (Release):           -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Winit-self -Wno-narrowing -Wno-comment -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -msse3 -fvisibility=hidden -O3 -DNDEBUG  -DNDEBUG
    C flags (Debug):             -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Winit-self -Wno-narrowing -Wno-comment -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -msse3 -fvisibility=hidden -g  -O0 -DDEBUG -D_DEBUG
    Linker flags (Release):      
    Linker flags (Debug):        
    ccache:                      NO
    Precompiled headers:         YES
    Extra dependencies:          dl m pthread rt
    3rdparty dependencies:

  OpenCV modules:
    To be built:                 calib3d core dnn features2d flann highgui imgcodecs imgproc java_bindings_generator ml objdetect photo python2 python3 python_bindings_generator shape stitching superres ts video videoio videostab viz
    Disabled:                    js world
    Disabled by dependency:      -
    Unavailable:                 cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev java
    Applications:                tests perf_tests apps
    Documentation:               NO
    Non-free algorithms:         NO

    GTK+:                        YES (ver 2.24.23)
      GThread :                  YES (ver 2.40.2)
      GtkGlExt:                  NO
    VTK support:                 YES (ver 5.8.0)

  Media I/O: 
    ZLib:                        /usr/lib/x86_64-linux-gnu/ (ver 1.2.8)
    JPEG:                        /usr/lib/x86_64-linux-gnu/ (ver )
    WEBP:                        build (ver encoder: 0x020e)
    PNG:                         /usr/lib ...
edit retag flag offensive close merge delete



Actually it should have influences, because you literally remove all debug code from the build. So no more asserts, no more debug checks, ... However it highly depends on the code to say if it actually improves speed alot. Can you quantify your increase in numbers? Also install thread building blocks, TBB, which is better than pthread for optimizations.

StevenPuttemans gravatar imageStevenPuttemans ( 2018-04-26 07:04:42 -0500 )edit

My experience using the DNN module is that there is a very significant performance difference between debug and release on builds targeting Windows. For instance, executing TinyYolo on CPU, release mode builds seem to be 3x to 4x faster in processing a single frame. However, I see very little difference executing the same code targeting Android: debug and release performance is practically the same.

Evren gravatar imageEvren ( 2018-04-26 11:11:14 -0500 )edit

Hmm that could be due to the fact that Android is a mobile platform. I am unsure if mobile platforms handle debug and release concepts in the same way.

StevenPuttemans gravatar imageStevenPuttemans ( 2018-04-27 02:16:00 -0500 )edit

Besides that, if you are going to build an app for sales, the compressed binary in release will also be smaller than the debug one, and that still matters :)

StevenPuttemans gravatar imageStevenPuttemans ( 2018-04-27 02:16:53 -0500 )edit