Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Can cv::dft() be sped up with the right compiler flags?

For some time I have been using cv::dft() on a large image and it always took about 4-5 seconds. I noticed it now takes about 30 seconds for the same image and I wonder why. I recently recompiled OpenCV, without having saved the original compiler flags, so maybe I am missing a flag now that speeds up the dft function?

This is the current build configuration:

General configuration for OpenCV 3.2.0 =====================================
  Version control:               unknown

  Extra modules:
    Location (extra):            /home/uname/opencv_contrib-3.2.0/modules
    Version control (extra):     unknown

  Platform:
    Timestamp:                   2018-09-17T15:22:43Z
    Host:                        Linux 4.4.0-135-generic x86_64
    CMake:                       3.5.1
    CMake generator:             Unix Makefiles
    CMake build tool:            /usr/bin/make
    Configuration:               RELEASE

  C/C++:
    Built as dynamic libs?:      YES
    C++ Compiler:                /usr/bin/c++  (ver 5.4.0)
    C++ flags (Release):         -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wno-narrowing -Wno-delete-non-virtual-dtor -Wno-comment -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffast-math -msse -msse2 -mno-avx -msse3 -mno-ssse3 -mno-sse4.1 -mno-sse4.2 -ffunction-sections -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG  -DNDEBUG
    C++ flags (Debug):           -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wno-narrowing -Wno-delete-non-virtual-dtor -Wno-comment -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffast-math -msse -msse2 -mno-avx -msse3 -mno-ssse3 -mno-sse4.1 -mno-sse4.2 -ffunction-sections -fvisibility=hidden -fvisibility-inlines-hidden -g  -O0 -DDEBUG -D_DEBUG
    C Compiler:                  /usr/bin/cc
    C flags (Release):           -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wno-narrowing -Wno-comment -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffast-math -msse -msse2 -mno-avx -msse3 -mno-ssse3 -mno-sse4.1 -mno-sse4.2 -ffunction-sections -fvisibility=hidden -O3 -DNDEBUG  -DNDEBUG
    C flags (Debug):             -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wno-narrowing -Wno-comment -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffast-math -msse -msse2 -mno-avx -msse3 -mno-ssse3 -mno-sse4.1 -mno-sse4.2 -ffunction-sections -fvisibility=hidden -g  -O0 -DDEBUG -D_DEBUG
    Linker flags (Release):
    Linker flags (Debug):
    ccache:                      NO
    Precompiled headers:         YES
    Extra dependencies:          /home/uname/anaconda3/lib/libpng.so /home/uname/anaconda3/lib/libtiff.so /usr/lib/x86_64-linux-gnu/libjasper.so /home/uname/anaconda3/lib/libjpeg.so gtk-3 gdk-3 pangocairo-1.0 pango-1.0 atk-1.0 cairo-gobject cairo gdk_pixbuf-2.0 gio-2.0 gobject-2.0 glib-2.0 gthread-2.0 avcodec-ffmpeg avformat-ffmpeg avutil-ffmpeg swscale-ffmpeg /home/uname/anaconda3/lib/libhdf5_hl.so /home/uname/anaconda3/lib/libhdf5.so /usr/lib/x86_64-linux-gnu/librt.so /usr/lib/x86_64-linux-gnu/libpthread.so /home/uname/anaconda3/lib/libz.so /usr/lib/x86_64-linux-gnu/libdl.so /usr/lib/x86_64-linux-gnu/libm.so dl m pthread rt cudart nppc nppial nppicc nppicom nppidei nppif nppig nppim nppist nppisu nppitc npps cublas cufft -L/usr/local/cuda/lib64
    3rdparty dependencies:       libwebp IlmImf libprotobuf

  OpenCV modules:
    To be built:                 cudev core cudaarithm flann hdf imgproc ml reg surface_matching video cudabgsegm cudafilters cudaimgproc cudawarping dnn freetype fuzzy imgcodecs photo shape videoio cudacodec highgui objdetect plot ts xobjdetect xphoto bgsegm bioinspired dpm face features2d line_descriptor saliency text calib3d ccalib cudafeatures2d cudalegacy cudaobjdetect cudaoptflow cudastereo datasets rgbd stereo superres tracking videostab xfeatures2d ximgproc aruco optflow phase_unwrapping stitching structured_light python2 python3
    Disabled:                    world contrib_world
    Disabled by dependency:      -
    Unavailable:                 java viz cnn_3dobj cvv matlab sfm

  GUI: 
    QT:                          NO
    GTK+ 3.x:                    YES (ver 3.18.9)
    GThread :                    YES (ver 2.48.2)
    GtkGlExt:                    NO
    OpenGL support:              NO
    VTK support:                 NO

  Media I/O: 
    ZLib:                        /home/uname/anaconda3/lib/libz.so (ver 1.2.8)
    JPEG:                        /home/uname/anaconda3/lib/libjpeg.so (ver 80)
    WEBP:                        build (ver 0.3.1)
    PNG:                         /home/uname/anaconda3/lib/libpng.so (ver 1.6.28)
    TIFF:                        /home/uname/anaconda3/lib/libtiff.so (ver 42 - 4.0.6)
    JPEG 2000:                   /usr/lib/x86_64-linux-gnu/libjasper.so (ver 1.900.1)
    OpenEXR:                     build (ver 1.7.1)
    GDAL:                        NO
    GDCM:                        NO

  Video I/O:
    DC1394 1.x:                  NO
    DC1394 2.x:                  NO
    FFMPEG:                      YES
      avcodec:                   YES (ver 56.60.100)
      avformat:                  YES (ver 56.40.101)
      avutil:                    YES (ver 54.31.100)
      swscale:                   YES (ver 3.1.101)
      avresample:                NO
    GStreamer:                   NO
    OpenNI:                      NO
    OpenNI PrimeSensor Modules:  NO
    OpenNI2:                     NO
    PvAPI:                       NO
    GigEVisionSDK:               NO
    Aravis SDK:                  NO
    UniCap:                      NO
    UniCap ucil:                 NO
    V4L/V4L2:                    NO/YES
    XIMEA:                       NO
    Xine:                        NO
    gPhoto2:                     NO

  Parallel framework:            pthreads

  Other third-party libraries:
    Use IPP:                     NO
    Use IPP Async:               NO
    Use VA:                      NO
    Use Intel VA-API/OpenCL:     NO
    Use Lapack:                  NO
    Use Eigen:                   NO
    Use Cuda:                    YES (ver 9.1)
    Use OpenCL:                  YES
    Use OpenVX:                  NO
    Use custom HAL:              NO

  NVIDIA CUDA
    Use CUFFT:                   YES
    Use CUBLAS:                  YES
    USE NVCUVID:                 NO
    NVIDIA GPU arch:             20 30 35 37 50 52 60 61
    NVIDIA PTX archs:
    Use fast math:               YES

  OpenCL:                        <Dynamic loading of OpenCL library>
    Include path:                /home/uname/opencv-3.2.0/3rdparty/include/opencl/1.2
    Use AMDFFT:                  NO
    Use AMDBLAS:                 NO

  Python 2:
    Interpreter:                 /usr/bin/python2.7 (ver 2.7.12)
    Libraries:                   /usr/lib/x86_64-linux-gnu/libpython2.7.so (ver 2.7.12)
    numpy:                       /usr/local/lib/python2.7/dist-packages/numpy/core/include (ver 1.13.0)
    packages path:               lib/python2.7/dist-packages

  Python 3:
    Interpreter:                 /home/uname/anaconda3/bin/python3 (ver 3.5.2)
    Libraries:                   /usr/lib/x86_64-linux-gnu/libpython3.5m.so (ver 3.5.2)
    numpy:                       /home/uname/anaconda3/lib/python3.5/site-packages/numpy/core/include (ver 1.12.1)
    packages path:               lib/python3.5/site-packages

  Python (for build):            /usr/bin/python2.7

  Java:
    ant:                         NO
    JNI:                         /usr/lib/jvm/default-java/include /usr/lib/jvm/default-java/include/linux /usr/lib/jvm/default-java/include
    Java wrappers:               NO
    Java tests:                  NO

  Matlab:                        Matlab not found or implicitly disabled

  Documentation:
    Doxygen:                     /usr/bin/doxygen (ver 1.8.11)

  Tests and samples:
    Tests:                       YES
    Performance tests:           YES
    C/C++ Examples:              YES