I've been looking through the long list of OpenCV 4 compile switches generated by cmake-gui. Trying to figure out the best options for fast runtime performance. That would include multithreading/multicore, math and matrix libs, GPU utilization, etc. I haven't found much in the way of guidance.
Switches include:
Atlas_,
BUILD_IPP_IW,
WITH_IPP,
BUILD_WITH_DYNAMIC_IPP,
WITH_ITT,
BUILD_TBB,
WITH_TBB,
various *BLAS switches,
various LAPACK switches,
various OPENCL switches,
MKL_,
OPENMP,
*PTHREADS,
WITH_CUDA,
whatever other Intel libraries,
[probably more that I'm overlooking]
Can anyone provide leads toward some kind of strategy for setting the compile switches? I'm using OpenCV with C++ and Python 3. I'll need to debug code that I'm writing but probably don't need debugging info for OpenCV itself.