I get pretty long compilation times for OpenCV even on an Intel Sandy Bridge server with 64gb of ram. My speculation is that it's a combination of:

  • Lots of C++ templates (see this thread for some insight into why C++ templates can be slow to build).
  • Lots of cuda kernels -- remember, the kernels themselves are not polymorphic (unless you do some really advanced tricks), so OpenCV routines often have one kernel for each data type (CV_8UC1, CV_32FC3, etc). This adds up quickly.
  • As Vladislav said, building for several architectures (Compute 1.0, 1.1, 1.2, 2.0, etc) increases build time, but you can avoid this by just selecting your architecture in the CUDA_ARCH_BIN flag.
  • I think there's also some code generation going on at compile-time. I don't remember the details, but I remember seeing a bunch of printouts about code generation during the OpenCV GPU compilation.

You may have already tried this, but building in multithread mode (e.g. use the flag -j8 for 8 threads, -j16 for 16 threads, pick your favorite number) can help. I've noticed that builds sometimes fail in multithreaded mode, but this may just be coincidence. Anyway, it's worth a try.

