I get pretty long compilation times for OpenCV even on an Intel Sandy Bridge server with 64gb of ram. My speculation is that it's a combination of:
- Lots of C++ templates (see this thread for some insight into why C++ templates can be slow to build).
- Lots of cuda kernels -- remember, the kernels themselves are not polymorphic (unless you do some really advanced tricks), so OpenCV routines often have one kernel for each data type (
CV_8UC1
, CV_32FC3
, etc). This adds up quickly. - As Vladislav said, building for several architectures (Compute 1.0, 1.1, 1.2, 2.0, etc) increases build time, but you can avoid this by just selecting your architecture in the
CUDA_ARCH_BIN
flag. - I think there's also some code generation going on at compile-time. I don't remember the details, but I remember seeing a bunch of printouts about code generation during the OpenCV GPU compilation.
You may have already tried this, but building in multithread mode (e.g. use the flag -j8
for 8 threads, -j16
for 16 threads, pick your favorite number) can help. I've noticed that builds sometimes fail in multithreaded mode, but this may just be coincidence. Anyway, it's worth a try.