Does anybody know if I can somewhere find the opencv_dnn module with CUDA support? A master branch supports only MKL and OpenCL backends but both too slow compare to the Caffe with CUDA: in my particular task average forward propagation time in Caffe with CUDA takes about 15 ms whereas in opencv_dnn with MKL 350 ms and with OpenCL 280 ms.