Look you touching a "burned" area in opencv now - and you should just invest your time into better things imho:
Let me tell you why i think so:
Situation now:
1) Open CV is owned by Intel. This company does not like Nvidea at all.
Cuda is a closed source Nvidea technology.
2) GPU support is done in opencv is via OPEN CL. Open CL is open source and should work across mutiple gpu vendors.
OPEN CL = "Cross platform gpu computing framework"
3) At least for the DNN module i could realize no difference between cpu or gpu. The code is most likely no optimized
to use gpu at all or opencl code is just not efficient enough. After spending literally days / weeks on this topic "make opencv fast on gpu for dnn/cnn" i just switched to the native solution using a cuda gpu(which was fast).
4) About python: Well python is a nice language with funny concepts of seeing the world compared to other programming languages. But performance is not its strength. Actually why python is so popular in machine learning domain is that you can easily call c / c++ code because of its dynamic typing(or something). The number crunching / heavy lifting is always done in c / c++. So you can think about making cuda calls from python but most likely you will need to write a c / c++
wrapper anyway.
At least on point 4) you are leaving the opencv world and should search somewhere else for help.
I for myself just run my model on cuda if possible - otherwise i use opencv - its fastes on cpu
My personal advise:
If you want to do face detection - just train a cnn base model and run it on gpu. I would rather invest my time there than to port some code. But your idea is not bad at all - i wanted to do the same!
Thanks for the answer.
No, in recent versions of OpenCV wrappers for python have appeared, it's just that this is not documented anywhere. Files with generated wrappers for Python (see build/modules/python or build/modules/python_bindings_generator) have CUDA wrappers.
The problem is that because of my level and the lack of normal documentation, I can’t figure it out myself, so I ask for help.
I think this topic should be popular since there is very little information on the Internet on the use of GPU with Python. There are only old questions, where we are talking about old versions of OpenCV, where there were no such CUDA wrappers.