TBB affecting HoughCircles?
I recently installed OpenCV via cmake with the option WITH_TBB=ON on a raspberry pi 3.
The code I wanted to accalerate is basically a circle detection with HoughCircles.
Unfortunately the CPU usage is the same as before having TBB enabled, that is somewhere below 30%.
Why is it that way? I supposed HoughCircles is highly parallelizable, according to this http://www.ijcsi.org/papers/IJCSI-9-6-3-481-486.pdf.
EDIT:
taking a look at the source code mentioned by matman, matman (see here), at line 1058 there are two for-loops which as far as I can see are jut filling up the accumulator. How can they be parallelized with parallel_for_, and would that actually help speeding up the dectection?