Just linking to opencv ruins thread-concurrency!?
in a small program demonstrating C++ threads, I noticed that linking with openCV ruins the concurrency of the threads. NOTE: the demo program doesn't even use openCV!. I stumbled on this problem when I observed that my own application, which does use openCV, did not show the concurrency as expected. It turned out that linking with openCV was the problem.
In the sample code (file main.cpp given below) three threads are launched, each doing the same calculation. From the same source, I create two executables, called 'mttok' and 'mttno', as follows:
g++ -o mttok -O3 main.cpp -lpthread
g++ -o mttno -O3 main.cpp -lpthread -L/usr/local/opencv/lib64 -lopencv_core
When I run the first executable with the gnu 'time' command I get:
=> time ./mttok
All threads are running...
result1: 5e+19
result2: 5e+19
result3: 5e+19
27.005u 0.003s 0:09.00 300.0% 0+0k 0+0io 0pf+0w
The third field is the elapsed time (9 secs) whereas the fourth number (300%) is the cpu-time, clearly showing the three threads running concurrently. This is also seen from an applet on my desktop visualising the CPU-activity: three bars corresponding to 3 'CPUs' climb to ~100%.
The other executable (linked with opencv) gives
=> time ./mttno
All threads are running...
result1: 5e+19
result2: 5e+19
result3: 5e+19
26.690u 0.203s 0:26.45 101.6% 0+0k 43648+0io 8pf+0w
Note the ~3x larger elapsed time (now 26 secs) and the CPU-percentage (101%). The CPU activity shows only one bar climbing to 100%.
I have tried this both with and without TBB, and both with and without openmp. The results are the same. The source code is a recent clone of the git repository (4.1.2-dev) but I saw the same phenomenon with the precompiled version of openCV that comes with SUSE leap 15.1, i.e. opencv 3.3 Who can explain this behaviour and suggest what can be done to keep proper concurrent behaviour?
I have asked this question at link:stackoverflow, with more detail and the source code. But this did not lead to a solution. Perhaps somebody of hte openCV community can help?
Here follows the code:
=== main.cpp ===
#include <iostream>
#include <thread>
const unsigned long NMAX=10000000000;
class MTTest
{
public:
void foo( double& r )
{
double s = 0;
for (unsigned long u=0; u<NMAX; u++)
{
s += u;
}
r = s;
}
};
int main()
{
double s1, s2, s3;
std::unique_ptr<MTTest> ptr1( new MTTest );
std::unique_ptr<MTTest> ptr2( new MTTest );
std::unique_ptr<MTTest> ptr3( new MTTest );
std::thread t1( &MTTest::foo, ptr1.get(), std::ref(s1) );
std::thread t2( &MTTest::foo, ptr2.get(), std::ref(s2) );
std::thread t3( &MTTest::foo, ptr3.get(), std::ref(s3) );
std::cout << "All threads are running..." << std::endl;
// synchronize threads:
t1.join();
t2.join();
t3.join();
std::cout << "result1: " << s1 << std::endl;
std::cout << "result2: " << s2 << std::endl;
std::cout << "result3: " << s3 << std ...
I have no idea, but if linking produces funny symptoms, I'd try changing linking order, that is, putting lpthread last in the command
@mvuori Good suggestion. I just put the pthread library at the end. It does not change the story.
unfair comparison, since you include cache warmups, opencl precompilation and such things in your measurement
@berak, Why unfair? The program does not use opencv, so I would expect the impact of linking against opencv should not have any impact at all. But it does.
Perhaps somebody would be so kind to repeat my steps above to see if the same effect is there. It takes less than 5 minutes....
there is (a lot of !) opencv code running on startup of your program, even if you don't call any code explicitly
@berak. Well, fair enough. But how does that prevent the concurrency of the 3 threads that are started in the demo program? This is not about losing some performance when openCV is linked in, but about losing thread concurrency when openCV is used. This is a major penalty, and I find it hard to belief that this is normal behaviour!
Regards Bw
can you explain ?
Have a look at the beginning of my posting. The code launches three different threads. Without linking with opencv the program runs 3x faster (elapsed time) compared to the case where opencv has been linked in. Moreover, CPU usage also points to the same: 300% (no opencv) vs 100% (with opencv). Finally, the same visual feedback is given by a CPU monitor: without opencv, there are three bars rising simulataneously to 100% usage corresponding to three threads running on three CPU's. With opencv linked in, only one CPU-meter is rising (and it takes three times the time to complete).