Ask Your Question
0

warpAffine execution time varying

asked Aug 6 '14

Plouf gravatar image

Hello,

I'm working on global motion estimation in videos. And I noticed that the execution time of the warpAffine function varies a lot (same input image, same transformation). Here's a small test :

Mat im = imread("image_test.png");
Mat wavimg = Mat(im.rows,im.cols,CV_8UC1);
Mat mapMatrix = Mat(2,3,CV_32F);
mapMatrix.at<float>(0,0) = 1.05;
mapMatrix.at<float>(0,1) = 0.0001;
mapMatrix.at<float>(0,2) = 5.3;
mapMatrix.at<float>(1,0) = -0.0001;
mapMatrix.at<float>(1,1) = 1.05;
mapMatrix.at<float>(1,2) = -0.8;

double tic, toc, tictoc;
for (int i=0; i<1000; i++){
    tic = (double)cvGetTickCount();
    warpAffine(im,wavimg,mapMatrix,wavimg.size(),CV_INTER_LINEAR+WARP_INVERSE_MAP);
    toc = (double)cvGetTickCount();
    tictoc = (toc-tic)/(1000*(double)cvGetTickFrequency());
    printf("time: %fms\n",(float)tictoc);
}
return 0;

Here are the execution times for 300 samples. These times are mainly around 0.8 ms but some of them are higher (up to 1.4 ms). I don't understand why and I'd like to know the reason. My configuration : windows 7 pro, visual studio express 2013, opencv 2.4.8, Intel Core i7-4700MQ CPU @ 2.4Ghz, 64 bits

Could anybody help me please ?

Regards, image description

Preview: (hide)

2 answers

Sort by » oldest newest most voted
1

answered Aug 12 '14

Plouf gravatar image

I have solved my problem. I compiled Opencv with the option WITH_OPENMP. My computer was using 8 threads. I forced it to use 1 thread (with the code lign : omp_set_num_threads(1);) and my execution is faster now and stable. On the chart below, I call 1000 times the function warpaffine with one thread (blue dots) and with 8 threads (red dots). With 8 threads, the execution time explodes sometimes (the red dots at 5 ms are higher actually, I reduced their values (~50ms) to 5 ms for scaling reason). As I call warpaffine function several times per frame and some other optimized opencv functions, there were some big execution time leaps. Now, the execution time per frame is stable (+- 1ms) and faster (time/2). It's less efficient to use more threads due probably to data transmission.

image description

Preview: (hide)
0

answered Aug 6 '14

boaz001 gravatar image

Issues like this are almost impossible to track back to one cause. Simplified explanation: this is because of the operating system controlling the processes and threads and the utilization of processor(s).

I think it is more important to solve problems then to find causes. So is this extra delay giving you trouble performance wise?

Preview: (hide)

Comments

Hey boaz001, thanks for your answer. I want to estimate global motion in real time and I want to have time left to do other stuff. My video is 720x576 at 30fps. I'd like to estimate global motion in less than 15ms. I call 10 times per frame the function warpaffine. So it's important for my application to have a stable execution delay. I tried the function calcOpticalFlowPyrLK on a predefined set of points and I noticed the same irregularities..

Plouf gravatar imagePlouf (Aug 7 '14)edit

To understand the problem and not basically ask the same question again. You must understand that you simply cannot guarantee something like timing on a software platform that has hardware that is controlled by some operating system. There is no stable execution delay! (Although I would call that 99% of warpAffine that runs in ~0.9ms very stable.)

Problems like this can be solved in many ways, to name a few; - Downscaling, use a lower resolution - Lower framerate - Faster hardware - Improve or simplify algorithm - Improve code - Not solve it; accept lag or missing frames - ...many more...

boaz001 gravatar imageboaz001 (Aug 8 '14)edit

Question Tools

Stats

Asked: Aug 6 '14

Seen: 2,044 times

Last updated: Aug 12 '14