Improve Runtime of a Function

Hi, I am using a function to find difference between 2 images. It takes two 8-bit grayscale images, converts them to CV_32FC1, does a subtraction. Here is the function I am using:

cv::Mat calculateDiff(const cv::Mat &image_one, const cv::Mat &image_two)
{
cv::Mat im1, im2, im_dest;
image_one.convertTo(im1,CV_32FC1);
image_two.convertTo(im2,CV_32FC1);
im1 /= 2.f;
im1 += 128.f;
im2 /= 2.f;
cv::subtract(im1, im2, im_dest);
im_dest.convertTo(im_dest, CV_8UC1);

return im_dest;
}


I have measured the run-time of each major step individually

1. convertTo steps
2. divide by 2, add 128
3. subtract() function
4. convertTo()

When I call this function with images of sizes: 9000 x 6000. I get a run-time of about 900 msec, but each individual step takes a lot less time. Here's one example:

1. Step 1 time: 64 msec
2. Step 2 time: 76 msec
3. Step 3 time: 51 msec
4. Step 4 time: 22 msec

When I called the function: I get the function's runtime: 905 msec

The function call looks like this:

cv::Mat diff_image;
diff_image = calculate_diff(input_one, input_two);


I measure the runtime using cv::getTickCount() and cv::getTickFrequency()

Why is the function's runtime so large where individual step do no take that long? How to improve the runtime? Kindly Help

Thanks!

1

I think that your program use opencl. Opencl source code is compiled during first call. If you want to measure real time don't measure first call but second one

• how do you measure execution time ?
• can you try a cerr << cv::getBuildInformation() << endl; , to find out, which optimization was built in ?

remember that cv::getTickCount() only measures CPU ticks, not operations on gpu(if opencl or such enabled), io(ie, imread), or sleep()

Got it. So what should I use if I want to measure execution time of functions that involve io operations say imread() and/or inwrite() ? I have functions that may or may not have io operations and I need to find execution time.

1

You can speed up your program when you use multiplication instead of division when possible. Further more you can specify the type inside some operations.

cv::Mat calculateDiff(const cv::Mat &image_one, const cv::Mat &image_two)
{
cv::Mat im1, im2, im_dest;
image_one.convertTo(im1, CV_32FC1, 0.5, 128);
image_two.convertTo(im2, CV_32FC1, 0.5, 0);
cv::subtract(im1, im2, im_dest, cv::noArray(), CV_8U);  // Not sure if this works here as expected
// im_dest.convertTo(im_dest, CV_8UC1);

return im_dest;
}


Aside of this it seems, that in OpenCV 3 the OpenCL code is build on first function call, even if you use cv::Mat. To avoid this #include <opencv2/core/ocl.hpp> and set ocl::setUseOpenCL(false) before function calls.

From the head: Perhabs this does the same job a bit faster if your input is 8bit:

cv::Mat calculateDiff(const cv::Mat &image_one, const cv::Mat &image_two)
{
cv::Mat im1, im_dest;
image_one.convertTo(im1, CV_16S, 1, 255);
cv::subtract(im1, image_two, im_dest, cv::noArray(), CV_16S);
im_dest.convertTo(im_dest, CV_8U, 0.5);

return im_dest;
}


I use cv::getTickCount()

Update:

I figured out the problem. The function was being called inside another function and there was a conditional cv::imwrite() in the function. That's why I was getting the problem. The conditional part comes from another section of the program. I've fixed the part and it's working ok.

Thanks everyone!

