What are pros and cons of pyrDown? What is quality and performance difference between pyrDown/pyrUp and resize? I have created this code to make tests. I am using 2557x3993 RGB image gained from scanner (the image contains mainly BW pixels, but few lines of small red text too).
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
#include <iostream>
using namespace cv;
int main( int argc, char** argv )
{
/// General instructions
printf( "\n Zoom In-Out demo \n " );
printf( "------------------ \n" );
printf( " * [+] -> Zoom in \n" );
printf( " * [-] -> Zoom out \n" );
printf( " * [ESC] -> Close program \n \n" );
char* window_name_1 = "Pyramids Demo";
char* window_name_2 = "Resize Demo";
cv::Mat src, dst_pyr, tmp_pyr;
cv::Mat dst_resize, tmp_resize;
/// Test image - Make sure it s divisible by 2^{n}
src = cv::imread( "../../data/strana210.jpg" );
if( !src.data )
{ printf(" No data! -- Exiting the program \n");
return -1; }
tmp_pyr = src;
dst_pyr = tmp_pyr;
dst_resize = src;
tmp_resize = dst_resize;
/// Create window
cv::namedWindow( window_name_1, CV_WINDOW_AUTOSIZE );
cv::imshow( window_name_1, dst_pyr );
cv::namedWindow( window_name_2, CV_WINDOW_AUTOSIZE );
cv::imshow( window_name_2, dst_pyr );
double t1, t2;
/// Loop
while( true )
{
int c;
c = cv::waitKey(10);
if( (char)c == 27 )
{ break; }
if( (char)c == '+' )
{
t1 = (double)getTickCount();
cv::pyrUp( tmp_pyr, dst_pyr, Size( tmp_pyr.cols*2, tmp_pyr.rows*2 ) );
t1 = 1000*((double)getTickCount()-t1) / getTickFrequency() ;
t2 = (double)getTickCount();
cv::resize( tmp_resize, dst_resize, Size( tmp_resize.cols*2, tmp_resize.rows*2 ) );
t2 = 1000*((double)getTickCount()-t2) / getTickFrequency() ;
printf( "** Zoom In: Image x 2 \n" );
std::cout << "pyrUp: " << t1 << "ms" << std::endl;
std::cout << "resize up: " << t2 << "ms" << std::endl;
std::cout << std::endl;
}
else if( (char)c == '-' )
{
t1 = (double)getTickCount();
cv::pyrDown( tmp_pyr, dst_pyr, Size( tmp_pyr.cols/2, tmp_pyr.rows/2 ) );
t1 = 1000*((double)getTickCount()-t1) / getTickFrequency() ;
t2 = (double)getTickCount();
cv::resize( tmp_resize, dst_resize, Size( tmp_resize.cols/2, tmp_resize.rows/2 ) );
t2 = 1000*((double)getTickCount()-t2) / getTickFrequency() ;
printf( "** Zoom Out: Image / 2 \n" );
std::cout << "pyrDown: " << t1 << "ms" << std::endl;
std::cout << "resize down: " << t2 << "ms" << std::endl;
std::cout << std::endl;
}
cv::imshow( window_name_1, dst_pyr );
cv::imshow( window_name_2, dst_resize );
tmp_pyr = dst_pyr;
tmp_resize = dst_resize;
}
return 0;
}
Now I press "-" few times and here are the results:
Debug version:
Zoom In-Out demo
** Zoom Out: Image / 2 pyrDown: 113.24ms resize down: 56.1988ms
** Zoom Out: Image / 2 pyrDown: 41.836ms resize down: 8.95756ms
** Zoom Out: Image / 2 pyrDown: 18.0705ms resize down: 12.6991ms
** Zoom Out: Image / 2 pyrDown: 4.68942ms resize down: 6.21811ms
** Zoom Out: Image / 2 pyrDown: 1.23563ms resize down: 1.62395ms
Release version:
Zoom In-Out demo
** Zoom Out: Image / 2 pyrDown: 119.267ms resize down: 59.2019ms
** Zoom Out: Image / 2 pyrDown: 29.2641ms resize down: 8.74413ms
** Zoom Out: Image / 2 pyrDown: 18.1238ms resize down: 12.7287ms
** Zoom Out: Image / 2 pyrDown: 4.63131ms resize down: 6.25275ms
** Zoom Out: Image / 2 pyrDown: 1.2686ms resize down: 1.6301ms
Zoom In with Release version: ** Zoom In: Image x 2 pyrUp: 1.3809ms resize up: 2.2338ms
** Zoom In: Image x 2 pyrUp: 5.29369ms resize up: 5.08109ms
** Zoom In: Image x 2 pyrUp: 20.6029ms resize up: 9.37577ms
** Zoom In: Image x 2 pyrUp: 83.8785ms resize up: 30.1823ms
Resize using divisor/multiplier 4
** Zoom Out: Image / 4 resize down: 22.3724ms
** Zoom Out: Image / 4 resize down: 3.78372ms
** Zoom Out: Image / 4 resize down: 0.349206ms
(The same operation does not work for pyrDown/pyrUp as the program crashes)
My Conclusion The tests proved that the pyrDown and pyrUp are very inefficient when they should convert bigger images. The Gaussian pyramid wins if the image is about 250x250px.
The quality of the image - Gaussian pyramid performs harder blur, whilst resize did very gentle changes so the title in my image is possible to read (4x zoomed out and 1x in to read the title). The resize with devisor 4 is faster but less quality. The result for these last two sentences is proved here:
The question is about differences between these methods when performing blur. Why to use Gaussian pyramids or when is it useful?