How to obtain a better time for execution for Sobel algorithm using OpenMP?
Hello everyone,
I tried to parallelize the Sobel algorithm using OpenMP. I have good results but I want to improve the results.
Initial time for algorithm (sequential code): 1.49 s With OpenMP: 0.523 s
Can you tell me what improvements can I make to the code to get a better time? Thank you in advance and have a nice day!
This is my code:
// ------- C/C++ includes ------
#include <iostream>
#include <stdio.h>
#include <omp.h>
#include <time.h>
// ------ OpenCV includes ------
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/opencv.hpp>
using namespace std;
using namespace cv;
// dimension of kernel
int x[3][3];
int y[3][3];
/*----- OpenMP -----*/
int num_of_threads, i ,j;
double start, end;
int main( int argc, char** argv )
{
Mat initialImage = imread(argv[1], 0); // imread gray-scale image
Mat finalImage = Mat::zeros(initialImage.size(), initialImage.type());
if (finalImage.type() == initialImage.type() )
{
cout << "YES" << endl;
}
if(argc != 2 || !initialImage.data)
{
cout << "No image data or Usage: ./sobel imagePath" << endl;
return -1;
}
else
cout << "Image OK!" << endl;
//x direction
x[0][0] = -1; x[0][1] = 0; x[0][2] = 1;
x[1][0] = -2; x[1][1] = 0; x[1][2] = 2;
x[2][0] = -1; x[2][1] = 0; x[2][2] = 1;
//y direction
y[0][0] = -1; y[0][1] = -2; y[0][2] = -1;
y[1][0] = 0; y[1][1] = 0; y[1][2] = 0;
y[2][0] = 1; y[2][1] = 2; y[2][2] = 1;
num_of_threads = 8;//omp_get_num_procs();
omp_set_num_threads(num_of_threads);
start = omp_get_wtime();
for(j = 0; j < initialImage.rows - 2; j++ ){
#pragma omp parallel for private(i)
for(i = 0; i < initialImage.cols -2; i++ ){
// applay karnel in x direction
int xValOfPixel =
(x[0][0] * (int)initialImage.at<uchar>(j, i )) + (x[0][1] * (int)initialImage.at<uchar>(j + 1, i )) + (x[0][2] * (int)initialImage.at<uchar>(j + 2, i )) +
(x[1][0] * (int)initialImage.at<uchar>(j, i + 1)) + (x[1][1] * (int)initialImage.at<uchar>(j + 1, i + 1)) + (x[1][2] * (int)initialImage.at<uchar>(j + 2, i + 1)) +
(x[2][0] * (int)initialImage.at<uchar>(j, i + 2)) + (x[2][1] * (int)initialImage.at<uchar>(j + 1, i + 2)) + (x[2][2] * (int)initialImage.at<uchar>(j + 2, i + 2));
// apply karnel in y direction
int yValOfPixel =
(y[0][0] * (int)finalImage.at<uchar>(j, i )) + (y[0][1] * (int)finalImage.at<uchar>(j + 1, i )) + (y[0][2] * (int)finalImage.at<uchar>(j + 2, i )) +
(y[1][0] * (int)finalImage.at<uchar>(j, i + 1)) + (y[1][1] * (int)finalImage.at<uchar>(j + 1, i + 1)) + (y[1][2] * (int)finalImage.at<uchar>(j + 2, i + 1)) +
(y[2][0] * (int)finalImage.at<uchar>(j, i + 2)) + (y[2][1] * (int)finalImage.at<uchar>(j + 1, i + 2)) + (y[2][2] * (int)finalImage.at<uchar>(j + 2, i + 2));
int sum = abs(xValOfPixel) + abs(yValOfPixel);
if(sum > 255)
sum = 255;
finalImage.at<uchar>(j, i) = (uchar)sum;
}
}
end = omp_get_wtime();
cout << "Time: " << end - start << endl;
// display the ...
don't write per-pixel code in the 1st place.
don't use
at
.don't reinvent the wheel.
use cv::Sobel, and enable TBB and IPP support at opencv build time.
you're reinventing the "flat tyre", and now you try to optimize that. seems somewhat silly.
have a look, how the opencv devs did it !
I don't want to reinvent the wheel. I just want to do a college study about the OpenMP influence in Image Processing algorithms. But, Thank you for your answer!
thing is, as long as your code looks like that, ANY other optimization (using pointers, loop unrolling, vector intrinsics) might beat your code without openmp, so it's all rigged.
I need just an optimization for my algorithm using OpenMP, I don't need the best Sobel algorithm that exists. But thank you for your time and implication.