1 | initial version |
Try using the scaleAdd function. It has SMID optimizations built in.
cv::Mat sum = cv::Mat::zeros( mats[0].rows, mats[0].cols, cvType );
for( int m = 0; m < mats.size(); m++ )
{
const Type val = rowVec.at< Type >( m );
cv::scaleAdd(mats[m], val, sum, sum);
}
2 | No.2 Revision |
Try using the scaleAdd function. It has SMID optimizations built in.
cv::Mat sum = cv::Mat::zeros( mats[0].rows, mats[0].cols, cvType );
for( int m = 0; m < mats.size(); m++ )
{
const Type val = rowVec.at< Type >( m );
cv::scaleAdd(mats[m], val, sum, sum);
}
Ok, ran the benchmarks.
Original method: 6.17536 s
ScaleAdd method: 2.76857s
That's a 65% speedup, whereas the best of the other answer was a 45% speedup.