# From Fundamental Matrix To Rectified Images

I have stereo photos coming from the same camera and I am trying to use them for 3D reconstruction.

To do that, I extract SURF features and calculate Fundamental matrix. Then, I get Essential matrix and from there, I have Rotation matrix and Translation vector. Finally, I use them to obtain rectified images.

The problem is that it works only with some specific parameters. If I set minHessian to 430, I will have a pretty nice rectified images. But, any other value gives me just a black image or some obviously wrong images.

In all the cases, the fundamental matrix seems to be fine (I draw epipolar lines on both the left and right images). However, I can not say so about Essential matrix, Rotation matrix and Translation vector. Even so I used all the 4 possible combination of R and T.

Here is my code. Any help or suggestion would be appreciated. Thanks!



if( !img_1.data || !img_2.data )
{ return -1; }

//-- Step 1: Detect the keypoints using SURF Detector
int minHessian = 430;
SurfFeatureDetector detector( minHessian );
std::vector<keypoint> keypoints_1, keypoints_2;
detector.detect( img_1, keypoints_1 );
detector.detect( img_2, keypoints_2 );

//-- Step 2: Calculate descriptors (feature vectors)
SurfDescriptorExtractor extractor;
Mat descriptors_1, descriptors_2;
extractor.compute( img_1, keypoints_1, descriptors_1 );
extractor.compute( img_2, keypoints_2, descriptors_2 );

//-- Step 3: Matching descriptor vectors with a brute force matcher
BFMatcher matcher(NORM_L1, true);
std::vector< DMatch > matches;
matcher.match( descriptors_1, descriptors_2, matches );

//-- Draw matches
Mat img_matches;
drawMatches( img_1, keypoints_1, img_2, keypoints_2, matches, img_matches );
//-- Show detected matches
namedWindow( "Matches", CV_WINDOW_NORMAL );
imshow("Matches", img_matches );
waitKey(0);

//-- Step 4: calculate Fundamental Matrix
vector<point2f>imgpts1,imgpts2;
for( unsigned int i = 0; i<matches.size(); i++="" )="" {="" queryidx="" is="" the="" "left"="" image="" imgpts1.push_back(keypoints_1[matches[i].queryidx].pt);="" trainidx="" is="" the="" "right"="" image="" imgpts2.push_back(keypoints_2[matches[i].trainidx].pt);="" }="" mat="" f="findFundamentalMat" (imgpts1,="" imgpts2,="" fm_ransac,="" 0.1,="" 0.99);="" --="" step="" 5:="" calculate="" essential="" matrix="" double="" data[]="{1189.46" ,="" 0.0,="" 805.49,="" 0.0,="" 1191.78,="" 597.44,="" 0.0,="" 0.0,="" 1.0};="" camera="" matrix="" mat="" k(3,="" 3,="" cv_64f,="" data);="" mat_<double=""> E = K.t() * F * K;

//-- Step 6: calculate Rotation Matrix and Translation Vector
Matx34d P;
//decompose E
SVD svd(E,SVD::MODIFY_A);
Mat svd_u = svd.u;
Mat svd_vt = svd.vt;
Mat svd_w = svd.w;
Matx33d W(0,-1,0,1,0,0,0,0,1);//HZ 9.13
Mat_<double> R = svd_u * Mat(W) * svd_vt; //
Mat_<double> T = svd_u.col(2); //u3

if (!CheckCoherentRotation (R)) {
std::cout<<"resulting rotation is not coherent\n";
return 0;
}

//-- Step 7: Reprojection Matrix and rectification data
Mat R1, R2, P1_, P2_, Q;
Rect validRoi[2];
double dist[] = { -0.03432, 0.05332, -0.00347, 0.00106, 0.00000};
Mat D(1, 5, CV_64F, dist);

stereoRectify(K, D, K, D, img_1.size(), R, T, R1, R2, P1_, P2_, Q, CV_CALIB_ZERO_DISPARITY, 1, img_1.size(),  &validRoi[0], &validRoi[1] );

edit retag close merge delete

1

What kind of motion undergoes between the two images?

( 2014-01-25 17:18:43 -0500 )edit

I move the camera horizontally (almost).

( 2014-01-27 02:48:35 -0500 )edit
1

There's a degenerate case with the 8 point algorithm if all/most of the points lie on a single planar surface eg. a book, table, wall.

One common way that people check this is by finding the homography between the points and observe the error. If the error is relatively low (eg. < 3 pixels) then the points lie too close to a plane and are unstable to work with.

( 2014-01-27 03:59:03 -0500 )edit

Sort by » oldest newest most voted

Upon looking into your problem in more detail, the source of your rectification errors has become a bit more obvious. Your processing pipeline up until the decomposition of the essential into rotation and translation is mostly correct (see comments further below). When decomposing the essential matrix into rotation and translation components, there are actually 4 possible configurations, where only one of them is actually valid for a given camera pair. Basically the decomposition is not unique because it allows degenerate configurations where one or both of the cameras are oriented away from the scene they imaged. The solution to this problem is to test if an arbitrary 3D point, derived from a point pair correspondences between both images, is located in front of each camera. In only one of the four configurations will the 3D point be located in front of both cameras. Assuming the first camera is set to the identity the four cases are:

where P1 is the camera matrix for the first camera and P2 for the second.

Testing whether any given 3D point, derived from a point correspondence in both images, is in front of both cameras for one of the four possible rotation and translation combinations, is a bit more involved. This is because you initially only have the point's projection in each image but lack the point's depth. Assuming X, X' is a 3d point imaged in the first and second cameras coordinate system respectively, and (ũ,ṽ), (ũ', ṽ') the corresponding projection in normalized image coordinates, in the first and second camera images respectively, we can use a rotation translation pair to estimate the 3D points depth in each camera coordinate system:

where r1 .. r3 are the rows of the rotation matrix R and translation t. Using the formula above for a point correspondence pair you can determine the associated 3D point's position in each camera coordinate system. If z or z' are negative, then you know you have a a degenerate configuration and you have to try one of the other three essential matrix decompositions. I have made a gist of this in python here: https://gist.github.com/jensenb/8668000#file-decompose_essential_matrix-py

Besides this You are not performing rectification (undistortion) prior to feature extraction / matching, which can cause some problems down the line, depending upon how strong the lens distortion is in your setup.

Estimation of the fundamental matrix depends upon having point correspondences between both images that are undistorted (as near ideal pinhole camera as possible). Lens distortion is nonlinear and depending upon how close the matched features are to the center of projection, you will get more or less correct fundamental matrix estimates, which directly effects the quality of the stereo rectification. So to summarize I recommend performing undistortion immediately after reading in your input images.

more

Thank you for the answer! I tried it but the problem is still there. :( another question: does the same procedure work for RGB images?

( 2014-01-27 03:47:21 -0500 )edit

Estimation of the fundamental matrix only depends on the point correspondences. Your choice of feature detector / descriptor determines whether color (RGB) or intensity information is used for estimation point correspondences. SURF does not use color information if I recall correctly. Maybe you could try other detectors / descriptors like FREAK, ORB etc?

( 2014-01-27 04:24:58 -0500 )edit

Can you maybe post a link to image with your key point matches drawn on, as well as the resulting incorrect stereo rectified image?

( 2014-01-27 04:29:58 -0500 )edit
1

I tried the code with different kind of features. the best result was for SURF.

here are one of the cases that I have tested:

left image: http://i42.tinypic.com/10sembn.jpg

Right image: http://i42.tinypic.com/6gkdia.jpg

Fundamental matrix test on the left image: http://i39.tinypic.com/b84x10.jpg

Rectified image : http://i42.tinypic.com/2jczivt.jpg

in this specific case I set minHessian to 250. Also I checked all the possible combinations of R and T. The other 3 results were just a black image.

( 2014-01-27 07:23:17 -0500 )edit

I updated my answer to reflect that you were not checking for degenerate essential matrix decompositions, and I added a gist in python of how you can perform the decomposition correctly. The OpenCV stereo rectification still isn't great with the images you posted, but it does not exhibit the symptoms you described above.

( 2014-01-28 09:09:48 -0500 )edit

Thank you very much, this explains a lot to me!!

( 2014-07-30 09:31:50 -0500 )edit

It's a very old thread, but you figured it out with the code provided or in other way? It seems to be wrong in my honest opinion :/

( 2019-04-29 02:06:28 -0500 )edit

Official site

GitHub

Wiki

Documentation