opencv 3 essentialmatrix and recoverpose

We are currently working on a project using random 3D camera positioning.

We compiled OpenCv 3.0.0 and did our best to use the functions findEssentialMat & recoverPose.

In our problem, we have two openGL cameras cam1 and cam2, which observe the same 3D object. cam1 and cam 2 have the same intrinsic parameters (resolution, focal and pp) On each capture from those cameras, we are able to identify a set of matched points (8 points per set)

The extrinsic parameters of cam1 are known.

The objective of our work is to determine the extrinsic parameter of cam2.

So we use

float focal = 4.1f;
cv::Point2d pp(0,0);
double prob = 0.999;
double threshold = 3.0;
int method = cv::RANSAC;
cv::Mat essentialMat = cv::findEssentialMat(points1, points2, focal, pp, method, prob, threshold, mask);


then we apply

cv::Mat T;
cv::Mat R;
cv::recoverPose(essentialMat, points1, points2, R, T, focal, pp, mask);


in order to get R the relative rotation matrix and T the relative translation matrix.

From that, we tried to apply R and T to cam1 extrinsic parameter without success. Could you help us determine how to obtain cam2 translation and orientation from cam1, R and T?

edit retag close merge delete

@jaystab Did you resolve your issue (I've got the same)?

( 2015-03-02 16:37:45 -0500 )edit

One question is whether the 8 correspondences you have are actually noise-free. In theory, only 5 are required (which should not be coplanar or close to it), but in reality, you need much more because the measurements are imprecise.

( 2015-07-10 03:37:50 -0500 )edit

How to you track the matched keypoints in the 2nd frame/camera? do you use cv::calcOpticalFlowPyrLK ?

( 2019-05-30 11:49:47 -0500 )edit

Sort by » oldest newest most voted

Here's a couple of tips that might help you get it working.

1. I'm not sure what units OpenGL uses for the focal length, but OpenCV uses pixels. A focal length of 4 pixels doesn't seem very realistic. The focal length should be on the order of 400 pixels or so. If the focal length is too large or too small, the calculated rotation will be incorrect. Page 12 of the original 5-point algorithm paper gives a good example of this.

2. The findEssentialMat function might not be giving you the right answer. An easy way to verify this is to cross-check it against a calculated essential matrix from the true translation and rotation. The formula for the essential matrix is . The two matrices may a different scale, so it may help to normalize them before comparison. If you aren't getting the correct essential matrix, I would recommend tweaking the RANSAC threshold or use LMEDS. For RANSAC you might try using a threshold around 0.1 pixels. If you use LMEDS you won't need to worry about tweaking the threshold since LMEDS minimizes the median error instead of counting inliers. I would also recommend using more than 8 points to reduce the effects of noise and better distinguish between candidate essential matrices.

3. Keep in mind that OpenCV defines the rotation and translation in the direction the points move, not the direction the camera moves. For example, the coordinates of a point in the second camera frame can be calculated from its coordinates in the first frame as . This is simple for point translation, but counter-intuitive since we often think about the direction the camera moves, not the direction the points move. If you need the camera transformation you can simply invert the matrix . See also this post.

4. If you are getting the correct essential matrix, but incorrect rotation and translation, you may need do more than just using the recoverPose function. The essential matrix has two possible rotations and a positive and negative possible translation. These are found using the decomposeEssentialMat function. The recoverPose function uses the cheirality constraint (positive depth) to determine which rotation and translation out of the 4 possible combinations is correct. However, it can sometimes give the wrong answer when there is noise. Additionally, page 5 of the original 5-point algorithm paper points out that the cheirality contraint does not resolve the ambiguity if "all visible points are closer to one [camera] than the other."

more

You CANNOT get the extrinsics by merely decomposing the essential matrix. The translation vectors you get from recoverPose() are always unit vectors. "By decomposing E, you can only get the direction of the translation, so the function returns unit t.", from document of decomposeEssentialMat().

If you must calculate the extrinsics by essential matrix, one possible solution would be introducing the scale information of the real world, including using the aprilTag or adding an IMU to your system.

Besides, eight pairs of matched points might be not enough. Points on the same plane are not useful for essential matrix method, and the decomposeEssentialMat() would use RANSAC to eliminate some outliers. So try using more points correspondences.

It would be much easier to locate and solve your question, if you could post some detailed data of your work.

more