# findEssentialMat() pose estimation: wrong translation vector in some cases

I am currently using epipolar geometry based pose estimation for estimating pose of one camera w.r.t another, with non-zero baseline between the cameras. I am using the five-point algorithm (implemented as findEssentialMat in opencv) to determine the up-to-scale translation and rotation matrix between the two cameras.

I have found two interesting problems when working with this, it would be great if someone can share their views: I don't have a great theoretical background in computer vision:

1. If the rotation of the camera is along the Z axis i.e., parallel to the scene and the translation is non-zero, the translation between camera1 and camera2 (which is along X in the real world) is wrongly estimated to be along Z. Example case: cam1 and cam2 spaced by approx 0.5 m on the X axis, cam2 rotated clockwise by 45 deg.

Image pair

Output:

Translation vector is [-0.02513, 0.0686, 0.9973] (wrong, should be along X)
Rotation Euler angles: [-7.71364, 6.0731, -43.7583] (correct)

1. The geometry between image1 and image2 is not the exact inverse of the geometry between image2 and image1. Again while an image1<=>image2 correspondence produces the correct translation, image2<=>image1 is way off (rotation values are close though). Example below, where camera2 was displaced along X and rotated for ~30 degrees along Y

Image 1 to image 2

Output: Rotation [-1.578, 24.94, -0.1631] (Close) Translation [-0.0404, 0.035, 0.998] (Wrong)

Image 2 to image 1

Output: Rotation [2.82943, -30.3206, -3.32636] Translation [0.99366, -0.0513, -0.0999] (Correct)

Looks like it has no issues figuring the rotations out but the translations are a hit or miss.

As to question 1, I was initially concerned because because the rotation is along the Z axis, the points might appear to be all coplanar. But the five point algorithm paper particularly states: "The 5-point method is essentially unaffected by the planar degeneracy and still works".