I am trying to use epipolar geometry concepts through the findEssentialMat, recoverPose, solvePnP functions in OpenCV to estimate the relative motion between two images. I am currently using a simulator to get my camera images, so the distortion is zero and I know the camera parameters beforehand. I am using the OpenCV coordinate convention in this question.
When two images are taken from positions displaced only along the X axis, essential matrix based estimation works perfectly, I get a translation matrix like [-0.9999, 0.0001, 0.0001]. But when the camera is moved along the forward/backward Z axis, I see the t matrix contains a higher than usual value on the Y axis, whereas X and Z are correct. Example results from solvePnP for these two cases are:
- Image 1 taken from (0, 0, 0) , image 2 from (-2, 0, 0):
t = [-1.988, 0.023, 0.046] (Y and Z are acceptably close to 0) - Image 1 taken from (0, 0, 0) , image 2 from (-2, 0, 4):
t = [-2.028, -0.441, 4.0983] (Y is totally off)
I don't understand where the 0.4m on the Y axis is coming from. The simulator is very accurate in terms of physics and there's absolutely no movement of that magnitude on the Y axis. The same behavior is reflected in the relative R/t output from the essential matrix decomposition. Any tips on solving this/suggestions for further debugging this issue would be helpful.