Strange issue with stereo triangulation: TWO valid solutions

asked 2015-08-24 00:58:58 -0500

saihv gravatar image

I am currently using OpenCV for a pose estimation related work, in which I am triangulating points between pairs for reconstruction and scale factor estimation. I have encountered a strange issue while working on this, especially in the opencv functions recoverPose() and triangulatePoints().

Say I have camera 1 and camera 2, spaced apart in X, with cam1 at (0,0,0) and cam2 to the right of it (positive X). I have two arrays points1 and points2 that are the matched features between the two images. According to the OpenCV documentation and code, I have noted two points:

  1. recoverPose() assumes that points1 belong to the camera at (0,0,0).
  2. triangulatePoints() is called twice: one from recoverPose() to tell us which of the four R/t combinations is valid, and then again from my code, and the documentation says:

    cv::triangulatePoints(P1, P2, points1, points2, points3D)  : points1 -> P1 and points2 -> P2.
    

Hence, as in the case of recoverPose(), it is safe to assume that P1 is [I|0] and P2 is [R|t].

What I actually found: It doesn't work that way. Although my camera1 is at 0,0,0 and camera2 is at 1,0,0 (1 being up to scale), the only correct configuration is obtained if I run

recoverPose(E, points2, points1...)
triangulatePoints([I|0], [R|t], points2, points1, pts3D)

which should be incorrect, because points2 is the set from R|t, not points1. I tested an image pair of a scene in my room where there are three noticeable objects after triangulation: a monitor and two posters on the wall behind it. Here are the point clouds resulting from the triangulation (excuse the MS Paint)

If I do it the OpenCV's prescribed way: (poster points dispersed in space, weird looking result)

Wrong

If I do it my (wrong?) way:

Correct

Can anybody share their views about what's going on here? Technically, both solutions are valid because all points fall in front of both cameras: and I didn't know what to pick until I rendered it as a pointcloud. Am I doing something wrong, or is it an error in the documentation? I am not that knowledgeable about the theory of computer vision so it might be possible I am missing something fundamental here. Thanks for your time!

edit retag flag offensive close merge delete