# Help Recovering Structure From Motion

Afternoon, all!

I have been banging my head against the problem of building a 3D structure from a set of sequential images intently for the past week or so and cannot seem to get a decent result out of it. I would greatly appreciate someone taking the time to go over my steps and let me know if they seem correct. I feel like I am missing something small but fundamental.

1. Build camera calibration matrix K and distortion coefficients from the calibration data of the chessboard provided (using findChessboardCorners(), cornerSubPix(), and calibrateCamera()).
2. Pull in the first and third images from the sequence and undistort them using K and the distortion coefficients.
3. Find features to track in the first image (using goodFeaturesToTrack() with a mask to mask off the sides of the image).
4. Track the features in the new image (using calcOpticalFlowPyrLK()). At this point, I have a set of point correspondences in image i0 and image i2.
5. Generate the fundamental matrix F from the point correspondences (using the RANSAC flag in findFundamentalMat()).
6. Correct the matches of the point correspondences I found earlier using the new F (using correctMatches()). From here, I can generate the essential matrix from F and K and extract candidate projection matrices for the second camera.
7. Generate the essential matrix E using E = K^T * F * K per HZ
8. Use SVD on E to get U, S, and V, which then allow me to build the two candidate rotations and two candidate translations.
9. For each candidate rotation, check to ensure the rotation is right-handed by checking sign of determinant. If <0, multiply through by -1. Now that I have the 4 candidate projection matrices, I want to figure out which one is the correct one.
10. Normalize the corrected matches for images i0 and i2
11. For each candidate matrix:
11.1. Triangulate the normalized correspondences using P1 = [ I | 0 ]
and P2 = candidate matrix using triangulatePoints().
11.2. Convert the triangulated 3D points out of homogeneous coordinates.
11.3. Select a test 3D point from the list and apply a perspective
transformation to it using P2 (converted to a 4x4 matrix instead of 3x4 where
the last row is [0,0,0,1]) using perspectiveTransform().
11.4. Check if the depth of the 3D point and the Z-component of the
perspectively transformed homogeneous point are both positive. If so,
use this candidate matrix as P2. Else, continue.
12. If none of the candidate matrices generate a good P2, go back to step 5. Now I should have two valid projection matrices P1 = [ I | 0 ] and P2 derived from E. I want to then use these matrices to triangulate the point correspondences I found back in step 4.
13. Triangulate the the normalized correspondence points using P1 and P2
14. Convert from homogeneous coordinates to get the real 3D points.

I already have encountered a problem here in that the 3D points I triangulate NEVER seem to correspond to the original structure. From the mug, they don't seem to form a clear ...

edit retag close merge delete