Help Recovering Structure From Motion
Afternoon, all!
I have been banging my head against the problem of building a 3D structure from a set of sequential images intently for the past week or so and cannot seem to get a decent result out of it. I would greatly appreciate someone taking the time to go over my steps and let me know if they seem correct. I feel like I am missing something small but fundamental.
- Build camera calibration matrix K and distortion coefficients from the calibration data of the chessboard provided (using findChessboardCorners(), cornerSubPix(), and calibrateCamera()).
- Pull in the first and third images from the sequence and undistort them using K and the distortion coefficients.
- Find features to track in the first image (using goodFeaturesToTrack() with a mask to mask off the sides of the image).
- Track the features in the new image (using calcOpticalFlowPyrLK()). At this point, I have a set of point correspondences in image i0 and image i2.
- Generate the fundamental matrix F from the point correspondences (using the RANSAC flag in findFundamentalMat()).
- Correct the matches of the point correspondences I found earlier using the new F (using correctMatches()). From here, I can generate the essential matrix from F and K and extract candidate projection matrices for the second camera.
- Generate the essential matrix E using E = K^T * F * K per HZ
- Use SVD on E to get U, S, and V, which then allow me to build the two candidate rotations and two candidate translations.
- For each candidate rotation, check to ensure the rotation is right-handed by checking sign of determinant. If <0, multiply through by -1. Now that I have the 4 candidate projection matrices, I want to figure out which one is the correct one.
- Normalize the corrected matches for images i0 and i2
- For each candidate matrix:
11.1. Triangulate the normalized correspondences using P1 = [ I | 0 ] and P2 = candidate matrix using triangulatePoints(). 11.2. Convert the triangulated 3D points out of homogeneous coordinates. 11.3. Select a test 3D point from the list and apply a perspective transformation to it using P2 (converted to a 4x4 matrix instead of 3x4 where the last row is [0,0,0,1]) using perspectiveTransform(). 11.4. Check if the depth of the 3D point and the Z-component of the perspectively transformed homogeneous point are both positive. If so, use this candidate matrix as P2. Else, continue.
- If none of the candidate matrices generate a good P2, go back to step 5. Now I should have two valid projection matrices P1 = [ I | 0 ] and P2 derived from E. I want to then use these matrices to triangulate the point correspondences I found back in step 4.
- Triangulate the the normalized correspondence points using P1 and P2
- Convert from homogeneous coordinates to get the real 3D points.
I already have encountered a problem here in that the 3D points I triangulate NEVER seem to correspond to the original structure. From the mug, they don't seem to form a clear ...