OpenCV Q&A Forum - RSS feedhttp://answers.opencv.org/questions/OpenCV answersenCopyright <a href="http://www.opencv.org">OpenCV foundation</a>, 2012-2018.Mon, 15 Jun 2020 06:14:23 -0500Triangulation scaling problem. (Multiple Images)http://answers.opencv.org/question/231247/triangulation-scaling-problem-multiple-images/Hello everyone,
I try to do 3d metric reconstruction with 60 images.
I use python and there is only [this](https://docs.opencv.org/3.4/d9/d0c/group__calib3d.html#gad3fc9a0c82b08df034234979960b778c) function in opencv to triangulation with python. It reconstructs the images two by two.
There are 2 methods that I consider using to make triangulation.
The first method is to select 1 image as base and triangulate one by one with its neighbors ;
Like this: ![image description](/upfiles/15922198763035918.png)
where 1 -> Base Image
; 2,3,4,5,6 -> Base Image's neighbors
R1 =[[1 0 0] t1 = [[0]]
[0 1 0] [0]
[0 0 1]] [0]]
R2,t2 = cv2.findEssentialMat(img1.pts, img2.pts, K) and cv2.recoverPose(E)
R3,t3 = cv2.findEssentialMat(img1.pts, img3.pts, K) and cv2.recoverPose(E)
.
.
R6,t6 = cv2.findEssentialMat(img1.pts, img6.pts, K) and cv2.recoverPose(E)
P1 = K[R1|t1]
P2 = K[R2|t2]
.
.
P6 = K[R6|t6]
3d_first = cv2.triangulatePoints(P1,P2, pts1,pts2)
3d_second = cv2.triangulatePoints(P1,P3, pts1,pts3)
3d_third = cv2.triangulatePoints(P1,P4, pts1,pts4)
3d_fourth = cv2.triangulatePoints(P1,P5, pts1,pts5)
3d_fifth = cv2.triangulatePoints(P1,P6, pts1,pts6)
When I visualize the 3d points, they do not overlap and there is a difference between them.(I guess It is a scaling problem).
How can i fix the problem?
There is a suggestion like that ;
t1 = t1 * (camera1 and camera2 baseline (unit metric))
t2 = t2 * (camera1 and camera3 baseline (unit metric))
but this sometimes does not work. Does this give me a metric results?
The second method is to triangulate the photos as img1-img2, img2-img3, img3-img4 instead of img1-img2, img1-img3, img1-img4. In this case, I have to bring all the 3D points to the same plane. Because P1 = [I | 0] for img1-img2 triangulation, P2 = [I, 0] for img2-img3 triangulation.
In this case, point clouds for img1-img2 will appear on the plane img1, and point clouds for img2, img3 will appear on the plane 2.
How can I bring them to the same plane?
Thanks in advance.guknaMon, 15 Jun 2020 06:14:23 -0500http://answers.opencv.org/question/231247/Wrong rank in Fundamental Matrixhttp://answers.opencv.org/question/204100/wrong-rank-in-fundamental-matrix/Hi guys,
I'm using the OpenCV for Python3 and, based on the Mastering OpenCV Book, try to compute the epipoles from many images (Structure from Motion algorithm).
In many books, they say which Fundamental Matrix has rank 2. But, the OpenCV function returns a rank 3 matrix.
How can I make this right?
orb = cv2.ORB_create()
# find the keypoints and descriptors with ORB
kp1, des1 = orb.detectAndCompute(img1,None)
kp2, des2 = orb.detectAndCompute(img2,None)
# create BFMatcher object
bf = cv2.BFMatcher(cv2.NORM_L2, crossCheck=True)
# Match descriptors.
matches = bf.match(des1,des2)
# Sort them in the order of their distance.
matches = sorted(matches, key = lambda x:x.distance)
pts1 = []
pts2 = []
for m in matches:
pts2.append(kp2[m.trainIdx].pt)
pts1.append(kp1[m.queryIdx].pt)
F, mask = cv2.findFundamentalMat(pts1, pt2,cv2.FM_RANSAC)
pts1 = match['leftPts'][mask.ravel()==1]
pts2 = match['rightPts'][mask.ravel()==1]
# F is the Fundamental Matrix
From that code, the output are like
Processing image 0 and image 1
rank of F: 3
Processing image 0 and image 2
rank of F: 3
Processing image 0 and image 3
rank of F: 3
Processing image 0 and image 4
rank of F: 2
[...]
Someone could help me? Someone have any functional code for SfM using OpenCV?
Thanks in advance.
Lucas Amparo BarbosaMon, 26 Nov 2018 11:04:12 -0600http://answers.opencv.org/question/204100/Triangulation gives weird results for rotationhttp://answers.opencv.org/question/199673/triangulation-gives-weird-results-for-rotation/OpenCV version 3.4.2
I am taking a stereo pair and using recoverPose to get the [R|t] pose of the camera, If I start at the origin and use triangulatePoints the result looks somewhat like expected although I would have expected the z points to be positive;
These are the poses of the cameras [R|t]
p0: [1, 0, 0, 0;
0, 1, 0, 0;
0, 0, 1, 0]
P1: [0.9999726146107655, -0.0007533190856300971, -0.007362237354563941, 0.9999683127209806;
0.0007569149205790131, 0.9999995956157767, 0.0004856419317479311, -0.001340876868928852;
0.007361868534054914, -0.0004912012195572309, 0.9999727804360723, 0.007847012372698725]
I get these results where the red dot and the yellow line indicates the camera pose (x positive is right, y positive is down):
![image description](/upfiles/1537317206819271.png)
When I rotate the first camera by 58.31 degrees and then use recoverPose to get the relative pose of the second camera the results are wrong.
Pose matrices where P0 is rotated by 58.31 degrees around the y axis before calling my code below.
P0: [0.5253219888177297, 0, 0.8509035245341184, 0;
0, 1, 0, 0;
-0.8509035245341184, 0, 0.5253219888177297, 0]
P1: [0.5315721563840478, -0.0007533190856300971, 0.8470126770406503, 0.5319823932782873;
-1.561037994149129e-05, 0.9999995956157767, 0.0008991799591322519, -0.001340876868928852;
-0.8470130118915117, -0.0004912012195572309, 0.5315719296650566, -0.8467543535708145]
(x positive is right, y positive is down)
![image description](/upfiles/15373172174565108.png)
The pose of the second frame is calculated as follows:
new_frame->E = cv::findEssentialMat(last_frame->points, new_frame->points, K, cv::RANSAC, 0.999, 1.0, new_frame->mask);
int res = recoverPose(new_frame->E, last_frame->points, new_frame->points, K, new_frame->local_R, new_frame->local_t, new_frame->mask);
// https://stackoverflow.com/questions/37810218/is-the-recoverpose-function-in-opencv-is-left-handed
// Convert so transformation is P0 -> P1
new_frame->local_t = -new_frame->local_t;
new_frame->local_R = new_frame->local_R.t();
new_frame->pose_t = last_frame->pose_t + (last_frame->pose_R * new_frame->local_t);
new_frame->pose_R = new_frame->local_R * last_frame->pose_R;
hconcat(new_frame->pose_R, new_frame->pose_t, new_frame->pose);
I then call triangulatePoints using the K * P0 and K * P1 on the corresponding points.
I feel like this is some kind of coordinate system issue as the points I would expect to have positive z values have a -z value in the plots and so the rotation is behaving strangely. I haven't been able to figure out what I need to do to fix it.
EDIT: Here is a gif of what's going on as I rotate through 360 degrees around Y. The cameras are still parallel. What am I missing, shouldn't the shape of the point cloud remain the same if both camera poses remain in relative positions even thought they have been rotated around the origin? Why are the points squashed into the X axis?
![image description](/upfiles/15373205818094867.gif)maym86Tue, 18 Sep 2018 14:55:35 -0500http://answers.opencv.org/question/199673/SFM opencv module for windowshttp://answers.opencv.org/question/171416/sfm-opencv-module-for-windows/ Hi,
I want to use the Scene Reconstruction algorithm at http://docs.opencv.org/trunk/d4/d18/tutorial_sfm_scene_reconstruction.html.
It however requires the SFM module to be installed with opencv. Where can I find the same for windows?
Regards,
Prachi Prachi mpTue, 08 Aug 2017 22:41:36 -0500http://answers.opencv.org/question/171416/3D reconstruction (SFM) with multi-lens camera system (instead of pinhole camera model)http://answers.opencv.org/question/173406/3d-reconstruction-sfm-with-multi-lens-camera-system-instead-of-pinhole-camera-model/3D reconstruction (especially SFM algorithms) are often related with pinhole camera models.
The state-of-the-art of these SFM techniques is to look where the rays of 2D-3D correspondences in two different cameras intersect in object space.
This enforces that the camera model is a pinhole model (where the 2D-3D ray is just a straight line).
But often in real world there are multiple lens system used, where you can't really figure out the ray of 2D-3D correspondence.
**My question is:** *How does the SFM technique works with such multiple lens camera systems?*mirnyyFri, 01 Sep 2017 06:24:13 -0500http://answers.opencv.org/question/173406/How is the basic pipeline of 3D reconstruction from more than two images?http://answers.opencv.org/question/127674/how-is-the-basic-pipeline-of-3d-reconstruction-from-more-than-two-images/ I am doing 3D reconstruction using OpenCV with Python and I have already reconstructed the 3D structure from two images. Let us named the images as **image_1** and **image_2**. So, I want to add another view from the third image, **image_3**. What I understood, it has to do with Bundle Adjustment, but I have a problem on understanding how things should be done.
So, to make my pipeline more simple, I have decided to use the Python's [sba](https://pypi.python.org/pypi/sba) module, which is explained as the *wrapper for Lourakis' sparse bundle adjustment C library*. Basically, what I understood is that I will need to pass the 3D points of a structure, and their corresponding 2D points from multiple images. For the example given with source code, it looks something like this:
|0.188427 -0.036568 -0.851884| 3 | 0 |500.7 521.2| 1 |902.8 974.4| 2 |147.9 580.2|
From the first to eighth column, this is what it represents:
1. The 3D point
2. The number of views/images that 3D point visible
3. Mask (here means that 3D point is visible in the first image (0'th)
4. The 2D points of that 3D point in the first image
The 5th, 6th, 7th, 8th column all have the same meaning as the 3rd and 4th column.
As for my condition, I have made a 3D reconstruction called **structure_1** from **image_1** and **image_2**. Then I have also made a 3D reconstruction called **structure_2** from **image_2** and **image_3**.
Then, consider that a point called **point_1** is visible in **image_1**, **image_2** and **image_3**.
This means, I have two 3D points (from **structure_1** and **structure_2**) for **point_1**. How should I make it as the example in the source code? The source code already has a 3D point from three views as shown in above snippet.
HilmanMon, 13 Feb 2017 01:39:51 -0600http://answers.opencv.org/question/127674/How to construct 3D representation of the scene with more than 2 images?http://answers.opencv.org/question/120184/how-to-construct-3d-representation-of-the-scene-with-more-than-2-images/From a video, I have took three images: i1, i2 and i3. The steps in getting the keypoints in each of the image are:
1. I detect keypoints from i1 and track it with optical flow up until i2.
2. From i2, I added more keypoints (good keypoints from i1 still exist) and track it up until i3.
Then, from the corresponding keypoints in i1 and i2, I managed to build the 3d representation.
So, using the same pipeline, of course I also managed to reconstruct the 3d representation from the corresponding keypoints in i2 and i3. So, I want to build a scene using these 2 reconstructed 3d scene.
I have done a little bit of reading, and I stuck in some parts. So, I know I will need to call the `solvePnPRansac`.
I have done book keeping during optical flow and knows which keypoints in the reconstructed 3d scene (between i1 and i2) are present in i3. So, I will need just to pass the reconstructed 3d points with the corresponding i3's keypoints to `solvePnPRansac`. From there, I can get the rotation and translation of the reconstructed 3d scene with respect to the i3's camera. From there, what should I do?
HilmanSat, 24 Dec 2016 22:10:45 -0600http://answers.opencv.org/question/120184/Why sfm::reconstruct return empty points3d_estimated if I set only two images?http://answers.opencv.org/question/119618/why-sfmreconstruct-return-empty-points3d_estimated-if-i-set-only-two-images/ I try use sfm::reconstruct with video images.
steps through which I passed
- detect keypoint and descriptors for two images (SIFT)
- chose only good key point
- build the following structure data (vector<vector<Mat<double>>>)
* frame1 frame2 frameN
* track1 | (x11,y11) | -> | (x12,y12) | -> | (x1N,y1N) |
* track2 | (x21,y11) | -> | (x22,y22) | -> | (x2N,y2N) |
* trackN | (xN1,yN1) | -> | (xN2,yN2) | -> | (xNN,yNN) |
and put this arrayofarrays on function sfm::reconstruct, but I get empty points3d_estimated((
It may be necessary for at least 3 images?I_coolSun, 18 Dec 2016 11:04:47 -0600http://answers.opencv.org/question/119618/Why findFundamentalMat gives different results for same but different orientation of points?http://answers.opencv.org/question/113067/why-findfundamentalmat-gives-different-results-for-same-but-different-orientation-of-points/Sorry if the title is kind of weird. It is quite hard to express the question of my problem.
So, I am in the middle of a 3D reconstruction project. The pipeline is more or less the same with the standard pipeline where
1. Undistort image
2. Detect points with keypoint detector
3. Track the points across frames (optical flow)
4. Calculate the fundamental matrix
and so on. The only different part is at step 2 where I use a Line Segment Detector and track it across frames.
So, if I am using a keypoint detector, giving two frame of images, I will get two set of keypoints (each set corresponds to each frame). But as for my situation, I have four set of keypoints (each two set correspond to each frame since a line has a start point and an end point).
In order to calculate the Fundamental matrix, I need to concatenate the two sets of point of each frame.
One way is by just vertically concatenate it: `np.vstack([start_point, end_point])`.
The other way is by :`np.hstack([start_point, end_point]).reshape(-1, 2)`. Means, it is concatenated 'alternately', i.e.
[[start_point[0],
end_point[0],
start_point[1],
end_point[1],
...]]
Both will end up with a same shape. But fair enough, they produce a quite different results. From my observation, the `vstack` produced a more '3D-like' result while the `hstack` produced a more 'planar-like' result for the reconstruction.
The question is why is this? And which one supposed to be better?
Below is sample code to give a view of this question:
import numpy as np
import cv2
np.random.seed(0)
def prepare_points(pts_frame1, pts_frame2):
# Prepare the four sets of points
(p1_f1, p2_f1) = pts_frame1
(p1_f2, p2_f2) = pts_frame2
v_stacked_f1f2 = (np.vstack([p1_f1, p2_f1]), np.vstack([p1_f2, p2_f2]))
h_stacked_f1f2 = (np.hstack([p1_f1, p2_f1]).reshape(-1, 2),
np.hstack([p1_f2, p2_f2]).reshape(-1, 2))
return (v_stacked_f1f2, h_stacked_f1f2)
pts_frame1 = np.random.random_sample((60, 2)).astype("float32")
pts_frame2 = np.random.random_sample((60, 2)).astype("float32")
# Emulate the two sets of points for each frame where
# the first set is the start point, while
# the second set is the end point of a line
pts_frame1 = (pts_frame1[::2], pts_frame1[1::2])
pts_frame2 = (pts_frame2[::2], pts_frame2[1::2])
(v_stacked_f1f2, h_stacked_f1f2) = prepare_points(pts_frame1, pts_frame2)
F_vstacked = cv2.findFundamentalMat(v_stacked_f1f2[0], v_stacked_f1f2[1],
cv2.FM_RANSAC, 3, 0.99)[0]
F_hstacked = cv2.findFundamentalMat(h_stacked_f1f2[0], h_stacked_f1f2[1],
cv2.FM_RANSAC, 3, 0.99)[0]
print("F_vstacked:\n", F_vstacked, "\n")
print("F_hstacked:\n", F_hstacked, "\n")
# The output:
# F_vstacked:
# [[ 3.31788127 -2.24336615 -0.77866782]
# [ 0.83418839 -1.4066019 -0.92088302]
# [-2.75413748 2.27311637 1. ]]
# F_hstacked:
# [[ 7.70558741 25.29966782 -16.20835082]
# [-12.95357284 -0.54474384 14.95490469]
# [ 1.79050172 -10.40077071 1. ]]
HilmanMon, 14 Nov 2016 22:59:48 -0600http://answers.opencv.org/question/113067/OpenCV Structure from Motion Reprojection Issuehttp://answers.opencv.org/question/98966/opencv-structure-from-motion-reprojection-issue/ I am currently facing an issue with my Structure from Motion program based on OpenCv. I'm gonna try to depict you what it does, and what it is supposed to do.
This program lies on the classic "structure from motion" method.
The basic idea is to take a pair of images, detect their keypoints and compute the descriptors of those keypoints. Then, the keypoints matching is done, with a certain number of tests to insure the result is good. That part works perfectly.
Once this is done, the following computations are performed : fundamental matrix, essential matrix, SVD decomposition of the essential matrix, camera matrix computation and finally, triangulation.
The result for a pair of images is a set of 3D coordinates, giving us points to be drawn in a 3D viewer. This works perfectly, for a pair.
Indeed, here is my problem : for a pair of images, the 3D points coordinates are calculated in the coordinate system of the first image of the image pair, taken as the reference image. When working with more than two images, which is the objective of my program, I have to reproject the 3D points computed in the coordinate system of the very first image, in order to get a consistent result.
My question is : How do I reproject 3D points coordinate given in a camera related system, into an other camera related system ? With the camera matrices ?
My idea was to take the 3D point coordinates, and to multiply them by the inverse of each camera matrix before.
I clarify :
Suppose I am working on the third and fourth image (hence, the third pair of images, because I am working like 1-2 / 2-3 / 3-4 and so on).
I get my 3D point coordinates in the coordinate system of the third image, how do I do to reproject them properly in the very first image coordinate system ?
I would have done the following :
Get the 3D points coordinates matrix, apply the inverse of the camera matrix for image 2 to 3, and then apply the inverse of the camera matrix for image 1 to 2. Is that even correct ?
Because those camera matrices are non square matrices, and I can't inverse them.
I am surely mistaking somewhere, and I would be grateful if someone could enlighten me, I am pretty sure this is a relative easy one, but I am obviously missing something...
Thanks a lot for reading :)DysthTue, 26 Jul 2016 03:07:17 -0500http://answers.opencv.org/question/98966/sfm::reconstruct() flat outputhttp://answers.opencv.org/question/93527/sfmreconstruct-flat-output/ **Problem**
I cannot get `sfm::reconstruct()` to output anything with a significant z dimension (|z|>1). Everything appears very flat, even if I run the demo `recon2v` on the demo data. When I run the `opencv_test_sfm` it passes.
**Question**
Is this a common issue? Is it possible I built something wrong? I also noticed that the
**Details**
I built the latest OpenCV yesterday. I am using the `sfm` sample (opencv_contrib/modules/sfm/samples/recon2v.cpp) and with its intended sample data (opencv_contrib/modules/sfm/samples/data/recon2v_checkerboards.txt). If I scale the z values by a factor of 100, I can see that the two planes of the checkerboard data are both very planar, but are actually on different planes. However, it doesn't make sense that you would have to scale the z values to see the depth (does it?)
I have additional suspicions that something is not built quite right: I have to compile the application with -DCERES_FOUND (which seems like it should only be done within cmake when building the library). I also get all zeros for my estimated camera intrinsics, both with the demo set and with my own pictures. There's also quite a few opencv tests which fail, though sfm isn't one of them. However, the Ceres solver passes all build tests.
rexroniFri, 22 Apr 2016 15:52:26 -0500http://answers.opencv.org/question/93527/Is undistortPoints so noisy or is calibration the problem?http://answers.opencv.org/question/53682/is-undistortpoints-so-noisy-or-is-calibration-the-problem/ Hi,
I am trying to do a sparse stereo reconstruction.
I am currently trying out a "synthetic" experiment to find out how much noise I get in the 3D reconstruction given "perfect" landmark detection data.
The steps:
1. I take intrinsics I got from 2 real cameras. (The cameras have a resolution of 1600 x 1200)
2. I place the cameras "virtually" 40 cm apart. Both look in the same direction (Identity matrix on rot)
3. I define a couple of 3d points around 1.30m in front of the cameras.
4. I use projectpoints to get 2d image points for each camera.
5. I use undistort points on the 2d image points. (I put in the intrinsics as last argument of the function to get the 2d points in "pixel coordinates")
6. I use the undistorted 2d points to triangulate 3d points.
Then I measure the the error between the original points 3d points and the triangulated (l2 distance).
Results (in mm):
Given the following distortion coefficients of the two cameras:
distL = [-0.5844; 0,401; 0 ; 0 -0,2274]
distR = [-0,6186; 0,2751; 0; 0; 0,03579]
I get **error of 16 to 22 mm in the 3d reconstruction** of points close to the frame of one of the camera images.
I get 3 mm on points whose imagepoint counterparts lie a little more near the center.
When I make **distortion coefficients** all **zero** the reconstruction is more or less **perfect**.
This leads me to the assumption that undistortPoints is very instable and major source of noise.
If you get that much nosie after single reconstruction step, I can not imagine how SfM can work.
Especially if you think that in a real system also landmark detection will be a major source of error.
Furthermore there are many published results on stereo reconstruction that have nearly micro meter precision.
Are the distortion coefficients shown above somehow degenerate?
The calibration values I got, came from a real calibration procedure. I got a rms of 2.1 pixels (which is probably bad). But I thought that if I use these values in a "virtual environment" it should not matter what values they have, isnt it?
![image description](/upfiles/14220122233019551.png)
In the picture you can see the view of the "right" camera (where the points lie more towards the frame of the camera)
Blue points are projected perfect 3d coordinates with lense distortion
Green points are the projections of the triangulated points with lense distortion
Thick red are projected perfect 3d coordinates WHITOUT lense distortion (a ground truth for undistort).
Thin red are projected perfect 3d coordinates after undistortion.
wolfomaniacFri, 23 Jan 2015 04:47:13 -0600http://answers.opencv.org/question/53682/SolvePnP - How to use it?http://answers.opencv.org/question/32881/solvepnp-how-to-use-it/Hi,
I am doing some multiview geometry reconstruction with structure from motion.
So far I am having having the following
- Two images as initial input
- Camera parameters and distortion coeff
- The working rectification pipeline for the initial input images
- Creation of a disparity map
- Creating a pointCloud from the disparity map with iterating over the disparity map and taking the value as z (x and y are the pixel coordinates of the pixel in the disparity map) (What is not working is reprojectImageTo3D as my Q matrix seems to be very wrong, but everything else is working perfectly)
This gives me a good pointcloud of the scene.
Now I need to add n more images to the pipeline. I've googled a lot and found the method solvePnP will help me.
But now I am very confused...
SolvePnP will take a list of the 3D points and the corresponding 2D image points and reconstruct the R and T vector for the third, fourth camera... and so on.
I've read that the tho vectors need to be aligned, meaning that the first 3D point in the first vector corresponds to the first 2D point in the 2nd vector.
So far so good. But from where do I take those correspondances? Can I use this method [reprojectPoints](http://docs.opencv.org/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html#projectpoints)
for getting those two vectors??? Or is my whole idea wrong using the disparity map for depth reconstruction? (Alternative: triangulatePoints using the good matches found before).
Can someone help me getting this straight? How can I use solvePnP to add n more cameras and therefore 3D Points to my pointcloud and improve the result of the reconstruction?glethienTue, 06 May 2014 05:51:12 -0500http://answers.opencv.org/question/32881/Help Recovering Structure From Motionhttp://answers.opencv.org/question/5248/help-recovering-structure-from-motion/Afternoon, all!
I have been banging my head against the problem of building a 3D structure from a set of sequential images intently for the past week or so and cannot seem to get a decent result out of it. I would greatly appreciate someone taking the time to go over my steps and let me know if they seem correct. I feel like I am missing something small but fundamental.
1. Build camera calibration matrix K and distortion coefficients from the calibration data of the chessboard provided (using findChessboardCorners(), cornerSubPix(), and calibrateCamera()).
2. Pull in the first and third images from the sequence and undistort them using K and the distortion coefficients.
3. Find features to track in the first image (using goodFeaturesToTrack() with a mask to mask off the sides of the image).
4. Track the features in the new image (using calcOpticalFlowPyrLK()).
At this point, I have a set of point correspondences in image i0 and image i2.
5. Generate the fundamental matrix F from the point correspondences (using the RANSAC flag in findFundamentalMat()).
6. Correct the matches of the point correspondences I found earlier using the new F (using correctMatches()).
From here, I can generate the essential matrix from F and K and extract candidate projection matrices for the second camera.
7. Generate the essential matrix E using E = K^T * F * K per HZ
8. Use SVD on E to get U, S, and V, which then allow me to build the two candidate rotations and two candidate translations.
9. For each candidate rotation, check to ensure the rotation is right-handed by checking sign of determinant. If <0, multiply through by -1.
Now that I have the 4 candidate projection matrices, I want to figure out which one is the correct one.
10. Normalize the corrected matches for images i0 and i2
11. For each candidate matrix:<pre>
11.1. Triangulate the normalized correspondences using P1 = [ I | 0 ]
and P2 = candidate matrix using triangulatePoints().
11.2. Convert the triangulated 3D points out of homogeneous coordinates.
11.3. Select a test 3D point from the list and apply a perspective
transformation to it using P2 (converted to a 4x4 matrix instead of 3x4 where
the last row is [0,0,0,1]) using perspectiveTransform().
11.4. Check if the depth of the 3D point and the Z-component of the
perspectively transformed homogeneous point are both positive. If so,
use this candidate matrix as P2. Else, continue.</pre>
12. If none of the candidate matrices generate a good P2, go back to step 5.
Now I should have two valid projection matrices P1 = [ I | 0 ] and P2 derived from E. I want to then use these matrices to triangulate the point correspondences I found back in step 4.
13. Triangulate the the normalized correspondence points using P1 and P2
14. Convert from homogeneous coordinates to get the real 3D points.
I already have encountered a problem here in that the 3D points I triangulate NEVER seem to correspond to the original structure. From the mug, they don't seem to form a clear surface, and from the statue, they're either scattered or on some line that goes off towards [-∞, -∞, 0] or similar. I am using Matplotlib's Axes3D scatter() method to plot them and see the same results with Matlab, so I assume it's not an issue with the visualization so much as the points. Any advice or insight just at this point alone would be hugely appreciated.
Moving forward though, it gets a little fuzzy in that I am not completely sure how to go about adding the additional frames. Below is my algorithm so far:
1. Store image i2 as the previous image, the image points from i2 as the previous image points, the triangulated 3D points as the corresponding real points, and the projection matrix P2 as the previous P for the loop below.
2. For each next frame iNext:<pre>
2.1. Undistort iNext using K and the distortion coefficients
2.2. Track the points from the previous image
(in the first loop iteration, I use the points from i2)
in the new image to get correspondences.
2.3. Normalize the newly tracked points.
2.4. Use the PerspectiveNPlace algorithm from OpenCV
(solvePnPRansac()) with the previous 3D points I found before
and the normalized points I tracked in the new frame to get
the rotation and translation vector of the new camera position
relative to the previous one along with a set of inliers.
2.5. Store the inlier 3D points and image points from iNext
2.6. Find new features to track in the previous image
2.7. Track the new features into the current image to get a
new set of correspondences
2.8. Correct and normalize the correspondences
2.9. Triangulate the corrected and normalized correspondences
to get a new set of 3D points (I do this to account for issues where
the original 3D points from the first triangulation in step 14 become
occluded).
2.10. Add the list of new 3D and 2D points to the inlier 3D and
2D points from step 2.5.
2.11. Repeat</pre>
3. After all of this, I will have built up a listing of 3D points found from the first triangulation between i0 and i2 and from the inliers of solvePnPRansac().
Unfortunately, the 3D points show nothing in the way of any structure, so I feel like this process of adding new images is wrong...
Any insight would be greatly appreciated, but thanks for taking the time to look over this email either way.
-CodycbuntainSun, 16 Dec 2012 13:08:47 -0600http://answers.opencv.org/question/5248/