OpenCV Q&A Forum - RSS feedhttp://answers.opencv.org/questions/OpenCV answersenCopyright <a href="http://www.opencv.org">OpenCV foundation</a>, 2012-2018.Wed, 29 May 2019 13:09:21 -0500understanding cv2.calibrateCamera in pythonhttp://answers.opencv.org/question/213682/understanding-cv2calibratecamera-in-python/ Hi All
im fairly new to opencv and decided to use it in python. To make it easy as possible. My overall objective is to use a 3d camera to triangulate the 3d position of an object. The camera i bought is two cameras mounted together so I know their orientation will be parallel and that they are 60mm apart
im trying to follow
https://answers.opencv.org/question/117141/triangulate-3d-points-from-a-stereo-camera-and-chessboard/
so..
first im using
https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_calib3d/py_calibration/py_calibration.html
to get
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objectPointsArray, imgPointsArray, gray.shape[::-1], None, None)
using a chessboard
now
1. Im confused about how this calibration works and whats returned? should i be moving the chess board in all dimensions while the camera is still? or just in the x ,y axis while z remains constant? or should i keep the camera and the board still?
2. of ret, mtx, dist, rvecs, tvecs which are returned
is mtx == camera matrix i.e. focal lengths and optical centres? how could it know this if it didnt know the z values of the chessboard
is dist == distortion coefficients i.e. a matrix describing the fish eye effect of the camera
is rvecs == the rotation of the camera? should this be zero if the camera is still and mounted parallel to the ground
is tvecs == the location of the camera? i don't understand how it could know this from looking at a chessboard, whic could be anywhere?
ant help would greatly appreciateddarthShanaWed, 29 May 2019 13:09:21 -0500http://answers.opencv.org/question/213682/Correctly interpreting the Pose (Rotation and Translation) after 'recoverPose' from Essential matrixhttp://answers.opencv.org/question/208290/correctly-interpreting-the-pose-rotation-and-translation-after-recoverpose-from-essential-matrix/ Hi,
I have been breaking my head trying to correctly interpret the results of recoverPose from Essential matrix. Here are the high level steps I am using:
1. Detect ORB features in two images
2. Match featues using BFMatcher
3. findEssential across two images
4. recoverPose ie. R,T from the two images
5. Triangulate the good featues (masked from recoverPose) using the R, T to created 3d point-clouds (landmarks)
6. As a ground truth, I also extract Chess board corners from the image and triangulate them using the R, T calculated above. A good planar formation for chess board corners indicates that R, T are accurate for triangulation.
7. Plot everything
**So as we can see from images 1488 and 1490, the camera is moving to the left - up AND it in pointing down and to the right. However the plot of R and T of the 2nd position reflects something completely different.**
![image description](/upfiles/15490517252702959.png)
![image description](/upfiles/15490517382481242.png)
I have tried inverting both using R' and -(R')*T, but that doesn't plot correctly either. I have tried a bunch of different combinations, but none seem to make sense.
So what gives???
The python script and test images can be found [here](https://drive.google.com/drive/folders/1wrYzphphyrFkUIiqXpfm-2mBQsrAN03b?usp=sharing). For reference the python code is:
import numpy as np
import cv2
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
def plot_pose3_on_axes(axes, gRp, origin, axis_length=0.1):
"""Plot a 3D pose on given axis 'axes' with given 'axis_length'."""
# get rotation and translation (center)
#gRp = pose.rotation().matrix() # rotation from pose to global
#t = pose.translation()
#origin = np.array([t.x(), t.y(), t.z()])
# draw the camera axes
x_axis = origin + gRp[:, 0] * axis_length
line = np.append(origin, x_axis, axis=0)
axes.plot(line[:, 0], line[:, 1], line[:, 2], 'r-')
y_axis = origin + gRp[:, 1] * axis_length
line = np.append(origin, y_axis, axis=0)
axes.plot(line[:, 0], line[:, 1], line[:, 2], 'g-')
z_axis = origin + gRp[:, 2] * axis_length
line = np.append(origin, z_axis, axis=0)
axes.plot(line[:, 0], line[:, 1], line[:, 2], 'b-')
img1 = cv2.imread('/home/vik748/data/chess_board/GOPR1488.JPG',1) # queryImage
img2 = cv2.imread('/home/vik748/data/chess_board/GOPR1490.JPG',1)
fx = 3551.342810
fy = 3522.689669
cx = 2033.513326
cy = 1455.489194
K = np.float64([[fx, 0, cx],
[0, fy, cy],
[0, 0, 1]])
D = np.float64([-0.276796, 0.113400, -0.000349, -0.000469]);
print(K,D)
# Convert images to greyscale
gr1=cv2.cvtColor(img1,cv2.COLOR_BGR2GRAY)
gr2=cv2.cvtColor(img2,cv2.COLOR_BGR2GRAY)
#Initiate ORB detector
detector = cv2.ORB_create(nfeatures=25000, edgeThreshold=15, patchSize=125, nlevels=32,
fastThreshold=20, scaleFactor=1.2, WTA_K=2,
scoreType=cv2.ORB_HARRIS_SCORE, firstLevel=0)
# find the keypoints and descriptors with ORB
kp1, des1 = detector.detectAndCompute(gr1,None)
kp2, des2 = detector.detectAndCompute(gr2,None)
print ("Points detected: ",len(kp1), " and ", len(kp2))
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = bf.match(des1,des2)
kp1_match = np.array([kp1[mat.queryIdx].pt for mat in matches])
kp2_match = np.array([kp2[mat.trainIdx].pt for mat in matches])
kp1_match_ud = cv2.undistortPoints(np.expand_dims(kp1_match,axis=1),K,D)
kp2_match_ud = cv2.undistortPoints(np.expand_dims(kp2_match,axis=1),K,D)
E, mask_e = cv2.findEssentialMat(kp1_match_ud, kp2_match_ud, focal=1.0, pp=(0., 0.),
method=cv2.RANSAC, prob=0.999, threshold=0.001)
print ("Essential matrix: used ",np.sum(mask_e) ," of total ",len(matches),"matches")
points, R, t, mask_RP = cv2.recoverPose(E, kp1_match_ud, kp2_match_ud, mask=mask_e)
print("points:",points,"\trecover pose mask:",np.sum(mask_RP!=0))
print("R:",R,"t:",t.T)
bool_mask = mask_RP.astype(bool)
img_valid = cv2.drawMatches(gr1,kp1,gr2,kp2,matches, None,
matchColor=(0, 255, 0),
matchesMask=bool_mask.ravel().tolist(), flags=2)
plt.imshow(img_valid)
plt.show()
ret1, corners1 = cv2.findChessboardCorners(gr1, (16,9),None)
ret2, corners2 = cv2.findChessboardCorners(gr2, (16,9),None)
corners1_ud = cv2.undistortPoints(corners1,K,D)
corners2_ud = cv2.undistortPoints(corners2,K,D)
#Create 3 x 4 Homogenous Transform
Pose_1 = np.hstack((np.eye(3, 3), np.zeros((3, 1))))
print ("Pose_1: ", Pose_1)
Pose_2 = np.hstack((R, t))
print ("Pose_2: ", Pose_2)
# Points Given in N,1,2 array
landmarks_hom = cv2.triangulatePoints(Pose_1, Pose_2,
kp1_match_ud[mask_RP[:,0]==1],
kp2_match_ud[mask_RP[:,0]==1]).T
landmarks_hom_norm = landmarks_hom / landmarks_hom[:,-1][:,None]
landmarks = landmarks_hom_norm[:, :3]
corners_hom = cv2.triangulatePoints(Pose_1, Pose_2, corners1_ud, corners2_ud).T
corners_hom_norm = corners_hom / corners_hom[:,-1][:,None]
corners_12 = corners_hom_norm[:, :3]
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.set_aspect('equal') # important!
title = ax.set_title('3D Test')
ax.set_zlim3d(-5,10)
# Plot triangulated featues in Red
graph, = ax.plot(landmarks[:,0], landmarks[:,1], landmarks[:,2], linestyle="", marker="o",color='r')
# Plot triangulated chess board in Green
graph, = ax.plot(corners_12[:,0], corners_12[:,1], corners_12[:,2], linestyle="", marker=".",color='g')
# Plot pose 1
plot_pose3_on_axes(ax,np.eye(3),np.zeros(3)[np.newaxis], axis_length=0.5)
#Plot pose 2
plot_pose3_on_axes(ax, R, t.T, axis_length=1.0)
ax.set_zlim3d(-2,5)
ax.view_init(-70, -90)
plt.show()vik748Fri, 01 Feb 2019 14:22:32 -0600http://answers.opencv.org/question/208290/What is solvePnP() exactly for? (while i already have the projection matrices P)http://answers.opencv.org/question/175301/what-is-solvepnp-exactly-for-while-i-already-have-the-projection-matrices-p/I already know that solvePnP() finds the position (rotation and translation) of the camera using the 2d point coordinates and corresponding 3d point coordinates, but i don't really understand why i have to use it after i triangulated some 3d points with 2 cameras and their corresponding 2d points.
Because while triangulating a new 3D point, i already have (need) the projection matrices P1 and P2 of the two cameras (which contains of the R1, R2 and t1, t2 rotation and translations and is already the location of the cameras w.r.t. the new triangulating 3D point).
*My workflow is:*
1. Get 2D-correspondences from 2 images.
2. Get Essential Matrix E using these 2D-correspondences.
3. Get relative orientation (R, t) of the 2 images from the Essential Matrix E.
4. Set Projection Matrix P1 of camera1 to
P1 = (1, 0, 0, 0,
0, 1, 0, 0,
0, 0, 1, 0,
0, 0, 0, 0);
and set Projection Matrix P2 of camera2 to
P2 = (R.at<double>(0, 0), R.at<double>(0, 1), R.at<double>(0, 2), t.at<double>(0),
R.at<double>(1, 0), R.at<double>(1, 1), R.at<double>(1, 2), t.at<double>(1),
R.at<double>(2, 0), R.at<double>(2, 1), R.at<double>(2, 2), t.at<double>(2));
5. Solve least squares problem
P1 * X = x1
P2 * X = x2
(solving for X = 3D Point). and so on.....
After that i get a triangulated 3D Point X from these Projection Matrices P1 and P2 and the x1 and x2 2D Point correspondences.
***My question is now again:
Why i need to use now solvePnP() to get the camera location?
Because I already have P1 and P2 which should be already the locations of the cameras (w.r.t. the triangulated 3D points).***mirnyyThu, 28 Sep 2017 12:27:55 -0500http://answers.opencv.org/question/175301/Help Recovering Structure From Motionhttp://answers.opencv.org/question/5248/help-recovering-structure-from-motion/Afternoon, all!
I have been banging my head against the problem of building a 3D structure from a set of sequential images intently for the past week or so and cannot seem to get a decent result out of it. I would greatly appreciate someone taking the time to go over my steps and let me know if they seem correct. I feel like I am missing something small but fundamental.
1. Build camera calibration matrix K and distortion coefficients from the calibration data of the chessboard provided (using findChessboardCorners(), cornerSubPix(), and calibrateCamera()).
2. Pull in the first and third images from the sequence and undistort them using K and the distortion coefficients.
3. Find features to track in the first image (using goodFeaturesToTrack() with a mask to mask off the sides of the image).
4. Track the features in the new image (using calcOpticalFlowPyrLK()).
At this point, I have a set of point correspondences in image i0 and image i2.
5. Generate the fundamental matrix F from the point correspondences (using the RANSAC flag in findFundamentalMat()).
6. Correct the matches of the point correspondences I found earlier using the new F (using correctMatches()).
From here, I can generate the essential matrix from F and K and extract candidate projection matrices for the second camera.
7. Generate the essential matrix E using E = K^T * F * K per HZ
8. Use SVD on E to get U, S, and V, which then allow me to build the two candidate rotations and two candidate translations.
9. For each candidate rotation, check to ensure the rotation is right-handed by checking sign of determinant. If <0, multiply through by -1.
Now that I have the 4 candidate projection matrices, I want to figure out which one is the correct one.
10. Normalize the corrected matches for images i0 and i2
11. For each candidate matrix:<pre>
11.1. Triangulate the normalized correspondences using P1 = [ I | 0 ]
and P2 = candidate matrix using triangulatePoints().
11.2. Convert the triangulated 3D points out of homogeneous coordinates.
11.3. Select a test 3D point from the list and apply a perspective
transformation to it using P2 (converted to a 4x4 matrix instead of 3x4 where
the last row is [0,0,0,1]) using perspectiveTransform().
11.4. Check if the depth of the 3D point and the Z-component of the
perspectively transformed homogeneous point are both positive. If so,
use this candidate matrix as P2. Else, continue.</pre>
12. If none of the candidate matrices generate a good P2, go back to step 5.
Now I should have two valid projection matrices P1 = [ I | 0 ] and P2 derived from E. I want to then use these matrices to triangulate the point correspondences I found back in step 4.
13. Triangulate the the normalized correspondence points using P1 and P2
14. Convert from homogeneous coordinates to get the real 3D points.
I already have encountered a problem here in that the 3D points I triangulate NEVER seem to correspond to the original structure. From the mug, they don't seem to form a clear surface, and from the statue, they're either scattered or on some line that goes off towards [-∞, -∞, 0] or similar. I am using Matplotlib's Axes3D scatter() method to plot them and see the same results with Matlab, so I assume it's not an issue with the visualization so much as the points. Any advice or insight just at this point alone would be hugely appreciated.
Moving forward though, it gets a little fuzzy in that I am not completely sure how to go about adding the additional frames. Below is my algorithm so far:
1. Store image i2 as the previous image, the image points from i2 as the previous image points, the triangulated 3D points as the corresponding real points, and the projection matrix P2 as the previous P for the loop below.
2. For each next frame iNext:<pre>
2.1. Undistort iNext using K and the distortion coefficients
2.2. Track the points from the previous image
(in the first loop iteration, I use the points from i2)
in the new image to get correspondences.
2.3. Normalize the newly tracked points.
2.4. Use the PerspectiveNPlace algorithm from OpenCV
(solvePnPRansac()) with the previous 3D points I found before
and the normalized points I tracked in the new frame to get
the rotation and translation vector of the new camera position
relative to the previous one along with a set of inliers.
2.5. Store the inlier 3D points and image points from iNext
2.6. Find new features to track in the previous image
2.7. Track the new features into the current image to get a
new set of correspondences
2.8. Correct and normalize the correspondences
2.9. Triangulate the corrected and normalized correspondences
to get a new set of 3D points (I do this to account for issues where
the original 3D points from the first triangulation in step 14 become
occluded).
2.10. Add the list of new 3D and 2D points to the inlier 3D and
2D points from step 2.5.
2.11. Repeat</pre>
3. After all of this, I will have built up a listing of 3D points found from the first triangulation between i0 and i2 and from the inliers of solvePnPRansac().
Unfortunately, the 3D points show nothing in the way of any structure, so I feel like this process of adding new images is wrong...
Any insight would be greatly appreciated, but thanks for taking the time to look over this email either way.
-CodycbuntainSun, 16 Dec 2012 13:08:47 -0600http://answers.opencv.org/question/5248/