Camera Pose Estimation With OpenCV and Python

asked 2017-12-05 12:16:33 -0600

I'm trying to use OpenCV with python to track camera pose in a video stream. I have a sample of code that determines the pose between two images as a test environment.

The overall flow here is this:

Read in the images and convert to gray/resize. Extract features with cv2 goodfeaturestotrack from both images. Use cv2 calcOpticalFlowPyrLK to find matching points. Convert the p1 points (starting image) to (x,y,z) with z for all points set as 0. Resolve cv2 PnPRansac to get the rotation and translation vectors. Convert angles from radians to degrees.

>  def function(mtx,dist):
>     #feature dictionary
>     feature_params = dict( maxCorners = 100,
>                        qualityLevel = 0.3,
>                        minDistance = 7,
>                        blockSize = 7 )
>     lk_params = dict( winSize  = (15,15),
>                   maxLevel = 2,
>                   criteria = (cv2.TERM_CRITERIA_EPS |
> cv2.TERM_CRITERIA_COUNT, 10, 0.03))
>     #image 1
>     image_1=cv2.imread("/Users/johnmcconnell/Desktop/Test_images/Test_image.jpg")
>     image_1=cv2.resize(image_1,(640,480))
>     gray_1=cv2.cvtColor(image_1,cv2.COLOR_BGR2GRAY)
>     p1=cv2.goodFeaturesToTrack(gray, mask = None, **feature_params)
>     #image read in
>     image_2=cv2.imread("/Users/johnmcconnell/Desktop/Test_images/Test_image.jpg")
>     image_2=cv2.resize(image_2,(640,480))
>     gray_2 = cv2.cvtColor(image_2,cv2.COLOR_BGR2GRAY)
>     p2, st, err = cv2.calcOpticalFlowPyrLK(gray_1,
> gray_2, p1, None, **lk_params)
>     #convert the old points to 3d
>     zeros=np.zeros([len(p1),1],dtype=np.float)
>     old_3d=np.dstack((p1, zeros))
>     #get rotational and translation vector
>     blank,rvecs, tvecs, inliers = cv2.solvePnPRansac(old_3d, p2, mtx,
> dist)
>     rad=180/math.pi
>     roll=rvecs[0]*rad
>     pitch=rvecs[1]*rad
>     yaw=rvecs[2]*rad
>     print(roll)
>     print(pitch)
>     print(yaw)
>     print(tvecs)
>      function(mtx,dist)

Given the fact that I am using exactly the same image to run this sample calculation I was expecting rotation and translation vectors to be very close to zero. However they are quite high, take a look at the sample output below. Additionally with different images with a known translation the vectors are very wrong.

The question at hand is my method sound? Have I approached this problem right? Have I matched the points correctly? Is this level of noise normal or is there something I can do about it?

answered 2017-12-05 17:20:31 -0600

Tetragramm

That is not a method of finding the pose. You give all the points a value of z = 1, which is not accurate, and will not work.

What you want is called Epipolar Geometry, and is described in THIS tutorial.

edit flag offensive delete link more


Thanks! One follow up question, once I have the fundamental matrix how do I get the change in camera pose between the frames? I need the essential matrix correct?

jake3991 gravatar imagejake3991 ( 2017-12-06 10:41:13 -0600 )edit

I knew where to find the tutorial, but I don't actually understand it as well as I should. I think that's correct, because there's a function decomposeEssentialMat which gives you the rvec and tvec, but as I said, I don't particularly understand it.

Tetragramm gravatar imageTetragramm ( 2017-12-06 14:41:35 -0600 )edit

