Ask Your Question

Proceedure for obtaining/updating camera pose for moving camera

asked 2015-07-29 18:05:09 -0500

JVorwald gravatar image

updated 2015-07-30 03:43:14 -0500

I would like to determine the translation and rotation of a single monocular camera (android phone) mounted on a micro helicopter. The camera has been calibrated with the chess board, so the camera matrix and distortion parameters are available. Is the following the correct procedure? The camera is moving, the background is fixed.

0) Initialize pos_R=Mat.eye(3) and pos_T=mat.zeros(3,1). 
1) Store the first image in Mat img_train and use ORB detector, BRISK extractor to obtain keypoints / features
2) Store the next video image in Mat img_query, use ORB/BRISK with BF_HG radius matcher
3) Find distances between keypoint matches and keep only distances below threshold
4) For the first frame, set it as key frame.  For subsequent frames update the keyframe if the number of keypoints falls to less than a required number (30) or if the percent of keypoint matches falls below a required percentage (50).
5) Obtain the change in rotation and translation between the current and the last key frame.  Use findEssentialMat to obtain the essential matrix from camera focal, principle point, and matching point.  Then use recoverPose to obtain camera_R, camera_T
6) Update pos_R and pos_T using gemm.   pos_R = camera_R * keyFrameR_R.  pos_T = keyFrame_R * camera_T + keyFrame_T
7) Convert to camera angles for display using Rodrigues
8) store query image, keypoints, and features into train image, keypoints, and features
9) Repeat starting from step 2

If we can get this working on android, we'll test it by moving the camera 1 foot forward/aft, left/right, up/down. Then rotate camera about vertical axis by 30, 60 deg, and pitch the camera by 15 deg, to see how the results look.

As the project progresses, INS will be integrated and Kalman filter implemented. Is there any video of indoor flight available for testing?

I've ran the procedure on a video from a model helicopter, but I don't know the truth values. The video came from a onboard cam on youtube. I can see some problems. x, y, z are not in an earth system (X east, Y north, Z up) but instead may be in a system with x up, y right, and z forward. From a 3d graph of the x/y/z results it appears that earth z is the distance from the z axis, because the helicopter starts and ends on the z axis, and returns to the z axis at times that may correspond to the vehicle hitting the ground.

The rotation / translation are in the current camera x/y/z frames, which I think are camera up, camera right, camera forward directions. To get to earth axis (X east, Y north, Z up) would require some conversion.

Edit 1: Added key frame and comment about earth axis and results from sample video.

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted

answered 2015-08-15 09:25:05 -0500

JVorwald gravatar image

updated 2015-08-22 06:43:20 -0500

One approach is this example code implements an algorithm that consists of 1) detect 5000 fast key points, 2) calculate optical flow to get matched key points, 3) find essential matrix, and 4) find pose. To solve the problem of scaling, the translation vector is scaled to match the actual displacement between the photos. The program is set up to read photo files from the KITTI Odometry database and compare the calculated trajectory with the measured trajectory.

The program can be easily translated to Java, and modified to 1) orb detect, 2) brisk extract / brute force haminglut match, but the results do not match as well using 500 orb key points as the 5000 fast key points with optical flow

Some other, possibly better approaches, are given here and here.

edit flag offensive delete link more

Question Tools

1 follower


Asked: 2015-07-29 18:05:09 -0500

Seen: 2,238 times

Last updated: Aug 22 '15