Revision history [back]

If you only have a mono camera you have to use some sort of Structure from Motion. Unfortunately, OpenCV provides you only with some building blocks but not with a complete solution for this problem and I would argue that this problem is sill not solved for all cases. By the way Goolge is trying to do similar things (Goolge Tango).

For two images you can try to do the following:

Use Feature Detector + Matcher
Calculate Essential Matrix with findEssentialMat (assuming you have a calibrated camera)
Recover the pose with decomposeEssentialMat
You can find the right rotation by testing all points

For image sequences it is getting a lot more complex (have a look at libmv there was also a project trying to make libmv compatible to opencv libmv2)

find frames with enough parallax
store matches (for example bipartite graph)
track features over multiple images
use P3P algorithm and five point algorithm to recover poses (have a look at Visual Odometry by Nister, D.)
use bundle adjustment for refinement (ceres solver)