Ask Your Question

Translation Vector upto a scale factor (odometry).

asked 2016-10-09 13:40:00 -0500

patrchri gravatar image

updated 2016-10-09 14:16:02 -0500


I apologize in advance for a question not directly related to opencv, but I have a question regarding a project paper I am reading about visual odometry. I am generally new to image processing and because I am not doing this for a project or something similar but for my interest, I don't have someone to ask these kind of questions. I will try to number my questions so the answer will be better organised:

In page 3, at 2.Algorithm->2.1 Problem formulation-> Output the paper metions:

"The vector, t can only be computed upto a scale factor in our monocular scheme."

It mentions that scale factor for the translation several times in the paper, but I cannot understand what this actually means.

Let me create an example and present you how I understand the translation vector and similarly the rotation vector and the general idea of visual odometry. Please correct me at the parts I am wrong.

Let's say that we use one camera in a mobile robot for visual odometry and we capture the following image in our first frame in which we have used a feature detection function (like FAST for example) and we have detected the 4 corners (A,B,C,D - marked as black) of the portrait for simplification with their respective positions A(ax,ay) , B(bx,by), C(cx,cy) and D(dx,dy) :

image description

Now let's suppose that the robot moved forward linearly, without turning, and we got the next frame as follows in which we matched these corners (using for example the calcOpticalFlowPyrLK() function) but with red color this time (A',B',C',D') and their respective positions (A'(ax',ay'), B'(bx',by'), C'(cx',cy'), D'(dx',dy')), while visualizing the previous points of the first frame as black:

image description


  1. If after the procedure I presented above (and of course if I am right in what I am saying) I find the Essential matrix and recover the translation and rotation from the recoverPose() function shouldn't be this enough to know how the camera moved ?
  2. Isn't the t vector a vector containing the linear displacement of the camera and similarly the rotation vector containing its rotation?
  3. What does the phrase I outlined above actually means ? What is a scale factor and why should I use a scale factor for the translation?

Please bare with me for any follow up questions that may arise.

Thank you for your answers and for your time in advance,


edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted

answered 2016-10-09 15:13:38 -0500

Tetragramm gravatar image

The scale factor is basically the units of the number. So if you moved ten units, is that feet, cm, km, inches? Since you can't know, the translation vector returned is (I believe) a unit vector giving the direction.

I am fairly sure there's a way of obtaining relative scale, where you use three frames, find the vector from 1->2, 2->3 and 1->3. Then you can solve the equation knowing that a1->2 + b2->3 = 1->3 in three dimensions (3 equations, 2 unkowns) and then you know the relative scale. In theory one is enough, but practically you have to repeat over many sets and lengths of frames and you can get an okayish estimate of relative scale for all the frames. This is prety computationally expensive though, and I think it has to be done in post processing.

edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower


Asked: 2016-10-09 13:40:00 -0500

Seen: 559 times

Last updated: Oct 09 '16