CameraCalibration -> Documentation -> Focal Lengths

asked 2014-02-16 22:59:24 -0600

Maxim Mikhisor
16 ●4

updated 2014-02-20 06:58:01 -0600

jensenb
730 ●11 ●18

Hi, everyone.

I have one question about FocalLenght in documentation in CameraCalibration. Now we have formula like this:

$image description$

Why do we multiply by Fx and Fy in the last two rows, but not in the second and third rows? In my opinion we should use this rows:

......
x' = Fx * x / z
y' = Fy * y / z
......
u = x'' + Cx
v = y'' + Cy

Am I right? Or I do not understand something?
Can somebody help me to find error?

EDIT: image description

For simplicity lets consider only Z and X. Lets Y = 0 everywhere.

D(x) — it is distortion polynom
x — original point coordinates (X and Zx axises)
x' — it is undistorted (ideal) coordinates (X and Zx axises)
x'' — it is distored (real) coordinates (X and Zx axises)
v — it is distored (real) coordinates (V and Zv axises)

In my opinion to calculate image coordinates (V and Zv axises) using camera coordinates (X and Zx axises) we should use these equations:
x' = f * x / z
x'' = D(x')
v = x'' + c

There is my mistake?
Why in OpenCV documentation we multiply by focus at the end, but not at the begin?

edit retag flag offensive close merge delete

Comments

I have updated my answer to reflect your updated question.

jensenb ( 2014-02-20 08:33:55 -0600 )edit

add a comment

answered 2014-02-18 08:18:22 -0600

jensenb
730 ●11 ●18

updated 2014-02-20 08:32:51 -0600

The answer lies in the magic performed in the two middle rows of the projection equations involving x'' and y''. Assuming you have a camera that perfectly models the pinhole projection model, you could completely skip the x'' and y'' steps and just use the four equations involving x', y', u, and v. But this is not true of most (all?) cameras in practice. The camera's lens introduces distortion that deviates the projection of 3D scene points P = [X, Y, Z] from their ideal image coordinates (as predicted by the pinhole model) p = [u, v] to distorted coordinates p* = [u, v]. This effect varies depending on the quality of camera and the type of lens, typically points are warped away from the principal point proportional to their distance (positive radial distortion). The equations involving x'' and y'' are compensating the lens distortion, which must be done projecting the points onto the image plane. See Szeliski chapter 2.1.6 for more info.

EDIT: As to why distortion correction is (almost) always applied before projection on the image plane.

First, you are arguing about right and wrong, i.e. your proposed equations are right is right and OpenCV is wrong. This is not a fruitful way to look at the problem. Basically the pinhole camera model defines a set of conditions that any projection of a scene has to obey, the size of an object is inversely proportional to its distance to the camera, directly proportional to the focal length etc, directly related to its pose relative to the camera etc... Distortion correction is just a way of fitting an imperfect physical system fit into this ideal mathematical model, there is no one absolutely correct way it must be applied, as long it makes the imaging process closer to the pinhole camera model.

So yes you could perform distortion after projection on the image plane but before decentering, but this will require that your distortion function is estimated using image coordinates and not normalized camera coordinates. The distortion function is a non linear function without an explicit representation that is approximated via Taylor expansion up to 6th degree terms (depending on OpenCV flags), and it is dependent upon the coordinate system in which it was estimated. Because of this non linearity you cannot switch the order in which it is performed:

image description

So if you want to use your projection equations you have to reestimate the distortion function in image coordinates.

Now there is a reason why distortion correction is typically performed as the OpenCV equations dictate, that is in normalized camera coordinates. Estimation of the distortion functions is performed during camera calibration where a planar pattern with a set of accurately known points is observed by the camera from many viewing angles and distances. Typically one of the first steps involves estimating the pose of the calibration target relative to the camera for each input image. Knowing the pose of the calibration pattern relative to the camera means you can ... (more)

edit flag offensive delete link

Comments

Hi, jensenb. Thank you for answer.

I know about distortion and understand middle rows responsible for.

Question was a little bit about another aspect.

We can apply distortion formula to the coordinates. For example, if x' and x'' are coordinates, than we can write x'' = DISTORTION(x')

But in formulas above x' is not coordinates. It is just ratio x/z.

Why x' is equal x/z? Why x' is not equal Fx * x / z?

Maxim Mikhisor ( 2014-02-19 01:41:33 -0600 )edit

First, both u,v and x',y' are in a coordinate system, although different ones. The former, u,v are in image coordinates, whereas x',y' are in so called ideal camera coordinates, that is before decentering and projection on the image plane.

jensenb ( 2014-02-19 02:16:44 -0600 )edit

Second, there is a theoretical justification for undistorting points after perspective division. From an optical point of view the lens warps light rays after they are reflected off surfaces in the scene (so according the objects 3D pose in the camera coordinate system, after perspective division), but before the rays land on the image chip. This is way the undistortion equations are performed before projection on the image plane.

Technically speaking projection on the image plane is just an affine transformation, so technically it would be possible to perform undistortion using image coordinates, but this would require reestimating image the distortion factors in image coordinates as most calibration software estimates distortion in ideal coordinates.

jensenb ( 2014-02-19 02:27:52 -0600 )edit

I think I begin to understand there is my incomprehension.

I do not know that is "normalized camera coordinates".

Can you explain briefly that is it "normalized camera coordinates"? Or may be you know link there I can read about it?

Maxim Mikhisor ( 2014-02-20 14:52:50 -0600 )edit

Normalized camera coordinates are what you get after perspective division, but before projection on the image plane and decentering, i.e. x' and x'' from the OpenCV equations. Its called normalized camera coordinates because all points lie on a plane at z=1 in front of the camera. I reading chapter 2 in Szeliski: http://szeliski.org/Book/

jensenb ( 2014-02-21 01:35:00 -0600 )edit

Thank you very much for help.

Maxim Mikhisor ( 2014-02-24 02:16:42 -0600 )edit

add a comment

CameraCalibration -> Documentation -> Focal Lengths

Comments

1 answer

Comments

Links

Question Tools

Stats

Related questions

CameraCalibration -> Documentation -> Focal Lengths edit

Comments

1 answer

Comments

Links

Question Tools

Stats

Related questions

CameraCalibration -> Documentation -> Focal Lengths