Questions on camera matrices

Nbb
731 ●12 ●23 ●38

I am getting very confused on camera matrices.

1) Does the extrinsic matrix describe the transformation from object to camera or camera to object ?

2) Does the extrinsic matrix apply the rotation before the translation or vice versa ? In either case, does it transform the points with respect to the original origin ? Like translate(p with respect to origin) then rotate(p with respect to origin) or is it like translate(p with respect to origin) then rotate(p with respect to new origin after the translation) ?

3) I read that the extrinsic matrix transforms points in world coordinates to camera coordinates. When do I use this ? Aren't my camera always the origin ?

4) Does the extrinsic matrix bring my camera to the world origin or does it bring the world origin to my camera ?

5) Does anyone know of any link or any simple example with images that show the above numerically ? I think being able to watch a correct simple example with numerical would greatly help my understanding. I am getting extremely confused by this extrinsic matrix. What happens before and after applying the matrix and where my objects and cameras are before and after.

answered Nov 18 '16

Eduardo
3589 ●1 ●15 ●41

updated Nov 18 '16

Quick answers:

For 1), you should look at the equation, a 3D point in the world coordinate is projected into the image plane using the extrinsic and intrinsic matrices
For 2), you should look for a course on this topic (homogeneous transformation), maybe this or this
For 3), you use it everytime when you capture the world in 3 dimensions into a 2D image

Depends on how many frames you have. If you have one object, you can define the same frame for the object / world frame. If you have multiples objects, you can define an object frame for each object and a reference world frame somewhere. The extrinsic matrix relates the pose of a frame with respect to the camara frame. In fact, the extrinsic matrix is just the name given of the homogeneous transformation that transforms one frame to the camera frame.

For 4), look at the equation: (u, v, 1)^T = K . [R | t] . (X, Y, Z, 1)^T. Which frame is multiplied by the extrinsic matrix?

If you want to experiment, print a chessboard pattern and calibrate your camera. You will have the intrinsic and extrinsic matrices. Look into the OpenCV sample code:

here is constructed the list of the 3D points for the chessboard
if you multiply one 3D point in the object frame with the corresponding extrinsic matrix, you will have the 3D coordinate in the camera frame. Also if you look at t_x, t_y, t_z in the extrinsic matrix, you will have the translation between the camera frame and the object frame.
For each image, you will a different extrinsic matrix.

I have found the following courses (among others):

Position & motion in 2D and 3D, Robot Vision by Peter Corke
Projective Geometry and Camera Models by Derek Hoiem
Computer Vision: Algorithms and Applications by Richard Szeliski for a general book on the computer vision field
etc.

add a comment

Questions on camera matrices

1 answer

Links

Question Tools

Stats

Related questions

Questions on camera matrices edit savecancel

1 answer

Links

Question Tools

Stats

Related questions

Questions on camera matrices