Revision history [back]

Quick answers:

For 1), you should look at the equation, a 3D point in the world coordinate is projected into the image plane using the extrinsic and intrinsic matrices
For 2), you should look for a course on this topic (homogeneous transformation), maybe this or this
For 3), you use it everytime when you capture the world in 3D dimension into a 2D image

Depends on how many frames you have. If you have one object, you can define the same frame for the object / world frame. If you have multiples objects, you can define an object frame for each object and a reference world frame somewhere. The extrinsic matrix relates the pose of a frame with respect to the camara frame.

For 4), look at the equation: (u, v, 1)^T = K . [R | t] . (X, Y, Z, 1). Which frame is multiplied by the extrinsic matrix?

If you want to experiment, print a chessboard pattern and calibrate your camera. You will have the intrinsic and extrinsic matrices. Look into the OpenCV sample code:

here is constructed the list of the 3D points for the chessboard
if you multiply one 3D point in the object frame with the corresponding extrinsic matrix, you will have the 3D coordinate in the camera frame. Also if you look at t_x, t_y, t_z in the extrinsic matrix, you will have the translation between the camera frame and the object frame.
For each image, you will a different extrinsic matrix.

I have found the following courses (among others):

Position & motion in 2D and 3D, Robot Vision by Peter Corke
Projective Geometry and Camera Models by Derek Hoiem
Computer Vision: Algorithms and Applications by Richard Szeliski for a general book on the computer vision field
etc.

Quick answers:

For 1), you should look at the equation, a 3D point in the world coordinate is projected into the image plane using the extrinsic and intrinsic matrices
For 2), you should look for a course on this topic (homogeneous transformation), maybe this or this
For 3), you use it everytime when you capture the world in ~~3D dimension~~ 3 dimensions into a 2D image

For 4), look at the equation: (u, v, 1)^T = K . [R | t] . (X, Y, Z, ~~1).~~ 1)^T. Which frame is multiplied by the extrinsic matrix?

If you want to experiment, print a chessboard pattern and calibrate your camera. You will have the intrinsic and extrinsic matrices. Look into the OpenCV sample code:

here is constructed the list of the 3D points for the chessboard
if you multiply one 3D point in the object frame with the corresponding extrinsic matrix, you will have the 3D coordinate in the camera frame. Also if you look at t_x, t_y, t_z in the extrinsic matrix, you will have the translation between the camera frame and the object frame.
For each image, you will a different extrinsic matrix.

I have found the following courses (among others):

Position & motion in 2D and 3D, Robot Vision by Peter Corke
Projective Geometry and Camera Models by Derek Hoiem
Computer Vision: Algorithms and Applications by Richard Szeliski for a general book on the computer vision field
etc.