Revision history [back]

My first advice is to look at a course or a book on rigid body transformation, homogeneous transformation topics. These topics are well-covered in a robotics or computer graphics course, for instance:

Ridig Body Motion – Homogeneous Transformations, Claudio Melchiorri

You have to understand what rvec and tvec are. First the camera model used in computer vision is described here. Additional information on pose estimation are described here.

rvec is a Rodrigues rotation vector, or axis-angle representation.

image description

Rotation direction is described by a unit vector and the "amount" of rotation by the length of the vector. This representation avoids issues you can have with Euler angles, i.e. Gimbal lock. Quaternion is another representation of rotation that don't have Gimbal lock issue.

Now, what rvec and tvec represent?

They allow transforming a 3D point expressed in one coordinate system to another one. Here, you transform 3D points expressed in the tag frame into the camera frame:

image description

So, for the first case you should have (here I use the tag X-axis as an example):

image description

For trivial case, you can recover the rotation matrix by looking at the values that allow transforming the X-axis, Y-axis and Z-axis of the tag into the camera frame.

Ridig Body Motion – Homogeneous Transformations, Claudio Melchiorri

You have to understand what rvec and tvec are. First the camera model used in computer vision is described here. Additional information on pose estimation are described here.

rvec is a Rodrigues rotation vector, or axis-angle representation.

image description

Most important, they are not Euler angles.

Now, what rvec and tvec represent?

They allow transforming a 3D point expressed in one coordinate system to another one. Here, you transform 3D points expressed in the tag frame into the camera frame:

image description

So, for the first case you should have (here I use the tag X-axis as an example):

image description

For trivial case, you can recover the rotation matrix by looking at the values that allow transforming the X-axis, Y-axis and Z-axis of the tag into the camera frame.

In my opinion, it is very rare to need to use the Euler angles and I don't like them. Why?

First, because they are 12 different representations/conventions of Euler angles. So first you have to agree on which Euler convention you are using when you have to deal with other people / code.

Then, Gimbal lock.

You only need Euler angles when you have to be able to interpret the rotation. So only for printing or as an user input (e.g. to rotate a CAD model in Blender).

Rest of the time, computation should be / are done using the rotation matrix or quaternion representations.