Ask Your Question

ArUco orientation using the function aruco.estimatePoseSingleMarkers()

asked 2019-07-09 10:28:00 -0600

lamaa gravatar image

Hi everyone!

I'm trying to program a python app that determine the position and orientation of an aruco marker. I calibrated the camera and everything and I used aruco.estimatePoseSingleMarkers that returns the translation and rotation vectors. The translation vector works fine but I don't understand how the rotation vector works. I took some picture to illustrate my problem with the "roll rotation":

Here the rotation vector is approximately [in degree]: [180 0 0] image description

Here the rotation vector is approximately [in degree]: [123 -126 0] image description

And here the rotation vector is approximately [in degree]: [0 -180 0] image description

And I don't see the logic in these angles. I've tried the other two rotations (pitch and yaw) and there appear also "random". So if you have an explication I would be very happy :)

edit retag flag offensive close merge delete


what does "is approximately [in degree]" mean ? you get an (exact) vector of 3 angles in rad.

berak gravatar imageberak ( 2019-07-09 10:37:17 -0600 )edit

Yes but I rounded and put them in degree to have a better idea of there meaning

lamaa gravatar imagelamaa ( 2019-07-09 10:40:56 -0600 )edit

1 answer

Sort by » oldest newest most voted

answered 2019-07-10 03:49:23 -0600

Eduardo gravatar image

updated 2019-07-10 03:59:52 -0600

My first advice is to look at a course or a book on rigid body transformation, homogeneous transformation topics. These topics are well-covered in a robotics or computer graphics course, for instance:

You have to understand what rvec and tvec are. First the camera model used in computer vision is described here. Additional information on pose estimation are described here.

rvec is a Rodrigues rotation vector, or axis-angle representation.

image description

Rotation direction is described by a unit vector and the "amount" of rotation by the length of the vector. This representation avoids issues you can have with Euler angles, i.e. Gimbal lock. Quaternion is another representation of rotation that don't have Gimbal lock issue.

Most important, they are not Euler angles.

Now, what rvec and tvec represent?

They allow transforming a 3D point expressed in one coordinate system to another one. Here, you transform 3D points expressed in the tag frame into the camera frame:

image description

So, for the first case you should have (here I use the tag X-axis as an example):

image description

For trivial case, you can recover the rotation matrix by looking at the values that allow transforming the X-axis, Y-axis and Z-axis of the tag into the camera frame.

In my opinion, it is very rare to need to use the Euler angles and I don't like them. Why?

First, because they are 12 different representations/conventions of Euler angles. So first you have to agree on which Euler convention you are using when you have to deal with other people / code.

Then, Gimbal lock.

You only need Euler angles when you have to be able to interpret the rotation. So only for printing or as an user input (e.g. to rotate a CAD model in Blender).

Rest of the time, computation should be / are done using the rotation matrix or quaternion representations.

edit flag offensive delete link more


Thanks a lot for you answer, now it's a lot clearer! :) I was struggling to find information about computer vision. Yeah you're right they are really tricky those Euler angles and there is no consensus about their representation. I remember that our teacher almost forbade us to use them. So I will follow your advice and work with the rotation matrix using cv2.Rodrigues(rvec). So if I understand well, to found the coordinate of the tag, I can simply do the following : (x,y,z,1)_tag = inv(transform_matrix) (x,y,z,1)_camera? Because basically I'm working on a "vision system" for a robot (ABB MR6400). The camera would be on the robot'arm and knowing it's tool coordinate, it would know the position of the tag.

lamaa gravatar imagelamaa ( 2019-07-10 04:41:15 -0600 )edit

For inverse transformation, have a look at this, p72. You don't need to compute the matrix inverse.

Not sure to understand what you need. In (x,y,z,1)_cam = [R | t] (x,y,z,1)_tag, t is translation of the tag frame wrt the camera frame in the camera coordinates system. That means that if you have tx = 10 cm, the tag center is at 10 cm in camera X-axis.

If you inverse the transformation, it is now the camera frame wrt the tag frame in the tag coordinates system.

Eduardo gravatar imageEduardo ( 2019-07-10 05:17:26 -0600 )edit

Yes your right, my bad, I don't need this transformation. But I'm still struggling with the orientation of the aruco... I converted rvec into quaternions and visualized them but they don't represent the way the aruco is oriented.

lamaa gravatar imagelamaa ( 2019-07-11 04:08:03 -0600 )edit

The axes drawn is the way to be sure that the pose is correctly estimated (red is X-axis, green is Y-axis and blue the Z-axis). Check that the axes correspond to how you have defined the object 3D points.

Eduardo gravatar imageEduardo ( 2019-07-12 05:23:48 -0600 )edit

Question Tools

1 follower


Asked: 2019-07-09 10:28:00 -0600

Seen: 30,432 times

Last updated: Jul 10 '19