Ask Your Question

# solvePNP inconsistency when rotating markers on a plane

I'm implementing a 3d laser scanner based on a rotating table with aruco markers on it for camera pose estimation.

Here is a sample captured camera frame, with the aruco markers detected:

The basic pose estimation flow (for one frame) can be summarized as follows:

camera_matrix, distortion_coefficients = readCalibrationParameters()
board_configuration = readBoardConfiguration()
frame = captureCameraFrame()

found_marker_corners = aruco::detectMarkers(frame, board_configuration)

rvec, tvec = cv::solvePNP(found_marker_corners, board_configuration,
camera_matrix, distortion_coefficients)

R = cv::Rodrigues(rvec)
R = R.t();

camera_position = (-R * tvec);

As the scanner's turntable is flat, the computed camera's Z position (height) should be consistent when rotating the turntable around.

However, this is not the case.

There is a direct relationship between the perceived angle of marker's rotation and the computed camera's Z position, with extremes when the marker is seen straight on and when it's rotated by 45 degrees.

The following charts illustrate this relationship.

EDIT: The charts have been replaced with ones based on synthetic images, so as to make sure the problem is not related to some physical-world phenomena and constrain the problem domain to purely algorithmic. The previous images may be still viewed here.

### Single marker results

Effects of finding the camera position based on just one marker (4 points to solvePNP)

Straight view:

View from 45 degrees rotation:

### 4 middle markers results

Effects of finding the camera position based on the 4 middle markers (16 points to solvePNP)

Straight view:

View from 45 degrees rotation:

### All markers results

Effects of finding the camera position based on all the markers (~200 points to solvePNP)

Straight view:

View from 45 degrees rotation:

### Already checked

Some of the things I tried which had no effect on the problem (the relationship was still there):

• using low-resolution camera frames (320x240)
• detecting 4 extra-large markers instead of 52 small ones
• using a different camera
• using different camera calibration / using zero distortion
• changing corner detection parameters
• using the two other solvePNP methods, CV_EPNP and CV_P3P (the relationship was still there, but the precision was much worse)

Are the results a limitation of solvePNP, or am I doing something wrong?

Edit:

• updated the pseudocode to clarify the used methods, according to suggestions
• added separate charts for 1-marker, 4-marker and all-marker position calculation
• added reprojection error for the detected points

Edit 2:

It turned out that the printed markers were not exactly square (1mm difference across the entire table), which made some results worse than they should be. After correcting for that, the main problem still remains, however.

I have replaced the images with ones generated synthetically so as to ensure that the problem is algorithm-related (and not related to, say camera calibration or physical dimensions mismatch).

edit retag close merge delete

## Comments

3

First of all, this a great detailed question! Just some questions or advices:

• can you add the code of the missing functions like invert, just in case of.
• how many points did you use for solvePnP? Only 4 or all the points marker?
• the camera seems to be at a distance of around 50 cm? I don't know if a divergence of 1.4 cm is normal or not for this distance.
• maybe you can try with a chessboard pattern to see if you get the same behavior?
• you can try to display the object frame to see if the Z-axis divergence is visible or not, or even try to compute the reprojection error if you can?
• what is the confidence for the camera parameters, like the reprojection error?
( 2016-04-07 17:04:00 -0500 )edit
1

Thanks for the suggestions, Eduardo!

I've updated the original question with some of them (and will continue to should I reach further conclusions).

As for the distance, the camera is located about 35cm from the middle of the roundtable. The pattern of found positions is regular enough for me to believe it to have a reason that can be understood, and through it, compensated for.

The reprojection error for the camera calibration (standard opencv with chessboard pattern) used for the position estimation is 0.44.

( 2016-04-14 09:13:37 -0500 )edit

I never used Aruco but it seems that corner refinement is disabled by default.

You could try to enable it if it is not already done and see if the precision increase or not.

( 2016-04-14 11:02:43 -0500 )edit

Your image is undistorted, right? Are you accidentally passing in non-zero distortion parameters?

( 2016-04-14 19:12:13 -0500 )edit

@Eduardo I had corner refinement enabled for my experiments. Disabling it retains the same tendency, only makes the computed camera positions more spaced out.

Do you think corner detection could be biased based on the angle of the corner?

( 2016-04-17 16:01:37 -0500 )edit

Thank you for the updates and for the interesting results/works.

If you want to completely eliminate the image processing part, maybe you could try to use directly for the 2D image coordinates of the corners the back projection of the 3D coordinates?

Just an idea, as you use now synthetic data, maybe you could try to plot the real distance to the world frame and see if the shapes correspond? Also plot directly the Z error to see if there is a correlation between distance to world frame and Z error?

Or maybe the effect is not due to the distance but rather the rotation of one axis?

Unfortunately, I am not enough qualified but I would like to know if this effect is due intrinsically to the method (how the method solves the PnP problem) or numerical imprecision.

( 2016-04-18 05:03:39 -0500 )edit

## 2 answers

Sort by ยป oldest newest most voted

There is a relationship between the control points configuration and the accuracy of homography and planar pose estimation methods. I am really interested in understanding the reason behind this for my Ph.D. For the moment I found that the condition number of the A matrix in the DLT transform can serve as metric to predict the accuracy of a given control point configuration and not surprisingly the most stable 4-point configuration is a square (we proved this empirically by minimizing the condition number using a gradient descent approach).

I think also the relative orientation of the points have an influence on the estimation and this is what is happening here. If you are having similar problems please contact me, I am looking for examples of these anomalies.

More information about my research here:

https://arxiv.org/abs/1803.03025

https://www.youtube.com/watch?v=a6lDr...

more

## Comments

1

Very interesting even if I don't understand all the maths behind.

( 2018-05-02 07:30:34 -0500 )edit

The rvecs and tvecs you get is the pose of the object relative the camera coordinates. Try plotting this:

Rodrigues(rvec, R);
R = R.t();
t = (-R * tvec);
more

## Comments

Actually that's almost exactly what I was doing, only in a less direct way.

I've updated the question's pseudocode to make it more clear.

( 2016-04-14 09:15:34 -0500 )edit

Hmm, how sure are you that your board configuration is correct? If you have a scale error, it might very well show up as an oscillation in the Z direction. I don't know that that would produce quite the error you see here. I would expect higher on one side of the rotation and lower on the other.

Can you show the axis? Just put a point at (0,0,0) and the x and y axis would be fine. As if you were going to print it, but smaller. I don't have a very good idea of what x and y mean in relation to the table.

( 2016-04-14 15:41:55 -0500 )edit

Actually, a scale error was there! The physical printed marker board was streched by 0.4% in one dimension. The remaining problem, however, is not related to the scale error - I have re-run all the experiments with marker board generated synthetically to confirm that the problem is not related to any physical-world phenomena. I have also updated the original question with the new (synthetic) images.

( 2016-04-17 15:49:19 -0500 )edit

Ok, so you're at less than 1% error in the vertical direction. That's not too bad, but I guess you need better?

Your re-projection error is under a tenth of a pixel. Try simulating with a higher image resolution and see if that damps the error.

( 2016-04-17 17:04:24 -0500 )edit

I simulated the process with 4x larger image resolution (2x in each axis) and the error got smaller (about 4 times smaller for single-marker and 4-marker and 2 times smaller for all-marker). The charts can be seen here.

Still, the general dependency of the rotation angle of the markers on the Z-position is there, and I would like to find it's reason and compensate for it.

It seems to either depend on the corner detection algorithm working differently depending on the angle of the corner, or on solvePNP working differently depending on the distribution of the points.

( 2016-04-17 18:55:11 -0500 )edit

Honestly? I think the error is almost entirely due to uncertainty in the exact location of the corners. The best way I can think to deal with it is to take a best-fit curve of that oscillation and subtract it off depending on the angle. It's not hard to calculate the angle, and it seems to be pretty consistent in shape. So just calibrate and subtract. That should get you into the hundredth of a mm range. And if you need more than that, you probably shouldn't be using a camera.

( 2016-04-17 19:20:52 -0500 )edit

Official site

GitHub

Wiki

Documentation

## Stats

Asked: 2016-04-07 11:51:07 -0500

Seen: 2,048 times

Last updated: Apr 17 '16