SolvePnp doesn't give the same result with four points and more points

asked 2019-03-02 14:07:11 -0600

ROSpioneer gravatar image

I'm computing the pose of the camera using SolvePnP function of OpenCv library and I'm facing the issue that when I give four point it gives me slightly better result than if I give more points. Here is an example of the code with some points as an example:

#include <opencv2/opencv.hpp>

int main(int argc, char **argv)
{
    //with 11 points
    std::vector<cv::Point2f> image_points_eleven =  {
                          cv::Point2f(905.63, 694.37),
                          cv::Point2f(936.403, 696.781),
                          cv::Point2f(988.424, 700.553),
                          cv::Point2f(1020.02, 702.846),
                          cv::Point2f(1016.45, 741.839),
                          cv::Point2f(1012.79, 781.955),
                          cv::Point2f(1009.06, 822.609),
                          cv::Point2f(951.48, 815.937),
                          cv::Point2f(894.528, 810.86),
                          cv::Point2f(898.26, 772.418),
                          cv::Point2f(901.867, 732.744) };
    std::vector<cv::Point3f> model_points_eleven =  {
                          cv::Point3f(-155.32,155.32, 0),
                          cv::Point3f(-70.6,155.32, 0),
                          cv::Point3f(70.6,155.32, 0),
                          cv::Point3f(155.32,155.32, 0),
                          cv::Point3f(155.32,52.95, 0),
                          cv::Point3f(155.32,-52.95, 0),
                          cv::Point3f(155.32,-158.85, 0),
                          cv::Point3f(0, -155.32, 0),
                          cv::Point3f(-155.32,-155.32, 0),
                          cv::Point3f(-155.32,-52.95, 0),
                          cv::Point3f(-155.32, 52.95, 0)};

    //with 4 points     
    std::vector<cv::Point2f> image_points_four =  {
                          cv::Point2f(921.808,714.396),
                          cv::Point2f(999.474,720.263),
                          cv::Point2f(992.486,800.519),
                          cv::Point2f(914.465,793.569) };

    std::vector<cv::Point3f> model_points_four =  {
                          cv::Point3f(-211.5 / 2.0f, 211.5 / 2.0f, 0),
                          cv::Point3f(-211.5 / 2.0f, 211.5 / 2.0f, 0),
                          cv::Point3f(-211.5 / 2.0f, 211.5 / 2.0f, 0),
                          cv::Point3f(-211.5 / 2.0f, 211.5 / 2.0f, 0)};    
    // Camera internals
    cv::Mat camera_matrix = (cv::Mat_<double>(3,3) << 1296.2477, 0, 1028, 0 , 1296.2477, 771, 0, 0, 1);
    cv::Mat dist_coeffs = (cv::Mat_<double>(1,5) << -0.12658285, 0.13909541, 0, 0, -0.040676277); // Assuming no lens distortion

    // Output rotation and translation
    cv::Mat rotation_vector; // Rotation in axis-angle form
    cv::Mat translation_vector;

    // Solve for pose 11 points
    cv::solvePnP(model_points_eleven, image_points_eleven, camera_matrix, dist_coeffs, rotation_vector, translation_vector);
    rotation_vector.convertTo(rotation_vector, CV_32F);
    translation_vector.convertTo(translation_vector, CV_32F);

    std::cout << "Rotation Vector 11pts: " << rotation_vector << std::endl;
    std::cout << "Translation Vector 11pts: " << translation_vector << std::endl;

    // Solve for pose 11 points  
    cv::solvePnP(model_points_four, image_points_four, camera_matrix, dist_coeffs, rotation_vector, translation_vector);
    rotation_vector.convertTo(rotation_vector, CV_32F);
    translation_vector.convertTo(translation_vector, CV_32F);

    std::cout << "Rotation Vector 4pts: " << rotation_vector << std::endl;
    std::cout << "Translation Vector 4pts: " << translation_vector << std::endl;

}

I do get:

Rotation Vector 11pts: [3.0777242; 0.13331571; -0.26817828]
Translation Vector 11pts: [-187.29686; -36.374325; 3410.5]
Rotation Vector 4pts: [1.5553488; 0.13531595; -0.19250046]
Translation Vector 4pts: [83.4842; -4.0712709; -163.0168]

Comparing to an external results (lazer), for me I should get more precision the more points I have or at least the same result, however it doesn ... (more)

edit retag flag offensive close merge delete

Comments

With 4 points, the Z-component of the translation vector is negative, which implies that the points are behind the camera. Also, you have 4 times the same 3D coordinates in the 4 points case.

With real data, you have to take into account the precision of the 2 coordinates. 4 2D/3D points with precise 2D coordinates with give better results than 11 2D/3D coordinates with noisy data.

Eduardo gravatar imageEduardo ( 2019-03-03 06:48:51 -0600 )edit