Using OpenCV projectPoints to overlay real world data on a photo

asked 2017-11-15 03:12:57 -0600

shanemacl gravatar image

I have a photograph and matching camera position (x,y,z), orientation (yaw, pitch and roll), camera matrix (Cx,Cy, Fx,Fy), and radial and tangential correction parameters. I would like to overlay some additional 3d information on the photo which is provided in the same coordinate system. Looking at a similar post here I feel I should be able to do this OpenCV projectPoints function as follows;

#include "opencv2/core/core.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/calib3d/calib3d.hpp"
#include "opencv2/highgui/highgui.hpp"

#include <iostream>
#include <string>

int ProjectMyPoints()
{
    std::vector<cv::Point3d> objectPoints;
    std::vector<cv::Point2d> imagePoints;

    // Camera position
    double CameraX = 709095.949, CameraY = 730584.110, CameraZ = 64.740;

    // Camera orientation (converting from Grads to radians)
    double PI = 3.14159265359;
    double Pitch = -99.14890023 * (PI / 200.0),
        Yaw = PI + 65.47067336 * (PI / 200.0),
        Roll = 194.92713428 * (PI / 200.0);

    // Input data in real world coordinates 
    double x, y, z;
    x = 709092.288; y = 730582.891; z = 62.837; objectPoints.push_back(cv::Point3d(x, y, z));
    x = 709091.618; y = 730582.541; z = 62.831; objectPoints.push_back(cv::Point3d(x, y, z));
    x = 709092.131; y = 730581.602; z = 62.831; objectPoints.push_back(cv::Point3d(x, y, z));
    x = 709092.788; y = 730581.973; z = 62.843; objectPoints.push_back(cv::Point3d(x, y, z));

    // Coefficients for camera matrix
    double CV_CX = 1005.1951672908998,
        CV_CY = 1010.36740512214021,
        CV_FX = 1495.61455114326009,
        CV_FY = 1495.61455114326009,

        // Distortion co-efficients

        CV_K1 = -1.74729071186991E-1,
        CV_K2 = 1.18342592220238E-1,
        CV_K3 = -2.29972026710921E-2,
        CV_K4 = 0.00000000000000E+0,
        CV_K5 = 0.00000000000000E+0,
        CV_K6 = 0.00000000000000E+0,
        CV_P1 = -3.46272954067614E-4,
        CV_P2 = -4.45389772269491E-4;

    // Intrisic matrix / camera matrix

    cv::Mat intrisicMat(3, 3, cv::DataType<double>::type); 
    intrisicMat.at<double>(0, 0) = CV_FX;
    intrisicMat.at<double>(1, 0) = 0;
    intrisicMat.at<double>(2, 0) = 0;

    intrisicMat.at<double>(0, 1) = 0;
    intrisicMat.at<double>(1, 1) = CV_FY;
    intrisicMat.at<double>(2, 1) = 0;

    intrisicMat.at<double>(0, 2) = CV_CX;
    intrisicMat.at<double>(1, 2) = CV_CY;
    intrisicMat.at<double>(2, 2) = 1;

    // Rotation matrix created from orientation
    rRot.at<double>(0, 0) = cos(Yaw)*cos(Pitch);
    rRot.at<double>(1, 0) = sin(Yaw)*cos(Pitch);
    rRot.at<double>(2, 0) = -sin(Pitch);

    rRot.at<double>(0, 1) = cos(Yaw)*sin(Pitch)*sin(Roll) - sin(Yaw)*cos(Roll);
    rRot.at<double>(1, 1) = sin(Yaw)*sin(Pitch)*sin(Roll) + cos(Yaw)*cos(Roll);
    rRot.at<double>(2, 1) = cos(Pitch)*sin(Roll);

    rRot.at<double>(0, 2) = cos(Yaw)*sin(Pitch)*cos(Roll) + sin(Yaw)*sin(Roll);
    rRot.at<double>(1, 2) = sin(Yaw)*sin(Pitch)*cos(Roll) - cos(Yaw)*sin(Roll);;
    rRot.at<double>(2, 2) = cos(Pitch)*cos(Roll);

    // Convert 3x3 rotation matrix to 1x3 rotation vector    
    cv::Mat rVec(3, 1, cv::DataType<double>::type); // Rotation vector
    cv::Rodrigues(rRot, rVec);

    cv::Mat tVec(3, 1, cv::DataType<double>::type); // Translation vector
    tVec.at<double>(0) = CameraX;
    tVec.at<double>(1) = CameraY;
    tVec.at<double>(2) = CameraZ;

    cv::Mat distCoeffs(5, 1, cv::DataType<double>::type);   // Distortion vector
    distCoeffs.at<double>(0) = CV_K1;
    distCoeffs.at<double>(1) = CV_K2;
    distCoeffs.at<double>(2) = CV_P1;
    distCoeffs.at<double>(3) = CV_P2;
    distCoeffs.at<double>(4) = CV_K3;

    std::cout << "Intrisic matrix ...
(more)
edit retag flag offensive close merge delete

Comments

Which unit do you use here x = 709092.288; y = 730582.891; z = 62.837 ?

LBerger gravatar imageLBerger ( 2017-11-15 03:45:33 -0600 )edit

These are Irish national grid coordinates given in metres.

shanemacl gravatar imageshanemacl ( 2017-11-15 03:50:19 -0600 )edit

Seting camera center to 0

double CameraX0 = 709095.949, CameraY0 = 730584.110, CameraZ0 = 64.740;
double CameraX = 0, CameraY = 0, CameraZ = 0;

x = 709092.288; y = 730582.891; z = 62.837; objectPoints.push_back(cv::Point3d(x- CameraX0, y- CameraY0, z- CameraZ0));
x = 709091.618; y = 730582.541; z = 62.831; objectPoints.push_back(cv::Point3d(x - CameraX0, y - CameraY0, z - CameraZ0));
x = 709092.131; y = 730581.602; z = 62.831; objectPoints.push_back(cv::Point3d(x - CameraX0, y - CameraY0, z - CameraZ0));
x = 709092.788; y = 730581.973; z = 62.843; objectPoints.push_back(cv::Point3d(x - CameraX0, y - CameraY0, z - CameraZ0));
LBerger gravatar imageLBerger ( 2017-11-15 03:54:59 -0600 )edit

gives :

Image point: [17627.9, 66775.4]
Image point: [20419.5, 78275.5]
Image point: [-15776, 96114.5]
Image point: [-25653, 86353.4]
LBerger gravatar imageLBerger ( 2017-11-15 03:55:39 -0600 )edit

Distance between points [709092.288, 730582.891, 62.837]- [709091.618, 730582.541,62.831] less than 0.5 ? airplane altitude is only 2 meters?

LBerger gravatar imageLBerger ( 2017-11-15 04:06:31 -0600 )edit

Thanks, I tried something similar and got very similar results, though 1000 times smaller .

[0] {x=176.27936753046254 y=667.75395317405560 }
[1] {x=204.19483501796526 y=782.75478423606671 }
[2] {x=-157.75991174112016 y=961.14485170223907 }
[3] {x=-256.52997942897093 y=863.53391123807774 }
shanemacl gravatar imageshanemacl ( 2017-11-15 04:06:47 -0600 )edit

Distance is of that order (0.7559m) which is correct. The object in question is a relatively small rectangular manhole cover.

shanemacl gravatar imageshanemacl ( 2017-11-15 04:12:41 -0600 )edit

Thinking if I compute a projection matrix from known world and corresponding image coordinates I can back compute rotation, translation and K matrices as per this example; https://github.com/daviddoria/Example... This should give me some idea what is wrong with the equivalent values I'm generating.

shanemacl gravatar imageshanemacl ( 2017-11-15 04:17:25 -0600 )edit

Ok i see a manhole cover now! and z is 2 meters?

LBerger gravatar imageLBerger ( 2017-11-15 04:27:47 -0600 )edit

Z difference between ground and camera is about 2 meters as the camera is mounted on the roof of a vehicle.

shanemacl gravatar imageshanemacl ( 2017-11-15 04:37:27 -0600 )edit

OK now angle yaw,... are in same axis ?

LBerger gravatar imageLBerger ( 2017-11-15 04:40:35 -0600 )edit

Not sure, so order in which angles are multiplied into the rotation matrix could well be an issue. I've since tried working backwards using solvePnP to back compute the rotation matrix based on known object and image coordinates, but even with the camera origin subtracted from the world coordinates this is still resulting in a translation vector. This could be due to a poor choice of points, so I'm trying again with more points at greater distances to improve redundancy.

shanemacl gravatar imageshanemacl ( 2017-11-15 10:52:57 -0600 )edit

Two things to note. First, subtract off the average of all the points involved from everything. projectPoints doesn't like large translation values.

Second: Remember that the OpenCV camera coordinate system is x = right, y = down, and z = forward. So yaw, pitch, and roll don't go directly into that coordinate system.

Tetragramm gravatar imageTetragramm ( 2017-11-15 18:05:05 -0600 )edit

I would have thought that the whole point of the translation and rotation was to align the world grid with the camera grid. Only reason a large translation should be an issue is if we were working in lower resolution numbers (e.g. floats instead of doubles), but this seems to be using doubles throughout. Is there any documentation covering what how exactly the rotation and translation relate to the world grid?

shanemacl gravatar imageshanemacl ( 2017-11-16 05:48:52 -0600 )edit