Using OpenCV projectPoints to overlay real world data on a photo
I have a photograph and matching camera position (x,y,z), orientation (yaw, pitch and roll), camera matrix (Cx,Cy, Fx,Fy), and radial and tangential correction parameters. I would like to overlay some additional 3d information on the photo which is provided in the same coordinate system. Looking at a similar post here I feel I should be able to do this OpenCV projectPoints function as follows;
#include "opencv2/core/core.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/calib3d/calib3d.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <iostream>
#include <string>
int ProjectMyPoints()
{
std::vector<cv::Point3d> objectPoints;
std::vector<cv::Point2d> imagePoints;
// Camera position
double CameraX = 709095.949, CameraY = 730584.110, CameraZ = 64.740;
// Camera orientation (converting from Grads to radians)
double PI = 3.14159265359;
double Pitch = -99.14890023 * (PI / 200.0),
Yaw = PI + 65.47067336 * (PI / 200.0),
Roll = 194.92713428 * (PI / 200.0);
// Input data in real world coordinates
double x, y, z;
x = 709092.288; y = 730582.891; z = 62.837; objectPoints.push_back(cv::Point3d(x, y, z));
x = 709091.618; y = 730582.541; z = 62.831; objectPoints.push_back(cv::Point3d(x, y, z));
x = 709092.131; y = 730581.602; z = 62.831; objectPoints.push_back(cv::Point3d(x, y, z));
x = 709092.788; y = 730581.973; z = 62.843; objectPoints.push_back(cv::Point3d(x, y, z));
// Coefficients for camera matrix
double CV_CX = 1005.1951672908998,
CV_CY = 1010.36740512214021,
CV_FX = 1495.61455114326009,
CV_FY = 1495.61455114326009,
// Distortion co-efficients
CV_K1 = -1.74729071186991E-1,
CV_K2 = 1.18342592220238E-1,
CV_K3 = -2.29972026710921E-2,
CV_K4 = 0.00000000000000E+0,
CV_K5 = 0.00000000000000E+0,
CV_K6 = 0.00000000000000E+0,
CV_P1 = -3.46272954067614E-4,
CV_P2 = -4.45389772269491E-4;
// Intrisic matrix / camera matrix
cv::Mat intrisicMat(3, 3, cv::DataType<double>::type);
intrisicMat.at<double>(0, 0) = CV_FX;
intrisicMat.at<double>(1, 0) = 0;
intrisicMat.at<double>(2, 0) = 0;
intrisicMat.at<double>(0, 1) = 0;
intrisicMat.at<double>(1, 1) = CV_FY;
intrisicMat.at<double>(2, 1) = 0;
intrisicMat.at<double>(0, 2) = CV_CX;
intrisicMat.at<double>(1, 2) = CV_CY;
intrisicMat.at<double>(2, 2) = 1;
// Rotation matrix created from orientation
rRot.at<double>(0, 0) = cos(Yaw)*cos(Pitch);
rRot.at<double>(1, 0) = sin(Yaw)*cos(Pitch);
rRot.at<double>(2, 0) = -sin(Pitch);
rRot.at<double>(0, 1) = cos(Yaw)*sin(Pitch)*sin(Roll) - sin(Yaw)*cos(Roll);
rRot.at<double>(1, 1) = sin(Yaw)*sin(Pitch)*sin(Roll) + cos(Yaw)*cos(Roll);
rRot.at<double>(2, 1) = cos(Pitch)*sin(Roll);
rRot.at<double>(0, 2) = cos(Yaw)*sin(Pitch)*cos(Roll) + sin(Yaw)*sin(Roll);
rRot.at<double>(1, 2) = sin(Yaw)*sin(Pitch)*cos(Roll) - cos(Yaw)*sin(Roll);;
rRot.at<double>(2, 2) = cos(Pitch)*cos(Roll);
// Convert 3x3 rotation matrix to 1x3 rotation vector
cv::Mat rVec(3, 1, cv::DataType<double>::type); // Rotation vector
cv::Rodrigues(rRot, rVec);
cv::Mat tVec(3, 1, cv::DataType<double>::type); // Translation vector
tVec.at<double>(0) = CameraX;
tVec.at<double>(1) = CameraY;
tVec.at<double>(2) = CameraZ;
cv::Mat distCoeffs(5, 1, cv::DataType<double>::type); // Distortion vector
distCoeffs.at<double>(0) = CV_K1;
distCoeffs.at<double>(1) = CV_K2;
distCoeffs.at<double>(2) = CV_P1;
distCoeffs.at<double>(3) = CV_P2;
distCoeffs.at<double>(4) = CV_K3;
std::cout << "Intrisic matrix ...
Which unit do you use here x = 709092.288; y = 730582.891; z = 62.837 ?
These are Irish national grid coordinates given in metres.
Seting camera center to 0
gives :
Distance between points [709092.288, 730582.891, 62.837]- [709091.618, 730582.541,62.831] less than 0.5 ? airplane altitude is only 2 meters?
Thanks, I tried something similar and got very similar results, though 1000 times smaller .
Distance is of that order (0.7559m) which is correct. The object in question is a relatively small rectangular manhole cover.
Thinking if I compute a projection matrix from known world and corresponding image coordinates I can back compute rotation, translation and K matrices as per this example; https://github.com/daviddoria/Example... This should give me some idea what is wrong with the equivalent values I'm generating.
Ok i see a manhole cover now! and z is 2 meters?
Z difference between ground and camera is about 2 meters as the camera is mounted on the roof of a vehicle.
OK now angle yaw,... are in same axis ?
Not sure, so order in which angles are multiplied into the rotation matrix could well be an issue. I've since tried working backwards using solvePnP to back compute the rotation matrix based on known object and image coordinates, but even with the camera origin subtracted from the world coordinates this is still resulting in a translation vector. This could be due to a poor choice of points, so I'm trying again with more points at greater distances to improve redundancy.
Two things to note. First, subtract off the average of all the points involved from everything. projectPoints doesn't like large translation values.
Second: Remember that the OpenCV camera coordinate system is x = right, y = down, and z = forward. So yaw, pitch, and roll don't go directly into that coordinate system.
I would have thought that the whole point of the translation and rotation was to align the world grid with the camera grid. Only reason a large translation should be an issue is if we were working in lower resolution numbers (e.g. floats instead of doubles), but this seems to be using doubles throughout. Is there any documentation covering what how exactly the rotation and translation relate to the world grid?