Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Using solvePnp on a video stream with changing points.

I am using a stereo camera to triangulate 3d points from rectified images. I then use these points, and the found keypoints, to run solvePnp. I get correspondences like this:

//  Find out the 2D/3D correspondences
std::vector<cv::Point3f> list_points3d_model_match;    // 3D coordinates found in the scene
std::vector<cv::Point2f> list_points2d_scene_match;    //  2D coordinates found in the scene    
for (unsigned int match_index = 0; match_index < matchR.size(); ++match_index)
{
    cv::Point3f point3d_model = Points3d[matchR[match_index].trainIdx];   // 3D points
    cv::Point2f point2d_scene = keyPntsCurrent[matchR[match_index].queryIdx].pt;    // 2D point 
    list_points3d_model_match.push_back(point3d_model);             // add 3D point
    list_points2d_scene_match.push_back(point2d_scene);              // add 2D point
}

Then I run solvePnp:

openCvPnp(list_points2d_scene_match, list_points3d_model_match);

with:

cv::Mat rvec(3, 1, cv::DataType<double>::type);
cv::Mat tvec(3, 1, cv::DataType<double>::type);
cv::Mat tvec2(3, 1, cv::DataType<double>::type);
cv::Mat tvecFinal(3, 1, cv::DataType<double>::type);

int64 t0 = cv::getTickCount();
bool useExtrinsicGuess = true;
int iterationsCount = 100;
float reprojectionError = 0.01;
double confidence = 0.8;
tvec = 0.0;
rvec = 0.0;

cv::solvePnP(p3d, p2d,
    cameraMatrix, distCoeffs,
    rvec, tvec,
    useExtrinsicGuess,
    cv::SOLVEPNP_ITERATIVE
    );


cv::Mat R;
cv::Rodrigues(rvec, R); // R is 3x3
R = R.t();  // rotation of inverse
tvec2 = -R * tvec; // translation of inverse

cv::Mat T(4, 4, R.type()); // T is 4x4
T(cv::Range(0, 3), cv::Range(0, 3)) = R * 1; // copies R into T
T(cv::Range(0, 3), cv::Range(3, 4)) = tvec2 * 1; // copies tvec into T
// fill the last row of T (NOTE: depending on your types, use float or double)
double *p = T.ptr<double>(3);
p[0] = p[1] = p[2] = 0; p[3] = 1;

std::cout << tvec2 << std::endl;

this runs fine, and gives me returned pose values. BUT when i run it on a video stream where the points are constantly changing as the camera turns, the pose data values stay very low. I assume this is because I am just returning the change between frames? is this right? Or is it that solvePnp is choosing a different zero point each frame, and this is messing up my values?

How can i run solvePnp on a scene where the 3d points change, and return the additive traveled value of the camera?

Thank you.

Using solvePnp on a video stream with changing points.

I am using a stereo camera to triangulate 3d points from rectified images. I then use these points, and the found keypoints, to run solvePnp. I get correspondences like this:

//  Find out the 2D/3D correspondences
std::vector<cv::Point3f> list_points3d_model_match;    // 3D coordinates found in the scene
std::vector<cv::Point2f> list_points2d_scene_match;    //  2D coordinates found in the scene    
for (unsigned int match_index = 0; match_index < matchR.size(); ++match_index)
{
    cv::Point3f point3d_model = Points3d[matchR[match_index].trainIdx];   // 3D points
    cv::Point2f point2d_scene = keyPntsCurrent[matchR[match_index].queryIdx].pt;    // 2D point 
    list_points3d_model_match.push_back(point3d_model);             // add 3D point
    list_points2d_scene_match.push_back(point2d_scene);              // add 2D point
}

Then I run solvePnp:

openCvPnp(list_points2d_scene_match, list_points3d_model_match);

with:

cv::Mat rvec(3, 1, cv::DataType<double>::type);
cv::Mat tvec(3, 1, cv::DataType<double>::type);
cv::Mat tvec2(3, 1, cv::DataType<double>::type);
cv::Mat tvecFinal(3, 1, cv::DataType<double>::type);

int64 t0 = cv::getTickCount();
bool useExtrinsicGuess = true;
int iterationsCount = 100;
float reprojectionError = 0.01;
double confidence = 0.8;
tvec = 0.0;
rvec = 0.0;

cv::solvePnP(p3d, p2d,
    cameraMatrix, distCoeffs,
    rvec, tvec,
    useExtrinsicGuess,
    cv::SOLVEPNP_ITERATIVE
    );


cv::Mat R;
cv::Rodrigues(rvec, R); // R is 3x3
R = R.t();  // rotation of inverse
tvec2 = -R * tvec; // translation of inverse

cv::Mat T(4, 4, R.type()); // T is 4x4
T(cv::Range(0, 3), cv::Range(0, 3)) = R * 1; // copies R into T
T(cv::Range(0, 3), cv::Range(3, 4)) = tvec2 * 1; // copies tvec into T
// fill the last row of T (NOTE: depending on your types, use float or double)
double *p = T.ptr<double>(3);
p[0] = p[1] = p[2] = 0; p[3] = 1;

std::cout << tvec2 << std::endl;

this runs fine, and gives me returned pose values. BUT when i run it on a video stream where the points are constantly changing as the camera turns, the pose data values stay very low. I assume this is because I am just returning the change between frames? is this right? Or is it that solvePnp is choosing a different zero point each frame, and this is messing up my values?

How can i run solvePnp on a scene where the 3d points change, and return the additive traveled value of the camera?

Thank you.EDIT:

After checking my 3d points in a 3d program, I noticed they were in camera space. I have added the following to flip them to world space.

after solvePnp first run:

//for camera-to-world-switch of points
    if (firstRun == true)
    {
        rvec.copyTo(rvecCamSpace);
        tvec.copyTo(tvecCamSpace);
            rvec.copyTo(rvecFinal);
    tvec.copyTo(tvecFinal);
        firstRun = false;
    }

every run after: (before solvePnp)

std::vector<cv::Point2d> p2d;
std::vector<cv::Point3d> p3d;
for (int i = 0; i < imagePoints.size(); i++)
{
    if (objectPoints[i].x != 0)
    {
        cv::Point2d tmp;
        tmp.x = imagePoints[i].x;
        tmp.y = imagePoints[i].y;
        p2d.push_back(tmp);

        cv::Point3d tmp2;
        tmp2.x = objectPoints[i].x;
        tmp2.y = objectPoints[i].y;
        tmp2.z = objectPoints[i].y;
        if (firstRun == true)
        {
            p3d.push_back(tmp2);

        }
        //to world space
        if (firstRun == false)
        {
            cv::Mat flip(tmp2);             
            flip.at<double>(0) = tmp2.x;
            flip.at<double>(1) = tmp2.y;
            flip.at<double>(2) = tmp2.z;

            cv::Mat R;
            cv::Rodrigues(rvecCamSpace, R); // R is 3x3

            flip = R.t()*flip - R.t() * tvecCamSpace;
            cv::Point3d tmpP;
            tmpP.x = flip.at<double>(0);
            tmpP.y = flip.at<double>(1);
            tmpP.z = flip.at<double>(2);
            p3d.push_back(tmpP);
        }

    }
}

Thanks to a reply below, i have also added:

//additive
    cv::composeRT(rvecFinal, tvecFinal, rvec, tvec, rvecFinal, tvecFinal);

However, i see a similar issue/ the returned translation values jump around, and never rise above 5 or so, despite the camera move in the data set being 8 meters. What am i missing? thanks!

Using solvePnp on a video stream with changing points.

I am using a stereo camera to triangulate 3d points from rectified images. I then use these points, and the found keypoints, to run solvePnp. I get correspondences like this:Everything runs, the 3d points look correct, the projected points look good. But the returned camera pose does not. It Jumps around, and gives incorrect values.

My workflow is:

//  Find out the 2D/3D correspondences
std::vector<cv::Point3f> list_points3d_model_match;    // 3D coordinates found in the scene
std::vector<cv::Point2f> list_points2d_scene_match;    //  2D coordinates found in the scene    
for (unsigned int match_index = 0; match_index < matchR.size(); ++match_index)
{
    cv::Point3f point3d_model = Points3d[matchR[match_index].trainIdx];   // 3D points
    cv::Point2f point2d_scene = keyPntsCurrent[matchR[match_index].queryIdx].pt;    // 2D point 
    list_points3d_model_match.push_back(point3d_model);             // add 3D point
    list_points2d_scene_match.push_back(point2d_scene);              // add 2D point
}
grab stereo frames.

find keypoints and match.

Triangulate points from matched stereo keypoints.

use Left camera keypoints and triangulated 3d points to run solvePnp.

invert the rvec and tvec values to get the camera pose.

repeat.

Then I run solvePnp:

openCvPnp(list_points2d_scene_match, list_points3d_model_match);

with:

cv::Mat rvec(3, 1, cv::DataType<double>::type);
cv::Mat tvec(3, 1, cv::DataType<double>::type);
cv::Mat tvec2(3, 1, cv::DataType<double>::type);
cv::Mat tvecFinal(3, 1, cv::DataType<double>::type);

int64 t0 = cv::getTickCount();
bool useExtrinsicGuess = true;
int iterationsCount = 100;
float reprojectionError = 0.01;
double confidence = 0.8;
tvec = 0.0;
rvec = 0.0;

cv::solvePnP(p3d, p2d,
    cameraMatrix, distCoeffs,
    rvec, tvec,
    useExtrinsicGuess,
    cv::SOLVEPNP_ITERATIVE
    );


cv::Mat R;
cv::Rodrigues(rvec, R); // R is 3x3
R = R.t();  // rotation of inverse
tvec2 = -R * tvec; // translation of inverse

cv::Mat T(4, 4, R.type()); // T is 4x4
T(cv::Range(0, 3), cv::Range(0, 3)) = R * 1; // copies R into T
T(cv::Range(0, 3), cv::Range(3, 4)) = tvec2 * 1; // copies tvec into T
// fill the last row of T (NOTE: depending on your types, use float or double)
double *p = T.ptr<double>(3);
p[0] = p[1] = p[2] = 0; p[3] = 1;

std::cout << tvec2 << std::endl;

this runs fine, and gives me returned pose values. BUT when i run it on a video stream where the points are constantly changing as the camera turns, the pose data values stay very low. I assume this is because I am just returning the change between frames? is this right? Or is it that solvePnp is choosing a different zero point each frame, and this is messing up my values?

How can i run solvePnp on a scene where the 3d points change, and return the additive traveled value of the camera?

EDIT:

After checking my I have checked the 3d points in a 3d program, I noticed application, and am projecting them back to the camera frame. they were look good.

I use the same keypoints that i triangulate with as the imagepoints, so the correspondences are good.

The 3d points are in camera space. I have added the following to flip them to world space.

after solvePnp first run:

//for camera-to-world-switch of points
    if (firstRun == true)
    {
        rvec.copyTo(rvecCamSpace);
        tvec.copyTo(tvecCamSpace);
            rvec.copyTo(rvecFinal);
    tvec.copyTo(tvecFinal);
        firstRun = false;
    }

every run after: (before solvePnp)

std::vector<cv::Point2d> p2d;
std::vector<cv::Point3d> p3d;
for (int i = 0; i < imagePoints.size(); i++)
{
    if (objectPoints[i].x != 0)
    {
        cv::Point2d tmp;
        tmp.x = imagePoints[i].x;
        tmp.y = imagePoints[i].y;
        p2d.push_back(tmp);

        cv::Point3d tmp2;
        tmp2.x = objectPoints[i].x;
        tmp2.y = objectPoints[i].y;
        tmp2.z = objectPoints[i].y;
        if (firstRun == true)
        {
            p3d.push_back(tmp2);

        }
        //to world space
        if (firstRun == false)
        {
            cv::Mat flip(tmp2);             
            flip.at<double>(0) = tmp2.x;
            flip.at<double>(1) = tmp2.y;
            flip.at<double>(2) = tmp2.z;

            cv::Mat R;
            cv::Rodrigues(rvecCamSpace, R); // R Camera-space, as that is 3x3

            flip = R.t()*flip - R.t() * tvecCamSpace;
            cv::Point3d tmpP;
            tmpP.x = flip.at<double>(0);
            tmpP.y = flip.at<double>(1);
            tmpP.z = flip.at<double>(2);
            p3d.push_back(tmpP);
        }

    }
}

Thanks to a reply below, i have also added:

//additive
    cv::composeRT(rvecFinal, tvecFinal, rvec, tvec, rvecFinal, tvecFinal);

However, i see a similar issue/ the returned translation values jump around, and never rise above 5 what triangulatePoints returns.

the calibration data is good.

And yet the camera pose, inverted or so, despite the camera move in the data set being 8 meters. not, jumps around between -1 and 1, and does not change as the camera moves.

What am i missing? thanks!missing?

Using solvePnp on a video stream with changing points.

I am using a stereo camera to triangulate 3d points from rectified images. I then use these points, and the found keypoints, to run solvePnp. Everything runs, the 3d points look correct, the projected points look good. But the returned camera pose does not. It Jumps around, and gives incorrect values.

My workflow is:

grab stereo frames.

find keypoints and match.

Triangulate points from matched stereo keypoints.

use Left camera keypoints and triangulated 3d points to run solvePnp.

invert the rvec and tvec values to get the camera pose.

repeat.

I have checked the 3d points in a 3d application, and am projecting them back to the camera frame. they look good.

I use the same keypoints that i triangulate with as the imagepoints, so the correspondences are good.

The 3d points are in Camera-space, as that is what triangulatePoints returns.

the calibration data is good.

And yet the camera pose, inverted or not, jumps around between -1 and 1, and does not change as the camera moves.

What am i missing?

I have tried flipping the 3d points to object space every frame, adding the tvec and rvec every frame, and i see the same result.

EDIT:

After writing this, i realize I may be missing a step. As i am detecting keypoints every frame, the resulting points are different every frame! So I may need to add a tracking step, or a 'match previous frame to current frame' step BEFORE the triangulation. Does this sound correct?

Using solvePnp on a video stream with changing points.

I am using a stereo camera to triangulate 3d points from rectified images. I then use these points, and the found keypoints, to run solvePnp. Everything runs, the 3d points look correct, the projected points look good. But the returned camera pose does not. It Jumps around, and gives incorrect values.

My workflow is:

grab stereo frames.

find keypoints and match.
in previous (left) frame, and current (left) and (right ) frames.

match the previous (left) frame with the current (left) frame.

match the (matched in previous step) left keypoints with the current(right) image.

Triangulate points from matched stereo keypoints.

use Left camera keypoints and triangulated 3d points to run solvePnp.

invert the rvec and tvec values to get the camera pose.

repeat.

I have checked the 3d points in a 3d application, and am projecting them back to the camera frame. they look good.

I use the same keypoints that i triangulate with as the imagepoints, so the correspondences are good.

The 3d points are in Camera-space, as that is what triangulatePoints returns.

the calibration data is good.

And yet the camera pose, inverted or not, jumps around between -1 and 1, and does not change as the camera moves.

What am i missing?

I have tried flipping the 3d points to object space every frame, adding the tvec and rvec every frame, and i see the same result.

EDIT:

After writing this, i realize I may be missing a step. As i am detecting keypoints every frame, the resulting points are different every frame! So I may need to add a tracking step, or a 'match previous frame to current frame' step BEFORE the triangulation. Does this sound correct?

Using solvePnp on a video stream with changing points.

I am using a stereo camera to triangulate 3d points from rectified images. I then use these points, and the found keypoints, to run solvePnp. Everything runs, the 3d points look correct, the projected points look good. But the returned camera pose does not. It Jumps around, and gives incorrect values.

My workflow is:

grab stereo frames.

find keypoints in previous (left) frame, and current (left) and (right ) frames.

match the previous (left) frame with the current (left) frame.

match the (matched in previous step) left keypoints descriptors with the current(right) image.
descriptors.

Triangulate points from matched stereo keypoints.

use Left camera keypoints and triangulated 3d points to run solvePnp.

invert the rvec and tvec values to get the camera pose.

repeat.

I have checked the 3d points in a 3d application, and am projecting them back to the camera frame. they look good.

I use the same keypoints that i triangulate with as the imagepoints, so the correspondences are good.

The 3d points are in Camera-space, as that is what triangulatePoints returns.

the calibration data is good.

And yet the camera pose, inverted or not, jumps around between -1 and 1, and does not change as the camera moves.

What am i missing?

I have tried flipping the 3d points to object space every frame, adding the tvec and rvec every frame, and i see the same result.

Using solvePnp on a video stream with changing points.

I am using a stereo camera to triangulate 3d points from rectified images. I then use these points, and the found keypoints, to run solvePnp. Everything runs, the 3d points look correct, the projected points look good. But the returned camera pose does not. It Jumps around, and gives incorrect values.

My workflow is:

grab stereo frames.

find keypoints in previous (left) frame, and current (left) and (right ) frames.

match the previous (left) frame with the current (left) frame.

match the (matched in previous step) left descriptors with the current(right) descriptors.

Triangulate points from matched stereo keypoints.

use Left camera keypoints and triangulated 3d points to run solvePnp.

invert the rvec and tvec values to get the camera pose.

repeat.

I have checked the 3d points in a 3d application, and am projecting them back to the camera frame. they look good.

I use the same keypoints that i triangulate with as the imagepoints, so the correspondences are good.

The 3d points are in Camera-space, as that is what triangulatePoints returns.

the calibration data is good.

And yet the I notice that even though I am matching the previous frame to the current, when i look at the 3d point setsfor consecutive frames, they do not align. For example, the first point in the set is in a different location, from frame 1 to frame 2.

The camera pose, inverted or not, jumps around between -1 and 1, and does not change as the camera moves.

What am i missing?

I have tried flipping the 3d points to object space every frame, adding the tvec and rvec every frame, and i see the same result.