Ask Your Question

[Paid job] Multi-view solvePnP routine

asked 2017-03-07 08:01:01 -0500

elliotwoods gravatar image

updated 2017-04-08 04:51:25 -0500

berak gravatar image

Hi all!

I need a multi-view version of the solvePnP function.

Qualitively: We want to resolve the pose (rotation + translation) of an object in space using projections of landmarks on that object onto multiple image planes. Each image plane represents a calibrated camera at fixed locations in the world (for each we have a priori : cameraMatrix, distortionCoefficients, rotation, translation). The object is covered in markers (e.g. 10-20 markers, perhaps 4-8 will be in the camera's view at any time) which can be seen and identified in the cameras and are at known 3D positions in object space and known corresponding 2D positions in each image plane. Using the correspondences of 3D points in object space to 2D points in projected image space for each camera, we must reliably (and quickly) discover the rotation and translation for the object.


  • Inputs
    • Set[ intrinsics, extrinsics ] views // size N
    • Set[ Set[ObjectPoints], Set[ImagePoints] ] // size N
  • Outputs
    • rotation of object // 3-vector
    • translation of object // 3-vector

I have some notes at:

And posed a freelancer posting at:

Using a single camera this is possible using the solvePnP function (which optimises the rotation and translation so that the projections of object points match the observed points on the image plane)

Template for the function could be:

double solvePnPMultiView(vector<vector<cv::Point3f>> objectPointsPerView
        , vector<vector<cv::Point2f>> imagePointProjectionsPerView
        , vector<cv::Mat> cameraMatrixPerView
        , vector<cv::Mat> distortionCoefficientsPerView
        , vector<cv::Mat> translationPerView
        , vector<cv::Mat> rotationVectorPerView

        , cv::Mat & objectRotationVector
        , cv::Mat & objectTranslation
        , bool useExtrinsicGuess);

//same function but with different data format
double solvePnPMultiView(vector<vector<cv::Point3f>> objectPointsPerView
        , vector<vector<cv::Point2f>> undsitortedImagePointProjectionsPerView
        , vector<cv::Mat> rectifiedProjectionMatrixPerView

        , cv::Mat & objectRotationVector
        , cv::Mat & objectTranslation
        , bool useExtrinsicGuess);

//specific version for stereo (calls one of the functions above)
double solvePnPStereo(vector<cv::Point3f> objectPointsObservedInCamera1
        , vector<cv::Point2f> projectedImagePointsObservedInCamera1
        , vector<cv::Point3f> objectPointsObservedInCamera2
        , vector<cv::Point2f> projectedImagePointsObservedInCamera2
        , cv::Mat cameraMatrix1
        , cv::Mat distortionCoefficientsCamera1
        , cv::Mat cameraMatrix2
        , cv::Mat distortionCoefficientsCamera2
        , cv::Mat camera1ToCamera2RotationVector
        , cv::Mat camera1ToCamera2Translation

        , cv::Mat & objectRotationVector
        , cv::Mat & objectTranslation
        , bool useExtrinsicGuess);

(these functions would all call the same code internally, but just have different ways of being used)

The object we're trying to track is a tree (with known mesh taken from photo-scan). The tree is covered in retroreflective markers. We are projecting onto the tree from a set of moving video projectors (attached to robot arm). I'm pretty confident I can figure out which marker is which before we get to the solvePnP stage. This is all part of a new artwork by our studio (please check for an example of previous work).


  • Routine should take less than 3ms on Core i7 for 2 views with 10 object points each.
  • Ideally don't use any libraries other than OpenCV (would be even be great to PR this into OpenCV)
  • I think OpenCV's only numerical solver is CvLevMarq which is C only, but I'd like to ...
edit retag flag offensive close merge delete


First, you should close this and post a new question for visibility.

Second, are the markers associated? By that I mean, does the first marker in the list for camera1 match the first marker in the list for camera2 or are they un-ordered?

Tetragramm gravatar imageTetragramm ( 2017-03-31 20:43:46 -0500 )edit

Hi @Tetragramm! I linked to this post from a few places so want to stick with it if possible (i think it gets bumped up the list when somebody posts here).

elliotwoods gravatar imageelliotwoods ( 2017-03-31 21:53:41 -0500 )edit

We can presume that the markers are in ordered sets. i.e. objectPoints1 and imagePoints1 are equal length, and objectPoints1[i] corresponds with imagePoints1[i] (as with solvePnP)

elliotwoods gravatar imageelliotwoods ( 2017-03-31 21:54:33 -0500 )edit

Well, I can't take your money, but if you look HERE, you can see the multi-camera any pose triangulation I've been working on. That gives you the position of each marker in 3d space.

Then use ICP to register the 3d markers to your model. (This may be overkill, but it's built into OpenCV already).

Tetragramm gravatar imageTetragramm ( 2017-03-31 23:10:52 -0500 )edit

All right, I had some time, so I went ahead and worked on this. I've tested the two pieces individually, but not together, so there's a possibility of bugs. I don't have a particularly good dataset for this.

I've checked it in HERE. See the calcObjectPosition function. Be careful, the signature isn't quite what you asked for, but it's very similar.

Instead of ICP, which works for un-associated point clouds, you know what image point goes with which model point so I used a Least-Squares solution.

Tetragramm gravatar imageTetragramm ( 2017-04-10 21:16:33 -0500 )edit

Hi @Tetragramm. Wow this is amazing! I'll try this is as soon as I can . The artwork opening is tomorrow, and my attempt at stereo solvePnP using non-linear fit didn't work (warning - uses other libraries, also produces terrible results. i think i have something wrong somewhere). so i'm on a monocular route right now. will try and switch to your solution tonight. do you have a DM channel?

elliotwoods gravatar imageelliotwoods ( 2017-04-10 22:34:36 -0500 )edit

You should get an e-mail from me soon. If you don't, check your spam. I've never used this feature of the forum thing, so I don't know if it works.

Tetragramm gravatar imageTetragramm ( 2017-04-10 22:39:50 -0500 )edit

Hi Tetragramm. In the end we went with single-camera because we couldn't get stereo working in time (i didn't get a chance to try your implementation). I didn't receive an email from you, and I just want to get in touch again to say thank you for your time and let's chat on email if you have time.

elliotwoods gravatar imageelliotwoods ( 2017-08-10 09:34:37 -0500 )edit

gau ssg un+ew @ gma il. com

Remove the spaces of course, and let me know when you see it so I can remove it before the spammers find it.

Tetragramm gravatar imageTetragramm ( 2017-08-10 18:58:02 -0500 )edit

1 answer

Sort by ยป oldest newest most voted

answered 2017-03-07 19:07:18 -0500

Tetragramm gravatar image

This would be the stereoCalibrate function. It doesn't get bonus points, but it does get you what you need.

edit flag offensive delete link more


I'm sorry but it doesn't get what I want :). I wish there was a way to hack stereoCalibrate to do what I need (e.g. if there were flags like USE_STEREO_EXTRINSICS_GUESS | FIX_STEREO_EXTRINSICS, and it output the extrinsics to the object rather than just between the cameras). Or perhaps it wasn't clear from my question that I want the extrinsics of the stereo pair RELATIVE TO THE SCENE, not to each other.. Thank you

elliotwoods gravatar imageelliotwoods ( 2017-03-07 19:43:30 -0500 )edit

Ah, I see what you mean. The math on this would actually be fairly complicated. Also, I apologize, I thought stereoCalibrate output a vector of tvec and rvec along with the R and T between cameras.

Your best bet is to modify the stereoCalibrate function. It's old C code though, so it'll be a lot of work. You shouldn't need to add anything but output variables. The rest is just deleting code that doesn't do what you need.

Tetragramm gravatar imageTetragramm ( 2017-03-07 22:32:05 -0500 )edit

current thinking : adding another line after to include the error from the second camera would result in a simultaneous solve. Just would need to be able to transform _r and _t by either R+T,E,F from stereoCalibrate.

elliotwoods gravatar imageelliotwoods ( 2017-03-15 20:52:20 -0500 )edit

Take a look at composeRT

Tetragramm gravatar imageTetragramm ( 2017-03-20 17:50:50 -0500 )edit

Thanks for letting me know about composeRT! it really helps

elliotwoods gravatar imageelliotwoods ( 2017-03-31 21:55:13 -0500 )edit
Login/Signup to Answer

Question Tools



Asked: 2017-03-07 08:01:01 -0500

Seen: 628 times

Last updated: Apr 08