homography vs SolvePNP for pose detection, how and why?

asked 2018-10-17 11:54:43 -0500

klayman gravatar image

I have mutiple planar markers where I can detect 100-200 points each in a reliable manner. In each frame I see one or two markers at most; I need to calculate the camera pose.

I am not sure whether I should use solvePNP or findhomography for the pose detection.

1) OpenCV homography page states that solvePnp should be used for planar objects as well, and findhomography is just for demo in terms of pose detection. See https://docs.opencv.org/3.4.1/d9/dab/... On the other hand, solvePNP is using POSIT, right? Which should be less accurate on planar features, please correct me here if the implementation is actually taking care of the planar case. So, should I ever consider findHomography or not for pose estimation?

2) In case findhomography still makes sense, how should I use it? Should I use two frames and feed 2d-2d coordinates to the findhomography, or use the "3d object coordinates without Z" as the source?

Thanks for any hint.

edit retag flag offensive close merge delete


solvePnP() does not use POSIT. See the documentation for the references.

Just use solvePnP(), the implemented methods can handle planar and non planar cases.

Eduardo gravatar imageEduardo ( 2018-10-17 12:18:49 -0500 )edit
berak gravatar imageberak ( 2018-10-17 12:30:17 -0500 )edit

Ok, this sounds like a conclusive answer, thanks. In order to "handle the planar cases", is it required that Z=0 on the object points, or should the 3d coords be "just planar enough", or it does not matter and the accuracy is always the same? Just want to check if I need to somehow "trigger" the detection of the planar case or not.

klayman gravatar imageklayman ( 2018-10-17 12:31:28 -0500 )edit

In this tutorial, yes Z=0 is required. If you use solvePnP() you can use an arbitrary planar configuration (not only Z=0).

If you want to go further:

  • generate for instance 10 points that lie on the same plane in an arbitrary configuration
  • generate a rotation vector rvec and a translation vector tvec
  • generate some camera intrinsic parameters
  • project the 3D object points using the intrinsic and extrinsic (rvec and tvec) parameters
  • feed to solvePnP() the 3D object points, the 2D image points, ...
  • estimated rvec and tvec should be close to the real ones
Eduardo gravatar imageEduardo ( 2018-10-17 15:19:06 -0500 )edit

If you want to understand more, you can:

  • read the reference papers mentioned in the solvePnP() method, for instance EPnP method explains both planar and non planar cases or read the corresponding source code
  • if you use solvePnP() with SOLVEPNP_ITERATIVE, set a breakpoint here or add a cout and you should see that the method should go to the planar case (this method is pose from homography with general configuration)
  • if you are familiar with Python, you can use it to write the test code, it is especially useful to display 3D object points (if you want to validate that you generate correct planar points)
  • you can add some noise to the projected points to have more realistic data
Eduardo gravatar imageEduardo ( 2018-10-17 15:26:08 -0500 )edit