Revision history [back]

What you need to do this is:

Camera calibration
Perspective transformation
Bounding box of your objects

First you calibrate your camera and undistort the grabbed image. Then you find the perspective transformation between your camera and the plane onto which you will put objects. Use getPerspectiveTransform with just 4 corners of a rectangle of known dimensions that you put on your plane (for best precision make it big and mark corners exactly - or use a checkerboard and findHomography instead). The perspective transform found this way will be used to convert screen coordinates of a pixel into Cartesian coordinates local to your rectangular pattern. Once you have found it, you ma remove the pattern. It is no more necessary. Just do not move tha camera with respect to the plane it is pointed at. If you do so, you will have to calculate the perspective transform once again.

Now, any pixel coordinate in the screen can be expressed as X,Y pair in your "plane" coordinate system with a simple call of perspectiveTransform. Or, if you want to transform the image itself, instead of recalculationg pixel position, use warpPerspective. This way it will be even easier. In the warped image (it will look as if you took the picture holding the camera perpendicularly to your plane) find the bounding box of your object (with simple thresholding and contour finding). The size of the bounding box (minAreaRect) will be close to the real size of your object, provided the objects are relatively flat and there is almost no parallax error and thresholding is not corrupted (for example due to shadows). You will best results with diffuse light camera-plane angle close to 90 degrees and flat objects (as compared to the camera-plane distance).