Revision history [back]

Scale and orientation of cameraspace

Introduction

I have obtained my camera intrinsics using the standard chessboard method, with source images at 640x512. This returns the following values:

cx, cy = 319.5, 255.5 (very close to image centre, as expected)
fx = fy = 1731.4

I have done the classic camera calibration and pose estimation as follows:

Chessboard lying flat on table
Camera positioned around 800mm above the chessboard, looking straight down, such that board edges are closely parallel to image edges
Use findChessboardCorners() to get imagespace inner corners
Create modelspace corners directly, with (0,0,0) being the top-left inner corner, and side length being 21.5mm (as measured in the real world)
Use solvePnP() to get transform (rotation, translation) from the board modelspace to cameraspace

The final step gives me a rotation of very close to identity - this is as expected, given that my camera is looking directly down at the chessboard. It also gives me a translation where x and y are small (< 100), and z=807. The size of z is as expected - it closely matches the measured distance between the camera and board, in mm.

Question 1

However, I am confused by the direction of the camera local z axis. My understanding is that:

In modelspace, z points directly up from the board In the pinhole camera model
In cameraspace z points into the scene, i.e. from the origin towards the image plane

But given that solvePnP() returns a rotation of close to identity, it's effectively giving me a cameraspace orientation where the NEGATIVE z axis points towards the board, rather than the POSITIVE z axis.

Question 2

I am confused about the scale factor required to convert back from imagespace to board modelspace.

Let R and T be the rotation and translation returned by solvePnP()
Let RI be the inverse (transpose) of R
Let MI be the inverse of the camera intrinsics matrix
Let p be my imagespace point

Then my viewspace point v = MI * p and my modelspace point b = RI * (s * v - T)

I have confirmed by inspection that this calculation of b is correct, IF the scaling factor s is set to be the distance of the camera origin from the board (around 807).

Why is this the case? How should I be deriving this scaling factor?

Question 3

Probably related to Q2... I am unsure about the scaling factor from imagespace to modelspace (worldspace). I am wanting to construct worldspace rays, running from the camera origin through imagespace points (pixels). I understand that in cameraspace I can place points at (px - cx, py - cy, pz) where px, py are my pixel coordinates and pz is the camera focal length in pixels (which is element 0,0 of the intrinsics matrix). But I need to scale these points to worldspace - how?

Thanks!