Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Interpretation of translational vectors results of camera calibration.

I currently debug my stereo calibration code. The results of the pose estimation (of single camera calibration) seem to be not accurate enough and the stereo rectification results are quite bad (same features of objects are about 200-400 px off in y direction).

General observations:

  1. The single camera calibration gives me an reprojection RMS of about 0.25 for both cameras, when using about 30 checkerboard calibration images. More images mostly just increase computation time...
  2. The estimated focal length is ~16mm (fx/fy multiplied by pixel size in mm) and the "tvecs" have Z-Values in the order of 0.618 m (I use meters as world units) for a pattern that was ~70cm away from the camera sensor.
  3. The estimated focal length andZ-Values seems to be quite stable, when using different images for calibration.

Issues:

As far as I know the tvecs[] represent the camera position relative to the pattern in world space coordinates. Now, if I calculate the distance between the cameras in the XY-Plane I get only 2.6 cm, while the cameras a about 8.7 cm away from eachother. You can find the estimated values of tvecs[0] for one pattern below. The pattern was laying at a fixed position and both cameras were fixed too (one image taken from a series of static images, where the pattern was laying flat below both cameras pointing downwards).

Questions:

I wonder if I correctly interpret the tvecs[] output values? ...or does it maybe need to be rotated first? There is no graphical representation of their meaning, hence it's kind of difficult to know. Maybe you could supply any... I used the below code to estimate roll, pitch and yaw of the cameras... How to adapt this code to actually calculate the distance between both cameras (in case just subtracting them and calculating the length of the vector is wrong).

Calibration-Result: Left Camera

  • RMS: 0.254619
  • fx=16.2095 mm, fy=16.2205 mm, cx=2008.18 px, cy=1049.5 px
  • fovx=46.1164, fovy=24.0126, apect ratio=1.00067, cx=6.92821 mm, cy=3.62076 mm
  • POSITION (tvecs[0]): X=-0.036155, Y=-0.00518067, Z=-0.618701
  • ROTATION (roll, pitch, yaw): 0.00080997, 0.0527651, 3.13971

Calibration-Result: Right Camera:

  • RMS: 0.24088
  • fx=16.2014 mm, fy=16.211 mm, cx=2068.66 px, cy=1012.78 px
  • fovx=46.1298, fovy=24.0285, apect ratio=1.0006, cx=7.13689 mm, cy=3.4941 mm
  • POSITION (tvecs[0]): X=-0.0623798, Y=-0.010667, Z=-0.618921
  • ROTATION (roll, pitch, yaw): -0.00761054, -0.0694482, -0.00125793

General Info:

  • Sony IMX304 sensor (4112 x 3008, 3.45 um Pixel Size)
  • Captured Size: 4000 x 2000 (cropped, xoffset: 56, yoffset: 504)
  • C-Mount-Lense: Quite Long (7-10 cm)
    • Checkerboard pattern: 7 x 9, patch size: about 16.04 mm x 16.01 mm
  • Pattern fixed onto 5mm aluminum plate using adhesive tape

Used Code to get Rotation above:

Mat Rt, R, pos;
Rodrigues(rvec, Rt);
transpose(Rt, R);
pos = -R * tvec;

double roll = atan2(-R.at<double>(2,1), R.at<double>(2,2));
double pitch = asin(R.at<double>(2,0));
double yaw = atan2(-R.at<double>(1,0), R.at<double>(0,0));

Interpretation of translational vectors results of camera calibration.

I currently debug my stereo calibration code. The results of the pose estimation (of single camera calibration) seem to be not accurate enough and the stereo rectification results are quite bad (same features of objects are about 200-400 px off in y direction).

General observations:

  1. The single camera calibration gives me an reprojection RMS of about 0.25 for both cameras, when using about 30 checkerboard calibration images. More images mostly just increase computation time...
  2. The estimated focal length is ~16mm (fx/fy multiplied by pixel size in mm) and the "tvecs" have Z-Values in the order of 0.618 m (I use meters as world units) for a pattern that was ~70cm away from the camera sensor.
  3. The estimated focal length andZ-Values seems to be quite stable, when using different images for calibration.

Issues:

As far as I know the tvecs[] represent the camera position relative to the pattern in world space coordinates. Now, if I calculate the distance between the cameras in the XY-Plane I get only 2.6 cm, while the cameras a about 8.7 cm away from eachother. You can find the estimated values of tvecs[0] for one pattern below. The pattern was laying at a fixed position and both cameras were fixed too (one image taken from a series of static images, where the pattern was laying flat below both cameras pointing downwards).

Questions:

I wonder if I correctly interpret the tvecs[] output values? ...or does it maybe need to be rotated first? There is no graphical representation of their meaning, hence it's kind of difficult to know. Maybe you could supply any... I used the below code to estimate roll, pitch and yaw of the cameras... How to adapt this code to actually calculate the distance between both cameras (in case just subtracting them and calculating the length of the vector is wrong).

Calibration-Result: Left Camera

  • RMS: 0.254619
  • fx=16.2095 mm, fy=16.2205 mm, cx=2008.18 px, cy=1049.5 px
  • fovx=46.1164, fovy=24.0126, apect ratio=1.00067, cx=6.92821 mm, cy=3.62076 mm
  • POSITION (tvecs[0]): X=-0.036155, Y=-0.00518067, Z=-0.618701
  • ROTATION (roll, pitch, yaw): 0.00080997, 0.0527651, 3.13971

Calibration-Result: Right Camera:

  • RMS: 0.24088
  • fx=16.2014 mm, fy=16.211 mm, cx=2068.66 px, cy=1012.78 px
  • fovx=46.1298, fovy=24.0285, apect ratio=1.0006, cx=7.13689 mm, cy=3.4941 mm
  • POSITION (tvecs[0]): X=-0.0623798, Y=-0.010667, Z=-0.618921
  • ROTATION (roll, pitch, yaw): -0.00761054, -0.0694482, -0.00125793

General Info:

  • Sony IMX304 sensor (4112 x 3008, 3.45 um Pixel Size)
  • Captured Size: 4000 x 2000 (cropped, xoffset: 56, yoffset: 504)
  • C-Mount-Lense: Quite Long (7-10 cm)
    • Checkerboard pattern: 7 x 9, patch size: about 16.04 mm x 16.01 mm
  • Pattern fixed onto 5mm aluminum plate using adhesive tape

Used Code to get Rotation above:

Mat Rt, R, pos;
Rodrigues(rvec, Rt);
transpose(Rt, R);
pos = -R * tvec;

double roll = atan2(-R.at<double>(2,1), R.at<double>(2,2));
double pitch = asin(R.at<double>(2,0));
double yaw = atan2(-R.at<double>(1,0), R.at<double>(0,0));

EDIT #1:

Here is my function to generate the known Positions of the markers in word coordinates:

   vector<Point3f> getKnownPositions(Size patternSize, double squareWidth, double squareHeight) {
          vector<Point3f> knownPositions;
          double Width = patternSize.width * double(squareWidth);
          double Height = patternSize.height * double(squareHeight);
          for (int j = 0; j < patternSize.height; ++j) {
                 for (int i = 0; i < patternSize.width; ++i) {
                        knownPositions.push_back(Point3f(float(double(i*squareWidth)-Width/2.), float(double(j*squareHeight)-Height/2.), 0.0f));
                 }
          }
          return knownPositions;
   }

I call it as following:

patternSize = Size(7, 9);
knownPositions = getKnownPositions(patternSize, 0.0160375, 0.01601);