Hello,
I'm trying to build a pretty standard stereoscopic marker tracking system but I'm a bit stuck. I'd like to determine the pose of a cluster of markers relative to the left camera but there seems to be an offset on the x (horizontal) axis, some z rotation too.
To show the problem I've calculated the pose relative to each camera separately using a single camera method then used stereo matching and a least-squares fitting method to determine the pose.
Here's an example of the results:
Mono Left - R: [3.069579943540829; -0.05977419625323342; 0.1544040260297337]
T: [38.09605707684862; 21.18416077979643; 163.7890550734082]
Mono Right - R: [3.004481122258051; -0.0503887459417317; 0.1974314610837208]
T: [-40.78674398112144; 22.07558311940088; 167.3692601754227]
Stereo - R: [3.040724049612813; -0.09136183473500635; 0.3408090688901517]
T: [24.53711327862293; 21.23019811829612; 161.4103927579672]
Mono Left - R: [3.058489621212333; -0.05228215250008787; 0.13683463230099]
T: [38.96282303353249; 22.00846031736913; 186.9398802020772]
Mono Right - R: [2.961243386981001; -0.05200500666843982; 0.1801525489916241]
T: [-39.02619557523813; 23.22876352563853; 191.2417072658133]
Stereo - R: [3.045682764179083; -0.08502438636624524; 0.3180723793202949]
T: [23.8027339111667; 22.44540681050449; 185.6476516191155]
Mono Left - R: [3.080152177836276; -0.04957840617637835; -0.1266525035770767]
T: [66.53298620735731; 22.29216647181219; 187.5828556316184]
Mono Right - R: [2.935352608593301; -0.034696380289205; -0.0137351212138292]
T: [-11.92929769108779; 23.58438750381959; 189.9232143310567]
Stereo - R: [3.056379378076835; -0.07589959397021091; 0.09381399770267766]
T: [50.60496929247539; 22.09750040313026; 188.0577998604219]
I remap the images from both cameras then find the four markers in both images. After that I use the following code to estimate the pose of the marker cluster.
//Calculate disparity
ARg::Point3dv disparity;
for(unsigned int i = 0; i < markerLeft->size(); i++) {
disparity.push_back(cv::Point3d((markerLeft->at(i)).x, (markerLeft->at(i)).y, (markerLeft->at(i)).x - (markerRight->at(i)).x));
}
//Transform to depth
ARg::Point3dv image;
cv::perspectiveTransform(disparity, image, left.Q());
//std::cout << "Disparity: " << disparity << "\nImage : " << image << std::endl;
//Calculate mean position of observed markers (image) and untransformed model (object) - centred at (0,0,0)
cv::Scalar imageMean = cv::mean(image);
cv::Scalar objectMean = cv::mean(object);
//Long winded conversion from scalar to mat
double tempI[3][1] = { imageMean(0), imageMean(1), imageMean(2) };
double tempM[3][1] = { objectMean(0), objectMean(1), objectMean(2) };
cv::Mat iMean(3, 1, CV_64F, tempI);
cv::Mat oMean(3, 1, CV_64F, tempM);
cv::Mat C = cv::Mat::zeros(3, 3, CV_64F);
cv::Mat A, B;
//Calculate correlation matrix C = (1/n)SUM{(obji - objMean)*(imgi - imgMean)^T}
for(unsigned int i = 0; i < image.size(); i++) {
double As[3][1] = { (object.at(i)).x - objectMean(0),
(object.at(i)).y - objectMean(1),
(object.at(i)).z - objectMean(2) };
double Bs[1][3] = { (image.at(i)).x - imageMean(0),
(image.at(i)).y - imageMean(1),
(image.at(i)).z - imageMean(2) };
A = cv::Mat(3, 1, CV_64F, As);
B = cv::Mat(1, 3, CV_64F, Bs);
C += A * B;
}
C /= 4;
//Perform SVD on C
cv::Mat w, u, vt;
cv::SVDecomp(C, w, u, vt);
//Apply Kanatani extension to get rotation
cv::Mat correct = cv::Mat::eye(3, 3, CV_64FC1);
correct.at<double>(2,2) = cv::determinant(u*vt);
cv::Mat R = u * correct * vt;
cv::invert(R, R);
cv::Mat T = iMean - R*oMean;
//Rodrigues for display
cv::Rodrigues(R, rvec);
std::cout << "Stereo - R: " << rvec << "\nT: " << T << std::endl;
Any ideas where my error might be coming from? My guess would be the perspectiveTransform, but short of a bad calibration I can't see how.
Thanks for any advice.