Ask Your Question

one point (u,v) to actual (x,y) w.r.t camera frame?

asked 2019-05-27 09:57:28 -0500

opencvr gravatar image


I am detecting center of circle and getting (u,v) of that coordinates in Image. Now I want this (u,v) into actual world coordinates (x,y) w.r.t to camera_frame. I do not need z coordinate.

How can I get this? what is the possible way to do this?

I could not found anything relevant to this. I have my camera calibration matrix.


edit retag flag offensive close merge delete


Do you not need z coordinate because you know how far it is? Or at least the location of the camera relative to some surface? If so, you're good.

If you know the location and orientation of a camera relative to a plane you can find the location on the plane easily. To know the location and orientation you must either have external info, or be able to see a minimum of four known locations on the plane to use solvePnP.

Tetragramm gravatar imageTetragramm ( 2019-05-27 17:43:25 -0500 )edit

@Tetragramm for z I can tell robot arm(camera on hand) every time to look at plane from above at certain height. so z, and orientation is fixed every time. I have also camera intrinsics cx,cy,fx,fy. so I can get actual (x,y) = (u-cx/fx, v-cy/fy) is this correct method?

opencvr gravatar imageopencvr ( 2019-05-28 04:08:34 -0500 )edit

1 answer

Sort by ยป oldest newest most voted

answered 2019-05-28 19:16:33 -0500

Tetragramm gravatar image

Based on your clarification, you can do what you need. First you calculate the vector in space. If you don't have the distortion matrix, you can ignore that part.

Mat tvec = _tvec.getMat();
Mat rvec = _rvec.getMat();
Mat camera_matrix = _cameraMatrix.getMat();
const Mat distortion_matrix = _distortionMatrix.getMat();

std::vector< Point2f > pts_in, pts_out;
pts_in.push_back( _pt );
undistortPoints( pts_in, pts_out, camera_matrix, distortion_matrix, noArray(), camera_matrix );

Mat los( 3, 1,CV_64F );< double >( 0 ) = pts_out[0].x;< double >( 1 ) = pts_out[0].y;< double >( 2 ) = 1;

if ( camera_matrix.type() != CV_64F )
    camera_matrix.convertTo( camera_matrix, CV_64F );
if ( rvec.type() != CV_64F )
    rvec.convertTo( rvec, CV_64F );
if ( tvec.type() != CV_64F )
    tvec.convertTo( tvec, CV_64F );

los = camera_matrix.inv() * los;

This gives you a vector pointing out from the center of the camera. You need to rotate it by the orientation of the camera, and use the position of the camera as the origin of the vector. If you calibrated your camera using OpenCV, use this function as a guide: HERE. Otherwise you'll need to use the information you have (presumably from the robot arm). From there it's just basic geometry.

edit flag offensive delete link more


@Tetragramm but what should I input as _rvec and _tvec?, what I have is, _camera_matrix(fx,cx,fy,cy), input points(u,v) so _pt, and _distortionMatrix.

opencvr gravatar imageopencvr ( 2019-05-29 03:16:26 -0500 )edit

camera_calibration with chessboard is giving me only _camera_matrix and _distortion_matrix.

opencvr gravatar imageopencvr ( 2019-05-29 09:30:41 -0500 )edit

You said you know the orientation and position of the camera. Is that from an opencv function that gives you rvec and tvec? Then use those.

Otherwise, it's not OpenCV, and I can't help.

Oh, except there's apparently a function call calibrateHandEye just for the task of matching robot arms to the camera.

Tetragramm gravatar imageTetragramm ( 2019-05-29 18:04:08 -0500 )edit

@Tetragramm now I see camera calibration giving me rvec, tvec tuples for every chessboard pattern images, whose location is changing w.r.t camera. it is huge 25 * 6 matrix, but for calculation I need 3*4 matrix? I am still not getting it.

opencvr gravatar imageopencvr ( 2019-05-30 04:12:09 -0500 )edit

This sort of question is best answered by the documentation for the function HERE. calibrateCamera treats the corner of the chessboard as the center of the world, and tells you where you camera is located in each of the calibration images. Check out this tutorial to see how to take the information and the video stream to see where the camera is at any given point.

Tetragramm gravatar imageTetragramm ( 2019-05-30 22:18:28 -0500 )edit

@Tetragramm as my object is placed on table, so distance to table from camera is z, known. as I do not need orientation, from equation of this attached photo link text I omitted first equation (so R,t) as I do not need complete POSE.

then using detected point (u,v) and fx, cx, fy, cy and if say(x,y) is real world point in camera_frame,

x= ((u-cx/fx) z), y = ((v-cy/fy) z)

can I use like this? is this correct method or still am I missing something or understood wrong?

opencvr gravatar imageopencvr ( 2019-06-03 08:48:48 -0500 )edit

That would work as long as the camera is pointed straight down at the table. And as long as you're ok with the center of the camera being the (0,0) of your coordinate system. And as long as you don't have any distortion.

If you do have distortion, use what's in my post, but multiply the final result by z/<double>(2), and then first two values of los are your x and y.

Tetragramm gravatar imageTetragramm ( 2019-06-03 19:53:16 -0500 )edit

@Tetragramm as I have attached images of my setup, camera is pointed exactly straight down at the table.

link text

link text

link text

"the center of the camera being the (0,0) of your coordinate system"-----is this mean by *camera coordinate system? as in my case it should be, realsense left imager (0,0) right?*

so after getting (x,y,z) in above camera_frame I am transforming It to robot base_link, and giving that POSE to end effector. so sometime it goes to that position and sometime there is random error of around 20mm,10 mm in x,y.

opencvr gravatar imageopencvr ( 2019-06-04 08:24:22 -0500 )edit

I am transforming this (x,y,z) value directly to base_link as camera frame is attached to robot(ROS URDF and TF).

that is why I think camera extrinsic parameter(R,t) is not required (like others have to do this using known coordinates on plane and they get camera Pose , but in my case I do not need world coordinates on plane). or am I understanding still wrong for R,t? is this can be a justification for omitting 1st equation from thislink text?

is this error because of I omitted first equation ,where R,t also there? but according to my setup, it should work without R,t?

opencvr gravatar imageopencvr ( 2019-06-04 08:54:44 -0500 )edit

First a question. You wish to move to an X,Y. Is it true that the farther from the destination you are, the farther the error? Or is it unconnected to the distance the X,Y is from the center of the camera?

Tetragramm gravatar imageTetragramm ( 2019-06-04 18:25:54 -0500 )edit

I have tested almost 10 different position. when object is near camera center (ex. x= 13 mm, y=58 mm) there is no error in tool position, but if I put object away from camera center , it is not reaching exact position where every time position error in y is more and less in x.

I have read book "multiple view geometry" where I have found that, if camera plane and table plane where object is lying is parallel, then we can directly convert object (u,v) to camera_frame (x,y) using R,t as identity matrix. can I consider this as a proof that my applied method is right?

then what should be wrong?

opencvr gravatar imageopencvr ( 2019-06-05 03:42:25 -0500 )edit

@Tetragramm here is attached photo ,which is my workspace coordinates. also from this link link text i found that homography matrix is transformations between two planes. but can we say that two planes in my case is parallel so, we are assuming it as identity matrix?

opencvr gravatar imageopencvr ( 2019-06-05 06:34:18 -0500 )edit

There are two main possibilities. It could be either, or both.

First is camera distortion. Distortion typically gets worse farther away from the center. You just calibrate your camera as in the opencv tutorial, and save the distortion matrix. Then use the undistortPoints function.

Second is how you are calculating your x and y. If you simply take a vector of length z in the direction of your point, you will fall short, because the point is on a sphere. You need to multiply the vector so that the z coordinate is of length z.

Tetragramm gravatar imageTetragramm ( 2019-06-05 19:17:29 -0500 )edit

@Tetragramm real sense camera is factory calibrated. although I have done calibration and have distortion matrix and I am using it. actually my circular object is too flat and hollow. link text so I can take center as object center when lying on flat plane. I am also calculating x,y at z depth.< double >( 0 ) = (((xr-center.x)/focal_length1)*distance);< double >( 1 ) = (((yr-center.y)/focal_length2)*distance);< double >( 2 ) = distance;

also, I have noticed that it is always error in y position when object is placed in negative y direction to camera. in positive y direction, there is no error in both x and y. can we say that this is problem because of camera plane and table plane is not exactly parallel?

opencvr gravatar imageopencvr ( 2019-06-06 06:58:17 -0500 )edit

I have also noticed that in negative y direction there is little slope between ground and table surface. so camera plane and table plane is not exactly parallel in negative -y direction. assuming both plane is parallel,but can I say that method applied is right to take R,t as zero? .

opencvr gravatar imageopencvr ( 2019-06-06 07:03:03 -0500 )edit

Nope, that's not the right calculation. And yes, it could also be the non-flatness of the table. But unless that slope is pretty big, or you're really close I wouldn't expect that to make as much difference as you see.< double >( 0 ) = (xr-center.x)/focal_length1;< double >( 1 ) = (yr-center.y)/focal_length2;< double >( 2 ) = 1;
los *= distance;
Tetragramm gravatar imageTetragramm ( 2019-06-06 19:03:04 -0500 )edit

@Tetragramm thank you for your assistance. In most of the position it is working correctly except in some case when y is negative. so I am assuming that taking R,t as 0 WHOULD BE RIGHT METHOD, when camera is parallel to table. may be some error is coming because of calibration. 1. camera 2. hand-eye

opencvr gravatar imageopencvr ( 2019-06-10 09:17:11 -0500 )edit

@Tetragramm should I consider this as right method?

opencvr gravatar imageopencvr ( 2019-06-12 05:47:30 -0500 )edit

When the camera is parallel to the table, yes, you can make the assumption that R is zero. t must obviously be [0,0,z], to account for the height.

Tetragramm gravatar imageTetragramm ( 2019-06-12 09:00:24 -0500 )edit

@Tetragramm t :[0,0,z] I am taking from depth directly as above we have discussed los *= distance.

opencvr gravatar imageopencvr ( 2019-06-13 07:59:25 -0500 )edit

@Tetragramm just small question, if you look at this link function link text , here they are using depth, pixel point 2d (x,y) and known camera intrinsics and getting 3d real world point (so It can project any 2d point to 3d point). and this is w.r.t camera regardless of camera position without parallel to image plane (or table). so why we need R,t extrinsics of camera when camera is not parallel to image plane?

opencvr gravatar imageopencvr ( 2019-07-04 08:58:47 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower


Asked: 2019-05-27 09:57:28 -0500

Seen: 274 times

Last updated: May 28 '19