# one point (u,v) to actual (x,y) w.r.t camera frame?

Hello,

I am detecting center of circle and getting (u,v) of that coordinates in Image. Now I want this (u,v) into actual world coordinates (x,y) w.r.t to camera_frame. I do not need z coordinate.

How can I get this? what is the possible way to do this?

I could not found anything relevant to this. I have my camera calibration matrix.

thanks.

edit retag close merge delete

Do you not need z coordinate because you know how far it is? Or at least the location of the camera relative to some surface? If so, you're good.

If you know the location and orientation of a camera relative to a plane you can find the location on the plane easily. To know the location and orientation you must either have external info, or be able to see a minimum of four known locations on the plane to use solvePnP.

( 2019-05-27 17:43:25 -0500 )edit

@Tetragramm for z I can tell robot arm(camera on hand) every time to look at plane from above at certain height. so z, and orientation is fixed every time. I have also camera intrinsics cx,cy,fx,fy. so I can get actual (x,y) = (u-cx/fx, v-cy/fy) is this correct method?

( 2019-05-28 04:08:34 -0500 )edit

Sort by ยป oldest newest most voted

Based on your clarification, you can do what you need. First you calculate the vector in space. If you don't have the distortion matrix, you can ignore that part.

Mat tvec = _tvec.getMat();
Mat rvec = _rvec.getMat();
Mat camera_matrix = _cameraMatrix.getMat();
const Mat distortion_matrix = _distortionMatrix.getMat();

std::vector< Point2f > pts_in, pts_out;
pts_in.push_back( _pt );
undistortPoints( pts_in, pts_out, camera_matrix, distortion_matrix, noArray(), camera_matrix );

Mat los( 3, 1,CV_64F );

los.at< double >( 0 ) = pts_out[0].x;
los.at< double >( 1 ) = pts_out[0].y;
los.at< double >( 2 ) = 1;

if ( camera_matrix.type() != CV_64F )
camera_matrix.convertTo( camera_matrix, CV_64F );
if ( rvec.type() != CV_64F )
rvec.convertTo( rvec, CV_64F );
if ( tvec.type() != CV_64F )
tvec.convertTo( tvec, CV_64F );

los = camera_matrix.inv() * los;


This gives you a vector pointing out from the center of the camera. You need to rotate it by the orientation of the camera, and use the position of the camera as the origin of the vector. If you calibrated your camera using OpenCV, use this function as a guide: HERE. Otherwise you'll need to use the information you have (presumably from the robot arm). From there it's just basic geometry.

more

@Tetragramm but what should I input as _rvec and _tvec?, what I have is, _camera_matrix(fx,cx,fy,cy), input points(u,v) so _pt, and _distortionMatrix.

( 2019-05-29 03:16:26 -0500 )edit

camera_calibration with chessboard is giving me only _camera_matrix and _distortion_matrix.

( 2019-05-29 09:30:41 -0500 )edit

You said you know the orientation and position of the camera. Is that from an opencv function that gives you rvec and tvec? Then use those.

Otherwise, it's not OpenCV, and I can't help.

Oh, except there's apparently a function call calibrateHandEye just for the task of matching robot arms to the camera.

( 2019-05-29 18:04:08 -0500 )edit

@Tetragramm now I see camera calibration giving me rvec, tvec tuples for every chessboard pattern images, whose location is changing w.r.t camera. it is huge 25 * 6 matrix, but for calculation I need 3*4 matrix? I am still not getting it.

( 2019-05-30 04:12:09 -0500 )edit

This sort of question is best answered by the documentation for the function HERE. calibrateCamera treats the corner of the chessboard as the center of the world, and tells you where you camera is located in each of the calibration images. Check out this tutorial to see how to take the information and the video stream to see where the camera is at any given point.

( 2019-05-30 22:18:28 -0500 )edit

@Tetragramm as my object is placed on table, so distance to table from camera is z, known. as I do not need orientation, from equation of this attached photo link text I omitted first equation (so R,t) as I do not need complete POSE.

then using detected point (u,v) and fx, cx, fy, cy and if say(x,y) is real world point in camera_frame,

x= ((u-cx/fx) z), y = ((v-cy/fy) z)

can I use like this? is this correct method or still am I missing something or understood wrong?

( 2019-06-03 08:48:48 -0500 )edit

That would work as long as the camera is pointed straight down at the table. And as long as you're ok with the center of the camera being the (0,0) of your coordinate system. And as long as you don't have any distortion.

If you do have distortion, use what's in my post, but multiply the final result by z/los.at<double>(2), and then first two values of los are your x and y.

( 2019-06-03 19:53:16 -0500 )edit

@Tetragramm as I have attached images of my setup, camera is pointed exactly straight down at the table.

"the center of the camera being the (0,0) of your coordinate system"-----is this mean by *camera coordinate system? as in my case it should be, realsense left imager (0,0) right?*

so after getting (x,y,z) in above camera_frame I am transforming It to robot base_link, and giving that POSE to end effector. so sometime it goes to that position and sometime there is random error of around 20mm,10 mm in x,y.

( 2019-06-04 08:24:22 -0500 )edit

I am transforming this (x,y,z) value directly to base_link as camera frame is attached to robot(ROS URDF and TF).

that is why I think camera extrinsic parameter(R,t) is not required (like others have to do this using known coordinates on plane and they get camera Pose , but in my case I do not need world coordinates on plane). or am I understanding still wrong for R,t? is this can be a justification for omitting 1st equation from thislink text?

is this error because of I omitted first equation ,where R,t also there? but according to my setup, it should work without R,t?

( 2019-06-04 08:54:44 -0500 )edit

First a question. You wish to move to an X,Y. Is it true that the farther from the destination you are, the farther the error? Or is it unconnected to the distance the X,Y is from the center of the camera?

( 2019-06-04 18:25:54 -0500 )edit

I have tested almost 10 different position. when object is near camera center (ex. x= 13 mm, y=58 mm) there is no error in tool position, but if I put object away from camera center , it is not reaching exact position where every time position error in y is more and less in x.

I have read book "multiple view geometry" where I have found that, if camera plane and table plane where object is lying is parallel, then we can directly convert object (u,v) to camera_frame (x,y) using R,t as identity matrix. can I consider this as a proof that my applied method is right?

then what should be wrong?

( 2019-06-05 03:42:25 -0500 )edit

@Tetragramm here is attached photo https://ibb.co/09Y9fJM ,which is my workspace coordinates. also from this link link text i found that homography matrix is transformations between two planes. but can we say that two planes in my case is parallel so, we are assuming it as identity matrix?

( 2019-06-05 06:34:18 -0500 )edit

There are two main possibilities. It could be either, or both.

First is camera distortion. Distortion typically gets worse farther away from the center. You just calibrate your camera as in the opencv tutorial, and save the distortion matrix. Then use the undistortPoints function.

Second is how you are calculating your x and y. If you simply take a vector of length z in the direction of your point, you will fall short, because the point is on a sphere. You need to multiply the vector so that the z coordinate is of length z.

( 2019-06-05 19:17:29 -0500 )edit

@Tetragramm real sense camera is factory calibrated. although I have done calibration and have distortion matrix and I am using it. actually my circular object is too flat and hollow. link text so I can take center as object center when lying on flat plane. I am also calculating x,y at z depth.

  los.at< double >( 0 ) = (((xr-center.x)/focal_length1)*distance);
los.at< double >( 1 ) = (((yr-center.y)/focal_length2)*distance);
los.at< double >( 2 ) = distance;


also, I have noticed that it is always error in y position when object is placed in negative y direction to camera. in positive y direction, there is no error in both x and y. can we say that this is problem because of camera plane and table plane is not exactly parallel?

( 2019-06-06 06:58:17 -0500 )edit

I have also noticed that in negative y direction there is little slope between ground and table surface. so camera plane and table plane is not exactly parallel in negative -y direction. assuming both plane is parallel,but can I say that method applied is right to take R,t as zero? .

( 2019-06-06 07:03:03 -0500 )edit

Nope, that's not the right calculation. And yes, it could also be the non-flatness of the table. But unless that slope is pretty big, or you're really close I wouldn't expect that to make as much difference as you see.

los.at< double >( 0 ) = (xr-center.x)/focal_length1;
los.at< double >( 1 ) = (yr-center.y)/focal_length2;
los.at< double >( 2 ) = 1;
los *= distance;

( 2019-06-06 19:03:04 -0500 )edit

@Tetragramm thank you for your assistance. In most of the position it is working correctly except in some case when y is negative. so I am assuming that taking R,t as 0 WHOULD BE RIGHT METHOD, when camera is parallel to table. may be some error is coming because of calibration. 1. camera 2. hand-eye

( 2019-06-10 09:17:11 -0500 )edit

@Tetragramm should I consider this as right method?

( 2019-06-12 05:47:30 -0500 )edit
1

When the camera is parallel to the table, yes, you can make the assumption that R is zero. t must obviously be [0,0,z], to account for the height.

( 2019-06-12 09:00:24 -0500 )edit

@Tetragramm t :[0,0,z] I am taking from depth directly as above we have discussed los *= distance.

( 2019-06-13 07:59:25 -0500 )edit

@Tetragramm just small question, if you look at this link function link text , here they are using depth, pixel point 2d (x,y) and known camera intrinsics and getting 3d real world point (so It can project any 2d point to 3d point). and this is w.r.t camera regardless of camera position without parallel to image plane (or table). so why we need R,t extrinsics of camera when camera is not parallel to image plane?

( 2019-07-04 08:58:47 -0500 )edit

They have depth. You do not.

If you know the distance to the table (t), and the table is parallel to the plane(R), you can calculate the depth at any point on the table. So really, you do know R,t. It's just that they're simple.

So saying that it's not parallel is saying you don't know R. Now you need something else to fill that gap.

( 2019-07-04 23:12:37 -0500 )edit

Official site

GitHub

Wiki

Documentation