Based on your clarification, you can do what you need. First you calculate the vector in space. If you don't have the distortion matrix, you can ignore that part.
Mat tvec = _tvec.getMat();
Mat rvec = _rvec.getMat();
Mat camera_matrix = _cameraMatrix.getMat();
const Mat distortion_matrix = _distortionMatrix.getMat();
std::vector< Point2f > pts_in, pts_out;
pts_in.push_back( _pt );
undistortPoints( pts_in, pts_out, camera_matrix, distortion_matrix, noArray(), camera_matrix );
Mat los( 3, 1,CV_64F );
los.at< double >( 0 ) = pts_out[0].x;
los.at< double >( 1 ) = pts_out[0].y;
los.at< double >( 2 ) = 1;
if ( camera_matrix.type() != CV_64F )
camera_matrix.convertTo( camera_matrix, CV_64F );
if ( rvec.type() != CV_64F )
rvec.convertTo( rvec, CV_64F );
if ( tvec.type() != CV_64F )
tvec.convertTo( tvec, CV_64F );
los = camera_matrix.inv() * los;
This gives you a vector pointing out from the center of the camera. You need to rotate it by the orientation of the camera, and use the position of the camera as the origin of the vector. If you calibrated your camera using OpenCV, use this function as a guide: HERE. Otherwise you'll need to use the information you have (presumably from the robot arm). From there it's just basic geometry.
https://en.wikipedia.org/wiki/Pinhole...
Do you not need z coordinate because you know how far it is? Or at least the location of the camera relative to some surface? If so, you're good.
If you know the location and orientation of a camera relative to a plane you can find the location on the plane easily. To know the location and orientation you must either have external info, or be able to see a minimum of four known locations on the plane to use solvePnP.
@Tetragramm for z I can tell robot arm(camera on hand) every time to look at plane from above at certain height. so z, and orientation is fixed every time. I have also camera intrinsics cx,cy,fx,fy. so I can get actual (x,y) = (u-cx/fx, v-cy/fy) is this correct method?