You must calibration camera(find extrinsic and intrinsic camera matrix, Use can using cheeboard or Calibrated Cameras in OpenGL width gluPerspective without glFrustum method).

After, set of 2D points provided onto 3D space.

We do know many points on the 3D model ( i.e. (U, V, W) ), but we do not know (X, Y, Z). We only know the location of the 2D points ( i.e. (x, y) ). In the absence of radial distortion, the coordinates (x, y) of point p in the image coordinates is given by

where, fx and fy are the focal lengths in the x and y directions, and (cx, cy) is the optical center (assume fx = fy = focal length). Things get slightly more complicated when radial distortion is involved and for the purpose of simplicity I am leaving it out.

See more https://en.wikipedia.org/wiki/Angle_of_view

Using cv::solvePnP
the function solvePnP and solvePnPRansac can be used to estimate pose.

Calculate distance and bearing between two point

Or you can using double hypot method measure distance.