You need a few more pieces of information, all about the camera. Fortunately, OpenCV can help obtain those. You need information about the distortion of the lens and it's field of view.
The most important of these is the field of view. This is how much angular distance that each pixel covers. If you don't need precise numbers, you can calculate this by putting a known size object at a known distance, looking at the number of pixels it occupies in both vertical and horizontal directions. Simple trig, and you have the angle.
If you need more precise width and heights, you can use the Camera Calibration functions of OpenCV to determine what you need.
first, you can read about stereo camera calibration. i think if you can find depth first, then you can find the width and height.
Hi,
Could you find out any solution? I have calibrated two camera by stereoCalibrate and also calculate the depth. But I can not figure out how to step further at this point.