3D Reconstruction with 1 camera

asked 2016-01-11 20:26:42 -0600

MrE gravatar image

updated 2016-01-12 11:18:03 -0600

Playing with stereo 3D reconstruction using one camera and rotating the object, but I can't seem to get proper Z mapping.

EDITED following comments:

I follow the typical process:

  • calibrate the camera with the chessboard,

  • get the camera matrix and distorsion.

  • Take left and right images by rotating the object: fixed camera, the object rotates on itself on the Y axis (the background is green screen I remove)

  • undistort the images.

All this seems fine to me. The images look good.

I get the disparity map with StereoBM.compute on the left and right images.

There is some black areas but mostly gray, so the Z seems to be computed for most of the image.

then I use stereoRectify to get the Q matrix:

I use a Rotation matrix which I built using Rodrigues on a rotation vector. My rotation is only along the Y axis, so the rotation vector is [0, angle, 0] (angle being the angle by which the object was rotated)

The Rotation matrix seems right as far as I can tell: I tried with trivial angles and I get what is expected.

I also need the translation vector, so I used [cos(angle), 0, sin(angle)] since I rotate only along Y, I then have a unit-less translation of the camera by the arc of the rotation. From my reading Rotation and Translation matrices are unit-less. I have tried applying a scale factor to the translation vector (with [d*cos(angle), 0, d*sin(angle)]) to account for the distance from camera to center of rotation, but it only seems to scale the object (in X,Y, and Z, not just one dimension)

I use stereoRectify with the same camera matrix and distortion for both cameras since it is the same camera.

When i reprojectImageTo3D with Q and the disparity map, I get a result that looks OK in Meshlab when looking at the right angle, but the depth seems way off when I move around (i.e the Z depth of the object is ~2x the width, when the object is really 1/10th of the width)

So, I'm just wondering if this is normal, and expected because it's only 2 images from a ~20degree angle difference, or if I'm just messing up somewhere.

Especially I wonder:

  • If I need to account somewhere for the distance from the camera to the center of rotation of the object: as I mentioned I tried to apply that factor to the translation vector but it only seems to scale the whole thing.

  • I wonder also if it may be a problem with the application of the colors: I use one of the 2 images to get the colors, because I wasn't sure how I could use both. I am not sure how the disparity map maps to the original images: does it map to Left or Right or neither? I could see that if color assignment to the disparity map is wrong, the ...

(more)
edit retag flag offensive close merge delete

Comments

I don't understand " Take left and right images by rotating the object"? what do you mean by rotating the object? If you want to use stereo algorithm for 3D construction, You are supposed to move camera & not the object.

Balaji R gravatar imageBalaji R ( 2016-01-11 23:06:56 -0600 )edit

Well, it's all relative, right? The camera rotate around the object or the object rotates in front of the camera. I use a green screen to remove background, so effectively I rotate the object on itself, but in fact it is as if the camera moved around the object, by the same angle, at a given distance from the object.

MrE gravatar imageMrE ( 2016-01-11 23:47:38 -0600 )edit

Yes but the translation/Rotation (base line) have to be constant. can you move the object exactly same distance for a given frame?

Balaji R gravatar imageBalaji R ( 2016-01-12 00:13:05 -0600 )edit

I have only used 2 images right now. I understand that if I wanted to move around the object 360degrees I would need to move each frame by the exact same angle, but would be the same if I was to move the camera. But with 2 images, I only need to know one angle. My question really is: is 2 frame enough to get reasonable Z, or am I supposed to go 360 around? If I have to take multiple frames, how am I supposed to match the Z from one reprojection to another?

MrE gravatar imageMrE ( 2016-01-12 00:21:34 -0600 )edit

Are you sure your translation vector between the camera at frame 1 and the camera at frame 2 is OK ?

If I understand well your formula, if the object rotate by 20°, the translation vector between the two positions of the camera is [x y z] <==> [0.939692621 0 0.342020143], so a distance of always 1 m whatever the angle is ?

Eduardo gravatar imageEduardo ( 2016-01-12 04:11:40 -0600 )edit

From what I have been reading, the rotation and translation matrices are unit-less. Correct me if I am wrong. So the distance to the object is constant, yes, and unit 1. I did try adding a multiplication factor to account for the distance to the object, but all that seems to do is change the scale (in x,y and z) not change the z depth. If I scale the disparity map, it does change the z scale but does not seem to map right. So i'm confused about what I am supposed to adjust, or even if it is possible to get reasonable z depth with just 2 images.

MrE gravatar imageMrE ( 2016-01-12 10:56:39 -0600 )edit

What I meant is for the computation of the Q matrix by stereoRectify:

  • you have to supply the rotation matrix between the camera frame at image 1 and the camera frame at image 2
  • you have to supply the translation vector between the camera frame at image 1 and the camera frame at image 2

In Q, you have basically the distance between the left and right camera frames (the baseline). Usually, the stereo rig is constructed so that the views are fronto-parallel, otherwise we usually rectify the images.

Eduardo gravatar imageEduardo ( 2016-01-12 12:16:55 -0600 )edit

You say the views are fronto-parallel, so what does that mean for the Rotation matrix? Does that mean there is NO rotation between the cameras, only translation? In my case since I rotate the object, the 2 cameras point at the same point, which is also the axis of rotation of the object. This is why I used the Rotation matrix as I explained. If this is not correct, what should I do?

MrE gravatar imageMrE ( 2016-01-12 12:30:07 -0600 )edit

For stereoRectify, the docs says the Rotation is the rotation between the coordinate systems of the cameras. And translation is translation between the 2 cameras. This is what I am doing. Now if the disparity map process expects the images to be taken from a fronto-parallel setup that's a different story: that means I should account for some 'distortion' when calculating that, is that right? I had calibrated my camera on a chessboard pointed towards the camera, and used that for both distortion in the stereoRectify, but maybe I need to calibrate the 2 views by rotating the chessboard? I'm starting to wonder if this can even work with rotation; I guess I should 'translate' the camera then to simulate a front-parallel setup.

MrE gravatar imageMrE ( 2016-01-12 12:38:33 -0600 )edit

Check this sample: stereo_match.cpp.

It is for stereo cameras but the principle should be the same for you.

The pipeline when using a stereo rig is:

  • Calibrate the left and right cameras once with stereoCalibrate using multiple images
  • Take left and right pictures
  • undistort the left and right images
  • rectify the left and right images in order to have a fronto-parallel view
  • compute the disparity map
  • convert the disparity map to the depth map

I don't think that your translation is good as you always have a distance of 1 between the two camera positions regardless the angle of rotation. If you imagine the inverse, two cameras and a static object, the distance between the cameras should change.

Eduardo gravatar imageEduardo ( 2016-01-12 12:58:56 -0600 )edit