Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

SLAM in 3D is hard and as far as I know still considered "unsolved". It's even harder when you have only 2d image data. First you'd have to extract depth data from that. This can be done by extracting 2D feature points from two consecutive frames and match them with RANSAC and friends in order to get the 6DOF transformation from one frame to the other. This is called structure from motion in this case. Maybe you can input some constraints here like not moving the camera but just rotating about one axis. This would reduce parameter space and therefore time complexity. After the registration you'd 3D data, which could be projected to the ground easily. The projected points can be threated as laserscanner data and fed into the mapping algorithm. The hardest part here is the error drift. Have a look at http://openslam.org/rgbdslam.html. Although they are using additional depth data, their algorithm drifts a lot over time, corrupting the map.

Summary: 1st look for a structure from motion implementation. 2nd, be aware of the fact, that this is not easy and needs some understanding of SLAM, mapping, bayes filters (resp. particle filters), error relaxation and ... stuff :)

SLAM in 3D is hard and as far as I know still considered "unsolved". It's even harder when you have only 2d image data. First you'd have to extract depth data from that. This can be done by extracting 2D feature points from two consecutive frames and match them with RANSAC and friends in order to get the 6DOF transformation from one frame to the other. This is called structure from motion in this case. Maybe you can input some constraints here like not moving the camera but just rotating about one axis. This would reduce parameter space and therefore time complexity. After the registration you'd 3D data, which could be projected to the ground easily. The projected points can be threated as laserscanner data and fed into the mapping algorithm. The hardest part here is the error drift. Have a look at http://openslam.org/rgbdslam.html. Although they are using additional depth data, their algorithm drifts a lot over time, corrupting the map.

Summary: 1st look for a structure from motion implementation. 2nd, be aware of the fact, that this is not easy and needs some understanding of SLAM, mapping, bayes filters (resp. particle filters), error relaxation and ... stuff :)

PS Google is trying something similar but more advanced with project tango. I think they use stereo cameras there.