I wish to write a program that takes in video or a sequence of images for example say a box on the floor, and then compute the edges of the box and generate a basic wireframe or textured 3D model.
I was wondering what would be a good workflow structure.
Possible ideas I have had include
1 Input image sequence
2 locate feature points that sit on a hough line and extract feature descriptors
3 Match feature points/hough lines between images perhaps use optical flow to constrain the search mask/area.
- check that the texture between feature points/hough lines is an unchanging planar texture between images by doing some form of histogram matching.
5 somehow select the feature points/hough lines that best match the intersection of 2 planes.
- perform delauny triangulation to generate the wireframe, perhaps extract textured planar surfaces
If anybody can suggest a better way or point me to some papers it would be appreciated.
The program dosnt need to be realtime it just needs to work robustly.