# Technique to find corresponding objects across stereo views

Hi all,

We have fixed stereo pairs of cameras looking into a closed volume. We know the dimensions of the volume and have the intrinsic and extrinsic calibration values for the camera pairs. The objective being to be able to identify the 3d positions of multiple duplicate objects accurately. Which naturally leads to what is described as the correspondence problem in litrature. We need a fast technique to match ball A from image 1 with Ball A from image 2 and so on. At the moment we use the properties of epipolar geomentry (Fundamental matrix) to match the balls from different views in a crude way and works ok when the objects are sparse, but gives a lot of false positives if the objects are densely scattered. Since ball A in image 1 can lie anywhere on the epipolar line going across image 2, it leads to mismatches when multiple objects lie on that line and look similar.

Is there a way to re-model this into a 3d line intersection problem or something? Since the ball A in image a can only take a bounded limit of 3d values, Is there a way to represent it as a line in 3d? and do a intersection test to find the closest matching ball in image 2?

Or is there a way to generate a sparse list of 3d values which correspond to each 2d grid of pixels in image 1 and 2, and do a intersection test of these values to find the matching objects across two cameras?

Because the objects can be identical, OpenCV feature matching algorithms like FLANN, ORB doesn't work.

Any ideas in the form of formulae or code is welcome.

Thanks! Sak  edit retag close merge delete

Sort by » oldest newest most voted

you understand the math well. go with your intuition.

consider a 3D ray/line for an object in the image. lines for the same object, from both camera views, will intersect (or pass closely) in space. lines that intersect/come close may not be of the same object, due to the correspondence problem, as you understand.

you'll need to calculate those lines for your objects. then, to check feasible correspondence, calculate the closest distance for each pair and check if it's small. that's a little bit of linear algebra involving cross products. you'll know that you have an unresolvable ambiguity when multiple candidates come close.

to resolve ambiguities, you can try giving objects identity, and track them over time. give objects position and velocity. associate detections in a new video frame to objects given their predicted position over the time step, then adjust position and velocity estimate. you can formalize this as a Kalman filter, or go with the simple formulation I gave.

as for object detection, since you have white balls on dark background, simple "blob detection" (thresholding, contours/connected components, cv::moments for centroids) is probably good enough. I agree that feature matching entirely unsuitable here.

calculating these lines given image pixel coordinates should involve the camera projection matrix and its pose matrix. "invert" the projection matrix and give it a screen coordinate (inverting is numerically bad, so this is done in one "solve" step). you'll get a point on the line (and image plane), in the camera frame. you can turn that into a vector (4x1 homogeneous, x/y/z as you got them, w-coordinate is 0) and pass it into the pose matrix (4x4), to get that vector in world space. combine with camera origin (0 vector, 4x1) mapped through pose matrix to get the origin point of the line/ray in world space.

offhand I can't list the specific APIs in OpenCV that will all of solve this for you. if OpenCV happens to not have a function you need, maybe for linear algebra, add the Eigen library, or use python+numpy. opencv does do cross product, norms (L2 norm in particular), and matrix multiplication (dot product).

more

Thanks for the suggestion, will get back after I get it working.

Official site

GitHub

Wiki

Documentation