# Estimate object pose using multiple cameras

Hello!

TL;DR I need a function similar to solvePnP(), but that would be able to estimate the pose of a model using information from multiple cameras instead of only one camera

I am trying to find the pose (rotation and translation) of a simple object covered with markers, using n cameras placed around the object.

The pose of each camera is known: I already have a matrix Ci for each camera i such as for a point X=(x,y,z,1) in real world coordinates, Pi*X gives me the coordinates of that point in the camera's coordinate system.

The object I am trying to estimate is composed of m points, and I know the position of each of them in the object's coordinate system.

I am already able to find the object's coordinates in the plane of each cameras.

So if I put all this together, for each point j seen in a camera i I get this:

sij * Pij = Ci * A * Xj

where:

sij is an unknown scalar (it is here because we don't know how far from the camera the point found is) that multiplies the projection of the point j on the camera i (unknown)

Pij is the coordinates of the point j projected on the camera i: (x',y',1)T (known)

Ci is the matrix that describes the rotation and translation of the camera i (known)

A is the matrix I'm looking for, it describes the transformation between the object's coordinate system and real world coordinates (unknown)

Xj is the point j in the object's coordinate system: (x,y,z,1)T (known)

I will typically see 4 different points on 3 different cameras (the 12 points found are all differents), which would give me a set of 12 of those linear systems.

How do I find the matrix A that satisfies the best this set of linear systems ?

This problem looks like something that could be solved using DLT (https://en.wikipedia.org/wiki/Direct_...), but I'm not able to transform my systems to fit the form shown on this wikipedia page.

My question is similar to this one : http://answers.opencv.org/question/13..., but the answer there does not solve my problem because it requires that the points used to resolve the pose of the model are seen in multiple cameras.

edit retag close merge delete