Ask Your Question

Depth from Intensity and Photometric Stereo

asked 2014-06-09 11:12:32 -0600

updated 2014-08-10 13:03:01 -0600

I am interested in estimating the depth of a scene within 1m of a sensor. I using a single IR sensor and several (2-3) sources of LED IR illumination. I intend to combine the information from intensity fall-off with the differences between subsequent frames illuminated from different sources, the locations of which are known relative to the sensor. I am able to capture up to 120fps and my hope is that this frame rate is sufficient to capture moving articulated objects (up to some reasonable speed of the objects).

My questions are the following: 1) If you have tried this approach, what are its limitations? 2) What frame rate is needed to capture moving objects if photometric stereo is used? 3) Do I need three sources of illumination? Or can I use 2? Can the 3 sources be collinear? 4) Can anyone point me to quality recent work estimating depth from intensity fall off, especially for close range depth estimation?

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted

answered 2014-08-20 04:17:04 -0600

R.Saracchini gravatar image

updated 2014-08-21 01:09:02 -0600

Answering your questions:

1) This approach is difficult to apply in non controlled conditions. Do not expect good results if your object does not fits well in the approach constraints. Moreover, Photometric Stereo (PS) does not recover directly the depth, it does computes the surface normal and albedo for each pixel. Depth is then recovered by surface gradient integration methods. Each step has its own particularities.

Normal computation: Depends largely on the approach which you want use. The classical approach assumes that the surface finish is Lambertian (matte paint, ceramic and paper-like) and light sources direction and intensity well known, so you will need to calibrate your PS rig well. Example-based approach allows you reconstruct most materials, as far you know how the normalized light intensities for a given normal behave,thus it needs a reference table normally built from a reference object with known normals under the same lighting conditions of the object to be reconstructed. Such approaches demands at least 3 light sources. SVD-based approaches do not need calibration (known light source positions, intensities), but they need a good number of light sources in order to recover good normals, using just 3 or 4 will give you "flatten" normals. Most suppose that the surface finish is lambertian or has a lambertian component with small specular component.

The worst problem is in fact when the observed pixels do not follow closely the surface reflectance model expected by your PS method. This includes shadows (self or projected), interreflection or specularity, non uniform lighting (most suppose a constant light field provided by a light source at infinity). In presence of those effects the normal computed at such regions are distorted or completely wrong.

Depth computation: First, since it is a monocular method, you cannot determine the depth in a metric scale, but in an arbitrary scale with depth relative to reference pixel (lets say, the top left one). Some methods such as Frankot-Chellapa integrator are really fast and used for real-time applications. The problem is that when you integrate the surface gradient, you have to known where depth discontinuities are (for instance, in a face, the pixel on the chin is not connected to the neck), otherwise your depth map will be very very distorted. Robust integrators makes use of weight maps which deal with this well, but Photometric stereo is not able to retrieve it alone. You will need to determine it by other method such as stereo correspondence. The relative depth between disconnected regions is also impossible to determine by PS only.

2) Depends on the speed of your moving object and number or light sources. I have seen some facial capture projects of persons in movement which needed 200 fps for 5 light sources. An 15 fps and 4 light sources photometric stereo rig worked with more or less 150 fps. Remember that you do need perfect synchronization when turning on/off lights and capturing frames. If your light source intensity varies between captures (a light source is ... (more)

edit flag offensive delete link more

Question Tools

1 follower


Asked: 2014-06-09 11:12:32 -0600

Seen: 2,803 times

Last updated: Aug 21 '14