Ask Your Question

Computing depth map features

asked 2013-02-03 10:53:30 -0600

pavrom gravatar image

For a task of object recognition/detection, i'm looking for a way to employ depth maps i got from a kinect style camera to increase accuracy of an eventual classfier. I was wondering if computing standard features (for example, SIFT, SURF, FAST or HOG) on depth maps could be useful in this context.

Thank you.

edit retag flag offensive close merge delete

2 answers

Sort by ยป oldest newest most voted

answered 2013-02-08 12:44:35 -0600

FrieS gravatar image

Hi adrom,

I'm currently writing my master thesis about feature based person detection in 3D data, so maybe I can help you getting an overview of the available features.

First important thing to know is, do you wanne do object detection or object recognition? As for the latter you (normally) match one point cloud to another to compare the similarity between both, features like FPFH or SURE (Surface Entropy by Fiolka, 2012) might be a good choice, as you can compare the similarity of the source object to the target object by computing the distance between the feature descriptors. Other Features for this task might be BRIEF, ORB, BRISK, FREAK and NARF.

For object detection the pure FPFH might not be such a good choice, as you are getting one feature histogram for each point of your point cloud. But you may combine the FPFH with some clustering algorithm or take kind of an average of the histogram to get one descriptor for your object (or a part of your object).

Beside this, there are a couple features adapting the HOG-idear on depth data: HOD (Histogram of Oriented Depth) HDD (Histogram of Depth Difference) RDSF (Relational Depth Similarities)

Some others are using the orientation of the normal vector, as FPFH is doing: HLSN (Histogram of Local Surface Normal -> working on pointclouds) HONV (Histogram of Oriented Normal Vectors -> working on depth map)

There is code for the SURE-Features online and the FPFH is of course part of the pcl. I couldn't find any code for the other features online, but especially the 4 latter shouldn't be too hard to implement.

edit flag offensive delete link more


@FrieS: thank you very much. Right now i'm trying to implement the strategy outlined in one of my previous comments (Feb 5 '13 - right below). I'd like to know your opinion on this. Also, i didn't know about HOD, thanks for that too..

pavrom gravatar imagepavrom ( 2013-02-09 09:40:58 -0600 )edit

answered 2013-02-05 12:10:42 -0600

You still can try, but I'm not sure it will be relevant. It exists interesting points for depth maps (see PCL for example), but not included in OpenCV so far. By the way, you have to train two separated classifiers for that purpose (one in color/gray level and one for depth images). Moreover, depth maps are not aligned with RGB images, you have to correctly use you depth detector. Nevertheless, it could be useful to have depth-based interest points, let us know what kind you have selected.

edit flag offensive delete link more


Thank you. I was thinking about collecting multiple features following the bag of words approach. For visual appearance any fast-computation feature (for example, FREAK) could be useful while for depth i was thinking about using fast point feature histograms (FPFH) ( After collecting both kind of features for each training sample, i would like to get the descriptors to train a single classifier. Do you think this strategy could be feasible? (I'm actually kind of a newbie to the all thing).

pavrom gravatar imagepavrom ( 2013-02-05 15:46:32 -0600 )edit

Descriptors for depth (FPFH) and for color (SIFT, SURF, ...) are represented by vectors of different sizes. I think it would be better to train a classifier on FPFH and another one on SIFT/SURF. After that, you can train a classifier with occurrences of FPFH and SIFT/SURF for each object (Bag Of Features). This solution seems fastidious, but more relevant in my opinion. Or you can make a single descriptor with occurrences of FPFH and SIFT/SURF for each object (as you suggest) but these images are not aligned, and the vector can be huge. Moreover, the features are not of the same scale, may be you need to normalize it before the training stage. I'm interesting to see your results and the solution you have choose.

Mathieu Barnachon gravatar imageMathieu Barnachon ( 2013-02-06 00:33:20 -0600 )edit

Thank you. For the vector-size problem i could use PCA and take only n components. The dataset i'm using was collected using OpenNI, which has a flag for registering the RGB and depth images so that they are aligned. Also, right now i'm more interested in recognition than detection. I'll probably try both strategy (one and two classifiers).

pavrom gravatar imagepavrom ( 2013-02-06 01:05:02 -0600 )edit

Question Tools

1 follower


Asked: 2013-02-03 10:53:30 -0600

Seen: 4,927 times

Last updated: Feb 08 '13