Stereo Vision Related Queries

asked 2013-06-21 06:39:34 -0600

236 ●2 ●10

Hey guys,

Recently I was working a little bit with stereo vision. But I came across a serious problem. I know i was implementing a fairly simple DSI (Disparity Space Image) based Stereo Matching (constructed using Normalized Cross Correlation with RGB as feature) with dynamic programming to minimize error. Pretty simple model, But too many local errors. And there were a lot of things like occlusion, inter scan-line consistency that I ignored.

Observations

But the thing I Observed was that if I use the block matching model, I will never be able to get any information regarding curvature. My disparity map will be like step functions at Intensity edges (since I am using RGB feature for correlation) - Pretty obvious from the primitive energy minimization model. But I have no other choice of choosing other models for two reasons - One, I am not that strong in Math to understand GraphCuts quickly. Two,I need speed more than accuracy, but hey, I am greedy and I would appreciate accuracy too.

Now for the queries :

If I use other forms of feature vectors obtained from neighborhood relations (or) other Transformations, Its kinda intuitive to give better results as I have more features than just RGB. But what do I choose if I need to get information regarding curvature of a plain cylinder which will have same features all over the surface and (it actually came to my mind now) I think the disparity map will be like steps ( " _/'''\_ " ,didnt get it, did you?). It can never be more than that if I use spatial domain features.So any suggestions on going around that problem?
If I need some decent accuracy in the estimation of disparity but quickly, Is there any other methods that You can suggest? I have tried graphcut algorithms from Middlebury and It gives awesome results but very slow and I have also had a look at NVIDIA's Push Relabel method which works only if someone has NVIDIA card. Sad.

Expectations

I will be really happy if you (the reader of this long and boring question) can guide me towards some resources, share your experience, suggest a method, etc. I will be happy even if you try to think for a moment for answering this. I don't know whether this question belongs to this forum or not, but I find many people interested in OpenCV here and found people working on stereo too. Moderators can close this question if they feel so. But I thought I should try my luck here. Sorry if I am breaking any rules.

Regarding OpenCV

Are there GraphCut Energy Minimisation algorithms (implemented / in pipeline) for Stereo in OpenCV?
Can you guys start a Discussion Forum as well apart from Q&A Forum? Is it feasible?

Thanks for trying to help

Regards,

Prasanna S

edit retag flag offensive close merge delete

add a comment

answered 2014-05-07 10:16:46 -0600

Pedro Batista

2213 ●5 ●18 ●42 https://lnkd.in/bw9XGH8

updated 2014-05-07 10:32:18 -0600

Good question. I don't have any answers for you, and I'm just putting this comment as an answer to bump this question up and give it atention, please do not delete it.

I've been investigating stereo vision myself as I need to see its potential. I've tried the opencv's algorithms but they are really slow (3 - 4 fps) and the accuracy is really far from what I need for the kind of applications I build. I've also been searching for a discussion forum about this technology, so far with no luck.

Since you seem much more into stereo vision then me, can you answer a question? Are there any stereo vision algorithms that work at 30 fps with an accurate output? Or is the technology just not there yet? Most videos I find on youtube on stereo are pretty poor, but sometimes i come across of things like this. What do you think about it? What is behind these kind of results?

I invite anyone invested in this issue to the discussion.

EDIT: To bring in some input regarding your question, there are some good curvature features, such as the Histogram of Oriented Gradient, which gives a measure of the strenght of the edges of an image affected by its orientation

Regards

edit flag offensive delete link

Comments

Well, I haven't come across any real time stereo algorithm working without any additional hardware like GPGPUs etc. The video link you posted was indeed a surprise to me - How could they possibly do it? I will dig a bit more. Thanks for sharing! I know you would have come across devices like kinect (Infrared ) and Time of Flight Cameras. Why don't you give them a try too? They seem to work real time. And regarding curvature, I didn't get your point on how HoG helps. Can you please elaborate? I will be happy even if someone can point out if its impossible to get the curved surface of a cylinder / sphere with a stereo rig.

Prasanna ( 2014-05-07 10:48:18 -0600 )edit

Actually I work a lot with infrared cameras like kinect, but looking for an alternative to get a depth map with close accuracy and frame rate. Well, since I don't know how your stereo algorithm works I cannot elaborate, it was just an idea that may help segmenting curved edges on an image.. dont know where to go from that.

Pedro Batista ( 2014-05-07 11:37:13 -0600 )edit

About the video, I dug it a little further. In their (Austrian Institute of Technology) website there is a publication called "Cooperative and asynchronous stereo vision for dynamic vision sensors". I believe this is related to the video I linked in my answer. So, they use a Dynamic Vision Sensor (DVS) which is a different type of camera.. it is actually pretty amazing, check this link: https://www.youtube.com/watch?v=QxJ-RTbpNXw

Pedro Batista ( 2014-05-07 11:49:14 -0600 )edit

Well, Maybe you should wait for Kinect 2.0 with which you can experiment with Time of Flight for depth. There are these techniques which I came across - Depth from blur, Depth from defocus, Depth from Photometry - Don't know anything about their performance, just heard about them. Maybe you wanna look into them.

What I actually meant by curvature was - When you set your stereo rig in front of a curved surface like a cylinder / sphere, Can you get a depth map such that the depth will vary as you move along the surface or will it just be like a depth map you get from repeating the experiment with a cube?

In short - Will you be able to differentiate a cube from a cylinder of same dimensions - side of cube = height of cylinder = dia. of cylinder?

Prasanna ( 2014-05-07 11:49:36 -0600 )edit

Well that was an amazing video indeed. Thanks. I came across this kind of sensors under the name of Neuromorphic Cameras and Optical Flow sensors. Pretty amazing things they are.

Regarding S3E, The technique they used is based on a concensus transform or something - Never heard of it before. Will share any details if I come across anything interesting.

Prasanna ( 2014-05-07 11:59:35 -0600 )edit

Please do. Thanks about your suggestions, I've already read about most of them. Actually, the guys from middlebury lab made a ranking of stereo vision algorithms that compares more than 100 methods. Check this link http://vision.middlebury.edu/stereo/eval/#references . The 2nd best method (the best one with the scientific paper available i found) still isn't good enough for my needs, since it "only" processes 10 images per second, and I build 30 fps applications. The algorithm really seems very well optimized according to the paper, with parallelizations and gpu usage, so I'm starting to conclude that the technology isn't developed enough to be able to replace 3D sensors (like kinect).

Pedro Batista ( 2014-05-07 12:21:05 -0600 )edit

About your problem, my immediate answer is yes, the depth map should show different distance values along the curved surface, and a good algorithm will most likely do. With a 3D sensor like kinect that is no problem at all.

Pedro Batista ( 2014-05-07 12:23:42 -0600 )edit

Here is a link to my video of a census stereo alogrithm in OpenCV: https://www.youtube.com/watch?v=glwB9QI8C9Y I can provide source code if folks are interested. It could be optimized further. I think for a CPU algorithm that you want to run at 30 fps, you will find that SGBM is the best tradeoff in terms of accuracy and speed. However, I would love for someone to replace the BT cost function in SGBM with the census cost function (hamming distance), and similarly for BM (which is essentially what I tried to do). The OpenCV source code isn't very revealing and so I would need a decent amount of time to understand how to insert the different cost functions, but it would be well worth the effort.

Der Luftmensch ( 2014-05-07 15:48:35 -0600 )edit

@Der Luftmensch have you considered using a fill algorithm on your depthmap? Seems like good potential for smoothing out the depth map. Something along lines of: http://www.youtube.com/watch?v=rEm-u_sgWyM would be interesting. I have some old code that did this with a few OpenCV functions with rather good results under the right conditions (I'll try to find and post). Would be interested in testing with your source/method as well.

Added: http://gc2011.graphicon.ru/files/gc2011/proceedings/conference/gc2011berdnikov.pdf

Jacob Smith ( 2014-05-07 22:49:53 -0600 )edit

@Der Luftmensch your algorithm seems create a very good depth map, very superior to SGBM, is that right?

Pedro Batista ( 2014-05-08 04:09:59 -0600 )edit

see more comments

Stereo Vision Related Queries

1 answer

Comments

Links

Question Tools

Stats

Related questions

Stereo Vision Related Queries edit

1 answer

Comments

Links

Question Tools

Stats

Related questions

Stereo Vision Related Queries