# stereo matching, census based

Hi I'm currently having trouble to understand a part in this paper :

My Problem is the part after the subpixel calculation on page 17. I don't understand it how to get the subpixel disparity map for both directions. Also I'm a little bit confused if my cost aggregation is correct. It's recommended to use a 5x5 windows and sum the values over this block. Do I sum all values in this 5x5 block or do I add every second in every second row, like I did for the census transformation? Thanks for the help!

edit retag close merge delete

Sort by ยป oldest newest most voted

You add all values in the 5x5 window. That is cost regularization with a simple box filter that does not respect edges. As for L->R and R->L matching, draw yourself some diagrams and convince yourself that from a single DSI you can get a left disparity and a right disparity. Sub-Pixel for each is the same.

more

I'm not sure I can follow. Because If I look at the block diagram there isn't said that it's multiple direction DSI and DSIaggr. So I have the aggregated costs for my disparties and from that I can get Disparity Maps for both directions? I only see a L->R or R->L matching when calculating the DSI, but this would result in 2 DSI if I do it for both directions... Is there a way to get the other direction if you have one? Because if I add the disparity-levels to the left image, I'm matching L->R? Because that's what I'm doing so far.

( 2017-03-10 03:41:40 -0500 )edit

Every comparison that you need to make for R->L you have already done for L->R it is just a different set of comparisons. So, if you have, say, 48 disparities, then for pixel DisparityL(i,j) and DisparityR(i,j) you will look at two different sets of 48 pixels, though these sets will have members in common. For the left image pixel (DisparityL(i,j)) it's all 48 pixels to the left in the right image (R(i,j-48:j)) and for the right image pixel (DisparityR(i,j)) it's all 48 pixels to the right in the left image(L(i,j:j+48)), all as found in the DSI. Isn't there a section in that paper that shows how each DSI (L and R) can be contained in one, where they are "overlaid"? If not, look for other papers by the authors.

( 2017-03-10 07:23:54 -0500 )edit

Ok there is an other paper where they explained it thank you for the information. It helped a lot. I have one more question though. For the Confidence Map I calculate the cost difference between the 2 best disparities from my aggregated cost function. Does it matter for which matching direction I determine the 2 best disparities?

( 2017-03-28 08:32:24 -0500 )edit

As I understand it, the ratio is between the lowest and second lowest in the set of n-disparities for the pixel in question. This is a final step and is performed only on the single final disparity image you are interested in (typically the left image). Of course, go ahead and do it for both if you want. Link to your project if its public and perhaps I can offer more advice.

( 2017-03-28 20:59:57 -0500 )edit

Do you mind if I send you my code and you take a look. maybe you'll see a mistake. Or do you see what I'm doing wrong from the pseudo code?

( 2017-04-20 02:17:38 -0500 )edit

@rt90 I don't think I have the time right now to look carefully at you code, but I'll offer the following advice: forget census as a cost and forget RL matching. What you first need to get right is the most basic algorithm, which would be something close to 5x5 SAD and LR matching. Once you have that working, then add in RL matching and then census cost. If you try to get the entire thing correct all in one go, you'll really struggle, there are too many indexing mistakes to make. You may also need to think about the fact that with the census cost, there is a radius, an apron, of pixels that cannot get a hamming string. Ignoring that apron could have negative affects in WTA. So, again, just concentrate on the easiest methods and then gradually add complexity and sophistication.

( 2017-04-22 16:54:33 -0500 )edit

Hey, I'm still having some troubles but I don't know where my mistake is. I wrote my programm as pseudo code. I pretty sure my code is correct for census, DSI, hamming and aggregation. In the last 2 parts LR - matching & RL-Matching, where I'm searching for the lowest costs (WTA) in the disparities. Did I understand that correct?

load left_img

calculate census_left(left_img)
calculate census_right(right_img)

//CALC DSI

for d=0 to N_Disp
for y=0 to height
for x=d to width
DSI[d].data[y*width + x] = hamming_distance(census_left[y*width + x],
census_right[y*width + x - d])
end
end
end

//aggregate costs 5+5 Boxfilter

for d=0 to N_Disp
for y=0 to height
for x=d to width
sum_pixel = boxfilter(dsi[d].data[y*width + x],5)
dsi_aggr[d].data[y*width + x] = sum_of_elems_pixel;
end
end
end

//LR - matching
disp_costs[N_Disp]
minIndex=0
min2Index=0
for y=0 to height
for x=0 to width
for d=0 to N_Disp
disp_costs[d] = dsi_aggr[d].data[y*width + x + d]
end
find_minimum(disp_costs,&minIndex,&min2Index)

dsi_min_lr[y*width + x] = minIndex
calculate_confidence_map_value(cm.data[y*width + x],dsi_aggr[].data[y*width + x],minIndex,min2Index)
end
end

//RL - matching
for y=0 to height
for x=0+N_Disp to width
for d=0 to N_Disp
disp_costs[d] = dsi_aggr[d].data[y*width + x - d]
end
find_minimum(disp_costs,&minIndex,&min2Index)

dsi_min_rl[y*width + x] = minIndex
calculate_confidence_map_value(cm,dsi_aggr,minIndex,min2Index)
end
end

more

Official site

GitHub

Wiki

Documentation