Ask Your Question
3

Calculate surface normals from depth image using neighboring pixels cross product

asked 2016-01-06 09:07:32 -0600

theodore gravatar image

updated 2016-01-07 08:55:02 -0600

As the title says I want to calculate the surface normals of a given depth image by using the cross product of neighboring pixels. However, I do not really understand the procedure. Does anyone have any experience?

Lets say that we have the following image:

image description

what are the steps to follow?


Update:

I am trying to translate the following pseudocode from this answer to opencv.

dzdx=(z(x+1,y)-z(x-1,y))/2.0;
dzdy=(z(x,y+1)-z(x,y-1))/2.0;
direction=(-dxdz,-dydz,1.0)
magnitude=sqrt(direction.x**2 + direction.y**2 + direction.z**2)
normal=direction/magnitude

where z(x,y) is my depth image. However, the output of the following does not seem correct to me:

for(int x = 0; x < depth.rows; ++x)
{
    for(int y = 0; y < depth.cols; ++y)
    {
        double dzdx = (depth.at<double>(x+1, y) - depth.at<double>(x-1, y)) / 2.0;
        double dzdy = (depth.at<double>(x, y+1) - depth.at<double>(x, y-1)) / 2.0;
        Vec3d d = (dzdx, dzdy, 1.0);
        Vec3d n = normalize(d);
    }
}

Update2:

Ok I think I am close:

Mat3d normals(depth.size(), CV_32FC3);

for(int x = 0; x < depth.rows; ++x)
{
    for(int y = 0; y < depth.cols; ++y)
    {
        double dzdx = (depth.at<double>(x+1, y) - depth.at<double>(x-1, y)) / 2.0;
        double dzdy = (depth.at<double>(x, y+1) - depth.at<double>(x, y-1)) / 2.0;

        Vec3d d;
        d[0] = -dzdx;
        d[1] = -dzdy;
        d[2] = 1.0;

        Vec3d n = normalize(d);
        normals.at<Vec3d>(x, y)[0] = n[0];
        normals.at<Vec3d>(x, y)[1] = n[1];
        normals.at<Vec3d>(x, y)[2] = n[2];
    }
}

which gives me the following image:

image description


Update 3:

following @berak's approach:

depth.convertTo(depth, CV_64FC1); // I do not know why it is needed to be transformed to 64bit image my input is 32bit

Mat nor(depth.size(), CV_64FC3);

for(int x = 1; x < depth.cols - 1; ++x)
{
    for(int y = 1; y < depth.rows - 1; ++y)
    {
        /*double dzdx = (depth(y, x+1) - depth(y, x-1)) / 2.0;
        double dzdy = (depth(y+1, x) - depth(y-1, x)) / 2.0;
        Vec3d d = (-dzdx, -dzdy, 1.0);*/
        Vec3d t(x,y-1,depth.at<double>(y-1, x)/*depth(y-1,x)*/);
        Vec3d l(x-1,y,depth.at<double>(y, x-1)/*depth(y,x-1)*/);
        Vec3d c(x,y,depth.at<double>(y, x)/*depth(y,x)*/);
        Vec3d d = (l-c).cross(t-c);
        Vec3d n = normalize(d);
        nor.at<Vec3d>(y,x) = n;
    }
}

imshow("normals", nor);

I get this one:

image description

which seems quite ok. However, if I use a 32bit image instead of 64bit the image is corrupted:

image description

edit retag flag offensive close merge delete

Comments

I see problem like this : In image plane M1(x1,y1,-f) f focal Optical center at coordinate (0,0,0) depth of M1 is d1 so there is a physical point P1 (vector equationà OM1=aOP1 you can find P1 coordinates and repeat procedure for M2(x1-1,y1,-f) and M3(x1,y1,-1) you have got a plane and you find normal to this plane...

LBerger gravatar imageLBerger ( 2016-01-06 09:50:05 -0600 )edit

@berak I do not see your answer.

theodore gravatar imagetheodore ( 2016-01-06 10:01:23 -0600 )edit

^^ sorry, i deleted the comment, because it was far too naive, but , for the fun, here it is again (just don't take it too serious ...)

   // 3d pixels, think (x,y, depth)
   * * * * *
   * * t * *
   * l c * *
   * * * * *

so, the naive way would be just: n = (t-c).cross(l-c); // normal at pixel c

bad thing about is, it contains a bias, depending , which neighbours you choose for the cross product

berak gravatar imageberak ( 2016-01-06 10:12:54 -0600 )edit

@berak no problem, I am quite new in the field as well so I am trying to figure it out. However, why are you taking only the left and top neighbor pixels. Shouldn't the right and bottom ones be included as well, or even the diagonals.

@LBerger I did not really get what you want to say, maybe if you could try to explain it a bit more.

Moreover, searching a bit I found this post in SO which describes a quite nice procedure. I am trying to figure out now how to implement it.

theodore gravatar imagetheodore ( 2016-01-06 11:25:26 -0600 )edit

Have you got xyz of all image points?

LBerger gravatar imageLBerger ( 2016-01-06 11:37:30 -0600 )edit

Yes I have this information. In the actual depth image of the image I have above what you see as black is NaN values (no depth info) and the gray-ish pixels representing the panther is double distance values.

theodore gravatar imagetheodore ( 2016-01-06 11:48:17 -0600 )edit

also there is this other link but I still cannot understand. How I do calculate the K matrix that is mentioned there.

theodore gravatar imagetheodore ( 2016-01-06 12:02:33 -0600 )edit

K matrix is only intrincec parameters. You must have this parameter in same unit of you depth map. Don't you want to use point cloud?

LBerger gravatar imageLBerger ( 2016-01-06 12:27:35 -0600 )edit

Ok I see I do not have the info for this matrix. My only input is the depth image and from that one I want to get the normals using the cross product of the neighbors.

theodore gravatar imagetheodore ( 2016-01-06 12:57:59 -0600 )edit

That's not possible without this matrix. Imagine a constant depth map : it's not a plane but a cylindrical because it's stenope model... You have to estimate some values. I think this method is only a shortcut of my first comment.

LBerger gravatar imageLBerger ( 2016-01-06 13:32:42 -0600 )edit

2 answers

Sort by » oldest newest most voted
3

answered 2016-01-09 06:29:56 -0600

theodore gravatar image

ok I think I am ready to provide some solutions. So, let's say that we have the following image as input:

The image bellow is converted to uchar values for visualization, so have in mind that. In order to get the correct result you should load the actual matrix with the depth values. You can still apply the solution on this image but the output will not be that smooth.

The above image is converted to uchar values, so have in mind that. In order to get the correct result you should load the actual matrix with the depth values.

Solution 1:

Following instructions from this SO thread mentioned in the comments above as well. It is a kind of shortcut to cross production method.

if(depth.type() != CV_32FC1)
        depth.convertTo(depth, CV_32FC1);

Mat normals(depth.size(), CV_32FC3);

for(int x = 0; x < depth.rows; ++x)
{
    for(int y = 0; y < depth.cols; ++y)
    {
        // use float instead of double otherwise you will not get the correct result
        // check my updates in the original post. I have not figure out yet why this
        // is happening.
        float dzdx = (depth.at<float>(x+1, y) - depth.at<float>(x-1, y)) / 2.0;
        float dzdy = (depth.at<float>(x, y+1) - depth.at<float>(x, y-1)) / 2.0;

        Vec3f d(-dzdx, -dzdy, 1.0f);

        Vec3f n = normalize(d);
        normals.at<Vec3f>(x, y) = n;
    }
}

imshow("normals", normals);

Outuput:

image description


Solution 2:

Applying the actual cross production explicitly. It is based on @berak's suggestion/comment and provided solution from above. (@berak if you have any problem with it or you do not feel confident having your solution here just tell me to remove it, no problem with it).

if(depth.type() != CV_32FC1)
    depth.convertTo(depth, CV_32FC1);

Mat normals(depth.size(), CV_32FC3);

for(int x = 1; x < depth.cols - 1; ++x)
{
    for(int y = 1; y < depth.rows - 1; ++y)
    {
            // 3d pixels, think (x,y, depth)
             /* * * * *
              * * t * *
              * l c * *
              * * * * */

        Vec3f t(x,y-1,depth.at<float>(y-1, x));
        Vec3f l(x-1,y,depth.at<float>(y, x-1));
        Vec3f c(x,y,depth.at<float>(y, x));

        Vec3f d = (l-c).cross(t-c);

        Vec3f n = normalize(d);
        nor.at<Vec3f>(y,x) = n;
    }
}

imshow("explicitly cross_product normals", normals);

Output:

image description

For some reason the output slightly differs, I do not really understand yet why but I am looking on it.


Solution 3:

A convolution based approach. Based on this paper where the method is described in the supplemental work (1st paragraph).

if(depth.type() != CV_32FC1)
    depth.convertTo(depth, CV_32FC1);

// filters
Mat_<float> f1 = (Mat_<float>(3, 3) << 1,  2,  1,
                                           0,  0,  0,
                                          -1, -2, -1) / 8;

Mat_<float> f2 = (Mat_<float>(3, 3) << 1, 0, -1,
                                           2, 0, -2,
                                           1, 0, -1) / 8;

/* Other filters that could be used:
% f1 = [0, 0, 0;
%       0, 1, 1;
%       0,-1,-1]/4;
% 
% f2 = [0, 0, 0;
%       0, 1,-1;
%       0, 1,-1]/4;

or

% f1 = [0, 1, 0;
%       0, 0, 0;
%       0,-1, 0]/2;
% 
% f2 = [0, 0, 0;
%       1, 0, -1;
%       0, 0, 0]/2;
*/


Mat f1m, f2m;
flip(f1, f1m, 0);
flip(f2, f2m, 1);

Mat n1, n2;
filter2D(depth, n1, -1, f1m, Point(-1, -1), 0 ...
(more)
edit flag offensive delete link more

Comments

@theodore, i got absolutely no problem with it, i just felt totally unsecure about the handedness, yaxis pointing down, world coords vs. image coords, and all. happy, you could use it in the end !

berak gravatar imageberak ( 2016-01-09 06:37:11 -0600 )edit

@berak super thanks. Your help was indeed really helpful. Just only another question, do you know why changing from double to float would make any difference to the pixel values Vec3d to Vec3f. I cannot really understand that.

theodore gravatar imagetheodore ( 2016-01-09 06:44:57 -0600 )edit

Hello, I have an image CV_8UC1 in gray scale. I was triyng do the same thing but is not working. I just see an black image. Someone can help me??

studentEEC gravatar imagestudentEEC ( 2018-02-22 08:40:48 -0600 )edit

@studentEEC , please do not post answers here, if you have a question or comment, thank you

berak gravatar imageberak ( 2018-02-22 09:01:09 -0600 )edit

Regarding the difference between the first and the second image, isn't the color channel inverted? Is that the difference

monsieurbeilto gravatar imagemonsieurbeilto ( 2018-09-03 22:29:14 -0600 )edit

^^ no, it is a different algorithm to calculate normal maps

berak gravatar imageberak ( 2018-09-03 22:39:37 -0600 )edit

@berak, I see. But the author of the solution mentions that the solution from method 1 and method 2 should be the same. "We should be getting the same result in Solution 2 but apparently something is missing" Plus, he says the 1st is a shortcut to the cross product method, which is explicitly coded in solution 2

monsieurbeilto gravatar imagemonsieurbeilto ( 2018-09-06 16:01:10 -0600 )edit

@berak@ALL Why do we set the z component of the vector to 1.0 in this step -> Vec3f d(-dzdx, -dzdy, 1.0f); ?

monsieurbeilto gravatar imagemonsieurbeilto ( 2018-09-29 19:16:27 -0600 )edit
baichuan gravatar imagebaichuan ( 2020-08-23 06:24:03 -0600 )edit

@berak Hi, theodore don't use camera intrinsics here at last. Is it right?

baichuan gravatar imagebaichuan ( 2020-08-23 06:28:57 -0600 )edit
0

answered 2020-08-23 06:27:21 -0600

The code (matrix calculation) I think is right:

def normalization(data):
   mo_chang =np.sqrt(np.multiply(data[:,:,0],data[:,:,0])+np.multiply(data[:,:,1],data[:,:,1])+np.multiply(data[:,:,2],data[:,:,2]))
   mo_chang = np.dstack((mo_chang,mo_chang,mo_chang))
   return data/mo_chang

x,y=np.meshgrid(np.arange(0,width),np.arange(0,height))
x=x.reshape([-1])
y=y.reshape([-1])
xyz=np.vstack((x,y,np.ones_like(x)))
pts_3d=np.dot(np.linalg.inv(K),xyz*img1_depth.reshape([-1]))
pts_3d_world=pts_3d.reshape((3,height,width))
f= pts_3d_world[:,1:height-1,2:width]-pts_3d_world[:,1:height-1,1:width-1]
t= pts_3d_world[:,2:height,1:width-1]-pts_3d_world[:,1:height-1,1:width-1]
normal_map=np.cross(f,l,axisa=0,axisb=0)
normal_map=normalization(normal_map)
normal_map=normal_map*0.5+0.5
alpha = np.full((height-2,width-2,1), (1.), dtype="float32")
normal_map=np.concatenate((normal_map,alpha),axis=2)
  1. We should use the camera intrinsics named 'K'. I think the value f and t is based on 3D points in camera coordinate.

  2. For the normal vector, the (-1,-1,100) and (255,255,100) are the same color in 8 bites images but they are totally different normal. So we should map the normal values to (0,1) by normal_map=normal_map*0.5+0.5.

Welcome to communication. My email is [email protected]

edit flag offensive delete link more

Question Tools

5 followers

Stats

Asked: 2016-01-06 09:07:32 -0600

Seen: 18,604 times

Last updated: Jan 09 '16