Ask Your Question

How to construct a 3d face from 2d images in openCV?

asked 2017-07-26 08:53:51 -0500

linengmiao gravatar image

updated 2017-07-26 09:54:47 -0500


I'd like to know what openCV offers nowadays to construct a 3d face starting from 2d images. I found a similar post on the openCV forum which is now already 4 years old (

Does openCV now offer new solutions for this? If not, what other newer possibilities are there nowadays.

The endgoal would be to be able to do feature extraction for every face, in order to implement face recognition.

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted

answered 2017-07-26 10:01:25 -0500

updated 2017-07-26 10:03:10 -0500

You can use OpenCV to detect faces, but you'll have to look at something else if you want to build a 3d model of the face from the 2d data.

I can tell you that all you see from 2D-3D face rendering (such as snapchat and other apps that draw 3D items on your face) are just using "tricks", such as relative position of the eyes and mouth and other stuff to calculate an approximation of the face orientation. Then, there is a fake 3D template model in which the pixels are rendered according to these orientations, making it look like the application is able to render 3D face from 2D.

There isn't actually any way to build 3D models from solely 2D data, its mathematically impossible (unless there are a lot of assumptions that you can make that reduce the problem a lot).

edit flag offensive delete link more


In that case what about the so-called 3d face recognition solutions? I assume there must be some real 3d information contained in the descriptors of the different keypoints, right? Next to this I assume that in some way it must be possible to use voxels, which are 3d-pixels. What do you think?

linengmiao gravatar imagelinengmiao ( 2017-07-26 10:25:59 -0500 )edit

Which 3D face recognition solutions? Is there a link?

It is possible to use 3D pixels if you have a 3D sensor. If you have a 2D sensor there is no way to magically get 3D data. Feature extraction will further reduce the dimensionality of the data, not increase it.

Just think about it. Imagine a frontal face picture. How could it be ever possible to determine, for example, the lenght of the nose?

Pedro Batista gravatar imagePedro Batista ( 2017-07-26 10:34:56 -0500 )edit

Which 3D face recognition solutions? Is there a link?

I am just speaking in general. I suppose the title of this wikipedia article ( wouldn't contain the words 3D if there isn't really some 3d info/data being used.

How could it be ever possible to determine, for example, the lenght of the nose?

By combining multiple images: profile, frontal etc.. You may not have an absolute scale, but a relative scale is good enough for recognition

linengmiao gravatar imagelinengmiao ( 2017-07-26 11:03:28 -0500 )edit

As I said in the answer, OpenCV willl let you detect faces, eyes, mouth, nose, etc in 2D data.

If having loads of images at different angles of a face is an option for you, why don't you setup a recognition system that computes and matches features on all of these 2D images, instead of having to reconstruct a 3D model to then perform recognition? Seems an useless step honestly.

But, my answer still stands, OpenCV does not provide standard tools for this, you'll have to go through the literature and implement it yourself.

Pedro Batista gravatar imagePedro Batista ( 2017-07-26 11:33:30 -0500 )edit

According to the wiki page, going via 3d is much more robust for recognition than simply using 2d images. It states: "It has been shown that 3D face recognition methods can achieve significantly higher accuracy than their 2D counterparts, rivaling fingerprint recognition" Feel free to correct me or the wikipedia article, there are quite some new aspects there for me.

linengmiao gravatar imagelinengmiao ( 2017-07-26 11:37:05 -0500 )edit

That quote is comparing recognizing in single 2D image vs single 3D "image".

If you can use loads of images to build 3D model with software math calculations, and then use it to perform recognition, in the end it would be the same thing as performing recognition right away on all of these loads of 2D images.

Pedro Batista gravatar imagePedro Batista ( 2017-07-26 11:41:09 -0500 )edit

So you'd say stitching all the 2d images together (like for a panorama) and afterwards performing recognition using machine learning, would be robust enough? I was thinking about a 3d model that way you can eg have an idea about ones nose length as well. While when just stiching the images together to get a flattened face you are -I think- unable to know that.

Some people seem to go really very far for a robustface recognition (or smth similar) system: I guess the best way after having stitched the images together using matched keypoints would be to create my own machine learning based classifier that picks good/strong features (ie non matter whether he wears sunglasses a beard etc...) to do recognition?

linengmiao gravatar imagelinengmiao ( 2017-07-26 11:56:45 -0500 )edit

@linengmiao, Look carefully at that github source. They do it using 3D sensor (microsoft kinect), precisely as I said.

As to my previous idea, there is no need for panorama.. just read carefully:

If you have a set of 2D images which can be transformed to a single 3D image, and then you use that 3D model features to do recognition, wouldn't it be exactly the same as just extracting features from the set of 2D images right away and do recognition? Why go about reconstructing?

If you use all these 2D images to extract relevant features for recognition, indirectly your machine learning method will include nose lenght and any features that are actually relevant for differentiating a sample from other.

Pedro Batista gravatar imagePedro Batista ( 2017-07-26 12:39:29 -0500 )edit

Indeed you're making a fair point. If "panorama" is not the way to go, how would you combine those different images to get "global" descriptors? I mean by that one could calculate the descriptors for 2 profile pictures and afterwards separately the descriptors for a frontal picture.The subject would then have to show its face to the camera under multiple angles again in order to be able to fully recognize him. I don't see where feature matching comes into play. What I initially had in mind: when signing up we take 3 pictures of the subject (2 profiles and one frontal), create a 3d model with those. And then when logging in the user would just have to face the camera (no need to be face completely straight as we have a sort of speak +/- 360 view of his head in the database) to be recognized

linengmiao gravatar imagelinengmiao ( 2017-07-26 12:59:03 -0500 )edit

Recognition means that you have a database somewhere with the faces you want to recognize stored. So, in run time, you application will always compute features and compare them to this database in order to recognize whose face it is.

First you compute features from your set of 2D images, and then you go compare those features to all the subjects on your database and figure out which subject returns the highest match score.

As to your system design, I know nothing about it so I don't know how it should be done.

Pedro Batista gravatar imagePedro Batista ( 2017-07-27 04:29:25 -0500 )edit

Question Tools

1 follower


Asked: 2017-07-26 08:53:51 -0500

Seen: 4,286 times

Last updated: Jul 26 '17