Ask Your Question

question / knowledge check on face identity detection

asked 2018-06-14 07:27:52 -0500

holger gravatar image

updated 2018-06-14 07:31:06 -0500


I was kindly pointed to an implementation which can do face identity detection. I want to check if my current understanding is correct as i dont have anyone with knowledge who i could ask.

  • The first part is to find and extract faces with bounding boxes Once the face(s) is detected - it is extracted(into a mat).

  • The second part is to feed the extracted face(s) mat into another network which will produce a vector with unique values which represents the identify for one face.

Is my current understanding correct?

  • Does it makes sense to combine these two step into one(model) or should it be seperated?
  • Can the second model work for all types of object or does it have to be specific to the object you want to classify(i guess so)?

Thank you again for your help, should be enough for today - dont want to be too greedy.

Greetings, Holger

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted

answered 2018-06-14 07:49:06 -0500

dkurt gravatar image


Does it makes sense to combine these two step into one(model) or should it be seperated?

Object detection network produces not images but coordinates of bounding boxes but the second network receives exactly images of faces.

Can the second model work for all types of object or does it have to be specific to the object you want to classify(i guess so)?

You may try it but I don't think so because it's hard even comparing images of the same face but from different sides.

edit flag offensive delete link more


Ok got it, dont mix things - thank you! In the method readNetFromTorch() theres is one comment i dont understand:

Also some equivalents of these classes from cunn, cudnn, and fbcunn may be successfully imported. Does this mean cuda and cudnn (have both in path) will be automatically used? If not i still can write a wrapper for torch too like i did it for yolo.

holger gravatar imageholger ( 2018-06-14 07:58:29 -0500 )edit

@holger, it's just a feature of Torch framework. You can serialize a model with CudaTensors and FloatTensors (like gpu and cpu mode). However you can not import model serialized in gpu mode with no CUDA installed (that means without Nvidia GPU). Hopefully there is no difference between them and OpenCV can import Torch models even if their owner forgot to switch them to cpu mode.

dkurt gravatar imagedkurt ( 2018-06-14 08:11:21 -0500 )edit

Got it, thank you for clarification.

So the cuda / cudnn optimization is in the model? I am no expert but this sounds like a bad design decision(maybe its really fast that way) - its a runtime aspect imho and should be transparent.

Sadly properietary cuda is a must have for me as it make a big big difference in performance. I need to be fast :-)

Hello torch wrapper here i come. I find open cv really really awsome. Its a great source for having a reference implementation, way better than the examples coming with the cnn's.

holger gravatar imageholger ( 2018-06-14 08:31:57 -0500 )edit

Ok i read about the triplets loss

I also read about siamese networks

@dkurt, @berak Now i understand the distance computing/ embedding you was talking about. I currently have the stupid idea to drop all this complexity(triplet mining & loss is not that easy for me) and just make a small model which overfits on persons(via transfer learning) Does this makes any sense and could this work?

Edit: Overfitting here would ultimativly mean that i can indentify only on same conditions(weather, environment). Bad - right?

Greetings + Thx again, Holg

holger gravatar imageholger ( 2018-06-15 04:23:46 -0500 )edit

Answering by myself (writing helps me thinking oo) - no - intentionally overfitting is a bad idea here. I will try out two things:

  • Naive approach : try out and try transfer learning and see if it possible to introduce a new object class for one specific object. One shot learning is another term on which you can google.
  • Serious approach: try to train a siamese network with triplets loss function for specific object class

Thank you for helping me coming to that conclusion.

holger gravatar imageholger ( 2018-06-15 04:33:52 -0500 )edit

Some peopel even use the feature vector (output from facenet) as input to train a model instead of computing distance directly. Well maybe a bit overkill but sounds consequent to me

holger gravatar imageholger ( 2018-06-18 15:58:31 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower


Asked: 2018-06-14 07:27:52 -0500

Seen: 16 times

Last updated: Jun 14