Recognition speed? (real-time app)

asked 2019-02-13 13:04:51 -0600

Hi! I want to write an android app that:

allows user to upload pics into collection

allows user to write a note for each pic (his name, for example)

pointing camera at a person displays their note (if their pic is in the collection)

user can add another user to friendlist

pointing camera also displays notes of users that are in your friend's collection

My questions are:

  1. I know that i can get 128-digit face descriptor using neural networks, but can I get a descriptor for a face that didn't take part in the training?

  2. How fast is the process of face descriptor calculation? Is an average smartphone able to handle this operation? For example, in my case the descriptor needs to be calculated in the camera and then compared to ~100 descriptors of user's friends. As far as I'm concerned, i can use face-tracking (which is applied for masks in apps like msqrd and instagram) to fix certain face in camera, to prevent calculating and comparing descriptors on every frame.

edit retag flag offensive close merge delete

Comments

"but can I get a descriptor for a face that didn't take part in the training? " -- wat ? explain !

2: the feature extraction is the expensive part, but you have to do that only once per image / person. then you can cache it (along with your 100 others). comparison (dot-prod or L2 norm) is lightning fast, even for thousands of those, don't worry.

berak gravatar imageberak ( 2019-02-13 13:23:29 -0600 )edit

user can add another user to friendlist

hmmm, i'm not sure, if this will work without a central (server based) database, are you ?

berak gravatar imageberak ( 2019-02-13 13:39:18 -0600 )edit
1

the feature extraction is the expensive part, but you have to do that only once per image / person. then you can cache it (along with your 100 others). comparison (dot-prod or L2 norm) is lightning fast, even for thousands of those, don't worry.

Thanks!

hmmm, i'm not sure, if this will work without a central (server based) database, are you ?

I think to compute descriptors on the client side, and store on the server side. Then, after launching the android application, the user will receive descriptors of persons from the collections of his friends from the server to the client. It is assumed that all clients and server will use the same neural network.

Annie99 gravatar imageAnnie99 ( 2019-02-13 15:26:23 -0600 )edit

"but can I get a descriptor for a face that didn't take part in the training? " -- wat ? explain !

I know that neural networks are trained in photographs of people's faces, but is it possible to recognize a face if it was not in the list of photographs in which the neural network was trained? PS It seems that I myself have already understood that this is possible, so the question disappears, I apologize :D

Annie99 gravatar imageAnnie99 ( 2019-02-13 15:28:09 -0600 )edit

^^ ah, i see, what you mean. maybe you should try to read the facenet paper (again?)

in the end, it works like this: NONE of the persons in YOUR db were part of the original (huge) training set. the whole pretrained network is used as a "fixed" feature extractor for your own set of persons. but since it already has seen millions of faces before, it generalizes nicely, to filter out the most important features for your images.

berak gravatar imageberak ( 2019-02-13 15:44:16 -0600 )edit