If you are new to OpenCV (and computer vision probably), then tackling such a problem is optimistic I would say.
The algorithms I have added are nowhere to be suited for datasets of 100,000
images. If you are going to run the Eigenfaces or Fisherfaces algorithm, you won't be able to allocate that much memory. Algorithms like Local Binary Patterns don't need to allocate that much memory, but finding the best match is going to be very time consuming as it's a Nearest Neighbor Search over the entire dataset.
Coming up with a solution that scales is far from trivial. While I can't offer source code and algorithm implementations, I think there are interesting papers available. Among them is one of the face.com team (a company quite successful in this area):
- Yaniv Taigman, Lior Wolf "Leveraging Billions of Faces to Overcome Performance Barriers in Unconstrained Face Recognition" (Online available on arxiv.org)
As for similarity measures I suggest looking into algorithms like One Shot Similarity Kernels, as I think they still provide state of the art results for similarity measures. There's a great paper by Lior Wolf, Tal Hassner and Yaniv Taigman (face.com Founder/CTO);
- Lior Wolf, Tal Hassner and Yaniv Taigman "Effective Unconstrained Face Recognition by Combining Multiple Descriptors and Learned Background Statistics". IEEE Transactions on Pattern Analysis and Machine Intelligence archive, Volume 33, Issue 10, October 2011 (PDF online available)
You can find some MATLAB Code on the project page for One Shot Similarity Kernels:
So do I think such a project feasible, if you are working alone and don't have a (very strong) background in computer vision? I know, that such a project requires a lot of tough problems to be solved in order to create a robust and efficient (and useful) face recognition system. In my opinion way too many tough problems for one person.