Bag of Word BOWTrainer clustering for binary descriptors
Context: As you might be aware, the current OpenCV BOW implementation can only function as expected with float descriptors (like SIFT and SURF) since the current clustering method is based on KMeans of euclidean distances with L2 norms. On the other hand, the much faster binary descriptors (like ORB or BRIEF) favored in mobile and real-time contexts cannot be used with the current BOW implementation, as any notion of "distance" between binary descriptors would have to be based on Hamming distance and some form of voting scheme to determine cluster centroids instead. Such a cluster method does not currently exist within the versions of OpenCV I know, 2.4 and 3.1.
The sometimes (wrongly) suggested solution of converting binary descriptors from CV_8U to CV_32F to plug them into the current BOW cluster medhod, though attractive as a quick-and-dirty fix, is nothing more than trying to fit a square peg into a round slot: it makes no mathematical sense, it's garbage in, garbage out.
Relevant links:
http://answers.opencv.org/question/17... http://answers.opencv.org/question/24... http://stackoverflow.com/questions/28... http://imagelab.ing.unimore.it/imagel... https://github.com/opencv/opencv/issu...
Question: Is a BOW implementation compatible with binary descriptors being worked on? If there is not currently one in some experimental / beta version of the latest OpenCV, is there any code example or tutorial of how to extend / inherit / modify the current BOW to make a compatible version for myself? The links above discuss the logic and algorithm of such a theoretical binary BOW, but otherwise I am basically a C++ and OpenCV newbie; I wouldn't know even where to begin to implement such a new BOW myself.