Question about Bag of Words, detectors, and such
hello all,
I am basically looking to do a bag of words type setup where I comepare a picture taken, with items in the database.
Basically am example of what I am doing is taking a picture of a bookshelf, and identifying the books on it. So 1 picture could contain 50 "vocabulary" items.
Basically I am curious about which "keypoint detectors" "feature descriptors" and "matchers" I will need.
It seems there are so many choices and I don't know which would be better for what.
I would like to use something other than SURF or SIFT because I hear you need a license and to get that requires a good bit of money.
I have heard good things about FREAK, BRISK, and ORB, but those are only descriptors right? I would still need a keypoint detector and matcher? ( I thought I also heard that some descriptors are also detectors or...?)
I think that one of the more important things would be scale invariance as the picture I have might not be the size of the picture I'm taking within the bookshelf.
I don't think that rotation is that big a deal.
I'm not sure what else I should ask about these but if anyone has any input to help me on my path I would greatly appreciate it...
As for BoW itself I hear you basically have your vocab of keypoints, then you compare them to the keypoints in the image, and then do a histogram compare?
I also believe I heard something about training classifiers? Why exactly would we need to do that? To identify the items within the whole picture? like a bottle compared to a box?
I think that's all, thanks again to anyone who can help,
~KZ