Some suggestions on limiting the enourmous space of your image in which each window has to pass the classifier:
- Use the lowest resolution your application allows you too, 640x328 in your case.
- Like you said, downsample the matrix to the half size in both dimensions.
- Use a detection range, just like a minSize there is also the possibility to use the maxSize attribute. Just take 10 shots, use your computer to define average button size in your setup and then take like range from 0.85 x range to 1.15 x range.
- Also if you can first do some segmentation of your space, and remove all areas that you are certain of do not contain the actual object, then create a binary mask and only detect in windows that lie in the smallest bounding box of your blobs of possible candidates.
These tactics will drastically reduce your searchspace and give you a speed advantage. However getting speeds of 20-30 frames will be hard. I think 15 fps should be possible, which is almost realtime.