OpenCV in use cases requiring millisecond-level precision

asked 2017-08-16 06:37:06 -0500

Jesper gravatar image


This is not a very technical question per se, but rather one regarding the limitations of OpenCV as a framework. I'm currently looking into designing a bot which can play rhythm games (think Guitar Hero and games with similar mechanics), utilizing computer vision to determine when inputs should be made.

My current design involves using a multi-tracker to track "notes" as objects as they scroll down the lane, and then press the correct button when they reach the lower area of the lane. I'd like to adapt this to use machine learning once I have the computer vision part done, but that's in the future.

For the first game which I'm planning on using this bot with, capturing the entire game window isn't necessary. I figure that capturing just the note lane(s) is a better option, since a smaller area would lead to faster processing (288x482 px in this case, notes have 3 different colors, don't change form as they scroll down, will scroll at different speeds depending on song). The game itself runs at 60 frames per second, so being able to process images at that rate seems necessary in this case.

Now, before I begin even thinking about implementing this design using OpenCV, I have two fundamental inquiries:

1. Would a multi-tracker be capable of performing a task like the one I've stated here? Given that there could be many notes present on-screen at any given time, would a multi-tracker still be able to accurately track all those notes?

2. With what degree of precision would OpenCV be able to track the motion of the notes and detect when they reach the designated "hit-area"? Many modern rhythm games have very small timing windows, one of the main ones that I'd like to use this bot with has a timing window of ~32ms to achieve the highest score for a single note, for instance.

Any answers/pointers are appreciated. If OpenCV would be unsuitable for the task I've specified above, I'd appreciate any advice regarding similar frameworks that I could use for my project here. Or perhaps a different OpenCV-powered approach, rather than using a tracker.

edit retag flag offensive close merge delete


Opencv is a computer vision library. precision is mostly related to your algorithm. Also performance and speed are related to the hardware you'll be using. your CAMERA , GPU/CPU.

Opencv supports GPU and multicore processing.

Ziri gravatar imageZiri ( 2017-08-16 08:55:41 -0500 )edit

Thanks for the clarification! I'm referring to the different trackers that come with OpenCV (KCF, MIL, etc) when I ask about precision here. From what I've read, KCF would likely be the best for my use case, but I'm still not entirely sure about its accuracy and precision (for my specific use case).

Jesper gravatar imageJesper ( 2017-08-16 09:03:36 -0500 )edit

I probably wouldn't use one of the built in trackers at all. Just use color thresholding to separate out the three colors, and then do your own processing of the results. Since it's so simple and regular, you don't need anything that learns like KCF.

Tetragramm gravatar imageTetragramm ( 2017-08-16 17:23:01 -0500 )edit

Right, that sounds like a more sensible approach. The only qualm I had about going that direction was whether or not it'd be possible to do that kind of processing at 60+ frames per second, although thinking about it I don't see how it could be any more demanding than object tracking (especially when we're talking about tracking many objects at a time).

Thank you for the input :)

Jesper gravatar imageJesper ( 2017-08-17 03:18:36 -0500 )edit