Best ojbect guess from multiple frames

I am trying to locate an object within 30 frames of video. In general during these 30 frames, the object will be stationary and the camera will be stationary, however, the object detection algorithm is not strong enough to consistently find the object in each frame (It will find something that is not the object if the object itself is not found) but the most common object guessed is the object that I am looking for.

The image below demonstrates what the guesses may look like, and the green box is the object itself which is trying to be located.

Is there an algorithm that can take a list of rectangles and find the most likely placement based on these rectangles?

image description


What I have tried: I tried to take each rectangle, and round the left-most value (the rectangle's X value) to the nearest 10 then find the mode rounded X value, but that will just give me the first rectangle with that x value (which could be too small/big or too high/low). Again, is there a known algorithm to find a best fit rectangle from other rectangles?

Hi, you can try the groupRectangles(...) method from the object detection module. It is used in HoG::detectMultiScale(...) and CascadeClassifier::detectMultiScale(...).

