Creating Regions of Interest (ROI) by clustering fragmented contours
Overall Objective: Create Regions of Interest (ROIs) in order to then examine them for objects such as person, dog, vehicle utilizing the Java bindings
Approach: BackgroundSubtraction -> FindContours -> downselect to Region of Interest (smallest encompassing rectangle around contours of an object) that is then sent to be classified and/or recognized.
Problem: Too many fragmented contours for each object almost all the time. I've tried BackgroundSubtractorMOG and MOG2 with varying parameters (may not have tried the right combinations) along with erode/dilate and findContours(). The contours rarely completely enclose the subject, consisting instead of a number of contours that usually partially map to the subject. On top of that, there are sometime multiple objects (eg., person with dog) moving through the video stream. I am not able to reliable draw a rect around the full object in order to use that smaller window in which to detect features (or classifier, HOG, etc).
(I am addressing the shadow issue in a different thread)
My approach is leaning towards grouping contours that have some measure of nearness, though for people, the vertical elongation can be a complication for nearness calculations.
Question: Is there a method by which the nearness of contours can be evaluated, so as to group them into a larger contour/object?
Are there approaches to solving this problem? Below are images that illustrate the issue under consideration;
You could cluster the locations of the contours via agglomerative clustering. But is that really necessary? Imho sth like a cascade classifier should be fast enough to be applied in real time (of course needs much learning time).
@Guanta , So it would be doable to use a cascade classifier across the entire image in a 4+ frames per second stream, with multiple cameras at 5MP?
When you said 'sth', I'm not quite sure what you mean - Soft Cascade Classifier?
kmeans clustering seemed to hold promise, but it requires knowing how many clusters exist, and the number of objects moving through the camera FOV isn't known ahead of time.
soft cascade classifier or normal cascade classifier. However multiple cameras w. 5MP are probably too much for one PC, but maybe the drawback of downscaling the images aren't too high, try it... And I meant agglomerative clustering not k-means clustering, in agglomerative clustering you can give a minimum distance a cluster can be merged with the nearest other cluster. Furthermore check out mean-shift / cam-shift! Good luck with your project!
@Guanta , your suggestion was most helpful - thank you!