Improving an algorithm for detecting fish in a canal
I have many hours of video captured by an infrared camera placed by marine biologists in a canal. Their research goal is to count herring that swim past the camera. It is too time consuming to watch each video, so they'd like to employ some computer vision to help them filter out the frames that do not contain fish. They can tolerate some false positives and false negatives, so they don't need a sophisticated machine learning approach at this point.
I am using a process that looks like this for each frame:
One: Load the frame from the video
Two: Apply a Gaussian (or median blur)
Three: Subtract the background using the BackgroundSubtractorMOG2 class
Four: Apply a brightness threshold — the fish tend to reflect the sunlight, or an infrared light that is turned on at night — and dilate
Five: Compute the total area of all of the contours in the image
Six: If this area is greater than a certain percentage of the frame, the frame may contain fish. Extract the frame.
To find optimal parameters for these operations, such as the blur algorithm and its kernel size, the brightness threshold, etc., I've taken a manually tagged video and run many versions of the detector algorithm using an evolutionary algorithm to guide me to optimal parameters. However, even the best parameter set I can find still creates many false negatives (about 2/3rds of the fish are not detected) and false positives (about 80% of the detected frames in fact contain no fish).
I'm looking for ways that I might be able to improve the algorithm. Can I identify the fish by the ellipse of their contour and the angle (they tend to be horizontal, or at an upward or downward angle, but not vertical or head-on)? Should I do something to normalize the lighting conditions so that the same brightness threshold works whether day or night? (I'm a novice when it comes to OpenCV, so examples are very appreciated.)
Can the fishes swim at any vertical position in the image or they swim more or less in the center? Can they by closer (bigger) and further (smaller)? Are there any other fishes that you do not want to detect?
Thanks for your comment. Yes, they can swim at any place vertically in the image. Though I'm sure that if you looked at the distribution there are trends in where they swim, but I wouldn't want to make an assumption. The distance to the camera also varies, so yes, the size will vary as well.
Regarding other fish (or objects such as leaves), it is OK if they are detected, because a human reviewer can eliminate them. But I don't want to burden the human reviewers with a significant number of false positives per true positive.
Can you post some more images (with and without fish, far and close, low and high) to provide a broader perspective of the problem? I am a bit surprised that your not so little effort (employing evolutionary algorithms) provided such poor results. I'm not an expert on evolutionary algorithms, but perhaps it is due to large variety of fish appearances (and disturbances) and small gene pool of possible image processing parameters? Simply, there is no such combination of your few processing steps that would catch just fish and not other stuff (which I am not sure what it is in the upper part of the image). My first thought was to analyze each contour and classify them based on certain parameters like brightness, size, shape coefficients, main axis angle (as you suggested) etc.