I'm using the Scene Text Detection features of OpenCV and have some troubles finding the optimal parameters for createERFilterNM1. Here is what the doc states:
C++: Ptr<erfilter> createERFilterNM1(const Ptr<erfilter::callback>& cb, int thresholdDelta=1, float minArea=0.00025, float maxArea=0.13, float minProbability=0.4, bool nonMaxSuppression=true, float minProbabilityDiff=0.1 ) Parameters:
- cb – Callback with the classifier. Default classifier can be implicitly load with function loadClassifierNM1(), e.g. from file in samples/cpp/trained_classifierNM1.xml
- thresholdDelta – Threshold step in subsequent thresholds when extracting the component tree
- minArea – The minimum area (% of image size) allowed for retreived ER’s
- maxArea – The maximum area (% of image size) allowed for retreived ER’s
- minProbability – The minimum probability P(er|character) allowed for retreived ER’s
- nonMaxSuppression – Whenever non-maximum suppression is done over the branch probabilities
- minProbabilityDiff – The minimum probability difference between local maxima and local minima ERs
The default values are not the same as in the example. So I wonder what the optimal settings are.
Would it be possible to optimize the settings when I know the fontsize of the to-detect-text? Maybe by setting the max/minArea to the size of a letter? Keeping this in mind I tried to tweak the settings a bit to get text detection for this image to work (I would like to detect the subtitles). With thresholdDelta = 16, minArea = 0.0004 maxArea = 0.04 I get these results. Changing minProbability and minProbabilityDiff seems to change nothing. But maybe it's no use given the quality of the source image.
I lack the clear understanding of the parameters. Could someone please explain the parameters more in detail for me or give some hints on how to find the optimal parameters?