Opencv detect ellipse around text
I'm working on an OCR software for table detection using the Java version of Opencv. I'm able to detect almost all text borders of the images but i've problems with "circled" words/numbers.
For text detection I do the following:
I detect horizontal and vertical lines from the table using morphological operations (from this answer). Detected lines are removed from the original Mat image performing a subtraction (not sure if is the best approach to do this):
Imgproc.cvtColor(src, gray, Imgproc.COLOR_BGR2GRAY);
// apply adaptive threshold at the bitwise_not of gray
Mat bw = new Mat();
Core.bitwise_not(gray, gray);
Imgproc.threshold(gray, bw, 0.0, 255.0, Imgproc.THRESH_BINARY | Imgproc.THRESH_OTSU);
// create the images that will use to extrat the horizontal and vertical lines
Mat horizontal = bw.clone();
Mat vertical = bw.clone();
// Specify size on horizontal axis
int horizontalSize = horizontal.cols() / 20;
Mat horizontalStructure = Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new Size(horizontalSize, 1));
// apply morphology operations
Imgproc.erode(horizontal, horizontal, horizontalStructure, new Point(-1, -1), 1);
Imgproc.dilate(horizontal, horizontal, horizontalStructure, new Point(-1, -1), 1);
// Imgproc.blur(horizontal, horizontal, new Size(3,3));
Mat kernel = Imgproc.getStructuringElement(Imgproc.MORPH_DILATE, new Size(3, 3));
Imgproc.dilate(horizontal, horizontal, kernel, new Point(-1, -1), 1);
// Specify size on vertical axis
int verticalSize = vertical.rows() / 20;
Mat verticalStructure = Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new Size(1, verticalSize));
// apply morphology operations
Imgproc.erode(vertical, vertical, verticalStructure, new Point(-1, -1), 1);
Imgproc.dilate(vertical, vertical, verticalStructure, new Point(-1, -1), 1);
kernel = Imgproc.getStructuringElement(Imgproc.MORPH_DILATE, new Size(3, 3));
Imgproc.dilate(vertical, vertical, kernel, new Point(-1, -1), 1);
//Is the correct way to do this?
// delete lines from binary image
Core.subtract(bw, horizontal, bw);
Core.subtract(bw, vertical, bw);
Then with the findContours method I get words boundaries:
ArrayList<MatOfPoint> contours = new ArrayList<MatOfPoint>();
Mat hierarchy = new Mat();
Imgproc.findContours(bw, contours, hierarchy, Imgproc.RETR_LIST, Imgproc.CHAIN_APPROX_SIMPLE);
The intermediate and final result are shown in the image:
The problem is that cirled numbers are not recognized and I am not able to detect and remove the cirles around numbers at the bottom of the table.
I've tried using _Hough Circles Transform_ and the fitEllipse
method with no decent results. Is the best approach trying to remove circles by matching them with an ellipse?
Can anyone suggest an effective procedure to achieve this?