Ask Your Question
2

Object detection and splitting, clustering?

asked 2015-11-07 05:55:40 -0600

pepper gravatar image

updated 2015-11-09 05:08:44 -0600

I'm currently working on an application that will split a scanned image (that contains multiple receipts) into individual receipt images.

Below is the sample image:

image description

I was able to detect the edges of each receipts in the scanned image using canny function. Below is the sample image with detected edges:

image description

... while my sample code is

Mat src = Highgui.imread(filename);
Mat gray = new Mat();

int threshold = 12;

Imgproc.cvtColor(src, gray, Imgproc.COLOR_BGR2GRAY);
Imgproc.blur(gray, gray, new Size(3, 3));
Imgproc.Canny(gray, gray, threshold, threshold * 3, 3, true);

List<MatOfPoint> contours = new ArrayList<>();
Mat hierarchy = new Mat();

Imgproc.findContours(gray, contours, hierarchy,
        Imgproc.RETR_CCOMP,
        Imgproc.CHAIN_APPROX_SIMPLE);

if (hierarchy.size().height > 0 && hierarchy.size().width > 0) {
    for (int idx = 0; idx >= 0; idx = (int) hierarchy.get(0, idx)[0]) {
        Rect rect = Imgproc.boundingRect(contours.get(idx));
        Core.rectangle(src, new Point(rect.x, rect.y),
                new Point(rect.x + rect.width, rect.y + rect.height),
                new Scalar(255, 0, 0));
    }
}

Now my problem is, I don't know how am I going to identify the 3rd receipt since unlike with the first 2 it is not enclosed in one rectangular shape which I will use as the basis for splitting the image.

I've heard that for me to extract the 3rd image, I must use a clustering algorithm like DBSCAN, unfortunately I can't find one.

Anyone knows how am I going to identify the 3rd image?

Thank you in advance!

edit retag flag offensive close merge delete

Comments

I am not able to transform to Java this line :

big = big | Imgproc.boundingRect(contour);

Is the bitwise_or from opencv ? But the API uses Mat as sources, not Rect. Any help please?

Juampa gravatar imageJuampa ( 2019-02-26 12:34:30 -0600 )edit

@Juampa, please do not post answers here, if you have a question or comment, thank you.

Rect a,b;
Rect c = a | b;

is a special contstruct (only available in c++, java does not have overloaded operators) denoting the union of 2 Rect's. (you'll have to code that manually, somehow)

berak gravatar imageberak ( 2019-02-26 13:30:19 -0600 )edit

2 answers

Sort by ยป oldest newest most voted
4

answered 2015-11-07 14:37:47 -0600

Eduardo gravatar image

updated 2015-11-11 18:41:26 -0600

Edit2:

Your second test image:

image description

is not suitable for my code. The receipts are too close each other to correctly clusterise the contours: image description

It is maybe a little bit better when binarising the image instead of using the Canny edge detection: image description

As you are working on a specific application, if you choose a clustering approach (either the DBSCAN or Euclidean Cluster Extraction algotihms, wich are pretty similar), you have to place the receipts with some spaces between them. For example after erasing some parts: image description

Another option I see is to use simple rules when scanning the receipts. For example, the receipts are placed on a regular grid and upright. So you could locate your receipts by finding for example the blank space between them (similar to this question): image description

To sum up, I you can control the conditions when scanning the receipts, I would:

  • place the receipts upright on a regular grid with some space between them
  • clusterise and detect the blank spaces to merge the different clusters and for better robustness.

Otherwise I am afraid that it will be difficult to write an algorithm that will successfully split the receipts in most of the case without tuning or tweaking the parameters for each situation.


Edit:

I decided to speed-up the computation as it was really too slow (due to the complexity for the calculation of the contour distances) by:

  • approximating the contour distance with the 16 distances between the bounding box of the two considered contours
  • and or using the FLANN library to perform a radius search.

I also added the code to use the Otsu threshold in order to be able to automatically get the threshold for the Canny method (does not work every time).

Now the computation time is pretty good (speed-up ~ x30 vs my previous code). The input variables are the Canny threshold if the Otsu threshold does not produce good contours, the minimum distance between two contours to be considered as the same cluster and the minimum number of neighbors for a contour to be considered as relevant (not noise).

Another result: the test image, the Canny image and the result image:

image description


Nice code @sturkmen !

There are often many different ways to solve a problem. My approach is to clusterize the different contours:

  • the code is a direct implementation of the algorithm Euclidean Cluster Extraction
  • the result is not bad but not ideal
  • the code is not optimized at all, the complexity increases with the number of contours / number of contour points
  • some morphology operations could be applied to eliminate noise contours or using the contour area

image description

With a modified image:

image description

The code:

#include <opencv2/opencv.hpp>

cv::RNG rng(12345);

//@url: http://answers.opencv.org/question/75649/object-detection-and-splitting-clustering/
typedef struct contour_t {
  std::vector<cv::Point> contour_pts;
  int idx;
  int unique_id;
  cv::Rect bounding_box;

  contour_t() :
      contour_pts(), idx(-1), unique_id(-1), bounding_box() {
  }

  contour_t(const contour_t &copy) {
    contour_pts = copy.contour_pts;
    idx = copy.idx;
    unique_id = copy.unique_id;
    bounding_box = copy.bounding_box;
  }

  contour_t(const std::vector<cv::Point> &c, const int ...
(more)
edit flag offensive delete link more

Comments

@Eduardo your solution is better. i know that my code is specific and fails on many cases.

sturkmen gravatar imagesturkmen ( 2015-11-07 14:52:02 -0600 )edit

@Eduardo, first I'd like thank you for a very detailed answer.

But since I'm a java developer, I'm wondering if having the solution implemented using c++ rather have it on java is a better approach.

Since most of the implementations that I've found related to image manipulation uses c++, could give inputs on why I should consider or stick with c++ implementation (or if c++ is better in doing this kind of stuff) vs having it on java.

Thanks in advance!

pepper gravatar imagepepper ( 2015-11-09 21:23:15 -0600 )edit

Since Java OpenCV seems to bind the OpenCV C++ functions (using JNI ?), if you are comfortable with both Java and C++, I would say to go directly with C++, otherwise you can use Java if you have never used C++.

Eduardo gravatar imageEduardo ( 2015-11-10 07:27:48 -0600 )edit

Hi @Eduardo, I'd like to thank you for your very helpful inputs (for the second image). I was able to achieved what I want by following it. It helps me a lot. Again, thank you very much!

pepper gravatar imagepepper ( 2015-11-15 21:03:13 -0600 )edit
2

answered 2015-11-07 09:20:16 -0600

you can find a solution related to your question here

also i tried to solve by another way.here is my trial C++ code ( i think you can convert it to JAVA ) if you need some explanation to understand the algorithm you can ask.

#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>

using namespace cv;
using namespace std;

int main( int, char** argv )
{
    Mat src = imread( argv[1] );
    Mat gray;

    int threshold = 12;

    cvtColor(src, gray, COLOR_BGR2GRAY );
    blur(gray, gray, Size(3, 3));
    erode(gray, gray, Mat());
    Canny(gray, gray, threshold, threshold * 3, 3, true);

    vector<vector<Point> > contours;

    findContours(gray, contours,
                 RETR_EXTERNAL,
                 CHAIN_APPROX_SIMPLE);

    gray = 0;
    for( size_t i = 0; i< contours.size(); i++ )
    {

        Rect rect = boundingRect(contours[i]);
        if(rect.width > 10 | rect.height > 10 )
            rectangle(gray, rect, Scalar(255), -1 );
    }
    erode(gray, gray, Mat());

    findContours(gray, contours,
                 RETR_EXTERNAL,
                 CHAIN_APPROX_SIMPLE);

    Rect big;
    for( size_t i = 0; i< contours.size(); i++ )
    {
        if( contourArea(contours[i]) > src.cols * src.rows / 8)
        {
            Rect rect = boundingRect(contours[i]);
            rectangle(src, rect, Scalar(255,0,0), 2 );
        }
        else
        {
            if(big.height < 1 )
                big = boundingRect(contours[i]);
            big = big | boundingRect(contours[i]);
        }
    }

    rectangle(src, big, Scalar(0,255,0), 2 );

    imshow( "result", src );

    waitKey(0);
    return(0);
}

Result Image :

image description

edit flag offensive delete link more

Comments

Hi @sturkmen, thanks for your answer.

I already tried implementing your solution in java, unfortunately i'm having problem with this part of the code

big = big | boundingRect(contours[i]);

can you please explain it to me what it does? Also, the Rect big variable prior to the for loop, what should be it's initial value?

Thanks again!

pepper gravatar imagepepper ( 2015-11-09 21:27:34 -0600 )edit

Rect big; means Rect big = Rect(0,0,0,0);

sturkmen gravatar imagesturkmen ( 2015-11-10 01:47:34 -0600 )edit

or you can use the code below

vector<Point> big;
Rect rect;
for( size_t i = 0; i< contours.size(); i++ )
{
    rect = boundingRect(contours[i]);
    if( contourArea(contours[i]) > src.cols * src.rows / 8)
    {
        rectangle(src, rect, Scalar(255,0,0), 2 );
    }
    else
    {
        big.push_back(Point(rect.x,rect.y));
        big.push_back(Point(rect.x + rect.width, rect.y + rect.height));
    }
}

rect = boundingRect( big );
rectangle(src, rect, Scalar(0,255,0), 2 );
sturkmen gravatar imagesturkmen ( 2015-11-10 02:08:13 -0600 )edit

Hi @sturkmen!

I was able to try and run the code you've provided and the result is as expected. Unfortunately, when I tried use a different image, the result was not what I expected.

Because of this, I'm thinking it might have something to do with the below linke of code

src.cols * src.rows / 8

can you please explain if what is the purpose of it?

thank you!

pepper gravatar imagepepper ( 2015-11-11 04:17:14 -0600 )edit

you could try different values like src.cols * src.rows / 7 or src.cols * src.rows / 9( bigger or smaller any value) you will see the difference ( i can't see the image http://s7.postimg.org/5auwuk1y3/scann... because of some restriction about internet policy on my country)

also it is possible to calculate according biggest contour.

keep in mind my code works if there is some big contours and one group of small contours

sturkmen gravatar imagesturkmen ( 2015-11-11 05:01:33 -0600 )edit

Hi @sturkmen, thanks for your inputs. But I find Eduardo's inputs more applicable/appropriate for my case. Thanks again!

pepper gravatar imagepepper ( 2015-11-15 21:04:36 -0600 )edit

Question Tools

1 follower

Stats

Asked: 2015-11-07 05:55:40 -0600

Seen: 8,597 times

Last updated: Nov 11 '15