Ask Your Question
0

classifier problem of charaters or numbers

asked May 21 '18

benz gravatar image

I used HOG and SVM to train and classify the charaters or numbers for OCR. Everything seems very good, I can get a label value after prediction, but as we know, I used multiclass for the charaters and numbers, such as,

number 0  label value (20)
number 1  label value (21)
number 2  label value (22)
......
character A  label value (31)
character B  label value (32)
......

When I used findContours to get a contour without chaarcters or numbers inside, I still got a value from prediction, 20 or something. What is the problem?

Preview: (hide)

Comments

What is the problem?

your expectation. it can only predict, what it was trained upon.

berak gravatar imageberak (May 21 '18)edit

My expectation for the other object found by findContours should be -1, and numbers and chars returns the correct label value. Aside predict, is there some other way to do the chars and numbers detections by SVM? or use DetectMultiScale to detect the whole ROI area ?

benz gravatar imagebenz (May 21 '18)edit

My expectation for the other object found by findContours should be -1,

why would that be so ? you did not train it on such data. (your expectation is flawed)

one way to achieve this, would be: instead of a multiclass SVM, have a seperate one for each character, trained on: this letter vs everything else (including weird fly-shit). but you'll find, that this is a much harder problem, than your original approach.

another approach might be, to have a text detection stage before the classifier

berak gravatar imageberak (May 21 '18)edit

1 answer

Sort by » oldest newest most voted
0

answered May 22 '18

benz gravatar image

I'm trying the first approach, berak, thanks so much

Preview: (hide)

Question Tools

1 follower

Stats

Asked: May 21 '18

Seen: 355 times

Last updated: May 20 '18