# How can I use yolo for box detection?

I want to create an app to click the picture of a sudoku puzzle located anywhere on a book or somewhere else. I have tried Harris but it selects many more points since the ink used is not uniform. I have tried Canny but again it selects few gridlines. I used Contour but it gave me broken contour. In the end I have currently YOLO algorithm in my mind to detect those gridboxes and use self prepared dataset of mobile clicked images or from the internet to get variety of border thickness. Can I get how do I implement YOLO or use existing YOLO algorithm to detect the coordinates of box?

edit retag close merge delete

Well, right maybe. If you see using yolo I can detect all coordinates in one look and then arrange them in ascending order according to x and then to y. If I use above mentioned methods they detect additional rubbish which is on the newspaper. I want to crop the numbers to feed into mnist classifier and then to default algorithm of sudoku solving by backtracking.

( 2019-07-17 12:45:04 -0500 )edit

Sort by ยป oldest newest most voted

Here is a bunch of possible traditional ways: https://lmgtfy.com/?q=sudoku+opencv&s=g

As for YOLO, this could work without finding additional rubbish if your training set contained not just plain numbers, but numbers inside squares and empty squares made of grid lines. It is quite unlikely that you will see many numbers in squares and empty squares somewhere around the sudoku grid. And, I believe, also a number of rubbish newspaper pages as negative examples. If that worked, it would save a lot of effort of traditional methods. And you don't need no number classifier, as YOLO will do it all for you! Still, I am not sure it is going to be 100% accurate, which is crucial here...

If you decide to use YOLO, please do let us know your results.

more

Sure, I will if I get successful. But first I need help. I do not have training data and thus I will have to create my own. Now as you said I want to detect the whole grid directly.In total one row has 9 points (8 boxes) and in total I have gt 81 points to detect. But the major problem is that internet images aren't of same sizes and thus how to I prepare input vector. For simple mnist all images are 28 * 28 or 32 * 32. But in that small fitting is absolutely trash. I think maybe 400 * 400 image would be fine but again fitting all the images to 400 * 400 would distort them. Any suggestions?

( 2019-07-18 07:26:17 -0500 )edit

Different sizes of input images are an advantage. And you don't need digit classifier at all, as YOLO will do everything at one go. As for the training data, sudoku boards are quite simple, so apart from searching the net, you may generate numerous examples, rotate them, skew them etc. to enrich your set.You shouldn't rely exclusively on artificial data though.

( 2019-07-18 11:57:49 -0500 )edit

No no, to detect those 81 boxes I need to feed a long vector of 81 * 4 of several images coordinates so that it can train on this data. Therefore it would be a difficult task and would take time or I can just mark newspaper sudoku's grid boxes with some distinctive color and automate this process of pixel collection once pictures are clicked. Also, can I get an implementation of yolo like something written by someone so that I can get some idea? When I see on the net all talks are about pre-trained models and no code.

( 2019-07-19 13:28:49 -0500 )edit

It will definitely take quite some of time to prepare the training data from real images. Here is how to compile and train Yolo: https://github.com/AlexeyAB/darknet Here is how to use it in OpenCV: https://www.learnopencv.com/deep-lear...

( 2019-07-20 04:15:58 -0500 )edit

Official site

GitHub

Wiki

Documentation