What characteristics does opencv's k nearest neighbor algorithm use to predict?

answered 2014-03-04 11:37:51 -0600

updated 2014-03-05 02:02:04 -0600

In fact, there are two different examples in the link you give: the first one works directly on the hand written images, the second one uses precalculated features. As mentioned clearly in the document, in the first example, the digit images are flattened from 20x20 resolution downto 400 dimensions row based vectors for training, that means there is no real feature extraction step, each image is represented by it raw intensity values (this technique is the same as in original eigenfaces method). In the second example, you should follow the links given to see how the features are generated. Additionally, the underlying training algorithm of knn is SVM.

edit flag offensive delete link

Comments

Sorry, should have specified it was the first example.

If only vectors are passed to the function, how are they used for training and prediction with the k nearest neighbor algorithm?

NonDescript ( 2014-03-05 15:04:40 -0600 )edit

In both cases, the training algorithm uses the passed data as features, it does not care what kind of features you feed in. But the difference is that, if you give the more discriminative features (resulted from a good feature extraction method), the training would be more effective and leads to more accuracy classification step afterward. I think it could be easier to understand to explain you with a binary SVM classification: the training step will generate an optimized superplan that distinguish the training samples (just two labels) on the features space, then the classification will use that superplan to predict the label for new test patterns. In the case of multilabels (N labels), the training step will generate N superplans for classification step.

tuannhtn ( 2014-03-05 16:29:40 -0600 )edit

Sorry, what are superplans? I've tried googling but nothing comes up.

NonDescript ( 2014-03-05 17:19:58 -0600 )edit

I am sorry, because the length restriction so I can not finish the answer in one post. It is the SVM, and it is hyperplan,not superplan (it was a typo). And the SVM training algorithm has to solve an optimize problem to find the hyperplans but in the examples given, the KNearest is used to avoid that as follow: the training algorithm just "calculating weighted sum, and so on" (http://docs.opencv.org/opencv2refman.pdf) the data (training data with true labels) and the prediction will find k smallest distances to decide the result label of a test digit. In the first example, you have 10 digits (each one has 500 samples, 250 use for training and 250 for testing) and k=5. So one test sample is assigned to digit's label which has >k/2 smallest distances to test sample.

tuannhtn ( 2014-03-05 17:32:38 -0600 )edit

For more details on kNN: http://docs.opencv.org/trunk/doc/py_tutorials/py_ml/py_knn/py_knn_understanding/py_knn_understanding.html

http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm

http://blog.damiles.com/2008/11/the-basic-patter-recognition-and-classification-with-opencv/

http://www.aishack.in/2010/10/k-nearest-neighbors-in-opencv/ http://bytefish.de/blog/machine_learning_opencv/

For more details on SVM:

http://en.wikipedia.org/wiki/Support_vector_machine http://docs.opencv.org/doc/tutorials/ml/introduction_to_svm/introduction_to_svm.html http://www.csie.ntu.edu.tw/~cjlin/libsvm/

tuannhtn ( 2014-03-05 17:36:28 -0600 )edit

Thank you very much for your help!

NonDescript ( 2014-03-05 18:04:49 -0600 )edit

add a comment

What characteristics does opencv's k nearest neighbor algorithm use to predict?

1 answer

Comments

Links

Question Tools

Stats

Related questions

What characteristics does opencv's k nearest neighbor algorithm use to predict? edit

1 answer

Comments

Links

Question Tools

Stats

Related questions

What characteristics does opencv's k nearest neighbor algorithm use to predict?