Ask Your Question

Revision history [back]

KMean and PCA connection

As I understand pattern recognition, PCA is used to remove unnecessary data in the dataset so that when the dataset will be used in a KMean, it will perform less than a dataset not being PCA'd. So, I can have code(pseudocode) something like this:

 assign .csv to var DATA
 PCA_DATA = PCAcompute(DATA)
 result = Kmean(PCA_DATA)
 plotToGraph(result)

Am I correct?

I've been looking for sample programs where it imports a csv then do some clustering with PCA for almost a MONTH now. What I need to do is to compare the output of a Kmean result to a Kmean result with PCA using the iris dataset.

click to hide/show revision 2
retagged

updated 2014-02-28 01:54:41 -0600

berak gravatar image

KMean and PCA connection

As I understand pattern recognition, PCA is used to remove unnecessary data in the dataset so that when the dataset will be used in a KMean, it will perform less than a dataset not being PCA'd. So, I can have code(pseudocode) something like this:

 assign .csv to var DATA
 PCA_DATA = PCAcompute(DATA)
 result = Kmean(PCA_DATA)
 plotToGraph(result)

Am I correct?

I've been looking for sample programs where it imports a csv then do some clustering with PCA for almost a MONTH now. What I need to do is to compare the output of a Kmean result to a Kmean result with PCA using the iris dataset.