# Clustering & Classify sparse large features

My problem initial features are x , y ,theta that normalized in range[0,255].

For each object number of features is variable.

Clustering is applied so each cluster has number of features & each object belongs to multiple clusters. In the predict stage ,compute clusters for each object from initial features(new features).

Each object belongs to a maximum of 10 clusters.

Total number of clusters is 4000.

If we consider new features constant for each object we have 4000 dimension that it very large for classify.Only 10 features may be useful and my features is sparse.

My question :

Is there any way that we can classify these sparse features with best performance & which classifier is useful for it? Note:I use locality sensitive hashing for classify new features with 4000 dimension that is very slow.

edit retag close merge delete

Sort by » oldest newest most voted

I guess you have to look at dimensionality reduction of your problem, which is done in techniques like Fishers Linear Descriminant Analysis or Principal Component Analysis. They will help you discover the most influencing dimensions in your data, dimensions you wouldn't even notice at first sight.

Good examples that use these techniques are Fisherfaces and Eigenfaces. Have a look at it!

more

1

Thanks.I used pca for reduce dimension of features & use svm for classification then solved my problem.

( 2013-07-29 06:24:05 -0500 )edit

Official site

GitHub

Wiki

Documentation