PCA in thousands of dimensions
I have vectors in 4096 VLAD codes, each one of them representing an image.
I have to run PCA on them without reducing their dimension on learning datasets with less than 4096 images (e.g., the Holiday dataset with <2k images.) and obtaining the rotation matrix A
.
In this paper (which also explains why I need to run this PCA without dimensionality reduction) they solved the problem with this approach:
For efficient computation of A, we compute at most the first 1,024 eigenvectors by eigendecomposition of the covariance matrix, and the remaining orthogonal complements up to D-dimensional are filled using Gram-Schmidt orthogonalization.
Now, how do I implement this using C++ libraries? I'm using OpenCV, but cv::PCA seems not to offer such a strategy. Is there any way to do this?
answer is in your question? "we compute at most the first 1,024 eigenvectors by eigendecomposition of the covariance matrix"
use eigen lib #include < Eigen / Eigenvalues >
@LBerger thanks for your comment. However, I don't fully understand this. How do you obtain the covariance matrix? How do you compute the first 1024 eigenvectors? And finally, what about the D - 1024 remaining components?
https://eigen.tuxfamily.org/dox/class...
@StevenPuttermans thanks for the reference, but how this could help me? No gram-schmidt method is cited there, neither how to compute PCA in thousands of dimensions :)