Is this matlab and opencv code the same for computing PCA?
I need to rewrite this matlab code using OpenCV. It's a simple PCA, but it's very useful because it implements the case where the number of data points is less than the number of dimensions (using the decomposition of the Gram matrix). This is the case if we want to implement dimensionality reduction for image representations in thousands of dimensions (e.g. VLAD codes), which is not present in cv::PCA
(please if you know any alternative library where this is possible please let me know) .
The function arguments are:
x
: the input data matrixnPCs
: the number of principal components that we want to keepsubtractMean
: whether we want to compute the mean vector or notU
: the resulting rotation matrixlams
: the list of eigen valuesmu
: the mean vectorUtmu
: simplyU^T*mu
The most difficult part is that (as written in the code below) x
is column-wise, while in OpenCV data are row-wise. I've rewritten the code in OpenCV considering x
row-wise.
The original code:
% [U, lams, mu, Utmu]= relja_PCA(x, nPCs, subtractMean)
%
% Author: Relja Arandjelovic ([email protected])
function [U, lams, mu, Utmu]= relja_PCA(x, nPCs, subtractMean)
% assumes x= ndims x nvectors
% automatically decides if primal or dual should be used
if nargin<3, subtractMean= true; end
nPoints= size(x,2);
nDims= size(x,1);
if nargin<2, nPCs= nDims; end
if subtractMean
mu= mean(x,2);
x= x - repmat(mu,1,size(x,2));
else
mu= zeros([size(x,1),1], class(x));
end
if nDims<=nPoints
doDual= false;
X2= double(x*x')/(nPoints-1);
else
doDual= true;
X2= double(x'*x)/(nPoints-1);
end
if nPCs < size(X2,1)
[U, L]= eigs(X2, nPCs);
else
[U, L]= eig(X2);
end
lams= diag(L);
% make sure in decreasing size
[lams, sortInd]= sort(lams,'descend');
U= U(:,sortInd);
if doDual
U= x * ( U * diag(1./sqrt(max(lams,1e-9))) / sqrt(nPoints-1) );
end
Utmu= U'*mu;
end
This is my code:
template <typename T>
void cc::relja_PCA(T &U, T &lams, T &mu, T &Utmu, T &x, size_t nPCs, bool substractMean){
int nPoints = x.rows;
int nDims = x.cols;
if(nPCs <= 0)
nPCs = x.rows;
if(substractMean){
mu = cv::Mat(1, x.cols, x.type());
for(int i=0; i<x.rows; i++)
mu += x.row(i);
mu /= x.rows;
}
else
mu = cv::Mat::zeros(1, x.cols, x.type());
bool doDual;
T X2;
if(nDims<=nPoints){
doDual = false;
cv::mulTransposed(x,X2, true); // x' * x , because row-major data
}
else{
doDual = true;
cv::mulTransposed(x, X2, false);
}
X2 /= nPoints-1;
cv::eigen(X2, U, lams);
//save only the first nPCs elements of L, which is already a vector
if(nPCs>0 && nPCs<X2.cols)
lams(cv::Range(0, nPCs), cv::Range(0, lams.cols)).copyTo(lams);
//should check if eigenvalues/vector are already sorted in decreasing size
if (doDual){
double min;
double max;
cv::minMaxLoc(lams, &min, &max);
max = max > 1e-9 ? max : 1e-9;
U = x * (U * 1.0f/std::sqrt(max) / std::sqrt(nPoints-1) );
}
Utmu ...
I always wonder why people try to reinvent the wheel: https://github.com/ameya005/VLAD-Impl...
@StevenPuttemans because your wheel doesn't implment pca and it's not parallelized ;)