Why does LDA reduction give only one dimension for two classes?

Hi,

I'm trying to reduce data from two classes with the Linear Discriminant Analysis algorithm (LDA opencv documentation here).

Here is a short example of what I'm trying to accomplish:

LDA lda(num_components);
lda.compute(someData, classesLabels); //Computes LDA algorithm to find the best projection
Mat reductedData = lda.project(someData); //Reduces input data

Let's say I've 100 dimensions per sample as input and I want to get 50 after reduction. If I'm correctly understanding the documentation (here), num_components should be the number of kept dimensions.

However I'm obtaining only one dimension regardless of the number I give to the LDA constructor. I looked at the LDA source code (here) which explains this behaviour :

...
// number of unique labels
int C = (int)num2label.size();
...
...
// clip number of components to be a valid number
if ((_num_components <= 0) || (_num_components > (C - 1))) {
_num_components = (C - 1);
}
...
_eigenvalues = Mat(_eigenvalues, Range::all(), Range(0, _num_components));
_eigenvectors = Mat(_eigenvectors, Range::all(), Range(0, _num_components));

Here are my questions:

• The behaviour in the documentation and the code seem to be different, is it normal ? If so, could someone explain why the number of output dimensions should be linked to the number of classes ?
• How should I proceed to have more than one dimension with two classes ?
edit retag close merge delete

2

"Let's say I've 100 dimensions per sample as input and I want to get 50 after reduction" -- imho, you want PCA, not LDA

From what I could read LDA could be used as reduction method as well ( e.g. here).

Wikipedia also seems to say it:

The resulting combination may be used as a linear classifier, or, more commonly, for dimensionality reduction before later classification.

(source)

I'm not interested in using the PCA method, I already have and would like to compare it to LDA reduction.

Sort by » oldest newest most voted LDA is a discriminant analysis. For two classes and a given data (somedata) it returns the probability that the given data depends to class 1 or 2.

I think you are looking for the PCA (principal component analysis). It will give you the projection on the principal component of the parameter space (and all the 100 dimensions, among which the first one is the most important, and the last is the least important).

The code to use the PCA class is the same, only you'll have one class of data.

more

I might have misunderstood how this works then. Please, correct me if I'm wrong: In both cases (PCA-LDA), the algorithm tries to find a projection in order to reduce data as effectivly as possible. From what I could read (e.g. here) LDA can be directly used to reduce data and might be more accurate than PCA.

What you're telling surpises me. The returned vector doesn't seem containing probabilities. Here is an example of returned vector after projection:

[1063.72663796166, 1100.15457383534, ..., -1102.669283385719, -1072.086030509124]

Are you sure the lda.project(..) function is used to classify data ?

I'm positive, I already have used PCA reduction and would like to compare it to LDA reduction.

On slide 2 of the linked presentation, the LDA is well explained: it computes the line y that separates the best the two classes (figure 2). Then, each value can be projected to this line, and you get an indicator, if the value is closer to class 1 or 2. That reduces the dimensionality of your data to 1. The further dimensions (perpenticular vectors) have no meaning, as they don't separate the data.

See slide 3: PCA will give you a principal component parallel to the X axis, as it separates the best the means of the classes. The second component will be perpenticular to the first component, as your data has less variance on that direction. The LDA will give you a mostly vertical discriminant vector, as it separates the 2 classes. However the other axes don't separate the data.

It seems I had misunderstood LDA then. If I'm correct this time, LDA will always project data on (a) line(s). I thought it would be able to project it on a subspace (not especially on a 1D space).

Here is what I would like to accomplish: "Reducting data by projecting it on a subspace (like PCA do) but using the Fisher linear discriminant."

Maximising data variance without relying on class labels works. However, depending on the situations using the Fisher discriminant might be a better solution to find a projection subspace especially since it takes advantage of the class labels.

It seems I shouldn't use LDA but only the same principle (Fisher discriminant) to reduce data. Do you know if any algorithms fitting this requirement might already have been implemented in openCV?

Yes, it's possible.

As the LDA doesn't need the mean of the data, you have to compute it yourself. It's easy.

Now use this vector and the eigenvalues and eigenvectors from the LDA to create a PCA object.

However this won't really reduce the dimensionality of your space. Back to the presentation you linked, slide 3: if you project those variables on the LDA vectors, you'll get a first component (vertical vector) that separates the classes, but it's quite short. The second component (a horizontal one) is much longer.

Slide 3 of the presentation is not representative of how LDA should work but slide 4 is.

Thank you for the advice, I going to try it. And thank you for your answers, those were very usefull. Since my question about LDA has been answered, I'm closing this thread.

Stats

Asked: 2016-04-11 08:03:10 -0500

Seen: 1,790 times

Last updated: Apr 11 '16