Ask Your Question
0

Within-classes scatter calculation in lda.cpp

asked 2017-07-09 11:13:01 -0600

sree gravatar image

updated 2017-07-09 11:15:43 -0600

Hi,

Below is the partial code from lda.cpp.

// calculate sums
    for (int i = 0; i < N; i++) {
        Mat instance = data.row(i);
        int classIdx = mapped_labels[i];
        add(meanTotal, instance, meanTotal);
        add(meanClass[classIdx], instance, meanClass[classIdx]);
        numClass[classIdx]++;
    }
    // calculate total mean
    meanTotal.convertTo(meanTotal, meanTotal.type(), 1.0 / static_cast<double> (N));
    // calculate class means
    for (int i = 0; i < C; i++) {
        meanClass[i].convertTo(meanClass[i], meanClass[i].type(), 1.0 / static_cast<double> (numClass[i]));
    }
    // subtract class means
    for (int i = 0; i < N; i++) {
        int classIdx = mapped_labels[i];
        Mat instance = data.row(i);
        subtract(instance, meanClass[classIdx], instance);
    }
    // calculate within-classes scatter
    Mat Sw = Mat::zeros(D, D, data.type());
    mulTransposed(data, Sw, true);

As you can see, the last two lines are the one which calculates the within-class scatter. My doubt is, is this correct? From my understanding, within-class scatter is calculated after finding the difference of class elements with its mean value. But here the mulTransposed is applied to data which is the data samples before finding difference about the mean value. Should it be instance instead of data? Please correct me if I am wrong. I am new to this.

Thanks!

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted
0

answered 2017-07-09 23:52:55 -0600

berak gravatar image

updated 2017-07-10 00:14:51 -0600

if you look at those lines:

// subtract class means
for (int i = 0; i < N; i++) {
    int classIdx = mapped_labels[i];
    Mat instance = data.row(i);
    subtract(instance, meanClass[classIdx], instance);
}

instance is a "shallow" copy, it's data pointer points to the "enclosing" matrix, so, in fact, the whole data matrix gets manipulated, not only instance (again, it's NOT a deep copy).

so, mulTransposed is applied to the "mean corrected" data matrix.

edit flag offensive delete link more

Comments

Thank you berak I got it. One more thing regarding the above calculation. Should we arrange the content of the "mean corrected" data matrix in a way to separate the classes before applying the mulTransposed operation? What I feel is that, if we don't arrange the content of data matrix, then there will be between classes products in the calculation. Is this correct?

sree gravatar imagesree ( 2017-07-10 13:45:44 -0600 )edit

Question Tools

1 follower

Stats

Asked: 2017-07-09 11:13:01 -0600

Seen: 367 times

Last updated: Jul 10 '17