Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Within-classes scatter calculation in lda.cpp

Hi,

Below is the partial code from lda.cpp.

// calculate sums for (int i = 0; i < N; i++) { Mat instance = data.row(i); int classIdx = mapped_labels[i]; add(meanTotal, instance, meanTotal); add(meanClass[classIdx], instance, meanClass[classIdx]); numClass[classIdx]++; } // calculate total mean meanTotal.convertTo(meanTotal, meanTotal.type(), 1.0 / static_cast<double> (N)); // calculate class means for (int i = 0; i < C; i++) { meanClass[i].convertTo(meanClass[i], meanClass[i].type(), 1.0 / static_cast<double> (numClass[i])); } // subtract class means for (int i = 0; i < N; i++) { int classIdx = mapped_labels[i]; Mat instance = data.row(i); subtract(instance, meanClass[classIdx], instance); } // calculate within-classes scatter Mat Sw = Mat::zeros(D, D, data.type()); mulTransposed(data, Sw, true);

As you can see, the lines in bold are the one which calculates the within-class scatter. My doubt is, is this correct? From my understanding, within-class scatter is calculated after finding the difference of class elements with its mean value. But here the mulTransposed is applied to data which is the data samples before finding difference about the mean value. Should it be instance instead of data? Please correct me if I am wrong. I am new to this.

Thanks!

Within-classes scatter calculation in lda.cpp

Hi,

Below is the partial code from lda.cpp.

// calculate sums
for (int i = 0; i < N; i++) {
Mat instance = data.row(i);
int classIdx = mapped_labels[i];
add(meanTotal, instance, meanTotal);
add(meanClass[classIdx], instance, meanClass[classIdx]);
numClass[classIdx]++;
}
// calculate total mean
meanTotal.convertTo(meanTotal, meanTotal.type(), 1.0 / static_cast<double> (N));
// calculate class means
for (int i = 0; i < C; i++) {
meanClass[i].convertTo(meanClass[i], meanClass[i].type(), 1.0 / static_cast<double> (numClass[i]));
}
// subtract class means
for (int i = 0; i < N; i++) {
int classIdx = mapped_labels[i];
Mat instance = data.row(i);
subtract(instance, subtract(**instance**, meanClass[classIdx], instance);
**instance**);
 }
// calculate within-classes scatter
Mat **Mat Sw = Mat::zeros(D, D, data.type());
mulTransposed(data, Sw, true);

true);**

As you can see, the lines in bold are the one which calculates the within-class scatter. My doubt is, is this correct? From my understanding, within-class scatter is calculated after finding the difference of class elements with its mean value. But here the mulTransposed is applied to data which is the data samples before finding difference about the mean value. Should it be instance instead of data? Please correct me if I am wrong. I am new to this.

Thanks!

Within-classes scatter calculation in lda.cpp

Hi,

Below is the partial code from lda.cpp.

// calculate sums
    for (int i = 0; i < N; i++) {
        Mat instance = data.row(i);
        int classIdx = mapped_labels[i];
        add(meanTotal, instance, meanTotal);
        add(meanClass[classIdx], instance, meanClass[classIdx]);
        numClass[classIdx]++;
    }
    // calculate total mean
    meanTotal.convertTo(meanTotal, meanTotal.type(), 1.0 / static_cast<double> (N));
    // calculate class means
    for (int i = 0; i < C; i++) {
        meanClass[i].convertTo(meanClass[i], meanClass[i].type(), 1.0 / static_cast<double> (numClass[i]));
    }
    // subtract class means
    for (int i = 0; i < N; i++) {
        int classIdx = mapped_labels[i];
        Mat instance = data.row(i);
        subtract(**instance**, subtract(instance, meanClass[classIdx], **instance**);
instance);
    }
    // calculate within-classes scatter
    **Mat Mat Sw = Mat::zeros(D, D, data.type());
    mulTransposed(data, Sw, true);**
true);

As you can see, the last two lines in bold are the one which calculates the within-class scatter. My doubt is, is this correct? From my understanding, within-class scatter is calculated after finding the difference of class elements with its mean value. But here the mulTransposed is applied to data which is the data samples before finding difference about the mean value. Should it be instance instead of data? Please correct me if I am wrong. I am new to this.

Thanks!