expectation-maximization with Gaussian mixture model on a 1D Histogram

asked 2020-04-06 05:00:36 -0600

jcb gravatar image

Hi all. I would like to detect a mixture of 2 gaussians/clusters inside a 1D histogram. This histogram has 180 bins, containing double values normalized between 0..1. It has a first peak at bin 25 (value=0.2), and a second one at bin 150 (value=0.1).

image description

I thought about using Expectation-Maximization class (cv::ml:EM), by specifying 2 clusters, and inputting to the class a 1D mat containing the histogram values (values between 0..1). I was expecting to get:

  • means values = 25 and 150.
  • first half of the histogram labelled as '0'
  • 2nd half of the histogram labelled as '1'

Instead i've got:

  • mean values = 0.006 and 0.077
  • first third part and last third part of the histogram labelled as '1'
  • middle third part of the histogram labelled as '0'

(see image below '1'=white, '0'=black):

image description

Here is my code:

cv::Ptr<cv::ml::EM> gmm = cv::ml::EM::create();
gmm->setClustersNumber(2);
gmm->setCovarianceMatrixType(cv::ml::EM::COV_MAT_DIAGONAL);
cv::TermCriteria term(cv::TermCriteria::COUNT + cv::TermCriteria::EPS, cv::ml::EM::DEFAULT_MAX_ITERS, FLT_EPSILON);
gmm->setTermCriteria(term);
cv::Mat matSamples = cv::Mat(180, 1, CV_64F);
for (int i=0; i < 180; i++)
    matSamples.at<double>(i) = vecHisto[i];
cv::Mat matLogs, matLabels, matProbs, matWeights, matMeans;
bool trained = gmm->trainEM(matSamples, matLogs, matLabels, matProbs);
matWeights = gmm->getWeights();
matMeans = gmm->getMeans();

I've also tried with more clusters, but the EM class still failed to detect the 2 gaussians with two dfferents labels. I've uploaded a file containing my input samples here: https://1fichier.com/?ot3vpg71pbx3flxiurza

What i'am doing wrong ? what's the correct way to use EM class with this kind of input data?

Thanks for any help.

edit retag flag offensive close merge delete