Ask Your Question

Expectation Maximization Prediction Issues

asked 2018-12-09 07:46:01 -0500

Tarcisioflima gravatar image

I was trying to segment a leaf from the background to be able to identify the type of disease that the leaf contains. First, I begin to work with a small dataset taken in a controlled environment, after I found out a new dataset which contains more images, but is not so controlled.

Controlled Environment

New Dataset Img Example

However, the code works very well for the first one, it is not good for the second as you can see below:


def train_em(samples, n_clusters):
    print('[Start training EM]')

    em =

    print('[Done training EM]')
    return em

def predict(em, image):
    pixelDict = {}
    mask = []
    height, width, _ = image.shape
    for h in range(0, height):
        for w in range(0, width):
            key = hash((image[h,w,0], image[h,w,1]))
            if key in pixelDict:
                _, probs = em.predict(np.float32([[image[h,w,0], image[h,w,1]]]))
                if probs[0][0] >= probs[0][1]:
                    pixelDict[key] = 1
                    pixelDict[key] = 0
    return np.array(mask).reshape(height, width).astype('uint8')

Could someone give me a direction?

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted

answered 2018-12-09 14:07:33 -0500

kbarni gravatar image

Expectation maximization - as most machine learning methods - learn to make decisions from the training data. So unfortunately they won't work on other types of data.

Your model learns that the RGB color of a healthy leaf is something like 140/160/80 +/-10 (I'm simplifying). On the second image, the color is around 50/80/60. So it won't fit by far on the learned model.

There are two solutions:

  • Use the same conditions for learning and predicting. In natural conditions this is quite hard, as conditions change often and the leaves have a great variability.
  • Try to find other descriptors for training/prediction that are invariant to the conditions and which discriminate well the healthy and sick part of the leaf. Texture is a good starting point (Haralick descriptors, Gabor filters, tensors, wavelets, etc...)
edit flag offensive delete link more


@kbarni Well, I'm not sure about it as I'm retraining it every time. Also, shadows are an issue here.

Tarcisioflima gravatar imageTarcisioflima ( 2018-12-09 16:26:12 -0500 )edit

As I said, color (esp. RGB color space) varies a lot with different conditions.

You need a more robust descriptor, and texture is a good candidate.

You can also try other color spaces as Lab or HSV (just add a cvtColor at the beginning of your algorithm). It's not as robust as texture, but better than RGB.

I have a paper about this but it's not published yet.

kbarni gravatar imagekbarni ( 2018-12-10 03:25:58 -0500 )edit

Kbarni, did you face shadows on your paper? I'm trying to find a way to avoid them

Tarcisioflima gravatar imageTarcisioflima ( 2018-12-10 16:51:35 -0500 )edit

No. Just used lighting invariant model. It really works on images taken in non-controlled conditions (in field); we didn't make any laboratory tests.

kbarni gravatar imagekbarni ( 2018-12-11 11:48:35 -0500 )edit

Lightning invariant model? Do you have anything about it?

Tarcisioflima gravatar imageTarcisioflima ( 2018-12-12 18:04:39 -0500 )edit

Please, don't make me repeat myself again!!!!! Texture descriptors (see above) are quite robust to varying light conditions; for the color use the a and *b channels from Lab color space (as it separates the luminosity and color information).

kbarni gravatar imagekbarni ( 2018-12-13 09:53:51 -0500 )edit

I'm not using RGB, I was already using just S and V form HSV to train it. I will take a look on Lab color! Thank you

Tarcisioflima gravatar imageTarcisioflima ( 2018-12-13 10:03:11 -0500 )edit

Did you published the paper?

Tarcisioflima gravatar imageTarcisioflima ( 2020-02-11 11:44:38 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower


Asked: 2018-12-09 07:46:01 -0500

Seen: 129 times

Last updated: Dec 09 '18