Hello everyone.
I'm trying to train a Haar classifier for detecting trees in satellite images. While it's almost easy to generate negative samples (it's sufficient to cut parts containing streets or buildings without any tree), I find it difficult to generate positive samples.
I've read (in this forum, too) that I should crop positive sample containing only desired object (a tree, in my case); anyway, it's hard to obtain this result with satellite images since:
- images are downloaded from Bing with different levels of zoom, thus cutting trees produces samples with different sizes
- because of the fact satellite images include aerial landscapes of cities, when I cut a tree it's not possible to isolate it from background containing pieces of strees or parts of other trees (this happens when there's an agglomerate of trees, as in parks or small green areas).
I've tried to generate a certain number of random square samples from satellite image. I've chosen size of squares in order to contain, in average, a tree almost completely. Then, I parsed samples one by one, separating them in negative and positive sets. I've stated that a sample is positive if it contains at least a tree at 70% of its surface, by visual inspection. Anyway, detection results are awful.
My questions are; - can I skip the request of having positives sample with same ratio? - how can I generate, correctly, positive samples to train classifier correctly to detect trees?
I even tried to browse web to look for a dataset of trees extracted from satellite image, but I haven't found anyone. Can you suggest one?
Thanks for support.