Multiple objects classification

asked 2015-04-09 04:20:57 -0600

495 ●2 ●6 ●20

I want to be able to detect and classify multiple objects in a scene. For example: A image contains multiple objects of which I want to detect and recognize car, mobile, laptop. The image below shows what am I looking for.

image description

How do I go about it ? Traincascade can be used to detect objects of one particular type for which they are trained. Will I be able to use multiple cascades trained for different objects in the same program ? If yes, will the program run slower? Is there any other solution to achieve this. I will be using openCV 2.4.11 on ubuntu 14.04 system with i5 processor and 8gb RAM, if it in any case matters.

edit retag flag offensive close merge delete

add a comment

2 answers

Sort by » oldest newest most voted

answered 2015-04-12 03:34:09 -0600

Gino Strato
127 ●5 ●6

Yes, traincascade can be used to detect objects of one particular type.
The more variable these objects are in shape, the more difficult is the task. In principle, you could train a single cascade using, as positives, samples of all types of objects you want to detect [in phase 1] and eventually use some method to assign detected objects to one particular class [in phase 2]. For phase 2 you could use even an algorithm that is slower than boosting, as it only has to run over a very few detected objects.
You could make experiments by yourself, but I’m skeptical about the results you could get following this route: there is too much variability among samples in phase 1 and this should slow down the running time and worsen the detection rate. (Anyway, when I can, I always make experiments for my projects to be sure that what I suspect is correct, many times I’ve discovered that thing are different than I imagined.)

The classical way to achieve your goal is to train multiple classifiers, use them in turn for detection over each image and put the results together.
Yes, the detection time will be the sum over all the detection times and the program will be slower.
To reduce the overall detection time, you could use some tricks depending on the particular objects you are detecting.
For example, if you are detecting faces and eyes you could run the detection algorithm for eyes only inside faces. If you are detecting big objects, you could take a larger minimal window size (small windows are the ones that slow down the running time the more, as many sliding windows have to be checked). If you are detecting oranges, you could only run the detection over areas with some particular colours.

A large RAM doesn’t matter. For this purpose you need much a good CPU (number of cores and GHz).

edit flag offensive delete link

Comments

Completely agree, to get the most decent result, multiple models will be the best way to go. The difference between a banana and an apple is to big. + Keep in mind that boosting based models are NOT rotation invariant!