Object detection not working

asked 2017-05-18 18:27:36 -0500

gunner gravatar image

updated 2017-05-20 04:20:51 -0500

Hello everybody! I Am working on my masters degree. My first task is to do bananadetector by following this tutorial:

http://technobium.com/object-detectio... Did whole tutorial but in the end, detector doesn't work. It shows red rectangle saying "banana" in the wrong places, it doesn't recognize bananas. Total disaster. I have been trying to make it work for more than 10 days, but nothing helped. For training I made 1400 positives and 3000 negatives and used next commands to train model:

opencv_createsamples -info positives.txt -num 1200 -w 60 -h 80 -vec training.vec

opencv_traincascade -data data -vec training.vec -bg negatives.txt -numPos 1000 -numNeg 1000 -numStages 10 -nsplits 2 -w 60 -h 80 -featureType LBP -minhitrate 0.999 -maxfalsealarm 0.5

Here is my BananaDetector.cpp file

include <highgui.h>

include <iostream>

include <stdio.h>

include <cv.h>

using namespace std; using namespace cv; using namespace std;

int main() {

cvNamedWindow("Banana detecting camera", 1);
// Capture images from any camera connected to the system
CvCapture* capture = cvCaptureFromAVI("/home/painkiller/Desktop/banana.mp4");

// Load the trained model
CascadeClassifier bananaDetector;

if (bananaDetector.empty()) {
    printf("Empty model.");
    return 0;

char key;
while (true) {

    // Get a frame from the camera

    Mat frame = cvQueryFrame( capture );

    std::vector<Rect> bananas;

    // Detect banana
    bananaDetector.detectMultiScale(frame, bananas, 1.1, 30,
            0 | CV_HAAR_SCALE_IMAGE, Size(100,440));

    for (int i = 0; i < (int) bananas.size(); i++) {
        Point pt1(bananas[i].x, bananas[i].y);
        Point pt2(bananas[i].x + bananas[i].width,
                bananas[i].y + bananas[i].width);

        // Draw a rectangle around the detected banana
        rectangle(frame, pt1, pt2, Scalar(0, 0, 255), 2);
        putText(frame, "Banana", pt1, FONT_HERSHEY_PLAIN, 1.0,
                Scalar(255, 0, 0), 2.0);


    // Show the transformed frame
    if (!frame.empty())
    imshow("Banana detecting camera", frame);

    // Read keystrokes, exit after ESC pressed
    key = cvWaitKey(10);
    if (char(key) == 27) {

return 0;


I have 3 more days to solve this till monday...

edit retag flag offensive close merge delete


It might help others help you if you add your code that does the detection and images of positives, negatives, and detector failure. Are you first creating a binary image by HSV segmentation (thresholding on yellow)? To my knowledge LBP works on 8-bit binary images.

Der Luftmensch gravatar imageDer Luftmensch ( 2017-05-18 22:28:36 -0500 )edit
  • bananas ? terrible idea (no texture, pose problems)
  • i see you only copy/pasting other ppl's (also outdated) code, whithout any significant thought (or even research, hell...) on your own. you won't overcome that problem until monday.
  • " For training I made 1400 positives and 3000 negatives" -- how did you do that ? i can't quite believe, you have 1400 real positive images (which would be good, but i guess, you been cheating with synthesizing positives from a few, if so, you're only fooling yourself.)
berak gravatar imageberak ( 2017-05-19 06:23:35 -0500 )edit

Der Luftmensch, Before using images I applied greyscale on all of them. All my images are 8-bit. My detector doesn't show any error, it just can't recognize bananas. It put Rectangle showing bananas in wrong places when i run the video with bananas. My posiitive images size are 80x60. I edited my post by adding BananaDetector.cpp file.

gunner gravatar imagegunner ( 2017-05-19 07:51:53 -0500 )edit

Berak, I appreciate your answer, but also you don't have to be so aggressive/rude. I am doing my best, I understand every line of code written in my project and I have made various changes in my program before asking here for help. SO, I please you to stop writing guesses about anyone not knowing the code they write when you know nothing about it. Secondly, I made 204 positives by my camera and used batch processes to blur them, change the brightness, convert them upside down.. So from 204 images i get 1400. If I need to make more positives by camera, that is not problem for me. If it will help, I can make it. But i highly doubt the problem is in that since detector can't detect anything and 204 initial positives and 1400 total is not really a small number..

gunner gravatar imagegunner ( 2017-05-19 08:04:05 -0500 )edit
  • " convert them upside down.." please do not variate the angle more than +-10%, it expects a fixed pose (more or less).
  • again bananas are a very bad target for this (there's only the shape, to hook into)
  • if you need a minNeighbours value of 30, then your model clearly overfits
  • -w=60 h=80 i guess, that's a bad ratio for an (upright) banana, rahter try -w=40 -h=80
  • use all your negatives, but only 90% of the positives in the training
  • maybe use only the original 204 images, idoubt, that your augmentation attempts work nicely.
berak gravatar imageberak ( 2017-05-19 08:16:59 -0500 )edit

I tried your suggestions but it did't help. I'm slowly facing the reality of not making it :/... Can you please explain me what it takes for object to be good for detection. I Guess from fruits, strawberry would be a good choice since it has texture and no pose problems are involved. Also you said: for bananas there's only the shape, to hook into. what else can opencv hook into and how? Do i make LBP training like this time with all arguments and opencv does it by itself or there are some different techniques..?

gunner gravatar imagegunner ( 2017-05-20 04:52:54 -0500 )edit

i''d think, traffic signs might be far easier, than fruits, they come with a lot of "inner contours", and best of it -- they're standardized !

try to avoid anything from nature, too much variation. iirc, @StevenPuttemans once trained a strawberry classifier (so it's possible !), but they had a TON of real positives (and the negatives were from a strawberry field , so , restricted situation).

maybe you can restrict it a little, too. while it should be fairly easy to spot a STOP sign against blue sky, it will be difficult in a dense traffic situation, or against a wall plastered with ads.

maybe even bananas work, if you can keep the pose consistant, and the bg mostly uniform

berak gravatar imageberak ( 2017-05-20 05:19:15 -0500 )edit

@berak, maybe a stupid question (I couldn't find a quick answer), does OpenCV's LBP cascade classifier support BGR? If so, with bananas having such a distinct color range and little texture as you mentioned, it seems silly to simply convert to greyscale (but first thresholding on HSV is a valid preprocessing step which the questioner has chosen to simply ignore).

Der Luftmensch gravatar imageDer Luftmensch ( 2017-05-20 08:36:55 -0500 )edit

@Der Luftmensch - it converts to grayscale almost before anything else ..

also, i do not think, that the LBP features work nicely with binary images.

and sure, if it was only about finding bananas, i'd say too: HSV thresholding on yello, then some shape comparison. bonus points for implementing "onedollar".

however, (some sort of) cascade training seems to be an explicit requirement (uni), no?

berak gravatar imageberak ( 2017-05-20 08:56:27 -0500 )edit

Berak, i didn't succeed with bananas so i have to pick a new object to detect it with this type of training I used for banana detection. I have been thinking a lot and i decided to pick a fire extinguisher. Can you please tell my did I make the right choice and your opinion will detection work after this type of training...

gunner gravatar imagegunner ( 2017-05-22 19:20:25 -0500 )edit