Ask Your Question
0

train_HOG not able to detect all images.

asked 2018-01-18 01:40:45 -0600

Kailash gravatar image

updated 2018-01-18 02:38:27 -0600

berak gravatar image

I picked the code for train_HOG.cpp from openCV site. I went through all the steps carefully, but not able to get the accuracy even in same resolution image (original image from which I created positive images.). Attaching output images and code here to analyse. I put red circle on area that was not found by HOG utility. I need the image should be recognized if source images is of different scale (e.g. 50%, 75%, 125%, 150%). Awaiting response/feed back/comments.

#include "stdafx.h"
#include "TrainHOGEx.h"


TrainHOGEx::TrainHOGEx()
{
}


TrainHOGEx::~TrainHOGEx()
{
}
int mainTestEx();
void get_svm_detectorex(const Ptr< SVM > & svm, vector< float > & hog_detector);
void convert_to_mlex(const std::vector< Mat > & train_samples, Mat& trainData);
void load_images(const String & dirname, vector< Mat > & img_lst, bool showImages);
void sample_negex(const vector< Mat > & full_neg_lst, vector< Mat > & neg_lst, const Size & size);
void computeHOGs(const Size wsize, const vector< Mat > & img_lst, vector< Mat > & gradient_lst);
int test_trained_detector(String obj_det_filename, String test_dir, String videofilename);

void get_svm_detectorex(const Ptr< SVM >& svm, vector< float > & hog_detector)
{
    // get the support vectors
    Mat sv = svm->getSupportVectors();
    const int sv_total = sv.rows;
    // get the decision function
    Mat alpha, svidx;
    double rho = svm->getDecisionFunction(0, alpha, svidx);

    CV_Assert(alpha.total() == 1 && svidx.total() == 1 && sv_total == 1);
    CV_Assert((alpha.type() == CV_64F && alpha.at<double>(0) == 1.) ||
        (alpha.type() == CV_32F && alpha.at<float>(0) == 1.f));
    CV_Assert(sv.type() == CV_32F);
    hog_detector.clear();

    hog_detector.resize(sv.cols + 1);
    memcpy(&hog_detector[0], sv.ptr(), sv.cols * sizeof(hog_detector[0]));
    hog_detector[sv.cols] = (float)-rho;
}

/*
* Convert training/testing set to be used by OpenCV Machine Learning algorithms.
* TrainData is a matrix of size (#samples x max(#cols,#rows) per samples), in 32FC1.
* Transposition of samples are made if needed.
*/
void convert_to_mlex(const vector< Mat > & train_samples, Mat& trainData)
{
    //--Convert data
    const int rows = (int)train_samples.size();
    const int cols = (int)std::max(train_samples[0].cols, train_samples[0].rows);
    Mat tmp(1, cols, CV_32FC1); //< used for transposition if needed
    trainData = Mat(rows, cols, CV_32FC1);

    for (size_t i = 0; i < train_samples.size(); ++i)
    {
        CV_Assert(train_samples[i].cols == 1 || train_samples[i].rows == 1);

        if (train_samples[i].cols == 1)
        {
            transpose(train_samples[i], tmp);
            tmp.copyTo(trainData.row((int)i));
        }
        else if (train_samples[i].rows == 1)
        {
            train_samples[i].copyTo(trainData.row((int)i));
        }
    }
}

void load_images(const String & dirname, vector< Mat > & img_lst, bool showImages = false)
{
    try {
        vector< String > files;
        try {
            glob(dirname, files);
        }
        catch (...)
        {
            AfxMessageBox("exception");
        }

        for (size_t i = 0; i < files.size(); ++i)
        {
            Mat img = imread(files[i]); // load the image
            if (img.empty())            // invalid image, skip it.
            {
                cout << files[i] << " is invalid!" << endl;
                continue;
            }

            if (showImages)
            {
                imshow("image", img);
                waitKey(1);
            }
            img_lst.push_back(img);
        }
    }
    catch (...)
    {
        AfxMessageBox("exception");
    }
}

void sample_negex(const vector< Mat > & full_neg_lst, vector< Mat > & neg_lst, const Size & size)
{
    Rect box;
    box.width = size.width;
    box.height = size.height;

    const int size_x = box.width;
    const int size_y = box ...
(more)
edit retag flag offensive close merge delete

Comments

please do NOT post screenshots of code. if you think, we need to see it, add it to your question as text

berak gravatar imageberak ( 2018-01-18 01:43:53 -0600 )edit
1

" I went through all the steps carefully," -- no, you did something entirely different.

(maybe that's "creative", but it won't deliver here)

instead of collecting many positive images, you tried to use the code from the (cascade) createsamples tool, to augment positives from a single image, using rotation/scaling. not helpful here, probably even counter productive (because the "real world" logos do not look like this).

also, looking for the google logo in a webpage, you're probably better off, using (scaled) template matching instead.

berak gravatar imageberak ( 2018-01-18 02:29:20 -0600 )edit

1 answer

Sort by ยป oldest newest most voted
0

answered 2018-01-18 02:27:27 -0600

updated 2018-01-18 02:27:59 -0600

but not able to get the accuracy even in same resolution image

So what is the accuracy you are expecting? You cannot evaluate a detector on a frame by frame basis, you rather do that over a global dataset.

I put red circle on area that was not found by HOG utility

HOG is not a holy grail, it is a rigid model and thus prone to false negative detections (meaning you are not detecting an actual object).

Like @berak said, people will refrain from clicking links... put some effort in your questions please.

edit flag offensive delete link more

Comments

I would like to know about the global dataset. Is it like below code.

hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector()) where HOGDescriptor_getDefaultPeopleDetector is a global dataset defined in the library secretly. If it is not, please send me some links on "What is global dataset and how to use it."

Kailash gravatar imageKailash ( 2018-01-18 04:57:14 -0600 )edit

I also found some of the links where other folks also trying to do the same thing but in different way, but internally having same API calls. Means, it seems to be possible by HOG, may be I am missing some configuration and giving it wrong figures (like positive image size, svm inputs, gradient list or detection logic.)

https://stackoverflow.com/questions/1...

http://www.coldvision.io/2017/03/23/v...

Kailash gravatar imageKailash ( 2018-01-18 04:57:47 -0600 )edit

oh man ... i think you just have to start reading on HOG in general. the getdefaultpeopledetector is a model, not a dataset, it stores the weights of the trained HOG model you are going to evaluate...

StevenPuttemans gravatar imageStevenPuttemans ( 2018-01-18 04:58:53 -0600 )edit

I am beginner in image processing codes. Please help on how can I solve this issue by HOG technique.

Kailash gravatar imageKailash ( 2018-01-18 05:37:54 -0600 )edit

the DefaultPeopleDetector was trained on the inria person dataset, for the daimler one it is this

berak gravatar imageberak ( 2018-01-18 11:22:57 -0600 )edit

I am beginner in image processing codes. --> then you might just be in over your head. Your task is not that easy as it seems and can probably not be solved robustly with off the shelf opencv software... Please help on how can I solve this issue by HOG technique. --> if you want to have a detection on every single frame and every single person, HOG simply won't cut it...

StevenPuttemans gravatar imageStevenPuttemans ( 2018-01-19 02:42:04 -0600 )edit

Question Tools

1 follower

Stats

Asked: 2018-01-18 01:40:45 -0600

Seen: 242 times

Last updated: Jan 18 '18