Ask Your Question

sub_o's profile - activity

2019-04-21 13:14:30 -0500 received badge  Popular Question (source)
2017-12-18 09:22:29 -0500 received badge  Popular Question (source)
2017-12-16 09:56:15 -0500 received badge  Notable Question (source)
2017-11-17 04:57:21 -0500 received badge  Popular Question (source)
2017-10-26 08:15:56 -0500 received badge  Self-Learner (source)
2015-12-18 09:12:35 -0500 received badge  Nice Answer (source)
2015-10-16 15:57:59 -0500 received badge  Teacher (source)
2015-06-12 10:30:39 -0500 received badge  Nice Question (source)
2015-03-25 16:02:23 -0500 received badge  Popular Question (source)
2015-03-18 09:33:15 -0500 received badge  Nice Question (source)
2014-02-27 11:02:44 -0500 asked a question Creating OpenGL's perspective projection matrix from calibrated parameters

Hi everyone,

I have calibrated intrinsic parameters, i.e. fx, fy, cx, cy. And I wanted to create perspective projection matrix for OpenGL. Normally in OpenGL you would either call glFrustum which follows the similar matrix below, or glm::perspective function if you're using GLM.

However, when I look into how perspective projection matrix on OpenGL are created, it doesn't factor in focal points and camera's principal points. How should I modify the matrix in order to consider those intrinsic parameters?

image description

In the end I'd like to create a MVP (model view projection) matrix by doing MVP = projection * view * model. With model's translation and rotation matrix obtained from solvePnP, thus. model = inverse( translation * rotation ) ,

Any idea on how should I create the perspective projection matrix?

2013-04-29 00:18:18 -0500 asked a question How do you classify true negatives ?

I'm gathering results from my image detector algorithm. So basically what I do is that, from a set of images (with the size of 320 x 480), I would run a sliding window of 64x128 thru it, and also under a number of predefined scales.

I understand that:

  • True Positives = when my detected window overlaps (within defined intersection size / centroid) with the ground-truth (annotated bounding boxes)
  • False Positives = when the algorithm gives me positive windows, which are outside of the grond truth.
  • False Negatives = when it failed me to give positive window, while the ground truth annotation states that there's an object.

But what about True Negatives ? Are these true negatives all the windows that my classifier gives me negative results ? That sounds weird, since I'm sliding a small window (64x128) by 4 pixels at a time, and I've around 8 different scales used in detection. If I were to do that, then I'd have lots of true negatives per image.

Or do I prepare a set of pure negative images (no objects / human at all), where I just slide thru, and if there's one or more positive detections in each of these images, I'd count it as False Negative, and vice versa ?

Here's an example image (with green rects as the ground truth)

Example image, not real result

2013-04-24 23:28:39 -0500 asked a question How to merge detected windows ?

So basically I've created my own pedestrian detection algorithm (I need it for some research purposes, thus decided not to use the supplied HoG detector) .

After detection, I'd have many overlapping rectangles around the detected object / human. Then I'd apply non-maxima suppression to retain the local maxima. However there are still overlapping rectangles in location out of search range of the non-maxima suppression algorithm.

How would you merge the rectangles ? I tried to use grouprectangles, but somehow i'm lost about how it came up with the result (e.g. grouprectangles( rects, 1.0, 0.2 ) )

I applied a rudimentary merging algorithm that merge if there are rectangles that overlapped for certain percentage of the area, the code is shown below.

 * Merge a set of rectangles if there's an overlap between each rectangle for more than 
 * specified overlap area
 * @param   boxes a set of rectangles to be merged
 * @param   overlap the minimum area of overlap before 2 rectangles are merged
 * @param   group_threshold only the rectangles that have more than the remaining group_threshold rectangles will be retained
 * @return  a set of merged rectangles
vector<Rect> Util::mergeRectangles( const vector<Rect>& boxes, float overlap, int group_threshold ) {
    vector<Rect> output;
    vector<Rect> intersected;
    vector< vector<Rect> > partitions;
    vector<Rect> rects( boxes.begin(), boxes.end() );

    while( rects.size() > 0 ) {
        Rect a      = rects[rects.size() - 1];
        int a_area  = a.area();

        if( partitions.empty() ) {
            vector<Rect> vec;
            vec.push_back( a );
            partitions.push_back( vec );
        else {
            bool merge = false;
            for( int i = 0; i < partitions.size(); i++ ){

                for( int j = 0; j < partitions[i].size(); j++ ) {
                    Rect b = partitions[i][j];
                    int b_area = b.area();

                    Rect intersect = a & b;
                    int intersect_area = intersect.area();

                    if (( a_area == b_area ) && ( intersect_area >= overlap * a_area  ))
                        merge = true;
                    else if (( a_area < b_area ) && ( intersect_area >= overlap * a_area  ) )
                        merge = true;
                    else if (( b_area < a_area ) && ( intersect_area >= overlap * b_area  ) )
                        merge = true;

                    if( merge )

                if( merge ) {
                    partitions[i].push_back( a );

            if( !merge ) {
                vector<Rect> vec;
                vec.push_back( a );
                partitions.push_back( vec );

    for( int i = 0; i < partitions.size(); i++ ) {
        if( partitions[i].size() <= group_threshold )

        Rect merged = partitions[i][0];
        for( int j = 1; j < partitions[i].size(); j++ ) {
            merged |= partitions[i][j];

        output.push_back( merged );


    return output;

However what I'd like to now if this is actually an accepted way to merge rectangles in computer vision, especially when I want to check the precision recall of my algorithm. My approach seems to be too simplistic at times, and every merged rectangles get bigger and bigger mainly because of merged |= partitions[i][j]; which finds the minimum rectangle that enclose both rectangles.

If this is an acceptable way to merge detection windows, what's the common value for merging overlap (i.e. if overlap area >= what percentage) ?

2013-04-24 13:43:56 -0500 commented answer Slow SVM.predict speed

Yeah, I was using linear C_SVM with C set to 0.001, it ended up with more than 4000 support vectors. I've since reduced it to 0.1, giving me around 650 support vecs, and the whole thing sped up to around 4 seconds now.

2013-04-24 07:38:07 -0500 commented answer Slow SVM.predict speed

3780 features. I found out that there's 4417 support vectors in my SVM. Maybe I should train with different C value, or I should switch to RandomForest

2013-04-24 07:19:04 -0500 asked a question Slow SVM.predict speed

I'm writing a sliding window to extract features and feed it into CvSVM's predict function. However, what I've stumbled upon is that the svm.predict function is relatively slow.

Basically the window slides thru the image with fixed stride length, on number of image scales.

  • The speed traversing the image plus extracting features for each window takes around 1000 ms (1 sec).
  • Inclusion of weak classifiers trained by adaboost resulted in around 1200 ms (1.2 secs)
  • However when I pass the features (which has been marked as positive by the weak classifiers) to svm.predict function, the overall speed slowed down to around 16000 ms ( 16 secs )
  • Trying to collect all 'positive' features first, before passing to svm.predict utilizing TBB's threads resulted in 19000 ms ( 19 secs ), probably due to the overhead needed to create the threads, etc.

My OpenCV build was compiled to include both TBB (threading) and OpenCL (GPU) functions.

Has anyone managed to speed up OpenCV's SVM.predict function ?

I've been stuck in this issue for quite sometime, since it's frustrating to run this detection algorithm thru my test data for statistics and threshold adjustment.

Thanks a lot for reading thru this !

2013-04-23 04:12:13 -0500 received badge  Critic (source)
2013-04-23 03:53:28 -0500 asked a question How to utilize cascade classifier in custom detection algorithm ?

So, I've created my own HoG features extractor and a simple sliding window algorithm, which pseudo code looks like this:

for( int i = 0; i < img.rows; i++ ) {
   for( int j = 0; j < img.cols; j++ ) {
     extract image ROI from the current position
     calculate features for the ROI
     feed the features into svm.predict() function, to determine whether it's human or not

However since it's very slow (especially when you include different scales), I've decided to train some cascade classifiers using openv_traincascade command on my positive and negative samples.

opencv_traincascade provides me with cascade.xml, params.xml, and a number of stages.xml files

My question is how do I utilize this trained cascade classifiers in my detection loop ?

2013-04-22 08:32:27 -0500 received badge  Scholar (source)
2013-04-18 03:43:01 -0500 commented question How to identify an image once in a video sequence?

Like what you say, maybe you could use tracking algorithm. But if you want to filter it so that you don't have duplicates, maybe try to match it with your already detected objects ? Maybe some template matching, or some other algorithm (i.e. normalize image, convert to grayscale, compare with detected ones) ?

2013-04-18 02:56:29 -0500 commented question Image Recognition question

Will the dollar bill be crumpled ?

If not, then you could try some edge detection algorithm, see if you could fit a polylines to it, and see if it's 4 sided. Extract that ROI, maybe warp the perspective (?) a bit, and calculate the color histogram of it, and see if it's similar to your expected histogram.

2013-04-18 02:53:44 -0500 asked a question How to have a faster sliding window algorithm ?

I'm running a sliding window algorithm that extract features from the patch and feed it into my trained SVM. The sliding window itself is naive brute force sliding (i.e. sliding from 0 to max y, and 0 to max x)

The feature extraction algorithm is pretty fast, it took around 900 ms to slide and extract features from the entire image.

However passing the features into svm.predict( features, true ) slows it down to around 40 secs.

Is there any algorithm that can speed up the whole process ?

2013-04-18 02:50:24 -0500 commented answer How to training HOG and use my HOGDescriptor?

Sadly no. I decided to create my own multi scale window detection, but still struggling with its speed.

2013-04-18 02:27:03 -0500 answered a question How to training HOG and use my HOGDescriptor?

In OpenCV, there's undocumented API

HOGDescriptor::compute(const cv::Mat &img, vector<float> &descriptors)

which you could pass in your image, and retrieve the descriptors for it. And then you could construct a large feature matrix containing both positive and negative features, and also label matrix that specifies which entry in that feature matrix is positive or negative.

Both the features and label matrices then could be passed into CvSVM class for training, and you could save it in HDD and load it afterwards. Later on you could extract features from a patch of an image, use compute to extract the features, and pass it into svm.predict to obtain the prediction whether it's positive or negative image. The downside from this is that you need to apply your own detection window algorithm, etc.

Below is some sample code to train your own SVM.

Mat labels( pos_count + neg_count, 1, CV_32FC1, Scalar(-1.0) );
labels.rowRange( 0, pos_count ) = Scalar( 1.0 );

CvSVM svm;
CvSVMParams params;
params.C = c;
params.svm_type = CvSVM::C_SVC;
params.kernel_type = CvSVM::LINEAR;
params.term_crit = cvTermCriteria( CV_TERMCRIT_ITER, 10000, 1e-6 );

Mat feat_matrix( pos_count + neg_count, pos_feats[0].cols, pos_feats[0].type() );
for( int i = 0; i < pos_count; i++ )
    pos_feats[i].copyTo( feat_matrix.rowRange(i, i+1) );
for( int i = 0; i < neg_count; i++ )
    neg_feats[i].copyTo( feat_matrix.rowRange(i + pos_count, i + pos_count + 1) );

cout << "Start training SVM" << endl;
svm.train( feat_matrix, labels, Mat(), Mat(), params ); (path + svm_file).c_str() );
2013-04-18 02:10:39 -0500 commented question SVM Predict Slow

I have similar issues here. I'm running sliding window across an image, and while my feature extraction took only around 900ms, by passing it to SVM it slows down to almost 40 secs. Using TBB's parallel_for to break down my loop, made it worse, the whole thing slowed down to 50 secs (either thru manual or automatic chunking)

2012-11-09 04:19:26 -0500 received badge  Supporter (source)
2012-11-05 04:30:31 -0500 answered a question Comparing two HOG descriptors vectors

Hi Ben-Uri. From what I know the vector should be multiply of 9, which means that it's a flatten version of HoG with 9 bins. Each 9 float in that vector should represent a single cell region (? correct me if I'm wrong).

I'm creating my own HoG using OpenCV's function (besides its own HoG method), and you can find it here:

2012-10-22 09:38:55 -0500 asked a question How do I approach training dataset for HoG, using images which are larger than 64 x 128 pixels ?

I'm writing my own HoG for future modification purposes, and experimenting with different approaches. But I stumbled upon this question / issue.

I have downloaded dataset from INRIA, and there are images which are in 320 x 240. While the default training window size for HoG is 64 x 128.

How should I go around this ?

For the positive images, they are around 96 x 160 pixels, and what I did is to resize them down to 64 x 128. But for larger images, do I resize them, use a sliding window which moves pixel by pixel, or do I calculate features for 64 x 128 patches in that large image ?

What's the best way to approach this ?

2012-08-13 04:44:06 -0500 received badge  Student (source)
2012-08-13 04:42:38 -0500 asked a question More questions on feeding HoG features to CvSVM

I've managed to extract HoG features from positive and negative images (from INRIA's person dataset ) using OpenCV's HOGDescriptor::compute function.

I've also managed to pack the data correctly and feed it into CvSVM for training purposes.

I have several questions:

  • While extracting features, I used positive images with dimension of 96 x 128, while the negative images are on average 320 x 240. I have been using window size of 64 x 128 for HoG extraction, should I use other window size ?

  • The size of extracted features for positive images are around 28800 features, while the negative ones are around 500000+. I have been truncating the features from negative ones to 28800, I think this is wrong, since I believe I'm losing too much information when feeding these features to SVM. How should I go and tackle this ? (It seems like I can only feed the same sample size for negative and positive features)

  • While doing prediction on images bigger than 64 x 128 (or 96 x 160), should I use a sliding window to do prediction ? Since large negative images still gives me more than 500000 features, but I can't feed it into SVM due to sample size.

Thanks a lot in advance !

2012-08-08 08:55:51 -0500 commented question libpng issues ?

@sammy Do you have any idea how to do this ? I am quite new when it comes to compiling my own libraries.

2012-08-08 08:14:38 -0500 asked a question libpng issues ?

I am trying to open up png files on my XCode 4.4 (Mountain Lion) with the following codes (works for jpg files)

Mat image = imread( "/Users/user_name/Desktop/result.png" );
imshow( "", image );
waitKey( 0 );

But OpenCV throws me this error:

libpng warning: Application built with libpng-1.4.12 but running with 1.5.4
OpenCV Error: Bad flag (parameter or structure field) (Unrecognized or unsupported array type) in cvGetMat, file /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_graphics_opencv/opencv/work/OpenCV-2.4.2/modules/core/src/array.cpp, line 2482
libc++abi.dylib: terminate called throwing an exception

Seems like my png file was created using a newer libpng, while OpenCV is using older one. So how do I resolve this issue ?

Edit: more information

I'm using OSX Mountain Lion, with XCode 4.4, and using OpenCV 2.4.2 installed by using MacPorts

2012-08-07 05:40:23 -0500 answered a question Question regarding feeding extracted HoG features into CvSVM's train

I found out the issue, since I was testing whether it is correct to pass feature vectors into the SVM in order to train it, I didn't bother to prepare both negative and positive samples.

Yet, CvSVM requires at least 2 different classes for training, that's why the error it threw.

Thanks a lot anyway !

2012-08-06 23:33:20 -0500 asked a question Question regarding feeding extracted HoG features into CvSVM's train

This is a silly question since I'm quite new to SVM,

I've managed to extract features and locations using:

vector< float > features;
vector< Point > locations;
hog_descriptors.compute( image, features, Size(0, 0), Size(0, 0), locations );

Then I proceed to use CvSVM to train the SVM based on the features I've extracted.

Mat training_data( features );
CvSVM svm;
svm.train( training_data, labels, Mat(), Mat(), params );

Which gave me an error:

OpenCV Error: Bad argument (There is only a single class) in cvPreprocessCategoricalResponses, file /opt/local/var/macports/build/

My question is that, how do I convert the vector < features > into appropriate matrix to be fed into CvSVM ? Obviously I am doing something wrong, the given tutorial shows that a 2D matrix containing the training data is fed into SVM. So, how do I convert vector < features > into a 2D matrix, what are the values in the 2nd dimension ?

Could anyone please explain to me how am I supposed to tackle this ?

2012-08-06 11:36:08 -0500 asked a question HOGDescriptor's compute

I am not sure if this is actually a bug, but I was trying to use HOG compute with my own parameters instead of the default ones. So what I did is that rather than using default constructor for HOGDescriptor, I used the other one, as shown in the code snippet below

Size win_size( 64, 128 );
Size cell_size( 8, 8 );
Size block_size( 16, 16 );
Size stride( 8, 8 );

int bins                = 9;
double sigma            = -1;
double l2_hys           = 0.2;
bool gamma_correction   = true;
int levels              = 64;

HOGDescriptor hog_descriptors( win_size, cell_size, block_size, stride, bins, sigma, l2_hys, gamma_correction, levels );
vector< Point > locations;
hog_descriptors.compute( image, features, Size(8, 8), Size(32, 32), locations );

Yet, my program fails every time when it hits hog_descriptors.compute , with OpenCV (2.4.2) throwing the following error:

OpenCV Error: Assertion failed ((winSize.width - blockSize.width) % blockStride.width == 0 && (winSize.height - blockSize.height) % blockStride.height == 0) in getDescriptorSize, file /opt/local/var/macports/build/

Weirdly enough if I were to use default constructor instead, i.e.:

HOGDescriptor hog_descriptors;

the compute function works without error, even though when I was using the non default constructor, I was supplying (apparently, not sure if I missed out anything) the same parameters. My question is that, is this somewhat an expected behavior, did I miss something out, or is it a bug within OpenCV ? I am working with an image with the size of 800 x 599

2012-07-23 23:54:08 -0500 commented question Extracting HoG features ?

@TRiBi, How do you extract the HoG features ? If you could, then maybe you could try to look into SVMLight or libsvm on the format that they required to train your own SVM. I am not sure how to pass it back to OpenCV though.

2012-07-23 02:54:12 -0500 received badge  Editor (source)
2012-07-23 02:49:33 -0500 asked a question Extracting HoG features ?

Hi everyone, this is my first time posting here.

I am trying to train my own SVM using features extracted using HoG. So far I can use the given pedestrian detector SVM for HoG to detect people. But I want to detect more poses, so I decided to train my own SVM using HoG.

My question is: Is there any OpenCV's API that allow me to extract HoG features and train a SVM later on ?

So far I couldn't find any answer, so I am trying to port HoG library from to C++, but I just realized that the developers use libsvm library, I am unsure whether I should also port that library.

Edit: I found libsvm library. But I am still wondering if there's any API directly from OpenCV that I can use, or any existing code regarding extracting HoG features within OpenCV's source code.

Thank you very much !