How to merge detected windows ?

asked 2013-04-24 23:28:39 -0500

sub_o gravatar image

So basically I've created my own pedestrian detection algorithm (I need it for some research purposes, thus decided not to use the supplied HoG detector) .

After detection, I'd have many overlapping rectangles around the detected object / human. Then I'd apply non-maxima suppression to retain the local maxima. However there are still overlapping rectangles in location out of search range of the non-maxima suppression algorithm.

How would you merge the rectangles ? I tried to use grouprectangles, but somehow i'm lost about how it came up with the result (e.g. grouprectangles( rects, 1.0, 0.2 ) )

I applied a rudimentary merging algorithm that merge if there are rectangles that overlapped for certain percentage of the area, the code is shown below.

/**
 * Merge a set of rectangles if there's an overlap between each rectangle for more than 
 * specified overlap area
 * @param   boxes a set of rectangles to be merged
 * @param   overlap the minimum area of overlap before 2 rectangles are merged
 * @param   group_threshold only the rectangles that have more than the remaining group_threshold rectangles will be retained
 * @return  a set of merged rectangles
 **/
vector<Rect> Util::mergeRectangles( const vector<Rect>& boxes, float overlap, int group_threshold ) {
    vector<Rect> output;
    vector<Rect> intersected;
    vector< vector<Rect> > partitions;
    vector<Rect> rects( boxes.begin(), boxes.end() );

    while( rects.size() > 0 ) {
        Rect a      = rects[rects.size() - 1];
        int a_area  = a.area();
        rects.pop_back();

        if( partitions.empty() ) {
            vector<Rect> vec;
            vec.push_back( a );
            partitions.push_back( vec );
        }
        else {
            bool merge = false;
            for( int i = 0; i < partitions.size(); i++ ){

                for( int j = 0; j < partitions[i].size(); j++ ) {
                    Rect b = partitions[i][j];
                    int b_area = b.area();

                    Rect intersect = a & b;
                    int intersect_area = intersect.area();

                    if (( a_area == b_area ) && ( intersect_area >= overlap * a_area  ))
                        merge = true;
                    else if (( a_area < b_area ) && ( intersect_area >= overlap * a_area  ) )
                        merge = true;
                    else if (( b_area < a_area ) && ( intersect_area >= overlap * b_area  ) )
                        merge = true;

                    if( merge )
                        break;
                }

                if( merge ) {
                    partitions[i].push_back( a );
                    break;
                }
            }

            if( !merge ) {
                vector<Rect> vec;
                vec.push_back( a );
                partitions.push_back( vec );
            }
        }
    }

    for( int i = 0; i < partitions.size(); i++ ) {
        if( partitions[i].size() <= group_threshold )
            continue;

        Rect merged = partitions[i][0];
        for( int j = 1; j < partitions[i].size(); j++ ) {
            merged |= partitions[i][j];
        }

        output.push_back( merged );

    }

    return output;
}

However what I'd like to now if this is actually an accepted way to merge rectangles in computer vision, especially when I want to check the precision recall of my algorithm. My approach seems to be too simplistic at times, and every merged rectangles get bigger and bigger mainly because of merged |= partitions[i][j]; which finds the minimum rectangle that enclose both rectangles.

If this is an acceptable way to merge detection windows, what's the common value for merging overlap (i.e. if overlap area >= what percentage) ?

edit retag flag offensive close merge delete

Comments

1

I guess the percentage of overlap is something people define specific for their application. If you know that people will stand close to eachother, or partially occluding eachother, and you have 2 frames with 50% overlap, then you do not want to merge those. Only if your algorithm can handle occlusion. However, if you do not have partially occluded objects, you might prefer to do merge 50% overlap. Guess its a trial and error value.

StevenPuttemans gravatar imageStevenPuttemans ( 2013-04-25 04:24:40 -0500 )edit