Ask Your Question

roboteyes's profile - activity

2016-11-02 19:34:58 -0600 received badge  Student (source)
2016-09-30 03:53:01 -0600 asked a question NEG count step gets progressively longer during training

I experience a progressive slow-down in the NEG count step during haar training at each successive stage. Is this normal behavior? If so, what is causing the slow down to occur?

2016-09-30 03:48:40 -0600 asked a question Drawing conclusions about cascade quality during training

I am wondering if someone could help me draw some conclusions about the quality of my HAAR cascades using the output of opencv_transcascade as they are being generated. The reason being a) I'm curious, b) I'd like to be able to stop wasting resources training on a cascade that is going to end up being 'funky'.

Given a minhitrate of .99 and a false alarm rate of 0.5, properly curated positive and background samples with a 3:1 neg:pos ratio, I have seen:

  • cascades that complete at much earlier stage than expected.
  • stages where I don't see FA fall below 1 until N > 5
  • small file size of final cascade.xml ( < 30kB)
  • cascades where the N (what is this signify?) column exceeds 40, yet others where it rarely exceeds 10
  • things like this (single entry with a zero false alarm rate):
===== TRAINING 9-stage =====
POS count : consumed   2723 : 2817
NEG count : acceptanceRatio    6960 : 0.000287357
Precalculation time: 6
+----+---------+---------+
|  N |    HR   |    FA   |
+----+---------+---------+
|   1| 0.999265|        0|
+----+---------+---------+
END

So I'm asking, what do some of you pros look for in the training output that help you predict the general quality of

  • input settings
  • positive sample quality
  • background sample quality
2016-09-30 03:38:24 -0600 received badge  Enthusiast
2016-09-27 16:17:33 -0600 commented question How to choose an appropriate set of negative samples?

Steven, ok, now this is a really interesting statement. I've verified this behavior in imagestorage.cpp... left-right-top-bottom. So what this means, is a 240x240 negative image with a window size of 24x24 has 100 samples. There is so much emphasis on having lots of negatives, but people might not realize that they can simply use slightly larger dimensions and get an increase by 2 orders of magnitude. What this means however, is that sample choice is a crucial step. If your negs are all 1280x720 for example, you would have several regions with extremely low frequency information when cropped to 24x24. Likewise small negs would have too much.

2016-09-27 01:19:20 -0600 commented question How to choose an appropriate set of negative samples?

I understand that DetectMultiscale scales the feature window when detecting; however, this is after the fact. I'm wondering if a specific negative set can influence the train, or if it's simply not worth the time and effort. Thoughts?

2016-09-27 00:03:08 -0600 asked a question How to choose an appropriate set of negative samples?

So I'm not sure exactly how much time and effort I should be spending on preparing my negative picture sets.

I usually just choose backgrounds that I think I'll encounter, which don't contain the entity I'm searching for. Also, I don't have a general rule of what the general dimensions should be, relative to the positive sample sizes.

See my images below:

image description

Now my question is, how much does your negative set impact the overall quality of your detector? Does the trainer slice up the negative images into sub-windows, and then walk across it in some order, or is it random? Does the trainer scale or crop the negative images? Is it worth the time customizing the negative set? Not much is said in the Viola Jones paper, but from what I can surmise, it seems like they used random regions from random photos.

And thank you for spending your time on my question!

2016-09-24 12:10:44 -0600 received badge  Supporter (source)
2016-09-24 12:10:26 -0600 commented answer Region of interest for Haar samples?

Helpful paper, thanks! A few extracts

"The training faces are only roughly aligned."

"place a bounding box around each face just above the eyebrows and about half-way between the mouth and the chin"

So think you're right. A little-less cropped, but only slightly.

2016-09-23 22:36:47 -0600 received badge  Editor (source)
2016-09-23 22:18:12 -0600 asked a question Region of interest for Haar samples?

I've noticed with many face detection algorithms, that the detected region of interest closely crops the eyes and mouth... excluding ears, hair and chin. When collecting and cropping my positive sample collection, should I tightly crop ensuring that no background is visible, or loosely crop the region?

See my example:

image description

My guess is the right hand side, but I'd just like to double-check with the experts here.