You should take a look at the sourcecode of the boosting process to completely understand it but you are misreading the output and what is happening. Let me explain.
Each stages follows these steps
- Starts by grabbing #pos and #neg for the stage
- Takes a first feature from the complete feature pool on accompagnied by the model dimensions, which allows to classify the set of positive samples 100% correctly
- It calculates the FA that this single feature (if you selected weak classifier depth as 1 then this will yield a weak classifier stump) yields on the negative samples and check if this is already below the maxFalseAlarmRate setting.
- Now we iteratively add an extra feature from the feature pool that ensures the positives are still correctly classified and do not drop under the minimum hit rate of for example 0.995 AND that ensures that we have a drop in the FA rate of the negative samples.
- We continue to add features until the maxFalseAlarmRate is exceeded. This means that you have a classifier stage that is a bit better then a random guess (50%) and we then move up to the next stage.
When moving to a next stage
- Discard all positive samples that are now wrongfully classified in the previous stage. Get new ones for that. This is the reason why you should never add the max amount of pos samples that you have to the numPos parameter.
- Remove all negative samples that were correctly classified and grab new windows (which do not get classified as negatives by all previous stages) untill you have as much as numNeg.
- Train a new stage of weak classifiers.