Adaboost Influence Trimming Takes Longer to Train
The documentation states that influence trimming can be used "to reduce the computation time for boosted models with substantially losing accuracy". By default, the weight_trim_rate parameter is 0.95. After disabling influence training by changing that parameter to 0, I actually achieve a large speed-up. When using a dataset with 262144 samples, I achieve a 5x speed-up. When using a dataset ten times larger, I achieve a 3x speed-up. This seems to be the opposite of the expected behavior. Can anyone explain why this might be happening? Thanks!
100 weak classifiers with a max depth of 1
Trim Accuracy MSE Training Time Percent Speedup
0 95.03 3.989 10.607
0.6 7.88 86.77 1.252 8.472044728
0.7 15.76 78.21 2.319 4.573954291
0.8 33.35 57.73 52.972 0.200237862
0.9 94.68 4.89 52.484 0.202099688
0.95 94.94 4.189 52.31 0.202771937
0.99 95.03 3.99 47.026 0.225556075
0.999 95.02 3.985 44.432 0.238724343
It is exactly as stated, training is faster but look at your accurary! it collapses in a whole and that should be avoided :)
I apologize as the data may be confusing. The base case here is when influence trimming is disabled. That gives an accuracy of 95.03 and a train time of 10.607. When influence trimming is turned on (with the default of 0.95), the accuracy drops to 94.94 as expected, but the training time takes 5x as long.