Adaboost Influence Trimming Takes Longer to Train

asked 2014-05-28 11:12:37 -0600

radford.parker gravatar image

updated 2014-05-28 11:32:33 -0600

The documentation states that influence trimming can be used "to reduce the computation time for boosted models with substantially losing accuracy". By default, the weight_trim_rate parameter is 0.95. After disabling influence training by changing that parameter to 0, I actually achieve a large speed-up. When using a dataset with 262144 samples, I achieve a 5x speed-up. When using a dataset ten times larger, I achieve a 3x speed-up. This seems to be the opposite of the expected behavior. Can anyone explain why this might be happening? Thanks!

100 weak classifiers with a max depth of 1              
Trim    Accuracy    MSE     Training Time   Percent Speedup
0       95.03        3.989   10.607 
0.6      7.88       86.77     1.252          8.472044728
0.7     15.76       78.21     2.319          4.573954291
0.8     33.35       57.73    52.972          0.200237862
0.9     94.68        4.89    52.484          0.202099688
0.95    94.94        4.189   52.31           0.202771937
0.99    95.03        3.99    47.026          0.225556075
0.999   95.02        3.985   44.432          0.238724343
edit retag flag offensive close merge delete

Comments

It is exactly as stated, training is faster but look at your accurary! it collapses in a whole and that should be avoided :)

StevenPuttemans gravatar imageStevenPuttemans ( 2014-05-28 11:53:41 -0600 )edit

I apologize as the data may be confusing. The base case here is when influence trimming is disabled. That gives an accuracy of 95.03 and a train time of 10.607. When influence trimming is turned on (with the default of 0.95), the accuracy drops to 94.94 as expected, but the training time takes 5x as long.

radford.parker gravatar imageradford.parker ( 2014-05-28 12:08:48 -0600 )edit