1 | initial version |
It's much easier to overfit the data with a larger max_depth than with max_num_of_trees_in_the_forest. In fact, from my experience with random forest it is safe to use a large number of trees eg. in the 30,40,50s. I would plot a graph of "accuracy vs number of trees". You should get a graph that asymptotes off for high number of trees. This should give you an idea of what number to pick.