Hello,
In the documentation for the CvDTree for __max_categories__ section it is written that
In case of regression and categorical variable the optimal split can be found efficiently without employing clustering, thus the parameter is not used in these cases.
I was wondering how is this achieved algorithmically. I tried to find the algorithm in the paper that is cited there but found none.