Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Hi! One day I caught this. It's easy :)

In case of a categorical variable a tree split is a bitmap "subset". This mask determines which category of a split variable (ie which samples) has to go to the left child node (the direction -1) and to the right one (the direction +1). The macro is used to compute the direction for a given category of variable.

In the implementation the bitmap "subset" is an array of 'int'. "idx" is a given category.

(idx)>>5 - it's equivalent to division by sizeof(int), ie we find the index of element of the array "subset" that contains a bit for the given category;

(idx) & 31) - it's the remainder of dividing by sizeof(int). Here we find the index of a category bit in the array element.

1 << ((idx) & 31) - it gives a map filled with zeros and having one "1" in the required position.

(subset[(idx)>>5]&(1 << ((idx) & 31)))==0 - here we check the bit value for the given category.

(2*((subset[(idx)>>5]&(1 << ((idx) & 31)))==0)-1) - if the value is 0 we get direction -1, otherwise +1.