Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

sad as it is, none of what you wanted is builtin. all strings are discarded while reading the csv, and replaced with (1 based !) indices, in the order of the appearance.

maybe you should / need do your own csv preprocessing, to handle this better.

categorial values, which do not represent an order (e.g: "cat","autobus","accordeon") should NOT be represented in a single numeric variable (like it is done here, (0,1,2)), but you have to find a suitable embedding for your string set, like "one-hot" encoding them.

sad as it is, none of what you wanted is builtin. all strings are discarded while reading the csv, and replaced with (1 based !) indices, in the order of the appearance.

maybe you should / need do your own csv preprocessing, to handle this better. better, instead of opencv's TrainData utility.

categorial values, which do not represent an order (e.g: "cat","autobus","accordeon") should NOT be represented in a single numeric variable (like it is done here, (0,1,2)), but you have to find a suitable embedding for your string set, like "one-hot" encoding them.

sad as it is, none of what you wanted is builtin. all strings are discarded while reading the csv, and replaced with (1 based !) indices, in the order of the appearance.

maybe you should / need do your own csv preprocessing, to handle this better, instead of opencv's TrainData utility.

categorial values, which do not represent an order (e.g: "cat","autobus","accordeon") should NOT be represented in a single numeric variable (like it is done here, (0,1,2)), but you have to find a suitable embedding for your string set, like "one-hot" encoding them.