class_balance as hyperparameter name instead of
The class_weight hyperparameter name is recognized by check_estimator() and the test check_class_weight_classifiers() is performed that uses the dict parameter and requires for a decision tree the “min_weight_fraction_leaf” hyperparameter to be implemented to pass the test.
sample_weights) are used and
for multi-output (single model),
class_weights are calculated and treated separately for each output.
Scikit-learn’s compute_class_weight() and compute_sample_weight() functions multiply the
sample_weightsfor multi-output together to a single sample_weight.
Overwritten score() function instead of using inherited score() function from metrics module.
Scikit-learn’s metrics module does not support “multiclass-multioutput” format.
We provide and use the same Random Number Generator from our C++ implementation in Python.
Cython (Python bindings for C++ library)¶
To allow all data to be passed by reference (not copied) between Python and C++, multi-dimensional arrays, like X, y and class_weights, in Python are handled as one-dimensional arrays in C++. The down-side is that this is not a very elegant programming style for our C++ library.
The basic concepts, including stack, samples LUT with in-place partitioning, incremental histogram updates, for the implementation of the classifiers are based on:
Louppe, Understanding Random Forests, PhD Thesis, 2014
Not Missing At Random (NMAR)¶
The probability of an instance having a missing value for a feature may depend on the value of that feature.
Training The split criterion considers missing values as another category and samples with missing values are passed to either the left or the right child depending on which option provides the best split.
Testing If the split criterion includes missing values, a missing value is dealt with accordingly (passed to left or right child). If the split criterion does not include missing values, a missing value at a split criterion is dealt with by combining the results from both children proportionally to the number of samples that are passed to the children during training (same as MCAR Missing Completely At Random).
Note that the number of samples that are passed to the children represents the feature’s estimated probability distribution for the particular missing value based on the training data.