Build a binary decision tree in depth-first order. More...

#include <decision_tree.h>

Collaboration diagram for koho::DepthFirstTreeBuilder:

[legend]

Public Member Functions
	DepthFirstTreeBuilder (OutputsIdx_t n_outputs, ClassesIdx_t n_classes, ClassesIdx_t n_classes_max, FeaturesIdx_t n_features, SamplesIdx_t n_samples, ClassWeights_t class_weight, TreeDepthIdx_t max_depth, FeaturesIdx_t max_features, unsigned long max_thresholds, std::string missing_values, RandomState const &random_state)
	Create and initialize a new depth first tree builder. More...

void	build (Tree &tree, Features_t X, Classes_t y, SamplesIdx_t n_samples)
	Build a binary decision tree from the training data. More...

Protected Attributes
TreeDepthIdx_t	max_depth

std::string	missing_values

BestSplitter	splitter

Detailed Description

Build a binary decision tree in depth-first order.

Constructor & Destructor Documentation

◆ DepthFirstTreeBuilder()

koho::DepthFirstTreeBuilder::DepthFirstTreeBuilder	(	OutputsIdx_t	n_outputs,
		ClassesIdx_t *	n_classes,
		ClassesIdx_t	n_classes_max,
		FeaturesIdx_t	n_features,
		SamplesIdx_t	n_samples,
		ClassWeights_t *	class_weight,
		TreeDepthIdx_t	max_depth,
		FeaturesIdx_t	max_features,
		unsigned long	max_thresholds,
		std::string	missing_values,
		RandomState const &	random_state
	)

Create and initialize a new depth first tree builder.

Parameters

[in]	n_outputs	Number of outputs (multi-output), minimum 1.
[in]	n_classes	Number of classes in the training data for each output, minimum 2 [n_outputs].
[in]	n_classes_max	Maximum number of classes across all outputs.
[in]	n_features	Number of features in the training data, minimum 1.
[in]	n_samples	Number of samples in the training data, minimum 2.
[in]	class_weight	Class weights for each output separately. Weights for each class, which should be inversely proportional to the class frequencies in the training data for class balancing, or 1.0 otherwise [n_outputs x max(n_classes for each output)].
[in]	max_depth	The depth of the tree is expanded until the specified maximum depth of the tree is reached or all leaves are pure or no further impurity improvement can be achieved.
[in]	max_features	Number of random features to consider when looking for the best split at each node, between 1 and n_features. Note: the search for a split does not stop until at least one valid partition of the node samples is found up to the point that all features have been considered, even if it requires to effectively inspect more than max_features features.
[in]	max_thresholds	Number of random thresholds to consider when looking for the best split at each node, 0 or 1. If 0, then all thresholds, based on the mid-point of the node samples, are considered. If 1, then consider 1 random threshold.
[in]	missing_values	Handling of missing values. string "NMAR" or "None", (default="None") If "NMAR" (Not Missing At Random), then during training: the split criterion considers missing values as another category and samples with missing values are passed to either the left or the right child depending on which option provides the best split, and then during testing: if the split criterion includes missing values, a missing value is dealt with accordingly (passed to left or right child), or if the split criterion does not include missing values, a missing value at a split criterion is dealt with by combining the results from both children proportionally to the number of samples that are passed to the children during training. If "None", an error is raised if one of the features has a missing value. An option is to use imputation (fill-in) of missing values prior to using the decision tree classifier.
[in]	random_state	Initialized Random Number Generator.

"Decision Tree": max_features=n_features, max_thresholds=0.

The following configurations should only be used for "decision forests":
"Random Tree": max_features<n_features, max_thresholds=0.
"Extreme Randomized Trees (ET)": max_features=n_features, max_thresholds=1.
"Totally Randomized Trees": max_features=1, max_thresholds=1, very similar to "Perfect Random Trees (PERT)".

Member Function Documentation

◆ build()

void koho::DepthFirstTreeBuilder::build	(	Tree &	tree,
		Features_t *	X,
		Classes_t *	y,
		SamplesIdx_t	n_samples
	)

Build a binary decision tree from the training data.

Parameters

[in,out]	tree	A binary decision tree.
[in]	X	Training input samples [n_samples x n_features].
[in]	y	Target class labels corresponding to the training input samples [n_samples].
[in]	n_samples	Number of samples, minimum 2.

Member Data Documentation

◆ max_depth

TreeDepthIdx_t koho::DepthFirstTreeBuilder::max_depth

protected

◆ missing_values

std::string koho::DepthFirstTreeBuilder::missing_values

protected

◆ splitter

BestSplitter koho::DepthFirstTreeBuilder::splitter

protected

The documentation for this class was generated from the following files:

Public Member Functions

Protected Attributes

Detailed Description

Constructor & Destructor Documentation

◆ DepthFirstTreeBuilder()

Member Function Documentation

◆ build()

Member Data Documentation

◆ max_depth

◆ missing_values

◆ splitter