Module ensemble (1.5.0)

Ensemble models. This module is styled after scikit-learn's ensemble module: https://scikit-learn.org/stable/modules/ensemble.html

Classes

RandomForestClassifier

RandomForestClassifier(
    n_estimators: int = 100,
    *,
    tree_method: typing.Literal["auto", "exact", "approx", "hist"] = "auto",
    min_tree_child_weight: int = 1,
    colsample_bytree: float = 1.0,
    colsample_bylevel: float = 1.0,
    colsample_bynode: float = 0.8,
    gamma: float = 0.0,
    max_depth: int = 15,
    subsample: float = 0.8,
    reg_alpha: float = 0.0,
    reg_lambda: float = 1.0,
    tol: float = 0.01,
    enable_global_explain=False,
    xgboost_version: typing.Literal["0.9", "1.1"] = "0.9"
)

A random forest classifier.

A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.

Parameters
Name Description
n_estimators Optional[int]

Number of parallel trees constructed during each iteration. Default to 100. Minimum value is 2.

tree_method Optional[str]

Specify which tree method to use. Default to "auto". If this parameter is set to default, XGBoost will choose the most conservative option available. Possible values: "exact", "approx", "hist".

min_child_weight Optional[float]

Minimum sum of instance weight(hessian) needed in a child. Default to 1.

colsample_bytree Optional[float]

Subsample ratio of columns when constructing each tree. Default to 1.0. The value should be between 0 and 1.

colsample_bylevel Optional[float]

Subsample ratio of columns for each level. Default to 1.0. The value should be between 0 and 1.

colsample_bynode Optional[float]

Subsample ratio of columns for each split. Default to 0.8. The value should be between 0 and 1.

gamma Optional[float]

(min_split_loss) Minimum loss reduction required to make a further partition on a leaf node of the tree. Default to 0.0.

max_depth Optional[int]

Maximum tree depth for base learners. Default to 15. The value should be greater than 0 and less than 1.

subsample Optional[float]

Subsample ratio of the training instance. Default to 0.8. The value should be greater than 0 and less than 1.

reg_alpha Optional[float]

L1 regularization term on weights (xgb's alpha). Default to 0.0.

reg_lambda Optional[float]

L2 regularization term on weights (xgb's lambda). Default to 1.0.

tol Optional[float]

Minimum relative loss improvement necessary to continue training. Default to 0.01.

enable_global_explain Optional[bool]

Whether to compute global explanations using explainable AI to evaluate global feature importance to the model. Default to False.

xgboost_version Optional[str]

Specifies the Xgboost version for model training. Default to "0.9". Possible values: "0.9", "1.1".ß

RandomForestRegressor

RandomForestRegressor(
    n_estimators: int = 100,
    *,
    tree_method: typing.Literal["auto", "exact", "approx", "hist"] = "auto",
    min_tree_child_weight: int = 1,
    colsample_bytree=1.0,
    colsample_bylevel=1.0,
    colsample_bynode=0.8,
    gamma=0.0,
    max_depth: int = 15,
    subsample=0.8,
    reg_alpha=0.0,
    reg_lambda=1.0,
    tol=0.01,
    enable_global_explain=False,
    xgboost_version: typing.Literal["0.9", "1.1"] = "0.9"
)

A random forest regressor.

A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.

Parameters
Name Description
n_estimators Optional[int]

Number of parallel trees constructed during each iteration. Default to 100. Minimum value is 2.

tree_method Optional[str]

Specify which tree method to use. Default to "auto". If this parameter is set to default, XGBoost will choose the most conservative option available. Possible values: "exact", "approx", "hist".

min_child_weight Optional[float]

Minimum sum of instance weight(hessian) needed in a child. Default to 1.

colsample_bytree Optional[float]

Subsample ratio of columns when constructing each tree. Default to 1.0. The value should be between 0 and 1.

colsample_bylevel Optional[float]

Subsample ratio of columns for each level. Default to 1.0. The value should be between 0 and 1.

colsample_bynode Optional[float]

Subsample ratio of columns for each split. Default to 0.8. The value should be between 0 and 1.

gamma Optional[float]

(min_split_loss) Minimum loss reduction required to make a further partition on a leaf node of the tree. Default to 0.0.

max_depth Optional[int]

Maximum tree depth for base learners. Default to 15. The value should be greater than 0 and less than 1.

reg_alpha Optional[float]

L1 regularization term on weights (xgb's alpha). Default to 0.0.

reg_lambda Optional[float]

L2 regularization term on weights (xgb's lambda). Default to 1.0.

tol Optional[float]

Minimum relative loss improvement necessary to continue training. Default to 0.01.

enable_global_explain Optional[bool]

Whether to compute global explanations using explainable AI to evaluate global feature importance to the model. Default to False.

xgboost_version Optional[str]

Specifies the Xgboost version for model training. Default to "0.9". Possible values: "0.9", "1.1".

XGBClassifier

XGBClassifier(
    n_estimators: int = 1,
    *,
    booster: typing.Literal["gbtree", "dart"] = "gbtree",
    dart_normalized_type: typing.Literal["tree", "forest"] = "tree",
    tree_method: typing.Literal["auto", "exact", "approx", "hist"] = "auto",
    min_tree_child_weight: int = 1,
    colsample_bytree: float = 1.0,
    colsample_bylevel: float = 1.0,
    colsample_bynode: float = 1.0,
    gamma: float = 0.0,
    max_depth: int = 6,
    subsample: float = 1.0,
    reg_alpha: float = 0.0,
    reg_lambda: float = 1.0,
    learning_rate: float = 0.3,
    max_iterations: int = 20,
    tol: float = 0.01,
    enable_global_explain: bool = False,
    xgboost_version: typing.Literal["0.9", "1.1"] = "0.9"
)

XGBoost classifier model.

Parameters
Name Description
n_estimators Optional[int]

Number of parallel trees constructed during each iteration. Default to 1.

booster Optional[str]

Specify which booster to use: gbtree or dart. Default to "gbtree".

dart_normalized_type Optional[str]

Type of normalization algorithm for DART booster. Possible values: "TREE", "FOREST". Default to "TREE".

tree_method Optional[str]

Specify which tree method to use. Default to "auto". If this parameter is set to default, XGBoost will choose the most conservative option available. Possible values: "exact", "approx", "hist".

min_child_weight Optional[float]

Minimum sum of instance weight(hessian) needed in a child. Default to 1.

colsample_bytree Optional[float]

Subsample ratio of columns when constructing each tree. Default to 1.0.

colsample_bylevel Optional[float]

Subsample ratio of columns for each level. Default to 1.0.

colsample_bynode Optional[float]

Subsample ratio of columns for each split. Default to 1.0.

gamma Optional[float]

(min_split_loss) Minimum loss reduction required to make a further partition on a leaf node of the tree. Default to 0.0.

max_depth Optional[int]

Maximum tree depth for base learners. Default to 6.

subsample Optional[float]

Subsample ratio of the training instance. Default to 1.0.

reg_alpha Optional[float]

L1 regularization term on weights (xgb's alpha). Default to 0.0.

reg_lambda Optional[float]

L2 regularization term on weights (xgb's lambda). Default to 1.0.

learning_rate Optional[float]

Boosting learning rate (xgb's "eta"). Default to 0.3.

max_iterations Optional[int]

Maximum number of rounds for boosting. Default to 20.

tol Optional[float]

Minimum relative loss improvement necessary to continue training. Default to 0.01.

enable_global_explain Optional[bool]

Whether to compute global explanations using explainable AI to evaluate global feature importance to the model. Default to False.

xgboost_version Optional[str]

Specifies the Xgboost version for model training. Default to "0.9". Possible values: "0.9", "1.1".

XGBRegressor

XGBRegressor(
    n_estimators: int = 1,
    *,
    booster: typing.Literal["gbtree", "dart"] = "gbtree",
    dart_normalized_type: typing.Literal["tree", "forest"] = "tree",
    tree_method: typing.Literal["auto", "exact", "approx", "hist"] = "auto",
    min_tree_child_weight: int = 1,
    colsample_bytree: float = 1.0,
    colsample_bylevel: float = 1.0,
    colsample_bynode: float = 1.0,
    gamma: float = 0.0,
    max_depth: int = 6,
    subsample: float = 1.0,
    reg_alpha: float = 0.0,
    reg_lambda: float = 1.0,
    learning_rate: float = 0.3,
    max_iterations: int = 20,
    tol: float = 0.01,
    enable_global_explain: bool = False,
    xgboost_version: typing.Literal["0.9", "1.1"] = "0.9"
)

XGBoost regression model.

Parameters
Name Description
n_estimators Optional[int]

Number of parallel trees constructed during each iteration. Default to 1.

booster Optional[str]

Specify which booster to use: gbtree or dart. Default to "gbtree".

dart_normalized_type Optional[str]

Type of normalization algorithm for DART booster. Possible values: "TREE", "FOREST". Default to "TREE".

tree_method Optional[str]

Specify which tree method to use. Default to "auto". If this parameter is set to default, XGBoost will choose the most conservative option available. Possible values: "exact", "approx", "hist".

min_child_weight Optional[float]

Minimum sum of instance weight(hessian) needed in a child. Default to 1.

colsample_bytree Optional[float]

Subsample ratio of columns when constructing each tree. Default to 1.0.

colsample_bylevel Optional[float]

Subsample ratio of columns for each level. Default to 1.0.

colsample_bynode Optional[float]

Subsample ratio of columns for each split. Default to 1.0.

gamma Optional[float]

(min_split_loss) Minimum loss reduction required to make a further partition on a leaf node of the tree. Default to 0.0.

max_depth Optional[int]

Maximum tree depth for base learners. Default to 6.

subsample Optional[float]

Subsample ratio of the training instance. Default to 1.0.

reg_alpha Optional[float]

L1 regularization term on weights (xgb's alpha). Default to 0.0.

reg_lambda Optional[float]

L2 regularization term on weights (xgb's lambda). Default to 1.0.

learning_rate Optional[float]

Boosting learning rate (xgb's "eta"). Default to 0.3.

max_iterations Optional[int]

Maximum number of rounds for boosting. Default to 20.

tol Optional[float]

Minimum relative loss improvement necessary to continue training. Default to 0.01.

enable_global_explain Optional[bool]

Whether to compute global explanations using explainable AI to evaluate global feature importance to the model. Default to False.

xgboost_version Optional[str]

Specifies the Xgboost version for model training. Default to "0.9". Possible values: "0.9", "1.1".