Link Search Menu Expand Document

GradientBoostingClassifier

Gradient boosting is a machine learning method which provides predictions by training an ensemble of weak estimatorss. GradientBoostingClassifier is an implementation of gradient boosting for classification task.

Import

import * as DataCook from '@pipcook/datacook';
const { GradientBoostingClassifier } = DataCook.Model;

Constructor

const gb = newGradientBoostingClassifier({ nEstimators: 10 });

Option parameters

ParameterTypeDescription
nEstimatorsnumbernumber of estimators for fitting. default = 100
criterion  
minSamplesLeafnumberThe minimum number of samples required to be at leaf node, default = 1
minImpurityDecreasenumberA node will be split if this split induces a decrease of the impurity greater than or equal to this value
minWeightFractionLeaf  
minSamplesSplitnumberMinimum number of samples required to split an internal node, default = 2va
validationFractionnumberThe proportion of training data to set aside as validation set for early stopping. Must be between 0 and 1. Only used if nIterNoChange is set to an integer.
ccpAlphanumberComplexity parameter used for Minimal Cost-complexity Pruning. The subtree with the largest cost complexity that is smaller than ccpAlpha will be chosen. By default, no pruning is performed. 
maxDepthnumberMaximum depth of the individual regression tree, default = 3
maxFeaturesnumber, or {“auto”, “sqrt”, “log2”}The number of features to consider when looking for the best split:
- If integer value, then consider maxFeatures features at each split.
- If not interger value, then maxFeatures is a fraction and Math.floor(maxFeatures * nFeatures) features are considered at each split.
- If “auto”, then max_features=sqrt(n_features).
- If “sqrt”, then maxFeatures=sqrt(nFeatures).
- If “log2”, then maxFeatures=log2(nFeatures).
- If none, then maxFeatures=nFatures.
tolnumberTolerance for the early stopping. When the loss is not improving by at least tol for nIterNoChange iterations (if set to a number), the training stops, default = 1e-4
nIterNoChangenumberUsed to decide if early stopping will be used to terminate training when validation score is not improving. By default it is set to none to disable early stopping. If set to a number, it will set aside validation_fraction size of the training data as validation and terminate training when validation score is not improving in all of the previous nIterNoChange numbers of iterations. The split is stratified.

Methods

fit

Fit gradient boosting classifier

Syntax

async fit(xData: number[][] | Tensor2D, yData: number[] | string[] | boolean[] | Tensor1D): Promise<void>

Parameters

Parametertypedescription
xDataTensor2D| number[][]input data of shape (nSamples,nFeatures) in type of array or tensor
yDataTensor1D| number[] | string[] | boolean[] input target

predict

Make predictions using gradient boosting model.

async predict(xData: Tensor|RecursiveArray<number>): Promise<Tensor>

Parameters

parametertypedescription
xDataTensorRecursiveArray <number>

Returns

Promise of fitted values

fromJson

Load model paramters from json string object

async fromJson(modelJson: string)

Parameters

parametertypedescription
modelJsonstringmodel json string

toJson

Export model paramters to json string

async toJson(): Promise<string>

Returns

String output of model json