Link Search Menu Expand Document

OneHotEncoder

OneHotEncoder is used to encode categorical features as a one-hot numeric array.

Import

import * as datacook from '@pipcook/datacook';
const { OneHotEncoder } = datacook.Preprocess;

Constructor

const onehotEncoder = new OnehotEncoder({ drop: "first" });

Option in Parameters

parametertypedescription
drop optional{ ‘none’, ‘binary-only’, ‘first’ }Specifies a method to drop one of the categories per feature, which is useful to avoid collinear problem. However, dropping one category may introduce a bias term in downstream models.
‘none’: default, return all features
‘first’: drop the first categories in each feature
‘binary-only’: drop the first category in each feature with two categories.
default=’none’

Properties

categories <Tensor1D>

One dimensional tensor of categories. Onehot result will be consistent with the order of categories appeared in this array.

drop <'first' | 'binary-only' | 'none'>

Drop method for this encoder.

Methods

init()

Initialize one-hot encoder

Syntax

init(x: Tensor1D | number[] | string[]): Promise<void>
parametertypedescription
xTensor1D | number[] | string[]data input used to initialize encoder

Example

const onehotEncoder = new OneHotEncoder();
await onehotEncoder.init([ '1', '2', '3' ]);

encode()

Encode a given feature into one-hot format

Syntax

async encode(x: Tensor | number[] | string[]): Promise<Tensor>

Parameters

parametertypedescription
xTensor | number[] | string[]original data need to encode

Returns

<Tensor> transformed one-hot feature

Example

const onehotEncoder = new OneHotEncoder();
await onehotEncoder.init([ '1', '2', '3' ]);
const encoded = await onehotEncoder.encode(['3', '3', '2']);
/**
 * Tensor
 *   [[0, 0, 1],
 *    [0, 0, 1],
 *    [0, 1, 0]]
 * /

decode()

Decode one-hot array to original category array

Syntax

async decode(x: Tensor | RecursiveArray<number>): Promise<Tensor>

Parameters

parametertypedescription
xTensor | RecursiveArrayone-hot format data need to transform

Returns

<Tensor> transformed category data

Example

const onehotEncoder = new OneHotEncoder();
await onehotEncoder.init([ '1', '2', '3' ]);
const decoded = await onehotEncoder.decode([
    [0, 0, 1], [0, 0, 1], [0, 1, 0]
]);
/**
 * Tensor
 *   ['1', '2', '3']
 * /