LabelEncoder#

class sklearn.preprocessing.LabelEncoder[source]#

Encode target labels with value between 0 and n_classes-1.

This transformer should be used to encode target values, i.e. y, and not the input X.

See also

OrdinalEncoder: Encode categorical features using an ordinal encoding scheme.
OneHotEncoder: Encode categorical features as a one-hot numeric array.

Examples

LabelEncoder can be used to normalize labels.

>>> from sklearn.preprocessing import LabelEncoder
>>> le = LabelEncoder()
>>> le.fit([1, 2, 2, 6])
LabelEncoder()
>>> le.classes_
array([1, 2, 6])
>>> le.transform([1, 1, 2, 6])
array([0, 0, 1, 2]...)
>>> le.inverse_transform([0, 0, 1, 2])
array([1, 1, 2, 6])

It can also be used to transform non-numerical labels (as long as they are hashable and comparable) to numerical labels.

>>> le = LabelEncoder()
>>> le.fit(["paris", "paris", "tokyo", "amsterdam"])
LabelEncoder()
>>> list(le.classes_)
['amsterdam', 'paris', 'tokyo']
>>> le.transform(["tokyo", "tokyo", "paris"])
array([2, 2, 1]...)
>>> list(le.inverse_transform([2, 2, 1]))
['tokyo', 'tokyo', 'paris']

fit(y)[source]#

Fit label encoder.

Parameters:

yarray-like of shape (n_samples,): Target values.

Returns:

selfreturns an instance of self.: Fitted label encoder.

fit_transform(y)[source]#

Fit label encoder and return encoded labels.

Parameters:

yarray-like of shape (n_samples,): Target values.

Returns:

yarray-like of shape (n_samples,): Encoded labels.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routingMetadataRequest: A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

paramsdict: Parameter names mapped to their values.

inverse_transform(y)[source]#

Transform labels back to original encoding.

Parameters:

yndarray of shape (n_samples,): Target values.

Returns:

yndarray of shape (n_samples,): Original encoding.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:

transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

"default": Default output format of a transformer
"pandas": DataFrame output
"polars": Polars output
None: Transform configuration is unchanged

Added in version 1.4: "polars" option was added.

Returns:

selfestimator instance: Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**paramsdict: Estimator parameters.

Returns:

selfestimator instance: Estimator instance.

transform(y)[source]#

Transform labels to normalized encoding.

Parameters:

yarray-like of shape (n_samples,): Target values.

Returns:

yarray-like of shape (n_samples,): Labels as normalized encodings.