# Comprehensive Guide on sklearn's LabelEncoder

schedule Aug 11, 2023
Last updated
local_offer
Machine LearningPython
Tags
expand_more
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!

The `LabelEncoder` module in Python's `sklearn` is used to encode the target labels into categorical integers (e.g. 0, 1, 2, ...).

# Encoding numerical target labels

Suppose our target labels are as follows:

``` raw_y = [6,9,2,5,6] ```

Our objective is to apply the following mapping:

``` 2: 05: 16: 29: 3 ```

Here, the value 2 gets mapped to 0, the value 5 gets mapped to 1, and so on.

We can make use of `LabelEncoder` like so:

``` from sklearn.preprocessing import LabelEncoderraw_y = [6,9,2,5,6]encoder = LabelEncoder()y = encoder.fit_transform(raw_y) # returns a NumPy arrayy array([2, 3, 0, 1, 2]) ```

Here, `y` is the encoded values.

We can access the classes, that is, the unique values in our target like so:

``` encoder.classes_ array([2, 5, 6, 9]) ```

We can also get the original `raw_y` using the `inverse_transform(~)` function:

``` encoder.inverse_transform(y) array([6, 9, 2, 5, 6]) ```

# Encoding categorical string target labels

The `LabelEncoder` also works when the target label is categorical:

``` from sklearn.preprocessing import LabelEncoderraw_y = ["A","B","A","C"]encoder = LabelEncoder()y = encoder.fit_transform(raw_y)y array([0, 1, 0, 2]) ```

We can see all our classes like so:

``` encoder.classes_ array(['A', 'B', 'C'], dtype='<U1') ```

We can retrieve the original `raw_y` like so:

``` encoder.inverse_transform(y) array(['A', 'B', 'A', 'C'], dtype='<U1') ```
