smart_toy

Machine Learning

36 guides

keyboard_arrow_down

Other math topics

Dagster

Pandas

NumPy

Matplotlib

PySpark

MySQL

check_circle

Mark as learned

thumb_up

thumb_down

chat_bubble_outline

Comment

auto_stories Bi-column layout

settings

Comprehensive Guide on sklearn's LabelEncoder

schedule Aug 11, 2023

Last updated

local_offer

Machine Learning●Python

Encoding numerical target labels

Suppose our target labels are as follows:


        
        
            
                
                
                    raw_y = [6,9,2,5,6]

Our objective is to apply the following mapping:

Here, the value 2 gets mapped to 0, the value 5 gets mapped to 1, and so on.

We can make use of LabelEncoder like so:


        
        
            
                
                
                    from sklearn.preprocessing import LabelEncoder

raw_y = [6,9,2,5,6]
encoder = LabelEncoder()
y = encoder.fit_transform(raw_y)   # returns a NumPy array
y
                
            
            array([2, 3, 0, 1, 2])

Here, y is the encoded values.

We can access the classes, that is, the unique values in our target like so:


        
        
            
                
                
                    encoder.classes_
                
            
            array([2, 5, 6, 9])

We can also get the original raw_y using the inverse_transform(~) function:


        
        
            
                
                
                    encoder.inverse_transform(y)
                
            
            array([6, 9, 2, 5, 6])

Encoding categorical string target labels

The LabelEncoder also works when the target label is categorical:


        
        
            
                
                
                    from sklearn.preprocessing import LabelEncoder

raw_y = ["A","B","A","C"]
encoder = LabelEncoder()
y = encoder.fit_transform(raw_y)
y
                
            
            array([0, 1, 0, 2])

We can see all our classes like so:


        
        
            
                
                
                    encoder.classes_
                
            
            array(['A', 'B', 'C'], dtype='<U1')

We can retrieve the original raw_y like so:


        
        
            
                
                
                    encoder.inverse_transform(y)
                
            
            array(['A', 'B', 'A', 'C'], dtype='<U1')

Published by Isshin Inada

Edited by 0 others

Did you find this page useful?

thumb_up

thumb_down

Comment

Citation

Ask a question or leave a feedback...

thumb_up

thumb_down

chat_bubble_outline

settings

Enjoy our search

Hit / to insta-search docs and recipes!