menu

login

Log in

Linear Algebra

Prob and Stats

Other math topics

Machine Learning

Dagster (NEW)

search

Search

Login

Unlock 100+ guides

menu

menu

search toc

close

Outline

Disabling default NaN Specifying custom NaN parser Related

Comments

Log in or sign up

Cancel

Post

account_circle

exit_to_app

Sign out

What does this mean?

Why is this true?

Give me some examples!

search

keyboard_voice

close

Searching Tips

Search for a recipe:
"Creating a table in MySQL"

Search for an API documentation: "@append"

Search for code: "!dataframe"

Apply a tag filter: "#python"

Useful Shortcuts

/ to open search panel

Esc to close search panel

↑↓ to navigate between search results

⌘d to clear all current filters

⌘Enter to expand content preview

icon_star

Doc Search

icon_star

Code Search Beta

SORRY NOTHING FOUND!

mic

Start speaking...

Voice search is only supported in Safari and Chrome.

fullscreen_exit

Shrink

Navigate to

Pandas

655 guides

keyboard_arrow_down

Linear Algebra

Prob and Stats

Machine Learning

Other math topics

chevron_leftCreating DataFrames Cookbook

Combining multiple Series into a DataFrame Combining multiple Series to form a DataFrame Converting a Series to a DataFrame Converting list of lists into DataFrame Converting list to DataFrame Converting percent string into a numeric for read_csv Converting scikit-learn dataset to Pandas DataFrame Converting string data into a DataFrame Creating a DataFrame from a string Creating a DataFrame using lists Creating a DataFrame with different type for each column Creating a DataFrame with empty values Creating a DataFrame with missing values Creating a DataFrame with random numbers Creating a DataFrame with zeros Creating a MultiIndex DataFrame Creating a Pandas DataFrame Creating a single DataFrame from multiple files Creating empty DataFrame with only column labels Filling missing values when using read_csv Importing Dataset Importing tables from PostgreSQL as Pandas DataFrames Initialising a DataFrame using a constant Initialising a DataFrame using a dictionary Initialising a DataFrame using a list of dictionaries Inserting lists into a DataFrame cell Keeping leading zeroes when using read_csv Parsing dates when using read_csv Preventing strings from getting parsed as NaN for read_csv Reading data from GitHub Reading file without header Reading large CSV files in chunks Reading n random lines using read_csv Reading space-delimited files Reading specific columns from file Reading tab-delimited files Reading the first few lines of a file to create DataFrame Reading the last n lines of a file Reading URL using read_csv Reading zipped csv file as a DataFrame Removing Unnamed:0 column Resolving ParserError: Error tokenizing data Saving DataFrame as zipped csv Skipping rows without skipping header for read_csv Specifying data type for read_csv Treating missing values as empty strings rather than NaN for read_csv

check_circle

Mark as learned

thumb_up

0

thumb_down

0

chat_bubble_outline

0

Comment

auto_stories Bi-column layout

settings

Preventing strings from getting parsed as NaN for read_csv in Pandas

schedule Aug 12, 2023

Last updated

local_offer

Python●Pandas

Tags

tocTable of Contents

expand_more

Disabling default NaN Specifying custom NaN parser

Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!

Consider the following my_data.txt file:

Disabling default NaN

By default, strings like "NA" will be parsed as NaN. To prevent such behaviour, set keep_default_na=False like so:


        
        
            
                
                
                    df = pd.read_csv("my_data.txt", keep_default_na=False)
df
                
            
               A   B
0  NA  5
1  a   6

Here, the NA that appears in column A is of type string.

Specifying custom NaN parser

The problem with just setting keep_default_na=False is that values like nan and empty entries in the file will no longer be parsed as NaN.

For instance, if my_data.txt is as follows:

Then we get an empty string for column B as opposed to NaN:


        
        
            
                
                
                    df = pd.read_csv("my_data.txt", keep_default_na=False)
df
                
            
               A   B
0  NA  5
1  a

This is most often undesirable, so the fix is to specify your own set of values that should be parsed as NaN using the na_values parameter:


        
        
            
                
                
                    df = pd.read_csv("my_data.txt", keep_default_na=False, na_values="")
df
                
            
               A   B
0  NA  5.0
1  a   NaN

Here, we specified that empty values are to be mapped to NaN.

Related

Pandas | read_csv method

Reads a file, and parses its content into a DataFrame.

chevron_right

robocat

Published by Isshin Inada

Edited by 0 others

Did you find this page useful?

thumb_up

thumb_down

Comment

Citation

Ask a question or leave a feedback...

thumb_up

0

thumb_down

0

chat_bubble_outline

0

settings

Enjoy our search

Hit / to insta-search docs and recipes!

Navigation

Contact us

Resources

Python Pandas MySQL Beautiful Soup Matplotlib NumPy PySpark

Community

Join our Discord

Join our newsletter for updates on new comprehensive DS/ML guides

|