search
Search
Publish
menu
menu search toc more_vert
Robocat
Guest 0reps
Thanks for the thanks!
close
chevron_left Handling Missing Values
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
help Ask a question
Share on Twitter
search
keyboard_voice
close
Searching Tips
Search for a recipe: "Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
A
A
share
thumb_up_alt
bookmark
arrow_backShare
Twitter
Facebook
chevron_left Handling Missing Values
thumb_up
0
thumb_down
0
chat_bubble_outline
0
auto_stories new
settings

Pandas DataFrame | fillna method

Programming
chevron_right
Python
chevron_right
Pandas
chevron_right
Documentation
chevron_right
DataFrame
chevron_right
Handling Missing Values
schedule Jul 1, 2022
Last updated
local_offer PythonPandas
Tags

Pandas' DataFrame.fillna(~) method fills NaN (missing values) with a specified value or with a filling rule.

Parameters

1. valuelink | scalar or dict or Series or DataFrame | optional

The value to replace NaN. If a dict or Series is specified, then the key/index is the column label, and the value is the filler.

2. methodlink | None or string | optional

The rule by which to fill NaN:

Value

Description

"backfill" or "bfill"

Use the next non-NaN value to fill.

"pad" or "ffill"

Use the previous non-NaN value to fill.

None

Does not perform any filling.

By default, method=None.

WARNING

Only specify either value or method - not both.

3. axislink | int or string | optional

Whether to fill each column or each row:

Axis

Description

Fill column-wise

0 or "index"

Fill row-wise

1 or "columns"

By default, axis=0. Note that this is only relevant if you specified method instead of value.

4. inplace | boolean | optional

  • If True, then the method will directly modify the source DataFrame instead of creating a new DataFrame.

  • If False, then a new DataFrame will be created and returned.

By default, inplace=False.

5. limitlink | None or int | optional

  • If method is specified, then limit is the maximum number of consecutive NaN to fill. If there are more than limit number of consecutive NaN, then no fill will be performed.

  • If method is not specified, then limit is the maximum number of fills to perform per row or column.

By default, limit=None.

Return Value

A DataFrame with NaN replaced by a filler.

Examples

Consider the following DataFrame:

df = pd.DataFrame({"A":[None,5,6],"B":[7,None,8],"C":[9,None,None]})
df
   A    B    C
0  NaN  7.0  9.0
1  5.0  NaN  NaN
2  6.0  8.0  NaN

Filling with a value

To fill NaN with the value 10:

df.fillna(10)
   A     B     C
0  10.0  7.0   9.0
1  5.0   10.0  10.0
2  6.0   8.0   10.0

Filling certain columns only

To specify which columns to fill, provide a dict or Series like so:

df.fillna({"A":"*","C":10})
A B C
0 * 7.0 9.0
1 5 NaN 10.0
2 6 8.0 10.0

Notice how column B still has NaN here since the provided dict has no B key.

Specifying a filling method

Consider the same df as before:

df
   A    B    C
0  NaN  7.0  9.0
1  5.0  NaN  NaN
2  6.0  8.0  NaN

Backward-fill

To fill NaNs with the next non-NaN value column-wise:

df.fillna(method="bfill")   # or method="backfill"
   A    B    C
0  5.0  7.0  9.0
1  5.0  8.0  NaN
2  6.0  8.0  NaN

Notice how we still have some NaNs remaining. This is because there is no value that comes after the NaN, and so we don't have a filling value.

Forward-fill

To fill NaNs with the previous non-NaN value column-wise:

df.fillna(method="ffill")   # or method="pad"
   A    B    C
0  NaN  7.0  9.0
1  5.0  7.0  9.0
2  6.0  8.0  9.0

Again, we have a NaN at A[0] because there is no value that comes before the NaN, and so we don't have a filling value.

Specifying the axis

Just for your reference, here's the same df:

df
   A    B    C
0  NaN  7.0  9.0
1  5.0  NaN  NaN
2  6.0  8.0  NaN

By default, axis=0, which means that the filling method is applied column-wise:

df.fillna(method="ffill")      # axis=0
   A    B    C
0  NaN  7.0  9.0
1  5.0  7.0  9.0
2  6.0  8.0  9.0

We can perform the filling row-wise by setting axis=1:

df.fillna(method="ffill", axis=1)
   A    B    C
0  NaN  7.0  9.0
1  5.0  5.0  5.0
2  6.0  8.0  8.0

Note that the axis parameter is only relevant if you specify the method parameter instead of value.

Specifying a limit

When parameter method is specified

If method is specified, then limit is the maximum number of consecutive NaN to fill.

Just for your reference, here's the same df:

df
   A    B    C
0  NaN  7.0  9.0
1  5.0  NaN  NaN
2  6.0  8.0  NaN

To limit the number of consecutive NaNs to fill to 1:

df.fillna(method="ffill", limit=1)
   A    B    C
0  NaN  7.0  9.0
1  5.0  7.0  9.0
2  6.0  8.0  NaN

Notice how cell C[2] still has a NaN value. This is because we have 2 consecutive NaNs here, and since we specified limit=1, only the first one got filled.

When parameter method is not specified

When method is not specified, then the limit represents the maximum number of fills to perform per row or column. For instance, consider the following DataFrame:

df = pd.DataFrame({"A":[None,5,None],"B":[7,None,8],"C":[9,None,None]})
df
A B C
0 NaN 7.0 9.0
1 5.0 NaN NaN
2 NaN 8.0 NaN

Performing a fill with limit=1 yields:

df.fillna(2, limit=1)
A B C
0 2.0 7.0 9.0
1 5.0 2.0 2.0
2 NaN 8.0 NaN

Notice how column A and C still have NaNs in them.

robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down