search
Search
Publish
menu
menu search toc more_vert
Robocat
Guest 0reps
Thanks for the thanks!
close
chevron_left Handling Missing Values
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
help Ask a question
Share on Twitter
search
keyboard_voice
close
Searching Tips
Search for a recipe: "Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
A
A
share
thumb_up_alt
bookmark
arrow_backShare
Twitter
Facebook
chevron_left Handling Missing Values
thumb_up
0
thumb_down
0
chat_bubble_outline
0
auto_stories new
settings

Pandas DataFrame | replace method

Programming
chevron_right
Python
chevron_right
Pandas
chevron_right
Documentation
chevron_right
DataFrame
chevron_right
Handling Missing Values
schedule Mar 9, 2022
Last updated
local_offer PythonPandas
Tags

Pandas' DataFrame.replace(~) method replaces the specified values with another set of values.

Parameters

1. to_replacelink | string or regex or list or dict or Series or number or None

The values that will be replaced.

2. valuelink | number or dict or list or string or regex or None | optional

The value(s) that will replace to_replace. By default, value=None.

3. inplacelink | boolean | optional

  • If True, then the method will directly modify the source DataFrame instead of creating a new DataFrame.

  • If False, then a new DataFrame will be created and returned.

By default, inplace=False.

4. limitlink | int | optional

The maximum number of consecutive fills to perform. By default, limit=None.

5. regexlink | boolean or string | optional

If True, then to_replace is interpreted as a regular expression. Note that this requires to_replace to be a string.

By default, regex=False.

6. methodlink | string or None | optional

The rule by which to replace to_replace:

Method

Description

Fills the value with the preceding row's value.

"ffill" or "pad"

"bfill"

Fills the value with the next row's value.

This parameter takes effect only when value=None. By default, method="pad".

Return Value

A DataFrame with the specified values replaced with your desired values.

Examples

Replacing single value with a single value

Consider the following DataFrame:

df = pd.DataFrame({"A":[1,2], "B":[3,4]})
df
   A  B
0  1  3
1  2  4

To replace all values of 1 with 5:

df.replace(1, 5)
   A  B
0  5  3
1  2  4

Replacing multiple values with a single value

Consider the following DataFrame:

df = pd.DataFrame({"A":[1,2], "B":[3,4]})
df
   A  B
0  1  3
1  2  4

To replace all values of 1 and 2 with 5:

df.replace([1,2], 5)
   A  B
0  5  3
1  5  4

Replacing multiple values with corresponding values

Using array

Consider the following DataFrame:

df = pd.DataFrame({"A":[1,2], "B":[3,4]})
df
   A  B
0  1  3
1  2  4

To replace all values of 1 and 2 with 5 and 6, respectively:

df.replace([1,2], [5,6])
   A  B
0  5  3
1  6  4

Using dict

Consider the following DataFrame:

df = pd.DataFrame({"A":[1,2], "B":[3,4]})
df
   A  B
0  1  3
1  2  4

To replace all values of 1 and 3 with 5 and 6, respectively:

df.replace({1:5, 3:6})
   A  B
0  5  6
1  2  4

Replacing using regex

Consider the following DataFrame:

df = pd.DataFrame({"A":["alex","bob"], "B":["cathy","doge"]})
df
   A     B
0  alex  cathy
1  bob   doge

To replace all values starting with the letter "a" with "eric":

df.replace("^a.*", "eric", regex=True)
   A     B
0  eric  cathy
1  bob   doge

Notice how we had to enable regex by specifying regex=True.

Replacing for certain columns only

Replacing single value with a single value

Consider the following DataFrame:

df = pd.DataFrame({"A":[1,2], "B":[1,2]})
df
   A  B
0  1  1
1  2  2

To replace all values of 1 with 3 for just column A. To do so, we must provide a dict, like follows:

df.replace({"A":1}, 3)
   A  B
0  3  1
1  2  2

Notice how column B is unaffected despite containing a value of 1.

Replacing multiple values with corresponding values

Consider the following DataFrame:

df = pd.DataFrame({"A":[1,2], "B":[3,4]})
df
   A  B
0  1  3
1  2  4

To replace values 1 and 2 with 5 and 6 respectively for just column A:

df.replace({"A":{1:5, 2:6}})
   A  B
0  5  3
1  6  4

Replacing using fills

When you don't explicitly provide the value parameter, the replace(~) function will automatically forward-fill the matched values, that is, replace the to_replace with the preceding row's value.

Consider the following DataFrame:

df = pd.DataFrame({"A":["a","b","c"]})
df
   A
0  a
1  b
2  c

Forward-fill

To forward-fill all occurrences of "b":

df.replace("b", method="ffill")   # or simply leave out the method parameter.
   A
0  a
1  a
2  c

Notice how the value a, which is the value preceding the match (i.e. "b"), was used as the filler.

WARNING

When there is no preceding row, nothing will be replaced.

Consider the case when we want to forward-fill all occurrences of "a":

df.replace("a", method="ffill")
   A
0  a
1  b
2  c

Notice how, even though we have "a" in our DataFrame, it did not get replaced. Since there is no preceding row, we have no filler and hence no replacement is performed.

Backward-fill

To backward-fill all occurrences of "b":

df.replace("b", method="bfill")
   A
0  a
1  c
2  c

Notice how the value "c", which comes right after the match (i.e. "b"), was used as the filler.

WARNING

When there is no next row, nothing will be replaced.

Consider the case when we want to backward-fill all occurrences of "c":

df.replace("c", method="bfill")
   A
0  a
1  b
2  c

Notice how, even though we have "c" in our DataFrame, it did not get replaced. Since there is no next row, we have no filler and hence no replacement is performed.

Limit

Consider the following DataFrame:

df = pd.DataFrame({"A":["a","b","b"]})
df
A
0 a
1 b
2 b

By default, limit=None, which means that there is no restriction on how many consecutive fills are allowed:

df.replace("b", method="ffill")
A
0 a
1 a
2 a

In contrast, setting limit=1 yields:

df.replace("b", method="ffill", limit=1)
A
0 a
1 a
2 b

Here, notice how b was filled only once. Also note that limit imposes a restriction on consecutive fills only.

Replacing in-place

To perform replacement in-place, we need to set inplace=True. This will directly perform the replace operation on the source DataFrame instead of creating a new one.

Consider the following DataFrame:

df = pd.DataFrame({"A":[1,2],"B":[3,4]})
df
   A  B
0  1  3
1  2  4

We replace all occurrences of 1 with 5 with inplace=True:

df.replace(1, 5, inplace=True)
df
   A  B
0  5  3
1  2  4

As shown in the output, the source DataFrame has been directly modified.

robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Ask a question or leave a feedback...