search
Search
Login
Math ML Join our weekly DS/ML newsletter
menu
menu search toc more_vert
Robocat
Guest 0reps
Thanks for the thanks!
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
help Ask a question
Share on Twitter
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
A
A
brightness_medium
share
arrow_backShare
Twitter
Facebook

Pandas DataFrame | pct_change method

Pandas
chevron_right
Documentation
chevron_right
DataFrame
chevron_right
Basic and Descriptive Statistics
schedule Jul 1, 2022
Last updated
local_offer PythonPandas
Tags

Pandas DataFrame.pct_change(~) computes the percentage change between consecutive values of each column of the DataFrame.

Parameters

1. periodslink | int | optional

If periods=2, then the percentage change will be computed using the values of two rows back. By default, periods=1, which means that the value in the previous row will be used to compute the percentage change.

2. fill_methodlink | string | optional

The rule by which to fill missing values:

Value

Description

"backfill" or "bfill"

Use the next non-NaN value to fill.

"pad" or "ffill"

Use the previous non-NaN value to fill.

By default, fill_method="pad".

WARNING

Regardless of fill_method, the first row will always have NaN since there is no prior value to compute the percentage change.

3. limit | int | optional

The number of consecutive NaN to fill before stopping to fill. By default, limit=None.

4. freqlink | string or timedelta or DateOffset | optional

The time interval to use for when DataFrame is a time-series. By default, freq=None.

Return Value

A DataFrame holding the percentage changes of the values in each column.

Examples

Basic usage

Consider the following DataFrame:

df = pd.DataFrame({"A":[2,4,12], "B":[1,3,15]})
df
A B
0 2 1
1 4 3
2 12 15

To compute the percentage change of consecutive values for each column in df:

df.pct_change()
A B
0 NaN NaN
1 1.0 2.0
2 2.0 4.0

Here, note the following:

  • the first row is always NaN because there is no prior value with which to compute the percentage change.

  • to explain how these percent changes are calculated, take for example the bottom-right value (4.0). This value is computed by taking the difference between the prior value in df (15-3=12), and then dividing this difference by the prior value (12/3=4.0).

Specifying periods

Consider the following DataFrame:

df = pd.DataFrame({"A":[2,4,12], "B":[1,3,15]})
df
A B
0 2 1
1 4 3
2 12 15

By default, periods=1, which means that the previous row is used to compute the percentage change:

df.pct_change()   # periods=1
A B
0 NaN NaN
1 1.0 2.0
2 2.0 4.0

To compute the percentage change with values in 2 rows back:

df.pct_change(periods=2)
A B
0 NaN NaN
1 NaN NaN
2 5.0 14.0

We get NaN for the second row because there is no row to compare with.

NOTE

To use the subsequent row to compute the percentage change, set periods=-1.

Specifying fill_method

Consider the following DataFrame with some missing values:

df = pd.DataFrame({"A":[2,pd.np.nan,12], "B":[1,3,pd.np.nan]})
df
A B
0 2.0 1.0
1 NaN 3.0
2 12.0 NaN

pad

By default, fill_method="pad", which means that the previous non-NaN value is used to fill NaN:

df.pct_change()   # fill_method="pad"
A B
0 NaN NaN
1 0.0 2.0
2 5.0 0.0

Note that this is equivalent to calling pct_change() on the following:

pd.DataFrame({"A":[2,2,12], "B":[1,3,3]})
A B
0 2 1
1 2 3
2 12 3
WARNING

Regardless of fill_method, the first row will always have NaN since there is no prior value with which to compute the percentage change.

bfill

To fill NaN using the next non-NaN value in the DataFrame:

df.pct_change(fill_method="bfill")
A B
0 NaN NaN
1 5.0 2.0
2 0.0 NaN

Note that this is equivalent to calling pct_change() on the following:

pd.DataFrame({"A":[2,12,12], "B":[1,3,pd.np.NaN]})
A B
0 2 1
1 12 3
2 12 NaN

Notice how the NaN in the bottom-right corner is still a NaN - this is because there exists no non-NaN in the next row (there is no next row).

Specifying freq

Consider the following time-series DataFrame:

idx = pd.date_range(start="2020-12-20", periods=4)
df = pd.DataFrame({"A":[2,4,12,24], "B":[1,3,15,30]}, index=idx)
df
A B
2020-12-20 2 1
2020-12-21 4 3
2020-12-22 12 15
2020-12-23 24 30

To compute the percentage change of every 2 days (e.g. 12-20 and 12-22):

df.pct_change(freq="2D")
A B
2020-12-20 NaN NaN
2020-12-21 NaN NaN
2020-12-22 5.0 14.0
2020-12-23 5.0 9.0

Here, we get NaN values for the first 2 rows because there are no dates 12-18 and 12-19 with which to compute the percentage change.

mail
Join our newsletter for updates on new DS/ML comprehensive guides (spam-free)
robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Ask a question or leave a feedback...
thumb_up
0
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!