Pandas DataFrame  rolling method
Start your free 7days trial now!
Pandas DataFrame.rolling(~)
method is used to compute statistics using moving windows. Note that a window is simply a sequence of values used to compute statistics like the mean.
Parameters
1. window
 int
or offset
or BaseIndexer
subclass
The size of the moving window.
When dealing with timeseries, that is when the index of the source DataFrame is DatetimeIndex
, offset
represents a time interval of each window.
2. min_periods
 int
 optional
The minimum number of values in the window. If a window contains there are less than min_periods
observations, then NaN
is returned for the computed statistic of that window. The default value depends on the following:
if window is offsetbased, then
min_periods=1
.otherwise,
min_periods=window
.
3. center
 boolean
 optional
If
True
, then the observation is set to the center of the window.If
False
, then the observation is set to the right of the window.
By default, center=False
. Consult examples below for clarification.
4. win_type
 string
 optional
The type of the window (e.g. boxvar
, triang
). For more information, consult the official documentationopen_in_new.
5. on
 string
 optional
The label of the datetimelike column to use instead of DatetimeIndex
, This is only relevant when dealing with timeseries.
6. axis
 int
or string
 optional
Whether to compute statistics for each column or each row. By default, axis=0
, that is, the statistic is computed for each column.
7. closed
 string
 optional
Whether the endpoints are inclusive or exclusive:
Value  Description 





 Both endpoints are inclusive. 
 Both endpoints are exclusive. 
By default,
for offsetbased windows,
closed="right"
.otherwise,
closed="both"
.
Return Value
A Window
or Rolling
object that will be used to compute some statistic.
Examples
Basic usage
Consider the following DataFrame:
df = pd.DataFrame({"A":[2,4,8,10],"B":[4,5,6,7]}, index=["a","b","c","d"])df
A Ba 2 4b 4 5c 8 6d 10 7
To compute the sum of values with a moving window of size 2
:
df.rolling(window=2).sum()
A Ba NaN NaNb 6.0 9.0c 12.0 11.0d 18.0 13.0
Here, note the following:
since
axis=0
(default), we are computing the statistic (sum) down each column.window=2
means that the sum is computed using two consecutive observations:we get
6.0
in the first column because2+4=6
.we get
12.0
because4+8=12
.we get
18.0
because8+10=18
.
we get
NaN
for the first row becausemin_periods
is equal to what we specify forwindow
for cases like this when the window is not offsetbased. This means that the minimum number of observations required to compute the statistic is2
, but for the very first row, we only have one number in the window soNaN
is returned.
Specifying center
Consider the following DataFrame:
df = pd.DataFrame({"A":[2,4,8,10]}, index=["a","b","c","d"])df
Aa 2b 4c 8d 10
By default, center=False
, which means that the window will not be centered around an observation:
df.rolling(window=3, min_periods=0).sum() # center=False
Aa 2.0b 6.0c 14.0d 22.0
Here, the numbers are computed like so:
A[a]: 2 = 2A[b]: 2 + 4 = 6 # the observation is 4 (see how 4 is rightaligned)A[c]: 2 + 4 + 8 = 14 # the observation is 8A[d]: 4 + 8 + 10 = 22 # the observation is 10
Compare this with the output of center=True
:
df.rolling(window=3, min_periods=0, center=True).sum()
Aa 6.0b 14.0c 22.0d 18.0
Here, the numbers are computed like so:
A[a]: 2 + 4 = 6A[b]: 2 + 4 + 8 = 14 # the observation is 4 (see how 4 is centered here)A[c]: 4 + 8 + 10 = 22 # the observation is 8A[d]: 8 + 10 = 18
Timeseries case
Consider the following timeseries DataFrame:
idx = [pd.Timestamp('20201220 15:00:00'), pd.Timestamp('20201220 15:00:01'), pd.Timestamp('20201220 15:00:02'), pd.Timestamp('20201220 15:00:04'), pd.Timestamp('20201220 15:00:05')]df = pd.DataFrame({"A":[1,10,100,1000,10000]}, index=idx)df
A20201220 15:00:00 120201220 15:00:01 1020201220 15:00:02 10020201220 15:00:04 100020201220 15:00:05 10000
Summing a window with a period of 2 seconds:
df.rolling(window="2S").sum()
A20201220 15:00:00 1.020201220 15:00:01 11.020201220 15:00:02 110.020201220 15:00:04 1000.020201220 15:00:05 11000.0
Note that since window is offsetbased, the min_periods=1
by default.
You can specify the closed
parameter to indicate whether the endpoints should be inclusive/exclusive:
df.rolling(window="2S", closed="both").sum() # both endpoints are inclusive
A20201220 15:00:00 1.020201220 15:00:01 11.020201220 15:00:02 111.020201220 15:00:04 1100.020201220 15:00:05 11000.0