Pandas DataFrame | cumsum method
Start your free 7-days trial now!
Pandas DataFrame.cumsum(~) method computes the cumulative sum along the row or column of the source DataFrame.
Parameters
1. axislink | int or string | optional
Whether to compute the cumulative sum along the row or the column:
Axis | Description |
|---|---|
| Compute the cumulative sum of each column. |
| Compute the cumulative sum of each row. |
By default, axis=0.
2. skipnalink | boolean | optional
Whether or not to ignore NaN. By default, skipna=True.
Return Value
A DataFrame holding the cumulative sum of the row or columns values.
Examples
Consider the following DataFrame:
df
A B0 3 51 4 6
Cumulative sum of each column
To compute the cumulative sum for each column:
df.cumsum()
A B0 3 51 7 11
Cumulative sum of each row
To compute the cumulative sum for each row:
df.cumsum(axis=1)
A B0 3 81 4 10
Dealing with missing values
Consider the following DataFrame with a missing value:
df
A0 3.01 NaN2 5.0
By default, skipna=True, which means that missing values are skipped and do not mutate the sum:
df.cumsum() # skipna=True
A0 3.01 NaN2 8.0
To take into account the missing values:
df.cumsum(skipna=False)
A0 3.01 NaN2 NaN
Here, notice how we end up with a NaN after the first NaN. This is because the sum of a scalar and a NaN in Pandas is a NaN, that is:
5 + pd.np.NaN
nan