Pandas DataFrame | stack method
Start your free 7-days trial now!
Pandas DataFrame.stack(~) method converts the specified column levels to row levels. This is the reverse of unstack(~).
Parameters
1. levellink | int or string | optional
The integer index or name(s) of the column level to convert into a row level. By default, level=-1, which means that the inner-most column level is converted.
2. dropnalink | boolean | optional
Whether or not to drop resulting rows that contain just NaN. By default, dropna=True.
Return Value
A Series or a DataFrame.
Examples
Stacking single-level DataFrames
Consider the following single-level DataFrame:
df = pd.DataFrame([[2,3],[4,5]], columns=["alice","bob"], index=["age","height"])df
alice bobage 2 3height 4 5
Calling stack() on df gives:
df.stack()
age alice 2 bob 3height alice 4 bob 5dtype: int64
Here, note the following:
the return type is
Serieswith a 2-level index.the row labels and the column labels in
dfhave merged to form a multi-index.
Stacking DataFrames with multi-level columns
Consider the following DataFrame with multi-level columns:
index = [("A", "alice"), ("A", "bob"), ("B","cathy")]multi_index = pd.MultiIndex.from_tuples(index)df = pd.DataFrame([[2,3,4],[5,6,7]], columns=multi_index, index=["age","height"])df
A B alice bob cathyage 2 3 4height 5 6 7
By default, level=-1, which means that the inner-most column level ([alice,bob,cathy]) will be converted into a row level:
df.stack()
A Bage alice 2.0 NaN bob 3.0 NaN cathy NaN 4.0height alice 5.0 NaN bob 6.0 NaN cathy NaN 7.0
Note the following:
the inner-most column level (
[alice, bob, cathy]) became a row index, and is positioned as the inner-most level.stacking columns with multi-levels often yield many
NaNsince, for instance, no data exists about theageofalicein groupB.
To specify which levels to convert, pass the level parameter like so:
df.stack(level=0)
alice bob cathyage A 2.0 3.0 NaN B NaN NaN 4.0height A 5.0 6.0 NaN B NaN NaN 7.0
Here, level=0 means that that outermost column level ([A,B]) is converted into a row level.
Specifying dropna
Consider the following DataFrame:
index = [("A", "alice"), ("A", "bob"), ("B","cathy")]multi_index = pd.MultiIndex.from_tuples(index)df = pd.DataFrame([[2,3,None],[5,6,7]], columns=multi_index, index=["age","height"])df
A B alice bob cathyage 2 3 NaNheight 5 6 7.0
By default, dropna=True, which means that rows that contain just NaN will be removed from the result:
df.stack()
A Bage alice 2.0 NaN bob 3.0 NaNheight alice 5.0 NaN bob 6.0 NaN cathy NaN 7.0
Notice how cathy's row for the age level is missing. This is because it only contains NaN.
To keep all rows, pass dropna=False like so:
df.stack(dropna=False)
A Bage alice 2.0 NaN bob 3.0 NaN cathy NaN NaNheight alice 5.0 NaN bob 6.0 NaN cathy NaN 7.0
Notice how we now have cathy's row under age.