**Pandas**

655 guides

*chevron_left*

**Data Aggregation Cookbook**

Applying a function to multiple columns in groupsCalculating percentiles of a DataFrameCalculating the percentage of each value in each groupComputing descriptive statistics of each groupDifference between a group's count and sizeDifference between methods apply and transform for groupbyGetting cumulative sum of each groupGetting descriptive statistics of DataFrameGetting multiple aggregates of a column after groupingGetting n rows with smallest column value in each groupGetting number of distinct rows in each groupGetting size of each groupGetting specific group after groupbyGetting the first row of each groupGetting the last row of each groupGetting the top n rows with largest column value in each groupGetting unique values of each groupGrouping by multiple columnsGrouping without turning group column into indexMerging rows within a group togetherNaming columns after aggregationSorting values within groups

check_circle

Mark as learned thumb_up

1

thumb_down

0

chat_bubble_outline

0

Comment auto_stories Bi-column layout

settings

# Calculating the percentage of each value in each group in Pandas

*schedule*Aug 12, 2023

local_offer

Tags Python●Pandas

*toc*Table of Contents

*expand_more*

Master the

Start your free 7-days trial now!

**mathematics behind data science**with 100+ top-tier guidesStart your free 7-days trial now!

Consider the following DataFrame:

```
df = pd.DataFrame({"A":[2,3,4],"B":[6,7,8],"group":["a","a","b"]})df
A B group0 2 6 a1 3 7 a2 4 8 b
```

To compute the percentage of each value in each distinct `group`

:

```
df.groupby("group").apply(lambda my_df: my_df / my_df.sum())
A B0 0.4 0.4615381 0.6 0.5384622 1.0 1.000000
```

Note the following:

the function defined in

`apply(~)`

is called twice in this case - once for each group.the argument (

`my_df`

) passed to this function is a DataFrame representing a single group.the

`my_df.sum()`

returns a Series containing the sum of each column of`my_df`

. In this case, for group`a`

,`my_df.sum()`

would evaluate to a Series holding values`[5,13]`

.dividing

`my_df`

by this Series involves dividing values in column`A`

by`5`

, and dividing values in column`B`

by`13`

.the return type of argument function is a DataFrame.

* * *

To compute the percentage of a specific column instead of all numeric columns:

```
df.groupby("group").apply(lambda my_df: my_df["A"] / my_df["A"].sum())
group a 0 0.4 1 0.6b 2 1.0Name: A, dtype: float64
```

Published by Isshin Inada

Edited by 0 others

Did you find this page useful?

thumb_up

thumb_down

Comment

Citation

Ask a question or leave a feedback...

thumb_up

1

thumb_down

0

chat_bubble_outline

0

settings

Enjoy our search

Hit / to insta-search docs and recipes!