search
Search
Login
Unlock 100+ guides
menu
menu
web
search toc
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
What does this mean?
Why is this true?
Give me some examples!
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to

Pandas DataFrame | agg method

schedule Aug 12, 2023
Last updated
local_offer
PythonPandas
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!

Pandas DataFrame.agg(~) applies the specified function to each row or column of the DataFrame.

Parameters

1. func | string or list or dict or function

The function to use for aggregating:

Type

Example

function

np.sum

Function name as a string

"mean"

list of function or string

[np.sum, "mean"]

dict

{"A": np.sum}

Built-in functions that can be used for aggregating are as follows:

Built-in aggregates

Description

sum

sum

prod

product of values

size

number of values

count

number of non-NaN values

mean

mean

var

variance

std

standard deviation

sem

unbiased standard error of mean

mad

mean absolute deviation

min

minimum

max

maximum

median

median

mode

mode

quantile

quantile

abs

absolute value

skew

unbiased skewness

kurt

unbiased kurtosis

cumsum

cumulative sum

cumprod

cumulative product

cummax

cumulative max

cummin

cumulative min

2. axislink | int or string | optional

Whether or to apply the function column-wise or row-wise:

Axis

Description

0 or "index"

func will be applied to each column.

1 or "columns"

func will be applied to each row.

By default, axis=0.

3. *args

Positional arguments to pass to func.

4. **kwargs

Keyword arguments to pass to func.

Return Value

A new scalar, Series or a DataFrame depending on the func that is passed.

Examples

Consider the following DataFrame:

df = pd.DataFrame({"A":[1,2], "B":[3,4]})
df
   A  B
0  1  3
1  2  4

Computing a single aggregate

To compute the mean of each column:

df.agg("mean")      # Equivalent to df.agg(pd.np.mean)
A 1.5
B 3.5
dtype: float64

Here, the returned data type is Series.

Computing multiple aggregates

To compute the mean as well as the minimum of each column:

df.agg(["mean", "min"])
      A    B
mean  1.5  3.5
min   1.0  3.0

Here, the returned data type is DataFrame.

Computing aggregates for a subset of columns

To compute the minimum of just column A:

df.agg({"A":"min"})   # Returns a Series
A 1
dtype: int64

Computing aggregates row-wise

To compute the maximum of every row, set axis=1 like so:

df.agg("max", axis=1)   # Returns a Series
0 3
1 4
dtype: int64

Defining custom aggregate functions

Consider the following DataFrame:

df = pd.DataFrame({"A":[1,2], "B":[3,4]})
df
A B
0 1 3
1 2 4

We can pass a custom function that serves as the aggregate:

df.agg(lambda col: 2 * sum(col))
A 6
B 14
dtype: int64

Here, x is a Series that represent a column of df.

Passing in additional parameters

We can pass in additional parameters to func like so:

def foo(col, x):
return col + x

df.agg(foo, x=5)
A B
0 6 8
1 7 9
robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...