Pandas DataFrame | agg method
Start your free 7-days trial now!
Pandas DataFrame.agg(~) applies the specified function to each row or column of the DataFrame.
Parameters
1. func | string or list or dict or function
The function to use for aggregating:
Type | Example |
|---|---|
|
|
Function name as a |
|
|
|
|
|
Built-in functions that can be used for aggregating are as follows:
Built-in aggregates | Description |
|---|---|
| sum |
| product of values |
| number of values |
| number of non- |
| mean |
| variance |
| standard deviation |
| unbiased standard error of mean |
| mean absolute deviation |
| minimum |
| maximum |
| median |
| mode |
| quantile |
| absolute value |
| unbiased skewness |
| unbiased kurtosis |
| cumulative sum |
| cumulative product |
| cumulative max |
| cumulative min |
2. axislink | int or string | optional
Whether or to apply the function column-wise or row-wise:
Axis | Description |
|---|---|
|
|
|
|
By default, axis=0.
3. *args
Positional arguments to pass to func.
4. **kwargs
Keyword arguments to pass to func.
Return Value
A new scalar, Series or a DataFrame depending on the func that is passed.
Examples
Consider the following DataFrame:
df
A B0 1 31 2 4
Computing a single aggregate
To compute the mean of each column:
df.agg("mean") # Equivalent to df.agg(pd.np.mean)
A 1.5B 3.5dtype: float64
Here, the returned data type is Series.
Computing multiple aggregates
To compute the mean as well as the minimum of each column:
df.agg(["mean", "min"])
A Bmean 1.5 3.5min 1.0 3.0
Here, the returned data type is DataFrame.
Computing aggregates for a subset of columns
To compute the minimum of just column A:
df.agg({"A":"min"}) # Returns a Series
A 1dtype: int64
Computing aggregates row-wise
To compute the maximum of every row, set axis=1 like so:
df.agg("max", axis=1) # Returns a Series
0 31 4dtype: int64
Defining custom aggregate functions
Consider the following DataFrame:
df
A B0 1 31 2 4
We can pass a custom function that serves as the aggregate:
df.agg(lambda col: 2 * sum(col))
A 6B 14dtype: int64
Here, x is a Series that represent a column of df.
Passing in additional parameters
We can pass in additional parameters to func like so:
def foo(col, x): return col + x
df.agg(foo, x=5)
A B0 6 81 7 9