Pandas DataFrame | apply method
Start your free 7-days trial now!
Pandas DataFrame.apply(~) applies the specified function to each row or column of the DataFrame.
Parameters
1. func | function
The function to apply along the rows or columns.
2. axis | string or int | optional
The axis along which to perform the function:
Axis | Description |
|---|---|
| Function will be applied to each column. |
| Function will be applied to each row. |
By default, axis=0.
3. raw | boolean | optional
If
True, then a NumPy array will be passed as the argument forfunc.If
False, then a Series will be passed instead.
Performance-wise, if you're applying a reductive Numpy function such as np.sum, then opt for raw=True. By default, raw=False.
4. result_typelink | string or None | optional
How to parse list-like return values of func. This is only relevant when axis=1 (when func is applied row-wise):
Value | Description |
|---|---|
| Values of list-like results (e.g. |
| Values of list-like results will be reduced to a single Series. |
| Values of list-like results will be separated out into columns, but unlike |
| Behaviour depends on the value returned by your function. If a |
By default, result_type=None. Consult the examples below for clarification.
5. args | tuple | optional
Additional positional arguments you want to supply to your func.
6. **kwds | optional
Additional keyword arguments you want to supply to your func.
Return Value
The resulting Series or DataFrame after applying your function.
Examples
Applying function on columns
Consider the following DataFrame:
df
A B0 2 41 3 5
To apply the np.sum function column-wise:
A 5B 9dtype: int64
Pandas can benefit from performance gains if you set raw=True when applying a NumPy reductive function like np.sum.
Applying function on rows
Consider the same DataFrame as before:
df
A B0 2 41 3 5
To apply the np.sum function row-wise, set axis=1:
0 61 8dtype: int64
Applying built-in bundlers
Consider the same DataFrame as before:
df
A B0 2 41 3 5
You could bundle values using built-in functions such as tuple, list and even Series:
df.apply(tuple)
A (2, 3)B (4, 5)dtype: object
Applying a custom function
Consider the same DataFrame as before:
df
A B0 2 41 3 5
To apply a custom function:
def foo(col): return 2 * col
df.apply(foo)
A B0 4 81 6 10
Our function foo takes in as argument a column (axis=0) of type Series, and returns the transformed column as a Series.
Passing in keyword arguments
To pass in keyword arguments to func:
def foo(col, x): return x * col
df.apply(foo, x=2)
A B0 4 81 6 10
Different ways of parsing list-like return values
Consider the following DataFrame:
df
A Ba 4 6b 5 7
The parameter result_type comes into play when the return type of the function is list-like.
Return type None
When return_type is not set, then the default behaviour is to place list-like return values in a Series:
df.apply(lambda x: [1,2,3], axis=1) # Returns a Series
a [1, 2, 3]b [1, 2, 3]dtype: object
Note that lambda x: [1,2,3] is equivalent to the following:
def foo(x): # The function name isn't important here return [1,2,3]
Return type expand
When return_type="expand", then the values of a list-like will be separated out into columns, resulting in a DataFrame:
df.apply(lambda x: [1,2,3], axis=1, result_type="expand") # Returns a DataFrame
0 1 2a 1 2 3b 1 2 3
Notice how we no longer have our original column names A and B.
Return type broadcast
When return_type="broadcast", then the list-like values will be separated out into columns, but unlike "expand", the column names will be retained:
df.apply(lambda x: [1,2], axis=1, result_type="broadcast") # Returns a DataFrame
A Ba 1 2b 1 2
For this to work, the length of list-like must be equal to the number of columns in the source DataFrame. This means that returning [1,2,3] instead of [1,2] in this case would result in an error.