search
Search
Login
Unlock 100+ guides
menu
menu
web
search toc
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
What does this mean?
Why is this true?
Give me some examples!
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to

Pandas DataFrame | transform method

schedule Aug 10, 2023
Last updated
local_offer
PythonPandas
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!

Pandas DataFrame.transform(~) method applies a function to transform the rows or columns of the source DataFrame. Note that a new DataFrame is returned, and the source DataFrame is kept intact.

Parameters

1. func | function or string or list or dict

The transformation applied to the rows or columns of the source DataFrame. If a function is passed, then it takes in as argument a Series or a DataFrame.

The allowed values are as follows:

  • a function (e.g. np.mean)

  • the name of the function as a string (e.g. "np.mean")

  • a list of the above two (e.g. [np.mean, "np.max"])

  • dictionary with:

    • key: row/column label

    • value: function, function name or list of such

2. axis | list | optional

Whether to apply the transformation row-wise or column-wise:

Axis

Description

0 or "index"

Transform each column.

1 or "columns"

Transform each row.

By default, axis=0.

3. args | any

The positional arguments you want to pass to func.

4. kwargs | any

The keyword arguments you want to pass to func.

Return Value

A new DataFrame that has the same shape as the source DataFrame.

Examples

Basic usage

Consider the following DataFrame:

df = pd.DataFrame({"A":[-3,4],"B":[5,-6]})
df
A B
0 -3 5
1 4 -6

Applying NumPy's abs(~) method, which returns the absolute value of the input:

df.transform(np.abs) # or "np.abs"
A B
0 3 5
1 4 6

Passing in a function

Consider the following DataFrame:

df = pd.DataFrame({"A":[3,4],"B":[5,6]})
df
A B
0 3 5
1 4 6

We can pass in a custom function like so:

def foo(col):
return 2 * col if np.sum(col) >= 10 else col

df.transform(foo)
A B
0 3 10
1 4 12

Here, our custom function foo takes in as argument col, which is a single column of the DataFrame (Series).

Passing in multiple functions

Consider the following DataFrame:

df = pd.DataFrame({"A":[-3,4],"B":[-5,6]})
df
A B
0 -3 -5
1 4 6

To apply multiple transformations, pass in a list like so:

df.transform([np.abs, lambda x: x + 1])
A B
absolute <lambda_0> absolute <lambda_0>
0 3 -2 5 -4
1 4 5 6 7

Notice how the two transformations are independent, that is, both transformations are applied to the original values.

Transforming each column

Consider the following DataFrame:

df = pd.DataFrame({"A":[3,4],"B":[5,6]})
df
A B
0 3 5
1 4 6

By default, axis=0, which means that we are transforming each column:

def foo(col):
return 2 * col if np.sum(col) >= 10 else col

df.transform(foo) # axis=0
A B
0 3 10
1 4 12

Transforming each row

Consider the same DataFrame as before:

df = pd.DataFrame({"A":[3,4],"B":[5,6]})
df
A B
0 3 5
1 4 6

To transform each row, pass in axis=1 like so:

def foo(row): # row is a Series representing a single row
return 2 * row if np.sum(row) >= 10 else row

df.transform(foo, axis=1)
A B
0 3 5
1 8 12

Transforming certain columns only

Consider the same DataFrame as before:

df = pd.DataFrame({"A":[3,4],"B":[5,6],"C":[7,8]})
df
A B C
0 3 5 7
1 4 6 8

By default, the transform(~) method will either transform all columns (axis=0) or all rows (axis=1).

To transform certain columns, select the columns you wish to transform first:

def foo(val):
return val * 3

# Here, we are transforming just columns A and B
df_new_cols = df[["A","B"]].transform(foo)
df_new_cols
A B
0 9 15
1 12 18

Here, a new DataFrame with the transformed columns is returned while the original DataFrame df is left intact. If you wanted to replace the original columns, then:

def foo(val):
return val * 3

df_new_cols = df[["A","B"]].transform(foo)
df[["A","B"]] = df_new_cols
df
A B C
0 9 15 7
1 12 18 8

Passing in arguments

Consider the following DataFrame:

df = pd.DataFrame({"A":[3,4],"B":[5,6]})
df
A B
0 3 5
1 4 6

To pass in arguments to func:

def foo(x, threshold):
return 2 * x if np.sum(x) >= threshold else x

df.transform(foo, threshold=10)
A B
0 3 10
1 4 12
robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...