**Pandas**

*chevron_left*

**Row and Column Operations Cookbook**

# Updating rows based on column values in Pandas DataFrame

*schedule*May 20, 2023

*toc*Table of Contents

*expand_more*

**mathematics behind data science**with 100+ top-tier guides

Start your free 7-days trial now!

# Filling rows where condition is based on their values with a constant

Consider the following DataFrame:

```
df = pd.DataFrame({"A":[1,2,3],"B":[4,5,6],"C":[7,8,9]}, index=["a","b","c"])df
A B Ca 1 4 7b 2 5 8c 3 6 9
```

To fill rows where value for column `A`

is `1`

or value for column `C`

is greater than or equal to `9`

:

```
df.loc[(df["A"] == 1) | (df["C"] >= 9)] = 0df
A B Ca 0 0 0b 2 5 8c 0 0 0
```

Here, we are first extracting the following `Series`

of booleans:

```
(df["A"] == 1) | (df["C"] >= 9)
a Trueb Falsec Truedtype: bool
```

Passing in this boolean mask into the `loc`

property will return the rows that correspond to `True`

. We then fill these rows with the value `0`

using standard assignment (`=`

).

# Filling certain row values where condition is based on their values with a constant

Consider the following DataFrame:

```
df = pd.DataFrame({"A":[1,2,3],"B":[4,5,6],"C":[7,8,9]}, index=["a","b","c"])df
A B Ca 1 4 7b 2 5 8c 3 6 9
```

Instead of filling the entire rows with a constant, you can specify which rows to fill like so:

```
df.loc[(df["A"] == 1) | (df["C"] >= 9), "B"] = 0df
A B Ca 1 0 7b 2 5 8c 3 0 9
```

Here, the `"B"`

after the comma indicates that we want to only update column `B`

, and leave the other column values intact.

# Filling rows where condition is based on a function of their values

Consider the following DataFrame:

```
df = pd.DataFrame({"A":[1,2,3],"B":[4,5,6],"C":[7,8,9]}, index= ["a","b","c"])df
A B Ca 1 4 7b 2 5 8c 3 6 9
```

To fill rows where the sum of the value for column `A`

and the value for column `B`

is greater than `6`

:

```
def criteria(my_df): return my_df["A"] + my_df["B"] > 6
```

df.loc[criteria] = 0df
A B Ca 1 4 7b 0 0 0c 0 0 0

To clarify, `criteria(my_df)`

takes in as argument the source DataFrame, and returns a `Series`

of booleans where `True`

corresponds to the rows that satisfy the condition:

```
def criteria(df): print(df["A"] + df["B"] > 6) return df["A"] + df["B"] > 6
```

df.loc[criteria] = 0
a Falseb Truec Truedtype: bool

The `loc`

property will then return all the rows that correspond to `True`

in this boolean mask.

# Filling rows using a function of their values

Consider the following DataFrame:

```
df = pd.DataFrame({"A":[1,2,3],"B":[4,5,6],"C":[7,8,9]}, index= ["a","b","c"])df
A B Ca 1 4 7b 2 5 8c 3 6 9
```

To double the values of rows where the value for column `B`

is larger than `4`

:

```
df.loc[df["B"] > 4] = df * 2df
A B Ca 1 4 7b 4 10 16c 6 12 18
```

Here, `loc`

returns all the rows where the value for column `B`

is larger than `4`

. These rows are then assigned new values using `=`

. Note that the assignment only updates the rows returned by `loc`

, and so the rows that do not satisfy the condition will be kept intact.