Pandas DataFrame | where method
Start your free 7-days trial now!
Pandas DataFrame.where(~) uses a boolean mask to selectively replace values in the source DataFrame.
Parameters
1. cond | boolean or array-like or callable | optional
A boolean mask, which is an array-like structure (e.g. Series and DataFrame) that contains either True or False as its entries.
If an entry is
True, then the corresponding value in the source DataFrame will be left as is.If an entry if
False, then the corresponding value in the source DataFrame will be replaced by that inother.
If a callable is passed, then the function takes as argument a DataFrame and returns a DataFrame of booleans. This callable must not modify the source DataFrame.
2. other | scalar or Series or DataFrame or function | optional
The values to replace the entries that have True in the cond.
If a callable is passed. then the function takes in as argument the value to be replaced and returns a new scalar, Series or DataFrame that will be the replacer. Once again, this callable must not modify the source DataFrame.
3. inplace | boolean | optional
Whether or not to perform the method inplace. Methods that are inplace means that they will directly modify the source DataFrame without creating and returning a new DataFrame. By default, inplace=False.
4. axis | int | optional
The axis along which to perform the method. By default, axis=None.
5. level | int | optional
The levels on which to perform the method. This is only relevant if your source DataFrame is a multi-index.
6. errors | string | optional
Whether to raise or suppress errors:
Value | Description |
|---|---|
| Allow for errors to be raised. |
| When error occurs, return the source DataFrame. |
By default, errors="raise".
7. try_cast | boolean | optional
Whether or not to cast the resulting DataFrame into the source DataFrame's type. By default, try_cast=False.
Return Value
A DataFrame.
Examples
Basic usage
Consider the following DataFrame:
df = pd.DataFrame({"A":[3,4],"B":[5,6]})df
A B0 3 51 4 6
Suppose we have the following DataFrame that acts as the boolean mask:
df_mask = pd.DataFrame({"A":[True,False],"B":[False,True]})df_mask
A B0 True False1 False True
We then call where(~) to selectively replace values in df that where the corresponding entry in df_mask is False:
df.where(df_mask, 10)
A B0 3 101 10 6
Passing in a callable for cond
Consider the same DataFrame as before:
df = pd.DataFrame({"A":[3,4],"B":[5,6]})df
A B0 3 51 4 6
Instead of specifying an array-like mask as the first parameter, we can also pass in a function like so:
def foo(my_df): return my_df > 4
df.where(foo, 10)
A B0 10 51 10 6
Here, the function foo takes in as argument the entire DataFrame, and returns a DataFrame of booleans. Again, a boolean of True would mean that the corresponding values will be kept intact, while replacement is carried out for False.
Note that the previous code snippet can be written compactly using lambdas:
df.where(lambda x : x > 4, 10)
A B0 10 51 10 6
Passing in callable for other
Consider the same DataFrame as before:
df = pd.DataFrame({"A":[3,4],"B":[5,6]})df
A B0 3 51 4 6
Suppose we have a mask like follows:
my_mask = [[True,False],[True,False]]my_mask
[[True, False], [True, False]]
Let us pass a callable for the other parameter:
df.where(my_mask, lambda x : x + 10)
A B0 3 151 4 16
The callable takes in as argument the value to be replaced, and returns the new replacer.