search
Search
Unlock 100+ guides
search toc
close
account_circle
Profile
exit_to_app
Sign out
What does this mean?
Why is this true?
Give me some examples!
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
Doc Search
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Shrink
Navigate to

# Getting rows with missing values (NaNs) in Pandas DataFrame

schedule Aug 12, 2023
Last updated
local_offer
PythonPandas
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!

# Example

Consider the following DataFrame:

``` df = pd.DataFrame({"A":[np.nan,3,np.nan],"B":[4,5,6],"C":[np.nan,7,8]}, index=["a","b","c"])df A B Ca NaN 4 NaNb 3.0 5 7.0c NaN 6 8.0 ```

## Rows with at least one missing value

### Solution

To get rows with missing values in `df`:

``` df[df.isna().any(axis=1)] A B Ca NaN 4 NaNc NaN 6 8.0 ```

### Explanation

The `isna()` method returns a DataFrame of booleans where `True` indicates the presence of a missing value:

``` df.isna() A B Ca True False Trueb False False Falsec True False False ```

We then use `any(axis=1)` to obtain a `Series` where `True` represents the presence of at least one `True` in each row:

``` df.isna().any(axis=1) a Trueb Falsec Truedtype: bool ```

The parameter `axis=1` is needed here since the default behaviour of `any(~)` is to scan through each column (as opposed to each row).

With this boolean mask, we can then extract rows that correspond to `True` using `[]` syntax:

``` df[df.isna().any(axis=1)] A B Ca NaN 4 NaNc NaN 6 8.0 ```

## Rows with missing value for a certain column

We show the same `df` here for your reference:

``` df = pd.DataFrame({"A":[np.nan,3,np.nan],"B":[4,5,6],"C":[np.nan,7,8]}, index=["a","b","c"])df A B Ca NaN 4 NaNb 3.0 5 7.0c NaN 6 8.0 ```

### Solution

To get rows with missing value in column `C`:

``` df[df["C"].isna()] A B Ca NaN 4 NaN ```

### Explanation

We first begin by extracting column `C` as a Series:

``` df["C"] a NaNb 7.0c 8.0Name: C, dtype: float64 ```

Next, we use the Series' `isna()` method to get a Series of booleans where `True` indicates the presence of `NaN`:

``` df["C"].isna() a Trueb Falsec FalseName: C, dtype: bool ```

Finally, we pass in this boolean mask to extract the rows corresponding to `True` using `[~]` syntax:

``` df[df["C"].isna()] A B Ca NaN 4 NaN ```

## Rows with missing value for multiple columns

We show the same `df` here for your reference:

``` df = pd.DataFrame({"A":[np.nan,3,np.nan],"B":[4,5,6],"C":[np.nan,7,8]}, index=["a","b","c"])df A B Ca NaN 4 NaNb 3.0 5 7.0c NaN 6 8.0 ```

### Solution

To get rows with missing value in columns `A` and `C`:

``` df[df[["A","C"]].isna().all(axis=1)] A B Ca NaN 4 NaN ```

### Explanation

We first fetch columns `A` and `C` as a DataFrame using `[~]` syntax:

``` df[["A","C"]] A Ca NaN NaNb 3.0 7.0c NaN 8.0 ```

We then use the `isna()` method to get a DataFrame of booleans where `True` indicates the presence of `NaN`:

``` df[["A","C"]].isna() A Ca True Trueb False Falsec True False ```

Next, we use `all(axis=1)` get a Series of booleans where `True` indicates a row with all `True`s:

``` df[["A","C"]].isna().all(axis=1) a Trueb Falsec Falsedtype: bool ```

Finally, we use the `[]` syntax to extract the rows corresponding to `True`:

``` df[df[["A","C"]].isna().all(axis=1)] A B Ca NaN 4 NaN ```

## Rows with missing values for all columns

Consider the following DataFrame:

``` df = pd.DataFrame({"A":[np.nan,np.nan],"B":[np.nan,4]}, index=["a","b"])df A Ba NaN NaNb NaN 4.0 ```

### Solution

To get rows with missing values for all columns:

``` df[df.isna().all(axis=1)] A Ba NaN NaN ```

### Explanation

The logic is exactly the same as the case for getting rows with at least one missing value, except that we use `all(~)` instead of `any(~)`. The difference is as follows:

• `all(~)` returns a Series of booleans where `True` indicates a row with all missing column values `(axis=1)`.

• `any(~)` returns a Series of booleans where `True` indicates a row with at least one missing column value.

Edited by 0 others
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...
thumb_up
0
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!