search
Search
Join our weekly DS/ML newsletter layers DS/ML Guides
menu
menu search toc more_vert
Robocat
Guest 0reps
Thanks for the thanks!
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
help Ask a question
Share on Twitter
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
A
A
brightness_medium
share
arrow_backShare
Twitter
Facebook

Getting index of rows with missing values (NaNs) in Pandas DataFrame

Pandas
chevron_right
Cookbooks
chevron_right
DataFrame Cookbooks
chevron_right
Handling Missing Values
schedule Jul 1, 2022
Last updated
local_offer PythonPandas
Tags

Getting index (row label)

Consider the following DataFrame with some missing values:

df = pd.DataFrame({"A":[3,pd.np.NaN,5],"B":[6,7,pd.np.NaN]}, index=["a","b","c"])
df
A B
a 3.0 6.0
b NaN 7.0
c 5.0 NaN

Solution

To get the index of rows with missing values in Pandas optimally:

temp = df.isna().any(axis=1)
temp[temp].index
Index(['b', 'c'], dtype='object')

Explanation

We first check for the presence of NaNs using isna(), which returns a DataFrame of booleans where True indicates the presence of a NaN:

df.isna()
A B
a False False
b True False
c False True

Next, we use any(axis=1), which scans each row and returns a Series of booleans where True indicates a row with at least one True:

df.isna().any(axis=1)
a False
b True
c True
dtype: bool

We then temporarily store this intermediate result in a variable called temp. Our goal now is to extract the Index where the corresponding value is True (b and c in this case).

We first exclude indexes where the corresponding values is False by treating temp as a boolean mask:

temp[temp]
b True
c True
dtype: bool

Finally, all we need to do is to access the index property of this Series:

temp[temp].index
Index(['b', 'c'], dtype='object')

Getting integer index

Again, consider the same df as above:

df = pd.DataFrame({"A":[3,pd.np.NaN,5],"B":[6,7,pd.np.NaN]}, index=["a","b","c"])
df
A B
a 3.0 6.0
b NaN 7.0
c 5.0 NaN

Solution

To get the integer indexes of rows with missing values:

np.where(df.isna().any(axis=1))[0] # returns a NumPy array
array([1, 2])

Explanation

Similar to the case above, we start by checking for the presence of NaN values using isna():

df.isna() # returns a DataFrame
A B
a False False
b True False
c False True

We then check for rows where there is at least one True:

df.isna().any(axis=1) # returns a Series
a False
b True
c True
dtype: bool

To get the integer index of the boolean True, use np.where(~):

np.where(df.isna().any(axis=1)) # returns a tuple of size one
(array([1, 2]),)

Here, np.where(~) returns a tuple of size one, and so we use [0] to extract the NumPy array of indexes:

np.where(df.isna().any(axis=1))[0] # returns a NumPy array
array([1, 2])
mail
Join our newsletter for updates on new DS/ML comprehensive guides (spam-free)
robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_down
Ask a question or leave a feedback...
0
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!