search
Search
Join our weekly DS/ML newsletter layers DS/ML Guides
menu
menu search toc more_vert
Robocat
Guest 0reps
Thanks for the thanks!
close
Outline
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
help Ask a question
Share on Twitter
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
A
A
brightness_medium
share
arrow_backShare
Twitter
Facebook

Difference between a group's count and size in Pandas

Pandas
chevron_right
Cookbooks
chevron_right
DataFrame Cookbooks
chevron_right
Data Aggregation Cookbook
schedule Jul 1, 2022
Last updated
local_offer PythonPandas
Tags
tocTable of Contents
expand_more

The difference between a group's count() and size() is the following:

  • count() returns the number of non-nan values for each column. If there is more than one column, then a DataFrame is returned.

  • size() returns the length, that is, the number of rows of a group. This method does not differentiate between nan and non-nan values.

Example

Consider the following DataFrame about some products:

df = pd.DataFrame({"price":[500,300,700, 200,np.nan], "brand": ["apple", "google", "apple", "google","apple"], "device":["phone","phone","computer","phone","phone"]}, index=["a","b","c","d","e"])
df
price brand device
a 500.0 apple phone
b 300.0 google phone
c 700.0 apple computer
d 200.0 google phone
e NaN apple phone

Notice how we have a missing value (nan) for the last product.

Here's the count() of each brand group:

df.groupby("brand").count()
price device
brand
apple 2 3
google 2 2

Note the following:

  • the return type is DataFrame,

  • the count for apple's price is 2, since only non-nan values are counted.

Now, consider the size() of each brand group:

df.groupby("brand").size()
brand
apple 3
google 2
dtype: int64

Note the following:

  • the return type is Series.

  • the size of brand apple is 3 since the size just counts the number of rows of each group.

mail
Join our newsletter for updates on new DS/ML comprehensive guides (spam-free)
robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
Ask a question or leave a feedback...
2
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!