Counting the occurrence of values in columns of a Pandas DataFrame
Start your free 7-days trial now!
Counting occurrence of a single value in a column
Consider the following DataFrame:
df
A0 a1 b2 a
Solution
To count the number of times the value "a" occurs in column A:
2
Explanation
To break this down, we are first fetching a Series of booleans where True indicates a match:
(df["A"] == "a")
0 True1 False2 TrueName: A, dtype: bool
Since the internal representation of a True is 1, and False is 0, we can simply take the sum of this Series to count the total occurrence:
2
Counting occurrence of a single value in multiple columns
Consider the following DataFrame:
df
A B0 a a1 b a2 a a
Solution
To get the number of "a" in each column:
A 2B 3dtype: int64
The idea is the exact same as that of the single-column case above.
Counting occurrences of multiple values in a column
Consider the following DataFrame:
df
A0 a1 b2 a3 c
Solution
To count the occurrences of multiple values in column A:
3
Explanation
We first obtain a frequency count of the values in column A using Series' value_counts():
counts
a 2c 1b 1Name: A, dtype: int64
We then extract the values we are interested in using [] syntax:
counts[values] # returns a Series
a 2b 1Name: A, dtype: int64
We then use the Series' sum() method:
3
Counting the total number of occurrences
Consider the same df as above:
df
A B0 a a1 b a2 a a
Solution
To count the total number of "a" in df:
Explanation
Once again, we first check for the presence of "a" like so:
df == "a" # returns a Series
A B0 True True1 False True2 True True
True is internally represented as a 1, while False as a 0. Taking the sum of each column yields:
A 2B 3dtype: int64
This tells us that we have 2 occurrences of "a" in column A, and 3 in B. What we want is the total number so we must take a second sum: