search
Search
Login
Math ML Join our weekly DS/ML newsletter
menu
menu search toc more_vert
Robocat
Guest 0reps
Thanks for the thanks!
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
help Ask a question
Share on Twitter
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
A
A
brightness_medium
share
arrow_backShare
Twitter
Facebook

Comprehensive Guide on Bayes' Theorem

Probability and Statistics
chevron_right
Probability Theory
schedule Nov 1, 2022
Last updated
local_offer Probability and Statistics
Tags
Theorem.

Bayes' Theorem

If $A$ and $B$ are events and $\mathbb{P}(A)\ne0$, then:

$$\mathbb{P}(B|A)= \frac{\mathbb{P}(A|B)\cdot\mathbb{P}(B)}{\mathbb{P}(A)}$$

Where:

  • $\mathbb{P}(B|A)$, which is the probability of $B$ given $A$, is called the posterior probability.

  • $\mathbb{P}(A|B)$ is called the likelihood.

  • $\mathbb{P}(B)$ is called the prior probability.

  • $\mathbb{P}(A)$ is called the marginal probability.

Proof. Recall the definition of conditional probability:

$$\begin{equation}\label{eq:G6sUfcheCFjfdWZZddk} \mathbb{P}(B|A)=\frac{\mathbb{P}(A,B)}{\mathbb{P}(A)}, \;\;\;\;\;\;\; \mathbb{P}(A|B)=\frac{\mathbb{P}(A,B)}{\mathbb{P}(B)} \end{equation}$$

Therefore, $\mathbb{P}(A,B)$ can be written as:

$$\begin{align*} \mathbb{P}(A,B)&=\mathbb{P}(B|A)\cdot\mathbb{P}(A)\\ \mathbb{P}(A,B)&=\mathbb{P}(A|B)\cdot\mathbb{P}(B)\\ \end{align*}$$

Equating the two equations and making $\mathbb{P}(B|A)$ the subject gives us the Bayes' theorem:

$$\mathbb{P}(B|A)= \frac{\mathbb{P}(A|B)\cdot\mathbb{P}(B)}{\mathbb{P}(A)}$$

This completes the proof.

Theorem.

Bayes' Theorem using law of total probability

If we have events $A$, $B_1$ and $B_2$ where $B_1$ and $B_2$ partitions the sample space, then:

$$\mathbb{P}(B_j|A)= \frac{\mathbb{P}(A|B_j)\cdot\mathbb{P}(B_j)} {\mathbb{P}(A|B_1)\cdot\mathbb{P}(B_1) + \mathbb{P}(A|B_2)\cdot\mathbb{P}(B_2)}$$

Where $j$ can be either $1$ or $2$.

Proof. Suppose we have events $A$, $B_1$ and $B_2$. Bayes' theorem tells us that:

$$\begin{equation}\label{eq:jtfupsOLvfmYwQhazH1} \mathbb{P}(B_j|A)= \frac{\mathbb{P}(A|B_j)\cdot\mathbb{P}(B_j)}{\mathbb{P}(A)} \;,\;\;\;\;\;\;\; \text{for }\;j=1,2 \end{equation}$$

Now, suppose $B_1$ and $B_2$ form a partition of the sample space. From the law of total probabilitylink, we know that:

$$\begin{equation}\label{eq:vWmostTQX3GHfm5mKVx} \mathbb{P}(A) = \mathbb{P}(A|B_1)\cdot\mathbb{P}(B_1) + \mathbb{P}(A|B_2)\cdot\mathbb{P}(B_2) \end{equation}$$

Substituting this into \eqref{eq:jtfupsOLvfmYwQhazH1} gives:

$$\begin{align*} \mathbb{P}(B_j|A)&= \frac{\mathbb{P}(A|B_j)\cdot\mathbb{P}(B_j)}{\mathbb{P}(A)}\\ &= \frac{\mathbb{P}(A|B_j)\cdot\mathbb{P}(B_j)} {\mathbb{P}(A|B_1)\cdot\mathbb{P}(B_1) + \mathbb{P}(A|B_2)\cdot\mathbb{P}(B_2)} \end{align*}$$

This completes the proof.

Example.

Drawing a ball from two bags

Suppose we have the following two bags:

  • bag $\text{I}$, which contains $1$ red and $3$ green balls

  • bag $\text{II}$, which contains $2$ red and $1$ green balls.

We pick one bag at random and then draw a single ball. Given that the drawn ball is red, compute the probability that it came from bag $\mathrm{II}$.

Solution. We define the following events:

  • event $\mathrm{I}$ - bag $\mathrm{I}$ is chosen.

  • event $\mathrm{II}$ - bag $\mathrm{II}$ is chosen.

  • event $\color{red}\text{red}$ - a red ball is drawn.

The probability we are after is therefore $\mathbb{P}(\mathrm{II}|{\color{red}\mathrm{red}})$. Using Bayes' theorem, we have that:

$$\mathbb{P}(\mathrm{II}|{\color{red}\mathrm{red}})= \frac{\mathbb{P}({\color{red}\mathrm{red}}|\mathrm{II})\cdot\mathbb{P}(\mathrm{II})} {\mathbb{P}({\color{red}\mathrm{red}})}$$

We are not provided with $\mathbb{P}({\color{red}\mathrm{red}})$, so we must use the second form of the Bayes' theorem in which we express the denominator $\mathbb{P}({\color{red}\mathrm{red}})$ using the law of total probability:

$$\mathbb{P}(\mathrm{II}|{\color{red}\mathrm{red}})= \frac{\mathbb{P}({\color{red}\mathrm{red}}|\mathrm{II})\cdot\mathbb{P}(\mathrm{II})} { \mathbb{P}({\color{red}\mathrm{red}}|\mathrm{I})\cdot \mathbb{P}(\mathrm{I})+ \mathbb{P}({\color{red}\mathrm{red}}|\mathrm{II})\cdot \mathbb{P}(\mathrm{II}) } $$

Let's start with the denominator:

$$\begin{align*} \mathbb{P}({\color{red}\mathrm{red}}) &=\mathbb{P}({\color{red}\mathrm{red}}|\mathrm{I})\cdot \mathbb{P}(\mathrm{I})+ \mathbb{P}({\color{red}\mathrm{red}}|\mathrm{II})\cdot \mathbb{P}(\mathrm{II})\\ &= \frac{1}{4}\cdot{\frac{1}{2}}+ \frac{2}{3}\cdot{\frac{1}{2}}\\ &=\frac{11}{24} \end{align*}$$

Next, the numerator is:

$$\begin{align*} \mathbb{P}({\color{red}\mathrm{red}}|\mathrm{II})\cdot\mathbb{P}(\mathrm{II}) &=\frac{2}{3}\cdot\frac{1}{2}\\ &=\frac{1}{3} \end{align*}$$

Substituting the denominator and numerator into the Bayes' theorem gives:

$$\begin{align*} \mathbb{P}(\mathrm{II}|{\color{red}\mathrm{red}}) &=\frac{1}{3}\cdot\frac{24}{11}\\ &=\frac{8}{11}\\ &\approx0.73 \end{align*}$$

Therefore, the probability that the red ball comes from bag $\mathrm{II}$ is around $0.73$.

Bayes rule as an update rule based on evidence

Let's now revisit the Bayes' theorem from the above example:

$$\mathbb{P}(\mathrm{II}|{\color{red}\mathrm{red}})= \frac{\mathbb{P}({\color{red}\mathrm{red}}|\mathrm{II})\cdot\mathbb{P}(\mathrm{II})} {\mathbb{P}({\color{red}\mathrm{red}})}$$

Note the following terminologies:

  • $\mathbb{P}(\mathrm{II}|{\color{red}\text{red}})$ is the posterior probability.

  • $\mathbb{P}({\color{red}\text{red}}|\mathrm{II})$ is the likelihood.

  • $\mathbb{P}(\mathrm{II})$ is the prior probability.

  • $\mathbb{P}({\color{red}\text{red}})$ is the marginal probability.

Suppose we are not given the information (or evidence) that the ball is red. In this case, the probability that the ball is from the second bag is $\mathbb{P}(\mathrm{II})$. This prior probability can be thought of as the probability before the evidence is considered.

Suppose we are now given evidence that the ball is red. The posterior probability $\mathbb{P}(\mathrm{II}|{\color{red}\text{red}})$ represents the updated probability based on this new evidence. In this way, the Bayes' theorem can be thought of as an update rule of probabilities whenever new evidence becomes available.

Example.

Probability of carrying a disease after testing

Suppose we have a testing kit for a disease that affects $1$ in $1000$ people. The accuracy of the testing kit is as follows:

  • if the person actually has the disease, then the test will return positive $99\%$ of the time.

  • if the person does not have the disease, then the test will return positive $2\%$ of the time.

If a random person is chosen and the test returns positive, then what is the probability that this person actually has the disease?

Solution. We define the following events:

  • event $A$ - test returns a positive.

  • event $B_1$ - the person has the disease.

  • event $B_2$ - the person does not have the disease.

Events $B_1$ and $B_2$ partition the sample space, that is, $\mathbb{P}(B_1)+\mathbb{P}(B_2)=1$. Our goal is to compute $\mathbb{P}(B_1|A)$, which is the probability that the person actually has the disease given that the test returned positive.

The Bayes's theorem tells us that:

$$\begin{align*} \mathbb{P}(B_1|A)&= \frac{\mathbb{P}(A|B_1)\cdot\mathbb{P}(B_1)} {\mathbb{P}(A|B_1)\cdot\mathbb{P}(B_1) + \mathbb{P}(A|B_2)\cdot\mathbb{P}(B_2)}\\ &=\frac{\dfrac{99}{100}\cdot\dfrac{1}{1000}} {\dfrac{99}{100}\dfrac{1}{1000}+ \dfrac{2}{100}\dfrac{999}{1000}}\\ &\approx0.047 \end{align*}$$

Therefore, even if the test returns positive, the probability that the person actually has the disease is only around $4.7\%$. This is surprising because the testing kit seems to be very accurate given the information provided in the question.

This counterintuitive result can be explained by the fact that the disease is rare and only affects $1$ out of $1000$ people, and so even if the test is accurate, there is still a high chance that the person does not carry the disease. To mathematically show this, let's make the Bayes' theorem more explicit:

$$\begin{align*} \mathbb{P}(\text{infected}|\text{test pos})&= \frac{\color{green}\mathbb{P}(\text{test pos}|\mathrm{infected})\cdot\mathbb{P}(\mathrm{infected})} {{\color{green}\mathbb{P}(\text{test pos}|\text{infected})\cdot\mathbb{P}(\text{infected})} + \color{red}\mathbb{P}(\text{test pos}|\neg\;\text{infected})\cdot\mathbb{P}(\neg\;\text{infected})}\\ &=\frac{\color{green}\dfrac{99}{100}\cdot\dfrac{1}{1000}} {{\color{green}\dfrac{99}{100}\dfrac{1}{1000}}+ \color{red}\dfrac{2}{100}\dfrac{999}{1000}}\\ \end{align*}$$

Here, $\neg$ is to be read as "not". Because the disease is rare, $\mathbb{P}(\text{infected})$ is very small, which makes the green terms small. In contrast, the red term is much larger because $\mathbb{P}(\neg\;\text{infected})$ is so high - close to $1$. Therefore, probability $\mathbb{P}(\text{infected}|\text{test pos})$ ends up being small.

Theorem.

General formula for Bayes' theorem

If the events $B_1$, $B_2$, $\cdots$, $B_k$ form a partition for the sample space $S$ such that $\mathbb{P}(B_i)\ne0$ for $i=1$, $2$, $\cdots$, $k$, then for any event $A$ in $S$, we have:

$$\mathbb{P}(B_j|A)= \frac{\mathbb{P}(B_j)\cdot\mathbb{P}(A|B_j)} {\sum^k_{i=1}\mathbb{P}(B_i)\cdot\mathbb{P}(A|B_i)}$$

For $j=1,2,\cdots,k$.

Solution. Recall the general law of total probabilitylink, which states that if events $B_1$, $B_2$, $\cdots$, $B_k$ form a partition for the sample space $S$ such that $\mathbb{P}(B_i)\ne0$ for $i=1$, $2$, $\cdots$, $k$, then for any event $A$ in $S$, we have:

$$\begin{equation}\label{eq:iz4HXwsDfKxxk2xH9zI} \mathbb{P}(A) =\sum^k_{i=1}\mathbb{P}(A\cap{B_i}) =\sum^k_{i=1}\mathbb{P}(B_i)\cdot\mathbb{P}(A|B_i) \end{equation}$$

Now, from Bayes' theorem, we know that:

$$\begin{equation}\label{eq:VtpXP99Fib9DpX86Nws} \mathbb{P}(B_j|A)= \frac{\mathbb{P}(A|B_j)\cdot\mathbb{P}(B_j)}{\mathbb{P}(A)} \;,\;\;\;\;\;\;\; \text{for }\;j=1,2,\cdots,k \end{equation}$$

Substituting \eqref{eq:iz4HXwsDfKxxk2xH9zI} into the denominator of \eqref{eq:VtpXP99Fib9DpX86Nws} gives us the general Bayes' theorem:

$$\mathbb{P}(B_j|A)= \frac{\mathbb{P}(B_j)\cdot\mathbb{P}(A|B_j)} {\sum^k_{i=1}\mathbb{P}(B_i)\cdot\mathbb{P}(A|B_i)}$$

This completes the proof.

Example.

Drawing a ball from three bags

Suppose we have the following three bags:

  • bag $\text{I}$, which contains $1$ red and $3$ green balls

  • bag $\text{II}$, which contains $2$ red and $1$ green balls.

  • bag $\mathrm{III}$, which contains $3$ red and $4$ green balls.

We pick one bag at random, and then draw a single ball. Given that the drawn ball is green, compute the probability that it came from bag $\mathrm{II}$.

Solution. Let's define the following events:

  • event $\mathrm{I}$ - choosing bag $\mathrm{I}$.

  • event $\mathrm{II}$ - choosing bag $\mathrm{II}$.

  • event $\mathrm{III}$ - choosing bag $\mathrm{III}$.

  • event $\color{green}\text{green}$ - drawing a green ball.

The general Bayes' theorem tells us that:

$$\begin{align*} \mathbb{P}(\mathrm{II}|{\color{green}\text{green}}) &=\frac{\mathbb{P}(\mathrm{II})\cdot\mathbb{P}({\color{green}\text{green}}|\mathrm{II})} {\mathbb{P}(\mathrm{I})\cdot\mathbb{P}({\color{green}\text{green}}|\mathrm{I})+ \mathbb{P}(\mathrm{II})\cdot\mathbb{P}({\color{green}\text{green}}|\mathrm{II})+ \mathbb{P}(\mathrm{III})\cdot\mathbb{P}({\color{green}\text{green}}|\mathrm{III})}\\ &=\frac{\dfrac{1}{3}\cdot\dfrac{1}{3}}{\dfrac{1}{3}\cdot\dfrac{3}{4}+\dfrac{1}{3}\cdot\dfrac{1}{3}+\dfrac{1}{3}\cdot\dfrac{4}{7}}\\ &\approx0.20 \end{align*}$$

Therefore, the probability that the drawn green ball is from bag $\mathrm{II}$ is around $0.20$.

mail
Join our newsletter for updates on new DS/ML comprehensive guides (spam-free)
robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Ask a question or leave a feedback...