search
Search
Login
Map of Data Science
menu
menu search toc more_vert
Robocat
Guest 0reps
Sign up
Log in
account_circleMy Profile homeAbout paidPricing
emailContact us
exit_to_appLog out
Map of data science
Thanks for the thanks!
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
help Ask a question
Share on Twitter
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
A
A
brightness_medium
share
arrow_backShare
Twitter
Facebook

Comprehensive Guide on Variance of Random Variables

Probability and Statistics
chevron_right
Mathematical Statistics
schedule Nov 7, 2022
Last updated
local_offer Probability and Statistics
Tags
map
Check out the interactive map of data science

To understand this guide, you must be familiar with the concept of expected values. Please consult our guide if you are not.

Definition.

Variance of a random variable

If $X$ is a random variable with mean $\mathbb{E}(X)=\mu$, then the variance of a random variable $X$ is defined to be the expected value of $(X-\mu)^2$, that is:

$$\mathbb{V}(X)=\mathbb{E}\Big[(X-\mu)^2\Big]$$

Intuitively, we can think of the variance as the average squared distance between a random variable and its mean, measuring the spread of the random variable's distribution.

Intuition behind the variance formula

Recall that the expected value $\mathbb{E}(X)$ can be interpreted as the average value that $X$ takes when we repeatedly sample $X$ from its distribution. Therefore, $\mathbb{E}(X)$ is to be thought of as the mean of its distribution.

For example, suppose a random variable $X$ has the following probability mass function:

Recall from this sectionlink that the mean value $X$ for a symmetric probability distribution is at the center, which in this case is $\mathbb{E}(X)=\mu=3$.

Distance squared

The quantity $X-\mu$ measures the distance between a possible value of the random variable and the mean $\mu$. This is easy to calculate:

Notice how in the case when $X$ is smaller than the mean, the distance is negative. This is problematic for two reasons:

  • we only care about how far $X$ is from the mean $\mu$ and not the sign of the distance.

  • the positive and negative distances cancel out each other, which makes the average distance become zero, that is, $(1/5)\sum_{i=1}^5(x_i-\mu)=0$.

The second point implies that the expected value of $X-\mu$ is zero, which can also be shown using the linearity of expected values:

$$\begin{align*} \mathbb{E}(X-\mu) &=\mathbb{E}(X)-\mu\\ &=3-3\\ &=0 \end{align*}$$

Clearly, this is not a good measure of spread because the probability histogram tells us that $x$ values are spread out, so the spread should not be equal to zero! Therefore, we should adjust the distance metric $X-\mu$ such that we don't encounter negative distances. One way of doing so is to take the square of $X-\mu$ like so:

The quantity $(X-\mu)^2$ measures the squared distance between $X$ and the mean of $X$. Even though we have taken the square of the distance, $(X-\mu)^2$ is still a measure of how far $X$ is from the mean $\mu$.

Why take the square instead of the absolute value?

For those wondering why we don't take the absolute value of $X-\mu$ instead, it turns out that defining the variance as $(X-\mu)^2$ instead of $\vert{X-\mu}\vert$ leads to extremely useful properties such as:

$$\begin{align*} \mathbb{V}(X)&=\mathbb{E}(X^2)-[\mathbb{E}(X)]^2\\ \mathbb{V}(X+Y)&=\mathbb{V}(X)+\mathbb{V}(Y)+2\cdot\mathrm{cov}(X,Y) \end{align*}$$

None of these properties hold if we were to define the variance using absolute value.

Taking the expected value of the squared distance

We now know that $(X-\mu)^2$ is to be interpreted as the squared distance of $X$ from the mean of $X$ and is thus a measure of how far off $X$ is from the mean $\mu$. Taking the expected value of $(X-\mu)^2$ will therefore give us the average squared distance from the mean as we repeatedly sample $X$ from its distribution. This expected value is the definition of variance:

$$\mathbb{V}(X)= \mathbb{E}\Big[(X-\mu)^2\Big]$$

Let's now compute the variance of $X$. Recall the following propertylink of expected values:

$$\mathbb{E}[g(X)] =\sum_xg(x)\cdot{p(x)}$$

If we define $g(X)=(X-\mu)^2$, then:

$$\begin{equation}\label{eq:nae0f7A1OE4oq37atSa} \mathbb{E}[(X-\mu)^2]=\sum_x(x-\mu)^2\cdot{p(x)}\\ \end{equation}$$

This allows us to compute the variance of $X$ like so:

$$\begin{align*} \mathbb{V}(X)&= \mathbb{E}\Big[(X-\mu)^2\Big]\\ &=\sum_x(x-\mu)^2\cdot{p(x)}\\ &=(4)(1/5)+(1)(1/5)+(0)(1/5)+(1)(1/5)+(4)(1/5)\\ &=4/5+1/5+1/5+4/5\\ &=2 \end{align*}$$

Therefore, the measure of spread for $X$ is $2$. The higher this number, the more spread out $X$ is.

Comparing the spread of two random variables

Suppose we have two random variables $X$ and $Y$ and their distributions:

Low variance

High variance

We can see that $Y$ is more spread out compared to $X$. Instead of having to rely on visual plots, we can mathematically show that $Y$ has more spread than $X$ since $\mathbb{V}(Y)\gt\mathbb{V}(X)$.

Example.

Rolling an unfair dice

Suppose we rolled an unfair 3-sided dice with faces $1$, $2$ and $3$. Let random variable $X$ denote the outcome of the roll. We assume that $X$ has the following probability mass function:

$x$

$p(x)$

$1$

$1/6$

$2$

$3/6$

$3$

$2/6$

Compute the variance of $X$.

Solution. To compute the variance $\mathbb{V}(X)$ using its definition, we must first compute the mean of $X$, that is, $\mu$. By definitionlink of expected values, we have that:

$$\begin{align*} \mu&= \mathbb{E}(X)\\ &=\sum_{x}x\cdot{p(x)}\\ &=(1)\cdot{\frac{1}{6}} +(2)\cdot{\frac{3}{6}} +(3)\cdot{\frac{2}{6}}\\ &=\frac{13}{6} \end{align*}$$

Now, let's use the definition of variance:

$$\begin{align*} \mathbb{V}(X) &=\mathbb{E}\Big[(X-\mu)^2\Big]\\ &=\sum_{x}(x-\mu)^2\cdot{p(x)}\\ &= \Big(1-\frac{13}{6}\Big)^2\Big(\frac{1}{6}\Big)+ \Big(2-\frac{13}{6}\Big)^2\Big(\frac{3}{6}\Big)+ \Big(3-\frac{13}{6}\Big)^2\Big(\frac{2}{6}\Big)\\ &=\frac{17}{36}\\ &\approx0.47 \end{align*}$$

Note that the definition of expected values is again used for the second equality.

Another measure of spread of a random variable is the standard deviation, which is defined as the square root of the variance. Variance is a more popular measure of spread because the variance has much nicer mathematical properties than standard deviation.

What's nice about standard deviation is that it has the same units as the random variable - for instance, if random variable $X$ has units of minutes, then the standard deviation would also be in minutes.

Definition.

Standard deviation of a random variable

The standard deviation of a random variable $X$ is defined as the square root of the variance of $X$, that is:

$$\sigma_X=\sqrt{\mathbb{V}(X)}=\sqrt{\mathbb{E}\Big[(X-\mu)^2\Big]}$$

Where $\mu$ is the mean or expected value of $X$.

In the next section, we will explore the mathematical properties of the variance of random variables!

robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Ask a question or leave a feedback...