**Prob and Stats**

# Comprehensive Guide on Expected Values of Random Variables

*schedule*Mar 5, 2023

*toc*Table of Contents

*expand_more*

**interactive map of data science**

Instead of stating the mathematical definition of expected values upfront, let's go through a motivating example first to develop our intuition.

# Motivating example

Suppose we have an **unfair** coin with the following probability of heads and tails:

Now, suppose we play a game where we get $3$ dollars per heads and $4$ dollars per tails. Whether we get a heads or tails is random, so we can use random variable $X$ to denote the profit of a single toss. The probability mass function of $X$, denoted as $p(x)$, is as follows:

Outcome | $x$ | $p(x)$ |
---|---|---|

Heads | $3$ | $0.8$ |

Tails | $4$ | $0.2$ |

This table states that for a single toss:

the probability of getting $3$ dollars is $0.8$.

the probability of getting $4$ dollars is $0.2$.

Note that $x$ represents a specific value that a random variable $X$ could take on.

Now, the key question we wish to answer is:

If we toss the coin $n=10$ times, how much do we expect to profit?

To answer this question, we should first find out how many heads and tails we expect from $10$ tosses. Intuitively, since the probability of heads is $0.8$, we should expect the outcome to be $8$ heads and $2$ tails. Mathematically, we are performing the following calculations:

Keep in mind that these are only the **expected** outcome since the outcome is random. Because we profit $3$ dollars for each heads and $4$ dollars for each tails, we can easily calculate the expected profit from heads and the expected profit from tails:

The sum of the expected profits from heads and tails gives us the total expected profit:

Therefore, if we were to toss the coin $10$ times, then our expected profit is $32$ dollars! We can generalize the above calculations as follows:

Now, instead of tossing the coin $n=10$ times, let's calculate the expected profit from a single coin toss, that is, $n=1$. Plugging in $n=1$ in the above formula gives:

Recall that the random variable $X$ represents the profit of a single trial. We can mathematically denote the expected value of $X$ as $\mathbb{E}(X)$, which in this case means:

Moreover, notice how the right-hand side of \eqref{eq:Btv5DtOEWn7AzPrgmka} can be written as:

Here, the summation symbol $\sum_x$ means that we are summing over all possible values of $x$.

Therefore, using \eqref{eq:rOdh5mI4gWAOeRva7TV} and \eqref{eq:Ln9S8Mcey8PttjlMMnf}, we can express \eqref{eq:Btv5DtOEWn7AzPrgmka} generally as:

For our example, we know that $p(3)=0.8$ and $p(4)=0.2$, so the expected profit for a single toss is:

Therefore, we should expect to profit $3.2$ dollars per toss on average! Note that this does not mean that we get $3.2$ dollars per toss - in fact, that's impossible because we either get $3$ or $4$ dollars per toss. An expected profit of $3.2$ dollars per toss means that if we were to toss the coin a large number of times, the average profit we make per toss is $3.2$ dollars.

## Simulation to demonstrate expected values

Let's run a quick simulation to demonstrate this. Suppose we toss the coin $1000$ times - the simulated outcome is as follows:

As expected, we get around $800$ heads ($X=3$) and $200$ tails ($X=4$). Below is a graph showing the running average of the profit per toss:

We can see that the average profit per toss fluctuates a lot in the beginning. However, as we keep tossing the coin, the average profit per toss stabilizes to the theoretical expected value of $3.2$ that we calculated earlier! The more tosses we make, the closer the average profit per toss will be to $3.2$.

In this example, the random variable $X$ represents the profit from a single toss and so $\mathbb{E}(X)$ represents the expected profit from a toss. Of course, $X$ does not have to represent profit - it could for instance represent waiting time, in which case $\mathbb{E}(X)$ would represent the expected waiting time.

We now state the formal definition of the expected value of random variables.

# Expected value of a random discrete variable

Let $X$ be a discrete random variable with probability mass function $p(x)$. The expected value of $X$ is defined as follows:

In words, the expected value involves summing the products of all possible values of the random variable $X$ and their respective probability. Intuitively, the expected value of $X$ is a number that tells us the average value of $X$ we expect to see when we perform a large number of independent repetitions of an experiment.

Note that $\mathbb{E}(X)$ is sometimes denoted as $\mu$.

## Expected profit from a lottery

Consider a lottery game where the cost of a single ticket is $3$ dollars. Let the random variable $X$ denote the lottery prize. The following table describes the three possible outcomes and their corresponding probabilities:

$x$ | $p(x)$ |
---|---|

$0$ | $0.90$ |

$10$ | $0.08$ |

$100$ | $0.02$ |

Compute and interpret the expected value of $X$.

Solution. By definition, the expected value of $X$ is:

This means that we should expect to win $2.8$ dollars per lottery game. However, since the cost of a single lottery ticket is $3$ dollars, we will lose $0.2$ dollars per game on average. If we play $100$ games, then we should expect to lose a total of $20$ dollars.

## Expected waiting time for food

The waiting time (in minutes) for food delivery is represented by the random variable $X$. The probability mass function of $X$ is:

$x$ | $p(x)$ |
---|---|

$10$ | $0.2$ |

$20$ | $0.3$ |

$30$ | $0.5$ |

On average, how long do we have to wait to get our food?

Solution. Let random variable $X$ be the waiting time for the food to arrive. The expected value of $X$ is:

Therefore, on average, we must wait 23 minutes for our food!

# Expected value of a symmetric distribution

Suppose we have a probability mass function like so:

$x$ | $4$ | $5$ | $6$ |
---|---|---|---|

$p(x)$ | $1/4$ | $2/4$ | $1/4$ |

This is visualized below:

When we have symmetric distributions like this, we can find the expected value without having to calculate it. We know that the expected value is the mean value that a random variable takes in the long run. For a symmetric distribution, the mean is at the center, which means $\mathbb{E}(X)=5$.

## Expected value of rolling a dice

Suppose we roll a fair dice once. Let the random variable $X$ denote the face of the rolled dice. What is the expected value of $X$?

Solution. Since we have a fair dice, the probability mass function of $X$ is:

$x$ | $1$ | $2$ | $3$ | $4$ | $5$ | $6$ |
---|---|---|---|---|---|---|

$p(x)$ | $1/6$ | $1/6$ | $1/6$ | $1/6$ | $1/6$ | $1/6$ |

The expected value of $X$ is:

One way of interpreting this is to think of $X$ as points - for instance, if we roll a $6$, we get $6$ points. On average, we would get $3.5$ points per roll.

As a side note, notice that $p(x)$ is a symmetric probability distribution. We know from earlier that the expected value of a symmetric distribution is the mean value of $x$, that is:

This way of computing the expected value is convenient because we don't have to deal with probabilities - we simply compute the average of all possible $x$ values!

We will now discuss the expected values of continuous random variables. The underlying intuition is the same as that for the discrete case - the main difference is that instead of summing over all possible values of a random variable, we integrate over them!

# Expected value of a continuous random variable

Let $X$ be a continuous random variable with probability density function $f(x)$. The expected value of $X$ is defined as follows:

Note the following:

this definition holds only if the integral exists.

the bounds of the integral are typically written as $-\infty$ and $\infty$ for the formal definition of expected values. In practice, we use the bounds of $X$ instead.

## Computing the expected value of a continuous random variable (1)

Suppose random variable $X$ has the following probability density function:

Compute the expected value of $X$.

Solution. By definition, the expected value of $X$ is:

Therefore, if we repeatedly sample from this distribution a large number of times, the average value that $X$ will take on is $2/3$.

## Computing the expected value of a continuous random variable (2)

Suppose random variable $X$ has the following probability density function:

Compute $\mathbb{E}(X)$.

Solution. By definition, the expected value of $X$ is:

Therefore, the average value that $X$ takes on in the long run is $6.5$.

Just as in the discrete case, we don't have to rely on the formula to compute the expected value when the distribution is symmetric. The probability density function of $X$ in this case is symmetric:

Because $\mathbb{E}(X)$ represents the average value that $X$ takes on upon repeated sampling, we have that $\mathbb{E}(X)$ must be at the center when the distribution of $X$ is symmetric. Therefore, we can easily conclude that $\mathbb{E}(X)=6.5$ in this case.

In the next section, we will go over the mathematical properties of expected value of random variables.