search
Search
Login
Math ML Join our weekly DS/ML newsletter
menu
menu search toc more_vert
Robocat
Guest 0reps
Thanks for the thanks!
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
help Ask a question
Share on Twitter
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
A
A
brightness_medium
share
arrow_backShare
Twitter
Facebook

Comprehensive Guide on Random Variables

Probability and Statistics
chevron_right
Probability Theory
schedule Oct 31, 2022
Last updated
local_offer Probability and Statistics
Tags

Motivating example

Recall that eventslink in the context of a statistical experiment are a random binary outcome. For instance, we might be interested in the following events when rolling a dice once:

  • is the outcome odd?

  • is the outcome greater than $4$?

  • is the outcome a $3$?

All of these events can be answered as either yes or no - this is what makes them binary. In many cases, we are rather interested in numeric events.

For instance, suppose we roll a dice twice. We might be interested in how many times we roll a $3$, and questions of the type "how many" cannot be answered with yes or no. Let's introduce a numeric variable $X$ that represents the number of times we roll a $3$. In this case, the possible values that $X$ may take are:

$$X\in\{0,1,2\}$$

Because the value of $X$ depends on the outcome of the experiment, $X$ is known as a random variable. Notationally, we use an uppercase $X$ to refer to a random variable and a lowercase $x$ to denote a particular value that $X$ takes on. The main difference is that $X$ is random whereas $x$ is a specific observed value that is not random.

To be more mathematically precise, a random variable is defined as a function that associates a real value $x$ to every possible outcome in the experiment, that is, the sample spacelink. For our dice-rolling experiment, let's define the events of success ($\mathrm{S}$) and failure ($\mathrm{F}$) as follows:

  • success - the outcome of rolling a $3$.

  • failure - the outcome of not rolling a $3$.

Therefore, the sample space $\Omega$ is:

$$\Omega= \{\;\mathrm{FF},\mathrm{SF},\mathrm{FS},\mathrm{SS}\;\}$$

Here, $\mathrm{SF}$, for instance, represents a success followed by a failure.

Remember, every element in the sample space is called a sample point. In this case, we have four sample points in our sample space. We defined the random variable $X$ as the number of times we roll a $3$, which is equivalent to the number of times we observe a success $\mathrm{S}$. The random variable $X$ maps each sample point to a specific value $x$ like so:

Sample space

$x$

$\mathrm{FF}$

$0$

$\mathrm{SF}$

$1$

$\mathrm{FS}$

$1$

$\mathrm{SS}$

$2$

Note the following:

  • $X$ maps $\mathrm{FF}$ to the value $0$ because this is the case when we roll no $3$s. Mathematically, $X(\mathrm{FF})=0$.

  • $X$ maps $\mathrm{SF}$ and $\mathrm{FS}$ both to the value $1$ because these are the cases when we roll a single $3$. Mathematically, $X(\mathrm{SF})=X(\mathrm{FS})=1$. This also demonstrates how $X$ can map different sample points to the same value.

  • $X$ maps $\mathrm{SS}$ to the value $2$ because this is the case when we roll two $3$s. Mathematically, $X(\mathrm{SS})=2$.

  • the values that $X$ can take on is $0$, $1$ or $2$.

We now formally state the definition of a random variable.

Definition.

Random variable

A random variable $X$ is a function that assigns a real value to each element (sample point) of the sample space.

Example.

Random variables of drawing balls from a bag

Suppose we draw two balls from a bag containing many red and green balls. We are interested in the number of green balls we draw. How should we define the random variable in this case?

Solution. We should define random variable $X$ as the number of green balls we draw. The sample space $\Omega$ in this case is:

$$\Omega=\{{ {\color{red}\mathrm{RR}}, {\mathrm{\color{red}R\color{green}G}}, \mathrm{\color{green}G\color{red}R},\color{green}\mathrm{GG}}\}$$

Here, $\color{red}\mathrm{R}$ represents the event of drawing a red ball, and $\color{green}\mathrm{G}$ represents the event of drawing a green ball. The random variable $X$ maps each of the sample points to a real value $x$, which is the number of green balls we draw for each sample point:

Sample space

$x$

$\color{red}\mathrm{RR}$

$0$

$\mathrm{\color{red}R\color{green}G}$

$ 1$

$\mathrm{\color{green}G\color{red}R}$

$1$

$\color{green}\mathrm{GG}$

$2$

Therefore, $X$ can take on the values $0$, $1$ or $2$.

Assigning probabilities to random variables

Just like how we assign probabilities to events, we can do the same to random variables. Consider the following example:

Example.

Assigning probabilities to random variables of drawing balls from a bag

Suppose we draw two balls with replacement from a bag containing 2 red and 3 green balls. What is the probability of drawing:

  • no red balls?

  • one red ball?

  • two red balls?

Solution. Let's define random variable $X$ as the number of red balls we draw. The probabilities of interest are:

  • $\mathbb{P}(X=0)$ - the probability of drawing no red balls.

  • $\mathbb{P}(X=1)$ - the probability of drawing one red ball.

  • $\mathbb{P}(X=2)$ - the probability of drawing two red balls.

We can compute these probabilities by referring to the sample space:

Sample space

$x$

$\color{green}\mathrm{GG}$

$0$

$\mathrm{\color{red}R\color{green}G}$

$ 1$

$\mathrm{\color{green}G\color{red}R}$

$1$

$\color{red}\mathrm{RR}$

$2$

Let's calculate $\mathbb{P}(X=0)$ and $\mathbb{P}(X=2)$ first:

$$\begin{align*} \mathbb{P}(X=0) &=\mathbb{P}(\mathrm{\color{green}GG})\\ &=\frac{3}{5}\cdot\frac{3}{5}\\ &=\frac{9}{25}\\\\ \mathbb{P}(X=2) &=\mathbb{P}(\mathrm{\color{red}RR})\\ &=\frac{2}{5}\cdot\frac{2}{5}\\ &=\frac{4}{25} \end{align*}$$

Next, let's calculate $\mathbb{P}(X=1)$. $X=1$ is true when either event $\mathrm{\color{red}R\color{green}G}$ or $\mathrm{\color{green}G\color{red}R}$ occurs. Since $\mathrm{\color{red}R\color{green}G}$ and $\mathrm{\color{green}G\color{red}R}$ are disjoint, we can apply the third axiom of probabilitylink to calculate $\mathbb{P}(X=1)$ like so:

$$\begin{align*} \mathbb{P}(X=1) &=\mathbb{P}(\mathrm{\color{red}R\color{green}G})+ \mathbb{P}(\mathrm{\color{green}G\color{red}R})\\ &=\frac{2}{5}\cdot\frac{3}{5}+ \frac{3}{5}\cdot\frac{2}{5}\\ &=\frac{12}{25} \end{align*}$$

Notice the following:

$$\mathbb{P}(X=0)+ \mathbb{P}(X=1)+ \mathbb{P}(X=2)= 1$$

This holds because the probability of the sample space must be equal to one:

$$\begin{align*} \mathbb{P}(X=0)+ \mathbb{P}(X=1)+ \mathbb{P}(X=2)&= \mathbb{P}({\color{green}\mathrm{GG}}) + \mathbb{P}({\color{red}\mathrm{R}\color{green}\mathrm{G}})+ \mathbb{P}({\color{green}\mathrm{G}\color{red}\mathrm{R}})+ \mathbb{P}({\color{red}\mathrm{RR}})\\ &=1 \end{align*}$$

Types of random variables

There are two types of random variables - discrete and continuous.

Discrete random variables

A discrete random variable only takes on a countable number of values. The examples we have looked at previously are all discrete random variables:

  • the number of times we get a 3 after rolling a dice twice.

  • the number of green balls we get after randomly drawing two balls from a bag.

The reason these are discrete random variables is that there is a finite number of values that the random variable can take on.

Other examples of discrete random variables are:

  • the number of days it rains in some given month.

  • the number of people waiting at a particular bus stop.

Continuous random variables

A continuous random variable only takes on an infinite number of possible values. For instance, let's consider the height of an adult as a random variable $X$. Unlike the discrete case, we cannot assign a real value $x$ to each possible value of $X$ because there is an infinite number of values - a person could be $170.5\mathrm{cm}$ tall or even $170.0005\mathrm{cm}$ tall.

Examples of continuous random variables include:

  • the amount of rain in some given month.

  • the length of time we wait for the bus.

Definition.

Independence of random variables

If $X$ and $Y$ are two independent random variables, then:

$$\mathbb{P}(X=x \,\, \text{and } Y=y) = \mathbb{P}(X=x)\cdot\mathbb{P}(Y=y), \;\;\;\;\;\;\text{for all }\;x,y$$

The independence of random variables is analogous to the case when we have two independent events $A$ and $B$ - the probability that both $A$ and $B$ occur is the product of the probability of $A$ and the probability of $B$, that is:

$$\mathbb{P}(A\text{ and }B) =\mathbb{P}(A)\cdot\mathbb{P}(B)$$
Example.

Tossing two coins

Suppose we have a fair coin, and perform the following:

  • we toss the coin once and we define random variable $X$ as the number of heads - $X\in\{0,1\}$.

  • we toss the coin twice and we define random variable $Y$ as the number of tails - $Y\in\{0,1,2\}$.

Compute $\mathbb{P}(X=1\text{ and }Y=2)$.

Solution. The outcome of a coin toss does not affect the outcome of subsequent tosses. This means that random variables $X$ and $Y$ are independent:

$$\begin{align*} \mathbb{P}(X=1\text{ and }Y=2) &=\mathbb{P}(X=1)\cdot{\mathbb{P}(Y=2)}\\ &=\Big(\frac{1}{2}\Big)\cdot\Big(\frac{1}{2}\cdot\frac{1}{2}\Big)\\ &=\frac{1}{8} \end{align*}$$

Final remarks

The concept of random variables provides the fundamental building block for other important statistical concepts such as probability distribution functions and linear regression. Random variables can be thought of as numeric events that can either be discrete or continuous. The main difference between the two types is that discrete random variables take on a finite number of values whereas continuous random variables take on an infinite number of values.

We have also briefly explored how we can assign probabilities to random variables. In the next section, we will discuss this further by looking at:

  • probability mass functions that assign probabilities to each possible value of a discrete random variable

  • probability density functions that assign probabilities to a range of values of a continuous random variable.

mail
Join our newsletter for updates on new DS/ML comprehensive guides (spam-free)
robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Ask a question or leave a feedback...