search
Search
Login
Map of Data Science
menu
menu search toc more_vert
Robocat
Guest 0reps
Sign up
Log in
account_circleMy Profile homeAbout paidPricing
emailContact us
exit_to_appLog out
Map of data science
Thanks for the thanks!
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
help Ask a question
Share on Twitter
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
A
A
brightness_medium
share
arrow_backShare
Twitter
Facebook

Comprehensive Introduction to Matrices

Linear Algebra
chevron_right
Matrix Algebra
schedule Jan 28, 2023
Last updated
local_offer Linear Algebra
Tags
map
Check out the interactive map of data science

What are matrices?

A matrix is a group of numbers arranged in rows and columns. For example, below is a matrix with $2$ rows and $3$ columns:

$$\boldsymbol{A}=\begin{pmatrix} 3&6&5\\ 5&9&1\\ \end{pmatrix}$$

The general convention is to use bold uppercase letters such as $\boldsymbol{A}$ to denote a matrix. We refer to a matrix with $2$ rows and $3$ columns as a $2\times3$ matrix where $\times$ is read as "by".

Some textbooks use square brackets instead of circular brackets:

$$\boldsymbol{A}=\begin{bmatrix} 3&6&5\\ 5&9&1\\ \end{bmatrix}$$

We use the following notation to represent a generic $m\times{n}$ matrix:

$$\boldsymbol{A}=\begin{pmatrix} a_{11}&a_{12}&\cdots&a_{1n}\\ a_{21}&a_{22}&\cdots&a_{2n}\\ \vdots&\vdots&\ddots&\vdots\\ a_{m1}&a_{m2}&\cdots&a_{mn} \end{pmatrix}$$

The convention is to denote each number in the matrix using a non-bold lowercase letter with subscripts. For instance, $a_{12}$ represents the number in the first row second column. We typically match the notation used for the matrix and the numbers it contains - for instance, the numbers in matrix $\boldsymbol{B}$ are denoted as $b$.

Relationship between vectors and matrices

Recall that vectors are essentially arrows that connect one point in space to another point. An example of a vector is:

$$\boldsymbol{v}= \begin{pmatrix} 3\\2 \end{pmatrix}$$

This can be treated as a matrix with $2$ rows and $1$ column. In fact, matrices can be thought of as a generalization of vectors that allow for multiple columns!

Special matrices

Definition.

Square matrix

A square matrix contains the same number of rows and columns. Below is an example of a $3\times3$ square matrix:

$$\begin{pmatrix} 3&1&5\\8&1&2\\2&1&2 \end{pmatrix}$$
Definition.

Zero matrix

The zero matrix contains all zeros. Below is an example of a $2\times3$ zero matrix:

$$\begin{pmatrix} 0&0&0\\0&0&0 \end{pmatrix}$$
Definition.

Identity matrix

The identity matrix is a square matrix denoted by $\boldsymbol{I}$ and has the main diagonals filled with $1$s and others filled with $0$s. Below is an example of a $3\times3$ identity matrix:

$$\boldsymbol{I}_3=\begin{pmatrix} 1&0&0\\0&1&0\\0&0&1 \end{pmatrix}$$

As we did here, we sometimes use the notation $\boldsymbol{I}_n$ to denote an $n\times{n}$ identity matrix. We will later explore how the identity matrix plays a critical role in linear algebra.

Matrix addition and subtraction

Suppose we have two matrices of the same shape $\boldsymbol{A}$ and $\boldsymbol{B}$. To compute $\boldsymbol{A}+\boldsymbol{B}$, we add the corresponding pair of numbers in the matrices. For instance, consider the following matrices:

$$\boldsymbol{A}=\begin{pmatrix} 3&6&5\\ 5&9&1\\ \end{pmatrix},\;\;\;\;\; \boldsymbol{B}=\begin{pmatrix} 0&2&1\\ 1&3&2\\ \end{pmatrix}$$

Their sum is:

$$\begin{align*} \boldsymbol{A}+\boldsymbol{B}&=\begin{pmatrix} 3&6&5\\ 5&9&1\\ \end{pmatrix}+\begin{pmatrix} 0&2&1\\ 1&3&2\\ \end{pmatrix}\\&= \begin{pmatrix} 3+0&6+2&5+1\\ 5+1&9+3&1+2\\ \end{pmatrix}\\ &= \begin{pmatrix} 3&8&6\\ 6&12&3\\ \end{pmatrix} \end{align*}$$

Similarly, to compute $\boldsymbol{A}-\boldsymbol{B}$, we subtract the numbers in $\boldsymbol{B}$ from the corresponding numbers in $\boldsymbol{A}$ like so:

$$\begin{align*} \boldsymbol{A}-\boldsymbol{B}&=\begin{pmatrix} 3&6&5\\ 5&9&1\\ \end{pmatrix}-\begin{pmatrix} 0&2&1\\ 1&3&2\\ \end{pmatrix}\\&= \begin{pmatrix} 3-0&6-2&5-1\\ 5-1&9-3&1-2\\ \end{pmatrix}\\ &= \begin{pmatrix} 3&4&4\\ 4&6&-1\\ \end{pmatrix} \end{align*}$$

Note that matrix addition and subtraction are only defined for the case when the two matrices have the same number of rows and columns.

Scalar-matrix multiplication

Scalar-matrix multiplication works just like scalar-vector multiplication. For instance, multiplying a scalar $k$ to a $2\times3$ matrix:

$$k \begin{pmatrix} a_{11}&a_{12}&a_{13}\\ a_{21}&a_{22}&a_{23}\\ \end{pmatrix}= \begin{pmatrix} ka_{11}&ka_{12}&ka_{13}\\ ka_{21}&ka_{22}&ka_{23}\\ \end{pmatrix}$$

Here's a more concrete example:

$$3 \begin{pmatrix} 1&4&2\\ 0&2&1\\ \end{pmatrix}= \begin{pmatrix} 3\times1&3\times4&3\times2\\ 3\times0&3\times2&3\times1\\ \end{pmatrix} = \begin{pmatrix} 3&12&6\\ 0&6&3\\ \end{pmatrix}$$

Matrix-matrix multiplication

Unfortunately, matrix multiplication is not as straightforward as matrix addition. Let's go through an example - consider the following matrices:

$$\boldsymbol{A}=\begin{pmatrix} 3&6&5\\ 5&0&1\\ \end{pmatrix},\;\;\;\;\; \boldsymbol{B}= \begin{pmatrix}2&4&1&0\\1&3&3&1\\0&5&1&2\\\end{pmatrix}$$

Matrix $\boldsymbol{A}$ is a $2\times3$ while matrix $\boldsymbol{B}$ is $3\times4$. The matrix product $\boldsymbol{AB}$ is another matrix whose shape is determined by the following rule:

$$\large{\underset{({\color{green}2}\times{\color{red}3})}{\boldsymbol{A}}\;\; \underset{({\color{red}3}\times{\color{blue}4})}{\boldsymbol{B}}\;\; =\;\; \underset{{(\color{green}2}\times{\color{blue}4})}{\boldsymbol{AB}}}$$

In words, matrix $\boldsymbol{AB}$ contains:

  • the same number of rows as $\boldsymbol{A}$.

  • the same number of columns as $\boldsymbol{B}$.

Note that the number of columns and the number of rows of $\boldsymbol{A}$ and $\boldsymbol{B}$ must match - otherwise matrix multiplication is not defined! We will show why this rule always holds later, but for now, please accept that $\boldsymbol{AB}$ will be a $2\times4$ matrix in this case.

WARNING

We write the product of matrices $\boldsymbol{A}$ and $\boldsymbol{B}$ as $\boldsymbol{AB}$ instead of $\boldsymbol{A}\times\boldsymbol{B}$. The notation $\boldsymbol{A}\times\boldsymbol{B}$ is not correct and should be avoided.

For notational convenience, let's define $\boldsymbol{C}=\boldsymbol{AB}$. Our goal now is to compute the entries of $\boldsymbol{C}$ below:

$$\boldsymbol{C}=\begin{pmatrix} c_{11}&c_{12}&c_{13}&c_{14}\\ c_{21}&c_{22}&c_{23}&c_{24}\\ \end{pmatrix}$$

To compute $c_{11}$, we take the dot product of the $1$st row of $\boldsymbol{A}$ and $1$st column of $\boldsymbol{B}$ below:

$$\boldsymbol{AB}=\begin{pmatrix} \color{green}3&\color{green}6&\color{green}5\\5&0&1\\\end{pmatrix} \begin{pmatrix}\color{green}2&4&1&0\\\color{green}1&3&3&1\\\color{green}0&5&1&2\\\end{pmatrix}$$

Therefore, $c_{11}$ is:

$$\begin{align*} c_{11} &=(3)(2)+(6)(1)+(5)(0)\\ &=12 \end{align*}$$

Next, to compute $c_{12}$, we take the dot product of the $1$st row of $\boldsymbol{A}$ and the $2$nd column of $\boldsymbol{B}$ below:

$$\boldsymbol{AB}=\begin{pmatrix} \color{green}3&\color{green}6&\color{green}5\\5&0&1\\\end{pmatrix} \begin{pmatrix}2&\color{green}4&1&0\\1&\color{green}3&3&1\\0&\color{green}5&1&2\\\end{pmatrix}$$

Therefore, $c_{12}$ is:

$$\begin{align*} c_{12} &=(3)(4)+(6)(3)+(5)(5)\\ &=55 \end{align*}$$

Can you see how there is a pattern in how the entries of $\boldsymbol{AB}$ are computed? $c_{13}$ is computed by taking the dot product of the $1$st row of $\boldsymbol{A}$ and the $3$rd column of $\boldsymbol{B}$.

Let's now move on to the entries of the $2$nd row of $\boldsymbol{AB}$. To compute $c_{21}$, we take the dot product of the $2$nd row of $\boldsymbol{A}$ and the $1$st column of $\boldsymbol{B}$ below:

$$\boldsymbol{AB}=\begin{pmatrix} 3&6&5\\\color{green}5&\color{green}0&\color{green}1\\\end{pmatrix} \begin{pmatrix}\color{green}2&4&1&0\\\color{green}1&3&3&1\\\color{green}0&5&1&2\\\end{pmatrix}$$

Therefore, $c_{21}$ is:

$$\begin{align*} c_{21} &=(5)(2)+(0)(1)+(1)(0)\\ &=10 \end{align*}$$

To compute $c_{22}$, we take the dot product of the $2$nd row of $\boldsymbol{A}$ and the $2$nd column of $\boldsymbol{B}$ below:

$$\boldsymbol{AB}=\begin{pmatrix} 3&6&5\\\color{green}5&\color{green}0&\color{green}1\\\end{pmatrix} \begin{pmatrix}2&\color{green}4&1&0\\1&\color{green}3&3&1\\0&\color{green}5&1&2\\\end{pmatrix}$$

Therefore, $c_{22}$ is:

$$\begin{align*} c_{22} &=(5)(4)+(0)(3)+(1)(5)\\ &=25 \end{align*}$$

As you would expect, we can compute $c_{23}$ by taking the product of the $2$nd row of $\boldsymbol{A}$ and the $3$rd column of $\boldsymbol{B}$. To generalize, the entry $c_{ij}$ is computed by taking the dot product of the $i$-th row of $\boldsymbol{A}$ and the $j$-th column of $\boldsymbol{B}$. In this way, finding the entries of the matrix product involves taking the dot product of the rows of the first matrix and the columns of the second matrix!

Example.

Computing matrix product of 2x2 matrices

Let's go through a simpler example - consider the following matrices:

$$\boldsymbol{A}=\begin{pmatrix} 3&2\\ 1&4\\ \end{pmatrix},\;\;\;\;\; \boldsymbol{B}= \begin{pmatrix}5&0\\6&7\end{pmatrix}$$

Solution. Firstly, the shape of $\boldsymbol{AB}$ is:

$$\large{\underset{({\color{green}2}\times{\color{red}2})}{\boldsymbol{A}}\;\; \underset{({\color{red}2}\times{\color{blue}2})}{\boldsymbol{B}}\;\; =\;\; \underset{{(\color{green}2}\times{\color{blue}2})}{\boldsymbol{AB}}}$$

Note that the product $\boldsymbol{AB}$ is defined because the number of columns of $\boldsymbol{A}$ and the number of rows of $\boldsymbol{B}$ match. Let's now compute entries of $\boldsymbol{AB}$ by taking the dot products of the rows of $\boldsymbol{A}$ and the columns of $\boldsymbol{B}$ like so:

$$\begin{align*} \boldsymbol{AB}&=\begin{pmatrix} \color{green}3&\color{green}2\\\color{green}1&\color{green}4\\\end{pmatrix} \begin{pmatrix}\color{red}5&\color{red}0\\\color{red}6&\color{red}7\end{pmatrix}\\ &=\begin{pmatrix} {\color{green}(3)}{\color{red}(5)}+{\color{green}(2)}{\color{red}(6)}&{\color{green}(3){\color{red}(0)}}+{\color{green}(2)}{\color{red}(7)}\\ {\color{green}(1)}{\color{red}(5)}+{\color{green}(4)}{\color{red}(6)}&{\color{green}(1){\color{red}(0)}}+{\color{green}(4)}{\color{red}(7)} \end{pmatrix}\\ &=\begin{pmatrix} 27&14\\ 29&28 \end{pmatrix} \end{align*}$$
Theorem.

Deducing the resulting shape of a matrix product

If $\boldsymbol{A}$ is an $m\times{n}$ matrix and $\boldsymbol{B}$ is an $n\times{r}$ matrix, then the shape of their product is $m\times{r}$, that is:

$$\large{\underset{({\color{green}m}\times{\color{red}n})}{\boldsymbol{A}}\;\; \underset{({\color{red}n}\times{\color{blue}r})}{\boldsymbol{B}}\;\; =\;\; \underset{{(\color{green}m}\times{\color{blue}r})}{\boldsymbol{AB}}}$$

Note that if the number of columns of $\boldsymbol{A}$ and the number of rows of $\boldsymbol{B}$ do not match, then the product is not defined.

Proof. Suppose $\boldsymbol{A}$ is a $2\times3$ matrix represented below:

$$\boldsymbol{A}=\begin{pmatrix} *&*&*\\ *&*&*\end{pmatrix}$$

Suppose we have another matrix $\boldsymbol{B}$ and we take product $\boldsymbol{AB}$. What must the shape of $\boldsymbol{B}$ be for the matrix multiplication to work? Let's consider the following case:

$$\boldsymbol{AB}=\begin{pmatrix} *&*&*\\ *&*&*\end{pmatrix} \begin{pmatrix} \bullet&\bullet\\ \bullet&\bullet\\ \bullet&\bullet\\ \bullet&\bullet\\ \end{pmatrix}$$

Does this matrix multiplication work? Recall that the top-left entry of $\boldsymbol{AB}$ is computed by taking the dot product of the first row of $\boldsymbol{A}$ and the first column of $\boldsymbol{B}$ like so:

$$\boldsymbol{AB}=\begin{pmatrix} \color{green}*&\color{green}*&\color{green}*\\ *&*&*\end{pmatrix} \begin{pmatrix} \color{green}\bullet&\bullet\\ \color{green}\bullet&\bullet\\ \color{green}\bullet&\bullet\\ \color{green}\bullet&\bullet\\ \end{pmatrix}$$

Here, the dot product is not defined because the first row of $\boldsymbol{A}$ contains $3$ numbers whereas the first column of $\boldsymbol{B}$ contains $4$ numbers. The only way for the dot product to work is if the number of rows of $\boldsymbol{B}$ is the same as the number of columns of $\boldsymbol{A}$ like so:

$$\begin{equation}\label{eq:iL6ajgdqb1tLrVa09rS} \boldsymbol{AB}=\begin{pmatrix} \color{green}*&\color{green}*&\color{green}*\\ *&*&*\end{pmatrix} \begin{pmatrix} \color{green}\bullet&\bullet\\ \color{green}\bullet&\bullet\\ \color{green}\bullet&\bullet\\ \end{pmatrix} \end{equation}$$

Now that we know how many rows $\boldsymbol{B}$ must have, how about the number of columns? Recall that the entry $c_{ij}$ of $\boldsymbol{C}=\boldsymbol{AB}$ is computed by taking the dot product of the $i$-th row of $\boldsymbol{A}$ and the $j$-th column of $\boldsymbol{B}$. This means that in the case of \eqref{eq:iL6ajgdqb1tLrVa09rS}, because $\boldsymbol{A}$ has $2$ rows and $\boldsymbol{B}$ has $2$ columns, the product $\boldsymbol{C}$ would take on the following shape:

$$\begin{pmatrix} c_{11}&c_{12}\\ c_{21}&c_{22}\\ \end{pmatrix}$$

What if the shape of $\boldsymbol{B}$ was as follows:

$$\boldsymbol{C}=\boldsymbol{AB}=\begin{pmatrix} *&*&*\\ *&*&*\end{pmatrix} \begin{pmatrix} \bullet&\bullet&\bullet\\ \bullet&\bullet&\bullet\\ \bullet&\bullet&\bullet\\ \end{pmatrix}$$

Since $\boldsymbol{A}$ has $2$ rows and $\boldsymbol{B}$ has $3$ columns, $\boldsymbol{C}$ would also have $2$ rows and $3$ columns:

$$\begin{pmatrix} c_{11}&c_{12}&c_{13}\\ c_{21}&c_{22}&c_{32}\\ \end{pmatrix}$$

In general, if $\boldsymbol{A}$ has $m$ rows and $\boldsymbol{B}$ has $r$ columns, then $\boldsymbol{C}$ must have $m$ rows and $r$ columns. The way to remember this rule is to first write down the shapes of the two matrices $\boldsymbol{A}$ and $\boldsymbol{B}$ like so:

$$\large{\underset{({\color{green}m}\times{\color{red}n})}{\boldsymbol{A}}\;\; \underset{({\color{red}n}\times{\color{blue}r})}{\boldsymbol{B}}\;\; =\;\; \underset{{(\color{green}m}\times{\color{blue}r})}{\boldsymbol{AB}}}$$

If the inner numbers ($\color{red}n$ in this case) are the same, then the product is defined - otherwise, the product is not defined. The shape of the resulting matrix is equal to the outer numbers ($\color{green}m$ and $\color{blue}r$ in this case).

Example.

Shape of a matrix-vector product

The shape of a matrix-vector product is:

$$\large{\underset{({\color{green}m}\times{\color{red}n})}{\boldsymbol{A}}\;\; \underset{({\color{red}n}\times{\color{blue}1})}{\boldsymbol{v}}\;\; =\;\; \underset{{(\color{green}m}\times{\color{blue}1})}{\boldsymbol{Av}}}$$

This means that a matrix-vector product results in another vector!

Theorem.

Matrix multiplication is not commutative

In general, matrix multiplication is not commutative, that is:

$$\boldsymbol{AB}\ne \boldsymbol{BA}$$

Where $\boldsymbol{A}$ and $\boldsymbol{B}$ are matrices.

Example. Consider the following matrices:

$$\boldsymbol{A}=\begin{pmatrix} 3&1\\ 0&2\\ \end{pmatrix},\;\;\;\;\; \boldsymbol{B}=\begin{pmatrix} 4&6\\ 5&7\\ \end{pmatrix}$$

The product $\boldsymbol{AB}$ and $\boldsymbol{BA}$ are:

$$\boldsymbol{AB}=\begin{pmatrix} 17&25\\ 10&14\\ \end{pmatrix},\;\;\;\;\; \boldsymbol{BA}=\begin{pmatrix} 12&16\\ 15&19\\ \end{pmatrix}$$

Notice that $\boldsymbol{AB}\ne{\boldsymbol{BA}}$. Therefore, unlike scalar multiplication, the ordering of the multiplication is important!

Theorem.

Multiplying a matrix with an identity matrix

One of the key properties of identity matrices $\boldsymbol{I}$ is that multiplying them with another matrix $\boldsymbol{A}$ will yield the matrix itself, that is:

$$\boldsymbol{IA}=\boldsymbol{AI}=\boldsymbol{A}$$

Proof. Consider the $m\times{n}$ matrix $\boldsymbol{A}$ and $n\times{n}$ identity matrix $\boldsymbol{I}_n$ below:

$$\boldsymbol{A}=\begin{pmatrix} a_{11}&a_{12}&\cdots&a_{1n}\\ a_{21}&a_{22}&\cdots&a_{2n}\\ \vdots&\vdots&\ddots&\vdots\\ a_{m1}&a_{m2}&\cdots&a_{mn} \end{pmatrix},\;\;\;\;\;\; \boldsymbol{I}_n= \begin{pmatrix} 1&0&\cdots&0\\ 0&1&\cdots&0\\ \vdots&\vdots&\smash\ddots&\vdots\\ 0&0&\cdots&1\\ \end{pmatrix}$$

The product $\boldsymbol{A}\boldsymbol{I}_n$ is:

$$\begin{align*} \boldsymbol{A}\boldsymbol{I}_n &=\begin{pmatrix} a_{11}&a_{12}&\cdots&a_{1n}\\ a_{21}&a_{22}&\cdots&a_{2n}\\ \vdots&\vdots&\ddots&\vdots\\ a_{m1}&a_{m2}&\cdots&a_{mn} \end{pmatrix} \begin{pmatrix} 1&0&\cdots&0\\ 0&1&\cdots&0\\ \vdots&\vdots&\smash\ddots&\vdots\\ 0&0&\cdots&1\\ \end{pmatrix}\\ &=\begin{pmatrix} a_{11}&a_{12}&\cdots&a_{1n}\\ a_{21}&a_{22}&\cdots&a_{2n}\\ \vdots&\vdots&\ddots&\vdots\\ a_{m1}&a_{m2}&\cdots&a_{mn} \end{pmatrix}\\ &=\boldsymbol{A} \end{align*}$$

Similarly, computing the product $\boldsymbol{I}_n\boldsymbol{A}$ will yield $\boldsymbol{A}$. This completes the proof.

Example.

Computing the product of a matrix and an identity matrix

Compute the following product:

$$\begin{pmatrix} 3&1&5\\8&1&2\\2&1&2 \end{pmatrix} \begin{pmatrix} 1&0&0\\0&1&0\\0&0&1 \end{pmatrix}$$

Solution. Notice how the right matrix is the identity matrix $\boldsymbol{I}_3$. The product will therefore return the left matrix:

$$\begin{pmatrix} 3&1&5\\8&1&2\\2&1&2 \end{pmatrix} \begin{pmatrix} 1&0&0\\0&1&0\\0&0&1 \end{pmatrix}= \begin{pmatrix} 3&1&5\\8&1&2\\2&1&2 \end{pmatrix}$$
Theorem.

Matrix product as a column operation

Let $\boldsymbol{A}$ be any $m\times{n}$ matrix and $\boldsymbol{B}$ be any $n\times{r}$ matrix represented below:

$$\boldsymbol{B}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{b_1}&\boldsymbol{b_2}&\cdots&\boldsymbol{b_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix}\\$$

Here, the columns of $\boldsymbol{B}$ are represented as vectors $\boldsymbol{b}_1$, $\boldsymbol{b}_2$, $\cdots$, $\boldsymbol{b}_n$.

The product $\boldsymbol{AB}$ is:

$$\boldsymbol{AB}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{A}\boldsymbol{b}_1&\boldsymbol{A}\boldsymbol{b}_2&\cdots&\boldsymbol{A}\boldsymbol{b}_n\\ \vert&\vert&\cdots&\vert \end{pmatrix}$$

Note that $\boldsymbol{A}\boldsymbol{b}_1$ is a vector.

Proof. Let matrices $\boldsymbol{A}$ and $\boldsymbol{B}$ be represented as follows:

$$\boldsymbol{A}=\begin{pmatrix} a_{11}&a_{12}&\cdots&a_{1n}\\ a_{21}&a_{22}&\cdots&a_{2n}\\ \vdots&\vdots&\smash\ddots&\vdots\\ a_{m1}&a_{m2}&\cdots&a_{mn} \end{pmatrix},\;\;\;\;\;\; \boldsymbol{B}= \begin{pmatrix} b_{11}&b_{12}&\cdots&a_{1r}\\ b_{21}&b_{22}&\cdots&a_{2r}\\ \vdots&\vdots&\smash\ddots&\vdots\\ b_{n1}&b_{n2}&\cdots&a_{nr} \end{pmatrix}$$

The first column of the matrix product $\boldsymbol{AB}$ is:

$$\begin{align*} \begin{pmatrix} a_{11}b_{11}+a_{12}b_{21}+\cdots+a_{1n}b_{n1}\\ a_{21}b_{11}+a_{22}b_{21}+\cdots+a_{2n}b_{n1}\\ \vdots\\ a_{m1}b_{11}+a_{m2}b_{21}+\cdots+a_{mn}b_{n1}\\ \end{pmatrix}&= \begin{pmatrix} a_{11}&a_{12}&\cdots&a_{1n}\\ a_{21}&a_{22}&\cdots&a_{2n}\\ \vdots&\vdots&\ddots&\vdots\\ a_{m1}&a_{m2}&\cdots&a_{mn}\\ \end{pmatrix} \begin{pmatrix} b_{11}\\b_{21}\\\vdots\\b_{n1} \end{pmatrix} =\boldsymbol{A}\boldsymbol{b}_1 \end{align*}$$

The second column of matrix product $\boldsymbol{AB}$ is:

$$\begin{align*} \begin{pmatrix} a_{11}b_{12}+a_{12}b_{22}+\cdots+a_{1n}b_{n2}\\ a_{21}b_{12}+a_{22}b_{22}+\cdots+a_{2n}b_{n2}\\ \vdots\\ a_{m1}b_{12}+a_{m2}b_{22}+\cdots+a_{mn}b_{n2}\\ \end{pmatrix}&= \begin{pmatrix} a_{11}&a_{12}&\cdots&a_{1n}\\ a_{21}&a_{22}&\cdots&a_{2n}\\ \vdots&\vdots&\ddots&\vdots\\ a_{m1}&a_{m2}&\cdots&a_{mn}\\ \end{pmatrix} \begin{pmatrix} b_{12}\\b_{22}\\\vdots\\b_{n2} \end{pmatrix} =\boldsymbol{A}\boldsymbol{b}_2 \end{align*}$$

We can therefore infer that the columns of $\boldsymbol{AB}$ are:

$$\boldsymbol{AB}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{A}\boldsymbol{b}_1&\boldsymbol{A}\boldsymbol{b}_2&\cdots&\boldsymbol{A}\boldsymbol{b}_n\\ \vert&\vert&\cdots&\vert \end{pmatrix}$$

This completes the proof.

Example.

Computing matrix product column by column

Consider the following matrices:

$$\boldsymbol{A}=\begin{pmatrix} 4&3\\ 1&2 \end{pmatrix},\;\;\;\;\;\;\boldsymbol{B}= \begin{pmatrix} 0&3\\5&5 \end{pmatrix}$$

Find $\boldsymbol{AB}$ using theoremlink.

Solution. The first column of $\boldsymbol{AB}$ is:

$$\begin{pmatrix} 4&3\\1&2 \end{pmatrix} \begin{pmatrix}0\\5\end{pmatrix} = \begin{pmatrix}15\\10\end{pmatrix}$$

The second column of $\boldsymbol{AB}$ is:

$$\begin{pmatrix} 4&3\\1&2 \end{pmatrix} \begin{pmatrix}3\\5\end{pmatrix} = \begin{pmatrix}27\\13\end{pmatrix}$$

Therefore, $\boldsymbol{AB}$ is:

$$\boldsymbol{AB}=\begin{pmatrix} 15&27\\ 10&13 \end{pmatrix}$$
Theorem.

Expressing a sum of scalar-vector products using a matrix-vector product

Consider the following sum:

$$x_1\boldsymbol{a}_1+ x_2\boldsymbol{a}_2+ \cdots+ x_n\boldsymbol{a}_n$$

Where $x_i\in\mathbb{R}$ and $\boldsymbol{a}_i\in\mathbb{R}^m$ for $i=1,2,\cdots,n$. This can be expressed as a matrix-vector product:

$$\begin{equation}\label{eq:y0B5oUWXjxeqQX1HUIN} x_1\boldsymbol{a}_1+ x_2\boldsymbol{a}_2+ \cdots+ x_n\boldsymbol{a}_n= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{a}_1&\boldsymbol{a}_2&\cdots&\boldsymbol{a}_n\\ \vert&\vert&\cdots&\vert \end{pmatrix} \begin{pmatrix} x_1\\ x_2\\ \vdots\\ x_n\\ \end{pmatrix}= \boldsymbol{A}\boldsymbol{x} \end{equation}$$

Here, $\boldsymbol{A}$ is a matrix whose columns are composed of vectors $\boldsymbol{a}_i$.

Proof. Suppose matrix $\boldsymbol{A}\in\mathbb{R}^{m\times{n}}$ and $\boldsymbol{x}\in\mathbb{R}^{n}$ are as follows:

$$\boldsymbol{A}=\begin{pmatrix} a_{11}&a_{12}&\cdots&a_{1n}\\ a_{21}&a_{22}&\cdots&a_{2n}\\ \vdots&\vdots&\ddots&\vdots\\ a_{m1}&a_{m2}&\cdots&a_{mn} \end{pmatrix},\;\;\;\;\;\; \boldsymbol{x}= \begin{pmatrix} x_1\\x_2\\\vdots\\x_n \end{pmatrix}$$

Taking their product:

$$\begin{align*} \boldsymbol{A}\boldsymbol{x}&=\begin{pmatrix} a_{11}&a_{12}&\cdots&a_{1n}\\ a_{21}&a_{22}&\cdots&a_{2n}\\ \vdots&\vdots&\ddots&\vdots\\ a_{m1}&a_{m2}&\cdots&a_{mn} \end{pmatrix} \begin{pmatrix} x_1\\x_2\\\vdots\\x_n \end{pmatrix}\\ &=\begin{pmatrix} a_{11}x_1+a_{12}x_2+\cdots+a_{1n}x_n\\ a_{21}x_1+a_{22}x_2+\cdots+a_{2n}x_n\\ \vdots\\ a_{m1}x_1+a_{m2}x_2+\cdots+a_{mn}x_n \end{pmatrix}\\ &=x_1\begin{pmatrix} a_{11}\\a_{21}\\\vdots\\a_{m1} \end{pmatrix}+x_2\begin{pmatrix} a_{12}\\a_{22}\\\vdots\\a_{m2} \end{pmatrix} +\cdots +x_n\begin{pmatrix} a_{1n}\\a_{2n}\\\vdots\\a_{mn} \end{pmatrix}\\ &=x_1\boldsymbol{a}_1+x_2\boldsymbol{a}_2+\cdots+x_n\boldsymbol{a}_n \end{align*}$$

Where $\boldsymbol{a}_1$, $\boldsymbol{a}_2$, $\cdots$, $\boldsymbol{a}_n$ are the columns of matrix $\boldsymbol{A}$. This completes the proof.

Example.

Expressing a sum of three vectors

Express the following sum as a matrix-vector product:

$$2\boldsymbol{x}_1+ 3\boldsymbol{x}_2+ 5\boldsymbol{x}_3$$

Where $\boldsymbol{x}_1$, $\boldsymbol{x}_2$ and $\boldsymbol{x}_3$ are the following vectors:

$$\boldsymbol{x}_1=\begin{pmatrix} 4\\3\\2 \end{pmatrix},\;\;\;\; \boldsymbol{x}_2=\begin{pmatrix} 1\\8\\7 \end{pmatrix},\;\;\;\; \boldsymbol{x}_3=\begin{pmatrix} 6\\4\\2 \end{pmatrix}$$

Solution. By theoremlink, we have that:

$$\begin{align*} 2\boldsymbol{x}_1+ 3\boldsymbol{x}_2+ 5\boldsymbol{x}_3\;\;{\color{blue}=}\; \begin{pmatrix} \vert&\vert&\vert\\ \boldsymbol{x}_1&\boldsymbol{x}_2&\boldsymbol{x}_3\\ \vert&\vert&\vert \end{pmatrix} \begin{pmatrix} 2\\3\\5 \end{pmatrix}\;{\color{blue}=}\; \begin{pmatrix} 4&1&6\\3&8&4\\2&7&2\\ \end{pmatrix} \begin{pmatrix} 2\\3\\5 \end{pmatrix} \end{align*}$$
Theorem.

Important properties of matrix-vector products

If $\boldsymbol{A}$ is an $m\times{n}$ matrix, $\boldsymbol{v}$ and $\boldsymbol{w}$ are vectors in $\mathbb{R}^n$, and $c$ is a scalar in $\mathbb{R}$, then:

  1. $\boldsymbol{A}(\boldsymbol{v}+\boldsymbol{w})= \boldsymbol{A}\boldsymbol{v}+\boldsymbol{A}\boldsymbol{w}$.

  2. $\boldsymbol{A}(c\boldsymbol{v}) =c(\boldsymbol{A}\boldsymbol{v})$.

  3. $(\boldsymbol{A}+\boldsymbol{B})\boldsymbol{v}= \boldsymbol{A}\boldsymbol{v}+\boldsymbol{B}\boldsymbol{v}$.

Proof. We will prove these properties for the simple case when $n=3$ but the proofs can easily be generalized. Suppose we have matrices $\boldsymbol{A},\boldsymbol{B} \in\mathbb{R}^{m\times3}$ and vectors $\boldsymbol{u},\boldsymbol{v}\in\mathbb{R}^3$ like so:

$$\boldsymbol{v}= \begin{pmatrix} v_1\\v_2\\v_3\end{pmatrix},\;\;\;\;\;\; \boldsymbol{w}=\begin{pmatrix} w_1\\w_2\\w_3 \end{pmatrix},\;\;\;\;\;\; \boldsymbol{A}=\begin{pmatrix} \vert&\vert&\vert\\ \boldsymbol{a}_1&\boldsymbol{a}_2&\boldsymbol{a}_3\\ \vert&\vert&\vert\end{pmatrix},\;\;\;\;\;\; \boldsymbol{B}=\begin{pmatrix} \vert&\vert&\vert\\ \boldsymbol{b}_1&\boldsymbol{b}_2&\boldsymbol{b}_3\\ \vert&\vert&\vert\end{pmatrix}$$

We start by proving the first property:

$$\begin{align*} \boldsymbol{A}(\boldsymbol{v}+\boldsymbol{w})&= \begin{pmatrix} \vert&\vert&\vert\\ \boldsymbol{a}_1&\boldsymbol{a}_2&\boldsymbol{a}_3\\ \vert&\vert&\vert\\ \end{pmatrix}\Big[ \begin{pmatrix} v_1\\v_2\\v_3 \end{pmatrix}+ \begin{pmatrix} w_1\\w_2\\w_3 \end{pmatrix}\Big]\\ &= \begin{pmatrix} \vert&\vert&\vert\\ \boldsymbol{a}_1&\boldsymbol{a}_2&\boldsymbol{a}_3\\ \vert&\vert&\vert\\ \end{pmatrix}\begin{pmatrix} v_1+w_1\\v_2+w_2\\v_3+w_3 \end{pmatrix}\\ &=(v_1+w_1)\boldsymbol{a}_1+(v_2+w_2)\boldsymbol{a}_2+(v_3+w_3)\boldsymbol{a}_3\\ &=v_1\boldsymbol{a}_1+w_1\boldsymbol{a}_1+v_2\boldsymbol{a}_2+w_2\boldsymbol{a}_2+v_3\boldsymbol{a}_3+w_3\boldsymbol{a}_3\\ &=(v_1\boldsymbol{a}_1+v_2\boldsymbol{a}_2+v_3\boldsymbol{a}_3)+(w_1\boldsymbol{a}_1+w_2\boldsymbol{a}_2+w_3\boldsymbol{a}_3)\\ &=\begin{pmatrix} \vert&\vert&\vert\\ \boldsymbol{a}_1&\boldsymbol{a}_2&\boldsymbol{a}_3\\ \vert&\vert&\vert \end{pmatrix}\begin{pmatrix} u_1\\u_2\\u_3 \end{pmatrix}+ \begin{pmatrix} \vert&\vert&\vert\\ \boldsymbol{a}_1&\boldsymbol{a}_2&\boldsymbol{a}_3\\ \vert&\vert&\vert \end{pmatrix}\begin{pmatrix} v_1\\v_2\\v_3 \end{pmatrix}\\ &=\boldsymbol{Au}+\boldsymbol{Av} \end{align*}$$

Next, we prove the second property:

$$\begin{align*} \boldsymbol{A}(c\boldsymbol{v}) &=\boldsymbol{A}\Big[c\begin{pmatrix} v_1\\v_2\\v_3 \end{pmatrix}\Big]\\ &=\boldsymbol{A}\begin{pmatrix} cv_1\\cv_2\\cv_3\end{pmatrix}\\ &=\begin{pmatrix} \vert&\vert&\vert\\ \boldsymbol{a}_1&\boldsymbol{a}_2&\boldsymbol{a}_3\\ \vert&\vert&\vert\\\end{pmatrix}\begin{pmatrix}cv_1\\cv_2\\cv_3\end{pmatrix}\\ &=cv_1\boldsymbol{a}_1+cv_2\boldsymbol{a}_2+cv_3\boldsymbol{a}_3\\ &=c(v_1\boldsymbol{a}_1+v_2\boldsymbol{a}_2+v_3\boldsymbol{a}_3)\\&=c\Big[\begin{pmatrix} \vert&\vert&\vert\\\boldsymbol{a}_1&\boldsymbol{a}_2&\boldsymbol{a}_3\\ \vert&\vert&\vert\\\end{pmatrix}\begin{pmatrix}v_1\\v_2\\v_3\end{pmatrix}\Big]\\ &=c\boldsymbol{A}\boldsymbol{v} \end{align*}$$

Finally, we prove the last property:

$$\begin{align*} (\boldsymbol{A}+\boldsymbol{B})\boldsymbol{v} &=\Big[\begin{pmatrix} \vert&\vert&\vert\\ \boldsymbol{a}_1&\boldsymbol{a}_2&\boldsymbol{a}_3\\ \vert&\vert&\vert\end{pmatrix}+ \begin{pmatrix} \vert&\vert&\vert\\ \boldsymbol{b}_1&\boldsymbol{b}_2&\boldsymbol{b}_3\\ \vert&\vert&\vert\end{pmatrix}\Big]\boldsymbol{v}\\ &=\begin{pmatrix} \vert&\vert&\vert\\ \boldsymbol{a}_1+\boldsymbol{b}_1&\boldsymbol{a}_2+\boldsymbol{b}_2&\boldsymbol{a}_3+\boldsymbol{b}_3\\ \vert&\vert&\vert\end{pmatrix}\begin{pmatrix}v_1\\v_2\\v_3\end{pmatrix}\\ &=(\boldsymbol{a}_1+\boldsymbol{b}_1)v_1+ (\boldsymbol{a}_2+\boldsymbol{b}_2)v_2+ (\boldsymbol{a}_3+\boldsymbol{b}_3)v_3\\ &=v_1\boldsymbol{a}_1+v_1\boldsymbol{b}_1+v_2\boldsymbol{a}_2+v_2\boldsymbol{b}_2+v_3\boldsymbol{a}_3+v_3\boldsymbol{b}_3\\ &=(v_1\boldsymbol{a}_1+v_2\boldsymbol{a}_2+v_3\boldsymbol{a}_3)+(v_1\boldsymbol{b}_1+v_2\boldsymbol{b}_2+v_3\boldsymbol{b}_3)\\ &=\begin{pmatrix}\vert&\vert&\vert\\\boldsymbol{a}_1&\boldsymbol{a}_2&\boldsymbol{a}_3\\ \vert&\vert&\vert\end{pmatrix}\begin{pmatrix}v_1\\v_2\\v_3\end{pmatrix}+ \begin{pmatrix}\vert&\vert&\vert\\\boldsymbol{b}_1&\boldsymbol{b}_2&\boldsymbol{b}_3\\ \vert&\vert&\vert\end{pmatrix}\begin{pmatrix}v_1\\v_2\\v_3\end{pmatrix}\\ &=\boldsymbol{Av}+\boldsymbol{Bv} \end{align*}$$

This completes the proofs of the three properties.

robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Ask a question or leave a feedback...