search
Search
Login
Unlock 100+ guides
menu
menu
web
search toc
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
What does this mean?
Why is this true?
Give me some examples!
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
check_circle
Mark as learned
thumb_up
0
thumb_down
0
chat_bubble_outline
0
Comment
auto_stories Bi-column layout
settings

Comprehensive Guide on Orthogonal Matrices in Linear Algebra

schedule Aug 12, 2023
Last updated
local_offer
Linear Algebra
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!
Definition.

Orthogonal matrices

An $n\times{n}$ matrix $\boldsymbol{Q}$ is said to be orthogonal if $\boldsymbol{Q}^T\boldsymbol{Q}=\boldsymbol{I}_n$ where $\boldsymbol{I}_n$ is the $n\times{n}$ identity matrix.

Example.

Checking if a matrix is an orthogonal matrix

Consider the following matrix:

$$\boldsymbol{A}=\begin{pmatrix} 0&1\\ -1&0\\ \end{pmatrix}$$

Show that $\boldsymbol{A}$ is orthogonal.

Solution. To check whether matrix $\boldsymbol{A}$ is orthogonal, use the definition directly:

$$\begin{align*} \boldsymbol{A}^T\boldsymbol{A} &=\begin{pmatrix} 0&-1\\ 1&0\\ \end{pmatrix} \begin{pmatrix} 0&1\\ -1&0\\ \end{pmatrix}\\ &=\begin{pmatrix} 1&0\\ 0&1\\ \end{pmatrix} \end{align*}$$

Since $\boldsymbol{A}^T\boldsymbol{A}=\boldsymbol{I}_2$, we have that $\boldsymbol{A}$ is an orthogonal matrix.

Theorem.

Transpose of an orthogonal matrix is equal to its inverse

If $\boldsymbol{Q}$ is an orthogonal matrix, then $\boldsymbol{Q}^T$ is the inverse of $\boldsymbol{Q}$, that is:

$$\boldsymbol{Q}^T=\boldsymbol{Q}^{-1}$$

Proof. By definition of orthogonal matrices, we have that:

$$\boldsymbol{Q}^T\boldsymbol{Q}=\boldsymbol{I}_n$$

By definitionlink of matrix inverses, if the product of any matrix $\boldsymbol{A}$ and $\boldsymbol{B}$ results in the identity matrix, then $\boldsymbol{A}$ is an inverse of $\boldsymbol{B}$ and vice versa. In this case then, we have that $\boldsymbol{Q}^T$ must be the inverse of $\boldsymbol{Q}$. This completes the proof.

Theorem.

Orthogonal matrices are invertible

If $\boldsymbol{Q}$ is an orthogonal matrix, then $\boldsymbol{Q}$ is invertiblelink.

Proof. If $\boldsymbol{Q}$ is an orthogonal matrix, then $\boldsymbol{Q}^{-1}=\boldsymbol{Q}^T$ by propertylink. Since the transpose of a matrix always exists, $\boldsymbol{Q}^{-1}$ always exists. This means that $\boldsymbol{Q}$ is invertible by definitionlink. This completes the proof.

Theorem.

Equivalent definition of orthogonal matrices

If $\boldsymbol{Q}$ is an $n\times{n}$ orthogonal matrix, then:

$$\boldsymbol{Q}^T\boldsymbol{Q}=\boldsymbol{Q}\boldsymbol{Q}^T =\boldsymbol{I}_n$$

Proof. Because $\boldsymbol{Q}^T=\boldsymbol{Q}^{-1}$, we have that:

$$\begin{align*} \boldsymbol{Q}^T\boldsymbol{Q}&=\boldsymbol{Q}^{-1}\boldsymbol{Q}\\ &=\boldsymbol{Q}\boldsymbol{Q}^{-1}\\ &=\boldsymbol{Q}\boldsymbol{Q}^{T}\\ \end{align*}$$

This completes the proof.

Theorem.

Transpose of an orthogonal matrix is also orthogonal

Matrix $\boldsymbol{Q}$ is orthogonal if and only if $\boldsymbol{Q}^T$ is orthogonal.

Proof. We first prove the forward proposition. Assume matrix $\boldsymbol{Q}$ is orthogonal. We know from the previous theoremlink that:

$$\begin{equation}\label{eq:mRczyO42IwvNUv4Luv9} \boldsymbol{Q}^T\boldsymbol{Q}=\boldsymbol{Q}\boldsymbol{Q}^T=\boldsymbol{I}_n \end{equation}$$

Now, taking the transpose of a matrix twice results in the matrix itself, that is:

$$(\boldsymbol{Q}^T)^T= \boldsymbol{Q}$$

Substitute this expression for $\boldsymbol{Q}$ into $\boldsymbol{Q}\boldsymbol{Q}^T$ in \eqref{eq:mRczyO42IwvNUv4Luv9} to get:

$$(\boldsymbol{Q}^T)^T \boldsymbol{Q}^T=\boldsymbol{I}_n$$

Here, $(\boldsymbol{Q}^T)^T$ is the transpose of $\boldsymbol{Q}^T$, and their product results in the identity matrix. This means that $\boldsymbol{Q}^T$ must be orthogonal by definition.

We now prove the converse. Assume $\boldsymbol{Q}^T$ is an orthogonal matrix. We have just proven that taking a transpose of an orthogonal matrix results in an orthogonal matrix. The transpose of $\boldsymbol{Q}^T$ is $\boldsymbol{Q}$, which means that $\boldsymbol{Q}$ is orthogonal.

This completes the proof.

Theorem.

Inverse of orthogonal matrix is also orthogonal

Matrix $\boldsymbol{Q}$ is orthogonal if and only if $\boldsymbol{Q}^{-1}$ is orthogonal.

Proof. We first prove the forward proposition. We assume matrix $\boldsymbol{Q}$ is orthogonal. By theoremlink, if $\boldsymbol{Q}$ is orthogonal, then $\boldsymbol{Q}^T$ is orthogonal. Because $\boldsymbol{Q}^T=\boldsymbol{Q}^{-1}$ by theoremlink, we have that $\boldsymbol{Q}^{-1}$ is orthogonal.

We now prove the converse. Assume $\boldsymbol{Q}^{-1}$ is orthogonal. We have just proven that taking the inverse of an orthogonal matrix results in an orthogonal matrix. The inverse of $\boldsymbol{Q}^{-1}$ is $\boldsymbol{Q}$ by theoremlink, which means that $\boldsymbol{Q}$ is orthogonal. This completes the proof.

Theorem.

Row vectors of an orthogonal matrix form an orthonormal basis

An $n\times{n}$ matrix Q is orthogonal if and only if the row vectors of $\boldsymbol{Q}$ form an orthonormal basislink of $\mathbb{R}^n$. This means that the every row vector of $\boldsymbol{Q}$ is an unit vector that is perpendicular to every other row vector of $\boldsymbol{Q}$.

Proof. Let the $i$-th row of the orthogonal matrix $\boldsymbol{Q}$ be represented by a row vector $\boldsymbol{r}_i$. The product $\boldsymbol{Q}\boldsymbol{Q}^T$ would therefore be:

$$\begin{equation}\label{eq:VGberAmNvO8jSBbFXKI} \begin{aligned}[b] \boldsymbol{Q}\boldsymbol{Q}^T&= \begin{pmatrix} -&\boldsymbol{r}_1&-\\ -&\boldsymbol{r}_2&-\\ \vdots&\vdots&\vdots\\ -&\boldsymbol{r}_n&- \end{pmatrix} \begin{pmatrix} \vert&\vert&\dots&\vert\\ \boldsymbol{r}_1& \boldsymbol{r}_2& \dots& \boldsymbol{r}_n\\ \vert&\vert&\dots&\vert\\ \end{pmatrix}\\ &= \begin{pmatrix} \boldsymbol{r}_1\cdot\boldsymbol{r}_1&\boldsymbol{r}_1\cdot\boldsymbol{r}_2&\dots&\boldsymbol{r}_1\cdot\boldsymbol{r}_n\\ \boldsymbol{r}_2\cdot\boldsymbol{r}_1&\boldsymbol{r}_2\cdot\boldsymbol{r}_2&\dots&\boldsymbol{r}_2\cdot\boldsymbol{r}_n\\ \vdots&\vdots&\ddots&\vdots\\ \boldsymbol{r}_n\cdot\boldsymbol{r}_1&\boldsymbol{r}_n\cdot\boldsymbol{r}_2&\dots&\boldsymbol{r}_n\cdot\boldsymbol{r}_n\\ \end{pmatrix} \end{aligned} \end{equation}$$

Since $\boldsymbol{Q}$ is orthogonal, we know that \eqref{eq:VGberAmNvO8jSBbFXKI} is equal to the identity matrix, which means that:

Mathematically, we can write this as:

$$\begin{equation}\label{eq:hrOax8UeTHrPQL7iQ6A} \begin{aligned}[b] \boldsymbol{r}_i\cdot\boldsymbol{r}_i&=1\\ \boldsymbol{r}_i\cdot\boldsymbol{r}_j&=0\;\;\;\text{when}\;i\ne{j} \end{aligned} \end{equation}$$

By theorem, the first equation can be written as:

$$\boldsymbol{r}_i\cdot \boldsymbol{r}_i=1 \;\;\;\;\;\;\;\;\; \Longleftrightarrow \;\;\;\;\;\;\;\;\; \Vert\boldsymbol{r}_i\Vert^2=1$$

Because magnitudes cannot be negative, we have that $\Vert\boldsymbol{r}_i\Vert=1$, that is, every row vector is an unit vector!

Next, since the dot product of $\boldsymbol{r}_i$ and $\boldsymbol{r}_j$ when $i\ne{j}$ is equal to zero, any pair of rows $\boldsymbol{r}_i$ and $\boldsymbol{r}_j$ are perpendicular to each other by definitionlink. This means that row vectors form an orthonormal setlink that spans $\mathbb{R}^n$, and thus the row vectors form an orthonormal basis of $\mathbb{R}^n$.

We now prove the converse, that is, if the row vectors of $\boldsymbol{Q}$ form an orthonormal basis of $\mathbb{R}^n$, then $\boldsymbol{Q}$ is orthogonal. The proof is very similar - let $\boldsymbol{Q}$ be defined as:

$$\boldsymbol{Q}= \begin{pmatrix} -&\boldsymbol{r}_1&-\\ -&\boldsymbol{r}_2&-\\ \vdots&\vdots&\vdots\\ -&\boldsymbol{r}_n&- \end{pmatrix}$$

Where \eqref{eq:hrOax8UeTHrPQL7iQ6A} holds. $\boldsymbol{QQ}^T$ is:

$$\begin{equation}\label{eq:bzEtTaIyRjdUGhIi7zD} \begin{aligned}[b] \boldsymbol{Q}\boldsymbol{Q}^T&= \begin{pmatrix} -&\boldsymbol{r}_1&-\\ -&\boldsymbol{r}_2&-\\ \vdots&\vdots&\vdots\\ -&\boldsymbol{r}_n&- \end{pmatrix} \begin{pmatrix} \vert&\vert&\dots&\vert\\ \boldsymbol{r}_1& \boldsymbol{r}_2& \dots& \boldsymbol{r}_n\\ \vert&\vert&\dots&\vert\\ \end{pmatrix}\\ &= \begin{pmatrix} \boldsymbol{r}_1\cdot\boldsymbol{r}_1&\boldsymbol{r}_1\cdot\boldsymbol{r}_2&\dots&\boldsymbol{r}_1\cdot\boldsymbol{r}_n\\ \boldsymbol{r}_2\cdot\boldsymbol{r}_1&\boldsymbol{r}_2\cdot\boldsymbol{r}_2&\dots&\boldsymbol{r}_2\cdot\boldsymbol{r}_n\\ \vdots&\vdots&\smash\ddots&\vdots\\ \boldsymbol{r}_n\cdot\boldsymbol{r}_1&\boldsymbol{r}_n\cdot\boldsymbol{r}_2&\dots&\boldsymbol{r}_n\cdot\boldsymbol{r}_n\\ \end{pmatrix}\\ &=\begin{pmatrix} 1&0&\cdots&0\\ 0&1&\cdots&0\\ \vdots&\vdots&\smash\ddots&\vdots\\ 0&0&\cdots&1\\ \end{pmatrix}\\ &=\boldsymbol{I}_n \end{aligned} \end{equation}$$

Because $\boldsymbol{QQ}^T=\boldsymbol{I}_n$, we have that $\boldsymbol{Q}$ is orthogonal by definitionlink. This completes the proof.

Theorem.

Column vectors of an orthogonal matrix form an orthonormal basis

An $n\times{n}$ matrix $\boldsymbol{Q}$ is orthogonal if and only if the column vectors of $\boldsymbol{Q}$ form an orthonormal basislink of $\mathbb{R}^n$. This means that the every column vector of $\boldsymbol{Q}$ is an unit vector that is perpendicular to every other column vector of $\boldsymbol{Q}$.

Proof. The proof is nearly identical to that of theorem, except that we represent matrix $\boldsymbol{Q}$ using column vectors $\boldsymbol{c}_i$ instead of row vectors:

$$\begin{equation}\label{eq:Rmy9J1pebd6tRjfi5vt} \begin{aligned}[b] \boldsymbol{Q}^T\boldsymbol{Q}&= \begin{pmatrix} -&\boldsymbol{c}_1&-\\ -&\boldsymbol{c}_2&-\\ \vdots&\vdots&\vdots\\ -&\boldsymbol{c}_n&- \end{pmatrix} \begin{pmatrix} \vert&\vert&\dots&\vert\\ \boldsymbol{c}_1&\boldsymbol{c}_2&\dots&\boldsymbol{c}_n\\ \vert&\vert&\dots&\vert\\ \end{pmatrix}\\ &= \begin{pmatrix} \boldsymbol{c}_1\cdot\boldsymbol{c}_1&\boldsymbol{c}_1\cdot\boldsymbol{c}_2&\dots&\boldsymbol{c}_1\cdot\boldsymbol{c}_n\\ \boldsymbol{c}_2\cdot\boldsymbol{c}_1&\boldsymbol{c}_2\cdot\boldsymbol{c}_2&\dots&\boldsymbol{c}_2\cdot\boldsymbol{c}_n\\ \vdots&\vdots&\smash\ddots&\vdots\\ \boldsymbol{c}_n\cdot\boldsymbol{c}_1&\boldsymbol{c}_n\cdot\boldsymbol{c}_2&\dots&\boldsymbol{c}_n\cdot\boldsymbol{c}_n\\ \end{pmatrix} \end{aligned} \end{equation}$$

By definition of orthogonal matrices, we have that $\boldsymbol{Q}^T\boldsymbol{Q}=\boldsymbol{I}_n$. This means that:

$$\begin{equation}\label{eq:c5LLrknfeKg1oV66aEB} \begin{aligned}[b] \boldsymbol{c}_i\cdot\boldsymbol{c}_i&=1\\ \boldsymbol{c}_i\cdot\boldsymbol{c}_j&=0\;\;\;\text{when}\;i\ne{j} \end{aligned} \end{equation}$$

The first equation implies $\Vert\boldsymbol{c}_i\Vert=1$, which means every column vector is an unit vector.

Next, by the definitionlink of dot product, every pair of $\boldsymbol{c}_i$ and $\boldsymbol{c}_j$ when $i\ne{j}$ must be orthogonal. This means that the column vectors of $\boldsymbol{Q}$ form an orthonormal set and spans $\mathbb{R}^n$. Therefore, the column vectors of $\boldsymbol{Q}$ form an orthonormal basis of $\mathbb{R}^n$.

We now prove the converse. Let $\boldsymbol{Q}$ be defined like so:

$$\boldsymbol{Q}= \begin{pmatrix} \vert&\vert&\dots&\vert\\ \boldsymbol{c}_1&\boldsymbol{c}_2&\dots&\boldsymbol{c}_n\\ \vert&\vert&\dots&\vert\\ \end{pmatrix}$$

Where \eqref{eq:c5LLrknfeKg1oV66aEB} holds. Now, $\boldsymbol{Q}^T\boldsymbol{Q}$ is:

$$\begin{align*} \boldsymbol{Q}^T\boldsymbol{Q}&= \begin{pmatrix} -&\boldsymbol{c}_1&-\\ -&\boldsymbol{c}_2&-\\ \vdots&\vdots&\vdots\\ -&\boldsymbol{c}_n&- \end{pmatrix} \begin{pmatrix} \vert&\vert&\dots&\vert\\ \boldsymbol{c}_1&\boldsymbol{c}_2&\dots&\boldsymbol{c}_n\\ \vert&\vert&\dots&\vert\\ \end{pmatrix}\\ &= \begin{pmatrix} \boldsymbol{c}_1\cdot\boldsymbol{c}_1&\boldsymbol{c}_1\cdot\boldsymbol{c}_2&\dots&\boldsymbol{c}_1\cdot\boldsymbol{c}_n\\ \boldsymbol{c}_2\cdot\boldsymbol{c}_1&\boldsymbol{c}_2\cdot\boldsymbol{c}_2&\dots&\boldsymbol{c}_2\cdot\boldsymbol{c}_n\\ \vdots&\vdots&\smash\ddots&\vdots\\ \boldsymbol{c}_n\cdot\boldsymbol{c}_1&\boldsymbol{c}_n\cdot\boldsymbol{c}_2&\dots&\boldsymbol{c}_n\cdot\boldsymbol{c}_n\\ \end{pmatrix}\\ &= \begin{pmatrix} 1&0&\cdots&0\\ 0&1&\cdots&0\\ 0&0&\smash\ddots&0\\ 0&0&\cdots&1 \end{pmatrix}\\ &=\boldsymbol{I}_n \end{align*}$$

Because $\boldsymbol{Q}^T\boldsymbol{Q}=\boldsymbol{I}_n$, we have that $\boldsymbol{Q}$ is orthogonal by definitionlink. This completes the proof.

Theorem.

Product of orthogonal matrices is also orthogonal

If $\boldsymbol{Q}$ and $\boldsymbol{R}$ are any $n\times{n}$ orthogonal matrices, then their product $\boldsymbol{Q}\boldsymbol{R}$ is also an $n\times{n}$ orthogonal matrix.

Proof. From the definition of orthogonal matrices, we know that:

$$\begin{align*} \boldsymbol{Q}\boldsymbol{Q}^T=\boldsymbol{Q}^T\boldsymbol{Q}=\boldsymbol{I}_n\\ \boldsymbol{R}\boldsymbol{R}^T=\boldsymbol{R}^T\boldsymbol{R}=\boldsymbol{I}_n \end{align*}$$

Now, let's use the definition of orthogonal matrices once again to check if $\boldsymbol{Q}\boldsymbol{R}$ is orthogonal:

$$\begin{align*} (\boldsymbol{Q}\boldsymbol{R})^T (\boldsymbol{Q}\boldsymbol{R}) &= (\boldsymbol{R}^T\boldsymbol{Q}^T) (\boldsymbol{Q}\boldsymbol{R})\\ &= \boldsymbol{R}^T(\boldsymbol{Q}^T \boldsymbol{Q})\boldsymbol{R}\\ &= \boldsymbol{R}^T\boldsymbol{I}_n\boldsymbol{R}\\ &= \boldsymbol{R}^T\boldsymbol{R}\\ &= \boldsymbol{I}_n\\ \end{align*}$$

For the first step, we used the theoremlink $(\boldsymbol{A}\boldsymbol{B})^T=\boldsymbol{B}^T\boldsymbol{A}^T$. This completes the proof.

Theorem.

Magnitude of the product of an orthogonal matrix and a vector

If $\boldsymbol{Q}$ is an $n\times{n}$ orthogonal matrix and $\boldsymbol{x}\in{\mathbb{R}}^n$ is any vector, then:

$$\Vert\boldsymbol{Q}\boldsymbol{x}\Vert=\Vert{\boldsymbol{x}}\Vert$$

Proof. The matrix-vector product $\boldsymbol{Qx}$ results in a vector. By theoremlink, the magnitude of a vector can be written as a dot product like so:

$$\begin{align*} \Vert\boldsymbol{Q}\boldsymbol{x}\Vert &=(\boldsymbol{Q}\boldsymbol{x}\cdot\boldsymbol{Q}\boldsymbol{x})^{1/2}\\ &=(\boldsymbol{x}\cdot\boldsymbol{Q}^T\boldsymbol{Q}\boldsymbol{x})^{1/2}\\ &=(\boldsymbol{x}\cdot\boldsymbol{I}_n\boldsymbol{x})^{1/2}\\ &=(\boldsymbol{x}\cdot\boldsymbol{x})^{1/2}\\ &=(\Vert\boldsymbol{x}\Vert^2)^{1/2}\\ &=\Vert\boldsymbol{x}\Vert \end{align*}$$

To clarify the steps:

  • the second equality uses theoremlink, that is, $\boldsymbol{A}\boldsymbol{v}\cdot\boldsymbol{w}= \boldsymbol{v}\cdot\boldsymbol{A}^T\boldsymbol{w}$.

  • the second-to-last step uses theoremlink, that is, $\boldsymbol{x}\cdot\boldsymbol{x}= \Vert\boldsymbol{x}\Vert^2$.

This completes the proof.

Intuition. Recall that a matrix-vector product can be considered as a linear transformation applied to the vector. The fact that $\Vert\boldsymbol{Qx}\Vert=\boldsymbol{x}$ means that applying the transformation $\boldsymbol{Q}$ on $\boldsymbol{x}$ preserves the length of $\boldsymbol{x}$.

Theorem.

Orthogonal transformation preserves angle

Let $\boldsymbol{Q}$ be an orthogonal matrix. If $\theta$ represents the angle between vectors $\boldsymbol{v}$ and $\boldsymbol{w}$, then the angle between $\boldsymbol{Qv}$ and $\boldsymbol{Qw}$ is also $\theta$.

Proof. We know from this theoremlink that:

$$\begin{align*} \boldsymbol{v}\cdot\boldsymbol{w}&= \Vert\boldsymbol{v}\Vert\Vert\boldsymbol{w}\Vert\cos(\theta) \end{align*}$$

Where $\theta$ is the angle between $\boldsymbol{v}$ and $\boldsymbol{w}$. Similarly, we have that:

$$\begin{equation}\label{eq:D7KabFz9nSld9XLbo08} \boldsymbol{Qv}\cdot\boldsymbol{Qw}= \Vert\boldsymbol{Qv}\Vert\Vert\boldsymbol{Qw}\Vert\cos(\theta_*) \end{equation}$$

Where $\theta_*$ is the angle between $\boldsymbol{Qv}$ and $\boldsymbol{Qw}$. Our goal is to show that $\theta=\theta_*$.

We start by making $\theta_*$ in \eqref{eq:D7KabFz9nSld9XLbo08} the subject like so:

$$\begin{align*} \theta_*= \arccos\left( \frac {\boldsymbol{Qv}\cdot\boldsymbol{Qw}} {\Vert\boldsymbol{Qv}\Vert\Vert\boldsymbol{Qw}\Vert}\right) \end{align*}$$

From theoremlink, we have that $\Vert{\boldsymbol{Qv}}\Vert=\Vert\boldsymbol{v}\Vert$ and $\Vert{\boldsymbol{Qw}}\Vert=\Vert\boldsymbol{w}\Vert$, thereby giving us:

$$\begin{align*} \theta_*= \arccos\left( \frac {\boldsymbol{Qv}\cdot\boldsymbol{Qw}} {\Vert\boldsymbol{v}\Vert\Vert\boldsymbol{w}\Vert}\right) \end{align*}$$

We rewrite the dot product in the numerator as a matrix-matrix product:

$$\begin{align*} \theta_*&= \arccos\left( \frac{(\boldsymbol{Qv})^T\boldsymbol{Qw}} {\Vert\boldsymbol{v}\Vert\Vert\boldsymbol{w}\Vert}\right)\\ &=\arccos\left( \frac{\boldsymbol{v}^T\boldsymbol{Q}^T\boldsymbol{Qw}} {\Vert\boldsymbol{v}\Vert\Vert\boldsymbol{w}\Vert}\right)\\ &=\arccos\left( \frac{\boldsymbol{v}^T\boldsymbol{I}\boldsymbol{w}} {\Vert\boldsymbol{v}\Vert\Vert\boldsymbol{w}\Vert}\right)\\ &=\arccos\left( \frac{\boldsymbol{v}^T\boldsymbol{w}} {\Vert\boldsymbol{v}\Vert\Vert\boldsymbol{w}\Vert}\right)\\ &=\arccos\left( \frac{\boldsymbol{v}\cdot\boldsymbol{w}} {\Vert\boldsymbol{v}\Vert\Vert\boldsymbol{w}\Vert}\right)\\ &=\theta \end{align*}$$

Here, in the second step, we used theoremlink $(\boldsymbol{AB})^T=\boldsymbol{B}^T\boldsymbol{A}^T$. This completes the proof.

Example.

Visualizing how the length and angle are preserved after an orthogonal transformation

Suppose we have the following vectors:

$$\boldsymbol{v}= \begin{pmatrix} 2\\1 \end{pmatrix},\;\;\;\;\; \boldsymbol{w}= \begin{pmatrix} 1\\3 \end{pmatrix}$$

Suppose we apply transformation $\boldsymbol{Qv}$ and $\boldsymbol{Qw}$ where $\boldsymbol{Q}$ is an orthogonal matrix defined as:

$$\boldsymbol{Q}=\begin{pmatrix} 0&1\\ -1&0\\ \end{pmatrix}$$

Visually show that the length of each vector and the angle between the two vectors are preserved after the transformation.

Solution. The vectors after the transformation are:

$$\begin{align*} \boldsymbol{Qv}=\begin{pmatrix} 0&1\\ -1&0\\ \end{pmatrix} \begin{pmatrix} 2\\1 \end{pmatrix}= \begin{pmatrix} 1\\-2 \end{pmatrix}\\\boldsymbol{Qw}= \begin{pmatrix} 0&1\\ -1&0\\ \end{pmatrix} \begin{pmatrix} 1\\3 \end{pmatrix}= \begin{pmatrix} 3\\-1 \end{pmatrix} \end{align*}$$

Let's visualize all the vectors:

Observe the following:

  • the length of the vectors remains unchanged after transformation.

  • the angle between the two vectors also remains unchanged after transformation.

Theorem.

Product of Qx and Qy where Q is an orthogonal matrix and x,y are vectors

Suppose $\boldsymbol{Q}$ is an $n\times{n}$ orthogonal matrix. For all $\boldsymbol{x}\in{\mathbb{R}}^n$ and $\boldsymbol{y}\in{\mathbb{R}}^n$, we have that:

$$\boldsymbol{Q}\boldsymbol{x}\cdot \boldsymbol{Q}\boldsymbol{y}= \boldsymbol{x}\cdot\boldsymbol{y}$$

Proof. Recall from theoremlink that:

$$\begin{equation}\label{eq:bvWCKdz5Oa0Kj2YbNcl} \boldsymbol{x}\cdot{\boldsymbol{y}} =\frac{1}{4}\Vert{\boldsymbol{x}} +\boldsymbol{y}\Vert^2- \frac{1}{4}\Vert{\boldsymbol{x}} -\boldsymbol{y}\Vert^2 \end{equation}$$

Since $\boldsymbol{Qx}$ and $\boldsymbol{Qy}$ are vectors, we can use \eqref{eq:bvWCKdz5Oa0Kj2YbNcl} to get:

$$\begin{align*} \boldsymbol{Q}\boldsymbol{x}\cdot \boldsymbol{Q}\boldsymbol{y}&= \frac{1}{4}\Vert{\boldsymbol{Q}\boldsymbol{x}} +\boldsymbol{Q}\boldsymbol{y}\Vert^2- \frac{1}{4}\Vert{\boldsymbol{Q}\boldsymbol{x}} -\boldsymbol{Q}\boldsymbol{y}\Vert^2\\ &= \frac{1}{4}\Vert{\boldsymbol{Q}(\boldsymbol{x}} +\boldsymbol{y})\Vert^2- \frac{1}{4}\Vert{\boldsymbol{Q}(\boldsymbol{x}} -\boldsymbol{y})\Vert^2\\ \end{align*}$$

We know from theoremlink that $\Vert\boldsymbol{Q}\boldsymbol{x}\Vert=\Vert{\boldsymbol{x}}\Vert$. Therefore, we have:

$$\begin{align*} \boldsymbol{Q}\boldsymbol{x}\cdot \boldsymbol{Q}\boldsymbol{y} &= \frac{1}{4}\Vert{\boldsymbol{x}} +\boldsymbol{y}\Vert^2- \frac{1}{4}\Vert{\boldsymbol{x}} -\boldsymbol{y}\Vert^2\\ \end{align*}$$

Using \eqref{eq:bvWCKdz5Oa0Kj2YbNcl} once more gives us the desired result:

$$\begin{align*} \boldsymbol{Q}\boldsymbol{x}\cdot \boldsymbol{Q}\boldsymbol{y} &=\boldsymbol{x}\cdot{\boldsymbol{y}} \end{align*}$$

This completes the proof.

Theorem.

Determinant of an orthogonal matrix

If $\boldsymbol{Q}$ is an orthogonal matrix, then:

$$\det(\boldsymbol{Q})=\pm1$$

Solution. By definition of orthogonal matrices, we have that:

$$\boldsymbol{Q}\boldsymbol{Q}^T=\boldsymbol{I}$$

Taking the determinant of both sides gives:

$$\begin{equation}\label{eq:OQGa4knzgaWQr3ShXuQ} \det(\boldsymbol{Q}\boldsymbol{Q}^T)= \det(\boldsymbol{I}) \end{equation}$$

We know from theoremlink that the determinant of an identity matrix is $1$. Next, we know from theoremlink that $\det(\boldsymbol{AB})=\det(\boldsymbol{A})\cdot\det(\boldsymbol{B})$ for any two squares matrices $\boldsymbol{A}$ and $\boldsymbol{B}$. Therefore, \eqref{eq:OQGa4knzgaWQr3ShXuQ} becomes:

$$\begin{equation}\label{eq:rm2ktxJUk80N1X6CTbx} \det(\boldsymbol{Q})\cdot\det(\boldsymbol{Q}^T)= 1 \end{equation}$$

Next, theoremlink tells us that $\det(\boldsymbol{Q}^T)=\det(\boldsymbol{Q})$. Therefore, \eqref{eq:rm2ktxJUk80N1X6CTbx} becomes:

$$\begin{align*} \det(\boldsymbol{Q})\cdot\det(\boldsymbol{Q})&=1\\ \big[\det(\boldsymbol{Q})\big]^2&=1\\ \det(\boldsymbol{Q})&=\pm1 \end{align*}$$

This completes the proof.

Matrices with orthogonal columns

Recall that the column vectors of an $n\times{n}$ orthogonal matrix form an orthonormal basis for $\mathbb{R}^n$. This means that:

  • every column vector is an unit vector.

  • every column vector is orthogonal to every other column vector.

In this section, we will look at a more relaxed version of an orthogonal matrix that only satisfies the second property.

Theorem.

Square matrix with orthogonal columns is invertible

If $\boldsymbol{A}$ is a square matrix with orthogonal columns, then $\boldsymbol{A}$ is invertiblelink.

Proof. Let $\boldsymbol{A}$ be an $n\times{n}$ matrix with orthogonal columns. The column vectors of $\boldsymbol{A}$ form an orthogonal basis of $\mathbb{R}^n$, which means that the set of column vectors is linearly independent by theoremlink. By theoremlink, $\boldsymbol{A}$ is invertible. This completes the proof.

Theorem.

Inverse of a square matrix with orthogonal columns

If $\boldsymbol{A}$ is an $n\times{n}$ matrix with orthogonal columns, then:

$$\boldsymbol{A}^{-1} =\boldsymbol{D}^2\boldsymbol{A}^T$$

Where $\boldsymbol{D}$ is the following $n\times{n}$ diagonal matrixlink:

$$\boldsymbol{D}=\begin{pmatrix} \frac{1}{\Vert\boldsymbol{a}_{1}\Vert}&0&\cdots&0\\ 0&\frac{1}{\Vert\boldsymbol{a}_{2}\Vert}&\cdots&0\\ \vdots&\vdots&\smash\ddots&\vdots\\ 0&0&\cdots&\frac{1}{\Vert\boldsymbol{a}_{n}\Vert}\\ \end{pmatrix}$$

Where $\boldsymbol{a}_1$, $\boldsymbol{a}_2$, $\cdots$, $\boldsymbol{a}_n$ are the column vectors of $\boldsymbol{A}$.

Proof. Let $\boldsymbol{A}$ be an $n\times{n}$ matrix with orthogonal columns:

$$\boldsymbol{A}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{a_1}&\boldsymbol{a_2}&\cdots&\boldsymbol{a_n}\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix}$$

We can convert $\boldsymbol{A}$ into an orthogonal matrix $\boldsymbol{Q}$ by converting each column vector into a unit vector:

$$\begin{equation}\label{eq:BDLJ6XiStiRppatzIID} \begin{aligned}[b] \boldsymbol{Q} &= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \frac{\boldsymbol{a_1}}{\Vert\boldsymbol{a}_{1}\Vert}& \frac{\boldsymbol{a_2}}{\Vert\boldsymbol{a}_{2}\Vert} &\cdots&\frac{\boldsymbol{a_n}}{\Vert\boldsymbol{a}_{n}\Vert}\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix}\\&= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{a_1}&\boldsymbol{a_2}&\cdots&\boldsymbol{a_n}\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix}\begin{pmatrix} \frac{1}{\Vert\boldsymbol{a}_{1}\Vert}&0&\cdots&0\\ 0&\frac{1}{\Vert\boldsymbol{a}_{2}\Vert}&\cdots&0\\ \vdots&\vdots&\smash\ddots&\vdots\\ 0&0&\cdots&\frac{1}{\Vert\boldsymbol{a}_{n}\Vert}\\ \end{pmatrix}\\ &=\boldsymbol{AD} \end{aligned} \end{equation}$$

Here, the second equality holds by theoremlink.

By the propertylink of orthogonal matrices, we have that $\boldsymbol{Q}^T=\boldsymbol{Q}^{-1}$. Taking the transpose of both sides of \eqref{eq:BDLJ6XiStiRppatzIID} gives:

$$\begin{align*} \boldsymbol{Q}^T&= (\boldsymbol{AD})^T\\ &=\boldsymbol{D}^T\boldsymbol{A}^T\\ &=\boldsymbol{D}\boldsymbol{A}^T\\ \end{align*}$$

Where the second equality follows by theoremlink. The inverse $\boldsymbol{Q}^{-1}$ is:

$$\begin{align*} \boldsymbol{Q}^{-1}&= (\boldsymbol{AD})^{-1}\\ &= \boldsymbol{D}^{-1}\boldsymbol{A}^{-1} \end{align*}$$

Where the second equality holds by theoremlink. Equating $\boldsymbol{Q}^T= \boldsymbol{Q}^{-1}$ gives:

$$\begin{align*} \boldsymbol{DA}^T&= \boldsymbol{D}^{-1}\boldsymbol{A}^{-1}\\ \boldsymbol{DDA}^T&=\boldsymbol{A}^{-1}\\ \boldsymbol{A}^{-1}&=\boldsymbol{D}^2\boldsymbol{A}^T \end{align*}$$

This completes the proof.

Theorem.

Upper triangular matrix with orthogonal columns is a diagonal matrix

If $\boldsymbol{A}$ is an upper triangular matrixlink with orthogonal columns, then $\boldsymbol{A}$ is a diagonal matrix.

Proof. Since $\boldsymbol{A}$ has orthogonal columns, $\boldsymbol{A}$ is invertible by theoremlink. Because $\boldsymbol{A}$ is an upper triangular matrix, we have the following:

  • $\boldsymbol{A}^{-1}$ is an upper triangular matrix by theoremlink.

  • $\boldsymbol{A}^T$ is a lower triangular matrix by theoremlink.

Since $\boldsymbol{A}$ has orthogonal columns, $\boldsymbol{A}^{-1} =\boldsymbol{D}^2\boldsymbol{A}^T$ where $\boldsymbol{D}$ is some diagonal matrix by theoremlink. By theoremlink, $\boldsymbol{D}^2$ is also diagonal. Because $\boldsymbol{A}^T$ is a lower triangular matrix, the product $\boldsymbol{D}^2\boldsymbol{A}^T$ is also a lower triangular matrix by theoremlink. Because $\boldsymbol{A}^{-1}= \boldsymbol{D}^2\boldsymbol{A}^T$, we have that $\boldsymbol{A}^{-1}$ is a lower triangular matrix.

Therefore, $\boldsymbol{A}^{-1}$ is both upper triangular and lower triangular. The means that $\boldsymbol{A}^{-1}$ is a diagonal matrix. The inverse of a diagonal matrix is also diagonal by theoremlink, which means that $(\boldsymbol{A}^{-1})^{-1}=\boldsymbol{A}$ is also diagonal. This completes the proof.

Theorem.

Upper triangular orthogonal matrix is a diagonal matrix with entries either 1 or minus 1

If $\boldsymbol{A}$ is an upper triangularlink orthogonal matrixlink, then $\boldsymbol{A}$ is a diagonal matrixlink with entries $\pm1$.

Proof. By theoremlink, if $\boldsymbol{A}$ is an upper triangular orthogonal matrix, then $\boldsymbol{A}$ is a diagonal matrix. Suppose $\boldsymbol{A}$ is as follows:

$$\boldsymbol{A}= \begin{pmatrix} a_{11}&0&\cdots&0\\ 0&a_{22}&\cdots&0\\ \vdots&\vdots&\smash\ddots&\vdots\\ 0&0&\cdots&a_{nn}\\ \end{pmatrix}$$

Since $\boldsymbol{A}$ is orthogonal, the column vectors must be unit vectors. This means that the diagonal entries can either be $\pm1$ like so:

$$\boldsymbol{A}= \begin{pmatrix} \pm1&0&\cdots&0\\ 0&\pm1&\cdots&0\\ \vdots&\vdots&\smash\ddots&\vdots\\ 0&0&\cdots&\pm1 \end{pmatrix}$$

This completes the proof.

robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...