search
Search
Login
Unlock 100+ guides
menu
menu
web
search toc
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
What does this mean?
Why is this true?
Give me some examples!
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
check_circle
Mark as learned
thumb_up
0
thumb_down
0
chat_bubble_outline
0
Comment
auto_stories Bi-column layout
settings

Comprehensive Guide on Orthogonal Diagonalization

schedule Aug 12, 2023
Last updated
local_offer
Linear Algebra
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!
Definition.

Orthogonal similar matrices

Let $\boldsymbol{A}$ and $\boldsymbol{B}$ be some square matrices. $\boldsymbol{A}$ is said to be orthogonally similar to $\boldsymbol{B}$ if there exists an orthogonal matrixlink $\boldsymbol{Q}$ such that $\boldsymbol{A}= \boldsymbol{Q}\boldsymbol{B}\boldsymbol{Q}^T$.

Example.

Checking that two matrices are orthogonally similar

Consider the following matrices:

$$\boldsymbol{B}=\begin{pmatrix} 0&2\\3&4 \end{pmatrix},\;\;\;\;\; \boldsymbol{Q}=\begin{pmatrix} 1&1/\sqrt2\\0&1/\sqrt2 \end{pmatrix}$$

Find a matrix $\boldsymbol{A}$ that is orthogonally similar to $\boldsymbol{B}$ through $\boldsymbol{Q}$.

Solution. The dot productlink of the column vectors of $\boldsymbol{Q}$ is $0$, which means that the column vectors are orthogonal. Also, the column vectors are of unit length and thus by theoremlink, we have that $\boldsymbol{Q}$ is an orthogonal matrix.

Let's now compute $\boldsymbol{Q}\boldsymbol{BQ}^T$ like so:

$$\begin{align*} \boldsymbol{Q}\boldsymbol{BQ}^T&= \begin{pmatrix}1&1/\sqrt2\\0&1/\sqrt2\end{pmatrix} \begin{pmatrix}0&2\\3&4\end{pmatrix} \begin{pmatrix}1&0\\1/\sqrt2&1/\sqrt2\end{pmatrix}\\ &=\begin{pmatrix}(5/\sqrt2)+2&(2/\sqrt2)+2\\ (3/\sqrt2)+2&2\end{pmatrix} \end{align*}$$

If we let $\boldsymbol{A}$ equal this matrix, then $\boldsymbol{A}$ will be orthogonally similar to $\boldsymbol{B}$.

Theorem.

Orthogonally similar matrices are also similar matrices

If matrix $\boldsymbol{A}$ is orthogonally similarlink to matrix $\boldsymbol{B}$, then $\boldsymbol{A}$ is similarlink to $\boldsymbol{B}$.

Proof. By definitionlink, if $\boldsymbol{A}$ is orthogonally similar to $\boldsymbol{B}$, then there exists an orthogonal matrix $\boldsymbol{Q}$ such that:

$$\boldsymbol{A} =\boldsymbol{Q}\boldsymbol{BQ}^T =\boldsymbol{Q}\boldsymbol{BQ}^{-1}$$

Where the second equality follows because $\boldsymbol{Q}^T= \boldsymbol{Q}^{-1}$ by propertylink. Since $\boldsymbol{Q}$ is invertible by propertylink, $\boldsymbol{Q}^{-1}$ exists. This means that $\boldsymbol{A}$ and $\boldsymbol{B}$ are similarlink by definition. This completes the proof.

Because orthogonally similar matrices are similar matrices, orthogonally similar matrices enjoy all the nice properties of similar matrices! For instance, orthogonally similar matrices are commutative, which means that if $\boldsymbol{A}$ is orthogonally similar to $\boldsymbol{B}$, then $\boldsymbol{B}$ is orthogonally similar to $\boldsymbol{A}$. To learn more about the other properties of similar matrices, please check out our guide on similar matrices.

Definition.

Orthogonally diagonalizable matrices

Suppose matrix $\boldsymbol{A}$ is orthogonally similar to some diagonal matrixlink $\boldsymbol{D}$, that is:

$$\boldsymbol{A}=\boldsymbol{Q}\boldsymbol{D}\boldsymbol{Q}^T$$

Where $\boldsymbol{Q}$ is some orthogonal matrix. We say that:

  • $\boldsymbol{A}$ is orthogonally diagonalizable.

  • $\boldsymbol{Q}$ orthogonally diagonalizes $\boldsymbol{A}$.

Theorem.

Equivalent definition of orthogonally diagonalizable matrices

Matrix $\boldsymbol{A}$ is orthogonally diagonalizable if and only if there exist an orthogonal matrix $\boldsymbol{Q}$ and diagonal matrix $\boldsymbol{D}$ such that:

$$\boldsymbol{D}=\boldsymbol{Q}^T\boldsymbol{A}\boldsymbol{Q}$$

Proof. By definitionlink, if $\boldsymbol{A}$ is orthogonally diagonalizable, then there exists an orthogonal matrix $\boldsymbol{Q}$ and diagonal matrix $\boldsymbol{D}$ such that:

$$\begin{equation}\label{eq:MZoWp18igntt89SRfef} \boldsymbol{A}=\boldsymbol{Q}\boldsymbol{D}\boldsymbol{Q}^T \end{equation}$$

By propertylink of orthogonal matrices, we have that $\boldsymbol{Q}^T=\boldsymbol{Q}^{-1}$. Therefore, \eqref{eq:MZoWp18igntt89SRfef} becomes:

$$\begin{align*} \boldsymbol{A} &=\boldsymbol{Q}\boldsymbol{D}\boldsymbol{Q}^{-1}\\ \boldsymbol{AQ}&=\boldsymbol{Q}\boldsymbol{D}\\ \boldsymbol{Q}^{-1}\boldsymbol{AQ}&=\boldsymbol{D}\\ \boldsymbol{D}&=\boldsymbol{Q}^T\boldsymbol{AQ} \end{align*}$$

This completes the proof.

Theorem.

Orthogonally diagonalizable matrices are diagonalizable

If $\boldsymbol{A}$ is an orthogonally diagonalizable matrix, then $\boldsymbol{A}$ is diagonalizablelink.

Proof. By definitionlink, if $\boldsymbol{A}$ is orthogonally diagonalizable, then there exists an orthogonal matrix $\boldsymbol{Q}$ and a diagonal matrix $\boldsymbol{D}$ such that:

$$\begin{equation}\label{eq:oQhoi4LJxYzTmi4v5Za} \begin{aligned}[b] \boldsymbol{A} &=\boldsymbol{Q}\boldsymbol{D}\boldsymbol{Q}^T\\ &=\boldsymbol{Q}\boldsymbol{D}\boldsymbol{Q}^{-1} \end{aligned} \end{equation}$$

Note that the second equality follow because of propertylink, that is, $\boldsymbol{Q}^T=\boldsymbol{Q}^{-1}$.

Now, recall that an $n\times{n}$ matrix $\boldsymbol{A}$ is said to be diagonalizablelink if there exists an invertible matrix $\boldsymbol{P}$ and a diagonal matrix $\boldsymbol{D}$ such that:

$$\boldsymbol{A}= \boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1}$$

Since any orthogonal matrix $\boldsymbol{Q}$ is invertible by propertylink, we conclude that orthogonally diagonalizable matrices are diagonalizable. This completes the proof.

This means that orthogonally diagonalizable matrices inherit all the properties of diagonalizable matrices! For instance, given $\boldsymbol{A}=\boldsymbol{QDQ}^T$, the column vectors of $\boldsymbol{Q}$ are the eigenvectors of $\boldsymbol{A}$ and the diagonal entries of $\boldsymbol{D}$ are the corresponding eigenvalues. However, orthogonal diagonalization requires a stricter condition that the eigenvectors must be orthogonal, which for many matrices is not true. In other words, most matrices are not orthogonally diagonalizable (although they may be diagonalizable).

We will now state and prove an extraordinary theorem that dictates what type of matrices is orthogonally diagonalizable.

Theorem.

Real spectral theorem

Let $\boldsymbol{A}$ be an $n\times{n}$ matrix. $\boldsymbol{A}$ is orthogonally diagonalizable if and only if $\boldsymbol{A}$ is symmetriclink.

Proof. We start by proving the forward proposition, which is much simpler than the proof of the converse. By definitionlink, if $\boldsymbol{A}$ is orthogonally diagonalizable, then there exist an orthogonal matrix $\boldsymbol{Q}$ and a diagonal matrix $\boldsymbol{D}$ such that:

$$\begin{equation}\label{eq:i2tUYOvl2WZETZvV3vi} \boldsymbol{A}=\boldsymbol{Q}\boldsymbol{D}\boldsymbol{Q}^{T} \end{equation}$$

Now, let's take the transpose of both sides:

$$\begin{align*} \boldsymbol{A}^T&=(\boldsymbol{Q}\boldsymbol{D}\boldsymbol{Q}^{T})^T\\ \boldsymbol{A}^T&=(\boldsymbol{Q}^T)^T\boldsymbol{D}^T\boldsymbol{Q}^{T}\\ \boldsymbol{A}^T&=\boldsymbol{Q}\boldsymbol{D}^T\boldsymbol{Q}^T\\ \boldsymbol{A}^T&=\boldsymbol{Q}\boldsymbol{D}\boldsymbol{Q}^T\\ \boldsymbol{A}^T&=\boldsymbol{A}\\ \end{align*}$$

Here, we used theoremlink for the second equality and theoremlink for $\boldsymbol{D}^T=\boldsymbol{D}$.

Since the transpose of $\boldsymbol{A}$ is equal to itself, $\boldsymbol{A}$ must be symmetric by definitionlink. This completes the proof of the forward proposition.

* * *

We will now prove the converse by induction on the size $n$ of a symmetric matrix $\boldsymbol{A}$. For the base case, suppose $n=1$, that is, $\boldsymbol{A}$ is a $1\times{1}$ symmetric matrix. Let's check if $\boldsymbol{A}$ is orthogonally diagonalizable:

$$\begin{pmatrix}1\end{pmatrix}^T\begin{pmatrix}a\end{pmatrix} \begin{pmatrix}1\end{pmatrix}= \begin{pmatrix}a\end{pmatrix}$$

Since $(1)$ is an orthogonal matrix and $(a)$ is a diagonal matrix, we conclude that $\boldsymbol{A}$ is orthogonally diagonalizable by definitionlink. We now assume that if $\boldsymbol{A}$ is an $(n-1)\times(n-1)$ symmetric matrix, then $\boldsymbol{A}$ is orthogonally diagonalizable. Our goal is to show that if $\boldsymbol{A}$ is an $n\times{n}$ symmetric matrix, then $\boldsymbol{A}$ is orthogonally diagonalizable.

Suppose $\boldsymbol{A}$ is an $n\times{n}$ symmetric matrix. Let $\lambda_1$ be an eigenvalue of $\boldsymbol{A}$ and let $\boldsymbol{x}_1$ be the corresponding eigenvector. Let $\boldsymbol{q}_1=\boldsymbol{x}_1$ and we apply theoremlink to construct a basis for $\mathbb{R}^n$ using $\boldsymbol{q}_1$. Suppose this basis is:

$$\{\boldsymbol{q}_1,\boldsymbol{v}_2,\cdots,\boldsymbol{v}_n\}$$

We now use the Gram-Schmidt processlink to convert this basis into an orthonormal basislink $\{\boldsymbol{q}_1,\boldsymbol{q}_2,\cdots,\boldsymbol{q}_n\}$ for $\mathbb{R}^n$. Let's define an orthogonal matrix $\boldsymbol{Q}_1$ whose column vectors are $\boldsymbol{q}_1$, $\boldsymbol{q}_2$, $\cdots$, $\boldsymbol{q}_n$ like so:

$$\boldsymbol{Q}_1= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{q}_1&\boldsymbol{q}_2&\cdots&\boldsymbol{q}_n\\ \vert&\vert&\cdots&\vert \end{pmatrix}$$

Now, consider $\boldsymbol{B}= \boldsymbol{Q}_1^{T}\boldsymbol{A}\boldsymbol{Q}_1$. Let's first show that $\boldsymbol{B}$ is symmetric:

$$\begin{align*} \boldsymbol{B}^T&=(\boldsymbol{Q}_1^T\boldsymbol{A}\boldsymbol{Q}_1)^T\\ &=\boldsymbol{Q}_1^T\boldsymbol{A}^T(\boldsymbol{Q}_1^T)^T\\ &=\boldsymbol{Q}_1^T\boldsymbol{A}^T\boldsymbol{Q}_1\\ &=\boldsymbol{Q}_1^T\boldsymbol{A}\boldsymbol{Q}_1\\ &=\boldsymbol{B} \end{align*}$$

Note that we used theoremlink for the first step. Since $\boldsymbol{B}^T=\boldsymbol{B}$, we have that $\boldsymbol{B}$ is a symmetric matrix by definitionlink.

Let's now evaluate the right-hand side of $\boldsymbol{B}= \boldsymbol{Q}^T_1\boldsymbol{A}\boldsymbol{Q}_1$ like so:

$$\begin{equation}\label{eq:ZaNbHzF3mH8VKKOMxcd} \begin{aligned}[b] \boldsymbol{B}&=\boldsymbol{Q}_1^T\boldsymbol{A}\boldsymbol{Q}_1 \\&= \begin{pmatrix}-&\boldsymbol{q}^T_1&-\\-&\boldsymbol{q}^T_2&-\\ \vdots&\vdots&\vdots\\-&\boldsymbol{q}^T_n&-\\ \end{pmatrix}\boldsymbol{A} \begin{pmatrix}\vert&\vert&\cdots&\vert\\ \boldsymbol{q}_1&\boldsymbol{q}_2&\cdots&\boldsymbol{q}_n\\ \vert&\vert&\cdots&\vert\end{pmatrix}\\ &= \begin{pmatrix}-&\boldsymbol{q}^T_1&-\\-&\boldsymbol{q}^T_2&-\\ \vdots&\vdots&\vdots\\-&\boldsymbol{q}^T_n&-\\ \end{pmatrix} \begin{pmatrix}\vert&\vert&\cdots&\vert\\ \boldsymbol{A}\boldsymbol{q}_1& \boldsymbol{A}\boldsymbol{q}_2&\cdots& \boldsymbol{A}\boldsymbol{q}_n\\ \vert&\vert&\cdots&\vert\end{pmatrix}\\ \end{aligned} \end{equation}$$

Here, we used theoremlink for the last step. The top-left entry of $\boldsymbol{B}$ is:

$$\begin{equation}\label{eq:rWZWVlSi936aOGhV9hP} \begin{aligned}[b] \boldsymbol{q}_1^T \boldsymbol{Aq}_1&= \boldsymbol{q}_1\cdot \boldsymbol{Aq}_1 \end{aligned} \end{equation}$$

Now, recall that $\boldsymbol{q}_1$ is an eigenvector corresponding to the eigenvalue $\lambda_1$. By definition then, the following is true:

$$\begin{equation}\label{eq:nwSUU2wYAAmUmcPgtBv} \boldsymbol{A}\boldsymbol{q}_1= \lambda_1\boldsymbol{q}_1 \end{equation}$$

Substituting \eqref{eq:nwSUU2wYAAmUmcPgtBv} into the right-hand side of \eqref{eq:rWZWVlSi936aOGhV9hP} gives:

$$\begin{align*} \boldsymbol{q}_1^T \boldsymbol{Aq}_1&= \boldsymbol{q}_1\cdot \lambda_1\boldsymbol{q}_1\\ &=\lambda_1(\boldsymbol{q}_1\cdot \boldsymbol{q}_1)\\ &=\lambda_1\Vert\boldsymbol{q}_1\Vert^2\\ &=\lambda_1(1)^2\\ &=\lambda_1 \end{align*}$$

Here, we used:

  • the theoremlink that $\boldsymbol{q}_1\cdot\boldsymbol{q}_1= \Vert\boldsymbol{q}_1\Vert^2$.

  • $\Vert\boldsymbol{q}_1\Vert=1$ since $\boldsymbol{q}_1$ is orthonormal and thus has a magnitude of $1$.

Let's now focus on the other entries of the first column of $\boldsymbol{B}$ in \eqref{eq:ZaNbHzF3mH8VKKOMxcd}. The $2$nd, $3$rd, $\cdots$, $n$-th entries of the first column are:

$$\begin{gather*} \boldsymbol{q}_2\cdot\boldsymbol{A}\boldsymbol{q}_1 \;{\color{blue}=}\;\boldsymbol{q}_2\cdot\lambda\boldsymbol{q}_1\;{\color{blue}=}\;\lambda(\boldsymbol{q}_2\cdot\boldsymbol{q}_1)\;{\color{blue}=}\;\lambda(0)\;{\color{blue}=}\;0\\ \boldsymbol{q}_3\cdot\boldsymbol{A}\boldsymbol{q}_1 \;{\color{blue}=}\;\boldsymbol{q}_3\cdot\lambda\boldsymbol{q}_1\;{\color{blue}=}\;\lambda(\boldsymbol{q}_3\cdot\boldsymbol{q}_1)\;{\color{blue}=}\;\lambda(0)\;{\color{blue}=}\;0\\ \vdots\\ \boldsymbol{q}_n\cdot\boldsymbol{A}\boldsymbol{q}_1 \;{\color{blue}=}\;\boldsymbol{q}_n\cdot\lambda\boldsymbol{q}_1 \;{\color{blue}=}\;\lambda(\boldsymbol{q}_n\cdot\boldsymbol{q}_1)\;{\color{blue}=}\;\lambda(0)\;{\color{blue}=}\;0 \\ \end{gather*}$$

This means that the composition of the first column of $\boldsymbol{B}$ is:

  • the first entry is $\lambda_1$.

  • the rest of the entries is $0$.

Let's now focus on the first row of $\boldsymbol{B}$. We have proven that $\boldsymbol{B}$ is a symmetric matrix, which means that the first column of $\boldsymbol{B}$ is equal to the first row of $\boldsymbol{B}$. Therefore, the composition of the first row of $\boldsymbol{B}$ is:

  • the first entry is $\lambda_1$.

  • the rest of the entries is $0$.

Therefore, we have that:

$$\boldsymbol{B}= \begin{pmatrix} \lambda_1&0&0&\cdots&0\\ 0&\color{blue}*&\color{blue}*&\color{blue}\cdots&\color{blue}*\\ 0&\color{blue}*&\color{blue}*&\color{blue}\cdots&\color{blue}*\\ \vdots&\color{blue}\vdots&\color{blue}\vdots&\color{blue}\smash\ddots&\color{blue}\vdots\\ 0&\color{blue}*&\color{blue}*&\color{blue}\cdots&\color{blue}*\\ \end{pmatrix}= \begin{pmatrix} \lambda_1&\boldsymbol{0}\\ \boldsymbol{0}&\color{blue}\boldsymbol{C} \end{pmatrix}$$

Here, we are using representing $\boldsymbol{B}$ as a block matrixlink:

  • $\boldsymbol{0}$ is a zero matrix.

  • $\boldsymbol{C}$ represents the blue sub-matrix.

We don't know the exact entries of $\boldsymbol{C}$ but we do know the following:

  • shape of $\boldsymbol{C}$ is $(n-1)\times(n-1)$.

  • $\boldsymbol{C}$ is symmetric because $\boldsymbol{B}$ is symmetric.

We now use the inductive assumption that any $(n-1)\times(n-1)$ symmetric matrix is orthogonally diagonalizable. This means that $\boldsymbol{C}$ is orthogonally diagonalizable. By definitionlink of orthogonally diagonalizable matrices, we have that:

$$\boldsymbol{D}= \boldsymbol{P}^T\boldsymbol{C}\boldsymbol{P}$$

Where $\boldsymbol{D}$ is an $(n-1)\times(n-1)$ diagonal matrix and $\boldsymbol{P}$ is an $(n-1)\times(n-1)$ orthogonal matrix, which we define as a collection of row vectors:

$$\boldsymbol{P}= \begin{pmatrix} -&\boldsymbol{p}_1&-\\ -&\boldsymbol{p}_2&-\\ \vdots&\vdots&\vdots\\ -&\boldsymbol{p}_{n-1}&-\\ \end{pmatrix}$$

Now, define an $n\times{n}$ block matrix $\boldsymbol{Q}_2$ like so:

$$\boldsymbol{Q}_2= \begin{pmatrix} 1&\boldsymbol{0}\\ \boldsymbol{0}&\boldsymbol{P} \end{pmatrix}$$

Our goal is to show that $\boldsymbol{Q}_1\boldsymbol{Q}_2$ orthogonally diagonalizes $\boldsymbol{A}$. By definitionlink then, we must show that:

  • $\boldsymbol{Q}_1\boldsymbol{Q}_2$ is orthogonal.

  • $(\boldsymbol{Q}_1\boldsymbol{Q}_2)^T \boldsymbol{A} (\boldsymbol{Q}_1\boldsymbol{Q}_2)$ results in a diagonal matrix.

Let's start by showing that $\boldsymbol{Q}_2$ is orthogonal. By theoremlink, the transpose of $\boldsymbol{Q}_2$ is:

$$\boldsymbol{Q}_2^T= \begin{pmatrix} 1&\boldsymbol{0}\\ \boldsymbol{0}&\boldsymbol{P}^T \end{pmatrix}$$

By theoremlink, we can evaluate $\boldsymbol{Q}_2\boldsymbol{Q}^T_2$ in block form like so:

$$\begin{align*} \boldsymbol{Q}_2 \boldsymbol{Q}_2^T&= \begin{pmatrix} 1&\boldsymbol{0}\\\boldsymbol{0}&\boldsymbol{P}\end{pmatrix} \begin{pmatrix} 1&\boldsymbol{0}\\\boldsymbol{0}&\boldsymbol{P}^T \end{pmatrix}\\ &= \begin{pmatrix} 1&\boldsymbol{0}\\ \boldsymbol{0}&\boldsymbol{PP}^T \end{pmatrix} \end{align*}$$

Because $\boldsymbol{P}$ is orthogonal, we have that $\boldsymbol{PP}^T= \boldsymbol{I}_{n-1}$ by definitionlink. This means $\boldsymbol{Q}_2\boldsymbol{Q}^T_2=\boldsymbol{I}_n$ and so $\boldsymbol{Q}_2$ is orthogonal by definitionlink. Now, by theoremlink, because $\boldsymbol{Q}_1$ and $\boldsymbol{Q}_2$ are both orthogonal, we have that their product $\boldsymbol{Q}_1\boldsymbol{Q}_2$ is also orthogonal.

Now, we must show that $(\boldsymbol{Q}_1\boldsymbol{Q}_2)^T \boldsymbol{A} (\boldsymbol{Q}_1\boldsymbol{Q}_2)$ results in a diagonal matrix:

$$\begin{equation}\label{eq:e3E4DXBim5OT3VSMhMO} \begin{aligned}[b] (\boldsymbol{Q}_1\boldsymbol{Q}_2)^T \boldsymbol{A} (\boldsymbol{Q}_1\boldsymbol{Q}_2) &= \boldsymbol{Q}_2^T\boldsymbol{Q}_1^T \boldsymbol{A} \boldsymbol{Q}_1\boldsymbol{Q}_2\\ &=\boldsymbol{Q}_2^T(\boldsymbol{Q}_1^T \boldsymbol{A} \boldsymbol{Q}_1)\boldsymbol{Q}_2\\ &=\boldsymbol{Q}_2^T(\boldsymbol{B})\boldsymbol{Q}_2\\ &= \begin{pmatrix} 1&\boldsymbol{0}\\\boldsymbol{0}&\boldsymbol{P}^T \end{pmatrix} \begin{pmatrix}\lambda_1&\boldsymbol{0}\\\boldsymbol{0}&\boldsymbol{C} \end{pmatrix} \begin{pmatrix}1&\boldsymbol{0}\\\boldsymbol{0}&\boldsymbol{P} \end{pmatrix}\\ &=\begin{pmatrix} \lambda_1&\boldsymbol{0}\\\boldsymbol{0}&\boldsymbol{P}^T\boldsymbol{C} \end{pmatrix} \begin{pmatrix}1&\boldsymbol{0}\\\boldsymbol{0}&\boldsymbol{P} \end{pmatrix}\\ &=\begin{pmatrix} \lambda_1&\boldsymbol{0}\\\boldsymbol{0}&\boldsymbol{P}^T\boldsymbol{CP} \end{pmatrix}\\ &=\begin{pmatrix} \lambda_1&\boldsymbol{0}\\\boldsymbol{0}&\boldsymbol{D} \end{pmatrix} \end{aligned} \end{equation}$$

Note the following:

  • for the first equality, we used theoremlink $(\boldsymbol{AB})^T=\boldsymbol{B}^T\boldsymbol{A}^T$.

  • for the third equality, we used the definition $\boldsymbol{B}=\boldsymbol{Q}^T_1\boldsymbol{A}\boldsymbol{Q}_1$.

  • for the multiplication of block matrices, we used theoremlink.

  • for the last equality, we used the definition $\boldsymbol{D}=\boldsymbol{P}^T\boldsymbol{C}\boldsymbol{P}$.

Since $\boldsymbol{D}$ is a diagonal matrix, we have that \eqref{eq:e3E4DXBim5OT3VSMhMO} is diagonal.

Because $\boldsymbol{Q}_1\boldsymbol{Q}_2$ is orthogonal and $(\boldsymbol{Q}_1\boldsymbol{Q}_2)^T \boldsymbol{A}(\boldsymbol{Q}_1\boldsymbol{Q}_2)$ results in a diagonal matrix, we conclude that $\boldsymbol{Q}_1\boldsymbol{Q}_2$ orthogonally diagonalizes $\boldsymbol{A}$. In other words, $\boldsymbol{A}$ is orthogonally diagonalizable. This completes the proof.

Theorem.

Orthogonally diagonalizable matrices of size n have a set of n orthonormal eigenvectors

Let $\boldsymbol{A}$ be an $n\times{n}$ matrix. $\boldsymbol{A}$ is orthogonally diagonalizable if and only if $\boldsymbol{A}$ has an orthonormal set of $n$ eigenvectors.

Proof. We first prove the forward proposition. Let $\boldsymbol{A}$ be an $n\times{n}$ orthogonally diagonalizable matrix, which means that there exists an orthogonal matrix $\boldsymbol{Q}$ and a diagonal matrix $\boldsymbol{D}$ such that:

$$\boldsymbol{A}=\boldsymbol{Q}\boldsymbol{D}\boldsymbol{Q}^{-1}$$

Since an orthogonally diagonalizable matrix is also a diagonalizable matrix by theoremlink, we can apply the diagonalization theoremlink. This means that the columns of $\boldsymbol{Q}$ are the eigenvectors of $\boldsymbol{A}$. Since $\boldsymbol{Q}$ is an orthogonal matrix, the columns of $\boldsymbol{Q}$ are orthogonal by propertylink. Therefore, the set of eigenvectors of $\boldsymbol{A}$ is orthogonal. By theoremlink, we can convert this orthogonal set into an orthonormal set. This completes the proof of the forward proposition.

* * *

Let's now prove the converse. Let $\boldsymbol{A}$ be an $n\times{n}$ matrix with an orthonormal set of $n$ eigenvectors $\boldsymbol{q}_1$, $\boldsymbol{q}_2$, $\cdots$, $\boldsymbol{q}_n$. Let the corresponding eigenvalues be $\lambda_1$, $\lambda_2$, $\cdots$, $\lambda_n$. By definitionlink of eigenvalues and eigenvectors, the following equations hold:

$$\begin{equation}\label{eq:ZyE5KJCsAVLL48QgSrz} \begin{gathered} \boldsymbol{Aq}_1=\lambda_1\boldsymbol{q}_1\\ \boldsymbol{Aq}_2=\lambda_2\boldsymbol{q}_2\\ \vdots\\ \boldsymbol{Aq}_n=\lambda_n\boldsymbol{q}_n\\ \end{gathered} \end{equation}$$

Let $\boldsymbol{Q}$ be a matrix whose column vectors are $\boldsymbol{q}_1$, $\boldsymbol{q}_2$, $\cdots$, $\boldsymbol{q}_n$ like so:

$$\boldsymbol{Q}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{q}_1&\boldsymbol{q}_2&\cdots&\boldsymbol{q}_n\\ \vert&\vert&\cdots&\vert \end{pmatrix}$$

Since the column vectors of $\boldsymbol{Q}$ are orthonormal, $\boldsymbol{Q}$ is an orthogonal matrix by theoremlink. Using theoremlink, the product $\boldsymbol{AQ}$ can be written as:

$$\boldsymbol{AQ}= \begin{pmatrix}\vert&\vert&\cdots&\vert\\ \boldsymbol{Aq}_1& \boldsymbol{Aq}_2&\cdots& \boldsymbol{Aq}_n\\\vert&\vert&\cdots&\vert \end{pmatrix}$$

Now, we use \eqref{eq:ZyE5KJCsAVLL48QgSrz} to get:

$$\boldsymbol{AQ}= \begin{pmatrix}\vert&\vert&\cdots&\vert\\ \lambda_1\boldsymbol{q}_1& \lambda_2\boldsymbol{q}_2&\cdots& \lambda_n\boldsymbol{q}_n\\\vert&\vert&\cdots&\vert \end{pmatrix}$$

Using theoremlink, this can be separated into:

$$\begin{align*} \boldsymbol{AQ} &= \begin{pmatrix}\vert&\vert&\cdots&\vert\\ \boldsymbol{q}_1& \boldsymbol{q}_2&\cdots& \boldsymbol{q}_n\\\vert&\vert&\cdots&\vert \end{pmatrix}\begin{pmatrix} \lambda_1&0&\cdots&0\\ 0&\lambda_2&\cdots&0\\ \vdots&\vdots&\smash\ddots&\vdots\\ 0&0&\cdots&\lambda_n \end{pmatrix} \\ &=\boldsymbol{QD} \end{align*}$$

Therefore, we have that $\boldsymbol{AQ}=\boldsymbol{QD}$. Because $\boldsymbol{Q}$ is an orthogonal matrix, $\boldsymbol{Q}$ is invertible by propertylink and so $\boldsymbol{Q}^{-1}$ exists. Finally, we perform some basic matrix manipulation:

$$\begin{align*} \boldsymbol{AQ}&=\boldsymbol{QD}\\ \boldsymbol{Q}^{-1}\boldsymbol{AQ}&=\boldsymbol{Q}^{-1}\boldsymbol{QD}\\ \boldsymbol{Q}^{T}\boldsymbol{AQ}&=\boldsymbol{D}\\ \end{align*}$$

This means that $\boldsymbol{A}$ is orthogonally diagonalizable by definitionlink. This completes the proof.

Theorem.

Principal axis theorem - equivalent statements of orthogonally diagonalizable matrices

Let $\boldsymbol{A}$ be an $n\times{n}$ matrix. The following statements are equivalent:

  1. $\boldsymbol{A}$ is orthogonally diagonalizable.

  2. $\boldsymbol{A}$ is symmetric.

  3. $\boldsymbol{A}$ has an orthonormal set of $n$ eigenvectors.

Proof. $(1)\Longleftrightarrow(2)$ is true by the real spectral theoremlink. $(1)\Longleftrightarrow(3)$ is true by theoremlink. Therefore, we have that:

$$(1)\Longleftrightarrow(2) \Longleftrightarrow(3)$$

This completes the proof.

Theorem.

Eigenvectors of symmetric matrices corresponding to distinct eigenvalues are orthogonal

If $\boldsymbol{A}$ is a symmetric matrixlink, then the eigenvectorslink of $\boldsymbol{A}$ corresponding to distinct eigenvalueslink are orthogonal.

Proof. Let $\boldsymbol{A}$ be a symmetric matrix. Denote any two distinct eigenvalues of $\boldsymbol{A}$ as $\lambda_1$ and $\lambda_2$ with corresponding eigenvectors $\boldsymbol{x}_1$ and $\boldsymbol{x}_2$. Our goal is to show that $\boldsymbol{x}_1$ and $\boldsymbol{x}_2$ are orthogonal.

By definitionlink of eigenvalues and eigenvectors, we have that:

$$\begin{equation}\label{eq:iR4kSnCnYf90PSjY8zB} \begin{aligned}[b] \boldsymbol{A}\boldsymbol{x}_1&=\lambda_1\boldsymbol{x}_1\\ \boldsymbol{A}\boldsymbol{x}_2&=\lambda_2\boldsymbol{x}_2\\ \end{aligned} \end{equation}$$

Let's focus on the first equation. Taking the transpose of both sides gives:

$$(\boldsymbol{A}\boldsymbol{x}_1)^T =(\lambda_1\boldsymbol{x}_1)^T$$

By propertylink and propertylink of matrix transpose, we have that:

$$\boldsymbol{x}_1^T\boldsymbol{A}^T =\lambda_1\boldsymbol{x}_1^T$$

Since $\boldsymbol{A}$ is symmetric, $\boldsymbol{A}^T=\boldsymbol{A}$. Therefore, we have:

$$\boldsymbol{x}_1^T\boldsymbol{A} =\lambda_1\boldsymbol{x}_1^T$$

Multiplying both sides by eigenvector $\boldsymbol{x}_2$ gives:

$$\boldsymbol{x}_1^T\boldsymbol{A}\boldsymbol{x}_2 =\lambda_1\boldsymbol{x}_1^T\boldsymbol{x}_2$$

Using the second equation of \eqref{eq:iR4kSnCnYf90PSjY8zB}, we get:

$$\begin{align*} \boldsymbol{x}_1^T\lambda_2\boldsymbol{x}_2 &=\lambda_1\boldsymbol{x}_1^T\boldsymbol{x}_2\\ \lambda_2\boldsymbol{x}_1^T\boldsymbol{x}_2 &=\lambda_1\boldsymbol{x}_1^T\boldsymbol{x}_2\\ \lambda_2\boldsymbol{x}_1^T\boldsymbol{x}_2- \lambda_1\boldsymbol{x}_1^T\boldsymbol{x}_2 &=\boldsymbol{0}\\ \boldsymbol{x}_1^T\boldsymbol{x}_2(\lambda_2-\lambda_1) &=\boldsymbol{0}\\ (\boldsymbol{x}_1\cdot\boldsymbol{x}_2)(\lambda_2-\lambda_1) &=\boldsymbol{0} \end{align*}$$

Because $\lambda_2$ and $\lambda_1$ are distinct, we have that $\lambda_2-\lambda_1\ne0$ and thus:

$$\boldsymbol{x}_1\cdot\boldsymbol{x}_2=\boldsymbol{0}$$

By theoremlink, $\boldsymbol{x}_1$ and $\boldsymbol{x}_2$ must be orthogonal. This completes the proof.

Example.

Demonstrating that eigenvectors of symmetric matrices are orthogonal

Consider the following symmetric matrix:

$$\boldsymbol{A}= \begin{pmatrix} 1&2\\2&1 \end{pmatrix}$$

Find the eigenvectors of $\boldsymbol{A}$ and show that they are orthogonal.

Solution. The characteristic polynomiallink of $\boldsymbol{A}$ is:

$$\begin{align*} \det(\boldsymbol{A}-\lambda\boldsymbol{I}) &= \begin{vmatrix} 1-\lambda&2\\2&1-\lambda\\ \end{vmatrix}\\ &=(1-\lambda)^2-2^2\\ &=1-2\lambda+\lambda^2-4\\ &=\lambda^2-2\lambda-3\\ &=(\lambda-3)(\lambda+1) \end{align*}$$

The eigenvalues of $\boldsymbol{A}$ are $\lambda_1=3$ and $\lambda_2 =-1$. We now find the eigenvectors corresponding to each of these eigenvalues:

$$\begin{align*} \begin{pmatrix}1-\lambda_1&2\\2&1-\lambda_1\\\end{pmatrix} \begin{pmatrix}x_{11}\\x_{12}\end{pmatrix}&= \begin{pmatrix}0\\0\end{pmatrix}\\ \begin{pmatrix}-2&2\\2&-2\\\end{pmatrix} \begin{pmatrix}x_{11}\\x_{12}\end{pmatrix}&= \begin{pmatrix}0\\0\end{pmatrix}\\ \begin{pmatrix}-2&2\\0&0\\\end{pmatrix} \begin{pmatrix}x_{11}\\x_{12}\end{pmatrix}&= \begin{pmatrix}0\\0\end{pmatrix}\\ \begin{pmatrix}1&-1\\0&0\\\end{pmatrix} \begin{pmatrix}x_{11}\\x_{12}\end{pmatrix}&= \begin{pmatrix}0\\0\end{pmatrix}\\ \end{align*}$$

Let $x_{12}=r$ where $r$ is some scalar. Using the first row, we get $x_{11}=r$. Therefore, the eigenvectors corresponding to $\lambda_1=3$ can be expressed as:

$$\begin{pmatrix} x_{11}\\x_{12} \end{pmatrix}= \begin{pmatrix} r\\r \end{pmatrix}= \begin{pmatrix} 1\\1 \end{pmatrix}r$$

Let's now get the eigenvectors corresponding to $\lambda_2=-1$ below:

$$\begin{align*} \begin{pmatrix}1-\lambda_2&2\\2&1-\lambda_2\\\end{pmatrix} \begin{pmatrix}x_{21}\\x_{22}\end{pmatrix}&= \begin{pmatrix}0\\0\end{pmatrix}\\ \begin{pmatrix}2&2\\2&2\\\end{pmatrix} \begin{pmatrix}x_{21}\\x_{22}\end{pmatrix}&= \begin{pmatrix}0\\0\end{pmatrix}\\ \begin{pmatrix}1&1\\0&0\\\end{pmatrix} \begin{pmatrix}x_{21}\\x_{22}\end{pmatrix}&= \begin{pmatrix}0\\0\end{pmatrix}\\ \end{align*}$$

Let $x_{22}=t$ where $t\in\mathbb{R}$. Using the first row, we get $x_{21}=-t$. Therefore, the eigenvectors corresponding to $\lambda_2=-1$ can be expressed as:

$$\begin{pmatrix} x_{21}\\x_{22} \end{pmatrix}= \begin{pmatrix} -t\\t \end{pmatrix}= \begin{pmatrix} -1\\1 \end{pmatrix}t$$

Let's now check that the eigenvectors corresponding to $\lambda_1$ and $\lambda_2$ are orthogonal:

$$\begin{align*} \begin{pmatrix}1\\1\end{pmatrix}r\cdot \begin{pmatrix}-1\\1\end{pmatrix}t&= rt\left[\begin{pmatrix}1\\1\end{pmatrix}\cdot \begin{pmatrix}-1\\1\end{pmatrix}\right]\\ &=rt\big[(1)(-1)+(1)(1)\big]\\ &=0 \end{align*}$$

Since the dot product of the two eigenvectors is $0$, they are orthogonal by theoremlink.

Example.

Performing orthogonal diagonalization on a symmetric 2x2 matrix

Consider the same symmetric matrix as in the previous example:

$$\boldsymbol{A}= \begin{pmatrix} 1&2\\2&1 \end{pmatrix}$$

Orthogonally diagonalize $\boldsymbol{A}$.

Solution. Firstly, since matrix $\boldsymbol{A}$ is symmetric, $\boldsymbol{A}$ is orthogonally diagonalizable by theoremlink. This means that there exists an orthogonal matrix $\boldsymbol{Q}$ and a diagonal matrix $\boldsymbol{D}$ such that:

$$\boldsymbol{A}=\boldsymbol{Q}\boldsymbol{D}\boldsymbol{Q}^T$$

In examplelink, we found that:

Eigenvalue

Eigenspace

Example of eigenvector

$$\lambda_1=3$$
$$\left\{\begin{pmatrix} 1\\1 \end{pmatrix}r \;|\; r\in\mathbb{R} \right\}$$
$$\begin{pmatrix} 1\\1 \end{pmatrix}$$
$$\lambda_2=-1$$
$$\left\{\begin{pmatrix} -1\\1 \end{pmatrix}t \;|\; t\in\mathbb{R} \right\}$$
$$\begin{pmatrix} -1\\1 \end{pmatrix}$$

Let's normalize the eigenvectors:

$$\boldsymbol{u}_1= \frac{1}{\sqrt{(1)^2+(1)^2}} \begin{pmatrix}1\\1\end{pmatrix}= \frac{1}{\sqrt2}\begin{pmatrix}1\\1\end{pmatrix} ,\;\;\;\;\;\;\; \boldsymbol{u}_2= \frac{1}{\sqrt{(-1)^2+(1)^2}} \begin{pmatrix}1\\1\end{pmatrix}= \frac{1}{\sqrt2}\begin{pmatrix}-1\\1\end{pmatrix}$$

We can now orthogonally diagonalize $\boldsymbol{A}$ like so:

$$\begin{align*} \boldsymbol{A}&= \begin{pmatrix}\vert&\vert\\\boldsymbol{u}_1& \boldsymbol{u}_2\\\vert&\vert\end{pmatrix} \begin{pmatrix}\lambda_1&0\\0&\lambda_2\end{pmatrix} \begin{pmatrix}-&\boldsymbol{u}_1&-\\ -&\boldsymbol{u}_2&-\end{pmatrix}\\ &=\begin{pmatrix}1/\sqrt2&-1/\sqrt2\\1/\sqrt2&1/\sqrt2\end{pmatrix} \begin{pmatrix}3&0\\0&-1\end{pmatrix} \begin{pmatrix}1/\sqrt2&1/\sqrt2\\-1/\sqrt2&1/\sqrt2\end{pmatrix} \end{align*}$$
robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...