search
Search
Unlock 100+ guides
search toc
close
account_circle
Profile
exit_to_app
Sign out
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview Doc Search Code Search Beta SORRY NOTHING FOUND!
mic
Start speaking... Voice search is only supported in Safari and Chrome.
Shrink
Navigate to

# Introduction to Diagonalization

schedule Aug 11, 2023
Last updated
local_offer
Linear Algebra
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!
Definition.

# Diagonalizable matrix

An $n\times{n}$ matrix $\boldsymbol{A}$ is said to be diagonalizable if there exists an invertible matrix $\boldsymbol{P}$ and a diagonal matrixlink $\boldsymbol{D}$ such that:

$$\boldsymbol{A}= \boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1}$$

Note that $\boldsymbol{P}$ is said to diagonalize $\boldsymbol{A}$. We say that $\boldsymbol{PDP}^{-1}$ is an eigen-decomposition of $\boldsymbol{A}$.

Example.

## 2x2 diagonalizable matrix

Here's an example of a diagonalizable $2\times2$ matrix:

$$\begin{pmatrix} 2&3\\4&1\\ \end{pmatrix}=\begin{pmatrix} 1&3\\1&-4\\ \end{pmatrix} \begin{pmatrix} 5&0\\0&-2\\ \end{pmatrix} \begin{pmatrix} 1&3\\1&-4\\ \end{pmatrix}^{-1}$$
Theorem.

# Diagonalizable matrices and the corresponding diagonal matrices are similar

If $\boldsymbol{A}$ is a diagonalizable matrix and $\boldsymbol{D}$ is the corresponding diagonal matrix, then $\boldsymbol{A}$ and $\boldsymbol{D}$ are similarlink.

Proof. Let $\boldsymbol{A}$ be a diagonalizable matrix:

$$\boldsymbol{A}= \boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1}$$

Where $\boldsymbol{P}$ is an invertible matrix and $\boldsymbol{D}$ is a diagonal matrix. By definitionlink of similar matrices, $\boldsymbol{A}$ and $\boldsymbol{D}$ are similar.

This means that a diagonalizable matrix and its associated diagonal matrix enjoy the properties of similar matrices.

Theorem.

# Diagonalization Theorem

Suppose we have an $n\times{n}$ matrix $\boldsymbol{A}$ that is diagonalizable, which means $\boldsymbol{A}$ can be written as:

$$\boldsymbol{A}=\boldsymbol{PDP}^{-1}$$

Where $\boldsymbol{P}$ is some invertible matrix and $\boldsymbol{D}$ is a diagonal matrix.

The diagonalization theorem states that:

• $\boldsymbol{D}$ is an $n\times{n}$ diagonal matrix with the eigenvalues of $\boldsymbol{A}$ as its diagonal entries.

• $\boldsymbol{P}$ is an invertible $n\times{n}$ matrix with the eigenvectors of $\boldsymbol{A}$ as its columns.

Proof. Suppose we have an $n\times{n}$ diagonalizable matrix $\boldsymbol{A}$ as follows:

$$\boldsymbol{A}=\boldsymbol{PDP}^{-1}$$

Where $\boldsymbol{P}$ is an invertible $n\times{n}$ matrix and $\boldsymbol{D}$ is an $n\times{n}$ diagonal matrix.

Suppose $\boldsymbol{P}$ has columns $\boldsymbol{x}_1$, $\boldsymbol{x}_2$, $\cdots$, $\boldsymbol{x}_n$ like so:

$$\boldsymbol{P}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{x_1}&\boldsymbol{x_2}&\cdots&\boldsymbol{x_n}\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix}$$

Suppose the diagonal matrix $\boldsymbol{D}$ is:

$$\boldsymbol{D}= \begin{pmatrix} \lambda_1&0&\cdots&0\\ 0&\lambda_2&\cdots&0\\ \vdots&\vdots&\smash{\ddots}&\vdots\\ 0&0&\cdots&\lambda_n\\ \end{pmatrix}$$

Where the diagonal entries $\lambda_1$, $\lambda_2$, $\cdots$, $\lambda_n$ are some real numbers. Note that we have defined $\boldsymbol{P}$ and $\boldsymbol{D}$ as above but we haven't imposed any conditions on them - we're simply expressing them as generic matrices.

By theoremlink, the matrix product $\boldsymbol{PD}$ is:

\begin{equation}\label{eq:LDiIcKRQr8b5m9eHnuG} \begin{aligned}[b] \boldsymbol{PD} &= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{x_1}&\boldsymbol{x_2}&\cdots&\boldsymbol{x_n}\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix} \begin{pmatrix} \lambda_1&0&\cdots&0\\ 0&\lambda_2&\cdots&0\\ \vdots&\vdots&\smash\ddots&\vdots\\ 0&0&\cdots&\lambda_n\\ \end{pmatrix}\\ &= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \lambda_1\boldsymbol{x_1}&\lambda_2\boldsymbol{x_2}&\cdots&\lambda_n\boldsymbol{x_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix} \end{aligned} \end{equation}

By theoremlink, the matrix product $\boldsymbol{AP}$ is:

\begin{equation}\label{eq:JWfoh7K8qb73Xx4MAR1} \begin{aligned}[b] \boldsymbol{AP} &= \boldsymbol{A} \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{x_1}&\boldsymbol{x_2}&\cdots&\boldsymbol{x_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix}\\ &= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{Ax_1}&\boldsymbol{Ax_2}&\cdots&\boldsymbol{Ax_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix} \end{aligned} \end{equation}

Now recall that $\boldsymbol{A}=\boldsymbol{PDP}^{-1}$. Multiplying both sides by $\boldsymbol{P}$ would give us $\boldsymbol{AP}=\boldsymbol{PD}$. Using equations \eqref{eq:LDiIcKRQr8b5m9eHnuG} and \eqref{eq:JWfoh7K8qb73Xx4MAR1}, we have that:

\begin{align*} \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{Ax_1}&\boldsymbol{Ax_2}&\cdots&\boldsymbol{Ax_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \lambda_1\boldsymbol{x_1}&\lambda_2\boldsymbol{x_2}&\cdots&\lambda_n\boldsymbol{x_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix} \end{align*}

Equating the columns gives us:

\begin{align*} \boldsymbol{Ax}_1&=\lambda_1\boldsymbol{x}_1\\ \boldsymbol{Ax}_2&=\lambda_2\boldsymbol{x}_2\\ &\;\;\vdots\\ \boldsymbol{Ax}_n&=\lambda_n\boldsymbol{x}_n\\ \end{align*}

By theoremlink, because $\boldsymbol{P}$ is invertible, its column vectors $\boldsymbol{x}_i$ are non-zero vectors. Therefore, by the definitionlink of eigenvalues and eigenvectors:

• $\lambda_1$, $\lambda_2$, $\cdots$, $\lambda_n$ must represent the eigenvalues.

• $\boldsymbol{x}_1$, $\boldsymbol{x}_2$, $\cdots$, $\boldsymbol{x}_n$ must represent the corresponding eigenvectors!

This completes the proof.

Example.

## Diagonalizing a 2x2 matrix

Diagonalize the following matrix:

$$\boldsymbol{A}=\begin{pmatrix} 2 & 3\\ 4 & 1\\ \end{pmatrix}$$

That is, find the diagonal matrix $\boldsymbol{D}$ and invertible matrix $\boldsymbol{P}$ such that $\boldsymbol{A}=\boldsymbol{PDP}^{-1}$.

Solution. We will use the diagonalization theorem to diagonalize matrix $\boldsymbol{A}$. We know that the columns of matrix $\boldsymbol{P}$ are the eigenvectors of $\boldsymbol{A}$ whereas the diagonal entries of matrix $\boldsymbol{D}$ are the eigenvalues of $\boldsymbol{A}$.

Therefore, we must compute the eigenvalues and eigenvectors of $\boldsymbol{A}$. We already went through the step-by-step calculations for this specific matrix $\boldsymbol{A}$ in this sectionlink of our guide on eigenvalues and eigenvectors, so we will just skip the calculations here. The eigenvalues of $\boldsymbol{A}$ are as follows:

$$\lambda_1=5,\;\;\; \lambda_2=-2$$

The corresponding eigenvectors are:

$$\boldsymbol{x}_1=\begin{pmatrix} 1\\ 1\\ \end{pmatrix},\;\;\; \boldsymbol{x}_2=\begin{pmatrix} 3\\ -4\\ \end{pmatrix}$$

We can now fully define matrices $\boldsymbol{P}$ and $\boldsymbol{D}$ like so:

$$\boldsymbol{P} =\begin{pmatrix} 1&3\\ 1&-4\\ \end{pmatrix},\;\;\; \boldsymbol{D} =\begin{pmatrix} 5&0\\ 0&-2\\ \end{pmatrix}$$

Matrix $\boldsymbol{A}$ can therefore be written as:

\begin{align*} \boldsymbol{A}&=\boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1} \\ &=\begin{pmatrix} 1&3\\ 1&-4\\ \end{pmatrix} \begin{pmatrix} 5&0\\ 0&-2\\ \end{pmatrix} \begin{pmatrix} 1&3\\ 1&-4\\ \end{pmatrix}^{-1} \end{align*}
Theorem.

# Taking the power of a diagonalized matrix

If $\boldsymbol{A}$ is a diagonalizable $n\times{n}$ matrix, then $\boldsymbol{A}$ raised to the power of $k$ is:

$$\boldsymbol{A}^k=\boldsymbol{P}\boldsymbol{D}^k\boldsymbol{P}^{-1}$$

Proof. Since $\boldsymbol{A}$ is assumed to be a diagonalizable matrix, we have that:

$$\boldsymbol{A}=\boldsymbol{PDP}^{-1}$$

Taking the power of $k$ on both sides yields:

\begin{align*} \boldsymbol{A}^k&=(\boldsymbol{P}\boldsymbol{D}\boldsymbol{P^{-1}})^k\\ &=\underbrace{(\boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1})(\boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1})(\boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1})\cdots(\boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1})}_{k}\\ &=\boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1}\boldsymbol{P}\boldsymbol{D}\boldsymbol{{}}^{-1}\boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1}\cdots\boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1}\\ &=\boldsymbol{P}\boldsymbol{D}\boldsymbol{I}\boldsymbol{D}\boldsymbol{I}\boldsymbol{D}\boldsymbol{I}\cdots\boldsymbol{I}\mathbf{D}\boldsymbol{P}^{-1}\\ &=\boldsymbol{P}\underbrace{\boldsymbol{D}\boldsymbol{D}\boldsymbol{D}\cdots\boldsymbol{D}}_k\boldsymbol{P}^{-1}\\ &=\boldsymbol{P}\boldsymbol{D}^k\boldsymbol{P}^{-1}\\ \end{align*}

This completes the proof.

Example.

## Computing the power of a diagonalizable matrix

Consider the same matrix as earlier:

$$\boldsymbol{A}=\begin{pmatrix} 2 & 3\\ 4 & 1\\ \end{pmatrix}$$

Compute $\boldsymbol{A}^{3}$.

Solution. Recall that the diagonalized form of $\boldsymbol{A}$ is:

\begin{align*} \boldsymbol{A}&=\boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1} \\ &=\begin{pmatrix} 1&3\\ 1&-4\\ \end{pmatrix} \begin{pmatrix} 5&0\\ 0&-2\\ \end{pmatrix} \begin{pmatrix} 1&3\\ 1&-4\\ \end{pmatrix}^{-1} \end{align*}

We now apply the theorem:

\begin{align*} \boldsymbol{A}^3&= \boldsymbol{P} \boldsymbol{D}^3 \boldsymbol{P}^{-1}\\ &= \begin{pmatrix} 1&3\\ 1&-4\\ \end{pmatrix} \begin{pmatrix} 5&0\\ 0&-2\\ \end{pmatrix}^3 \begin{pmatrix} 1&3\\ 1&-4\\ \end{pmatrix}^{-1}\\ &= \begin{pmatrix}1&3\\1&-4\\\end{pmatrix} \begin{pmatrix}5^3&0\\0&-2^3\\\end{pmatrix} \begin{pmatrix}\frac{4}{7}&\frac{3}{7}\\\frac{1}{7}&\frac{-1}{7}\\\end{pmatrix}\\ &= \begin{pmatrix}68&57\\76&49\\\end{pmatrix} \end{align*}

This theorem is particularly useful when computing much higher powers. For instance, suppose we wanted to compute $\boldsymbol{A}^{100}$. Without the theorem, this would involve computing matrix products $99$ times:

$$\boldsymbol{A}^{100} = \underbrace{ \begin{pmatrix} 2 & 3\\ 4 & 1\\ \end{pmatrix} \begin{pmatrix} 2 & 3\\ 4 & 1\\ \end{pmatrix} \cdots \begin{pmatrix} 2 & 3\\ 4 & 1\\ \end{pmatrix} }_{100}$$

In contrast, using our theorem involves computing the following:

\begin{align*} \boldsymbol{A}^{100}&= \boldsymbol{P} \boldsymbol{D}^3 \boldsymbol{P}^{-1}\\ &= \begin{pmatrix}1&3\\1&-4\\\end{pmatrix} \begin{pmatrix}5^{100}&0\\0&-2^{100}\\\end{pmatrix} \begin{pmatrix}\frac{4}{7}&\frac{3}{7}\\\frac{1}{7}&\frac{-1}{7}\\\end{pmatrix} \end{align*}

Computing powers of several numbers is computationally much cheaper than performing matrix multiplication $99$ times!

Theorem.

# Matrix of size n is diagonalizable if and only if it has n linearly independent eigenvectors

Let $\boldsymbol{A}$ be an $n\times{n}$ matrix. $\boldsymbol{A}$ is diagonalizable if and only if $\boldsymbol{A}$ has $n$ linearly independent eigenvectors.

Proof. We start by proving the forward proposition. Assume $\boldsymbol{A}$ is diagonalizable, which means that:

$$\boldsymbol{A}= \boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1}$$

Where:

• $\boldsymbol{D}$ is a diagonal matrix with eigenvalues of $\boldsymbol{A}$ as its diagonal entries.

• $\boldsymbol{P}$ is an invertible $n\times{n}$ matrix with the eigenvectors of $\boldsymbol{A}$ as its columns.

Since $\boldsymbol{A}$ is diagonalizable, both $\boldsymbol{D}$ and $\boldsymbol{P}$ exist. Since $\boldsymbol{P}$ is invertible, $\boldsymbol{P}$ has $n$ linearly independent columns by theoremlink. Since the columns of $\boldsymbol{P}$ are its eigenvectors, we have that $\boldsymbol{P}$ has $n$ linearly independent eigenvectors.

* * *

Let's now prove the converse. We assume that $\boldsymbol{A}$ has $n$ linearly independent eigenvectors $\boldsymbol{x}_1$, $\boldsymbol{x}_2$, $\cdots$, $\boldsymbol{x}_n$ and corresponding eigenvalues $\lambda_1$, $\lambda_2$, $\cdots$, $\lambda_n$. By definitionlink of eigenvectors and eigenvalues, the following equations hold:

$$\begin{equation}\label{eq:x4dH09uGAwai0JZqNxh} \begin{gathered} \boldsymbol{A}\boldsymbol{x}_1 =\lambda_1\boldsymbol{x}_1\\ \boldsymbol{A}\boldsymbol{x}_2 =\lambda_2\boldsymbol{x}_2\\ \vdots\\ \boldsymbol{A}\boldsymbol{x}_n =\lambda_n\boldsymbol{x}_n\\ \end{gathered} \end{equation}$$

In matrix form, we can express this as:

$$\begin{equation}\label{eq:ymonPOqHL1v7Dg4EZ2F} \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{Ax_1}&\boldsymbol{Ax_2}&\cdots&\boldsymbol{Ax_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \lambda_1\boldsymbol{x_1}&\lambda_2\boldsymbol{x_2}&\cdots&\lambda_n\boldsymbol{x_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix} \end{equation}$$

Where \eqref{eq:x4dH09uGAwai0JZqNxh} is obtained by equating the columns. By theoremlink, the left-hand side of \eqref{eq:ymonPOqHL1v7Dg4EZ2F} can be expressed as:

$$\begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{Ax_1}&\boldsymbol{Ax_2}&\cdots&\boldsymbol{Ax_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix}= \boldsymbol{A}\begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{x_1}&\boldsymbol{x_2}&\cdots&\boldsymbol{x_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix}= \boldsymbol{AP}$$

Where $\boldsymbol{P}$ is a matrix whose columns are the $n$ linearly independent eigenvectors.

Next, by theoremlink, the right-hand side of \eqref{eq:ymonPOqHL1v7Dg4EZ2F} can be expressed as:

$$\begin{pmatrix} \vert&\vert&\cdots&\vert\\ \lambda_1\boldsymbol{x_1}&\lambda_2\boldsymbol{x_2}&\cdots&\lambda_n\boldsymbol{x_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix}=\begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{x_1}&\boldsymbol{x_2}&\cdots&\boldsymbol{x_n}\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix} \begin{pmatrix} \lambda_1&0&\cdots&0\\ 0&\lambda_2&\cdots&0\\ \vdots&\vdots&\smash\ddots&\vdots\\ 0&0&\cdots&\lambda_n\\ \end{pmatrix}= \boldsymbol{PD}$$

Where $\boldsymbol{D}$ is a diagonal matrix whose diagonal entries are the eigenvalues of $\boldsymbol{A}$.

Now, \eqref{eq:ymonPOqHL1v7Dg4EZ2F} can be expressed as $\boldsymbol{AP}=\boldsymbol{PD}$. Since the columns of $\boldsymbol{P}$ are linearly independent, $\boldsymbol{P}$ is invertible by theoremlink. This means that $\boldsymbol{P}^{-1}$ exists and thus:

\begin{align*} \boldsymbol{AP}&=\boldsymbol{PD}\\ \boldsymbol{APP}^{-1}&=\boldsymbol{PDP}^{-1}\\ \boldsymbol{AI}_n&=\boldsymbol{PDP}^{-1}\\ \boldsymbol{A}&=\boldsymbol{PDP}^{-1}\\ \end{align*}

By definitionlink, $\boldsymbol{A}$ is therefore diagonalizable. This completes the proof.

Theorem.

# Matrix of size n with n distinct eigenvalues is diagonalizable

If $\boldsymbol{A}$ is an $n\times{n}$ matrix with $n$ distinct eigenvalues, then $\boldsymbol{A}$ is diagonalizable.

Proof. By definitionlink of diagonalizable matrices and the diagonalization theoremlink, we know that $\boldsymbol{A}$ is diagonalizable if we can write $\boldsymbol{A}$ as:

$$\boldsymbol{A}= \boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1}$$

Where:

• $\boldsymbol{D}$ is a diagonal matrix with eigenvalues of $\boldsymbol{A}$ as its diagonal entries.

• $\boldsymbol{P}$ is an invertible $n\times{n}$ matrix with the eigenvectors of $\boldsymbol{A}$ as its columns.

By theoremlink, if an $n\times{n}$ matrix $\boldsymbol{A}$ has $n$ distinct eigenvalues, then $\boldsymbol{A}$ has $n$ linearly independent eigenvectors. By theoremlink, $\boldsymbol{A}$ is therefore diagonalizable. This completes the proof.

Note that the converse is not necessarily true, that is, if $\boldsymbol{A}$ is diagonalizable, then $n\times{n}$ matrix may not have $n$ distinct eigenvalues.

Theorem.

# Matrix is diagonalizable if and only if it has an eigenbasis

Let $\boldsymbol{A}$ be a square matrix. $\boldsymbol{A}$ is diagonalizable if and only if $\boldsymbol{A}$ has an eigenbasis.

Proof. This theorem is analogous to theoremlink. Let's first prove the forward proposition. Assume an $n\times{n}$ matrix $\boldsymbol{A}$ is diagonalizable. By theoremlink, $\boldsymbol{A}$ has $n$ linearly independent eigenvectors, which form an eigenbasis for $\mathbb{R}^n$.

Let's now prove the converse. Assume that $\boldsymbol{A}$ has an eigenbasis. This means that $\boldsymbol{A}$ has $n$ linearly independent eigenvectors. By theoremlink once again, $\boldsymbol{A}$ is diagonalizable.

This completes the proof.

thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...
thumb_up
0
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!