search
Search
Unlock 100+ guides
search toc
close
account_circle
Profile
exit_to_app
Sign out
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview Doc Search Code Search Beta SORRY NOTHING FOUND!
mic
Start speaking... Voice search is only supported in Safari and Chrome.
Shrink
Navigate to

# Comprehensive Guide on Cholesky Factorization

schedule Aug 12, 2023
Last updated
local_offer
Linear Algebra
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!

We have provenlink in the previous guide that a positive definite matrixlink $\boldsymbol{A}$ can be Cholesky-factorized into $\boldsymbol{A}= \boldsymbol{U}^T\boldsymbol{U}$ where $\boldsymbol{U}$ is an upper triangular matrixlink with positive diagonal entries. However, the theoremlink merely guarantees the existence of the Cholesky factorization without providing concrete steps to obtain it.

In this section, we will go over an algorithm to obtain the Cholesky factorization of a positive definite matrix and then prove its correctness and uniqueness.

Definition.

# Cholesky Factorization

The Cholesky factorization of $\boldsymbol{A}$ is defined as $\boldsymbol{A}= \boldsymbol{U}^T\boldsymbol{U}$ where $\boldsymbol{U}$ is an upper triangular matrixlink with positive diagonal entries.

Theorem.

# Algorithm for Cholesky Factorization

If $\boldsymbol{A}$ is a positive definite matrixlink, then theoremlink guarantees that $\boldsymbol{A}$ can be Cholesky-factorized as $\boldsymbol{A}=\boldsymbol{U}^T\boldsymbol{U}$. To obtain the Cholesky factorization, perform the following two steps:

1. row-reduce $\boldsymbol{A}$ to upper triangular matrix $\boldsymbol{U}_1$ only by performing the elementary row operation of adding a multiple of one row to another row below it.

2. divide each row of $\boldsymbol{U}_1$ by the square root of the diagonal entry in that row.

Before proving the correctness of this algorithm, we will first go through a simple example for demonstration.

Example.

# Performing Cholesky Factorization

Perform Cholesky factorization on the following matrix:

$$\boldsymbol{A}= \begin{pmatrix} 2&3&1\\ 3&6&2\\ 1&2&5 \end{pmatrix}$$

Solution. The first step of the Cholesky factorization is to row-reduce $\boldsymbol{A}$ by only performing the elementary row operation of adding a multiple of one row to another row below it:

$$\begin{equation}\label{eq:S4V5aGJ7dID0Iu3dhTc} \begin{pmatrix} 2&3&1\\3&6&2\\1&2&5 \end{pmatrix}\sim \begin{pmatrix} 2&3&1\\0&3/2&1/2\\0&1/2&1/2 \end{pmatrix}\sim \begin{pmatrix} 2&3&1\\0&3/2&1/2\\0&0&1/3 \end{pmatrix} \end{equation}$$

The rightmost matrix in \eqref{eq:S4V5aGJ7dID0Iu3dhTc} is $\boldsymbol{U}_1$. For the second step, we divide each row of $\boldsymbol{U}_1$ by the square root of the diagonal entry in that row:

$$\boldsymbol{U}=\begin{pmatrix} 2/\sqrt2&3/\sqrt2&1/\sqrt2\\ 0&(3/2)/\sqrt{3/2}&(1/2)/\sqrt{3/2}\\ 0&0&(1/3)/\sqrt{1/3} \end{pmatrix}$$

Therefore, the Cholesky factorization of $\boldsymbol{A}$ is:

\begin{align*} \boldsymbol{A}&=\boldsymbol{U}^T\boldsymbol{U} \\&= \begin{pmatrix} 2/\sqrt2&0&0\\ 3/\sqrt2&(3/2)/\sqrt{3/2}&0\\ 1/\sqrt2&(1/2)/\sqrt{3/2}&(1/3)/\sqrt{1/3} \end{pmatrix} \begin{pmatrix} 2/\sqrt2&3/\sqrt2&1/\sqrt2\\ 0&(3/2)/\sqrt{3/2}&(1/2)/\sqrt{3/2}\\ 0&0&(1/3)/\sqrt{1/3} \end{pmatrix} \end{align*}

As you can see, obtaining the Cholesky factorization of a positive definite matrix is quite simple!

# Proving the correctness of the Cholesky Factorization algorithm

Let $\boldsymbol{A}$ be an $n\times{n}$ positive definite matrix. By theoremlink, because $\boldsymbol{A}$ is positive definite, $\boldsymbol{A}$ can be factorized as:

$$\begin{equation}\label{eq:W2mRvwhj6ndbox3IZK6} \boldsymbol{A}= \boldsymbol{U}^T\boldsymbol{U} \end{equation}$$

Because $\boldsymbol{U}^T$ and $\boldsymbol{U}$ are square matrices, their diagonals entries are the same by propertylink of matrix transpose. Let $\boldsymbol{D}$ be a diagonal matrix whose diagonal entries are those of $\boldsymbol{U}^T$ and $\boldsymbol{U}$. Suppose the diagonal entries of $\boldsymbol{D}$ are $d_1$, $d_2$, $\cdots$, $d_n$. By theoremlink, the inverse of $\boldsymbol{D}$ is another diagonal matrix whose diagonal entries are $1/d_1$, $1/d_2$, $\cdots$, $1/d_n$.

Now, taking a transpose of an upper triangular matrix yields a lower triangular matrix by theoremlink. $\boldsymbol{U}^T$ is thus a lower triangular matrix with diagonals $d_1$, $d_2$, $\cdots$, $d_n$. By theoremlink, the product $\boldsymbol{U}^T\boldsymbol{D}^{-1}$ is a lower triangular matrix whose diagonal entries are all ones since the reciprocals cancel out.

The inverse of a lower triangular matrix is also a lower triangular matrix whose diagonal entries are the reciprocal of the original matrix by theoremlink. This means that $(\boldsymbol{U}^T\boldsymbol{D}^{-1})^{-1}$ is also lower triangular with all diagonal entries equalling one. Let's denote this matrix as $\boldsymbol{L}$ like so:

$$\begin{equation}\label{eq:i88wig70IwSLSdp0TEk} \boldsymbol{L}= (\boldsymbol{U}^T\boldsymbol{D}^{-1})^{-1} \end{equation}$$

Using equations \eqref{eq:W2mRvwhj6ndbox3IZK6} and \eqref{eq:i88wig70IwSLSdp0TEk}, we can evaluate $\boldsymbol{LA}$ as:

\begin{align*} \boldsymbol{LA}&= \big[(\boldsymbol{U}^T\boldsymbol{D}^{-1})^{-1}\big] (\boldsymbol{U}^T\boldsymbol{U})\\ &=(\boldsymbol{D}^{-1})^{-1}(\boldsymbol{U}^T)^{-1} \boldsymbol{U}^T\boldsymbol{U}\\ &=\boldsymbol{D}\boldsymbol{I}\boldsymbol{U}\\ &=\boldsymbol{D}\boldsymbol{U}\\ \end{align*}

Here, the 2nd equality holds by propertylink of invertible matrices.

Remember, the following:

• $\boldsymbol{D}$ is a diagonal matrix with diagonal entries $d_1$, $d_2$, $\cdots$, $d_n$.

• $\boldsymbol{U}$ is an upper triangular matrix with diagonal entries $d_1$, $d_2$, $\cdots$, $d_n$.

By theoremlink, the product $\boldsymbol{DU}$ is an upper triangular matrix with diagonal entries $d_1^2$, $d_2^2$, $\cdots$, $d_n^2$.

* * *

Now, suppose matrix $\boldsymbol{A}$ is reduced to an upper triangular matrix $\boldsymbol{U}_1$ using only the elementary row operating of adding a multiple of one row to another row below it. The corresponding elementary matrixlink will always be a lower triangular matrix according to propertylink whose diagonal entries are all ones.

To understand why, suppose we have some $3\times3$ matrix. The elementary matrix corresponding to the elementary row operation of multiplying the first row by $5$ and then adding it to the $3$rd row is:

$$\begin{pmatrix} 1&0&0\\ 0&1&0\\ 5&0&1\\ \end{pmatrix}$$

Notice how this is a lower triangular matrix whose diagonal entries are all ones. By propertylink, applying an elementary row operation on matrix $\boldsymbol{A}$ is equivalent to multiplying the corresponding elementary matrix to $\boldsymbol{A}$. The series of elementary row operations required to reduce $\boldsymbol{A}$ to $\boldsymbol{U}_1$ can therefore be represented by:

$$\begin{equation}\label{eq:dYd9w6a3KJz4r8kLOfo} \boldsymbol{E}_r \cdots \boldsymbol{E}_2 \boldsymbol{E}_1\boldsymbol{A} =\boldsymbol{U}_1 \end{equation}$$

Where $\boldsymbol{E}_1$, $\boldsymbol{E}_2$, $\cdots$, $\boldsymbol{E}_r$ are the corresponding elementary matrices. As just discussed, these matrices are all lower triangular. The product of lower triangular matrices is also lower triangular by propertylink and the product would have diagonal entries of all ones by propertylink. This means that we can define a lower triangular matrix $\boldsymbol{L}_1$ whose diagonal entries are all ones such that:

$$\boldsymbol{L}_1=\boldsymbol{E}_r \cdots \boldsymbol{E}_2 \boldsymbol{E}_1$$

With this, \eqref{eq:dYd9w6a3KJz4r8kLOfo} can be simplified as:

$$\begin{equation}\label{eq:GM9sQFICSgDTohkl2CW} \boldsymbol{L}_1\boldsymbol{A}= \boldsymbol{U}_1 \end{equation}$$

Now, consider $\boldsymbol{L}_1 \boldsymbol{U}^T_1$ below:

\begin{equation}\label{eq:BDBiPwRKUmPsHaj6Phc} \begin{aligned}[b] \boldsymbol{L}_1\boldsymbol{U}_1^T &=\boldsymbol{L}_1(\boldsymbol{L}_1\boldsymbol{A})^T\\ &=\boldsymbol{L}_1\boldsymbol{A}^T\boldsymbol{L}_1^T\\ &=\boldsymbol{L}_1\boldsymbol{A}\boldsymbol{L}_1^T\\ &=\boldsymbol{U}_1\boldsymbol{L}_1^T \end{aligned} \end{equation}

Here, note the following:

• the 2nd equality holds by the propertylink of matrix transpose.

• the 3rd equality uses the fact that $\boldsymbol{A}$ is symmetriclink (since $\boldsymbol{A}$ is positive definite), that is, $\boldsymbol{A}^T=\boldsymbol{A}$.

Now, define a diagonal matrix $\boldsymbol{D}_1$ whose diagonal entries are those of $\boldsymbol{U}_1$. Suppose these diagonal entries are $c_1$, $c_2$, $\cdots$, $c_n$.

Next, consider $\boldsymbol{L}_1( \boldsymbol{U}^T_1\boldsymbol{D}^{-1}_1)$ below:

\begin{equation}\label{eq:kcPfPZHi6IgDt0TIMLQ} \begin{aligned}[b] \boldsymbol{L}_1(\boldsymbol{U}^T_1\boldsymbol{D}^{-1}_1)&= (\boldsymbol{L}_1\boldsymbol{U}^T_1)\boldsymbol{D}^{-1}_1\\ &=\boldsymbol{U}_1\boldsymbol{L}^T_1\boldsymbol{D}^{-1}_1 \end{aligned} \end{equation}

Here, the 2nd equality follows by substituting in \eqref{eq:BDBiPwRKUmPsHaj6Phc}.

Let's now determine the nature of the matrix of both sides in \eqref{eq:kcPfPZHi6IgDt0TIMLQ}. We start with the left-hand side. The matrix $\boldsymbol{U}^T_1$ is lower triangular by propertylink and holds diagonal entries $c_1$, $c_2$, $\cdots$, $c_n$. Next, $\boldsymbol{D}_1^{-1}$ is a diagonal matrix whose diagonal entries are $1/c_1$, $1/c_2$, $\cdots$, $1/c_n$ by theoremlink. The product $\boldsymbol{U}^T_1 \boldsymbol{D}_1^{-1}$ is a lower triangular matrix and its diagonal entries are all ones by theoremlink. By theoremlink and theoremlink, because $\boldsymbol{L}_1$ and $\boldsymbol{U}^T_1\boldsymbol{D}_1^{-1}$ are both lower triangular matrices whose diagonal entries are all ones, we conclude that their product $\boldsymbol{L}_1(\boldsymbol{U}^T_1\boldsymbol{D}_1^{-1})$ is lower triangular with diagonal entries of ones.

Now, let's focus on the right-hand side of \eqref{eq:kcPfPZHi6IgDt0TIMLQ}. The matrix $\boldsymbol{L}^T_1$ is upper triangular by propertylink and has diagonal entries of all ones. Again, $\boldsymbol{D}_1^{-1}$ is a diagonal matrix with diagonal entries $1/c_1$, $1/c_2$, $\cdots$, $1/c_n$ by theoremlink. The product $\boldsymbol{L}^T_1\boldsymbol{D}^{-1}$ is therefore upper triangular with diagonal entries $1/c_1$, $1/c_2$, $\cdots$, $1/c_n$ by theoremlink. Because $\boldsymbol{U}_1$ is an upper triangular matrix with diagonal entries $c_1$, $c_2$, $\cdots$, $c_n$, the product $\boldsymbol{U}_1\boldsymbol{L}^T_1 \boldsymbol{D}^{-1}$ is another upper triangular matrix whose diagonal entries are ones by theoremlink.

What we have just shown is that:

The only way for this to hold is if both sides of \eqref{eq:kcPfPZHi6IgDt0TIMLQ} are equal to the identity matrix $\boldsymbol{I}_n$. Taking the left-hand side of \eqref{eq:kcPfPZHi6IgDt0TIMLQ}, we get:

\begin{equation}\label{eq:mP4AEo8Q55fxNPl9BRB} \begin{aligned}[b] \boldsymbol{L}_1(\boldsymbol{U}^T_1\boldsymbol{D}^{-1}_1)&= \boldsymbol{I}_n\\ \boldsymbol{U}^T_1\boldsymbol{D}^{-1}_1&= \boldsymbol{L}_1^{-1} \end{aligned} \end{equation}

Now, define a second diagonal matrix $\boldsymbol{D}_2$ whose diagonal entries are $\sqrt{c_1}$, $\sqrt{c_2}$, $\cdots$, $\sqrt{c_n}$. By theoremlink, $\boldsymbol{D}^2_2= \boldsymbol{D}_1$. In addition, define the following matrix:

$$\boldsymbol{U}= \boldsymbol{D}^{-1}_2 \boldsymbol{U}_1$$

The inverse $\boldsymbol{D}^{-1}_2$ is diagonal by theoremlink, which makes $\boldsymbol{U}$ upper triangular by theoremlink. Now, $\boldsymbol{U}^T\boldsymbol{U}$ is:

\begin{align*} \boldsymbol{U}^T\boldsymbol{U}&= (\boldsymbol{D}^{-1}_2\boldsymbol{U}_1)^T \boldsymbol{D}^{-1}_2\boldsymbol{U}_1\\ &=\boldsymbol{U}_1^T(\boldsymbol{D}^{-1}_2)^T \boldsymbol{D}^{-1}_2\boldsymbol{U}_1\\ &=\boldsymbol{U}_1^T\boldsymbol{D}^{-1}_2 \boldsymbol{D}^{-1}_2\boldsymbol{U}_1\\ &=\boldsymbol{U}_1^T(\boldsymbol{D}_2^2)^{-1}\boldsymbol{U}_1\\ &=\boldsymbol{U}_1^T\boldsymbol{D}_1^{-1}\boldsymbol{U}_1\\ &=\boldsymbol{L}_1^{-1}\boldsymbol{U}_1\\ &=\boldsymbol{A}\\ \end{align*}

Here, note the following:

This completes the proof.

Theorem.

# Uniqueness of Cholesky Factorization

The Cholesky factorization of a positive definite matrixlink $\boldsymbol{A}$ is unique, that is, $\boldsymbol{A}$ cannot be Cholesky-factorized in multiple ways.

Proof. Let $\boldsymbol{A}$ be a positive definite matrix. Suppose there exist two distinct Cholesky factorizations of $\boldsymbol{A}$ below:

\begin{equation}\label{eq:z4xrkDH28cVaxeOr7nk} \begin{aligned} \boldsymbol{A}&=\boldsymbol{U}_1^T\boldsymbol{U}_1\\ \boldsymbol{A}&=\boldsymbol{U}_2^T\boldsymbol{U}_2 \end{aligned} \end{equation}

Now, let's define a matrix $\boldsymbol{D}$ below:

$$\begin{equation}\label{eq:WktRxXhdmllPm2xaRoW} \boldsymbol{D}= \boldsymbol{U}_1\boldsymbol{U}_2^{-1} \end{equation}$$

Our goal is to show that $\boldsymbol{D}$ is an identity matrix, which would make imply $\boldsymbol{U}_2= \boldsymbol{U}_1$. We will do this in two stages:

1. show that $\boldsymbol{D}$ is a diagonal matrix.

2. show that the diagonal entries of $\boldsymbol{D}$ are all $1$.

Let's start by showing that $\boldsymbol{D}$ is diagonal. Focus on \eqref{eq:WktRxXhdmllPm2xaRoW}. Since $\boldsymbol{U}_2$ is upper triangular, $\boldsymbol{U}^{-1}_2$ is also upper triangular by propertylink. Since $\boldsymbol{D}$ is a product of two upper triangular matrices, $\boldsymbol{D}$ is upper triangular by propertylink.

Now, equating \eqref{eq:z4xrkDH28cVaxeOr7nk} gives:

\begin{equation}\label{eq:JdU1WATpAruyotT8aoY} \begin{aligned}[b] \boldsymbol{U}_1^T\boldsymbol{U}_1 &=\boldsymbol{U}_2^T\boldsymbol{U}_2\\ \boldsymbol{U}_1 &=(\boldsymbol{U}_1^T)^{-1}\boldsymbol{U}_2^T\boldsymbol{U}_2 \end{aligned} \end{equation}

Substituting \eqref{eq:JdU1WATpAruyotT8aoY} into \eqref{eq:WktRxXhdmllPm2xaRoW} gives:

\begin{align*} \boldsymbol{D}&= \big((\boldsymbol{U}_1^T)^{-1}\boldsymbol{U}_2^T\boldsymbol{U}_2\big) \boldsymbol{U}_2^{-1}\\ &=(\boldsymbol{U}_1^T)^{-1}\boldsymbol{U}_2^T\boldsymbol{U}_2 \boldsymbol{U}_2^{-1}\\ &=(\boldsymbol{U}_1^T)^{-1}\boldsymbol{U}_2^T\\ \end{align*}

Here, $\boldsymbol{U}^T_1$ is lower triangular by propertylink, which means $(\boldsymbol{U}^T_1)^{-1}$ is also lower triangular by propertylink. Similarly, $\boldsymbol{U}^T_2$ is lower triangular by propertylink. This means that $\boldsymbol{D}$ is lower triangular by propertylink.

Since $\boldsymbol{D}$ is both an upper triangular matrix and a lower triangular matrix, $\boldsymbol{D}$ must be a diagonal matrix.

* * *

We now move on to the second stage of showing that the diagonal entries of $\boldsymbol{D}$ are all $1$. Let's take the transpose of both sides of \eqref{eq:WktRxXhdmllPm2xaRoW} to get:

\begin{equation}\label{eq:R0ed9PdC8DjQca29faI} \begin{aligned}[b] \boldsymbol{D}^T&= (\boldsymbol{U}_1\boldsymbol{U}_2^{-1})^T\\ \boldsymbol{D}&= (\boldsymbol{U}_2^{-1})^T\boldsymbol{U}_1^T \end{aligned} \end{equation}

Here, we used propertylink of diagonal matrices for the left-hand side and propertylink of matrix transpose for the right-hand side.

Now, go back to \eqref{eq:JdU1WATpAruyotT8aoY} and make $\boldsymbol{U}^T_1$ the subject:

$$\begin{equation}\label{eq:G2NJRTQrr2strtGvtXG} \boldsymbol{U}_1^T= \boldsymbol{U}_2^T\boldsymbol{U}_2\boldsymbol{U}_1^{-1} \end{equation}$$

Substituting \eqref{eq:G2NJRTQrr2strtGvtXG} into \eqref{eq:R0ed9PdC8DjQca29faI} gives:

\begin{equation}\label{eq:jSAAyRYX4ZsfXk8zNtx} \begin{aligned}[b] \boldsymbol{D}&= (\boldsymbol{U}_2^{-1})^T (\boldsymbol{U}_2^T\boldsymbol{U}_2\boldsymbol{U}_1^{-1})\\ &=(\boldsymbol{U}_2^{T})^{-1} \boldsymbol{U}_2^T\boldsymbol{U}_2\boldsymbol{U}_1^{-1}\\ &=\boldsymbol{I}\boldsymbol{U}_2\boldsymbol{U}_1^{-1}\\ &=\boldsymbol{U}_2\boldsymbol{U}_1^{-1}\\ \end{aligned} \end{equation}

Here, the 2nd equality holds by propertylink of matrix transpose. We have now established by \eqref{eq:WktRxXhdmllPm2xaRoW} and \eqref{eq:jSAAyRYX4ZsfXk8zNtx} that:

\begin{align*} \boldsymbol{D}&=\boldsymbol{U}_1\boldsymbol{U}_2^{-1}\\ \boldsymbol{D}&=\boldsymbol{U}_2\boldsymbol{U}_1^{-1} \end{align*}

Let's rearrange the top equation to get $\boldsymbol{U}_2=\boldsymbol{D}^{-1}\boldsymbol{U}_1$ and substitute this into the bottom equation:

\begin{align*} \boldsymbol{D}&= \boldsymbol{D}^{-1}\boldsymbol{U}_1\boldsymbol{U}_1^{-1}\\ \boldsymbol{D}&=\boldsymbol{D}^{-1}\\ \boldsymbol{D}^2&=\boldsymbol{I}\\ \end{align*}

By propertylink, squaring a diagonal matrix $\boldsymbol{D}$ involves squaring all the diagonal entries of $\boldsymbol{D}$. If we let $d_{ii}$ represent the $i$-th diagonal entry of $\boldsymbol{D}$, then:

\begin{align*} d_{ii}^2&=1\\ d_{ii}&=\pm1\\ \end{align*}

Let's now show that $d_{ii}$ must be positive. Recall from \eqref{eq:WktRxXhdmllPm2xaRoW} that we defined $\boldsymbol{D}$ like so:

$$\begin{equation}\label{eq:M4oy54M40H6KqXmxa1R} \boldsymbol{D}= \boldsymbol{U}_1\boldsymbol{U}_2^{-1} \end{equation}$$

By definitionlink of Cholesky factorization, the diagonal entries of $\boldsymbol{U}_1$ and $\boldsymbol{U}_2$ are positive. By propertylink, the diagonal entries of $\boldsymbol{U}_2^{-1}$ are the reciprocals of the diagonal entries of $\boldsymbol{U}_2$, which means that they are also positive. By propertylink, the diagonal entries of $\boldsymbol{D}$ are the pairwise product of the diagonal entries of $\boldsymbol{U}_1$ and $\boldsymbol{U}^{-1}_2$, which means that $\boldsymbol{D}$ has positive diagonal entries.

We have now established that $\boldsymbol{D}$ is equal to the identity matrix. Therefore, \eqref{eq:M4oy54M40H6KqXmxa1R} becomes:

$$\boldsymbol{U}_2= \boldsymbol{U}_1$$

This proves that the Cholesky factorization is unique. This completes the proof.

thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...
thumb_up
0
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!