The Cholesky factorization of $\boldsymbol{A}$ is defined as $\boldsymbol{A}= \boldsymbol{U}^T\boldsymbol{U}$ where $\boldsymbol{U}$ is an upper triangular matrixlink with positive diagonal entries.

Theorem.

Algorithm for Cholesky Factorization

If $\boldsymbol{A}$ is a positive definite matrixlink, then theoremlink guarantees that $\boldsymbol{A}$ can be Cholesky-factorized as $\boldsymbol{A}=\boldsymbol{U}^T\boldsymbol{U}$. To obtain the Cholesky factorization, perform the following two steps:

row-reduce $\boldsymbol{A}$ to upper triangular matrix $\boldsymbol{U}_1$ only by performing the elementary row operation of adding a multiple of one row to another row below it.
divide each row of $\boldsymbol{U}_1$ by the square root of the diagonal entry in that row.

Before proving the correctness of this algorithm, we will first go through a simple example for demonstration.

Example.

Performing Cholesky Factorization

Perform Cholesky factorization on the following matrix:

$$\boldsymbol{A}= \begin{pmatrix} 2&3&1\\ 3&6&2\\ 1&2&5 \end{pmatrix}$$

Solution. The first step of the Cholesky factorization is to row-reduce $\boldsymbol{A}$ by only performing the elementary row operation of adding a multiple of one row to another row below it:

$$\begin{equation}\label{eq:S4V5aGJ7dID0Iu3dhTc} \begin{pmatrix} 2&3&1\\3&6&2\\1&2&5 \end{pmatrix}\sim \begin{pmatrix} 2&3&1\\0&3/2&1/2\\0&1/2&1/2 \end{pmatrix}\sim \begin{pmatrix} 2&3&1\\0&3/2&1/2\\0&0&1/3 \end{pmatrix} \end{equation}$$

The rightmost matrix in \eqref{eq:S4V5aGJ7dID0Iu3dhTc} is $\boldsymbol{U}_1$. For the second step, we divide each row of $\boldsymbol{U}_1$ by the square root of the diagonal entry in that row:

$$\boldsymbol{U}=\begin{pmatrix} 2/\sqrt2&3/\sqrt2&1/\sqrt2\\ 0&(3/2)/\sqrt{3/2}&(1/2)/\sqrt{3/2}\\ 0&0&(1/3)/\sqrt{1/3} \end{pmatrix}$$

Therefore, the Cholesky factorization of $\boldsymbol{A}$ is:

$$\begin{align*} \boldsymbol{A}&=\boldsymbol{U}^T\boldsymbol{U} \\&= \begin{pmatrix} 2/\sqrt2&0&0\\ 3/\sqrt2&(3/2)/\sqrt{3/2}&0\\ 1/\sqrt2&(1/2)/\sqrt{3/2}&(1/3)/\sqrt{1/3} \end{pmatrix} \begin{pmatrix} 2/\sqrt2&3/\sqrt2&1/\sqrt2\\ 0&(3/2)/\sqrt{3/2}&(1/2)/\sqrt{3/2}\\ 0&0&(1/3)/\sqrt{1/3} \end{pmatrix} \end{align*}$$

As you can see, obtaining the Cholesky factorization of a positive definite matrix is quite simple!

■

Proving the correctness of the Cholesky Factorization algorithm

Let $\boldsymbol{A}$ be an $n\times{n}$ positive definite matrix. By theoremlink, because $\boldsymbol{A}$ is positive definite, $\boldsymbol{A}$ can be factorized as:

$$\begin{equation}\label{eq:W2mRvwhj6ndbox3IZK6} \boldsymbol{A}= \boldsymbol{U}^T\boldsymbol{U} \end{equation}$$

Because $\boldsymbol{U}^T$ and $\boldsymbol{U}$ are square matrices, their diagonals entries are the same by propertylink of matrix transpose. Let $\boldsymbol{D}$ be a diagonal matrix whose diagonal entries are those of $\boldsymbol{U}^T$ and $\boldsymbol{U}$. Suppose the diagonal entries of $\boldsymbol{D}$ are $d_1$, $d_2$, $\cdots$, $d_n$. By theoremlink, the inverse of $\boldsymbol{D}$ is another diagonal matrix whose diagonal entries are $1/d_1$, $1/d_2$, $\cdots$, $1/d_n$.

Now, taking a transpose of an upper triangular matrix yields a lower triangular matrix by theoremlink. $\boldsymbol{U}^T$ is thus a lower triangular matrix with diagonals $d_1$, $d_2$, $\cdots$, $d_n$. By theoremlink, the product $\boldsymbol{U}^T\boldsymbol{D}^{-1}$ is a lower triangular matrix whose diagonal entries are all ones since the reciprocals cancel out.

The inverse of a lower triangular matrix is also a lower triangular matrix whose diagonal entries are the reciprocal of the original matrix by theoremlink. This means that $(\boldsymbol{U}^T\boldsymbol{D}^{-1})^{-1}$ is also lower triangular with all diagonal entries equalling one. Let's denote this matrix as $\boldsymbol{L}$ like so:

$$\begin{equation}\label{eq:i88wig70IwSLSdp0TEk} \boldsymbol{L}= (\boldsymbol{U}^T\boldsymbol{D}^{-1})^{-1} \end{equation}$$

Using equations \eqref{eq:W2mRvwhj6ndbox3IZK6} and \eqref{eq:i88wig70IwSLSdp0TEk}, we can evaluate $\boldsymbol{LA}$ as:

$$\begin{align*} \boldsymbol{LA}&= \big[(\boldsymbol{U}^T\boldsymbol{D}^{-1})^{-1}\big] (\boldsymbol{U}^T\boldsymbol{U})\\ &=(\boldsymbol{D}^{-1})^{-1}(\boldsymbol{U}^T)^{-1} \boldsymbol{U}^T\boldsymbol{U}\\ &=\boldsymbol{D}\boldsymbol{I}\boldsymbol{U}\\ &=\boldsymbol{D}\boldsymbol{U}\\ \end{align*}$$

Here, the 2nd equality holds by propertylink of invertible matrices.

Remember, the following:

$\boldsymbol{D}$ is a diagonal matrix with diagonal entries $d_1$, $d_2$, $\cdots$, $d_n$.
$\boldsymbol{U}$ is an upper triangular matrix with diagonal entries $d_1$, $d_2$, $\cdots$, $d_n$.

By theoremlink, the product $\boldsymbol{DU}$ is an upper triangular matrix with diagonal entries $d_1^2$, $d_2^2$, $\cdots$, $d_n^2$.

* * *

Now, suppose matrix $\boldsymbol{A}$ is reduced to an upper triangular matrix $\boldsymbol{U}_1$ using only the elementary row operating of adding a multiple of one row to another row below it. The corresponding elementary matrixlink will always be a lower triangular matrix according to propertylink whose diagonal entries are all ones.

To understand why, suppose we have some $3\times3$ matrix. The elementary matrix corresponding to the elementary row operation of multiplying the first row by $5$ and then adding it to the $3$rd row is:

$$\begin{pmatrix} 1&0&0\\ 0&1&0\\ 5&0&1\\ \end{pmatrix}$$

Notice how this is a lower triangular matrix whose diagonal entries are all ones. By propertylink, applying an elementary row operation on matrix $\boldsymbol{A}$ is equivalent to multiplying the corresponding elementary matrix to $\boldsymbol{A}$. The series of elementary row operations required to reduce $\boldsymbol{A}$ to $\boldsymbol{U}_1$ can therefore be represented by:

$$\begin{equation}\label{eq:dYd9w6a3KJz4r8kLOfo} \boldsymbol{E}_r \cdots \boldsymbol{E}_2 \boldsymbol{E}_1\boldsymbol{A} =\boldsymbol{U}_1 \end{equation}$$

Where $\boldsymbol{E}_1$, $\boldsymbol{E}_2$, $\cdots$, $\boldsymbol{E}_r$ are the corresponding elementary matrices. As just discussed, these matrices are all lower triangular. The product of lower triangular matrices is also lower triangular by propertylink and the product would have diagonal entries of all ones by propertylink. This means that we can define a lower triangular matrix $\boldsymbol{L}_1$ whose diagonal entries are all ones such that:

$$\boldsymbol{L}_1=\boldsymbol{E}_r \cdots \boldsymbol{E}_2 \boldsymbol{E}_1$$

With this, \eqref{eq:dYd9w6a3KJz4r8kLOfo} can be simplified as:

$$\begin{equation}\label{eq:GM9sQFICSgDTohkl2CW} \boldsymbol{L}_1\boldsymbol{A}= \boldsymbol{U}_1 \end{equation}$$

Now, consider $\boldsymbol{L}_1 \boldsymbol{U}^T_1$ below:

$$\begin{equation}\label{eq:BDBiPwRKUmPsHaj6Phc} \begin{aligned}[b] \boldsymbol{L}_1\boldsymbol{U}_1^T &=\boldsymbol{L}_1(\boldsymbol{L}_1\boldsymbol{A})^T\\ &=\boldsymbol{L}_1\boldsymbol{A}^T\boldsymbol{L}_1^T\\ &=\boldsymbol{L}_1\boldsymbol{A}\boldsymbol{L}_1^T\\ &=\boldsymbol{U}_1\boldsymbol{L}_1^T \end{aligned} \end{equation}$$

Here, note the following:

the 2nd equality holds by the propertylink of matrix transpose.
the 3rd equality uses the fact that $\boldsymbol{A}$ is symmetriclink (since $\boldsymbol{A}$ is positive definite), that is, $\boldsymbol{A}^T=\boldsymbol{A}$.

Now, define a diagonal matrix $\boldsymbol{D}_1$ whose diagonal entries are those of $\boldsymbol{U}_1$. Suppose these diagonal entries are $c_1$, $c_2$, $\cdots$, $c_n$.

Next, consider $\boldsymbol{L}_1( \boldsymbol{U}^T_1\boldsymbol{D}^{-1}_1)$ below:

$$\begin{equation}\label{eq:kcPfPZHi6IgDt0TIMLQ} \begin{aligned}[b] \boldsymbol{L}_1(\boldsymbol{U}^T_1\boldsymbol{D}^{-1}_1)&= (\boldsymbol{L}_1\boldsymbol{U}^T_1)\boldsymbol{D}^{-1}_1\\ &=\boldsymbol{U}_1\boldsymbol{L}^T_1\boldsymbol{D}^{-1}_1 \end{aligned} \end{equation}$$

Here, the 2nd equality follows by substituting in \eqref{eq:BDBiPwRKUmPsHaj6Phc}.

Let's now determine the nature of the matrix of both sides in \eqref{eq:kcPfPZHi6IgDt0TIMLQ}. We start with the left-hand side. The matrix $\boldsymbol{U}^T_1$ is lower triangular by propertylink and holds diagonal entries $c_1$, $c_2$, $\cdots$, $c_n$. Next, $\boldsymbol{D}_1^{-1}$ is a diagonal matrix whose diagonal entries are $1/c_1$, $1/c_2$, $\cdots$, $1/c_n$ by theoremlink. The product $\boldsymbol{U}^T_1 \boldsymbol{D}_1^{-1}$ is a lower triangular matrix and its diagonal entries are all ones by theoremlink. By theoremlink and theoremlink, because $\boldsymbol{L}_1$ and $\boldsymbol{U}^T_1\boldsymbol{D}_1^{-1}$ are both lower triangular matrices whose diagonal entries are all ones, we conclude that their product $\boldsymbol{L}_1(\boldsymbol{U}^T_1\boldsymbol{D}_1^{-1})$ is lower triangular with diagonal entries of ones.

Now, let's focus on the right-hand side of \eqref{eq:kcPfPZHi6IgDt0TIMLQ}. The matrix $\boldsymbol{L}^T_1$ is upper triangular by propertylink and has diagonal entries of all ones. Again, $\boldsymbol{D}_1^{-1}$ is a diagonal matrix with diagonal entries $1/c_1$, $1/c_2$, $\cdots$, $1/c_n$ by theoremlink. The product $\boldsymbol{L}^T_1\boldsymbol{D}^{-1}$ is therefore upper triangular with diagonal entries $1/c_1$, $1/c_2$, $\cdots$, $1/c_n$ by theoremlink. Because $\boldsymbol{U}_1$ is an upper triangular matrix with diagonal entries $c_1$, $c_2$, $\cdots$, $c_n$, the product $\boldsymbol{U}_1\boldsymbol{L}^T_1 \boldsymbol{D}^{-1}$ is another upper triangular matrix whose diagonal entries are ones by theoremlink.

What we have just shown is that:

the left-hand side of \eqref{eq:kcPfPZHi6IgDt0TIMLQ} is a lower triangular matrix with diagonal entries of ones.
the right-hand side of \eqref{eq:kcPfPZHi6IgDt0TIMLQ} is an upper triangular matrix with diagonal entries of ones.

The only way for this to hold is if both sides of \eqref{eq:kcPfPZHi6IgDt0TIMLQ} are equal to the identity matrix $\boldsymbol{I}_n$. Taking the left-hand side of \eqref{eq:kcPfPZHi6IgDt0TIMLQ}, we get:

$$\begin{equation}\label{eq:mP4AEo8Q55fxNPl9BRB} \begin{aligned}[b] \boldsymbol{L}_1(\boldsymbol{U}^T_1\boldsymbol{D}^{-1}_1)&= \boldsymbol{I}_n\\ \boldsymbol{U}^T_1\boldsymbol{D}^{-1}_1&= \boldsymbol{L}_1^{-1} \end{aligned} \end{equation}$$

Now, define a second diagonal matrix $\boldsymbol{D}_2$ whose diagonal entries are $\sqrt{c_1}$, $\sqrt{c_2}$, $\cdots$, $\sqrt{c_n}$. By theoremlink, $\boldsymbol{D}^2_2= \boldsymbol{D}_1$. In addition, define the following matrix:

$$\boldsymbol{U}= \boldsymbol{D}^{-1}_2 \boldsymbol{U}_1$$

The inverse $\boldsymbol{D}^{-1}_2$ is diagonal by theoremlink, which makes $\boldsymbol{U}$ upper triangular by theoremlink. Now, $\boldsymbol{U}^T\boldsymbol{U}$ is:

$$\begin{align*} \boldsymbol{U}^T\boldsymbol{U}&= (\boldsymbol{D}^{-1}_2\boldsymbol{U}_1)^T \boldsymbol{D}^{-1}_2\boldsymbol{U}_1\\ &=\boldsymbol{U}_1^T(\boldsymbol{D}^{-1}_2)^T \boldsymbol{D}^{-1}_2\boldsymbol{U}_1\\ &=\boldsymbol{U}_1^T\boldsymbol{D}^{-1}_2 \boldsymbol{D}^{-1}_2\boldsymbol{U}_1\\ &=\boldsymbol{U}_1^T(\boldsymbol{D}_2^2)^{-1}\boldsymbol{U}_1\\ &=\boldsymbol{U}_1^T\boldsymbol{D}_1^{-1}\boldsymbol{U}_1\\ &=\boldsymbol{L}_1^{-1}\boldsymbol{U}_1\\ &=\boldsymbol{A}\\ \end{align*}$$

Here, note the following:

for the 2nd equality, we used propertylink of transpose.
for the 3rd equality, we used propertylink of diagonal matrices.
for the 4th equality, we used propertylink of invertible matrices.
for the 5th equality, we used the fact that $\boldsymbol{D}^2_2=\boldsymbol{D}_1$.
for the 6th equality, we substituted \eqref{eq:mP4AEo8Q55fxNPl9BRB}.
for the final equality, we used \eqref{eq:GM9sQFICSgDTohkl2CW}.

This completes the proof.

■

Theorem.

Uniqueness of Cholesky Factorization

The Cholesky factorization of a positive definite matrixlink $\boldsymbol{A}$ is unique, that is, $\boldsymbol{A}$ cannot be Cholesky-factorized in multiple ways.

Proof. Let $\boldsymbol{A}$ be a positive definite matrix. Suppose there exist two distinct Cholesky factorizations of $\boldsymbol{A}$ below:

$$\begin{equation}\label{eq:z4xrkDH28cVaxeOr7nk} \begin{aligned} \boldsymbol{A}&=\boldsymbol{U}_1^T\boldsymbol{U}_1\\ \boldsymbol{A}&=\boldsymbol{U}_2^T\boldsymbol{U}_2 \end{aligned} \end{equation}$$

Now, let's define a matrix $\boldsymbol{D}$ below:

$$\begin{equation}\label{eq:WktRxXhdmllPm2xaRoW} \boldsymbol{D}= \boldsymbol{U}_1\boldsymbol{U}_2^{-1} \end{equation}$$

Our goal is to show that $\boldsymbol{D}$ is an identity matrix, which would make imply $\boldsymbol{U}_2= \boldsymbol{U}_1$. We will do this in two stages:

show that $\boldsymbol{D}$ is a diagonal matrix.
show that the diagonal entries of $\boldsymbol{D}$ are all $1$.

Let's start by showing that $\boldsymbol{D}$ is diagonal. Focus on \eqref{eq:WktRxXhdmllPm2xaRoW}. Since $\boldsymbol{U}_2$ is upper triangular, $\boldsymbol{U}^{-1}_2$ is also upper triangular by propertylink. Since $\boldsymbol{D}$ is a product of two upper triangular matrices, $\boldsymbol{D}$ is upper triangular by propertylink.

Now, equating \eqref{eq:z4xrkDH28cVaxeOr7nk} gives:

$$\begin{equation}\label{eq:JdU1WATpAruyotT8aoY} \begin{aligned}[b] \boldsymbol{U}_1^T\boldsymbol{U}_1 &=\boldsymbol{U}_2^T\boldsymbol{U}_2\\ \boldsymbol{U}_1 &=(\boldsymbol{U}_1^T)^{-1}\boldsymbol{U}_2^T\boldsymbol{U}_2 \end{aligned} \end{equation}$$

Substituting \eqref{eq:JdU1WATpAruyotT8aoY} into \eqref{eq:WktRxXhdmllPm2xaRoW} gives:

$$\begin{align*} \boldsymbol{D}&= \big((\boldsymbol{U}_1^T)^{-1}\boldsymbol{U}_2^T\boldsymbol{U}_2\big) \boldsymbol{U}_2^{-1}\\ &=(\boldsymbol{U}_1^T)^{-1}\boldsymbol{U}_2^T\boldsymbol{U}_2 \boldsymbol{U}_2^{-1}\\ &=(\boldsymbol{U}_1^T)^{-1}\boldsymbol{U}_2^T\\ \end{align*}$$

Here, $\boldsymbol{U}^T_1$ is lower triangular by propertylink, which means $(\boldsymbol{U}^T_1)^{-1}$ is also lower triangular by propertylink. Similarly, $\boldsymbol{U}^T_2$ is lower triangular by propertylink. This means that $\boldsymbol{D}$ is lower triangular by propertylink.

Since $\boldsymbol{D}$ is both an upper triangular matrix and a lower triangular matrix, $\boldsymbol{D}$ must be a diagonal matrix.

* * *

We now move on to the second stage of showing that the diagonal entries of $\boldsymbol{D}$ are all $1$. Let's take the transpose of both sides of \eqref{eq:WktRxXhdmllPm2xaRoW} to get:

$$\begin{equation}\label{eq:R0ed9PdC8DjQca29faI} \begin{aligned}[b] \boldsymbol{D}^T&= (\boldsymbol{U}_1\boldsymbol{U}_2^{-1})^T\\ \boldsymbol{D}&= (\boldsymbol{U}_2^{-1})^T\boldsymbol{U}_1^T \end{aligned} \end{equation}$$

Here, we used propertylink of diagonal matrices for the left-hand side and propertylink of matrix transpose for the right-hand side.

Now, go back to \eqref{eq:JdU1WATpAruyotT8aoY} and make $\boldsymbol{U}^T_1$ the subject:

$$\begin{equation}\label{eq:G2NJRTQrr2strtGvtXG} \boldsymbol{U}_1^T= \boldsymbol{U}_2^T\boldsymbol{U}_2\boldsymbol{U}_1^{-1} \end{equation}$$

Substituting \eqref{eq:G2NJRTQrr2strtGvtXG} into \eqref{eq:R0ed9PdC8DjQca29faI} gives:

$$\begin{equation}\label{eq:jSAAyRYX4ZsfXk8zNtx} \begin{aligned}[b] \boldsymbol{D}&= (\boldsymbol{U}_2^{-1})^T (\boldsymbol{U}_2^T\boldsymbol{U}_2\boldsymbol{U}_1^{-1})\\ &=(\boldsymbol{U}_2^{T})^{-1} \boldsymbol{U}_2^T\boldsymbol{U}_2\boldsymbol{U}_1^{-1}\\ &=\boldsymbol{I}\boldsymbol{U}_2\boldsymbol{U}_1^{-1}\\ &=\boldsymbol{U}_2\boldsymbol{U}_1^{-1}\\ \end{aligned} \end{equation}$$

Here, the 2nd equality holds by propertylink of matrix transpose. We have now established by \eqref{eq:WktRxXhdmllPm2xaRoW} and \eqref{eq:jSAAyRYX4ZsfXk8zNtx} that:

$$\begin{align*} \boldsymbol{D}&=\boldsymbol{U}_1\boldsymbol{U}_2^{-1}\\ \boldsymbol{D}&=\boldsymbol{U}_2\boldsymbol{U}_1^{-1} \end{align*}$$

Let's rearrange the top equation to get $\boldsymbol{U}_2=\boldsymbol{D}^{-1}\boldsymbol{U}_1$ and substitute this into the bottom equation:

$$\begin{align*} \boldsymbol{D}&= \boldsymbol{D}^{-1}\boldsymbol{U}_1\boldsymbol{U}_1^{-1}\\ \boldsymbol{D}&=\boldsymbol{D}^{-1}\\ \boldsymbol{D}^2&=\boldsymbol{I}\\ \end{align*}$$

By propertylink, squaring a diagonal matrix $\boldsymbol{D}$ involves squaring all the diagonal entries of $\boldsymbol{D}$. If we let $d_{ii}$ represent the $i$-th diagonal entry of $\boldsymbol{D}$, then:

$$\begin{align*} d_{ii}^2&=1\\ d_{ii}&=\pm1\\ \end{align*}$$

Let's now show that $d_{ii}$ must be positive. Recall from \eqref{eq:WktRxXhdmllPm2xaRoW} that we defined $\boldsymbol{D}$ like so:

$$\begin{equation}\label{eq:M4oy54M40H6KqXmxa1R} \boldsymbol{D}= \boldsymbol{U}_1\boldsymbol{U}_2^{-1} \end{equation}$$

By definitionlink of Cholesky factorization, the diagonal entries of $\boldsymbol{U}_1$ and $\boldsymbol{U}_2$ are positive. By propertylink, the diagonal entries of $\boldsymbol{U}_2^{-1}$ are the reciprocals of the diagonal entries of $\boldsymbol{U}_2$, which means that they are also positive. By propertylink, the diagonal entries of $\boldsymbol{D}$ are the pairwise product of the diagonal entries of $\boldsymbol{U}_1$ and $\boldsymbol{U}^{-1}_2$, which means that $\boldsymbol{D}$ has positive diagonal entries.

We have now established that $\boldsymbol{D}$ is equal to the identity matrix. Therefore, \eqref{eq:M4oy54M40H6KqXmxa1R} becomes:

$$\boldsymbol{U}_2= \boldsymbol{U}_1$$

This proves that the Cholesky factorization is unique. This completes the proof.

■

Published by Isshin Inada

Edited by 0 others

Did you find this page useful?

thumb_up

thumb_down

Comment

Citation

Ask a question or leave a feedback...

thumb_up

thumb_down

chat_bubble_outline

settings

Enjoy our search

Hit / to insta-search docs and recipes!