$\boldsymbol{Q}$ is an $m\times{n}$ orthogonal matrixlink.
$\boldsymbol{R}$ is an $n\times{n}$ invertiblelink upper triangular matrixlink with positive diagonal entries.

Proof. Let $\boldsymbol{A}$ be an $m\times{n}$ matrix with $n$ linearly independent column vectors and let $\boldsymbol{Q}$ be an $m\times{n}$ orthogonal matrixlink whose column vectors are constructed by applying the Gram-Schmidt process on the columns of $\boldsymbol{A}$. These matrices are shown below:

$$\boldsymbol{A}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{x}_1&\boldsymbol{x}_2&\cdots&\boldsymbol{x}_n\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix},\;\;\;\;\;\; \boldsymbol{Q}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{q}_1&\boldsymbol{q}_2&\cdots&\boldsymbol{q}_n\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix}$$

Note that for the Gram-Schmidt process to work, the column vectors of $\boldsymbol{A}$ must be linearly independent.

The set $\{\boldsymbol{q}_1,\boldsymbol{q}_2, \cdots,\boldsymbol{q}_n\}$ is an orthonormal basislink for $\mathbb{R}^n$. By theoremlink, each column vector of $\boldsymbol{A}$ can be expressed using this orthonormal basis like so:

$$\begin{align*} \boldsymbol{x}_1 &=(\boldsymbol{x}_1\cdot\boldsymbol{q}_1)\boldsymbol{q}_1 +(\boldsymbol{x}_1\cdot\boldsymbol{q}_2)\boldsymbol{q}_2 +\cdots+(\boldsymbol{x}_1\cdot\boldsymbol{q}_n)\boldsymbol{q}_n\\ \boldsymbol{x}_2 &=(\boldsymbol{x}_2\cdot\boldsymbol{q}_1)\boldsymbol{q}_1 +(\boldsymbol{x}_2\cdot\boldsymbol{q}_2)\boldsymbol{q}_2 +\cdots+(\boldsymbol{x}_2\cdot\boldsymbol{q}_n)\boldsymbol{q}_n\\ &\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\vdots\\ \boldsymbol{x}_n&=(\boldsymbol{x}_n\cdot\boldsymbol{q}_1)\boldsymbol{q}_1 +(\boldsymbol{x}_n\cdot\boldsymbol{q}_2)\boldsymbol{q}_2 +\cdots+(\boldsymbol{x}_n\cdot\boldsymbol{q}_n)\boldsymbol{q}_n \end{align*}$$

By theoremlink, this can be written as:

$$\begin{equation}\label{eq:LOwUNEaa0XTeKlhE1sn} \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{x}_1&\boldsymbol{x}_2&\cdots&\boldsymbol{x}_n\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{q}_1&\boldsymbol{q}_2&\cdots&\boldsymbol{q}_n\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix} \begin{pmatrix} \boldsymbol{x}_1\cdot\boldsymbol{q}_1 &\boldsymbol{x}_2\cdot\boldsymbol{q}_1 &\cdots&\boldsymbol{x}_n\cdot\boldsymbol{q}_1\\ \boldsymbol{x}_1\cdot\boldsymbol{q}_2& \boldsymbol{x}_2\cdot\boldsymbol{q}_2 &\cdots&\boldsymbol{x}_n\cdot\boldsymbol{q}_2\\ \vdots&\vdots&\smash\ddots&\vdots\\ \boldsymbol{x}_1\cdot\boldsymbol{q}_n &\boldsymbol{x}_2\cdot\boldsymbol{q}_n &\cdots&\boldsymbol{x}_n\cdot\boldsymbol{q}_n \end{pmatrix} \end{equation}$$

By the propertylink of the Gram-Schmidt process, we have that:

$$\begin{align*} &\boldsymbol{x_1} \;\text{ is orthogonal to }\; \boldsymbol{\boldsymbol{q}_2},\boldsymbol{\boldsymbol{q}_3}, \boldsymbol{\boldsymbol{q}_4},\cdots\boldsymbol{\boldsymbol{q}_n}\\ &\boldsymbol{x_2} \;\text{ is orthogonal to }\; \boldsymbol{\boldsymbol{q}_3},\boldsymbol{\boldsymbol{q}_4},\cdots, \boldsymbol{\boldsymbol{q}_n}\\ &\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\vdots\\ &\boldsymbol{x_{n-1}} \;\text{ is orthogonal to }\; \boldsymbol{\boldsymbol{q}_n} \end{align*}$$

Therefore, \eqref{eq:LOwUNEaa0XTeKlhE1sn} becomes:

$$\begin{equation}\label{eq:YO3WKNRMjx6l4ScZ6pw} \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{x}_1&\boldsymbol{x}_2&\cdots&\boldsymbol{x}_n\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{q}_1&\boldsymbol{q}_2&\cdots&\boldsymbol{q}_n\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix} \begin{pmatrix} \boldsymbol{x}_1\cdot\boldsymbol{q}_1 &\boldsymbol{x}_2\cdot\boldsymbol{q}_1 &\cdots&\boldsymbol{x}_n\cdot\boldsymbol{q}_1\\ 0& \boldsymbol{x}_2\cdot\boldsymbol{q}_2 &\cdots&\boldsymbol{x}_n\cdot\boldsymbol{q}_2\\ \vdots&\vdots&\smash\ddots&\vdots\\ 0 &0&\cdots&\boldsymbol{x}_n\cdot\boldsymbol{q}_n \end{pmatrix} \end{equation}$$

Notice how the right-hand side is now a product of an orthogonal matrix $\boldsymbol{Q}$ and an upper triangular matrix which we denote as $\boldsymbol{R}$. Let's rewrite \eqref{eq:YO3WKNRMjx6l4ScZ6pw} in short-hand:

$$\boldsymbol{A}= \boldsymbol{QR}$$

Next, by theoremlink, we have that $\boldsymbol{x}_i \cdot \boldsymbol{q}_i =\Vert\boldsymbol{v}_i\Vert$ for $i=1,2,\cdots,n$ where $\boldsymbol{v}_i$ is an orthogonal basis vector. Therefore, \eqref{eq:YO3WKNRMjx6l4ScZ6pw} can be written as:

$$\begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{x}_1&\boldsymbol{x}_2&\cdots&\boldsymbol{x}_n\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{q}_1&\boldsymbol{q}_2&\cdots&\boldsymbol{q}_n\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix} \begin{pmatrix} \Vert\boldsymbol{v}_1\Vert &\boldsymbol{x}_2\cdot\boldsymbol{q}_1 &\cdots&\boldsymbol{x}_n\cdot\boldsymbol{q}_1\\ 0& \Vert\boldsymbol{v}_2\Vert &\cdots&\boldsymbol{x}_n\cdot\boldsymbol{q}_2\\ \vdots&\vdots&\smash\ddots&\vdots\\ 0&0&\cdots&\Vert\boldsymbol{v}_n\Vert \end{pmatrix}$$

Note the following:

magnitude cannot be negative by definitionlink.
a basis vector cannot be zero by definitionlink.

This means that $\Vert\boldsymbol{v}_i\Vert\gt0$ for $i=1,2,\cdots,n$. By theoremlink, a triangular matrix with non-zero diagonal entries is invertible. This completes the proof.

■

Theorem.

Invertible matrices can be QR-factorized

If $\boldsymbol{A}$ is an invertible matrixlink, then $\boldsymbol{A}$ can be $\boldsymbol{QR}$-factorized.

Proof. By theoremlink, if $\boldsymbol{A}$ is an invertible matrix, then the columns of $\boldsymbol{A}$ are linearly independent. By theoremlink, $\boldsymbol{A}$ can be $\boldsymbol{QR}$-factorized. This completes the proof.

■

Example.

Performing QR factorization on a 2x2 matrix

Perform $\boldsymbol{QR}$ factorization on the following matrix:

$$\begin{pmatrix} 3&2\\ 4&6 \end{pmatrix}$$

Proof. The first step is to obtain an orthogonal matrix using the Gram-Schmidt process:

$$\boldsymbol{Q}= \begin{pmatrix} 3&2\\4&6 \end{pmatrix}$$

We start as follows:

$$\boldsymbol{v}_1= \begin{pmatrix} 3\\4 \end{pmatrix}$$

Next, we find $\boldsymbol{v}_2$ that is orthogonal to $\boldsymbol{v}_1$ like so:

$$\begin{align*} \boldsymbol{v}_2&= \begin{pmatrix}2\\6\end{pmatrix}- \frac{(2)(3)+(6)(4)}{3^2+4^2} \begin{pmatrix}3\\4\end{pmatrix}\\ &=\begin{pmatrix}2\\6\end{pmatrix}- \frac{30}{25} \begin{pmatrix}3\\4\end{pmatrix}\\ &=\begin{pmatrix}2\\6\end{pmatrix}- \begin{pmatrix}18/5\\24/5\end{pmatrix}\\ &=\begin{pmatrix}-8/5\\6/5\end{pmatrix} \end{align*}$$

Let's turn $\boldsymbol{v}_1$ and $\boldsymbol{v}_2$ into unit vectors $\boldsymbol{q}_1$ and $\boldsymbol{q}_2$ respectively. $\boldsymbol{q}_1$ is:

$$\begin{align*} \boldsymbol{q}_1&= \frac{1}{\sqrt{3^2+4^2}} \begin{pmatrix}3\\4\end{pmatrix}\\ &=\frac{1}{5} \begin{pmatrix}3\\4\end{pmatrix}\\ &=\begin{pmatrix}3/5\\4/5\end{pmatrix} \end{align*}$$

Next, $\boldsymbol{q}_2$ is:

$$\begin{align*} \boldsymbol{q}_2&= \frac{1}{\sqrt{(-8/5)^2+(6/5)^2}} \begin{pmatrix}-8/5\\6/5\end{pmatrix}\\ &=\frac{1}{2} \begin{pmatrix}-8/5\\6/5\end{pmatrix}\\ &=\begin{pmatrix}-4/5\\3/5\end{pmatrix} \end{align*}$$

Therefore, the associated orthogonal matrix of $\boldsymbol{A}$ is:

$$\boldsymbol{Q}= \begin{pmatrix}3/5&-4/5\\4/5&3/5\end{pmatrix}$$

Next, the upper triangular matrix $\boldsymbol{R}$ is:

$$\begin{align*} \boldsymbol{R}&=\begin{pmatrix} \boldsymbol{x}_1\cdot\boldsymbol{q}_1& \boldsymbol{x}_2\cdot\boldsymbol{q}_1\\ 0&\boldsymbol{x}_2\cdot\boldsymbol{q}_2 \end{pmatrix}\\&= \begin{pmatrix} (3)(3/5)+(4)(4/5)& (2)(3/5)+(6)(4/5) \\0&(2)(-4/5)+(6)(3/5) \end{pmatrix}\\&= \begin{pmatrix}5&6\\0&2\end{pmatrix} \end{align*}$$

Therefore, the $\boldsymbol{QR}$ factorization of $\boldsymbol{A}$ is:

$$\boldsymbol{A}= \begin{pmatrix}3/5&-4/5\\4/5&3/5\end{pmatrix} \begin{pmatrix}5&6\\0&2\end{pmatrix}$$

■

Example.

Performing QR factorization on a 3x2 matrix

Perform $\boldsymbol{QR}$ factorization on the following matrix:

$$\boldsymbol{A}=\begin{pmatrix} 0&1\\1&0\\0&1 \end{pmatrix} $$

Solution. We first must obtain the associated orthogonal matrix $\boldsymbol{Q}$ of $\boldsymbol{A}$ using the Gram-Schmidt process. Let $\boldsymbol{q}_1$ be:

$$\boldsymbol{q}_1= \begin{pmatrix} 0\\1\\0 \end{pmatrix}$$

The second vector $\boldsymbol{v}_2$ that is orthogonal to $\boldsymbol{q}_1$ is:

$$\begin{align*} \boldsymbol{v}_2&= \begin{pmatrix}1\\0\\1\end{pmatrix} -\frac{(1)(0)+(0)(1)+(1)(0)}{(1)^2} \begin{pmatrix}0\\1\\0\end{pmatrix} =\begin{pmatrix}1\\0\\1\end{pmatrix} \end{align*}$$

Let's convert $\boldsymbol{v}_2$ into an unit vector:

$$\boldsymbol{q}_2= \frac{1}{\sqrt{1^2+1^2}} \begin{pmatrix}1\\0\\1\end{pmatrix}= \begin{pmatrix}1/\sqrt2\\0\\1/\sqrt2\end{pmatrix}$$

Therefore, the associated orthogonal matrix is:

$$\boldsymbol{Q}= \begin{pmatrix} 0&1/\sqrt2\\1&0\\0&1/\sqrt2 \end{pmatrix}$$

The upper triangular matrix $\boldsymbol{R}$ is:

$$\begin{align*} \boldsymbol{R}&=\begin{pmatrix} \boldsymbol{x}_1\cdot\boldsymbol{q}_1& \boldsymbol{x}_2\cdot\boldsymbol{q}_1\\ 0&\boldsymbol{x}_2\cdot\boldsymbol{q}_2 \end{pmatrix}\\&= \begin{pmatrix} (0)(0)+(1)(1)+(0)(0)&(1)(0)+(0)(1)+(1)(0)\\ 0&(1)(1/\sqrt2)+(0)(0)+(1)(1/\sqrt2) \end{pmatrix}\\ &=\begin{pmatrix}1&0\\0&\sqrt2\end{pmatrix} \end{align*}$$

Therefore, the $\boldsymbol{QR}$ factorization of $\boldsymbol{A}$ is:

$$\begin{pmatrix} 0&1\\1&0\\0&1 \end{pmatrix}= \begin{pmatrix} 0&1/\sqrt2\\1&0\\0&1/\sqrt2 \end{pmatrix} \begin{pmatrix}1&0\\0&\sqrt2\end{pmatrix} $$

■

Theorem.

Uniqueness of QR factorization

The $\boldsymbol{QR}$ factorization of a matrix is unique. In other words, there can be at most one $\boldsymbol{QR}$ form of a matrix.

Proof. Suppose $\boldsymbol{A}$ can be $\boldsymbol{QR}$-factorized into $\boldsymbol{A}=\boldsymbol{Q}_1\boldsymbol{R}_1$ and $\boldsymbol{A}= \boldsymbol{Q}_2 \boldsymbol{R}_2$. Equating the two equations gives:

$$\begin{equation}\label{eq:ueU7wc2mRd2QCZaVdNM} \begin{aligned}[b] \boldsymbol{Q}_1\boldsymbol{R}_1&= \boldsymbol{Q}_2\boldsymbol{R}_2\\ \boldsymbol{Q}_1\boldsymbol{R}_1\boldsymbol{R}_2^{-1}&= \boldsymbol{Q}_2\\ \boldsymbol{R}_1\boldsymbol{R}_2^{-1}&= \boldsymbol{Q}_1^{-1}\boldsymbol{Q}_2\\ \end{aligned} \end{equation}$$

By theoremlink, because $\boldsymbol{R}_2$ is an upper triangular matrix, its inverse $\boldsymbol{R}^{-1}_2$ is also an upper triangular matrix. By theoremlink, $\boldsymbol{R}_1\boldsymbol{R}^{-1}_2$ is upper triangular because the product of two upper triangular matrices is also upper triangular.

Next, by theoremlink, the inverse of an orthogonal matrix is also orthogonal, which means $\boldsymbol{Q}^{-1}_1$ is orthogonal. By theoremlink, $\boldsymbol{Q}^{-1}_1\boldsymbol{Q}_2$ is also orthogonal because the product of two orthogonal matrices is orthogonal.

Therefore, the left-hand side of \eqref{eq:ueU7wc2mRd2QCZaVdNM} is upper triangular while the right-hand side is orthogonal. By theoremlink, upper triangular orthogonal matrices are diagonal matrices with diagonal entries $\pm1$. However, because the diagonal entries of $\boldsymbol{R}_1$ and $\boldsymbol{R}_2^{-1}$ are strictly positive, the diagonal entries of $\boldsymbol{R}_1\boldsymbol{R}^{-1}_2$ are also strictly positive by theoremlink. This means that:

$$\begin{equation}\label{eq:V8pFSzlUhOsQTF5m6aG} \boldsymbol{R}_1\boldsymbol{R}_2^{-1}= \boldsymbol{Q}_1^{-1}\boldsymbol{Q}_2=\boldsymbol{I}_n \end{equation}$$

Where $\boldsymbol{I}_n$ is the $n\times{n}$ identity matrix. Now, \eqref{eq:V8pFSzlUhOsQTF5m6aG} implies:

$$\begin{align*} \boldsymbol{R}_1\boldsymbol{R}_2^{-1} &=\boldsymbol{I}_n\\ \boldsymbol{R}_1 &=\boldsymbol{R}_2 \end{align*}$$

Also, \eqref{eq:V8pFSzlUhOsQTF5m6aG} implies:

$$\begin{align*} \boldsymbol{Q}_1^{-1}\boldsymbol{Q}_2 &=\boldsymbol{I}_n\\ \boldsymbol{Q}_2 &=\boldsymbol{Q}_1\\ \end{align*}$$

Because $\boldsymbol{R}_1=\boldsymbol{R}_2$ and $\boldsymbol{Q}_1=\boldsymbol{Q}_2$, we conclude that $\boldsymbol{QR}$-factorization is unique. This completes the proof.

■

Theorem.

Using QR factorization to obtain the least squares solution

Suppose $\boldsymbol{A}$ can be $\boldsymbol{QR}$-factorized. For any $\boldsymbol{b}\in\mathbb{R}^m$, the system $\boldsymbol{Ax}=\boldsymbol{b}$ has a unique least squares solution given by:

$$\boldsymbol{x}= \boldsymbol{R}^{-1} \boldsymbol{Q}^T\boldsymbol{b}$$

Proof. Recalllink that the least squares solution of the system $\boldsymbol{Ax}=\boldsymbol{b}$ is given by:

$$\boldsymbol{x} =(\boldsymbol{A}^T\boldsymbol{A})^{-1} \boldsymbol{A}^T\boldsymbol{b}$$

Now, we substitute $\boldsymbol{A}=\boldsymbol{QR}$ to get:

$$\begin{align*} \boldsymbol{x} &=\big((\boldsymbol{QR})^T(\boldsymbol{QR})\big) ^{-1}(\boldsymbol{QR})^T\boldsymbol{b}\\ &=\big(\boldsymbol{R}^T\boldsymbol{Q}^T\boldsymbol{QR}\big)^{-1}\boldsymbol{R}^T\boldsymbol{Q}^T\boldsymbol{b}\\ &=\big(\boldsymbol{R}^T\boldsymbol{Q}^{-1}\boldsymbol{QR}\big)^{-1}\boldsymbol{R}^T\boldsymbol{Q}^T\boldsymbol{b}\\ &=\big(\boldsymbol{R}^T\boldsymbol{R}\big)^{-1}\boldsymbol{R}^T\boldsymbol{Q}^T\boldsymbol{b}\\ &=\boldsymbol{R}^{-1} (\boldsymbol{R}^{T})^{-1}\boldsymbol{R}^T\boldsymbol{Q}^T\boldsymbol{b}\\ &=\boldsymbol{R}^{-1}\boldsymbol{Q}^T\boldsymbol{b} \end{align*}$$

Here, we used theoremlink and theoremlink. This completes the proof.

■

Theorem.

Solving the least squares problem using QR factorization

Let's revisit examplelink in which we found the least squares solution to the system $\boldsymbol{Ax}=\boldsymbol{b}$ where:

$$\boldsymbol{A}=\begin{pmatrix} 0&1\\1&0\\0&1 \end{pmatrix},\;\;\;\;\; \boldsymbol{b}= \begin{pmatrix} 1\\2\\2 \end{pmatrix} $$

Find the least squares solution using $\boldsymbol{QR}$ factorization.

Proof. We found the least squares solution by solving:

$$\begin{equation}\label{eq:YwRlrMsBKoQdhcvtdix} \boldsymbol{x} =(\boldsymbol{A}^T\boldsymbol{A})^{-1}\boldsymbol{A}^T\boldsymbol{b} \end{equation}$$

The least squares solution was:

$$\begin{equation}\label{eq:tfLTHmD6bgmLeH74dOt} \boldsymbol{x}=\begin{pmatrix}2\\1.5\end{pmatrix} \end{equation}$$

Let's arrive at the same solution using $\boldsymbol{QR}$ factorization. As found in examplelink, the $\boldsymbol{QR}$ factorization of $\boldsymbol{A}$ is:

$$\begin{pmatrix} 0&1\\1&0\\0&1 \end{pmatrix}= \begin{pmatrix} 0&1/\sqrt2\\1&0\\0&1/\sqrt2 \end{pmatrix} \begin{pmatrix}1&0\\0&\sqrt2\end{pmatrix} $$

In this case, $\boldsymbol{R}$ is a diagonal matrix. By theoremlink, the inverse of $\boldsymbol{R}$ is another diagonal matrix whose diagonal entries are the reciprocals:

$$\boldsymbol{R^{-1}}= \begin{pmatrix}1&0\\0&1/\sqrt2\end{pmatrix}$$

By theoremlink, the least squares solution is:

$$\begin{align*} \boldsymbol{x}&= \boldsymbol{R}^{-1} \boldsymbol{Q}^T\boldsymbol{b}\\ &=\begin{pmatrix}1&0\\0&1/\sqrt2\end{pmatrix} \begin{pmatrix}0&1&0\\1/\sqrt2&0&1/\sqrt2\end{pmatrix} \begin{pmatrix}1\\2\\2\end{pmatrix}\\ &=\begin{pmatrix} 2\\1.5 \end{pmatrix} \end{align*}$$

Great, this aligns with \eqref{eq:tfLTHmD6bgmLeH74dOt}.

■

Benefits of using QR factorization

We won't go into the technical details here but it turns out that solving certain problems such as the least squares problem via $\boldsymbol{QR}$ factorization offers higher numerical accuracy. In contrast, the standard approach of computing \eqref{eq:YwRlrMsBKoQdhcvtdix} using computer programs is that the result may be inaccurate due to rounding-off errors. In addition, an iterative algorithm called the $\boldsymbol{QR}$ algorithm, which is based on $\boldsymbol{QR}$ factorization, is also widely used to find eigenvalues of larger matrices efficiently.

Published by Isshin Inada

Edited by 0 others

Did you find this page useful?

thumb_up

thumb_down

Comment

Citation

Ask a question or leave a feedback...

thumb_up

thumb_down

chat_bubble_outline

settings

Enjoy our search

Hit / to insta-search docs and recipes!