search
Search
Login
Unlock 100+ guides
menu
menu
web
search toc
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
What does this mean?
Why is this true?
Give me some examples!
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
check_circle
Mark as learned
thumb_up
0
thumb_down
0
chat_bubble_outline
0
Comment
auto_stories Bi-column layout
settings

Comprehensive Guide on QR Factorization

schedule Aug 10, 2023
Last updated
local_offer
Linear Algebra
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!
Theorem.

QR factorization

If $\boldsymbol{A}$ is an $m\times{n}$ matrix with linearly independent column vectors, then $\boldsymbol{A}$ can be factorized as:

$$\boldsymbol{A}= \boldsymbol{QR}$$

Where:

Proof. Let $\boldsymbol{A}$ be an $m\times{n}$ matrix with $n$ linearly independent column vectors and let $\boldsymbol{Q}$ be an $m\times{n}$ orthogonal matrixlink whose column vectors are constructed by applying the Gram-Schmidt process on the columns of $\boldsymbol{A}$. These matrices are shown below:

$$\boldsymbol{A}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{x}_1&\boldsymbol{x}_2&\cdots&\boldsymbol{x}_n\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix},\;\;\;\;\;\; \boldsymbol{Q}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{q}_1&\boldsymbol{q}_2&\cdots&\boldsymbol{q}_n\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix}$$

Note that for the Gram-Schmidt process to work, the column vectors of $\boldsymbol{A}$ must be linearly independent.

The set $\{\boldsymbol{q}_1,\boldsymbol{q}_2, \cdots,\boldsymbol{q}_n\}$ is an orthonormal basislink for $\mathbb{R}^n$. By theoremlink, each column vector of $\boldsymbol{A}$ can be expressed using this orthonormal basis like so:

$$\begin{align*} \boldsymbol{x}_1 &=(\boldsymbol{x}_1\cdot\boldsymbol{q}_1)\boldsymbol{q}_1 +(\boldsymbol{x}_1\cdot\boldsymbol{q}_2)\boldsymbol{q}_2 +\cdots+(\boldsymbol{x}_1\cdot\boldsymbol{q}_n)\boldsymbol{q}_n\\ \boldsymbol{x}_2 &=(\boldsymbol{x}_2\cdot\boldsymbol{q}_1)\boldsymbol{q}_1 +(\boldsymbol{x}_2\cdot\boldsymbol{q}_2)\boldsymbol{q}_2 +\cdots+(\boldsymbol{x}_2\cdot\boldsymbol{q}_n)\boldsymbol{q}_n\\ &\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\vdots\\ \boldsymbol{x}_n&=(\boldsymbol{x}_n\cdot\boldsymbol{q}_1)\boldsymbol{q}_1 +(\boldsymbol{x}_n\cdot\boldsymbol{q}_2)\boldsymbol{q}_2 +\cdots+(\boldsymbol{x}_n\cdot\boldsymbol{q}_n)\boldsymbol{q}_n \end{align*}$$

By theoremlink, this can be written as:

$$\begin{equation}\label{eq:LOwUNEaa0XTeKlhE1sn} \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{x}_1&\boldsymbol{x}_2&\cdots&\boldsymbol{x}_n\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{q}_1&\boldsymbol{q}_2&\cdots&\boldsymbol{q}_n\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix} \begin{pmatrix} \boldsymbol{x}_1\cdot\boldsymbol{q}_1 &\boldsymbol{x}_2\cdot\boldsymbol{q}_1 &\cdots&\boldsymbol{x}_n\cdot\boldsymbol{q}_1\\ \boldsymbol{x}_1\cdot\boldsymbol{q}_2& \boldsymbol{x}_2\cdot\boldsymbol{q}_2 &\cdots&\boldsymbol{x}_n\cdot\boldsymbol{q}_2\\ \vdots&\vdots&\smash\ddots&\vdots\\ \boldsymbol{x}_1\cdot\boldsymbol{q}_n &\boldsymbol{x}_2\cdot\boldsymbol{q}_n &\cdots&\boldsymbol{x}_n\cdot\boldsymbol{q}_n \end{pmatrix} \end{equation}$$

By the propertylink of the Gram-Schmidt process, we have that:

$$\begin{align*} &\boldsymbol{x_1} \;\text{ is orthogonal to }\; \boldsymbol{\boldsymbol{q}_2},\boldsymbol{\boldsymbol{q}_3}, \boldsymbol{\boldsymbol{q}_4},\cdots\boldsymbol{\boldsymbol{q}_n}\\ &\boldsymbol{x_2} \;\text{ is orthogonal to }\; \boldsymbol{\boldsymbol{q}_3},\boldsymbol{\boldsymbol{q}_4},\cdots, \boldsymbol{\boldsymbol{q}_n}\\ &\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\vdots\\ &\boldsymbol{x_{n-1}} \;\text{ is orthogonal to }\; \boldsymbol{\boldsymbol{q}_n} \end{align*}$$

Therefore, \eqref{eq:LOwUNEaa0XTeKlhE1sn} becomes:

$$\begin{equation}\label{eq:YO3WKNRMjx6l4ScZ6pw} \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{x}_1&\boldsymbol{x}_2&\cdots&\boldsymbol{x}_n\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{q}_1&\boldsymbol{q}_2&\cdots&\boldsymbol{q}_n\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix} \begin{pmatrix} \boldsymbol{x}_1\cdot\boldsymbol{q}_1 &\boldsymbol{x}_2\cdot\boldsymbol{q}_1 &\cdots&\boldsymbol{x}_n\cdot\boldsymbol{q}_1\\ 0& \boldsymbol{x}_2\cdot\boldsymbol{q}_2 &\cdots&\boldsymbol{x}_n\cdot\boldsymbol{q}_2\\ \vdots&\vdots&\smash\ddots&\vdots\\ 0 &0&\cdots&\boldsymbol{x}_n\cdot\boldsymbol{q}_n \end{pmatrix} \end{equation}$$

Notice how the right-hand side is now a product of an orthogonal matrix $\boldsymbol{Q}$ and an upper triangular matrix which we denote as $\boldsymbol{R}$. Let's rewrite \eqref{eq:YO3WKNRMjx6l4ScZ6pw} in short-hand:

$$\boldsymbol{A}= \boldsymbol{QR}$$

Next, by theoremlink, we have that $\boldsymbol{x}_i \cdot \boldsymbol{q}_i =\Vert\boldsymbol{v}_i\Vert$ for $i=1,2,\cdots,n$ where $\boldsymbol{v}_i$ is an orthogonal basis vector. Therefore, \eqref{eq:YO3WKNRMjx6l4ScZ6pw} can be written as:

$$\begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{x}_1&\boldsymbol{x}_2&\cdots&\boldsymbol{x}_n\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{q}_1&\boldsymbol{q}_2&\cdots&\boldsymbol{q}_n\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix} \begin{pmatrix} \Vert\boldsymbol{v}_1\Vert &\boldsymbol{x}_2\cdot\boldsymbol{q}_1 &\cdots&\boldsymbol{x}_n\cdot\boldsymbol{q}_1\\ 0& \Vert\boldsymbol{v}_2\Vert &\cdots&\boldsymbol{x}_n\cdot\boldsymbol{q}_2\\ \vdots&\vdots&\smash\ddots&\vdots\\ 0&0&\cdots&\Vert\boldsymbol{v}_n\Vert \end{pmatrix}$$

Note the following:

This means that $\Vert\boldsymbol{v}_i\Vert\gt0$ for $i=1,2,\cdots,n$. By theoremlink, a triangular matrix with non-zero diagonal entries is invertible. This completes the proof.

Theorem.

Invertible matrices can be QR-factorized

If $\boldsymbol{A}$ is an invertible matrixlink, then $\boldsymbol{A}$ can be $\boldsymbol{QR}$-factorized.

Proof. By theoremlink, if $\boldsymbol{A}$ is an invertible matrix, then the columns of $\boldsymbol{A}$ are linearly independent. By theoremlink, $\boldsymbol{A}$ can be $\boldsymbol{QR}$-factorized. This completes the proof.

Example.

Performing QR factorization on a 2x2 matrix

Perform $\boldsymbol{QR}$ factorization on the following matrix:

$$\begin{pmatrix} 3&2\\ 4&6 \end{pmatrix}$$

Proof. The first step is to obtain an orthogonal matrix using the Gram-Schmidt process:

$$\boldsymbol{Q}= \begin{pmatrix} 3&2\\4&6 \end{pmatrix}$$

We start as follows:

$$\boldsymbol{v}_1= \begin{pmatrix} 3\\4 \end{pmatrix}$$

Next, we find $\boldsymbol{v}_2$ that is orthogonal to $\boldsymbol{v}_1$ like so:

$$\begin{align*} \boldsymbol{v}_2&= \begin{pmatrix}2\\6\end{pmatrix}- \frac{(2)(3)+(6)(4)}{3^2+4^2} \begin{pmatrix}3\\4\end{pmatrix}\\ &=\begin{pmatrix}2\\6\end{pmatrix}- \frac{30}{25} \begin{pmatrix}3\\4\end{pmatrix}\\ &=\begin{pmatrix}2\\6\end{pmatrix}- \begin{pmatrix}18/5\\24/5\end{pmatrix}\\ &=\begin{pmatrix}-8/5\\6/5\end{pmatrix} \end{align*}$$

Let's turn $\boldsymbol{v}_1$ and $\boldsymbol{v}_2$ into unit vectors $\boldsymbol{q}_1$ and $\boldsymbol{q}_2$ respectively. $\boldsymbol{q}_1$ is:

$$\begin{align*} \boldsymbol{q}_1&= \frac{1}{\sqrt{3^2+4^2}} \begin{pmatrix}3\\4\end{pmatrix}\\ &=\frac{1}{5} \begin{pmatrix}3\\4\end{pmatrix}\\ &=\begin{pmatrix}3/5\\4/5\end{pmatrix} \end{align*}$$

Next, $\boldsymbol{q}_2$ is:

$$\begin{align*} \boldsymbol{q}_2&= \frac{1}{\sqrt{(-8/5)^2+(6/5)^2}} \begin{pmatrix}-8/5\\6/5\end{pmatrix}\\ &=\frac{1}{2} \begin{pmatrix}-8/5\\6/5\end{pmatrix}\\ &=\begin{pmatrix}-4/5\\3/5\end{pmatrix} \end{align*}$$

Therefore, the associated orthogonal matrix of $\boldsymbol{A}$ is:

$$\boldsymbol{Q}= \begin{pmatrix}3/5&-4/5\\4/5&3/5\end{pmatrix}$$

Next, the upper triangular matrix $\boldsymbol{R}$ is:

$$\begin{align*} \boldsymbol{R}&=\begin{pmatrix} \boldsymbol{x}_1\cdot\boldsymbol{q}_1& \boldsymbol{x}_2\cdot\boldsymbol{q}_1\\ 0&\boldsymbol{x}_2\cdot\boldsymbol{q}_2 \end{pmatrix}\\&= \begin{pmatrix} (3)(3/5)+(4)(4/5)& (2)(3/5)+(6)(4/5) \\0&(2)(-4/5)+(6)(3/5) \end{pmatrix}\\&= \begin{pmatrix}5&6\\0&2\end{pmatrix} \end{align*}$$

Therefore, the $\boldsymbol{QR}$ factorization of $\boldsymbol{A}$ is:

$$\boldsymbol{A}= \begin{pmatrix}3/5&-4/5\\4/5&3/5\end{pmatrix} \begin{pmatrix}5&6\\0&2\end{pmatrix}$$
Example.

Performing QR factorization on a 3x2 matrix

Perform $\boldsymbol{QR}$ factorization on the following matrix:

$$\boldsymbol{A}=\begin{pmatrix} 0&1\\1&0\\0&1 \end{pmatrix} $$

Solution. We first must obtain the associated orthogonal matrix $\boldsymbol{Q}$ of $\boldsymbol{A}$ using the Gram-Schmidt process. Let $\boldsymbol{q}_1$ be:

$$\boldsymbol{q}_1= \begin{pmatrix} 0\\1\\0 \end{pmatrix}$$

The second vector $\boldsymbol{v}_2$ that is orthogonal to $\boldsymbol{q}_1$ is:

$$\begin{align*} \boldsymbol{v}_2&= \begin{pmatrix}1\\0\\1\end{pmatrix} -\frac{(1)(0)+(0)(1)+(1)(0)}{(1)^2} \begin{pmatrix}0\\1\\0\end{pmatrix} =\begin{pmatrix}1\\0\\1\end{pmatrix} \end{align*}$$

Let's convert $\boldsymbol{v}_2$ into an unit vector:

$$\boldsymbol{q}_2= \frac{1}{\sqrt{1^2+1^2}} \begin{pmatrix}1\\0\\1\end{pmatrix}= \begin{pmatrix}1/\sqrt2\\0\\1/\sqrt2\end{pmatrix}$$

Therefore, the associated orthogonal matrix is:

$$\boldsymbol{Q}= \begin{pmatrix} 0&1/\sqrt2\\1&0\\0&1/\sqrt2 \end{pmatrix}$$

The upper triangular matrix $\boldsymbol{R}$ is:

$$\begin{align*} \boldsymbol{R}&=\begin{pmatrix} \boldsymbol{x}_1\cdot\boldsymbol{q}_1& \boldsymbol{x}_2\cdot\boldsymbol{q}_1\\ 0&\boldsymbol{x}_2\cdot\boldsymbol{q}_2 \end{pmatrix}\\&= \begin{pmatrix} (0)(0)+(1)(1)+(0)(0)&(1)(0)+(0)(1)+(1)(0)\\ 0&(1)(1/\sqrt2)+(0)(0)+(1)(1/\sqrt2) \end{pmatrix}\\ &=\begin{pmatrix}1&0\\0&\sqrt2\end{pmatrix} \end{align*}$$

Therefore, the $\boldsymbol{QR}$ factorization of $\boldsymbol{A}$ is:

$$\begin{pmatrix} 0&1\\1&0\\0&1 \end{pmatrix}= \begin{pmatrix} 0&1/\sqrt2\\1&0\\0&1/\sqrt2 \end{pmatrix} \begin{pmatrix}1&0\\0&\sqrt2\end{pmatrix} $$
Theorem.

Uniqueness of QR factorization

The $\boldsymbol{QR}$ factorization of a matrix is unique. In other words, there can be at most one $\boldsymbol{QR}$ form of a matrix.

Proof. Suppose $\boldsymbol{A}$ can be $\boldsymbol{QR}$-factorized into $\boldsymbol{A}=\boldsymbol{Q}_1\boldsymbol{R}_1$ and $\boldsymbol{A}= \boldsymbol{Q}_2 \boldsymbol{R}_2$. Equating the two equations gives:

$$\begin{equation}\label{eq:ueU7wc2mRd2QCZaVdNM} \begin{aligned}[b] \boldsymbol{Q}_1\boldsymbol{R}_1&= \boldsymbol{Q}_2\boldsymbol{R}_2\\ \boldsymbol{Q}_1\boldsymbol{R}_1\boldsymbol{R}_2^{-1}&= \boldsymbol{Q}_2\\ \boldsymbol{R}_1\boldsymbol{R}_2^{-1}&= \boldsymbol{Q}_1^{-1}\boldsymbol{Q}_2\\ \end{aligned} \end{equation}$$

By theoremlink, because $\boldsymbol{R}_2$ is an upper triangular matrix, its inverse $\boldsymbol{R}^{-1}_2$ is also an upper triangular matrix. By theoremlink, $\boldsymbol{R}_1\boldsymbol{R}^{-1}_2$ is upper triangular because the product of two upper triangular matrices is also upper triangular.

Next, by theoremlink, the inverse of an orthogonal matrix is also orthogonal, which means $\boldsymbol{Q}^{-1}_1$ is orthogonal. By theoremlink, $\boldsymbol{Q}^{-1}_1\boldsymbol{Q}_2$ is also orthogonal because the product of two orthogonal matrices is orthogonal.

Therefore, the left-hand side of \eqref{eq:ueU7wc2mRd2QCZaVdNM} is upper triangular while the right-hand side is orthogonal. By theoremlink, upper triangular orthogonal matrices are diagonal matrices with diagonal entries $\pm1$. However, because the diagonal entries of $\boldsymbol{R}_1$ and $\boldsymbol{R}_2^{-1}$ are strictly positive, the diagonal entries of $\boldsymbol{R}_1\boldsymbol{R}^{-1}_2$ are also strictly positive by theoremlink. This means that:

$$\begin{equation}\label{eq:V8pFSzlUhOsQTF5m6aG} \boldsymbol{R}_1\boldsymbol{R}_2^{-1}= \boldsymbol{Q}_1^{-1}\boldsymbol{Q}_2=\boldsymbol{I}_n \end{equation}$$

Where $\boldsymbol{I}_n$ is the $n\times{n}$ identity matrix. Now, \eqref{eq:V8pFSzlUhOsQTF5m6aG} implies:

$$\begin{align*} \boldsymbol{R}_1\boldsymbol{R}_2^{-1} &=\boldsymbol{I}_n\\ \boldsymbol{R}_1 &=\boldsymbol{R}_2 \end{align*}$$

Also, \eqref{eq:V8pFSzlUhOsQTF5m6aG} implies:

$$\begin{align*} \boldsymbol{Q}_1^{-1}\boldsymbol{Q}_2 &=\boldsymbol{I}_n\\ \boldsymbol{Q}_2 &=\boldsymbol{Q}_1\\ \end{align*}$$

Because $\boldsymbol{R}_1=\boldsymbol{R}_2$ and $\boldsymbol{Q}_1=\boldsymbol{Q}_2$, we conclude that $\boldsymbol{QR}$-factorization is unique. This completes the proof.

Theorem.

Using QR factorization to obtain the least squares solution

Suppose $\boldsymbol{A}$ can be $\boldsymbol{QR}$-factorized. For any $\boldsymbol{b}\in\mathbb{R}^m$, the system $\boldsymbol{Ax}=\boldsymbol{b}$ has a unique least squares solution given by:

$$\boldsymbol{x}= \boldsymbol{R}^{-1} \boldsymbol{Q}^T\boldsymbol{b}$$

Proof. Recalllink that the least squares solution of the system $\boldsymbol{Ax}=\boldsymbol{b}$ is given by:

$$\boldsymbol{x} =(\boldsymbol{A}^T\boldsymbol{A})^{-1} \boldsymbol{A}^T\boldsymbol{b}$$

Now, we substitute $\boldsymbol{A}=\boldsymbol{QR}$ to get:

$$\begin{align*} \boldsymbol{x} &=\big((\boldsymbol{QR})^T(\boldsymbol{QR})\big) ^{-1}(\boldsymbol{QR})^T\boldsymbol{b}\\ &=\big(\boldsymbol{R}^T\boldsymbol{Q}^T\boldsymbol{QR}\big)^{-1}\boldsymbol{R}^T\boldsymbol{Q}^T\boldsymbol{b}\\ &=\big(\boldsymbol{R}^T\boldsymbol{Q}^{-1}\boldsymbol{QR}\big)^{-1}\boldsymbol{R}^T\boldsymbol{Q}^T\boldsymbol{b}\\ &=\big(\boldsymbol{R}^T\boldsymbol{R}\big)^{-1}\boldsymbol{R}^T\boldsymbol{Q}^T\boldsymbol{b}\\ &=\boldsymbol{R}^{-1} (\boldsymbol{R}^{T})^{-1}\boldsymbol{R}^T\boldsymbol{Q}^T\boldsymbol{b}\\ &=\boldsymbol{R}^{-1}\boldsymbol{Q}^T\boldsymbol{b} \end{align*}$$

Here, we used theoremlink and theoremlink. This completes the proof.

Theorem.

Solving the least squares problem using QR factorization

Let's revisit examplelink in which we found the least squares solution to the system $\boldsymbol{Ax}=\boldsymbol{b}$ where:

$$\boldsymbol{A}=\begin{pmatrix} 0&1\\1&0\\0&1 \end{pmatrix},\;\;\;\;\; \boldsymbol{b}= \begin{pmatrix} 1\\2\\2 \end{pmatrix} $$

Find the least squares solution using $\boldsymbol{QR}$ factorization.

Proof. We found the least squares solution by solving:

$$\begin{equation}\label{eq:YwRlrMsBKoQdhcvtdix} \boldsymbol{x} =(\boldsymbol{A}^T\boldsymbol{A})^{-1}\boldsymbol{A}^T\boldsymbol{b} \end{equation}$$

The least squares solution was:

$$\begin{equation}\label{eq:tfLTHmD6bgmLeH74dOt} \boldsymbol{x}=\begin{pmatrix}2\\1.5\end{pmatrix} \end{equation}$$

Let's arrive at the same solution using $\boldsymbol{QR}$ factorization. As found in examplelink, the $\boldsymbol{QR}$ factorization of $\boldsymbol{A}$ is:

$$\begin{pmatrix} 0&1\\1&0\\0&1 \end{pmatrix}= \begin{pmatrix} 0&1/\sqrt2\\1&0\\0&1/\sqrt2 \end{pmatrix} \begin{pmatrix}1&0\\0&\sqrt2\end{pmatrix} $$

In this case, $\boldsymbol{R}$ is a diagonal matrix. By theoremlink, the inverse of $\boldsymbol{R}$ is another diagonal matrix whose diagonal entries are the reciprocals:

$$\boldsymbol{R^{-1}}= \begin{pmatrix}1&0\\0&1/\sqrt2\end{pmatrix}$$

By theoremlink, the least squares solution is:

$$\begin{align*} \boldsymbol{x}&= \boldsymbol{R}^{-1} \boldsymbol{Q}^T\boldsymbol{b}\\ &=\begin{pmatrix}1&0\\0&1/\sqrt2\end{pmatrix} \begin{pmatrix}0&1&0\\1/\sqrt2&0&1/\sqrt2\end{pmatrix} \begin{pmatrix}1\\2\\2\end{pmatrix}\\ &=\begin{pmatrix} 2\\1.5 \end{pmatrix} \end{align*}$$

Great, this aligns with \eqref{eq:tfLTHmD6bgmLeH74dOt}.

Benefits of using QR factorization

We won't go into the technical details here but it turns out that solving certain problems such as the least squares problem via $\boldsymbol{QR}$ factorization offers higher numerical accuracy. In contrast, the standard approach of computing \eqref{eq:YwRlrMsBKoQdhcvtdix} using computer programs is that the result may be inaccurate due to rounding-off errors. In addition, an iterative algorithm called the $\boldsymbol{QR}$ algorithm, which is based on $\boldsymbol{QR}$ factorization, is also widely used to find eigenvalues of larger matrices efficiently.

robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...