Linear Algebra Note#4: LU Factorization

This note is based on MIT 18.06 📒

Content

Inverse of Operations

The inverse of matrix product:

\[ (AB)^{-1} = B^{-1}A^{-1} \]

The inverse of transpose:

\[ (A^{T})^{-1} = (A^{-1})^{T} \]

Understanding: the column combinations of \(A\) can be seen as the row combinations of \(A^{T}\) with the same multipliers. The order of transpose and inverse of the operation matrix (the inverse) doesn't matter. Any order will map the same multipliers from the ones for columns of \(A\) to rows of \(A^{T}\).

LU Factorization

Let's think about another view of elimination. Given a \(2\times 2\) matrix \(A\), one elementary matrix (operation) \(E_{21}\) can get to \(U\) (the result matrix after elimination):

\[ \begin{array}{c} \begin{bmatrix} 1 & 0 \\\\ -4 & 1 \end{bmatrix} \\ E_{21} \end{array} \times \begin{array}{c} \begin{bmatrix} 2 & 1 \\\\ 8 & 7 \end{bmatrix} \\ A \end{array} = \begin{array}{c} \begin{bmatrix} 2 & 1 \\\\ 0 & 3 \end{bmatrix} \\ U \end{array} \]

To restore \(A\) from \(U\), it should be multiplied by the inverse of \(E_{21}\), which we called \(L\):

\[ \begin{array}{c} \begin{bmatrix} 2 & 1 \\\\ 8 & 7 \end{bmatrix} \\ A \end{array} = \begin{array}{c} \begin{bmatrix} 1 & 0 \\\\ 4 & 1 \end{bmatrix} \\ L \end{array} \times \begin{array}{c} \begin{bmatrix} 2 & 1 \\\\ 0 & 3 \end{bmatrix} \\ U \end{array} \]

Properties of the factorized matrices:

  • The \(L\) represents a lower triangular matrix, while \(U\) represents a upper triangular one.
  • There are all \(1\)s on the diagonal of \(L\), meaning not changing the pivots.
  • The elements on the diagonal of \(U\) are the pivots themselves.

A more balanced form \(A=LDU\):

\[ \begin{array}{c} \begin{bmatrix} 2 & 1 \\\\ 8 & 7 \end{bmatrix} \\ A \end{array} = \begin{array}{c} \begin{bmatrix} 1 & 0 \\\\ 4 & 1 \end{bmatrix} \\ L \end{array} \times \begin{array}{c} \begin{bmatrix} 2 & 0 \\\\ 0 & 3 \end{bmatrix} \\ D \end{array} \times \begin{array}{c} \begin{bmatrix} 1 & \frac{1}{2} \\\\ 0 & 1 \end{bmatrix} \\ U \end{array} \]

where the pivots are normalized to \(1\)s in \(U\).

Why Factorization is Better?

Consider a \(3\times 3\) matrix \(A\), and suppose there is no row exchange needed for the elimination:

\[ E_{32}E_{31}E_{21}A = U \]

The inverse of the three elementary operations should be in an reverse order, which can be combined into a single matrix \(L\):

\[ \begin{aligned} A &= E_{21}^{-1}E_{31}^{-1}E_{32}^{-1}U \\\\ &= LU \end{aligned} \]

Why is \(L\) better represented than \(E\)?

Assume the elimination only needs two operations \(E_{32}\) and \(E_{21}\), which represent the subtraction of \(row_{2}\) to \(row_{3}\), and \(row_{1}\) to \(row_{2}\), respectively (the \(E_{31}\) is identity):

\[ \begin{array}{c} \begin{bmatrix} 1 & 0 & 0 \\\\ 0 & 1 & 0 \\\\ 0 & -5 & 1 \end{bmatrix} \\ E_{32} \end{array} \begin{array}{c} \begin{bmatrix} 1 & 0 & 0 \\\\ -2 & 1 & 0 \\\\ 0 & 0 & 1 \end{bmatrix} \\ E_{21} \end{array} = \begin{array}{c} \begin{bmatrix} 1 & 0 & 0 \\\\ -2 & 1 & 0 \\\\ 10 & -5 & 1 \end{bmatrix} \\ E \end{array} \]

Here the \(E\) produces a new \(10\), because the \(row_1\) affects \(row_3\) through affecting \(row_2\) beforehand.

While for \(L\):

\[ \begin{array}{c} \begin{bmatrix} 1 & 0 & 0 \\\\ 2 & 1 & 0 \\\\ 0 & 0 & 1 \end{bmatrix} \\ E_{21}^{-1} \end{array} \begin{array}{c} \begin{bmatrix} 1 & 0 & 0 \\\\ 0 & 1 & 0 \\\\ 0 & 5 & 1 \end{bmatrix} \\ E_{32}^{-1} \end{array} = \begin{array}{c} \begin{bmatrix} 1 & 0 & 0 \\\\ 2 & 1 & 0 \\\\ 0 & 5 & 1 \end{bmatrix} \\ L \end{array} \]

The elements in \(L\) is just the the multipliers for each elimination operation. Because the reverse way is operated on the result matrix \(U\), in which the rows are already of the affected versions. This means there is no chain effects needed to be represented explicitely in the elements of operation matrix \(L\).

Understanding: In the elimination matrix \(E\), the row operations create a chain effect among the rows (\(row_1 \rightarrow row_2 \rightarrow row_3\)). Since \(row_2\) in \(U\) already reflects the modification from \(row_1\), we can add it directly to \(row_3\) when performing the inverse operation. This cancels the subtraction introduced by \(E_{32}\) without introducing any additional multipliers of \(row_1\) into \(row_3\).

The only possible influence of \(row_1\) on \(row_3\) occurs in the first pivot column, and this effect is represented directly by the negative of the multiplier appearing in \(E_{31}\).

In summary: If there is no row exchanges, the multipliers go directly into \(L\).

  • \(L\) contains all the multipliers.
  • \(U\) contains the finished rows with pivots.
  • then all the information of \(A\) is now represented by \(L\) and \(U\).

Cost of Elimination

One operation = one multiplication + one subtraction

How many operations need to be made on a \(n\times n\) matrix \(A\) during elimination?

  • First stage (first pivot): it needs to get all elements under pivot \(1\) to \(0\), meaning doing operations of all elements of \(row_1\) to all elements below. which is about \(n^2\) operations (actually \(n\times (n-1)\))
  • Second stage: about \((n-1)^2\) (actually \((n-1)\times (n-2)\))
  • ...
  • To sum up, about \(n^2 + (n-1)^2 + ... + 1^2 \approx \frac{1}{3}n^{3}\)