Linear Algebra Note#3: Multiplication & Inverse
This note is based on MIT 18.06 📒
Content
- #1: Row, Column & Matrix
- #2: Elimination
- #3: Multiplication & Inverse 👈
Matrix Multiplication
\[ \begin{gathered} \begin{bmatrix} a_{11} & a_{12} \\\\ a_{21} & a_{22} \end{bmatrix} \times \begin{bmatrix} b_{11} & b_{12} \\\\ b_{21} & b_{22} \end{bmatrix} = \begin{bmatrix} c_{11} & c_{12} \\\\ c_{21} & c_{22} \end{bmatrix} \end{gathered} \]
Element Perspective
Each element in matrix \(C\) is the result of the multiplcation of a row in \(A\) and a column in \(B\).
\[ c_{21} = a_{21} \times b_{11} + a_{22} \times b_{21} \]
Note: They are allowed to be multiplied only when \(m\times n\) multiplying with \(n\times p\). The length of row A and col B should be equal to dot product.
Linear Combination Perspective
Each column in \(C\) is a linear combination of the columns in \(A\), where the coefficients are elements of the column in \(B\) with the same column index.
\[ \begin{gathered} b_{11} \times \begin{bmatrix} a_{11} \\\\ a_{21} \end{bmatrix} + b_{21} \times \begin{bmatrix} a_{12} \\\\ a_{22} \end{bmatrix} = \begin{bmatrix} c_{11} \\\\ c_{21} \end{bmatrix} \end{gathered} \]
Similarly, each row in \(C\) is a linear combination of rows in \(B\), where the coeefficients are elements of the row in \(A\).
\[ \begin{gathered} a_{11} \times \begin{bmatrix} b_{11} & b_{12} \end{bmatrix} + a_{12} \times \begin{bmatrix} b_{21} & b_{22} \end{bmatrix} = \begin{bmatrix} c_{11} & c_{12} \end{bmatrix} \end{gathered} \]
Column \(\times\) Row Perspective
\(C\) can be seen as a result of columns in \(A\) multiplies rows in \(B\). The \(i^{th}\) column of \(A\) multiplying the \(j^{th}\) row in \(B\) spreads the \(A\) column into a \(C\)-shaped matrix, with multipliers of different elements in the \(B\) row.
\[ \begin{gathered} \begin{bmatrix} a_{11} \\\\ a_{21} \end{bmatrix} \times \begin{bmatrix} b_{11} & b_{12} \end{bmatrix} = \begin{bmatrix} c_{11}^{11} & c_{12}^{11} \\\\ c_{21}^{11} & c_{22}^{11} \end{bmatrix} \end{gathered} \]
Note: the superscript "\(11\)" means the result of \(1^{st}\) column in \(A\) and \(1^{st}\) row in \(B\). The result matrix here is not the final \(C\).
Similarly, it can also be seen as spreading the \(B\) row into a matrix with multipliers of different elements in the \(A\) column.
Understanding: this still follows the previous perspective, the sum of different result matrices is summing the multiples of each \(A\) column (or \(B\) row), which is equivalent to performing linear combinations of the \(A\) columns (or \(B\) rows).
Matrix Block Multiplication
\[ \begin{gathered} \begin{bmatrix} A_{1} & A_{2} \\\\ A_{3} & A_{4} \end{bmatrix} \times \begin{bmatrix} B_{1} & B_{2} \\\\ B_{3} & B_{4} \end{bmatrix} = \begin{bmatrix} C_{1} & C_{2} \\\\ C_{3} & C_{4} \end{bmatrix} \end{gathered} \]
Note: the \(A_1\) etc. are smaller matrices themselves, respresenting segments of the original matrix.
\[ C_1 = A_1 \times B_1 + A_2 \times B_3 \]
Understanding of block multiplications:
- \(C_1\) is the result of the first \(n\) rows of \(A\) and first \(n\) columns of \(B\), which is \(A_1\), \(A_2\) and \(B_1\), \(B_3\)
- and, \(A_1\) has only to do with \(B_1\), while \(A_2\) has only to do with \(B_3\)
Matrix Inverse
The matrices discussed here are all square.
The inverse of a matrix is directional:
- Left inverse: \(A^{-1} \times A = I\)
- Right inverse: \(A \times A^{-1} = I\)
For square matrices, the left inverse is same as the right one.
Singular Case
When a matrix doesn't have an inverse.
\[ \begin{gathered} \begin{bmatrix} 1 & 3 \\\\ 2 & 6 \end{bmatrix} \nRightarrow \begin{bmatrix} 1 & 0 \\\\ 0 & 1 \end{bmatrix} \end{gathered} \]
For the case above, the linear combination of the columns (or rows) can never form an Identity Matrix \(I\), which means there is no inverse matrix.
Understanding: the \(1\) in each dimension of the \(I\) means there is something new to contribute to that dimension, but these two vectors contribute same to each dimension; geometrically, to get \(n\) one-hot vectors means the A needs to cover the whole \(n\)-d space.
The general rule: if there is \(Ax=0\), with \(x\) isn't a zero vector, then \(A\) is non-invertible.
- If there is \(A^{-1}\), then \(A^{-1} A x = x = A^{-1} 0 = 0\), but \(x\) isn't zero vector.
Gauss-Jordan
One of the ways to get the inverse of a matrix
\[ \begin{gathered} \underset{\begin{array}{c}\\ A \end{array}}{ \begin{bmatrix} 1 & 3 \\\\ 2 & 7 \end{bmatrix}} \times \underset{\begin{array}{c}\\ A^{-1} \end{array}}{ \begin{bmatrix} a & c \\\\ b & d \end{bmatrix}} = \underset{\begin{array}{c}\\ I \end{array}}{ \begin{bmatrix} 1 & 0 \\\\ 0 & 1 \end{bmatrix}} \end{gathered} \]
\[ \begin{gathered} \begin{bmatrix} 1 & 3 \\\\ 2 & 7 \end{bmatrix} \times \begin{bmatrix} a \\\\ b \end{bmatrix} = \begin{bmatrix} 1 \\\\ 0 \end{bmatrix} ,\ \ \ \begin{bmatrix} 1 & 3 \\\\ 2 & 7 \end{bmatrix} \times \begin{bmatrix} c \\\\ d \end{bmatrix} = \begin{bmatrix} 0 \\\\ 1 \end{bmatrix} \end{gathered} \]
If the two linear combination equations corresponding to the columns of \(I\) are all solvable, then \(A\) is invertible.
How to actually get \(A^{-1}\)
- Augmenting \(A\) with an Identity Matrix \(I\):
\[ \left[\begin{array}{cc|cc} 1 & 3 & 1 & 0\\\\ 2 & 7 & 0 & 1 \end{array}\right] \]
- Operating on both \(A\) and the augmented \(I\) together, aiming to turn \(A\) into an Identity Matrix:
\[ \Rightarrow \left[\begin{array}{cc|cc} 1 & 3 & 1 & 0\\\\ 0 & 1 & -2 & 1 \end{array}\right] \]
- The augmented part in the end is the \(A^{-1}\):
\[ \Rightarrow \left[\begin{array}{cc|cc} 1 & 0 & 7 & -3\\\\ 0 & 1 & -2 & 1 \end{array}\right] \]
When \(A\) gets to \(I\), it means it's left multiplied by \(A^{-1}\). And the augmented part will be \(A^{-1} I = A^{-1}\)