[Linear Algebra] 11. Cramer's rule, explained geometrically

2022. 3. 12. 13:19Mathematics/Linear Algebra

In this section, let's view "Cramer's rule" by geometrically.

 

Cramer's rule is not the best way to compute solutions of systems of linear equations. Gaussian elimination will always be faster. But understanding Cramer's rule geometrically will help consolidate ideas of relation between determinant and system of linear equations.

Example

In this setup, we define system of linear equations with 1). two unknowns $x$ and $y$, and two equations, and 2). non-zero determinant. These equations can be interpreted geometrically, as a certain known matrix $\begin{bmatrix} v_1 & v_2 \\ w_1 & w_2 \end{bmatrix}$ transforming an unknown vector $\begin{bmatrix} x \\ y \end{bmatrix}$ to known vector $\begin{bmatrix} u_1 \\ u_2 \end{bmatrix}$:

$$
\begin{matrix}
v_1x + v_2y = u_1 \\
w_1x + w_2y = u_2 \\
\end{matrix}
\longrightarrow
\begin{bmatrix} v_1 & v_2 \\ w_1 & w_2 \end{bmatrix}
\begin{bmatrix} x \\ y \end{bmatrix}
=
\begin{bmatrix} u_1 \\ u_2 \end{bmatrix}
$$

 

Let's think, the $x$-coordinate of this unknown vector as dot product between $\begin{bmatrix} x \\ y \end{bmatrix}$ and $\begin{bmatrix} 1 \\ 0 \end{bmatrix}$. And $y$-coordinate of unkown vector as dot product between $\begin{bmatrix} x \\ y \end{bmatrix}$ and $\begin{bmatrix} 0 \\ 1 \end{bmatrix}$.

 

It will be fantastic, if the dot product between $T\left(\begin{bmatrix} x \\ y \end{bmatrix}\right)$ and $T\left(\begin{bmatrix} 1 \\ 0 \end{bmatrix}\right)$ is $x$-coordinate of unknown vector. Because we know $T\left(\begin{bmatrix} x \\ y \end{bmatrix}\right)$ and $T\left(\begin{bmatrix} 1 \\ 0 \end{bmatrix}\right)$. But unfortunately, for most linear transformations, the dot product before and after the transformation is very different.

Orthonormal transformation

So transformation which preserve dot products are special enough to have their own name: Orthonormal transformations. These are the ones leave all the basis vectors perpendicular to each other with unit lengths. Solving a linear system with an orthonormal matrix is very easy: Since $T\left(\begin{bmatrix} x \\ y \end{bmatrix}\right)\cdot T\left(\begin{bmatrix} 1 \\ 0 \end{bmatrix}\right)$ is same as $\begin{bmatrix} x \\ y \end{bmatrix} \cdot \begin{bmatrix} 1 \\ 0 \end{bmatrix}$.

Non-orthonormal transformation

To solve non-orthonormal transformations, view coordinates as "area of the parallelogram", rather "scalar of basis vector". In detail, $y$-coordinate can be the area of the parallelogram defined by $\mathbf{\hat{i}}$ and $\begin{bmatrix} x \\ y \end{bmatrix}$; it's a wacky view about coordinates. Moreover, we should think about orientation of parallelogram. Summetrically, $x$-coordinate can be the area of the parallelogram defined by $\mathbf{\hat{j}}$ and $\begin{bmatrix} x \\ y \end{bmatrix}$.

 

Why view coordinates as areas and volumes? If applying some matrix transformation, the areas of the parallelograms don't stay the same, they may get scaled up or down. But, All the areas get scaled by the same amount which is a key idea of determinant.

 

That why, the area of parallelogram spanned by $T\left(\begin{bmatrix} x \\ y \end{bmatrix}\right)$ and $T\left(\begin{bmatrix} 1 \\ 0 \end{bmatrix}\right)$ will be the determinant $T$ multiplied by $y$-coordinate. So, $y$-coordinate is Area divided by the determinant $T$. Area is determinant of $2 \times 2$ matrix whose first column is where $\mathbf{\hat{i}}$ land and second column is the output vector.

$$
y = \frac{\mathbf{Area}}{\det(A)}
= \frac{\det\left(\begin{bmatrix} v_1 & u_1 \\ w_1 & u_2\end{bmatrix}\right)}{\det\left(\begin{bmatrix} v_1 & v_2 \\ w_1 & w_2 \end{bmatrix}\right)}
$$

 

Just using the output of the transformation, namely the columns of the matrix and the coordinates of output vector, we can recover the $y$-coordinate of our mystery intput vector. Likewise, the same idea can get you the $x$-coordinate.

 

This formula for finding the solutions to a system of linear equations is known as Cramer's rule.