# Epsilons, no. 2: Understanding matrix multiplication

### The geometric and algebraic interpretation of matrix products

Matrix multiplication is not easy to understand.

Even looking at the definition used to make me sweat, let alone trying to comprehend the pattern. Yet, there is a stunningly simple explanation behind it.

Let's pull back the curtain!

First, some notations.

The product of A and B is given by the following formula. Not the easiest (or most pleasant) to look at.

The situation is worse if we drop the simpler notation and fully write out *AB*.

A quick visualization before the technical details. The element in the *i*-th row and *j*-th column of *AB* is the dot product of *A*'s *i*-th row and *B*'s *j*-th column.

Let’s unwrap matrix multiplication!

We start with a special case: multiplying *A* with a (column) vector whose first component is *1*, and the rest are *0*. Let's name this special vector *e₁*.

Turns out, the product of *A* and *e₁* is the first column of *A*.

Similarly, multiplying *A* with *e₂ — *that is, a (column) vector whose second component is *1* and the rest are *0* — yields the second column of *A*.

That's a pattern!

By the same logic, we conclude that *A* times *eₖ* equals the *k*-th column of *A*.

This sounds a bit algebra-y, so let's see this idea in geometric terms.

Matrices represent linear transformations. You know, those that stretch, skew, rotate, flip, or otherwise linearly distort the space.

The images of basis vectors form the columns of the matrix. We can visualize this in two dimensions.

Thus, the columns of A can be also thought of as the images of *eₖ* under *A*.

Moreover, we can look at a matrix-vector product as a linear combination of the column vectors. Make a mental note of this, because it is important.

(If unwrapping the matrix-vector product seems too complex, I got you. The computation below is the same as in the above tweet, only in vectorized form.)

Now, about the matrix product formula.

From a geometric perspective, *AB* is the same as first applying *B*, then *A* to our underlying space.

Now, let’s study *ABe₁*, that is, the first column of *AB*.

Recall that matrix-vector products are linear combinations of column vectors. Thus, the first column of *AB* is the linear combination of *A*'s columns. (With coefficients from the first column of *B*.)

We can collapse the linear combination into a single vector, resulting in a formula for the first column of *AB*. This is straight from the mysterious definition of the matrix product.

The same logic can be applied, thus giving an explicit formula to calculate the elements of a matrix product. And, we are done!

Linear algebra is powerful exactly because it abstracts away the complexity of manipulating data structures like vectors and matrices.

Instead of explicitly dealing with arrays and convoluted sums, we can use simple expressions *AB*.

That's a huge deal.

Without a doubt, linear algebra is one of the most important mathematical tools for a machine learning practitioner.

If you want to master it like a pro, check out my upcoming Mathematics of Machine Learning book. I release the chapters as I finish them, and all the linear algebra ones are done and published. (The linear algebra chapters are also available as a stand-alone book.)

Understanding mathematics is a superpower, and I want to help you unlock it.