The unreasonable effectiveness of orthogonal systems
Decomposing complex objects into a sum of simple parts
Last week, we have talked about the all-important concept of orthogonality. At least, on the surface. Deep down, the previous post was about an extremely important idea: finding the right viewpoint is half the success in mathematics.
We illustrated this through the concept of vector spaces, inner products, and orthogonality. Through clever representations, we can stretch mathematical definitions to objects that seemingly lie far from the original domain. Functions as vectors? Check. Riemann integral as an inner product? Check. Enclosed angle by functions? Once more, check.
This time, we’ll continue on this thread. Today, the core idea in the spotlight: decomposing complex objects into simple parts. Believe it or not, this is behind the stunning success of modern science.
(If the title of the post is familiar, it is not an accident. It is a reference to the classical paper “The Unreasonable Effectiveness of Mathematics in the Natural Sciences“ by Eugene Wigner.)
Inner products from a geometric perspective
Let’s go back to square one. Last time, we provided all kinds of generalizations for the inner product. Now we revisit the simplest possible example: the Euclidean plane.
There, the inner product is defined by the famous magnitude x magnitude x cosine of the enclosed angle formula.
What does this formula mean?
If we recall the geometric definition of the cosine, the answer is revealed to us. Here is an illustration to help.
From this, a little algebra shows that the inner product describes the length of the projection of x onto y.
If y is a unit vector, we can simplify. In this case, the projection of x onto y is given by a much simpler formula.
This allows us to decompose x into two vectors: one is orthogonal to y, while the other is parallel to it.
(If you are interested in more in this idea, here is a deep-dive Twitter thread about the topic I published a few weeks ago.)
Awesome. But why is this good for us?
A physical application
Let’s consider one of the simplest mechanical systems possible: a rectangular object, siding on a slope.
How would you calculate the velocity and acceleration of such an object? When will it reach the bottom?
We can find all the answers by decomposing the gravitational force into components parallel and orthogonal to the slope itself.
With a simple trigonometric calculation, we can explcitly find F₁ and F₂. By understanding these forces, it’s easy to quantitatively analyze this system.
Are there any more serious applications? You bet. This is just the iceberg. We’ll take vector decomposition to the extremes. This idea is behind the technologies that have propelled our civilization light-years forward.
Instead of decomposing vectors in terms of other ones, let’s turn this question upside down. Can we find a set of orthogonal vectors that can be used to express any vector?
It turns out that the answer is yes, even in the most general cases. Simple examples first, surprising ones later. The most straightforward one is in the n-dimensional Euclidean space. There, the (positive) unit vectors along every dimension provide such an example.
Why is this a big deal? First, let’s capture the essential properties of such a vector system, and turn that into a definition. (This is what we’ve been doing last week when we extended orthogonality from arrows to functions.)
There are exactly three of them:
the vector set must be a basis,
its elements should be orthogonal to each other,
and its elements should have unit norms.
Such a system is called an orthonormal basis.
There are (uncountably) infinite variations, some of which are more useful than others. The system is often selected to fit the problem. For instance, recall the slope: there, the ideal orthonormal system consists of two vectors One is parallel to the slope, while the other is perpendicular to it.
Why do we love orthonormal bases?
Suppose that somebody hands an arbitrary vector to us. How do we express it in terms of our orthonormal basis?
We apply the ever-so-powerful principle of wishful thinking! Suppose that we know exactly how our vector is written.
We can explicitly compute the coefficients. How? By selecting any element of the orthonormal basis and calculating the inner product!
Thus, any vector can be expressed after performing a few cheap calculations.
This property is unique to the orthonormal bases!
Another way to interpret this is that the k-th coefficient only contains information about the k-th basis element and the vector itself. In other words, the basis vectors do not contain redundant information.
It’s time to see some examples. I’ll talk about two. Both are fundamental discoveries: the Principal Component Analysis and the Fourier series.
Principal Component Analysis
Let’s start with the famous Principal Component Analysis. The gist is this: given a data matrix X, we can find a principal component transformation W such that the transformed data T = XW is optimal in the sense of variance. That is, we rewrite the data in an orthonormal basis where the basis vectors maximize the variance when the data is projected onto them.
It’s easier to draw a picture. Suppose that we have this simple dataset below, straight from Wikipedia.
The two vectors superimposed on the image are the basis given by PCA. These are well-suited to represent the dataset, as they represent features that
and eliminate redundancy between each other.
Hence, PCA is often used to reduce dimensionality. After the transformation, the less expressive features can be omitted.
Trigonometric functions as an orthonormal system
Now comes the juicy part. Recall that last week, we were talking about how the famous trigonometric functions sine and cosine are orthogonal, at least in the function space we called L².
Here comes the surprise: the trigonometric functions form an orthonormal basis for the L² space! That is, all L² functions can be expressed in terms of sines and cosines of various frequencies.
The resulting expression is called the Fourier series.
(This equality symbol is “over-loaded”. You don’t need to be concerned with such details, but equality does not necessarily hold point-wise; it holds in the L²-sense. For sufficiently smooth functions, these concepts overlap.)
As we have seen earlier, the coefficients can be explicitly calculated.
Why is such representation useful for us? Think about how functions are stored inside a computer. Analog signals, such as sound waves, are continuous in nature, and we don’t have an explicit formula to calculate their value.
This problem is solved by the Fourier series. Calculate the Fourier coefficients, and store those to represent the function. For sound waves, each coefficient represents the contribution of a given frequency.
If you are puzzled about why the coefficients are calculated this way, think back to what we have seen earlier. When expressing an arbitrary vector (f(x) in our case) as the linear combination of an orthonormal basis (here, the trigonometric functions), the coefficients are the inner products.
(If you are interested in more about this topic, check out the always fantastic 3Blue3Brown.)
With the Fourier representation, problems like noise filtering become much simpler. For instance, high-frequency noise can be cleaned by throwing away the coefficients after a cutoff point.
This is not just a simple filtering operation. It is also a method for data compression! Think about it: removing the high-frequency parts reduce the continuous signal into a finite-dimensional vector representation. Thus, the Fourier series is the foundation of modern telecommunication.
And it’s all thanks to orthogonal bases.
Let’s summarize what we have seen so far. Last week, we have talked about how one can extend the notion of orthogonality; at least, that was the cover story. In reality, the story was about one of the fundamental ideas of mathematics: abstraction.
This time, we carried on the subject to highlight another essential principle. By combining orthogonality with the concept of linear bases, we can decompose complex objects into the sum of their parts. This is stunningly powerful. Thus, the concept of orthonormal bases is born.
From easy examples like motion on a slope to the analysis of sound waves, they are almost everywhere in mathematics, science, and technology. The discovery of the Fourier series (and eventually the general orthonormal function bases) truly changed the course of science.
Even the JPEG format is based on them. Think about this the next time you see an image. (Which should be within the next five seconds.)