Education or Experience? Both.
What the musings of an old mathematician teach to machine learning engineers
Ever since the inception of machine learning, several debates have remained unsettled. If you follow the discussions in the community, you have likely seen many takes on the following questions.
“Do you need formal education or hands-on experience?”
“Should you focus on theory or practice?”
“Specialize in one area or strive to be a generalist?”
The answers you might find typically lie on an extreme part of the spectrum. There are strong arguments on both sides.
On the one hand, formal education does not imply in-depth knowledge, especially not in specialized fields that are not part of a standard curriculum. Requiring a degree to work as a data scientist can mean gatekeeping, denying entry to many talented engineers.
On the other hand, hands-on experience without understanding how things work is just user-level knowledge. If you know how to drive a car, it doesn’t mean that you can fix it when the engine starts to smoke. In tech, we all strive to be pit stop mechanics in competitive environments.
The same question, formulated differently, is the age-old theory vs. practice debate. For some, machine learning equals proving theorems. For others, it is cleaning data or serving models on Kubernetes. Which side is right?
Both.
📌 The Palindrome breaks down advanced math and machine learning concepts with visuals that make everything click.
Join the premium tier to get instant access to guided tracks on graph theory, foundations of mathematics, and neural networks from scratch.
How mathematics faced the same questions
Science is an upward spiral: we ask questions that previous generations have answered. As times change, different contexts prompt us to reevaluate the problems, albeit many old arguments stick.
Mathematicians have been through similar dilemmas in the 20th century. During this time, applied research began to blossom.
Pure mathematics was motivated by itself, which was uncommon among scientific fields. (It still is.) Problems gave rise to new theories, which sprung new problems, and the cycle went on and on.
Group theory was created by Évariste Galois to show that there is no explicit formula for finding the roots of polynomials with degree larger than four. In turn, monoid theory was formulated out of mathematical curiosity, to see what happens if we don’t assume the invertibility of every element.
That doesn’t mean that monoid theory is not useful. On the contrary, it has found fruitful applications in various fields, such as computer science. Was it motivated by applications, though? No.
However, the newer generation of mathematicians often challenged this view, thus a divide was born.
Theory or applications?
Rényi’s Ars Mathematica
The most brilliant answer was given by Alfréd Rényi, one of the founding fathers of mathematics research in Hungary. If you are into probability theory, you have encountered the concept of Rényi entropy, a generalization of the well-known Shannon entropy.
(If this is not familiar, you have probably seen this quote or one of its variations: “A mathematician is a device for turning coffee into theorems.” This is from Rényi, even though it is frequently attributed to Paul Erdős.)
If poets have an ars poetica, Rényi thought, why can’t mathematicians have an ars mathematica?
In his writing (unfortunately, only available in Hungarian), he outlined several key dilemmas that mathematicians were facing.
Study or research?
Theory or applications?
Wide or deep knowledge?
Hard work or a lucky idea?
Individual work or teamwork?
Self-criticism or self-confidence?
Mathematical precision or intuition?
Open new research areas or find unsolved problems in the existing ones?
These questions are relevant today for data scientists, engineers, and machine learning researchers. Rényi’s short answer?
“In all of the dilemmas, replace “or” with “and”. This is the Ars Mathematica.” — Alfréd Rényi
What does this mean for us, machine learning people?
That instead of debating whether theory or practice comes first, we should realize that both are important, but not both are necessary to enter the field.
There are many paths to machine learning, and numerous paths within it. Each one has its strengths and weaknesses, but they complement each other well. It is perfectly fine to lack theoretical knowledge because not all jobs require you to implement a neural network from scratch. (In fact, very few do.)
A similar argument applies to those who are coming from a more theoretical background.
Questions like “How much mathematics do you need?” are invalid. Machine learning is HUGE. Which part do you have in mind? Some parts require no math at all, while some are full of it. However, depending on your career path, a time will probably come when you need to know mathematics. In this case, a degree in computer science or mathematics is useful, but you can manage if you are willing to put in the time and learn like a student.
The solution: you need theory AND practice, but starting from one direction is perfectly fine.
P.S. I went through every article I’ve ever published, categorized them, and put together a library to solve all your machine learning needs. You can check it out here.
The answer is "yes". We need tool level thinkers but also dreamers. Who wants a Bayesian thinker in a government job? Not me. Nice Wavelet. Keep it up.
Very thoughtful, it make me realize even more that the gatekeeping structures are not keeping up with what we need from them now.