Making Neural Networks Cheaper, Smaller, and Faster
Building smaller systems, one weight at a time
Hey!
It’s Tivadar from The Palindrome. Although I come from the theoretical side of machine learning, I have an ongoing fascination with low-level hardware and optimization.
One topic that always gets me going is the acceleration of neural networks via reducing their size, making the operations faster, etc. This post is a deep dive into the techniques and methods for this.
Speaking of deep dives: today’s issue is sponsored by Together AI, a GPU cloud and model platform provider. They are hosting a discussion between Dylan Patel (SemiAnalysis) and Ian Buck (NVIDIA) for an insider look at NVIDIA Blackwell, the latest microarchitecture behind the GPUs that power our neural networks.
The deep dive will cover architecture, optimizations, implementation, and more, along with an opportunity to get your questions answered.
Keep reading with a 7-day free trial
Subscribe to The Palindrome to keep reading this post and get 7 days of free access to the full post archives.


