Making Neural Networks Cheaper, Smaller, and Faster

Building smaller systems, one weight at a time

Sep 19, 2025

∙ Paid

Hey!

It’s Tivadar from The Palindrome. Although I come from the theoretical side of machine learning, I have an ongoing fascination with low-level hardware and optimization.

One topic that always gets me going is the acceleration of neural networks via reducing their size, making the operations faster, etc. This post is a deep dive into the techniques and methods for this.

Speaking of deep dives: today’s issue is sponsored by Together AI, a GPU cloud and model platform provider. They are hosting a discussion between Dylan Patel (SemiAnalysis) and Ian Buck (NVIDIA) for an insider look at NVIDIA Blackwell, the latest microarchitecture behind the GPUs that power our neural networks.

The deep dive will cover architecture, optimizations, implementation, and more, along with an opportunity to get your questions answered.

Join us on Wednesday, October 1 at 9 AM PDT!

Keep reading with a 7-day free trial

Subscribe to The Palindrome to keep reading this post and get 7 days of free access to the full post archives.