<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[The Palindrome]]></title><description><![CDATA[mathematics ∪ machine learning]]></description><link>https://thepalindrome.org</link><image><url>https://substackcdn.com/image/fetch/$s_!5Jm3!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8b68cf8-d3f4-42f6-b8dd-cccde036005f_720x720.png</url><title>The Palindrome</title><link>https://thepalindrome.org</link></image><generator>Substack</generator><lastBuildDate>Mon, 01 Jun 2026 19:46:39 GMT</lastBuildDate><atom:link href="https://thepalindrome.org/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Tivadar Danka]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[thepalindrome@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[thepalindrome@substack.com]]></itunes:email><itunes:name><![CDATA[Tivadar Danka]]></itunes:name></itunes:owner><itunes:author><![CDATA[Tivadar Danka]]></itunes:author><googleplay:owner><![CDATA[thepalindrome@substack.com]]></googleplay:owner><googleplay:email><![CDATA[thepalindrome@substack.com]]></googleplay:email><googleplay:author><![CDATA[Tivadar Danka]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[How to OOP in Python Like a Pro]]></title><description><![CDATA[Inheritance and composition]]></description><link>https://thepalindrome.org/p/how-to-oop-in-python-like-a-pro</link><guid isPermaLink="false">https://thepalindrome.org/p/how-to-oop-in-python-like-a-pro</guid><dc:creator><![CDATA[Stephen Gruppetta]]></dc:creator><pubDate>Sat, 30 May 2026 06:49:08 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/f2c5ca9e-989d-47d2-8cce-001481f77916_1635x962.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>If you liked this post, you should support me with a paid subscription. It&#8217;s $100 a year, or $10 a month, and in return you get well-researched machine learning/mathematics deep dives such as <a href="https://thepalindrome.org/p/machine-learning-is-not-just-statistics-4d9">Machine Learning is Not Just Statistics</a>, <a href="https://thepalindrome.org/p/matrices-and-graphs">Matrices and Graphs</a>, or <a href="https://thepalindrome.org/p/an-introduction-to-vectorization">Vectorization in Theory</a> + <a href="https://thepalindrome.org/p/vectorization-in-practice">Vectorization in Practice.</a></em></p><p><em>Your support makes it possible for me to write these high value, high signal posts every week.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://thepalindrome.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://thepalindrome.org/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><p>Hey!</p><p>It&#8217;s Tivadar from The Palindrome. Back when I wrote the <em>Mathematics of Machine Learning</em> book, I realized</p><ol><li><p>what a great language Python is,</p></li><li><p>and that object-oriented Python should be the first thing taught to every machine learning engineer.</p></li></ol><p>So, I&#8217;ve been thinking about publishing a series, but my Python skills are not exactly top-of-the-line; I just hack and slash until things work. Fortunately, I found the best person who could do that!</p><p>It&#8217;s my pleasure to introduce <a href="https://open.substack.com/users/120170782-stephen-gruppetta?utm_source=mentions">Stephen Gruppetta</a>, author of <a href="https://open.substack.com/pub/thepythoncodingstack">The Python Coding Stack</a>, and my longtime online friend from back when X was called Twitter.</p><p>If you ever wanted to become a power user of Python and take advantage of all the heavy machinery provided by classes, operator overloading, inheritance, composition, etc., this is the article for you. </p><p>(It&#8217;s the second post in the object-oriented Python miniseries. <a href="https://thepalindrome.org/p/introduction-to-object-oriented-programming">Check the first one here for the foundations of OOP!</a>)</p><p>Dig in!</p><p>Cheers,<br>Tivadar</p><div><hr></div><p>A class is a template you can use to create several objects that share the same characteristics. All the objects created from a class will have the same structure and can perform the same actions.</p><p>However, sometimes you need to create objects that don&#8217;t have the same attributes but that are still similar to each other. You may need objects that share some characteristics but are sufficiently different from each other, so you can&#8217;t create them using the same class.</p><p>You want to avoid writing a brand new class from scratch. Instead, you&#8217;d like to reuse some of the code from the original class.</p><p>There are several options for handling overlap between classes. Inheritance is a common tool to link classes. However, inheritance is not always the right solution. Composition is an alternative technique to use a class as part of another class. In this article, you&#8217;ll learn about inheritance and composition. You&#8217;ll learn how to use each technique, and just as importantly, when to use them.</p><h1>The Velocity Class</h1><p>You learned about classes in <a href="https://thepalindrome.org/p/introduction-to-object-oriented-programming">the first article in this series</a>. You wrote <a href="https://thepalindrome.org/i/194861422/special-methods-how-python-works">a Vector class</a> that you can use to create vector instances.</p><p>You&#8217;ll make some changes to this class as you progress through this article. However, you&#8217;ll primarily work on a new class: the <code>Velocity</code> class.</p><p>You&#8217;ll consider two different routes to create the <code>Velocity</code> class. <a href="https://en.wikipedia.org/wiki/Velocity">Velocity</a> is a vector: it has a magnitude and a direction. Therefore, you can start from the <code>Vector</code> class you defined in the first article and use it as a starting point to build the <code>Velocity</code> class. This is the inheritance route.</p><p>However, you can also create a class that has a <code>Vector</code> instance as one of its attributes. You&#8217;ll also explore this composition route in this article.</p>
      <p>
          <a href="https://thepalindrome.org/p/how-to-oop-in-python-like-a-pro">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Explore LLM word representations using similarity analysis (part 2)]]></title><description><![CDATA[Investigate semantic information inside the attention matrices of GPT-2]]></description><link>https://thepalindrome.org/p/explore-llm-word-representations-c57</link><guid isPermaLink="false">https://thepalindrome.org/p/explore-llm-word-representations-c57</guid><dc:creator><![CDATA[Mike X Cohen, PhD]]></dc:creator><pubDate>Wed, 20 May 2026 11:06:26 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/49c6e1a4-b00f-4a2e-b2d0-87083aa4aecb_1200x630.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>What you will learn in this 2-part post series</h2><p>The primary goal of this post series is to teach you the Representational Similarity Analysis (RSA), which is a machine-learning analysis that is used to compare distributed representations in different systems.</p><p>If you haven&#8217;t already read <a href="https://thepalindrome.org/p/explore-llm-word-representations">Part 1 in this series</a>, please do so! It provides necessary background about how the RSA score is calculated and interpreted.</p><p>As a brief reminder, an RSA (representational similarity analysis) works by comparing cosine similarity matrices across different embeddings spaces (layers, blocks, models, etc.). The idea is that different embeddings spaces may have distinct coordinate systems and even different dimensionalities, but if their internal representational structures are similar, the relative similarities should be strongly correlated even if the vectors are distinct.</p><p>The additional goals of this second post are (1) to learn more about RSA and category specificity, and (2) to learn how to dissect the &#8220;hidden layers&#8221; of an LLM, and in particular, the Query, Keys, and Values vectors inside the transformer block. Those <strong>Q</strong>, <strong>K</strong>, and <strong>V</strong> vectors are part of the mechanism by which LLMs figure out what information from previous words are relevant for using the current word to make predictions about subsequent words.</p><p>You will discover that the adjustment vectors are largely uncorrelated while their RSA scores are quite high. These results show that although individual attention matrices have idiosyncratic internal calculations, they learn meaningful representations and words that allow them to interact in elegant and semantically meaningful ways.</p><p>If you want to learn more about the attention algorithm, I can humbly recommend <a href="https://mikexcohen.substack.com/p/llm-breakdown-56-attention">my post on the topic</a>.</p><p>This post roughly corresponds to Project 38 in my recent book on <a href="https://github.com/mikexcohen/ML4LLM_book">using machine-learning projects to understand how LLMs work</a>. Don&#8217;t worry, you don&#8217;t need the book to follow this post.</p><h3>How to use the code with these posts</h3><p>The accompanying code file will reproduce all the figures in this post &#8212; but you can do so much more by thinking of the code as a starting-point for your continued explorations. Try changing parameters, adding new words or categories, using different similarity/distance metrics, different models, etc.</p><p><a href="https://github.com/mikexcohen/Substack/blob/main/MLonLLMs/Cohen_RSA_LLMs_part2.ipynb">The code is available here on my GitHub.</a> In the video below, I show how to get and run the code using Google Colab. You can also download the notebook file and run it locally, but I recommend using Colab because you won&#8217;t need to worry about local installations or library versions.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;362fafd8-b46c-41d2-9a37-23c52a90cdbf&quot;,&quot;duration&quot;:null}"></div><p></p><h2>What are the <strong>Q</strong>, <strong>K</strong>, and <strong>V</strong> vectors in the attention algorithm?</h2><p>You&#8217;ve probably heard of the &#8220;attention algorithm&#8221; at the heart of the LLM transformer block. Attention is an elegant and clever trick that allows language models to determine how much information is contained in each pair of tokens and how that information is relevant for generating predictions about new text.</p><p><a href="https://open.substack.com/pub/mikexcohen/p/llm-breakdown-56-attention?r=658yg">You can read more about the attention algorithm in this linked post; the paragraphs below provide a very brief summary.</a></p><p>There are three sets of activations for each token position, called the Query (<strong>Q</strong>), Keys (<strong>K</strong>), and Values (<strong>V</strong>) matrices. The idea is this: For each pair of tokens, the dot product between their corresponding <strong>Q</strong> and <strong>K</strong> vectors creates a scalar weighting value, with higher dot products indicating that more importance (attention) should be paid to that pair; that weight value then determines how much relevant information in <strong>V</strong> gets added onto the embeddings vectors.</p><p>The result of the attention algorithm is an adjustment to the embeddings vectors as they pass through the transformer stack. In other words, the embeddings vectors you worked with in the previous post are rotated and scaled by each transformer layer, and those adjustments nudge the vectors from pointing towards the tokens in the input (e.g., the text prompt you gave to Claude) towards other tokens to generate an appropriate and context-relevant output.</p><p>The attention calculation is separated into &#8220;heads,&#8221; which are low-dimensional views of the hidden states that capture distinct features of the text and are combined at the end of the attention algorithm. I decided to ignore the attention heads for this post. That&#8217;s partly because we&#8217;re not working with the QK&#7488; dot products, which is where the representations become head-specific, and partly in the interest of focusing on the mechanics and interpretations of RSA. An interesting extension of this project could involve running the analyses separately per-head, although that would create a dimensionality-explosion due to the number of heads and matrices that could be compared with RSA.</p><h2>Import and inspect GPT-2-XL</h2><p>I decided to use GPT-2-XL for this post. The code to import the model and its tokenizer from Hugging Face was shown in the previous post. The screenshot below shows an overview of the model architecture.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!soB2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84589e08-4598-49a4-9f04-8cd172461630_512x367.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!soB2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84589e08-4598-49a4-9f04-8cd172461630_512x367.png 424w, https://substackcdn.com/image/fetch/$s_!soB2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84589e08-4598-49a4-9f04-8cd172461630_512x367.png 848w, https://substackcdn.com/image/fetch/$s_!soB2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84589e08-4598-49a4-9f04-8cd172461630_512x367.png 1272w, https://substackcdn.com/image/fetch/$s_!soB2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84589e08-4598-49a4-9f04-8cd172461630_512x367.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!soB2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84589e08-4598-49a4-9f04-8cd172461630_512x367.png" width="512" height="367" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/84589e08-4598-49a4-9f04-8cd172461630_512x367.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:367,&quot;width&quot;:512,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!soB2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84589e08-4598-49a4-9f04-8cd172461630_512x367.png 424w, https://substackcdn.com/image/fetch/$s_!soB2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84589e08-4598-49a4-9f04-8cd172461630_512x367.png 848w, https://substackcdn.com/image/fetch/$s_!soB2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84589e08-4598-49a4-9f04-8cd172461630_512x367.png 1272w, https://substackcdn.com/image/fetch/$s_!soB2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84589e08-4598-49a4-9f04-8cd172461630_512x367.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you&#8217;re new to PyTorch models and LLMs, then this overview might look intimidating. The relevant information for us here is that there are 48 transformer layers (h is for &#8220;hidden layer&#8221;), and each hidden layer contains an attention block (attn) among other components like layerNorm and MLP. The <strong>Q</strong>, <strong>K</strong>, and <strong>V</strong> vectors are calculated in the c_attn layer. That matrix is 4800&#215;1600. The 1600 corresponds to the embeddings dimensionality, and 4800 corresponds to the <strong>Q</strong>, <strong>K</strong>, and <strong>V</strong> matrices concatenated into one wide matrix (4800 = 3&#215;1600).</p><h2>Access the internal calculations using hooks</h2><p>In the previous post, we didn&#8217;t need to prompt the model because we could just grab the embeddings vectors for each word we wanted to analyze. However, accessing the internal calculations of an LLM is a little more involved. The reason is that the transformer modifies each embeddings vector according to context (previous words in the text); in other words, the representation of the word &#8220;the&#8221; depends on all the words that come before it. Thus, we need to prompt the model with some text in order to analyze its internals.</p><p>But that creates a new problem: Even small LLMs create huge data matrices during each forward pass, and storing all of those internal calculations for each prompt would require terabytes of space. Therefore, the internal calculations are destroyed as soon as they&#8217;re no longer needed.</p><p>Fortunately, PyTorch provides a special method (like a function) that we can implant into the model that allows us to grab the internal calculations before they are destroyed. It&#8217;s called a &#8220;hook function&#8221; and looks like this:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!O_x4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05cb8ab0-ff4b-4e64-875e-b8638bdf8814_2824x1465.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!O_x4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05cb8ab0-ff4b-4e64-875e-b8638bdf8814_2824x1465.png 424w, https://substackcdn.com/image/fetch/$s_!O_x4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05cb8ab0-ff4b-4e64-875e-b8638bdf8814_2824x1465.png 848w, https://substackcdn.com/image/fetch/$s_!O_x4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05cb8ab0-ff4b-4e64-875e-b8638bdf8814_2824x1465.png 1272w, https://substackcdn.com/image/fetch/$s_!O_x4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05cb8ab0-ff4b-4e64-875e-b8638bdf8814_2824x1465.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!O_x4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05cb8ab0-ff4b-4e64-875e-b8638bdf8814_2824x1465.png" width="1456" height="755" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/05cb8ab0-ff4b-4e64-875e-b8638bdf8814_2824x1465.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:755,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;code&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="code" title="code" srcset="https://substackcdn.com/image/fetch/$s_!O_x4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05cb8ab0-ff4b-4e64-875e-b8638bdf8814_2824x1465.png 424w, https://substackcdn.com/image/fetch/$s_!O_x4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05cb8ab0-ff4b-4e64-875e-b8638bdf8814_2824x1465.png 848w, https://substackcdn.com/image/fetch/$s_!O_x4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05cb8ab0-ff4b-4e64-875e-b8638bdf8814_2824x1465.png 1272w, https://substackcdn.com/image/fetch/$s_!O_x4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05cb8ab0-ff4b-4e64-875e-b8638bdf8814_2824x1465.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There&#8217;s a lot going on in that code, and it might look intimidating if you&#8217;re new to working with LLMs in Python. But the idea is to implant the &#8220;hook&#8221; function into the attention block of each transformer layer, make a copy of the <strong>Q</strong>, <strong>K</strong>, and <strong>V</strong> activations, convert them to NumPy, and store them in a dictionary called activations that we can access later. This function gets called each time we prompt the model with some text.</p><p>Now we&#8217;re ready to prompt the model using the 34 words in 3 categories that you used in the previous post.</p><p>But here&#8217;s the thing about language models: They&#8217;re not trained to process isolated words; they&#8217;re trained to extract rich and context-specific meaning from sequences of words like sentences and paragraphs that contain hundreds or thousands of words. Presenting one token at a time to an LLM will elicit unusual and outlier-like activations. Indeed, most interpretability analyses of LLMs specifically exclude the first token in the sequence because of its extreme activation patterns.</p><p>So I&#8217;ve done something very simple: I&#8217;ve presented to the model the sentence &#8220;The next word is <em><strong>&#8221; substituting &#8220;</strong></em>&#8221; for each of 34 words that we want to analyze. This is good experimental design because it means that all words have identical context, and thus any differences and similarities can only be attributed to world-knowledge that the model learned about each word.</p><p>The screenshot below shows code that creates the batch of token sequences and prompts GPT-2 with those sequences. When the model runs through its calculations, the hook function is activated, copying and storing the attention activations vectors into the dictionary.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MP9s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373a0ce9-1f05-42d2-8bec-b973b4aded72_2621x855.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MP9s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373a0ce9-1f05-42d2-8bec-b973b4aded72_2621x855.png 424w, https://substackcdn.com/image/fetch/$s_!MP9s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373a0ce9-1f05-42d2-8bec-b973b4aded72_2621x855.png 848w, https://substackcdn.com/image/fetch/$s_!MP9s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373a0ce9-1f05-42d2-8bec-b973b4aded72_2621x855.png 1272w, https://substackcdn.com/image/fetch/$s_!MP9s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373a0ce9-1f05-42d2-8bec-b973b4aded72_2621x855.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MP9s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373a0ce9-1f05-42d2-8bec-b973b4aded72_2621x855.png" width="1456" height="475" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/373a0ce9-1f05-42d2-8bec-b973b4aded72_2621x855.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:475,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;code&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="code" title="code" srcset="https://substackcdn.com/image/fetch/$s_!MP9s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373a0ce9-1f05-42d2-8bec-b973b4aded72_2621x855.png 424w, https://substackcdn.com/image/fetch/$s_!MP9s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373a0ce9-1f05-42d2-8bec-b973b4aded72_2621x855.png 848w, https://substackcdn.com/image/fetch/$s_!MP9s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373a0ce9-1f05-42d2-8bec-b973b4aded72_2621x855.png 1272w, https://substackcdn.com/image/fetch/$s_!MP9s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373a0ce9-1f05-42d2-8bec-b973b4aded72_2621x855.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nIF5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F616dc110-4b42-426b-9076-026872288603_2120x367.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nIF5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F616dc110-4b42-426b-9076-026872288603_2120x367.png 424w, https://substackcdn.com/image/fetch/$s_!nIF5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F616dc110-4b42-426b-9076-026872288603_2120x367.png 848w, https://substackcdn.com/image/fetch/$s_!nIF5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F616dc110-4b42-426b-9076-026872288603_2120x367.png 1272w, https://substackcdn.com/image/fetch/$s_!nIF5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F616dc110-4b42-426b-9076-026872288603_2120x367.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nIF5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F616dc110-4b42-426b-9076-026872288603_2120x367.png" width="1456" height="252" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/616dc110-4b42-426b-9076-026872288603_2120x367.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:252,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;code&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="code" title="code" srcset="https://substackcdn.com/image/fetch/$s_!nIF5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F616dc110-4b42-426b-9076-026872288603_2120x367.png 424w, https://substackcdn.com/image/fetch/$s_!nIF5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F616dc110-4b42-426b-9076-026872288603_2120x367.png 848w, https://substackcdn.com/image/fetch/$s_!nIF5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F616dc110-4b42-426b-9076-026872288603_2120x367.png 1272w, https://substackcdn.com/image/fetch/$s_!nIF5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F616dc110-4b42-426b-9076-026872288603_2120x367.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Output:</p><p><code>(dict_keys(['attn_0_q', 'attn_0_k', 'attn_0_v', 'attn_1_q', 'attn_1_k', 'attn_1_v', 'attn_2_q', 'attn_2_k', 'attn_2_v', 'attn_3_q', 'attn_3_k', 'attn_3_v', 'attn_4_q', 'attn_4_k', 'attn_4_v', 'attn_5_q', 'attn_5_k', 'attn_5_v', 'attn_6_q', 'attn_6_k', 'attn_6_v', 'attn_7_q', 'attn_7_k', 'attn_7_v', 'attn_8_q', 'attn_8_k', 'attn_8_v', 'attn_9_q', 'attn_9_k', 'attn_9_v', 'attn_10_q', 'attn_10_k', 'attn_10_v', 'attn_11_q', 'attn_11_k', 'attn_11_v', 'attn_12_q', 'attn_12_k', 'attn_12_v', 'attn_13_q', 'attn_13_k', 'attn_13_v', 'attn_14_q', 'attn_14_k', 'attn_14_v', 'attn_15_q', 'attn_15_k', 'attn_15_v', 'attn_16_q', 'attn_16_k', 'attn_16_v', 'attn_17_q', 'attn_17_k', 'attn_17_v', 'attn_18_q', 'attn_18_k', 'attn_18_v', 'attn_19_q', 'attn_19_k', 'attn_19_v', 'attn_20_q', 'attn_20_k', 'attn_20_v', 'attn_21_q', 'attn_21_k', 'attn_21_v', 'attn_22_q', 'attn_22_k', 'attn_22_v', 'attn_23_q', 'attn_23_k', 'attn_23_v', 'attn_24_q', 'attn_24_k', 'attn_24_v', 'attn_25_q', 'attn_25_k', 'attn_25_v', 'attn_26_q', 'attn_26_k', 'attn_26_v', 'attn_27_q', 'attn_27_k', 'attn_27_v', 'attn_28_q', 'attn_28_k', 'attn_28_v', 'attn_29_q', 'attn_29_k', 'attn_29_v', 'attn_30_q', 'attn_30_k', 'attn_30_v', 'attn_31_q', 'attn_31_k', 'attn_31_v', 'attn_32_q', 'attn_32_k', 'attn_32_v', 'attn_33_q', 'attn_33_k', 'attn_33_v', 'attn_34_q', 'attn_34_k', 'attn_34_v', 'attn_35_q', 'attn_35_k', 'attn_35_v', 'attn_36_q', 'attn_36_k', 'attn_36_v', 'attn_37_q', 'attn_37_k', 'attn_37_v', 'attn_38_q', 'attn_38_k', 'attn_38_v', 'attn_39_q', 'attn_39_k', 'attn_39_v', 'attn_40_q', 'attn_40_k', 'attn_40_v', 'attn_41_q', 'attn_41_k', 'attn_41_v', 'attn_42_q', 'attn_42_k', 'attn_42_v', 'attn_43_q', 'attn_43_k', 'attn_43_v', 'attn_44_q', 'attn_44_k', 'attn_44_v', 'attn_45_q', 'attn_45_k', 'attn_45_v', 'attn_46_q', 'attn_46_k', 'attn_46_v', 'attn_47_q', 'attn_47_k', 'attn_47_v']), (34, 5, 1600))</code></p><p>The size of each data tensor is 34&#215;5&#215;1600. There are 34 target words embedded into a sentence comprising 5 tokens (&#8220;The next word is ___&#8221;) with an embeddings dimensionality of 1600.</p><p>We&#8217;re almost ready for the analyses! The last preparatory step is to create two matrix masks that will allow us to identify the word pairs that are within-category (e.g., galaxy-comet and bed-wall) vs. across-category (e.g., star-sofa and window-banana).</p><p>To create those masks, I identified the diagonal blocks corresponding to the word pairs of the same category, and subtracted that from an upper-diagonal matrix. The result is the two matrices in panels C and D in Figure 1 below.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!D9L8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47fd613e-eb44-4a78-b71e-7e88aa2e05e9_1920x1152.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!D9L8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47fd613e-eb44-4a78-b71e-7e88aa2e05e9_1920x1152.png 424w, https://substackcdn.com/image/fetch/$s_!D9L8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47fd613e-eb44-4a78-b71e-7e88aa2e05e9_1920x1152.png 848w, https://substackcdn.com/image/fetch/$s_!D9L8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47fd613e-eb44-4a78-b71e-7e88aa2e05e9_1920x1152.png 1272w, https://substackcdn.com/image/fetch/$s_!D9L8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47fd613e-eb44-4a78-b71e-7e88aa2e05e9_1920x1152.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!D9L8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47fd613e-eb44-4a78-b71e-7e88aa2e05e9_1920x1152.png" width="1456" height="874" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/47fd613e-eb44-4a78-b71e-7e88aa2e05e9_1920x1152.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:874,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!D9L8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47fd613e-eb44-4a78-b71e-7e88aa2e05e9_1920x1152.png 424w, https://substackcdn.com/image/fetch/$s_!D9L8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47fd613e-eb44-4a78-b71e-7e88aa2e05e9_1920x1152.png 848w, https://substackcdn.com/image/fetch/$s_!D9L8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47fd613e-eb44-4a78-b71e-7e88aa2e05e9_1920x1152.png 1272w, https://substackcdn.com/image/fetch/$s_!D9L8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47fd613e-eb44-4a78-b71e-7e88aa2e05e9_1920x1152.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Figure 1: Creating two binary matrix masks (panels C and D) to isolate and extract the within- vs. across-category similarity values from symmetric matrices. Panels A and B show the two intermediate matrix masks from which the key masks are created.</em></figcaption></figure></div><p>The upshot is this: I can apply those mask matrices to cosine similarity matrices to extract the similarity values within- vs. across-categories, and then apply the RSA method you learned about in the previous post.</p><h2>Correlating <strong>Q</strong>, <strong>K</strong>, and <strong>V</strong> activations</h2><p>The goal of Part 2 is to correlate the activations between pairs of attention activation matrices. Spoiler alert: The correlations will be close to zero. But you need to see how small these correlations are, in order to appreciate the insights gained from applying the RSA technique.</p><p>To begin, I ran the analysis in layer index 6. The scatter plots in Figure 2 show the correlations between all pairs of attention vectors for the second token position.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dcpd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86c98fdd-84c5-4e65-b4af-64a935020032_2304x672.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dcpd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86c98fdd-84c5-4e65-b4af-64a935020032_2304x672.png 424w, https://substackcdn.com/image/fetch/$s_!dcpd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86c98fdd-84c5-4e65-b4af-64a935020032_2304x672.png 848w, https://substackcdn.com/image/fetch/$s_!dcpd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86c98fdd-84c5-4e65-b4af-64a935020032_2304x672.png 1272w, https://substackcdn.com/image/fetch/$s_!dcpd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86c98fdd-84c5-4e65-b4af-64a935020032_2304x672.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dcpd!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86c98fdd-84c5-4e65-b4af-64a935020032_2304x672.png" width="1200" height="350.27472527472526" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/86c98fdd-84c5-4e65-b4af-64a935020032_2304x672.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:425,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!dcpd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86c98fdd-84c5-4e65-b4af-64a935020032_2304x672.png 424w, https://substackcdn.com/image/fetch/$s_!dcpd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86c98fdd-84c5-4e65-b4af-64a935020032_2304x672.png 848w, https://substackcdn.com/image/fetch/$s_!dcpd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86c98fdd-84c5-4e65-b4af-64a935020032_2304x672.png 1272w, https://substackcdn.com/image/fetch/$s_!dcpd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86c98fdd-84c5-4e65-b4af-64a935020032_2304x672.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Figure 2: Direct correlations amongst the three attention matrices. Each dot in the scatter plots reflects the activation value for one embeddings dimension for one token position.</em></figcaption></figure></div><p>The correlation between <strong>Q</strong> and <strong>K</strong> is weakly negative, while the other two pairs have correlations near zero. The negative correlation between <strong>Q</strong> and <strong>K</strong> stems from the shift towards negative values in <strong>QK&#7488;</strong>, which is an important part of the attention algorithm but isn&#8217;t relevant here. Overall, it seems that the activations in one matrix are unrelated to the activations in the other matrices &#8212; even though the tokens and contexts are identical.</p><p>By the way, I asked you to extract the data from the second token instead of the final (target) token to demonstrate that the weak correlations are not trivially due to processing different tokens; these scatter plots reflect identical token sequences at this point. In fact, some of the correlations are even closer to zero when using the final token. You can explore that yourself in the code.</p><p>Next I repeated the analysis for each transformer layer in a for-loop. That&#8217;s a lot of scatter plots to look at, so instead I visualized the correlation coefficients (Figure 3).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8gMQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d02a77-f533-4ea2-8339-f36838bca890_1920x576.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8gMQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d02a77-f533-4ea2-8339-f36838bca890_1920x576.png 424w, https://substackcdn.com/image/fetch/$s_!8gMQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d02a77-f533-4ea2-8339-f36838bca890_1920x576.png 848w, https://substackcdn.com/image/fetch/$s_!8gMQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d02a77-f533-4ea2-8339-f36838bca890_1920x576.png 1272w, https://substackcdn.com/image/fetch/$s_!8gMQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d02a77-f533-4ea2-8339-f36838bca890_1920x576.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8gMQ!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d02a77-f533-4ea2-8339-f36838bca890_1920x576.png" width="1200" height="360.16483516483515" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a3d02a77-f533-4ea2-8339-f36838bca890_1920x576.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:437,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!8gMQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d02a77-f533-4ea2-8339-f36838bca890_1920x576.png 424w, https://substackcdn.com/image/fetch/$s_!8gMQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d02a77-f533-4ea2-8339-f36838bca890_1920x576.png 848w, https://substackcdn.com/image/fetch/$s_!8gMQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d02a77-f533-4ea2-8339-f36838bca890_1920x576.png 1272w, https://substackcdn.com/image/fetch/$s_!8gMQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d02a77-f533-4ea2-8339-f36838bca890_1920x576.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Figure 3: The correlations in Figure 2 were repeated for all layers and visualized here.</em></figcaption></figure></div><p>The results are similar throughout the model: Near-zero correlations for the pairs involving the <strong>V</strong> vectors, and weakly negative correlations between <strong>Q</strong> and <strong>K</strong> across most layers (those negative correlations impose sparsity for reasons that I detail <a href="https://mikexcohen.substack.com/p/llm-breakdown-56-attention?r=6bsj8n">in other posts</a>).</p><p>What have we learned so far? In the previous post, you saw that although the embeddings spaces in different models are not directly comparable, their internal statistical relational structures are highly consistent, at least in the small sample we examined. How about the different attention matrices here in this post; do they have high RSA scores despite low direct correlations? Perhaps you&#8217;ve guessed that the answer is Yes, but let&#8217;s not trust our intuition; instead, let&#8217;s gather statistical evidence to make data-informed decisions.</p><p>We will start with focusing on data from one transformer layer to build visualization, intuition, and code, and then we&#8217;ll expand that analysis to all the layers.</p><h2>Cosine similarities and RSA (one layer)</h2><p>Remember from the previous post that RSA involves correlating similarity values, not correlating dimension-coordinate values. So to calculate an RSA score, we first need to calculate the within-matrix similarity values across all word pairs.</p><p>I&#8217;ll start with data from layer 6. In the code, I extract the <strong>Q</strong>, <strong>K</strong>, and <strong>V</strong> activations from the final token from all batch sequences, and calculate the token &#215; token cosine similarity matrices within each attention matrix.</p><p>The cosine similarity matrices are interesting to look at: The within-category similarities (block-diagonals) are visually higher than the across-category off-diagonal elements, and the overall similarities are highest for <strong>K</strong> and weakest for <strong>V</strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!p09h!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b09d0aa-f0d0-4d2b-9f9b-b72357a081cd_1920x576.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!p09h!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b09d0aa-f0d0-4d2b-9f9b-b72357a081cd_1920x576.png 424w, https://substackcdn.com/image/fetch/$s_!p09h!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b09d0aa-f0d0-4d2b-9f9b-b72357a081cd_1920x576.png 848w, https://substackcdn.com/image/fetch/$s_!p09h!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b09d0aa-f0d0-4d2b-9f9b-b72357a081cd_1920x576.png 1272w, https://substackcdn.com/image/fetch/$s_!p09h!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b09d0aa-f0d0-4d2b-9f9b-b72357a081cd_1920x576.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!p09h!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b09d0aa-f0d0-4d2b-9f9b-b72357a081cd_1920x576.png" width="1200" height="360.16483516483515" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3b09d0aa-f0d0-4d2b-9f9b-b72357a081cd_1920x576.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:437,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!p09h!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b09d0aa-f0d0-4d2b-9f9b-b72357a081cd_1920x576.png 424w, https://substackcdn.com/image/fetch/$s_!p09h!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b09d0aa-f0d0-4d2b-9f9b-b72357a081cd_1920x576.png 848w, https://substackcdn.com/image/fetch/$s_!p09h!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b09d0aa-f0d0-4d2b-9f9b-b72357a081cd_1920x576.png 1272w, https://substackcdn.com/image/fetch/$s_!p09h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b09d0aa-f0d0-4d2b-9f9b-b72357a081cd_1920x576.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Figure 4: Cosine similarity matrices across all word pairs (x- and y-axes) within each attention matrix, from one layer. Dashed lines indicate category boundaries. Colormaps have the same limits for all matrices.</em></figcaption></figure></div><p>Now we can calculate the pairwise RSA scores by correlating the non-redundant and non-trivial cosine similarity values across the pairs of attention matrices. I&#8217;m not separating into within- vs. across-category submatrices yet; the RSA scores here are based on all words.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0b5O!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F905519d5-b004-4e11-9325-56d4f32e541a_1920x576.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0b5O!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F905519d5-b004-4e11-9325-56d4f32e541a_1920x576.png 424w, https://substackcdn.com/image/fetch/$s_!0b5O!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F905519d5-b004-4e11-9325-56d4f32e541a_1920x576.png 848w, https://substackcdn.com/image/fetch/$s_!0b5O!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F905519d5-b004-4e11-9325-56d4f32e541a_1920x576.png 1272w, https://substackcdn.com/image/fetch/$s_!0b5O!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F905519d5-b004-4e11-9325-56d4f32e541a_1920x576.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0b5O!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F905519d5-b004-4e11-9325-56d4f32e541a_1920x576.png" width="1200" height="360.16483516483515" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/905519d5-b004-4e11-9325-56d4f32e541a_1920x576.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:437,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!0b5O!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F905519d5-b004-4e11-9325-56d4f32e541a_1920x576.png 424w, https://substackcdn.com/image/fetch/$s_!0b5O!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F905519d5-b004-4e11-9325-56d4f32e541a_1920x576.png 848w, https://substackcdn.com/image/fetch/$s_!0b5O!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F905519d5-b004-4e11-9325-56d4f32e541a_1920x576.png 1272w, https://substackcdn.com/image/fetch/$s_!0b5O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F905519d5-b004-4e11-9325-56d4f32e541a_1920x576.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Figure 5: Scatter plots showing the RSA results for the three pairs of matrices. Each marker is one cosine similarity value from the upper-triangle of the matrices shown in Figure 4. The correlation (r) value is the RSA score.</em></figcaption></figure></div><p>After having seen the weak correlations of activation values across the attention matrices, the RSA is a remarkable and refreshing change: Even the &#8220;weakest&#8221; RSA is still around .85.</p><p>The interpretation of the results so far is that the <strong>Q</strong>, <strong>K</strong>, and <strong>V</strong> matrices represent token updates in distinct (often orthogonal) ways, but the nature of how those adjustments relate to each other is comparable across the matrices. In other words, the internal representations are similar while the coordinate spaces are distinct.</p><p>That is no accident: If the three sets of attention vectors were already so closely correlated, then they would be redundant and the attention algorithm wouldn&#8217;t be terribly useful. Instead, each of the three matrices (especially <strong>V</strong>) is trained into orthogonal spaces so that their unique contributions to the hidden-state adjustments can be information-rich and context-selective.</p><p>But before we get too excited about this result, let&#8217;s see if these observations are unique to this layer, or whether we see similar results in other layers.</p><h2>Laminar profile of RSA scores</h2><p>The online code file shows how to repeat the RSA analysis in a for-loop over all transformer layers. Figure 6 below shows the laminar profiles of the correlation coefficients (RSA scores).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8do6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d470cd7-f857-49ce-9266-d0574d2c8ffd_1920x576.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8do6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d470cd7-f857-49ce-9266-d0574d2c8ffd_1920x576.png 424w, https://substackcdn.com/image/fetch/$s_!8do6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d470cd7-f857-49ce-9266-d0574d2c8ffd_1920x576.png 848w, https://substackcdn.com/image/fetch/$s_!8do6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d470cd7-f857-49ce-9266-d0574d2c8ffd_1920x576.png 1272w, https://substackcdn.com/image/fetch/$s_!8do6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d470cd7-f857-49ce-9266-d0574d2c8ffd_1920x576.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8do6!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d470cd7-f857-49ce-9266-d0574d2c8ffd_1920x576.png" width="1200" height="360.16483516483515" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2d470cd7-f857-49ce-9266-d0574d2c8ffd_1920x576.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:437,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!8do6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d470cd7-f857-49ce-9266-d0574d2c8ffd_1920x576.png 424w, https://substackcdn.com/image/fetch/$s_!8do6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d470cd7-f857-49ce-9266-d0574d2c8ffd_1920x576.png 848w, https://substackcdn.com/image/fetch/$s_!8do6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d470cd7-f857-49ce-9266-d0574d2c8ffd_1920x576.png 1272w, https://substackcdn.com/image/fetch/$s_!8do6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d470cd7-f857-49ce-9266-d0574d2c8ffd_1920x576.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Figure 6: The RSA values calculated in Figure 5 were repeated for each layer and visualized.</em></figcaption></figure></div><p>With exception of the first transformer layer, the RSA scores are all roughly equally strong, around .9, and not visually obviously changing with depth into the model.</p><h2>Category separability in one layer</h2><p>We still haven&#8217;t incorporated the categories into the analyses. That&#8217;s the goal of the last two sections of this post. We will quantify the category separability of cosine similarity values within each of the attention matrices using an effect size calculation called Cohen&#8217;s <em>d</em> (unrelated &#128578;). And then we&#8217;ll calculate the RSA scores separately for the similarity values within- vs. across-categories.</p><p>Before getting into the details, let&#8217;s build some intuition by visualizing distributions of cosine similarity values. I&#8217;ll use the within- and across-category mask matrices I created earlier to isolate the within- and across-category cosine similarity values from one transformer layer. See Figure 7.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Z4MU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4292bc5-bf79-496c-98c9-d9e4f1f3ef74_2304x576.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Z4MU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4292bc5-bf79-496c-98c9-d9e4f1f3ef74_2304x576.png 424w, https://substackcdn.com/image/fetch/$s_!Z4MU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4292bc5-bf79-496c-98c9-d9e4f1f3ef74_2304x576.png 848w, https://substackcdn.com/image/fetch/$s_!Z4MU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4292bc5-bf79-496c-98c9-d9e4f1f3ef74_2304x576.png 1272w, https://substackcdn.com/image/fetch/$s_!Z4MU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4292bc5-bf79-496c-98c9-d9e4f1f3ef74_2304x576.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Z4MU!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4292bc5-bf79-496c-98c9-d9e4f1f3ef74_2304x576.png" width="1200" height="300" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e4292bc5-bf79-496c-98c9-d9e4f1f3ef74_2304x576.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:364,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Z4MU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4292bc5-bf79-496c-98c9-d9e4f1f3ef74_2304x576.png 424w, https://substackcdn.com/image/fetch/$s_!Z4MU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4292bc5-bf79-496c-98c9-d9e4f1f3ef74_2304x576.png 848w, https://substackcdn.com/image/fetch/$s_!Z4MU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4292bc5-bf79-496c-98c9-d9e4f1f3ef74_2304x576.png 1272w, https://substackcdn.com/image/fetch/$s_!Z4MU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4292bc5-bf79-496c-98c9-d9e4f1f3ef74_2304x576.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption"><em>Figure 7: Histograms of cosine similarity values, separated by within- (blue) vs. across- (orange) category, for each of the attention matrices. Notice the differences in the similarity values (x-axis ticks in different panels). The title of each panel also indicates Cohen&#8217;s d, a measure of effect size used here to quantify category separability.</em></figcaption></figure></div><p>The distributions are clearly well-separated: For each attention matrix, the similarity values are stronger within- vs. across-categories. This result shows that even deep inside the attention algorithm, LLMs have some relational structure that incorporates semantic world-knowledge into its token adjustment vectors.</p><p>The d-values in the titles are Cohen&#8217;s d effect size values that I calculated using the <code>compute_effsize</code> function in the <code>pingouin</code> library. Cohen&#8217;s <em>d</em> is the difference of the means of the distributions, scaled by their standard deviations. It&#8217;s closely related to the t-value, but is scaled to give more interpretable results. Effect sizes of around 2.5 are very large &#8212; in the experimental psychology literature, by comparison, researchers are very happy with effect sizes of around .6 to .8.</p><h2>Laminar profile of category separability and RSA</h2><p>The final part of this post is to calculate the effect size and RSA score for each layer, in order to determine whether category specificity evolves across the transformer stack.</p><p>The coding here is fairly straightforward, and requires only some minor modifications, for example splitting the RSA calculations by category class and storing the results per-layer.</p><p>Here are the results:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HeWn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16150e0c-8e6e-4538-ad4c-e1dacaeccffb_2304x652.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HeWn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16150e0c-8e6e-4538-ad4c-e1dacaeccffb_2304x652.png 424w, https://substackcdn.com/image/fetch/$s_!HeWn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16150e0c-8e6e-4538-ad4c-e1dacaeccffb_2304x652.png 848w, https://substackcdn.com/image/fetch/$s_!HeWn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16150e0c-8e6e-4538-ad4c-e1dacaeccffb_2304x652.png 1272w, https://substackcdn.com/image/fetch/$s_!HeWn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16150e0c-8e6e-4538-ad4c-e1dacaeccffb_2304x652.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HeWn!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16150e0c-8e6e-4538-ad4c-e1dacaeccffb_2304x652.png" width="1200" height="339.56043956043953" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/16150e0c-8e6e-4538-ad4c-e1dacaeccffb_2304x652.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:412,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!HeWn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16150e0c-8e6e-4538-ad4c-e1dacaeccffb_2304x652.png 424w, https://substackcdn.com/image/fetch/$s_!HeWn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16150e0c-8e6e-4538-ad4c-e1dacaeccffb_2304x652.png 848w, https://substackcdn.com/image/fetch/$s_!HeWn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16150e0c-8e6e-4538-ad4c-e1dacaeccffb_2304x652.png 1272w, https://substackcdn.com/image/fetch/$s_!HeWn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16150e0c-8e6e-4538-ad4c-e1dacaeccffb_2304x652.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Figure 8: Category separability effect size (from Figure 7) and RSA scores are shown across all layers.</em></figcaption></figure></div><p>The results show high category separability and RSA scores throughout the depth of the model, with some relative decrease in separability around the middle layers.</p><p>Interestingly, the across-category RSA scores seem to be weaker than the within-category scores. Let&#8217;s investigate that observation more.</p><p>The scatter plot below shows the direct comparison of within- to across-category RSA scores, with each marker indicating matrix per layer, and the diagonal line showing unity.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sRt6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67bb4fac-37c1-4ce8-8f72-234cd6b4ca60_1152x960.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sRt6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67bb4fac-37c1-4ce8-8f72-234cd6b4ca60_1152x960.png 424w, https://substackcdn.com/image/fetch/$s_!sRt6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67bb4fac-37c1-4ce8-8f72-234cd6b4ca60_1152x960.png 848w, https://substackcdn.com/image/fetch/$s_!sRt6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67bb4fac-37c1-4ce8-8f72-234cd6b4ca60_1152x960.png 1272w, https://substackcdn.com/image/fetch/$s_!sRt6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67bb4fac-37c1-4ce8-8f72-234cd6b4ca60_1152x960.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sRt6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67bb4fac-37c1-4ce8-8f72-234cd6b4ca60_1152x960.png" width="1152" height="960" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/67bb4fac-37c1-4ce8-8f72-234cd6b4ca60_1152x960.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:960,&quot;width&quot;:1152,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!sRt6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67bb4fac-37c1-4ce8-8f72-234cd6b4ca60_1152x960.png 424w, https://substackcdn.com/image/fetch/$s_!sRt6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67bb4fac-37c1-4ce8-8f72-234cd6b4ca60_1152x960.png 848w, https://substackcdn.com/image/fetch/$s_!sRt6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67bb4fac-37c1-4ce8-8f72-234cd6b4ca60_1152x960.png 1272w, https://substackcdn.com/image/fetch/$s_!sRt6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67bb4fac-37c1-4ce8-8f72-234cd6b4ca60_1152x960.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Figure 9: Comparing RSA scores within- vs. across-category. Each marker is an RSA score from one attention matrix and one layer. The color indicates the layer index, going from earlier transformer layers in dark purple, to later layers in yellow. The diagonal line of unity indicates equal RSA scores; data values above the line indicate stronger within- compared to across-category RSA.</em></figcaption></figure></div><p>Every single marker is above the line of unity, meaning that all within-category RSA scores are larger than all across-category scores. You don&#8217;t need a statistical test to see that it&#8217;s a real effect!</p><p>By the way, this is not simply due to a ceiling effect of RSA scores (they are bound by 1), because correcting for the ceiling by re-running the analysis using the Fisher-z transform doesn&#8217;t change the key result. I don&#8217;t show that re-analysis here, but you can test it by applying the np.atanh function to the RSA scores.</p><p>These results reveal that semantic information is present within each attention matrix, can be quantified by a linear analysis (though this does not prove that the representations are fully linear), and the relational structures are comparable across matrices despite them being nearly orthogonal spaces.</p><h2>So you wanna learn more?</h2><p>If you think that leveraging machine-learning techniques to investigate and understand LLM architecture and internal calculations is a useful approach, then please consider checking out <a href="https://github.com/mikexcohen/ML4LLM_book">my book from which</a> these posts were adapted, and/or my 90-hour <a href="https://www.udemy.com/course/dullms_x/?couponCode=202509">video-based course on LLM architecture</a>, training, and interpretability. And of course, you can check out <a href="https://mikexcohen.substack.com/">my other Substack posts</a>.</p>]]></content:encoded></item><item><title><![CDATA[The 10 Most Important Lessons 20 Years of Mathematics Taught Me]]></title><description><![CDATA[#5. There are no shortcuts to mastery.]]></description><link>https://thepalindrome.org/p/the-10-most-important-lessons-20-abe</link><guid isPermaLink="false">https://thepalindrome.org/p/the-10-most-important-lessons-20-abe</guid><dc:creator><![CDATA[Tivadar Danka]]></dc:creator><pubDate>Sun, 17 May 2026 13:57:54 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/60c07f96-d990-473a-89e2-b64116427fb1_3840x2160.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div id="youtube2-dxWeZMszSGo" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;dxWeZMszSGo&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/dxWeZMszSGo?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p><strong>One.</strong> <em>Breaking the rules is often the best course of action.</em></p><p>I can&#8217;t even count the number of math-breaking ideas that propelled science forward by light-years.</p><p>We have set theory because Bertrand Russell broke the notion that <em>&#8220;sets are just collections of things.&#8221;</em> We have complex numbers because Gerolamo Cardano kept the computations going when encountering &#8730;&#8722;1, refusing to acknowledge that it doesn&#8217;t exist. We have non-Euclidean spaces because J&#225;nos Bolyai did not accept that, given a line and an external point, there&#8217;s only one parallel line that intersects the point.</p><p>A triangle of three right angles breaks one space, but creates another.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!w8lM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26c00d04-8b0d-45a4-bff1-dba8977c4fac_3840x2160.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w8lM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26c00d04-8b0d-45a4-bff1-dba8977c4fac_3840x2160.png 424w, https://substackcdn.com/image/fetch/$s_!w8lM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26c00d04-8b0d-45a4-bff1-dba8977c4fac_3840x2160.png 848w, https://substackcdn.com/image/fetch/$s_!w8lM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26c00d04-8b0d-45a4-bff1-dba8977c4fac_3840x2160.png 1272w, https://substackcdn.com/image/fetch/$s_!w8lM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26c00d04-8b0d-45a4-bff1-dba8977c4fac_3840x2160.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w8lM!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26c00d04-8b0d-45a4-bff1-dba8977c4fac_3840x2160.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/26c00d04-8b0d-45a4-bff1-dba8977c4fac_3840x2160.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:616822,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://thepalindrome.org/i/197797926?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26c00d04-8b0d-45a4-bff1-dba8977c4fac_3840x2160.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!w8lM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26c00d04-8b0d-45a4-bff1-dba8977c4fac_3840x2160.png 424w, https://substackcdn.com/image/fetch/$s_!w8lM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26c00d04-8b0d-45a4-bff1-dba8977c4fac_3840x2160.png 848w, https://substackcdn.com/image/fetch/$s_!w8lM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26c00d04-8b0d-45a4-bff1-dba8977c4fac_3840x2160.png 1272w, https://substackcdn.com/image/fetch/$s_!w8lM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26c00d04-8b0d-45a4-bff1-dba8977c4fac_3840x2160.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>No assumption should be set in stone, and I&#8217;m not afraid to challenge them. You shouldn&#8217;t be afraid either.</p><div><hr></div><p><em>&#128204; If you find value in my work, consider supporting The Palindrome with a paid subscription! Your support is what makes independent and high-quality education possible.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://thepalindrome.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://thepalindrome.org/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><p><strong>Two.</strong> <em>You have to deeply understand the rules to successfully break them.</em></p><p>Let&#8217;s be honest: breaking rules is a clich&#233;.</p><p>All of my examples of successful rule-breakings above were executed by experts in their fields: Russell was a master of logic, Cardano of algebra, and Bolyai of geometry.</p><p>Think of doing <s>mathematics</s> science as holding a large marble orb in your hands, with treasure inside. You want to open it, but the marble is so sturdy that you can&#8217;t just smash it open.</p><p>What you <em>can</em> do, however, is to obsessively study it in detail until you find a tiny crack, giving you a way in.</p><p>If you don&#8217;t know what you are doing, you are just an elephant in a porcelain store.</p><p>This holds true outside of math. Miles Davis, legendary jazz musician, famously stated that</p><blockquote><p><em>&#8220;Once is a mistake, twice is jazz.&#8221;</em></p></blockquote><p>Mistakes are easy. Jazz is hard.</p><div><hr></div><p><strong>Three.</strong> <em>Understanding happens when you take things slow.</em></p><p>At the university, most of my classes were in the classic chalk + blackboard style, and let me tell you, taking notes on formulas, definitions, and theorems by hand from a blackboard is infinitely more beneficial than relying on a PowerPoint presentation.</p><p>We were <em>thinking and working together</em> with the lecturers instead of just hanging along for the ride.</p><p>Something magical happens when you copy formulas by hand. Your mind automatically wraps itself around the concepts, bending them, studying them.</p><p>It&#8217;s not just true for math but for <em>any</em> topic.</p><p>That&#8217;s why typing out a solution from an LLM response (or StackOverflow if you are old-school) is better than copying and pasting it.</p><div><hr></div><p><strong>Four.</strong> <em>The best way to learn is to solve problems.</em></p><p>I just said that <em>&#8220;understanding happens when you take things slow.&#8221;</em></p><p>Well, the easiest way to take things slow is to be forced to by pushing your skills and knowledge to the limit.</p><p>This happens when you are working with tools you don&#8217;t know how to use, on problems you don&#8217;t know how to solve.</p><p>Until I implemented my own neural network library from scratch, my understanding was limited to the user level. I prepared my datasets, ran a couple of epochs, evaluated the model, and repeated the process until I liked what I saw.</p><p>Soon, it wasn&#8217;t enough. I wanted to dig deeper, so I built a neural network framework from scratch. And another one. And another one that became <a href="https://github.com/the-palindrome/mlfz">mlfz</a>, the cornerstone of my next book.</p><p>Consuming YouTube tutorials is an extremely inefficient way to improve at anything. If you want to get good, get your hands dirty and start solving problems.</p><p>Think.</p><p>Build.</p><p>Take it apart and put it back together again.</p><div><hr></div><p><strong>Five.</strong> <em>There are no shortcuts to mastery.</em></p><p>Once upon a time, the legendary Greek mathematician Euclid was summoned by King Ptolemy I Soter (not to be confused with the astronomer Ptolemy), who wanted to study geometry.</p><p>Judging from the fact that, two thousand years later, we still call classical geometry <em>Euclidean</em>, you have probably guessed that Euclid was pretty good. However, the king soon got impatient and asked if there was a shortcut.</p><p>Euclid replied that</p><blockquote><p><em>&#8220;There is no royal road to geometry.&#8221;</em></p></blockquote><p>The man invented the axiomatic approach to mathematics, wrote the second-most printed book of all time, and told a king to f**k off. That&#8217;s a pretty impressive resume.</p><p>But what does no royal road mean?</p><p>That there are no shortcuts. You can&#8217;t pay your way to knowledge; you must put in the work for math and anything else that&#8217;s worth doing.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jDwW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00af48c1-71f8-4282-9449-bf30f0ca7bd1_3840x2160.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jDwW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00af48c1-71f8-4282-9449-bf30f0ca7bd1_3840x2160.png 424w, https://substackcdn.com/image/fetch/$s_!jDwW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00af48c1-71f8-4282-9449-bf30f0ca7bd1_3840x2160.png 848w, https://substackcdn.com/image/fetch/$s_!jDwW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00af48c1-71f8-4282-9449-bf30f0ca7bd1_3840x2160.png 1272w, https://substackcdn.com/image/fetch/$s_!jDwW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00af48c1-71f8-4282-9449-bf30f0ca7bd1_3840x2160.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jDwW!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00af48c1-71f8-4282-9449-bf30f0ca7bd1_3840x2160.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/00af48c1-71f8-4282-9449-bf30f0ca7bd1_3840x2160.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:1443462,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://thepalindrome.org/i/197797926?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00af48c1-71f8-4282-9449-bf30f0ca7bd1_3840x2160.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jDwW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00af48c1-71f8-4282-9449-bf30f0ca7bd1_3840x2160.png 424w, https://substackcdn.com/image/fetch/$s_!jDwW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00af48c1-71f8-4282-9449-bf30f0ca7bd1_3840x2160.png 848w, https://substackcdn.com/image/fetch/$s_!jDwW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00af48c1-71f8-4282-9449-bf30f0ca7bd1_3840x2160.png 1272w, https://substackcdn.com/image/fetch/$s_!jDwW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00af48c1-71f8-4282-9449-bf30f0ca7bd1_3840x2160.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><p><strong>Six.</strong> <em>Always tackle one issue at a time. Look at special cases, then add complexity step by step.</em></p><p>I&#8217;m quite an impatient person. Even though my recklessness has gotten better with age, sometimes I still add two new features to a single commit. I just can&#8217;t help myself.</p><p>However, if you are a frequent reader of mine, you probably heard me say, <em>&#8220;We&#8217;ll add complexity one layer at a time,&#8221;</em> or something along these lines.</p><p>This is not an accident.</p><p>Whenever I want to understand a new concept, I know that mentally juggling with more than one <s>ball</s> thought is a recipe for failure, so I avoid it at all costs. Once I&#8217;m familiar with the basics, I add more and more details.</p><p>Think of the very first time you drove a car. Personally, my head was spinning so hard I didn&#8217;t even know whether I was coming or going. Letting off the gas and gently pressing the brake with one leg, smashing the clutch with the other, finding the right gear with my right hand, and holding the steering wheel steady with the other. That&#8217;s just slowing down before a turn. (I&#8217;m European, so I learned to drive on a manual.)</p><p>After twenty years of experience, I can do all this in my sleep, and I learned it one step at a time.</p><p>Driving on autopilot is the problem now, but that&#8217;s a topic for another day.</p><div><hr></div><p><strong>Seven.</strong> <em>Finding the right perspective is half the success.</em></p><p>The all-time most-read post of The Palindrome is titled <a href="https://thepalindrome.org/p/matrices-and-graphs">Matrices and Graphs</a>, and it is secretly about this principle. By looking at matrices as graphs, we can immediately prove complex and profound results. Like in computer science, each implementation has its pros and cons. The proper choice of data structure can make or break a problem.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!M_3y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33bcf14f-9e02-4021-b8a2-47b6173191a1_1920x700.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!M_3y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33bcf14f-9e02-4021-b8a2-47b6173191a1_1920x700.png 424w, https://substackcdn.com/image/fetch/$s_!M_3y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33bcf14f-9e02-4021-b8a2-47b6173191a1_1920x700.png 848w, https://substackcdn.com/image/fetch/$s_!M_3y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33bcf14f-9e02-4021-b8a2-47b6173191a1_1920x700.png 1272w, https://substackcdn.com/image/fetch/$s_!M_3y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33bcf14f-9e02-4021-b8a2-47b6173191a1_1920x700.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!M_3y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33bcf14f-9e02-4021-b8a2-47b6173191a1_1920x700.png" width="1456" height="531" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/33bcf14f-9e02-4021-b8a2-47b6173191a1_1920x700.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:531,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:103276,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://thepalindrome.org/i/164861571?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33bcf14f-9e02-4021-b8a2-47b6173191a1_1920x700.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!M_3y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33bcf14f-9e02-4021-b8a2-47b6173191a1_1920x700.png 424w, https://substackcdn.com/image/fetch/$s_!M_3y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33bcf14f-9e02-4021-b8a2-47b6173191a1_1920x700.png 848w, https://substackcdn.com/image/fetch/$s_!M_3y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33bcf14f-9e02-4021-b8a2-47b6173191a1_1920x700.png 1272w, https://substackcdn.com/image/fetch/$s_!M_3y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33bcf14f-9e02-4021-b8a2-47b6173191a1_1920x700.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Matrices on one hand, graphs on the other.</p><p>Algebraic expressions like <em>a + ib</em> on one hand, vectors on the Euclidean plane on the other.</p><p>Driving your startup into the ground on one hand, a priceless learning opportunity on the other.</p><p>Success and failure are matters of perspective: you win, or you learn.</p><div><hr></div><p><strong>Eight.</strong> <em>Asking questions is a superpower.</em></p><p><em>&#8220;It&#8217;s not that there are no stupid questions,&#8221;</em> the words of my professor echo in my ears, <em>&#8220;it&#8217;s that not asking your questions is stupid.&#8221;</em></p><p>You probably don&#8217;t know this about me, but I&#8217;m quite shameless. I never had a problem raising my hand and asking my question during a lecture, no matter what it was.</p><p>My mid-(and sometimes post)-lecture discussions with the professors followed a notable trajectory of improvement, from <em>&#8220;What does that &#8704; symbol on the board mean?&#8221;</em> to discussing open problems and eventually solving a couple of them.</p><p>The key to that improvement was my blatant disregard for others&#8217; perceptions of me, daring to play some wrong notes to master my instrument.</p><div><hr></div><p><strong>Nine.</strong> <em>Talent is just the icing on the cake. The rest is hard work and perseverance.</em></p><p>When people discover that I have a PhD in mathematics, one of the most common reactions is, <em>&#8220;you must be a genius.&#8221;</em></p><p>Let me tell you, this could not be further from the truth.</p><p>I was never a mathlete or a brilliant researcher. I am a slow thinker. I have problems with mental arithmetic.</p><p>However, three redeeming qualities helped me get where I am today. I am</p><ul><li><p>emotionally resilient,</p></li><li><p>curious,</p></li><li><p>and a hard worker.</p></li></ul><p>Without these, no amount of brilliance can put you at the top or even in the middle of the pack.</p><div><hr></div><p><strong>Ten.</strong> <em>Don&#8217;t give too much credit to advisors and professors. They are people like you, and experience is the only thing they have over you.</em></p><p>I follow quite a few notable people who have successfully realized their vision, solving problems to move our world forward. Their audiences are full of the young and motivated, yearning to carve out their piece of the pie, failing to realize that the only path they can walk is their own.</p><p>What worked for Elon Musk or Paul Erd&#337;s might not work for you.</p><p>Why? Because you have a different personality, a different socio-economic background, and potentially a different zeitgeist.</p><p><em>&#8220;No man ever steps in the same river twice, for it&#8217;s not the same river and he&#8217;s not the same man.&#8221;</em></p><div><hr></div><p>Thanks for reading! If you have enjoyed the post, consider supporting me with a paid subscription. I&#8217;m an independent researcher/content creator, and your support is what makes work like this one possible.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://thepalindrome.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://thepalindrome.org/subscribe?"><span>Subscribe now</span></a></p>]]></content:encoded></item><item><title><![CDATA[Introduction to Object-Oriented Programming in Python]]></title><description><![CDATA[The true way of doing stuff with data]]></description><link>https://thepalindrome.org/p/introduction-to-object-oriented-programming</link><guid isPermaLink="false">https://thepalindrome.org/p/introduction-to-object-oriented-programming</guid><dc:creator><![CDATA[Stephen Gruppetta]]></dc:creator><pubDate>Fri, 08 May 2026 12:29:01 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/4dfc98cb-b18b-4311-b358-ed04e8491619_1672x941.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey!</p><p>It&#8217;s Tivadar from The Palindrome. Back when I wrote the <em>Mathematics of Machine Learning</em> book, I realized</p><ol><li><p>what a great language Python is,</p></li><li><p>and that object-oriented Python should be the first thing taught to every machine learning engineer.</p></li></ol><p>So, I&#8217;ve been thinking about publishing a series, but my Python skills are not exactly top-of-the-line; I just hack and slash until things work. Fortunately, I found the best person who could do that, and even better, he agreed to write a special article for you!</p><p>It&#8217;s my pleasure to introduce <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Stephen Gruppetta&quot;,&quot;id&quot;:120170782,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ca736a83-f5a1-4563-ac6c-c09a9e6fa351_800x800.png&quot;,&quot;uuid&quot;:&quot;4a700fa6-f573-43a7-bb02-df1279787ada&quot;}" data-component-name="MentionToDOM"></span>, author of <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;The Python Coding Stack&quot;,&quot;id&quot;:1563052,&quot;type&quot;:&quot;pub&quot;,&quot;url&quot;:&quot;https://open.substack.com/pub/thepythoncodingstack&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ab4a59e8-e362-456b-8427-934e87c31a0d_600x600.png&quot;,&quot;uuid&quot;:&quot;26bcf8de-47da-4533-a58b-b49266ccd00c&quot;}" data-component-name="MentionToDOM"></span>, and my longtime online friend from back when X was called Twitter.</p><p>If you ever wanted to become a power user of Python and take advantage of all the heavy machinery provided by classes, operator overloading, inheritance, composition, etc., this is the article for you.</p><p>Dig in!</p><p>Cheers,<br>Tivadar</p><div><hr></div><p><em>&#8220;A computer program stores data and does stuff with the data.&#8221;</em></p><p>This is not the most technical definition of a computer program you&#8217;ll see. But it&#8217;s a valid one. When you learn to code, you learn about data structures to store different types of data. And you also learn about tools needed to manipulate and transform the data. You often define functions containing code to &#8220;do stuff&#8221; with the data.</p><p>Your code will contain data structures and functions, and you pass those data structures to the functions when needed.</p><p>Object-oriented programming (OOP) brings these two aspects together into a single unit. This unit is the object, which contains the data and the tools needed to manipulate the data. This may not sound like much, but it enables you to think about the problem you&#8217;re trying to solve differently. You can visualize your problem in a way closer to how humans see the world.</p><p>Let&#8217;s look at some examples, starting with a concrete one. Consider a country. There&#8217;s plenty of data relevant to a country: its name, population size, geographical area, capital city, and more. All countries have these attributes. Therefore, you can create a template that includes these attributes that you can use each time you want to represent a different country. Countries also include people, so these can be included in the data for each country, too.</p><p>But countries also perform actions. They issue passports, they collect taxes, they create laws, and so on. OOP urges you to think of a single unit to represent a country, which includes all the data and tools needed for the country to perform the actions required. You&#8217;d create a class called Country&#8211;this is the template you need to create lots of countries. The class doesn&#8217;t represent a specific country but the idea of a country. Once you define the class, you can create as many instances of the class as you need. Each instance represents a specific country.</p><h2>Classes and Instances &#8226; The Vector Class</h2><p>But let&#8217;s work on a different example in this article. Let&#8217;s consider a vector. One way to view vectors is as entities with both magnitude and direction. In three-dimensional (3D) Euclidean geometry, a vector is represented by three numbers.</p><p>Let&#8217;s put on our OOP hat. We need a unit in our program to represent a vector. It needs to represent its data and its functionality. Let&#8217;s start by creating a class called Vector:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!u3p3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91aa0151-710d-4297-8136-508e350e1718_2120x428.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!u3p3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91aa0151-710d-4297-8136-508e350e1718_2120x428.png 424w, https://substackcdn.com/image/fetch/$s_!u3p3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91aa0151-710d-4297-8136-508e350e1718_2120x428.png 848w, https://substackcdn.com/image/fetch/$s_!u3p3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91aa0151-710d-4297-8136-508e350e1718_2120x428.png 1272w, https://substackcdn.com/image/fetch/$s_!u3p3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91aa0151-710d-4297-8136-508e350e1718_2120x428.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!u3p3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91aa0151-710d-4297-8136-508e350e1718_2120x428.png" width="1456" height="294" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/91aa0151-710d-4297-8136-508e350e1718_2120x428.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:294,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;code&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="code" title="code" srcset="https://substackcdn.com/image/fetch/$s_!u3p3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91aa0151-710d-4297-8136-508e350e1718_2120x428.png 424w, https://substackcdn.com/image/fetch/$s_!u3p3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91aa0151-710d-4297-8136-508e350e1718_2120x428.png 848w, https://substackcdn.com/image/fetch/$s_!u3p3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91aa0151-710d-4297-8136-508e350e1718_2120x428.png 1272w, https://substackcdn.com/image/fetch/$s_!u3p3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91aa0151-710d-4297-8136-508e350e1718_2120x428.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Admittedly, this class doesn&#8217;t do much for now. The ellipsis (...) is just a placeholder that&#8217;s valid Python syntax. You&#8217;ll add more to this class soon. In the previous section, I mentioned that a class is a template for creating many objects modelled from the same blueprint. Let&#8217;s create a few instances of this class:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!V99G!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71034348-fb22-4f22-a654-2652fab011c1_2120x611.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!V99G!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71034348-fb22-4f22-a654-2652fab011c1_2120x611.png 424w, https://substackcdn.com/image/fetch/$s_!V99G!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71034348-fb22-4f22-a654-2652fab011c1_2120x611.png 848w, https://substackcdn.com/image/fetch/$s_!V99G!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71034348-fb22-4f22-a654-2652fab011c1_2120x611.png 1272w, https://substackcdn.com/image/fetch/$s_!V99G!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71034348-fb22-4f22-a654-2652fab011c1_2120x611.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!V99G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71034348-fb22-4f22-a654-2652fab011c1_2120x611.png" width="1456" height="420" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/71034348-fb22-4f22-a654-2652fab011c1_2120x611.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:420,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;code&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="code" title="code" srcset="https://substackcdn.com/image/fetch/$s_!V99G!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71034348-fb22-4f22-a654-2652fab011c1_2120x611.png 424w, https://substackcdn.com/image/fetch/$s_!V99G!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71034348-fb22-4f22-a654-2652fab011c1_2120x611.png 848w, https://substackcdn.com/image/fetch/$s_!V99G!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71034348-fb22-4f22-a654-2652fab011c1_2120x611.png 1272w, https://substackcdn.com/image/fetch/$s_!V99G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71034348-fb22-4f22-a654-2652fab011c1_2120x611.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There&#8217;s only one Vector class. But now you have two instances of this class. You create an instance of the class when you add parentheses after the class name. The objects referenced by v1 and v2 are separate objects that occupy different areas of your computer&#8217;s memory. You can confirm that they&#8217;re different objects by showing their identity using Python&#8217;s <code>id()</code> function. The two instances of the Vector class have different identities:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2JPb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8126d64e-c245-4d07-8973-70caa9365ed6_2120x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2JPb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8126d64e-c245-4d07-8973-70caa9365ed6_2120x768.png 424w, https://substackcdn.com/image/fetch/$s_!2JPb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8126d64e-c245-4d07-8973-70caa9365ed6_2120x768.png 848w, https://substackcdn.com/image/fetch/$s_!2JPb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8126d64e-c245-4d07-8973-70caa9365ed6_2120x768.png 1272w, https://substackcdn.com/image/fetch/$s_!2JPb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8126d64e-c245-4d07-8973-70caa9365ed6_2120x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2JPb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8126d64e-c245-4d07-8973-70caa9365ed6_2120x768.png" width="1456" height="527" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8126d64e-c245-4d07-8973-70caa9365ed6_2120x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:527,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;code&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="code" title="code" srcset="https://substackcdn.com/image/fetch/$s_!2JPb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8126d64e-c245-4d07-8973-70caa9365ed6_2120x768.png 424w, https://substackcdn.com/image/fetch/$s_!2JPb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8126d64e-c245-4d07-8973-70caa9365ed6_2120x768.png 848w, https://substackcdn.com/image/fetch/$s_!2JPb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8126d64e-c245-4d07-8973-70caa9365ed6_2120x768.png 1272w, https://substackcdn.com/image/fetch/$s_!2JPb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8126d64e-c245-4d07-8973-70caa9365ed6_2120x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>You&#8217;ll get different values from the ones shown here when you run this code on your computer. But what matters is that the two numbers you get are different from each other. You can also confirm that <code>v1</code> and <code>v2</code> represent different objects by using <code>v1</code> is <code>v2</code>, which returns <code>False</code>.</p><p>Note that the terms object and instance are both commonly used to refer to the unit created by a class. They refer to the same thing.</p><p>However, these are &#8220;blank&#8221; objects. They don&#8217;t have anything beyond the bare minimum a Python object needs. Let&#8217;s add some data.</p>
      <p>
          <a href="https://thepalindrome.org/p/introduction-to-object-oriented-programming">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Explore LLM word representations using similarity analysis (part 1)]]></title><description><![CDATA[A hands-on introduction to representational similarity analysis (RSA) with GPT-2 and BERT embeddings]]></description><link>https://thepalindrome.org/p/explore-llm-word-representations</link><guid isPermaLink="false">https://thepalindrome.org/p/explore-llm-word-representations</guid><dc:creator><![CDATA[Mike X Cohen, PhD]]></dc:creator><pubDate>Wed, 22 Apr 2026 12:43:43 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/c50dc465-a555-4819-b83e-d9d4f29dba7c_1200x630.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Hey! It&#8217;s Tivadar.</em></p><p><em><span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Mike X Cohen, PhD&quot;,&quot;id&quot;:382604135,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3c804d93-69c2-49a9-a797-2216b4bae5ba_1000x1000.jpeg&quot;,&quot;uuid&quot;:&quot;b23ff038-6b40-48bb-ab4c-b4d5522ff932&quot;}" data-component-name="MentionToDOM"></span> returns to The Palindrome! You know I&#8217;m a big fan of his work, and if you are into machine learning, you should be too. His posts always strike the perfect balance between educational, practical, and entertaining.</em></p><p><em>He recently published the book <a href="https://github.com/mikexcohen/ML4LLM_book">50 ML Projects to Understand LLMs</a>, and his upcoming two-part series on exploring word representations is taken directly from the book. If you want to understand how Large Language Models work under the hood, don&#8217;t miss the post below.</em></p><p><em>Enjoy!</em></p><p><em>Cheers,<br>Tivadar</em></p>
      <p>
          <a href="https://thepalindrome.org/p/explore-llm-word-representations">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[I Built the Knowledge Graph of Machine Learning]]></title><description><![CDATA[Exploring the structure of machine learning]]></description><link>https://thepalindrome.org/p/i-built-the-knowledge-graph-of-machine</link><guid isPermaLink="false">https://thepalindrome.org/p/i-built-the-knowledge-graph-of-machine</guid><dc:creator><![CDATA[Tivadar Danka]]></dc:creator><pubDate>Sun, 19 Apr 2026 07:46:57 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/WR-VyH0pIgs" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey! It&#8217;s Tivadar from The Palindrome.</p><p><em>&#8220;How to get started in machine learning?&#8221;</em> is one of the most common questions I get. I have a couple of default answers, but they are based more on my personal experience than on science.</p><p>Inspired by this, I&#8217;ve mapped out the knowledge graph of machine learning, building a hierarchy of concepts that can guide you from the foundations to the state of the art.</p><p>A couple of fascinating patterns have emerged from my journey: the thin spine of mathematics that holds up the entire knowledge graph, the central concepts like gradient descent that enable modern machine learning as we know it, and more.</p><p>Here&#8217;s the video where I talk about my findings:</p>
      <p>
          <a href="https://thepalindrome.org/p/i-built-the-knowledge-graph-of-machine">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[This Week at The Palindrome]]></title><description><![CDATA[Finishing up with knowledge graphs and building more interactive tools]]></description><link>https://thepalindrome.org/p/this-week-at-the-palindrome</link><guid isPermaLink="false">https://thepalindrome.org/p/this-week-at-the-palindrome</guid><dc:creator><![CDATA[Tivadar Danka]]></dc:creator><pubDate>Sat, 11 Apr 2026 09:42:48 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!xKLg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462454e3-5f77-4889-b287-ebc2750db03b_1345x770.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey! It&#8217;s Tivadar from The Palindrome. It&#8217;s time for an update. Let&#8217;s </p><p>I&#8217;m finishing up my video about <a href="https://the-palindrome.github.io/ml-knowledge-graph/">the knowledge graph of machine learning</a>, which will be released next week; I&#8217;ll do a live premiere right here on Substack Live, with a discussion after the viewing. (<a href="https://open.substack.com/live-stream/160251">You can join here.</a>)</p><p>This video will mark a milestone for me: instead of relying on hand-crafted slides and a presentation-style exposition, I built a scripting engine on top of the knowledge graph explorer that turns a JSON script like</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;json&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-json">[{
  "at": 0.0,
  "action": "autoRotate",
  "axis": "y",
  "speed": 0.01,
  "duration": 9.3,
  "windDown": 2.0,
  "easing": "linear"
},
{
  "at": 9.3,
  "action": "selectNode",
  "nodeId": "gpt",
  "showPrerequisites": true,
  "showDependents": false,
  "duration": 2.0
}]</code></pre></div><p>into a beautifully rendered video.</p><p>Here&#8217;s a sneak peek:</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;f441ec90-5bce-463d-a7a6-504a23d70b56&quot;,&quot;duration&quot;:null}"></div><p>The machine learning knowledge graph project could serve as a template for my future content. From now on, I&#8217;ll go full multimodal, meaning that I&#8217;ll</p><ul><li><p>build interactive visualizations (such as <a href="https://the-palindrome.github.io/ml-knowledge-graph/">the knowledge graph explorer</a>),</p></li><li><p>then write posts and record videos, aided by the interactive tool.</p></li></ul><p>Now that the current video-in-progress is about to be finished, what&#8217;s next?</p><p>Read on.</p>
      <p>
          <a href="https://thepalindrome.org/p/this-week-at-the-palindrome">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The Power of Mathematical Modeling]]></title><description><![CDATA[What do online rumors, computer viruses and zombie apocalypses have in common?]]></description><link>https://thepalindrome.org/p/the-power-of-mathematical-modeling</link><guid isPermaLink="false">https://thepalindrome.org/p/the-power-of-mathematical-modeling</guid><dc:creator><![CDATA[Tivadar Danka]]></dc:creator><pubDate>Tue, 07 Apr 2026 11:03:35 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!QFjt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5475cbc3-22c6-4279-a65a-f2a61e3d6f71_759x506.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey! It&#8217;s Tivadar from The Palindrome.</p><p>This week, it&#8217;s my pleasure to introduce <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Manlio De Domenico, Ph.D.&quot;,&quot;id&quot;:38842368,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/537de500-db20-4bcd-a894-5ef6226bbf13_1080x1080.jpeg&quot;,&quot;uuid&quot;:&quot;8477290e-4570-449a-b642-328a24f2b0f1&quot;}" data-component-name="MentionToDOM"></span>, a fellow scholar working at the forefront of physics, mathematics, and computer science.</p><p>One of the main reasons behind the success of modern science is mathematical modeling, the process of translating complex real-life observations into a language that allows us to generalize, understand, and predict.</p><p>If you have enjoyed this post, subscribe to his newsletter <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Complexity Thoughts&quot;,&quot;id&quot;:1183925,&quot;type&quot;:&quot;pub&quot;,&quot;url&quot;:&quot;https://open.substack.com/pub/manlius&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5d142d85-7836-48c2-8e36-664af3a7d8ef_1280x1280.png&quot;,&quot;uuid&quot;:&quot;943a9401-ffb7-4d44-a5dc-b0d8c29eade4&quot;}" data-component-name="MentionToDOM"></span>, a space dedicated to translating the complexity of the empirical world, from your cells to entire societies, into language that is <strong>as simple as possible, though not necessarily simpler.</strong></p>
      <p>
          <a href="https://thepalindrome.org/p/the-power-of-mathematical-modeling">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Explore Machine Learning as a Knowledge Graph]]></title><description><![CDATA[And see how everything connects]]></description><link>https://thepalindrome.org/p/explore-machine-learning-as-a-knowledge</link><guid isPermaLink="false">https://thepalindrome.org/p/explore-machine-learning-as-a-knowledge</guid><dc:creator><![CDATA[Tivadar Danka]]></dc:creator><pubDate>Mon, 30 Mar 2026 08:25:06 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!kyQK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13131e5c-5c36-4833-9ffe-4b89bf9e8f03_898x850.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>TL;DR: <em>Now you can play around with the <a href="https://the-palindrome.github.io/ml-knowledge-graph/">Machine Learning Knowledge Graph Explorer</a> I&#8217;ve been building. Check it out; it&#8217;s awesome.</em></p>
      <p>
          <a href="https://thepalindrome.org/p/explore-machine-learning-as-a-knowledge">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[This Week at The Palindrome (2025, Week 13)]]></title><description><![CDATA[Knowledge graphs and Minecraft epidemics]]></description><link>https://thepalindrome.org/p/this-week-at-the-palindrome-2025</link><guid isPermaLink="false">https://thepalindrome.org/p/this-week-at-the-palindrome-2025</guid><dc:creator><![CDATA[Tivadar Danka]]></dc:creator><pubDate>Fri, 27 Mar 2026 17:16:10 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!YMWO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F951085b5-b67c-44c0-970e-2943d5579254_926x875.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey! It&#8217;s Tivadar from The Palindrome.</p><p>Let&#8217;s try something new. In recent months, I started to feel that the weekly publishing schedule takes its toll on the quality of my posts. I want to take more time per post to give you some high-quality technical content. I&#8217;m inspired by amazing writers such as Sebastian Raschka (<span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Ahead of AI&quot;,&quot;id&quot;:1174659,&quot;type&quot;:&quot;pub&quot;,&quot;url&quot;:&quot;https://open.substack.com/pub/sebastianraschka&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/49f25d0a-212b-4853-8bcb-128d0a3edbbf_1196x1196.png&quot;,&quot;uuid&quot;:&quot;030fabf3-a6a7-4feb-90dd-5f95e98572b6&quot;}" data-component-name="MentionToDOM"></span>) or Cameron R. Wolfe (<span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Deep (Learning) Focus&quot;,&quot;id&quot;:1092659,&quot;type&quot;:&quot;pub&quot;,&quot;url&quot;:&quot;https://open.substack.com/pub/cameronrwolfe&quot;,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/ab9b43fb-52d5-40da-995d-5b7cd3f91064_896x896.png&quot;,&quot;uuid&quot;:&quot;20698ad0-6f62-44dd-957b-39485eb37f8f&quot;}" data-component-name="MentionToDOM"></span>), who publish once a month, but each post is a work of art.</p><p>On the other hand, I miss you. A monthly schedule feels too long, and I have a lot to share with you. Ever since I got my ChatGPT Max subscription with access to Codex, my creativity is out of bounds.</p><p>So, here&#8217;s a new format. Each week, I&#8217;m sending you my unfiltered stream of consciousness, all the projects that I&#8217;m currently working on. Think of it as joining me for a coffee, where we talk about all the exciting/revolutionary/insane ideas we have in our minds.</p><p>This week, there are two things on my mind: knowledge graphs and Minecraft.</p><p>Let&#8217;s start with knowledge graphs.</p><h1>The Complete Map of Machine Learning</h1><p>If you are a regular reader, you know that one of the most common questions I get is <em>&#8220;which part of mathematics do I need to study machine learning?&#8221;</em> My default answer, based on my decade of experience, is: a ton of linear algebra, a decent amount of calculus, and a snippet of probability theory.</p><p>To be honest, I&#8217;m not completely satisfied with my reply. So, I dug deep with my newly found agentic AI-fueled superpower to find what is scientifically backed.</p><p>Without further ado: here&#8217;s the full knowledge graph of mathematics and machine learning. 2081 nodes, 5149 edges.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pN9B!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60918a9d-6076-4b86-879b-cc39b7454f06_988x825.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pN9B!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60918a9d-6076-4b86-879b-cc39b7454f06_988x825.png 424w, https://substackcdn.com/image/fetch/$s_!pN9B!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60918a9d-6076-4b86-879b-cc39b7454f06_988x825.png 848w, https://substackcdn.com/image/fetch/$s_!pN9B!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60918a9d-6076-4b86-879b-cc39b7454f06_988x825.png 1272w, https://substackcdn.com/image/fetch/$s_!pN9B!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60918a9d-6076-4b86-879b-cc39b7454f06_988x825.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pN9B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60918a9d-6076-4b86-879b-cc39b7454f06_988x825.png" width="728" height="607.8947368421053" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60918a9d-6076-4b86-879b-cc39b7454f06_988x825.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:825,&quot;width&quot;:988,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:493437,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://thepalindrome.org/i/192289224?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60918a9d-6076-4b86-879b-cc39b7454f06_988x825.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pN9B!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60918a9d-6076-4b86-879b-cc39b7454f06_988x825.png 424w, https://substackcdn.com/image/fetch/$s_!pN9B!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60918a9d-6076-4b86-879b-cc39b7454f06_988x825.png 848w, https://substackcdn.com/image/fetch/$s_!pN9B!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60918a9d-6076-4b86-879b-cc39b7454f06_988x825.png 1272w, https://substackcdn.com/image/fetch/$s_!pN9B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60918a9d-6076-4b86-879b-cc39b7454f06_988x825.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The knowledge graph of machine learning</figcaption></figure></div><p>(The images are screenshots from the interactive graph explorer I&#8217;m building, which will be open source and publicly available.)</p><p>Let&#8217;s unravel this.</p>
      <p>
          <a href="https://thepalindrome.org/p/this-week-at-the-palindrome-2025">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Mathematics of Machine Learning workshop]]></title><description><![CDATA[Watch now | The full recording of the workshop]]></description><link>https://thepalindrome.org/p/mathematics-of-machine-learning-workshop</link><guid isPermaLink="false">https://thepalindrome.org/p/mathematics-of-machine-learning-workshop</guid><dc:creator><![CDATA[Tivadar Danka]]></dc:creator><pubDate>Sun, 22 Mar 2026 08:23:46 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/a06820a5-38c7-4999-825d-0dba24ca1159_1920x1008.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey!</p><p>Yesterday we concluded our first monthly workshop! As promised, here is the full recording, exclusive to paid subscribers.</p><p>You can access the Jupyter Notebook lecture notes here: <a href="https://github.com/the-palindrome/mathematics-of-machine-learning-workshop">https://github.com/the-palindrome/mathematics-of-machine-learning-workshop</a></p><p>The next monthly workshop is already in the works; it&#8217;s going to be the next iteration of the Neural Networks From Scratch course. The tentative date is April 18th, 15:00&#8211;19:00 CET, but stay tuned for the announcement, as this date might change.</p><p>Thanks again so much for attending!</p><p>Cheers,<br>Tivadar</p>
      <p>
          <a href="https://thepalindrome.org/p/mathematics-of-machine-learning-workshop">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Machine Learning From Zero, Chapter 01]]></title><description><![CDATA[Machine Learning From Zero, Chapter 01]]></description><link>https://thepalindrome.org/p/what-is-machine-learning</link><guid isPermaLink="false">https://thepalindrome.org/p/what-is-machine-learning</guid><dc:creator><![CDATA[Tivadar Danka]]></dc:creator><pubDate>Sat, 14 Mar 2026 14:43:31 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/2554ada4-39ba-41ae-80ca-41a491e2e5e8_1200x630.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey! This is Tivadar from The Palindrome.</p><p>I&#8217;m finally working on my upcoming book <em>Machine Learning From Zero</em>, the sequel to <em>Mathematics of Machine Learning</em>.</p><p>The first chapter has just been finished, which sets the foundations for the neat stuff, like implementing neural networks from scratch. (Which is the core of the book.)</p><p>Without further ado, here&#8217;s an exclusive preview.</p><p>Enjoy!</p><p>P.S. I&#8217;m writing this book in Jupyter Notebooks, and I turn them into Substack posts with <a href="https://notebookpress.xyz/">NotebookPress</a>, a tool I&#8217;m building to bring technical writing on Substack to the next level. If you write math-and-code-heavy content, you should check it out.</p><div><hr></div><p><em>Machine learning is training predictive models from data.</em></p><p>Sure, we can be academic about it and refine the definition of machine learning by looking at the countless of nuances, but that&#8217;s not what we are here to do. We are here to understand the core fundamentals of machine learning &#8212; the fundamentals that will take you further than anything else.</p><p>I believe the only way to aquire deep knowledge of any technical subject is to take it apart and put it back together again. <em>This</em> is what we are here to do.</p><p>The fundamental machine learning setup consists of:</p><ol><li><p>a dataset, usually coming in the form of input and target variables,</p></li><li><p>a parametric function that models the relation between the input and the target variables,</p></li><li><p>and a loss function that measures the model&#8217;s fit to the data.</p></li></ol><p>Let&#8217;s start with the data.</p><p>To look behind the curtain of machine learning algorithms, we have to precisely formulate the problems that we deal with. Three important parameters determine a machine learning paradigm: the input, the output, and the training data.</p><p>All machine learning tasks boil down to finding a model that provides additional insight into the data, i.e., a function <em>f</em> that transforms the input <em>x</em> into the useful representation <em>y</em>. This can be a prediction, an action to take, a high-level feature representation, and many more. We&#8217;ll learn about all of them.</p><p>Mathematically speaking, the basic machine learning setup consists of:</p><ol><li><p>a dataset &#119967;,</p></li><li><p>a function <em>f</em> that describes the true relation between the input and the output,</p></li><li><p>and a parametric model <em>h</em> &#8212; also called a <em>hypothesis</em> &#8212; that serves as our estimation of <em>f</em>.</p></li></ol><div><hr></div><p><strong>Remark.</strong> <em>(Common abuses of machine learning notation.)</em></p><p><em>Note that although the function f only depends on the input x, the parametric model h also depends on the parameters and the training dataset.</em></p><p><em>Thus, it is customary to write h(x) as h(x; w, &#119967;), where w represents the parameters, and &#119967; is our training dataset.</em></p><p><em>This dependence is often omitted, but keep in mind that it&#8217;s always there.</em></p><div><hr></div><p>We make no restrictions about how the model <em>f</em>&#770; is constructed. It can be a deterministic function like <em>h</em>(<em>x</em>) = &#8722;13.2 <em>x</em>&#178; + 0.92 <em>x</em> + 3.0 or a probability distribution <em>h</em>(<em>x</em>) = <em>P</em>(<em>Y</em> = <em>y</em> &#8739; <em>X</em> = <em>x</em>). Models have all kinds of families like <em>generative</em>, <em>discriminative</em>, and more. We&#8217;ll talk about them in detail; in fact, models will be the focal points of the majority of the chapters.</p><p>First, let&#8217;s focus on the paradigms themselves. There are four major ones:</p><ul><li><p>supervised learning,</p></li><li><p>unsupervised learning,</p></li><li><p>semi-supervised learning,</p></li><li><p>and reinforcement learning.</p></li></ul><p>What are these?</p><h2>Supervised learning</h2><p>The most common paradigm is <em>supervised learning</em>. There, we have inputs &#119857;&#7522; &#8712; &#8477;&#7504; and ground truth labels y&#7522; that form our training dataset</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!voUV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ec88697-eef9-4b76-ab01-752340845aa9_1920x236.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!voUV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ec88697-eef9-4b76-ab01-752340845aa9_1920x236.png 424w, https://substackcdn.com/image/fetch/$s_!voUV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ec88697-eef9-4b76-ab01-752340845aa9_1920x236.png 848w, https://substackcdn.com/image/fetch/$s_!voUV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ec88697-eef9-4b76-ab01-752340845aa9_1920x236.png 1272w, https://substackcdn.com/image/fetch/$s_!voUV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ec88697-eef9-4b76-ab01-752340845aa9_1920x236.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!voUV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ec88697-eef9-4b76-ab01-752340845aa9_1920x236.png" width="1456" height="179" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2ec88697-eef9-4b76-ab01-752340845aa9_1920x236.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:179,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;math&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="math" title="math" srcset="https://substackcdn.com/image/fetch/$s_!voUV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ec88697-eef9-4b76-ab01-752340845aa9_1920x236.png 424w, https://substackcdn.com/image/fetch/$s_!voUV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ec88697-eef9-4b76-ab01-752340845aa9_1920x236.png 848w, https://substackcdn.com/image/fetch/$s_!voUV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ec88697-eef9-4b76-ab01-752340845aa9_1920x236.png 1272w, https://substackcdn.com/image/fetch/$s_!voUV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ec88697-eef9-4b76-ab01-752340845aa9_1920x236.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Although the labels can be anything like numbers or text, they are all available for us. The goal is to construct a function that models the relationship between the input <em>x</em> and the target variable <em>y</em>.</p><p>Let&#8217;s saddle up and see a couple of examples.</p>
      <p>
          <a href="https://thepalindrome.org/p/what-is-machine-learning">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Let’s Bring Jupyter Notebooks to Substack]]></title><description><![CDATA[From Jupyter Notebook to Substack post in two clicks]]></description><link>https://thepalindrome.org/p/lets-bring-jupyter-notebooks-to-substack</link><guid isPermaLink="false">https://thepalindrome.org/p/lets-bring-jupyter-notebooks-to-substack</guid><dc:creator><![CDATA[Tivadar Danka]]></dc:creator><pubDate>Thu, 05 Mar 2026 12:30:25 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/c4e4cc41-93d5-4fbe-bb07-030c0c7aeea7_1200x630.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Jupyter Notebooks are my favorite publishing format by far. I write all my posts in them.</p><p>They are the perfect medium for math-and-code-heavy technical content: they support LaTeX snippets, code execution, and, to top it all, enable interactive exploration. Every time I&#8217;m reading a hands-on tutorial about some fancy new framework, I cannot resist the urge to jump into edit mode and break the code in ways no author can think of.</p><p>Unfortunately, if you choose to write in Jupyter Notebooks, you either abandon content distribution by platforms such as Substack, LinkedIn, or X (because they don&#8217;t support the format) or manually convert the notebooks to satisfy every possible whim of every possible editor.</p><p>So, I built a tool that enables writers to publish Jupyter Notebooks on Substack (and other platforms) with a couple of clicks. It&#8217;s called <a href="https://notebookpress.xyz/">NotebookPress</a>, and it solves four major pain points:</p><ul><li><p>LaTeX rendering,</p></li><li><p>code snippets,</p></li><li><p>user interactivity,</p></li><li><p>and cross-platform compatibility.</p></li></ul><p>Let me give you a t&#8230;</p>
      <p>
          <a href="https://thepalindrome.org/p/lets-bring-jupyter-notebooks-to-substack">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Accio Insights: The Marauder’s Map of the ML World]]></title><description><![CDATA[A deep dive into the swiss army knife of machine learning]]></description><link>https://thepalindrome.org/p/accio-insights-the-marauders-map</link><guid isPermaLink="false">https://thepalindrome.org/p/accio-insights-the-marauders-map</guid><dc:creator><![CDATA[Tivadar Danka]]></dc:creator><pubDate>Thu, 19 Feb 2026 10:58:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!chXt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff6f3fe5-9a37-40af-97c1-226ba20f247d_480x320.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi there! It&#8217;s Tivadar from The Palindrome.</p><p>Please welcome <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Sairam Sundaresan&quot;,&quot;id&quot;:85853406,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!3vud!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79cc4b2d-3161-4743-85d8-97910007711b_1463x1463.jpeg&quot;,&quot;uuid&quot;:&quot;bc15a281-1e09-4497-a6e2-0e4c8ef5204a&quot;}" data-component-name="MentionToDOM"></span>, author of the brilliant <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Gradient Ascent&quot;,&quot;id&quot;:1199871,&quot;type&quot;:&quot;pub&quot;,&quot;url&quot;:&quot;https://open.substack.com/pub/artofsaience&quot;,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/01dfb858-3107-4656-b289-cf13de969a17_800x800.png&quot;,&quot;uuid&quot;:&quot;ce83f3e6-f9b6-4a5c-89ac-1b7d07b9f0fd&quot;}" data-component-name="MentionToDOM"></span> Substack. I&#8217;ve been following his work for years, and I&#8217;m honored to have him here for a guest post.</p><p>By the way, <a href="https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=Tivadar">he is hosting a workshop on February 28th titled &#8220;Machine Learning and Generative AI System Design,&#8221;</a> and he has kindly offered a 35% discount for readers of <em>The Palindrome</em>.</p><p>The code <strong>TIVADAR35</strong> is valid until February 24th.&#8221;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!COP6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1426-cd58-44f5-9ac8-a90df07f85ec_1179x578.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!COP6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1426-cd58-44f5-9ac8-a90df07f85ec_1179x578.webp 424w, https://substackcdn.com/image/fetch/$s_!COP6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1426-cd58-44f5-9ac8-a90df07f85ec_1179x578.webp 848w, https://substackcdn.com/image/fetch/$s_!COP6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1426-cd58-44f5-9ac8-a90df07f85ec_1179x578.webp 1272w, https://substackcdn.com/image/fetch/$s_!COP6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1426-cd58-44f5-9ac8-a90df07f85ec_1179x578.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!COP6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1426-cd58-44f5-9ac8-a90df07f85ec_1179x578.webp" width="1179" height="578" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/be2d1426-cd58-44f5-9ac8-a90df07f85ec_1179x578.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:578,&quot;width&quot;:1179,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!COP6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1426-cd58-44f5-9ac8-a90df07f85ec_1179x578.webp 424w, https://substackcdn.com/image/fetch/$s_!COP6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1426-cd58-44f5-9ac8-a90df07f85ec_1179x578.webp 848w, https://substackcdn.com/image/fetch/$s_!COP6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1426-cd58-44f5-9ac8-a90df07f85ec_1179x578.webp 1272w, https://substackcdn.com/image/fetch/$s_!COP6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe2d1426-cd58-44f5-9ac8-a90df07f85ec_1179x578.webp 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=Tivadar&quot;,&quot;text&quot;:&quot;Reserve Your Seat&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=Tivadar"><span>Reserve Your Seat</span></a></p><p>Now, I&#8217;ll pass the mic to Sairam.</p><p>Enjoy!</p><p>Cheers,<br>Tivadar</p>
      <p>
          <a href="https://thepalindrome.org/p/accio-insights-the-marauders-map">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The Palindrome in 2026]]></title><description><![CDATA[What's coming]]></description><link>https://thepalindrome.org/p/the-palindrome-in-2026</link><guid isPermaLink="false">https://thepalindrome.org/p/the-palindrome-in-2026</guid><dc:creator><![CDATA[Tivadar Danka]]></dc:creator><pubDate>Sun, 15 Feb 2026 07:50:11 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/01a35c2b-d6bb-4a35-8a50-1ac8d52b15a3_1920x1008.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey! It's Tivadar.</p><p>Yes, I know that it is February. It took me a bit longer to write this post.</p><p>I also know that your inbox is (<em>was</em>) overloaded with triumphant 2025 reviews and grand plans for 2026. To respect your time, here's a no-bullshit summary of what you'll get from The Palindrome this year, and if you want the details, just read on.</p><p><strong>All subscribers:</strong></p><ul><li><p>I'm finishing my Machine Learning From Zero book this year, where we'll implement all the fundamental algorithms from scratch. This'll be the topic for my technical posts.</p></li><li><p>I'll do more explainer-style videos, <a href="https://youtu.be/PB-1_JTHyEU?si=-7o5KsgJjA_WYRyc">like this one from the Matrices and Graphs post</a>.</p></li></ul><p><strong>Paid subscribers:</strong></p><ul><li><p>Monthly live workshops, streamed live right here on Substack.  </p></li><li><p>First workshop: Mathematics of Machine Learning, March 7th, 15:00 - 20:00 CET.</p></li><li><p>Second workshop: Neural Networks from Scratch, date TBD.</p></li></ul>
      <p>
          <a href="https://thepalindrome.org/p/the-palindrome-in-2026">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The Day My Project Went to Space]]></title><description><![CDATA[From whiteboard to orbit to science]]></description><link>https://thepalindrome.org/p/the-day-my-project-went-to-space</link><guid isPermaLink="false">https://thepalindrome.org/p/the-day-my-project-went-to-space</guid><dc:creator><![CDATA[Tivadar Danka]]></dc:creator><pubDate>Mon, 02 Feb 2026 12:33:14 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!C5XA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff444b75c-8d7c-4ceb-ab1c-b20622d27b5f_1600x1200.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi there! It&#8217;s Tivadar from The Palindrome.</p><p>Today&#8217;s post is a very special one, written by my friend Mikl&#243;s, whom I met during our PhD years. (Which was more than ten years ago. I feel old.) He is one of the smartest people I know, and he&#8217;s been doing impressive research projects since then.</p><p>One of his latest projects made the news recently, because the data collection took place on the International Space Station (ISS). This is interesting in itself, but what you rarely see is the &#8220;backend&#8221; side of science, the stuff that don&#8217;t make the news, but makes or breaks a research project of this scale.</p><p>What follows is a deep-dive report on the entire lifecycle of a space-bound project:</p><ul><li><p>grant proposal (moving from a whiteboard to orbit),</p></li><li><p>agile problem solving (like jumping hoops to meet Apple Store regulations),</p></li><li><p>stakeholder coordination (managing the logistics between international space agencies),</p></li><li><p>project management (handling &#8220;no second chance&#8221; execution under the pressure of shifting launch windo&#8230;</p></li></ul>
      <p>
          <a href="https://thepalindrome.org/p/the-day-my-project-went-to-space">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Why is the Golden Ratio Hiding in the Fibonacci Sequence?]]></title><description><![CDATA[The non-recursive formula for Fibonacci numbers]]></description><link>https://thepalindrome.org/p/why-is-the-golden-ratio-hiding-in</link><guid isPermaLink="false">https://thepalindrome.org/p/why-is-the-golden-ratio-hiding-in</guid><dc:creator><![CDATA[Tivadar Danka]]></dc:creator><pubDate>Fri, 30 Jan 2026 09:17:44 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/43220097-ce29-4505-aa4a-da3b6f699577_2090x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The Fibonacci numbers form one of the most famous integer sequences, known for their close connection to the golden ratio, sunflower spirals, the mating habits of rabbits, and several other things.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iE9D!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F849eea4a-dec2-4dd5-bcb4-107d3e703b30_3840x2160.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iE9D!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F849eea4a-dec2-4dd5-bcb4-107d3e703b30_3840x2160.png 424w, https://substackcdn.com/image/fetch/$s_!iE9D!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F849eea4a-dec2-4dd5-bcb4-107d3e703b30_3840x2160.png 848w, https://substackcdn.com/image/fetch/$s_!iE9D!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F849eea4a-dec2-4dd5-bcb4-107d3e703b30_3840x2160.png 1272w, https://substackcdn.com/image/fetch/$s_!iE9D!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F849eea4a-dec2-4dd5-bcb4-107d3e703b30_3840x2160.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iE9D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F849eea4a-dec2-4dd5-bcb4-107d3e703b30_3840x2160.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/849eea4a-dec2-4dd5-bcb4-107d3e703b30_3840x2160.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:156485,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://thepalindrome.org/i/186281090?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F849eea4a-dec2-4dd5-bcb4-107d3e703b30_3840x2160.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iE9D!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F849eea4a-dec2-4dd5-bcb4-107d3e703b30_3840x2160.png 424w, https://substackcdn.com/image/fetch/$s_!iE9D!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F849eea4a-dec2-4dd5-bcb4-107d3e703b30_3840x2160.png 848w, https://substackcdn.com/image/fetch/$s_!iE9D!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F849eea4a-dec2-4dd5-bcb4-107d3e703b30_3840x2160.png 1272w, https://substackcdn.com/image/fetch/$s_!iE9D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F849eea4a-dec2-4dd5-bcb4-107d3e703b30_3840x2160.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Because of its recursive nature, computing the Fibonacci sequence via brute force is computationally expensive.</p><p>However, the Fibonacci numbers have a simple and beautiful closed-form expression written in terms of the golden ratio (&#966;) and the conjugate golden ratio (&#968;).</p>
      <p>
          <a href="https://thepalindrome.org/p/why-is-the-golden-ratio-hiding-in">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[How did the Babylonians know √2 up to six digits?]]></title><description><![CDATA[The greatest known computational accuracy in the ancient world]]></description><link>https://thepalindrome.org/p/how-did-the-babylonians-know-2-up-e2c</link><guid isPermaLink="false">https://thepalindrome.org/p/how-did-the-babylonians-know-2-up-e2c</guid><dc:creator><![CDATA[Tivadar Danka]]></dc:creator><pubDate>Wed, 21 Jan 2026 17:04:40 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/3O730YTS8Yg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>There&#8217;s an ancient Babylonian clay tablet from 1800-1600 BC that contains the square root of two with 99.9999% precision.</p><p>How did they compute it?</p><p>Let me show you:</p>
      <p>
          <a href="https://thepalindrome.org/p/how-did-the-babylonians-know-2-up-e2c">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[I Finally Listened to You and I’m So Glad I Did]]></title><description><![CDATA[(you will be too)]]></description><link>https://thepalindrome.org/p/i-finally-listened-to-you-and-im</link><guid isPermaLink="false">https://thepalindrome.org/p/i-finally-listened-to-you-and-im</guid><dc:creator><![CDATA[Tivadar Danka]]></dc:creator><pubDate>Thu, 15 Jan 2026 19:50:02 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/PB-1_JTHyEU" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there!</p><p>For the longest time, people have told me that I should start sharing my educational content in video format.</p><p>I&#8217;ve gotten compliments like the fact that my illustrations resemble those of 3B1B, for which I&#8217;m extremely honored.</p><p>So, after a few years of second-guessing whether I should dive into video content, I&#8217;ve finally decided to give it a try&#8230;</p><p>&#8230;and, in the short time I&#8217;ve been doing it, the results have amazed me.</p><h2>Introducing The Palindrome YouTube channel</h2><p>I&#8217;ve had a YouTube account since 2011, but just very recently I started shifting towards the creator side instead of being just a plain consumer.</p><p>My goal for the next couple of weeks is simple: every fan-favorite post from The Palindrome newsletter and from my social media posts will be turned into video.</p><p>So far, there are three videos uploaded to the channel. These are sort of the greatest hits of my educational content. Those that, no matter how many times I republish them, they always get a lot of engagement and I keep gett&#8230;</p>
      <p>
          <a href="https://thepalindrome.org/p/i-finally-listened-to-you-and-im">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The Camel Principle: Why Adding Zero is the Most Powerful Trick in Mathematics]]></title><description><![CDATA[What it is, how it works, and why it is essential]]></description><link>https://thepalindrome.org/p/the-camel-principle-why-adding-zero</link><guid isPermaLink="false">https://thepalindrome.org/p/the-camel-principle-why-adding-zero</guid><dc:creator><![CDATA[Tivadar Danka]]></dc:creator><pubDate>Mon, 12 Jan 2026 20:23:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/fjMKGkocgaE" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Behold one of the mightiest tools in mathematics: the camel principle.</p><p>I am dead serious. Deep down, this tiny rule is the cog in many methods. Ones that you use every day.</p><p>Here is what it is, how it works, and why it is essential:</p>
      <p>
          <a href="https://thepalindrome.org/p/the-camel-principle-why-adding-zero">
              Read more
          </a>
      </p>
   ]]></content:encoded></item></channel></rss>