From Loops to Klein Bottles: Uncovering Hidden Topology in High Dimensional Data
Gunnar Carlsson Mattimore Cronin Gunnar Carlsson Mattimore Cronin

From Loops to Klein Bottles: Uncovering Hidden Topology in High Dimensional Data

Motivation: Dimensionality reduction is vital to the analysis of high dimensional data. It allows for better understanding of the data, so that one can formulate useful analyses. Dimensionality reduction that produces a set of points in a vector space of dimension n, where n s much smaller than the number of features N in the data set. If the number n is 1, 2, or 3, it is possible to visualize the data and obtain insights. If n is larger, then it is more difficult. One interesting  situation, though, is where the data concentrates around a non-linear surface whose dimension is 1, 2, or 3, but can only be embedded in a dimension higher than 3. We will discuss such examples in this post.

Read More
Geometry of Features in Mechanistic Interpretability
Gunnar Carlsson Mattimore Cronin Gunnar Carlsson Mattimore Cronin

Geometry of Features in Mechanistic Interpretability

This post is motivated by the observation in Open Problems in Mechanistic Interpretability by Sharkey, Chugtai, et al that “SDL (sparse dictionary learning) leaves feature geometry unexplained”, and that it is desirable to utilize geometric structures to gain interpretability for sparse autoencoder features.

We strongly agree, and the goal of this post is to describe one method for imposing such structures on data sets in general. Of course, it applies particularly to the case of sparse autoencoder features in LLM’s. The need for geometric structures on feature sets applies generally in the data science of wide data sets (those with many columns), such as occur as the activation data sets in complex neural networks. We will give some examples in the life sciences, and conclude with one derived from LLM’s.

Read More
Topological Data Analysis and Mechanistic Interpretability
Gunnar Carlsson Mattimore Cronin Gunnar Carlsson Mattimore Cronin

Topological Data Analysis and Mechanistic Interpretability

In this post, we’ll look at some ways to use topological data analysis (TDA) for mechanistic interpretability.

We’ll first show how one can apply TDA in a very simple way to the internals of convolutional neural networks to obtain information about the “responsibilities” of the various layers, as well as about the training process. For LLM’s, though, simply approaching weights or activations “raw” yields limited insights, and one needs additional methods like sparse autoencoders (SAEs) to obtain useful information about the internals. We will discuss this methodology, and give a few initial examples where TDA helps reveal structure in SAE feature geometry.

Read More