Circuit Tracing in Transformers: Peeking Inside the Black Box

### Title

Circuit Tracing in Transformers: Peeking Inside the Black Box

### Describe your Talk

I will explain what circuit tracing is in large language models like Gemma. It's a new way to understand how models answer questions by looking inside them and checking which neurons activate when. Anthropic open sourced a library called circuit-tracer and also the website Neuronpedia, which helps us find neurons linked to real-world concepts like "Texas" or "capital".

I'll show how this works with a live demo from their Jupyter notebook. We'll see what nodes and supernodes are, and how they connect to form reasoning paths in the model. This can help us debug, understand, and make models safer.

### Pre-requisites & reading material

rread about transformers, and about activations and circuits 

### Resources

https://www.anthropic.com/research/open-source-circuit-tracing


### Time required for the talk

25

### Link to slides/demos

https://docs.google.com/presentation/d/1FNd37jW3nB95lko2imfk6A7VGVG0S0H53hUgWocYJ1g/edit?usp=sharing

### About you

Viraj 

### Availability

21/06/2025

### Any comments

Ill try to make it engaging, I will post a demo video of the talk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Circuit Tracing in Transformers: Peeking Inside the Black Box #345

Title

Describe your Talk

Pre-requisites & reading material

Resources

Time required for the talk

Link to slides/demos

About you

Availability

Any comments

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Circuit Tracing in Transformers: Peeking Inside the Black Box #345

Description

Title

Describe your Talk

Pre-requisites & reading material

Resources

Time required for the talk

Link to slides/demos

About you

Availability

Any comments

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions