Mapping an MLP in Terms of Tokens

1

If you look at elements of the residual stream that correspond to a specific token, and then see what MLP inputs map to what MLP outputs, how much of that can be described as a simple program that compares with token directions are present and then performs a shallow program to produce the output token directions?

llm mechinterp MLP

Chat

If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:

@misc{holtzman-mapping-an-mlp-2026,
  author = {Holtzman, Ari},
  title = {Mapping an MLP in Terms of Tokens},
  year = {2026},
  url = {https://hypogenic.ai/ideahub/idea/jEMWwJOLbyG4DsqfcPns}
}

Comments (0)

Please sign in to comment on this idea.

No comments yet. Be the first to share your thoughts!