Post-hoc Low-Rank Residual Stream

3

If we run PCA on a bunch of hidden states across a bunch of layers of a transformer that's already been pre-trained, and then we project the residual stream every time it's written into that low rank space of the first k principal compoents. How much perplexity will we get What behaviors will remain?

llms mechinterp

Chat

If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:

@misc{holtzman-posthoc-lowrank-residual-2026,
  author = {Holtzman, Ari},
  title = {Post-hoc Low-Rank Residual Stream},
  year = {2026},
  url = {https://hypogenic.ai/ideahub/idea/zKibqFxRFzrpZvmVDCLH}
}

Comments (0)

Please sign in to comment on this idea.

No comments yet. Be the first to share your thoughts!