Induction heads allow models to copy data according to patterns that have been displayed in the text. They are active often even when the model isn't directly, as they are believed to be responsible for pattern-matching behavior, e.g., ICL. But then the question is: do induction heads end-up leaking whatever is patterned in the text even when the model shouldn't be copying something? Is this one of the reason models struggle to be random?
If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:
@misc{holtzman-are-induction-heads-2026,
author = {Holtzman, Ari},
title = {Are induction heads a large source of leakage?},
year = {2026},
url = {https://hypogenic.ai/ideahub/idea/bZUARiC6O4K2APDVhZ3j}
}Please sign in to comment on this idea.
No comments yet. Be the first to share your thoughts!