Finding non-compositional phrases

5

I previously noted that I wanted to find all the words and phrases LLMs have memorized: https://hypogenic.ai/ideahub/idea/cSOuC7g6YSQHQcWgINsJ

Can we use Future Lens (https://arxiv.org/abs/2311.04897) to at least lower bound this, since when something is predictable in advance from a hidden state that means it was memorized as a single unit with high likelihood?

llms mechinterp implicit vocabulary

Chat

If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:

@misc{holtzman-finding-noncompositional-phrases-2026,
  author = {Holtzman, Ari},
  title = {Finding non-compositional phrases},
  year = {2026},
  url = {https://hypogenic.ai/ideahub/idea/h2s84jdJLvdiluiD18Dd}
}

Comments (0)

Please sign in to comment on this idea.

No comments yet. Be the first to share your thoughts!