To be honest, I've been in denial about how powerful a technique Nostalegbraist's Logit Lens is for years. That said, it's clear that there's extra information that at least looking at the top-1 or even top-k predicted tokens doesn't give you. Can we 'delete' the information logit lens does capture from the residual stream and see what's left?
If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:
@misc{holtzman-what-doesnt-logit-2026,
author = {Holtzman, Ari},
title = {What doesn't Logit Lens capture?},
year = {2026},
url = {https://hypogenic.ai/ideahub/idea/PnwOw4B0VBhYh4djyjhT}
}Please sign in to comment on this idea.
No comments yet. Be the first to share your thoughts!