How much can you store in a hidden state if you encode things in ascii?
What about if you use an LLM vocab?
What if you encode things through a fixed walk where the magnitude of the vector reads off a token and the direction tells you the position?
What if you can use the unembeddings?
What if you can use one layer of the Transformer as a decoder?
If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:
@misc{holtzman-how-much-can-2026,
author = {Holtzman, Ari},
title = {How much can you store in a hidden state?},
year = {2026},
url = {https://hypogenic.ai/ideahub/idea/Vj6sySyijVaJXqhfJywK}
}Please sign in to comment on this idea.
No comments yet. Be the first to share your thoughts!