I'll be the first to admit that we don't have a great definition of what unconscious information is in humans. That said, unconscious information clearly exists in people. There are lots of times when When people are being manipulable via signals they don't seem to consciously notice. The representation hypothesis has suggested that information in LLMs is largely linear. I asked some non-linear information in LLMs though proving this directly appears difficult. However, some papers have argued that things like refusal are non-linear (https://arxiv.org/html/2501.08145v1). Since transformations such as attention largely work through linear transformations, is it Is it possible that all the lens are aware or can introspect about linear information but cannot directly introspect on nonlinear information. What aware or introspection mean here is still left partially undefined. But isn't it tantalizing to consider that maybe LLMs aren't fully aware of their own refusal behavior even if they are fully aware of something like where on the political spectrum the persona they've taken on lies?
If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:
@misc{holtzman-is-linearnonlinear-information-2025,
author = {Holtzman, Ari},
title = {Is Linear/Nonlinear Information in LLMs Similar to Conscious/Unconscious Information in Humans?},
year = {2025},
url = {https://hypogenic.ai/ideahub/idea/gnjZBYzECR9LhJSPhBrA}
}Please sign in to comment on this idea.
No comments yet. Be the first to share your thoughts!