Linear representations in language models can change dramatically over a conversation. Maybe one of the best way we can figure out how LLMs work is to figure out a space that describes what a given vector will do in a given context. In this view traditional steering vectors are the zeroth order case: a vector that always has the same effect. But what about a first order case, a vector that changes linearly based on singular feature in the context? Have we found one of those?
If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:
@misc{holtzman-conditional-steering-vectors-2026,
author = {Holtzman, Ari},
title = {Conditional Steering Vectors},
year = {2026},
url = {https://hypogenic.ai/ideahub/idea/1bp7sjkSbVfBBKE4djMd}
}Please sign in to comment on this idea.
No comments yet. Be the first to share your thoughts!