Since steering vectors work, there appear to be directions associated with different styles (in addition to other aspects of text). One question is: if there's a particular style that it's hard to to get a steering vector for, does that also mean it will be hard to prompt? Does the model represent any style it 'knows' a priori (rather than constructs online, e.g., if I ask it to always end every sentence with 🥕) non-linearly, or are they all linear?🥕
If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:
@misc{holtzman-nonlinear-diction-2026,
author = {Holtzman, Ari},
title = {Non-linear Diction},
year = {2026},
url = {https://hypogenic.ai/ideahub/idea/Tb1IIhOnXoYhPPqMsGcg}
}Please sign in to comment on this idea.
No comments yet. Be the first to share your thoughts!