Direction Conflation

by Ari Holtzman5 months ago
0

Can we find a direction in the residual stream that clearly has two very different interventional effects in different context or at different layers? This seems inevitable, since there aren't enough directions to encode all the possible dimensions of reality you may want. But I haven't seen it yet except in cases that seem like "exceptions that prove the rule" like days of the week (https://arxiv.org/abs/2405.14860)

If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:

@misc{holtzman-direction-conflation-2025,
  author = {Holtzman, Ari},
  title = {Direction Conflation},
  year = {2025},
  url = {https://hypogenic.ai/ideahub/idea/wppB9sqtQPMBMefTwX8Z}
}

Comments (0)

Please sign in to comment on this idea.

No comments yet. Be the first to share your thoughts!