I have a theory: I hypothesize that in long multi-turn conversations, LLMs regress to their base level prior more than anything else. Of course, there's a significant amount of noise explaining the total degeneration after long multi-turn conversations, but I think a significant amount of what's happening is that the alignment training only works for the first few terms, which are the Space the model has most well-tread when undergoing alignment.
If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:
@misc{holtzman-do-multiturn-conversations-2026,
author = {Holtzman, Ari},
title = {Do Multi-Turn Conversations Regress to the Prior?},
year = {2026},
url = {https://hypogenic.ai/ideahub/idea/WHbvDdMiYyDhTZxbMjhO}
}Please sign in to comment on this idea.
I like this idea! It's fundamentally related to the idea about rollback in multi-turn interaction I posted before.