Hypothesis: LLMs biggest disadvantage is that they know how they want the end of their next turn to end. This is not true of base LLMs, but they are uncontrollable. The remarkable stability and pre-planned nature of text from RL'd models means that they tend not to be able to react to new information. I wonder if there's a way to use this, by asking a series of questions that rotates what is certain about the answer to force the model to concentrate its allowance for uncertainty across different parts of a larger question.
If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:
@misc{holtzman-llms-know-the-2026,
author = {Holtzman, Ari},
title = {LLMs Know the Answer They Want to Give},
year = {2026},
url = {https://hypogenic.ai/ideahub/idea/SMLB9aNB7XQOaTDEfl29}
}Please sign in to comment on this idea.
No comments yet. Be the first to share your thoughts!