We should take relatively old LLMs, e.g., from 2023 or so, and see what data in 2026 they have trouble converging on in finetuning. Where are LLMs 'stuck' and where are they flexible?
If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:
@misc{holtzman-what-text-cant-2026,
author = {Holtzman, Ari},
title = {What text can't LLMs simulate?},
year = {2026},
url = {https://hypogenic.ai/ideahub/idea/EH1P3KtZ7FUbewLH0z5X}
}Please sign in to comment on this idea.
No comments yet. Be the first to share your thoughts!