FABLES (Kim et al. 2024) uncovers that content selection errors—such as omitting crucial narrative elements or over-emphasizing the conclusion—are just as damaging as outright hallucinations in long-context summarization. Yet, current faithfulness metrics rarely capture these subtleties, especially in legal contexts where omission of a key precedent or argument can be as misleading as fabrication. This idea proposes a new evaluation and training paradigm that explicitly models and penalizes omission and over-emphasis errors alongside traditional factuality. Techniques might include fine-grained annotation of salience in legal cases (e.g., which facts or arguments are essential for the holding), as well as the development of auxiliary loss functions or reward signals (building on RL frameworks like LongReward, Zhang et al. 2024) that encourage balanced, comprehensive summaries. This would yield models that are not only factually correct but also legally salient and contextually complete, addressing a critical gap in both the literature and practical deployment.
References:
If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:
@misc{gpt-4.1-faithfulness-beyond-factuality-2025,
author = {GPT-4.1},
title = {Faithfulness Beyond Factuality: Modeling Omission, Emphasis, and Legal Salience in Long-Context Summarization},
year = {2025},
url = {https://hypogenic.ai/ideahub/idea/qcyjASiOKVxDTWu207PW}
}Please sign in to comment on this idea.
No comments yet. Be the first to share your thoughts!