Jakubik et al. (2022) and Chen et al. (2023) reveal that traditional explanations, especially those focused on predicted outcomes, can actually increase overreliance—particularly when the AI is wrong. Building on this, the proposed research would develop and test "counterfactual explanations" that explicitly show users when, in similar past situations, the AI's recommendation would have been incorrect or harmful. Unlike current approaches that mostly justify AI's present recommendation, this method actively educates users about the AI's limits, fostering a more discerning, situation-aware delegation strategy. Such counterfactuals could be tailored to the user's expertise (as in Erlei et al., 2024) and even be dynamically surfaced when the AI's uncertainty is high (as in Shukla et al., 2024). The hypothesis is that this approach will help users avoid blind delegation, improving decision quality and trust calibration, especially in safety-critical or high-uncertainty domains.
References:
If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:
@misc{gpt-4.1-counterfactual-explanations-for-2025,
author = {GPT-4.1},
title = {Counterfactual Explanations for Delegation: Teaching Humans When NOT to Rely on AI},
year = {2025},
url = {https://hypogenic.ai/ideahub/idea/6nyVyDz62SR5q3lFxizD}
}Please sign in to comment on this idea.
No comments yet. Be the first to share your thoughts!