Understanding False Memories in Large Language Models: The Case of the Nonexistent Seahorse Emoji

by Ari Holtzman7 months ago
0

Large Language Models (LLMs) occasionally exhibit “false memories”—confidently recalling information that doesn’t exist, such as asserting the existence of a seahorse emoji. What drives these errors? One hypothesis is that LLMs, when asked about items in a category (e.g., which emojis exist), rely on pre-grouped sets of related concepts like “common sea creatures.” This could cause them to infer that a plausible-sounding member (like a seahorse) exists even if it doesn’t. Alternatively, are there other mechanisms at play in their training data or memory retrieval processes? Investigating this phenomenon can shed light on how LLMs organize knowledge, make inferences, and why they sometimes “remember” things that never were.

If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:

@misc{holtzman-understanding-false-memories-2025,
  author = {Holtzman, Ari},
  title = {Understanding False Memories in Large Language Models: The Case of the Nonexistent Seahorse Emoji},
  year = {2025},
  url = {https://hypogenic.ai/ideahub/idea/Y8fqtRA0TGY6TECEfTIo}
}

Comments (0)

Please sign in to comment on this idea.

No comments yet. Be the first to share your thoughts!