Fully Invertible Language Model Output Spaces

by Ari Holtzman2 months ago
0

Morris et al. introduce the problem of Language Model Inversion—finding the prompt given the output of an LLM.

This problem raises many questions. Here's one: Is there any easy-to-describ subset of text in which there are attacks that reliably recover a prompt (say >90% accuracy) other than the case of repeated strings in the prompt showing up in the output?

If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:

@misc{holtzman-fully-invertible-language-2026,
  author = {Holtzman, Ari},
  title = {Fully Invertible Language Model Output Spaces},
  year = {2026},
  url = {https://hypogenic.ai/ideahub/idea/ScraBwhpLTKDhffjzVmV}
}

Comments (0)

Please sign in to comment on this idea.

No comments yet. Be the first to share your thoughts!