Diluted Adversarial Prompts?

0

There is evidence that adversarial prompts essentially hack an LLM's attention mechanism (e.g. https://arxiv.org/abs/2406.11717). Is it possible to make diluted adversarial prompts, or does dilution just cause them to stop actually interrupting processing?

llms adversarial prompts long context

Chat

If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:

@misc{holtzman-diluted-adversarial-prompts-2026,
  author = {Holtzman, Ari},
  title = {Diluted Adversarial Prompts?},
  year = {2026},
  url = {https://hypogenic.ai/ideahub/idea/NKAaMPoJfSe3UJWJuH8u}
}

Comments (0)

Please sign in to comment on this idea.

No comments yet. Be the first to share your thoughts!