Is it easier or harder to hide adversarial prompts in longer documents?

by Ari Holtzman5 months ago

We all know about the adversarial prompts that look like gibberish but get an LLM to do something like tell you how to build a bomb or simply write a python function. Is it easier or harder to hide such 'instructions' in very long documents? Does the document create noise that overrides the adversarial effect or does it give more room to work with?

adversarial prompts llms length jailbreaking llm perception Implemented:https://github.com/Hypogenic-AI/adversarial-prompts-claude Implemented:https://github.com/Hypogenic-AI/adversarial-prompts-docs-codex Implemented:https://github.com/Hypogenic-AI/hide-prompts-length-gemini

Chat

If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:

@misc{holtzman-is-it-easier-2025,
  author = {Holtzman, Ari},
  title = {Is it easier or harder to hide adversarial prompts in longer documents?},
  year = {2025},
  url = {https://hypogenic.ai/ideahub/idea/RE4hPBZFPcPjlucQzkA8}
}

Comments (0)

Please sign in to comment on this idea.

No comments yet. Be the first to share your thoughts!