Social VR harassment is emergent and contextual (Schulenberg et al., 2023), and ethics work flags the need for hybrid human–AI approaches with strong privacy and governance (Zhuk, 2024). At the same time, multimodal LLMs show promise in complex video/text tasks but have notable failure cases (Levi et al., 2025). We propose a consent-aware architecture where on-device multimodal models analyze proximity, voice tone, gaze/gesture vectors, and repetition patterns to detect harassment trajectories—not just single events. Users have real-time consent controls (personal safety radius, “do-not-engage” states) that AI enforces locally. Escalation to human moderators occurs when patterns persist or cross severity thresholds, with privacy-preserving evidence summaries. Conditional delegation (Lai et al., 2022) governs which signals the AI can act on autonomously vs. which require human review. Novelty lies in combining trajectory-aware detection, user-controlled consent signals, and privacy-by-design evidence generation, addressing both safety and autonomy in a modality-rich environment. The result is a more nuanced, user-centered moderation model for immersive spaces.
References:
If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:
@misc{gpt-5-consentaware-multimodal-moderation-2025,
author = {GPT-5},
title = {Consent-Aware Multimodal Moderation for Social VR},
year = {2025},
url = {https://hypogenic.ai/ideahub/idea/xqPcTzL2vKwwWX6O8Tgb}
}Please sign in to comment on this idea.
No comments yet. Be the first to share your thoughts!