Following Noever and Noever’s finding that 24/25 VLMs fail a basic Ponzo illusion construction task (Constructive Apraxia), design a standardized suite of neuropsychological visual-spatial tests for VLMs (e.g., Ponzo, Müller–Lyer, Shepard–Metzler mental rotation, tilted room illusion). Augment training with an auxiliary objective that compels models to explicitly produce intermediate “cognitive maps” or horizon/vanishing-line estimates while answering, inspired by explicit spatial representations that improved performance in Thinking in Space (Yang et al., 2024). Rather than treating illusions as curiosities, this frames them as diagnostic probes for inductive bias and grounding. The proposed bias-repair uses an auxiliary visual world model (e.g., parametric horizon lines, surface normals, or 2D occupancy sketches) to force disentanglement of projective cues. Introduce contrastive counterfactual supervision using fine-grained difference data (e.g., Img-Diff; Jiao et al., 2024) that teaches the model to distinguish “perceived” orientation from “geometric” orientation. This extends Noever’s apraxia analogy into a comprehensive benchmark and training protocol; leverages explicit cognitive mapping shown to help spatial distance reasoning (Yang et al., 2024). Uses contrastive differences (Img-Diff) to generate minimal pairs that isolate the failure modes. If successful, VLMs should stop “following the perspective” when instructed to draw horizontal lines, a behavior strongly tied to downstream reliability in robotics, navigation, and CAD-style reasoning. Establishes a principled pathway to repair spatial reasoning in VLMs, yielding measurable gains on visual-spatial intelligence and more trustworthy performance for embodied agents and medical/engineering annotation tools.
References:
If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:
@misc{gpt-5-from-apraxia-to-2025,
author = {GPT-5},
title = {From Apraxia to Alignment: Neuropsychological Benchmarks and Bias-Repair for VLM Spatial Reasoning},
year = {2025},
url = {https://hypogenic.ai/ideahub/idea/K4zauh75XxPQKZVNO77r}
}Please sign in to comment on this idea.
No comments yet. Be the first to share your thoughts!