Software bug reports now routinely include visual artifacts—GUI screenshots, flowcharts, and diagrams—that capture critical information missing from text. Yet current automated program repair (APR) methods struggle to leverage this multimodal context. We propose Structured Visual Reasoning (SVR), which bridges the "pixel-to-logic" gap through two innovations: a fine-tuned vision-language model that translates visual artifacts into structured symbolic representations, and an intelligent refinement loop that iteratively improves repair quality.
📦 Resources: Code and model weights coming soon.
