Generating Pedagogically Meaningful Visuals for Math Word Problems: A New Benchmark and Analysis of Text-to-Image Models
📄 ACL 2025 Findings Paper — Math2Visual
📘 Annotated Visual Language and Visual Dataset
🤖 Visual Language Generation Model
In this project, we present Math2Visual, an automatic framework for generating pedagogically meaningful visuals from math word problem text descriptions. Math2Visual leverages a pre-defined visual language and a design space grounded in interviews with math teachers, to illustrate the core mathematical relationships in math word problems. Using Math2Visual, we construct an annotated dataset of 1,903 visuals and evaluate Text-to-Image (TTI) models for their ability to generate visuals that align with our design. We further fine-tune several TTI models with our dataset, demonstrating improvements in educational visual generation. Our work establishes a new benchmark for automated generation of pedagogically meaningful visuals and offers insights into key challenges in producing multimodal educational content, such as the misrepresentation of mathematical relationships and the omission of essential visual elements.
We have released the full dataset on Hugging Face, including:
- Annotated visual language with corresponding math word problems
- Generated formal and intuitive visuals in both
.svgand.pngformats
👉 Browse the dataset on Hugging Face
You can preview images and download files directly from the Hugging Face web interface.
git clone https://github.com/eth-lre/math2visual.git
conda create -n math2visual python=3.12.4
conda activate math2visual
cd math2visualpip install -r requirements_a.txtpip install -r requirements_b.txttouch .env
echo "OPENAI_API_KEY=<your_openai key>" >> .envDownload our model adapter on Hugging Face
Place the adapter_model.safetensors into model/check-point/
Download base model meta-llama/Llama-3.1-8B on Hugging Face
Place the downloaded folder into model/base_model/
Replace the 'mwp' and 'formula' fields with your own math word problem content in generate_visual_language_with_our_model.py (around line 102). Then run:
python3 generate_visual_language_with_our_model.pyIt will print out the generated visual language and save it in /output_visual_language/visual_langauge.txt
Replace the 'mwp' and 'formula' fields with your own math word problem content in generate_visual_language_with_gpt.py (around line 196). Then run:
python3 generate_visual_language_with_gpt.pyIt will print out the generated visual language and save it in /output_visual_language/visual_langauge.txt
Replace the 'visual_language' field with your own generated visual language in generate_visual_formal.py (around line 1406). Then run:
python3 generate_visual_formal.pyIt will generate the visual and save it in /output_visual_formal/01.svg
Replace the 'visual_language' field with your own generated visual language in generate_visual_intuitive.py (around line 4263). Then run:
python3 generate_visual_intuitive.pyIt will generate the visual and save it in /output_visual_intuitive/01.svg
@inproceedings{wang2025math2visual,
title={Generating Pedagogically Meaningful Visuals for Math Word Problems: A New Benchmark and Analysis of Text-to-Image Models},
author={Wang, Junling and Rutkiewicz, Anna and Wang, April Yi and Sachan, Mrinmaya},
booktitle={Findings of the Association for Computational Linguistics: ACL 2025},
year={2025},
url={https://arxiv.org/abs/2506.03735}
}This work is licensed under a
This work is licensed under the Apache License 2.0.
For research inquiries, please contact: Junling Wang — wangjun [at] ethz [dot] ch