Training Versatile Coding Agents in
Synthetic Environments

Yiqi Zhu¹, Apurva Gandhi², Graham Neubig²

¹Tsinghua University, ²Carnegie Mellon Univeristy

📃 Paper • 🤗 Data & Models • 🌐 Project Page

We present SWE-Playground, a fully automated pipeline that synthesizes tasks and verifiable unit tests from scratch for training versatile coding agents. We identify that previous environments for training SWE agents typically rely on pre-existing GitHub repositories and focus predominantly on issue resolution tasks. SWE-Playground is designed to address these limitations by proposing tasks from scratch, offering flexibility and extensibility when constructing training instances, and supporting training versatile coding agents that benefit mutually from various types of trajectories.

SWE-Playground pipeline starts with an LLM proposing a project, which is then decomposed into step-by-step tasks and initialized as a repository. Subsequently, two separate agents sequentially generate unit tests and implement functionality to complete the development. Finally, to curate data tailored to specific benchmarks, we incorporate a task-specific generation stage that adapts the pipeline for diverse task types including issue resolution (SWE-bench), issue reproduction (SWT-Bench) and library generation from scratch (Commit-0).

Results demonstrate that SWE-Playground trains agents to achieve strong performance across various benchmarks using a significantly smaller dataset, and trajectories across different tasks mutually enhance performance on a single benchmark. Further analysis substantiates the high data efficiency of our approach, with our trajectories containing dense training signal and exhibiting a execution-based softward development paradigm.

🔧 Setup

1. Environment Installation

# Create and activate Conda environment
conda create -n swe-play python==3.12
conda activate swe-play

# Install package in editable mode
pip install -e .

2. OpenHands Configuration

This repository leverages OpenHands in headless mode. Please follow the OpenHands Development setup instructions.

3. Environment Variables Setup

Configure the required environment variables to connect to your LLM provider and the OpenHands runtime:

export OPENAI_API_KEY="your_api_key"
export OPENAI_BASE_URL="your_api_endpoint"
export OPENHANDS_CONFIG_PATH="path/to/openhands/config.toml"

🚀 Quickstart

Generating Projects

To launch the proposal pipeline and generate a new coding project with associated step-by-step tasks:

python -m swe_play.propose.pipeline --model claude-sonnet-4-20250514 --output generated

This will initialize the project repository and save the project description and task breakdown in the generated/ directory.

For detailed documentation on the propose pipeline, please refer to PROPOSE_README.md

Running Rollouts

To collect trajectories from agents working on the generated tasks:

# Ensure using absolute paths here
python -m swe_play.rollout.rollout --repo-path /path/to/your/project/repo --runtime-folder /path/to/your/runtime

We also support generating trajectories tailored to specific benchmarks, including issue resolution (SWE-bench), issue reproduction (SWT-Bench) and library generation from scratch (Commit-0). You can enable these by appending the corresponding flags:

python -m swe_play.rollout.rollout --repo-path /path/to/your/project/repo --runtime-folder /path/to/your/runtime --swe --swt --commit0

Our pipeline is designed for easy adaptation to new tasks and benchmarks. You are encouraged to implement custom adapters for specific tasks, benchmarks and use cases.

For detailed documentation on the rollout pipeline, please refer to ROLLOUT_README.md

🔥 Training and Evaluation

Training

Dataset: Our collected and filtered trajectories are available on Hugging Face.

Training Scripts: Scripts for training versatile coding agents are located in the train/ directory. These scripts are adapted from the awesome R2E-Gym offical repository.

Prerequisites

Install LLaMA-Factory: Follow the official instructions to install the LLaMA-Factory which is necessary for our training.
Sequence Parallelism (Optional): For sequence parallelism support, please refer to 360-LLaMA-Factory for setup.

Configuration & Execution Before running the scripts, update the base model and dataset paths in the relevant config files under train/ directory.

Note: When using the Qwen base model, you must first modify the configuration to extend the maximum context length to 128k. Please refer to the official Qwen documentation for details.

To start training, run:

llamafactory-cli train train/train_sweplay_raw.yaml

Evaluation

Model Serving: We recommend serving our model using vLLM. Please refer to the official documentation for installation and setup.

Evaluation: For evaluation across various coding benchmarks, please refer to the official implementation of VersaBench. This repository provides easy-to-use entries and scripts for evaluating diverse benchmarks, even extending beyond the coding domain.

Note: We use slightly modified prompts for these benchmarks. To reproduce our results, please substitute the original prompts with the versions provided in the evaluation_prompts/ directory.

📚 Citation

@misc{zhu2025trainingversatilecodingagents,
      title={Training Versatile Coding Agents in Synthetic Environments}, 
      author={Yiqi Zhu and Apurva Gandhi and Graham Neubig},
      year={2025},
      eprint={2512.12216},
      archivePrefix={arXiv},
      primaryClass={cs.SE},
      url={https://arxiv.org/abs/2512.12216}, 
}

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
assets		assets
evaluation_prompts		evaluation_prompts
swe_play		swe_play
train		train
.DS_Store		.DS_Store
.cursorrules		.cursorrules
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
index.html		index.html
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Training Versatile Coding Agents in
Synthetic Environments

🔧 Setup

1. Environment Installation

2. OpenHands Configuration

3. Environment Variables Setup

🚀 Quickstart

Generating Projects

Running Rollouts

🔥 Training and Evaluation

Training

Evaluation

📚 Citation

📄 License

About

Uh oh!

Packages

Contributors 3

Uh oh!

Languages

License

neulab/SWE-Playground

Folders and files

Latest commit

History

Repository files navigation

Training Versatile Coding Agents inSynthetic Environments

🔧 Setup

1. Environment Installation

2. OpenHands Configuration

3. Environment Variables Setup

🚀 Quickstart

Generating Projects

Running Rollouts

🔥 Training and Evaluation

Training

Evaluation

📚 Citation

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Packages 0

Contributors 3

Uh oh!

Languages

Training Versatile Coding Agents in
Synthetic Environments

Packages