Skip to content

⚔️ OpenHands PR Arena ⚔️ is a platform for evaluating and benchmarking agentic coding assistants through paired pull request (PR) generations.

License

Notifications You must be signed in to change notification settings

neulab/pr-arena

Repository files navigation

⚔️ OpenHands PR Arena ⚔️

👐 We welcome your feedback. Feel free to fill out the google form, send an email, or open an issue on this repository. 👐

OpenHands PR Arena is a platform for evaluating and benchmarking agentic coding assistants through paired pull request (PR) generations. PR Arena enables developers to compare multiple LLMs in real-world issue resolution by presenting side-by-side pull requests and allowing users to select the better fix.

Follow the instruction below to setup the Arena setting for the OpenHands resolver.

Demo

Maintainer

X (formerly Twitter) Follow GitHub Website

7 LLMs ready to enter the Arena!

Claude Sonnet 4 DeepSeek R1 GPT-4.1 Gemini 2.5 Pro Qwen3 Coder 480B DeepSeek V3.1 GPT-5 Mini

How to Get Started with the OpenHands PR Arena GitHub App

How to use

  1. Install OpenHands PR Arena to your GitHub Repository
    • Go to the installation page
    • Under Repository access, select the repositories you want to install the app on

Once you've installed the GitHub App ...

🎉 You’re all set. Let’s start fixing your GitHub issues!

  1. Open the repository where the GitHub App was installed (i.e., where you’d like to resolve issues).
  2. Label an issue with pr-arena to trigger the automated fix:
    • Open or create an issue, click Labels in the sidebar, and select pr-arena
  3. Wait for the agent to resolve the issue and open the Arena (this may take 10-20 minutes)
  4. Click the link in the comment to enter the Arena and choose your preferred model
  5. The selected fix will be automatically submitted as a Pull Request

⭐️ Please watch the guideline video that explains how to use the OpenHands PR Arena GitHub App!

Arena Lifecycle

  • Progress is continuously updated via comments on the issue — keep an eye on them!
  • The Arena will automatically close 60 minutes after the label is applied, but you can still view fixes and vote.
  • For guidance on locally testing proposed fixes and viewing Arena results after closure, see ARENA_GUIDE.md.

Privacy Notification

  1. The only code we collect is the git_diff and traces generated during issue resolution. We never access or store the entire codebase, access GitHub secrets, or release any user data.
  2. Important: Installing this App will automatically add a workflow file named pr-arena-workflow.yml to your repository. This file redirects to the actual resolver workflow located here. If you are concerned about repository workflows, we encourage you to review the resolver workflow to understand the operations it performs.
  3. Do not modify the injected workflow. Any modifications will prevent it from being triggered.
  4. Please install and use this app only on repositories where you consent to having code snippets (i.e., git_diff) processed by the LLM provider.
  5. The following metadata is collected for research purpose:
    • User info: owner, repo, repo URL
    • Model info: user preference on model, duration of an attempt
    • Code info: agent code (git_diffs), commit hash, repository language composition

Q&A

Can I use the App in my forked repository?

✅ Yes — you can install and use OpenHands PR Arena in a forked repository. ⚠️ Note: GitHub disables Issues on forks by default. To enable them:

  1. Go to your forked repository.
  2. Navigate to Settings → General.
  3. Scroll down to Features.
  4. Check the box for Issues.

How can I track the progress?

The agent will automatically comment on the issue at each stage of the process:

  • 👐 OpenHands PR-Arena has started the task: [click here for details]. For more info about how to use OpenHands PR-Arena, [click this link].
    • Step 1. OpenHands begins resolving the issue. Please wait 10 ~ 20 minutes for the next comment.
  • ⚔️PR-Arena is now open⚔️! You can view the proposed fixes and make a decision at [this link].
    • Step 2. The Arena is open. Click the link to review both fixes and choose your preferred one.
  • PR has been created based on the fix you've selected. Please review the changes.
    • Step 3. A pull request has been created. You can now review and merge it.

What happens if an error occurs?

If an error occurs, the agent will comment on the issue with an appropriate message. You can retry by removing the pr-arena label, waiting 5 seconds, and adding it again.

How long does the process take?

The time depends on the complexity of the issue. Some models may take longer to process depending on the complexity of the task. Typically, it should take less than 30 minutes, so please be patient.

How does this affect my GitHub Actions build minutes?

The workflow makes API calls to our backend infrastructure where OpenHands agents run remotely. Your GitHub Actions runner only handles lightweight tasks like triggering the workflow and creating pull requests. The actual AI processing and code generation happens on our servers, so it consumes minimal GitHub Actions minutes (typically just a few minutes per issue).

Security & Permission

This GitHub App requires the following permissions:

  • Read & Write access to Issues and Pull Requests — to analyze issues and generate PRs
  • Workflow execution — to trigger automated fixes via GitHub Actions
  • Access to repository contents — to apply code changes and submit pull requests

No user secrets or sensitive information are stored in your repository. All sensitive operations are securely handled through our backend infrastructure.

Support and Acknowledgment

If you have any issues, please open an issue on this github repo, we're happy to help! Alternatively, you can email us or join the OpenHands Slack workspace and ask there.

This project is built upon OpenHands GitHub Backlog Resolver and inspired by Copilot Arena, an open source AI coding assistant that provides paired autocomplete completions from different LLMs.

Powered by OpenHands

About

⚔️ OpenHands PR Arena ⚔️ is a platform for evaluating and benchmarking agentic coding assistants through paired pull request (PR) generations.

Topics

Resources

License

Stars

Watchers

Forks

Contributors 8