Sandbox Agent

The Sandbox Agent is a specialized AI assistant for working with the .NET MAUI Sandbox app to test, validate, and experiment with MAUI features through automated deployment and testing.

How to Use This Agent

VS Code: Custom Agents in VS Code
Copilot CLI: GitHub Copilot CLI

Export Your Chat Sessions

If you use any of our agents or just Copilot in general to work on an issue:

End with a message summarizing your experience
Export the chat session (how-to)
Attach the JSON to your PR

What It Does

The Sandbox Agent:

✅ Sets up test scenarios in the Sandbox app
✅ Deploys to iOS/Android simulators and emulators
✅ Runs automated Appium tests to reproduce issues
✅ Validates PR fixes work correctly
✅ Reproduces reported issues
✅ Iteratively fixes issues - once reproduction is automated, keeps working until fix is validated
✅ Converts Sandbox scenarios to UI tests when ready
✅ Captures device logs and screenshots

When to Use

Use the Sandbox Agent when:

You want to manually verify a PR fix works on device/simulator
You need to reproduce an issue hands-on
You want to experiment with a MAUI feature
You need functional validation beyond code review
You want to iteratively fix an issue - reproduce → fix → test → repeat until solved
You're ready to convert a working Sandbox scenario into proper UI tests

Note: The Sandbox Agent focuses on functional testing, not code review. For reviewing code quality, use the PR Reviewer instead.

Example Prompts

test PR #12345 on Android

test this PR on iOS

validate PR #12345 on both Android and iOS

reproduce issue #12345 in Sandbox

try to reproduce issue #12345 on Android

test this PR on iPhone 15

test PR #12345 on iOS 18.5

verify that PR #12345 actually fixes issue #12000

set up a test in Sandbox for CollectionView with 1000 items

create a Sandbox test that demonstrates Grid layout with SafeArea

reproduce issue #12345 - the bug happens when you tap the button twice quickly

test PR #12345 and verify that:
1. Button click works
2. Label updates correctly
3. No crashes occur

reproduce issue #12345 with Appium automation, then work on fixing it until the test passes

The fix works! Now move this Sandbox scenario to proper UI tests.

What to Expect

When you invoke the Sandbox Agent, it will:

Understand the issue - Reviews PR/issue details
Create test scenario - Modifies Sandbox MainPage.xaml[.cs] with reproduction code
Set up Appium test - Creates automated test script
Build and deploy - Uses BuildAndRunSandbox.ps1 to deploy to device
Run validation - Executes Appium test and captures results
Provide report - Summarizes findings with logs and screenshots

The test output includes:

Test Summary: What was tested and results
Validation Results: Pass/fail with details
Device Logs: Relevant log excerpts
Screenshots: Visual confirmation (optional)
Verdict: Clear assessment of whether fix works

Test Workflow

The Sandbox Agent follows this workflow:

1. Modify Sandbox app → 2. Create Appium test → 3. Deploy to device → 4. Run test → 5. Report results

Files Modified

src/Controls/samples/Controls.Sample.Sandbox/MainPage.xaml - UI for test scenario
src/Controls/samples/Controls.Sample.Sandbox/MainPage.xaml.cs - Test logic
CustomAgentLogsTmp/Sandbox/RunWithAppiumTest.cs - Appium test script (auto-generated)

Logs Captured

CustomAgentLogsTmp/Sandbox/android-device.log or ios-device.log - Device logs
CustomAgentLogsTmp/Sandbox/build-run-output.log - Build and deployment logs
CustomAgentLogsTmp/Sandbox/appium.log - Appium test execution logs
CustomAgentLogsTmp/Sandbox/*.png - Screenshots (if test captures them)

Tips for Best Results

Link the PR or Issue

Instead of:

test the CollectionView fix

Try:

test PR #12345 which fixes CollectionView crash on item removal

Specify Platform When Relevant

test PR #12345 on Android

More efficient than testing all platforms if it's platform-specific.

Provide Reproduction Context

reproduce issue #12345 - the bug happens when you tap the button twice quickly

Helps the agent create the right test scenario.

Request Specific Validation

test PR #12345 and verify that:
1. Button click works
2. Label updates correctly
3. No crashes occur

Gives clear success criteria.

Common Use Cases

Iterative Issue Fixing (Recommended Workflow)

This is the most powerful way to use the Sandbox Agent - set up automated reproduction, then work with Copilot to fix the issue iteratively.

Step 1: Set up automated reproduction

reproduce issue #12345 in Sandbox with Appium automation

What happens:

Agent creates MainPage with reproduction scenario
Agent creates Appium test that demonstrates the bug
Agent runs test and confirms: "Bug reproduced - test fails as expected"

Step 2: Iteratively fix the issue

Now work on fixing issue #12345. Keep testing after each change until the Appium test passes.

What happens:

Copilot analyzes the bug and proposes a fix
Modifies MAUI framework code (not Sandbox)
Reruns BuildAndRunSandbox.ps1 to test
If test still fails → analyzes why → tries different approach
Repeats until Appium test passes

Step 3: Convert to UI tests

The fix works! Now move this Sandbox scenario to proper UI tests.

What happens:

Agent creates TestCases.HostApp/Issues/Issue12345.xaml[.cs]
Agent creates TestCases.Shared.Tests/Tests/Issues/Issue12345.cs
Copies working test logic from Sandbox
Cleans up Sandbox

Why this workflow works:

✅ Fast feedback loop - Appium test validates each attempt
✅ Objective validation - Not guessing if fix works, test proves it
✅ Incremental progress - Each iteration gets closer to solution
✅ Smooth transition - Working test becomes regression test

Example conversation:

You: reproduce issue #12345 - CollectionView crashes when removing last item

Agent: [Sets up reproduction, runs test]
"Bug reproduced successfully. Appium test demonstrates crash on item removal."

You: Now fix this issue. Keep testing until it works.

Agent: [Iteration 1]
"Added null check in CollectionViewHandler. Testing..."
"Test still fails - crash occurs before null check. Analyzing..."

Agent: [Iteration 2]
"Modified item removal sequence to update adapter first. Testing..."
"Test passes! No crash observed. Fix validated."

You: Great! Now move this to UI tests.

Agent: [Creates Issue12345.xaml and Issue12345.cs in proper locations]
"UI test created. Sandbox cleaned up. Ready for PR."

Validate a Bug Fix

verify PR #12345 fixes the crash reported in issue #12000

Expected flow:

Agent creates reproduction scenario
Tests WITH the fix
Optionally tests WITHOUT the fix to confirm bug exists
Reports whether fix resolves the issue

Quick Manual Testing

deploy PR #12345 to Android so I can test it manually

Expected flow:

Agent sets up basic test scenario
Deploys to device
Leaves app running for manual exploration
Provides instructions for manual validation

Pre-Submit Validation

test my changes on iOS before I submit a PR

Expected flow:

Tests current branch changes
Validates functionality
Confirms no obvious issues
Gives green light or flags concerns

Reproduce Community-Reported Bugs

reproduce issue #12345 to verify it still happens on main branch

Expected flow:

Reads issue description
Creates reproduction scenario
Tests on main branch
Confirms whether bug is reproducible

Platform Selection

The agent automatically selects platforms based on:

PR title tags - [Android], [iOS], etc.
Modified file paths - Platform-specific code paths
Issue description - Mentioned platforms
Code changes - Cross-platform vs. platform-specific

Default: Tests on Android only (faster) unless PR affects iOS-specific code or cross-platform controls.

You can override by specifying:

test PR #12345 on iOS

Understanding Test Results

✅ Success

✅ FIX VALIDATED - Test scenario completes successfully, expected behavior observed

Meaning: PR fix works as expected, no issues found.

⚠️ Partial Success

⚠️ PARTIAL - Fix appears to work but noticed a minor animation glitch

Meaning: Fix mostly works but there are concerns worth noting.

❌ Issues Found

❌ ISSUES FOUND - App crashes when tapping button after navigation

Meaning: Test revealed problems with the fix.

🚫 Cannot Test

🚫 CANNOT TEST - Build failed due to missing dependency

Meaning: Unable to complete testing due to technical issues.

Troubleshooting

Build Failures

If the agent reports build failures:

the build failed - can you check what went wrong?

The agent will analyze build logs and suggest fixes.

Test Can't Find Elements

If Appium can't locate UI elements:

the test can't find "TestButton" - can you check the AutomationIds?

Agent will verify and fix AutomationId mismatches.

App Crashes

If the app crashes during testing:

the app crashed - what does the log say?

Agent will analyze crash logs and identify the root cause.

Manual Validation

After automated testing, you can manually validate by:

Simulator stays running - The app remains deployed
Navigate to Sandbox - Find the app on simulator
Test manually - Interact with the test scenario
Review logs - Check CustomAgentLogsTmp/Sandbox/ for captured logs

The Sandbox Agent leaves the environment ready for hands-on exploration.

Advanced Usage

Test Multiple Scenarios

test PR #12345 with these scenarios:
1. Tap button once
2. Tap button rapidly 10 times
3. Navigate away and back, then tap

Creates comprehensive test coverage.

Capture Specific Metrics

test PR #12345 and measure the Grid layout dimensions

Uses Appium to capture element properties.

Compare Branches

test this PR on Android, then test main branch to compare behavior

Shows before/after comparison.

Best Practices

✅ Test platform-specific changes on that platform - Don't test iOS changes on Android
✅ Start with one platform - Test Android first (faster), then iOS if needed
✅ Read issue reproduction steps - Use them as your test scenario when available
✅ Validate incrementally - Test small changes frequently rather than large batches
✅ Keep Sandbox simple - Focus on the specific bug or feature, don't create complex scenarios

Cleanup

The Sandbox Agent leaves your repository in a ready state:

Sandbox app contains your test scenario
Logs are captured in CustomAgentLogsTmp/Sandbox/
Device remains booted with app deployed

When to clean up:

git checkout -- src/Controls/samples/Controls.Sample.Sandbox/
rm -rf CustomAgentLogsTmp/Sandbox/

Only clean up when you're done with this test cycle and want to start fresh.

How the Sandbox Agent Uses RunAppiumTest Script to Run Tests

Overview: The Complete Workflow

The sandbox-agent uses a multi-step automated workflow to test MAUI apps. Here's the complete flow:

flowchart TD
    A[1. Agent Updates MainPage.xaml] --> B[2. Agent Creates RunWithAppiumTest.cs]
    B --> C[3. Run BuildAndRunSandbox.ps1]
    C --> D[4. Script Orchestrates Everything]
    D --> E[Build & Deploy]
    D --> F[Start Appium]
    D --> G[Run Test]
    D --> H[Capture Logs]

Detailed Steps:

Agent Updates MainPage.xaml[.cs]
- Creates test scenario (UI elements, event handlers)
- Adds AutomationIds to elements for Appium to find
Agent Creates RunWithAppiumTest.cs
- Copies from template
- Updates to match MainPage AutomationIds
- Adds test logic (tap buttons, verify labels, etc.)
- Saves to: CustomAgentLogsTmp/Sandbox/RunWithAppiumTest.cs

User/Agent Runs BuildAndRunSandbox.ps1

pwsh .github/scripts/BuildAndRunSandbox.ps1 -Platform android

BuildAndRunSandbox.ps1 Orchestrates Everything
- Validates prerequisites
- Starts device/emulator
- Builds and deploys app
- Starts Appium server
- Runs Appium test
- Captures logs

Step-by-Step: How BuildAndRunSandbox.ps1 Works

Phase 1: Setup & Validation 🔧

Directory Structure:

CustomAgentLogsTmp/Sandbox/
├── RunWithAppiumTest.cs     # Agent creates this
├── appium.log               # Created by script
├── android-device.log       # Created by script
└── *.png                    # Screenshots from test

Validation Steps:

✅ Check if RunWithAppiumTest.cs exists
✅ Check if dotnet is available
✅ Check if Appium is installed

Phase 2: Device Management 📱

Android:

# 1. Find Android emulators
adb devices

# 2. Pick first available (or use specified UDID)
$DeviceUdid = "emulator-5554"

iOS:

# 1. Find iOS simulators
xcrun simctl list devices

# 2. Boot iPhone Xs if not running
xcrun simctl boot "AC8BCB28-..."

# 3. Return UDID
$DeviceUdid = "AC8BCB28-..."

Result: $DeviceUdid contains the target device

Phase 3: Build & Deploy 🏗️

# Build
dotnet build Maui.Controls.Sample.Sandbox.csproj `
    -f net10.0-android `
    -c Debug

# Deploy - Android
adb -s $DeviceUdid install bin/Debug/.../*.apk

# Deploy - iOS
xcrun simctl install $DeviceUdid bin/Debug/.../*.app

Result: Sandbox app is installed on device

Phase 4: Appium Server 🔌

# Check if Appium already running
try {
    $response = Invoke-WebRequest "http://localhost:4723/status"
    Write-Host "✅ Already running"
}
catch {
    # Start Appium in background
    $appiumJob = Start-Job { 
        appium --log-level info > appium.log 
    }
    
    # Wait up to 30 seconds for it to be ready
    while ($waited < 30) {
        try {
            Invoke-WebRequest "http://localhost:4723/status"
            break  # Ready!
        }
        catch { Start-Sleep 1 }
    }
}

Result: Appium server running on http://localhost:4723

Phase 5: Run Appium Test 🎯 ← KEY PART

# Set environment variable for test script
$env:DEVICE_UDID = $DeviceUdid

# Change to Sandbox directory where RunWithAppiumTest.cs lives
cd CustomAgentLogsTmp/Sandbox/

# Run the C# script using dotnet-script
$appiumOutput = dotnet run RunWithAppiumTest.cs /p:NoWarn="CA1307;CS0162" 2>&1

# Display output
$appiumOutput | ForEach-Object { Write-Host $_ }

# Extract PID from output (Android only)
$pidLine = $appiumOutput | Select-String "SANDBOX_APP_PID=(\d+)"
if ($pidLine) {
    $sandboxPid = $pidLine.Matches.Groups[1].Value
}

What Happens Inside RunWithAppiumTest.cs

Script Structure

#!/usr/bin/env dotnet run
#:package Appium.WebDriver@8.0.1  // ← Tells dotnet-script to install Appium

using System;
using OpenQA.Selenium.Appium;
using OpenQA.Selenium.Appium.Android;

// ========== 1. READ ENVIRONMENT ==========
var udid = Environment.GetEnvironmentVariable("DEVICE_UDID");
string PLATFORM = udid.Contains("-") && udid.Length > 20 ? "ios" : "android";

// ========== 2. CONFIGURE APPIUM ==========
AppiumOptions options;
if (PLATFORM == "android") {
    options = new AppiumOptions();
    options.PlatformName = "Android";
    options.AutomationName = "UIAutomator2";
    options.AddAdditionalAppiumOption("appium:appPackage", "com.microsoft.maui.sandbox");
    options.AddAdditionalAppiumOption("appium:noReset", true);  // ← CRITICAL for Android
    options.AddAdditionalAppiumOption(MobileCapabilityType.Udid, udid);
}

// ========== 3. CONNECT TO APPIUM & LAUNCH APP ==========
var serverUri = new Uri("http://localhost:4723");
AndroidDriver driver = new AndroidDriver(serverUri, options);

// ========== 4. GET APP PID (for logcat filtering) ==========
if (PLATFORM == "android") {
    var pid = GetAppPidFromAdb();
    Console.WriteLine($"SANDBOX_APP_PID={pid}");  // ← PowerShell captures this!
}

// ========== 5. VERIFY APP LAUNCHED ==========
driver.WaitForElement("InstructionLabel", TimeSpan.FromSeconds(30));
Console.WriteLine("✅ App launched successfully");

// ========== 6. RUN TEST LOGIC ==========
// Agent implements custom test logic here
driver.FindElement(By.Id("NavigateButton")).Click();
var labelText = driver.FindElement(By.Id("ResultLabel")).Text;
Console.WriteLine($"Label text: {labelText}");

// ========== 7. EXIT (app stays running) ==========
// No driver.Quit() - app remains open for manual validation

Key Implementation Details

Package Management

#!/usr/bin/env dotnet run
#:package Appium.WebDriver@8.0.1

Uses dotnet-script to run C# as a script
#:package directive automatically installs NuGet packages
No need to create a full .csproj project!

Device Detection

var udid = Environment.GetEnvironmentVariable("DEVICE_UDID");
string PLATFORM = udid.Contains("-") && udid.Length > 20 ? "ios" : "android";

How it works:

iOS UDIDs: Long with hyphens (e.g., AC8BCB28-A72D-4A2D-90E7-E78FF0BA07EE)
Android UDIDs: Short without many hyphens (e.g., emulator-5554, 192.168.1.100:5555)

PID Capture (Android Only)

Inside RunWithAppiumTest.cs:

var pid = GetAppPid();  // adb shell pidof com.microsoft.maui.sandbox
Console.WriteLine($"SANDBOX_APP_PID={pid}");

Inside BuildAndRunSandbox.ps1:

$pidLine = $appiumOutput | Select-String "SANDBOX_APP_PID=(\d+)"
$sandboxPid = $pidLine.Matches.Groups[1].Value

# Use PID to filter logcat
adb -s $DeviceUdid logcat -d --pid=$sandboxPid > android-device.log

Why this matters:

Without PID: Get ALL logcat output (thousands of lines from all apps)
With PID: Get only Sandbox app logs (clean, focused output)

Why `appium:noReset` is CRITICAL for Android

options.AddAdditionalAppiumOption("appium:noReset", true);

Without `noReset`	With `noReset`
❌ Appium clears app data	✅ App data persists
❌ .NET MAUI Fast Deployment breaks	✅ Fast Deployment works
❌ Crash: "No assemblies found"	✅ App launches successfully

Fast Deployment uses a special directory (__override__) to deploy assemblies quickly. When Appium clears app data, this directory is deleted, causing the app to crash immediately.

Phase 6: Log Capture 📝

# Extract PID from Appium output
$sandboxPid = "12345"  # Extracted from "SANDBOX_APP_PID=12345"

# Dump Android logcat for that specific PID
if ($Platform -eq "android") {
    adb -s $DeviceUdid logcat -d --pid=$sandboxPid > android-device.log
}

# iOS logs captured via xcrun simctl
if ($Platform -eq "ios") {
    xcrun simctl spawn booted log stream --predicate 'processImagePath contains "Sandbox"' > ios-device.log
}

Result: All logs saved to CustomAgentLogsTmp/Sandbox/

Complete Flow Diagram

BuildAndRunSandbox.ps1
│
├─ 1. Validate prerequisites
│   └─ Check RunWithAppiumTest.cs exists ✅
│
├─ 2. Start device
│   └─ Start-Emulator.ps1 → returns UDID
│
├─ 3. Build & deploy app
│   └─ Build-AndDeploy.ps1 → installs app
│
├─ 4. Start Appium server
│   └─ appium --log-level info > appium.log
│
├─ 5. Run Appium test  ← MAIN ACTION
│   │
│   └─ dotnet run RunWithAppiumTest.cs
│       │
│       ├─ Reads $env:DEVICE_UDID
│       ├─ Connects to http://localhost:4723
│       ├─ Launches com.microsoft.maui.sandbox
│       ├─ Outputs "SANDBOX_APP_PID=12345"
│       ├─ Waits for UI element
│       ├─ Runs test logic (tap, verify, etc.)
│       └─ Exits (app stays running)
│
└─ 6. Capture logs
    ├─ Extract PID from output
    └─ adb logcat --pid=12345 > android-device.log

Summary Table

Component	Purpose	Key Details
BuildAndRunSandbox.ps1	Orchestrates entire workflow	Builds, deploys, starts Appium, runs test, captures logs
RunWithAppiumTest.cs	Appium test script	C# script that connects to Appium and tests the app
dotnet-script	Script runner	Allows running C# with `#:package` directive for NuGet
Appium Server	UI automation	Runs on port 4723, connects to device, controls app
PID Capture	Log filtering	Extracts app PID to filter logcat (Android only)
`appium:noReset`	Prevents data wipe	Critical for Android Fast Deployment to work

Key Innovation

Using dotnet run with #:package directive allows running C# scripts with NuGet dependencies without creating a full project!

This makes it easy for agents to:

Generate a single .cs file
Run it directly with dotnet run
Automatically install dependencies via #:package

No need for .csproj, dotnet restore, or complex project setup! 🎉

Files Involved

File	Purpose
`.github/scripts/BuildAndRunSandbox.ps1`	Main orchestration script
`.github/scripts/templates/RunWithAppiumTest.template.cs`	Template for Appium test
`CustomAgentLogsTmp/Sandbox/RunWithAppiumTest.cs`	Generated Appium test
`CustomAgentLogsTmp/Sandbox/appium.log`	Appium server logs
`CustomAgentLogsTmp/Sandbox/android-device.log`	Android logcat output
`CustomAgentLogsTmp/Sandbox/ios-device.log`	iOS device logs
`CustomAgentLogsTmp/Sandbox/*.png`	Screenshots from test
`src/Controls/samples/Controls.Sample.Sandbox/MainPage.xaml`	Test UI

Next Steps

Want to run a test? Here's how:

# 1. Create your test scenario in MainPage.xaml
# 2. Create RunWithAppiumTest.cs in CustomAgentLogsTmp/Sandbox/
# 3. Run the script
pwsh .github/scripts/BuildAndRunSandbox.ps1 -Platform android

# 4. Check logs
cat CustomAgentLogsTmp/Sandbox/android-device.log

Sandbox Agent

How to Use This Agent

Export Your Chat Sessions

What It Does

When to Use

Example Prompts

What to Expect

Test Workflow

Files Modified

Logs Captured

Tips for Best Results

Link the PR or Issue

Specify Platform When Relevant

Provide Reproduction Context

Request Specific Validation

Common Use Cases

Iterative Issue Fixing (Recommended Workflow)

Validate a Bug Fix

Quick Manual Testing

Pre-Submit Validation

Reproduce Community-Reported Bugs

Platform Selection

Understanding Test Results

✅ Success

⚠️ Partial Success

❌ Issues Found

🚫 Cannot Test

Troubleshooting

Build Failures

Test Can't Find Elements

App Crashes

Manual Validation

Advanced Usage

Test Multiple Scenarios

Capture Specific Metrics

Compare Branches

Best Practices

Cleanup

How the Sandbox Agent Uses RunAppiumTest Script to Run Tests

Overview: The Complete Workflow

Step-by-Step: How BuildAndRunSandbox.ps1 Works

Phase 1: Setup & Validation 🔧

Phase 2: Device Management 📱

Phase 3: Build & Deploy 🏗️

Phase 4: Appium Server 🔌

Phase 5: Run Appium Test 🎯 ← KEY PART

What Happens Inside RunWithAppiumTest.cs

Script Structure

Key Implementation Details

Package Management

Device Detection

PID Capture (Android Only)

Why appium:noReset is CRITICAL for Android

Phase 6: Log Capture 📝

Complete Flow Diagram

Summary Table

Key Innovation

Files Involved

Next Steps

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Why `appium:noReset` is CRITICAL for Android