Skip to content

Sandbox Agent

Shane Neuville edited this page Dec 13, 2025 · 7 revisions

The Sandbox Agent is a specialized AI assistant for working with the .NET MAUI Sandbox app to test, validate, and experiment with MAUI features through automated deployment and testing.

How to Use This Agent


Export Your Chat Sessions

If you use any of our agents or just Copilot in general to work on an issue:

  1. End with a message summarizing your experience
  2. Export the chat session (how-to)
  3. Attach the JSON to your PR

What It Does

The Sandbox Agent:

  • ✅ Sets up test scenarios in the Sandbox app
  • ✅ Deploys to iOS/Android simulators and emulators
  • ✅ Runs automated Appium tests to reproduce issues
  • ✅ Validates PR fixes work correctly
  • ✅ Reproduces reported issues
  • Iteratively fixes issues - once reproduction is automated, keeps working until fix is validated
  • ✅ Converts Sandbox scenarios to UI tests when ready
  • ✅ Captures device logs and screenshots

When to Use

Use the Sandbox Agent when:

  • You want to manually verify a PR fix works on device/simulator
  • You need to reproduce an issue hands-on
  • You want to experiment with a MAUI feature
  • You need functional validation beyond code review
  • You want to iteratively fix an issue - reproduce → fix → test → repeat until solved
  • You're ready to convert a working Sandbox scenario into proper UI tests

Note: The Sandbox Agent focuses on functional testing, not code review. For reviewing code quality, use the PR Reviewer instead.

Example Prompts

test PR #12345 on Android
test this PR on iOS
validate PR #12345 on both Android and iOS
reproduce issue #12345 in Sandbox
try to reproduce issue #12345 on Android
test this PR on iPhone 15
test PR #12345 on iOS 18.5
verify that PR #12345 actually fixes issue #12000
set up a test in Sandbox for CollectionView with 1000 items
create a Sandbox test that demonstrates Grid layout with SafeArea
reproduce issue #12345 - the bug happens when you tap the button twice quickly
test PR #12345 and verify that:
1. Button click works
2. Label updates correctly
3. No crashes occur
reproduce issue #12345 with Appium automation, then work on fixing it until the test passes
The fix works! Now move this Sandbox scenario to proper UI tests.

What to Expect

When you invoke the Sandbox Agent, it will:

  1. Understand the issue - Reviews PR/issue details
  2. Create test scenario - Modifies Sandbox MainPage.xaml[.cs] with reproduction code
  3. Set up Appium test - Creates automated test script
  4. Build and deploy - Uses BuildAndRunSandbox.ps1 to deploy to device
  5. Run validation - Executes Appium test and captures results
  6. Provide report - Summarizes findings with logs and screenshots

The test output includes:

  • Test Summary: What was tested and results
  • Validation Results: Pass/fail with details
  • Device Logs: Relevant log excerpts
  • Screenshots: Visual confirmation (optional)
  • Verdict: Clear assessment of whether fix works

Test Workflow

The Sandbox Agent follows this workflow:

1. Modify Sandbox app → 2. Create Appium test → 3. Deploy to device → 4. Run test → 5. Report results

Files Modified

  • src/Controls/samples/Controls.Sample.Sandbox/MainPage.xaml - UI for test scenario
  • src/Controls/samples/Controls.Sample.Sandbox/MainPage.xaml.cs - Test logic
  • CustomAgentLogsTmp/Sandbox/RunWithAppiumTest.cs - Appium test script (auto-generated)

Logs Captured

  • CustomAgentLogsTmp/Sandbox/android-device.log or ios-device.log - Device logs
  • CustomAgentLogsTmp/Sandbox/build-run-output.log - Build and deployment logs
  • CustomAgentLogsTmp/Sandbox/appium.log - Appium test execution logs
  • CustomAgentLogsTmp/Sandbox/*.png - Screenshots (if test captures them)

Tips for Best Results

Link the PR or Issue

Instead of:

test the CollectionView fix

Try:

test PR #12345 which fixes CollectionView crash on item removal

Specify Platform When Relevant

test PR #12345 on Android

More efficient than testing all platforms if it's platform-specific.


Provide Reproduction Context

reproduce issue #12345 - the bug happens when you tap the button twice quickly

Helps the agent create the right test scenario.


Request Specific Validation

test PR #12345 and verify that:
1. Button click works
2. Label updates correctly
3. No crashes occur

Gives clear success criteria.


Common Use Cases

Iterative Issue Fixing (Recommended Workflow)

This is the most powerful way to use the Sandbox Agent - set up automated reproduction, then work with Copilot to fix the issue iteratively.

Step 1: Set up automated reproduction

reproduce issue #12345 in Sandbox with Appium automation

What happens:

  • Agent creates MainPage with reproduction scenario
  • Agent creates Appium test that demonstrates the bug
  • Agent runs test and confirms: "Bug reproduced - test fails as expected"

Step 2: Iteratively fix the issue

Now work on fixing issue #12345. Keep testing after each change until the Appium test passes.

What happens:

  • Copilot analyzes the bug and proposes a fix
  • Modifies MAUI framework code (not Sandbox)
  • Reruns BuildAndRunSandbox.ps1 to test
  • If test still fails → analyzes why → tries different approach
  • Repeats until Appium test passes

Step 3: Convert to UI tests

The fix works! Now move this Sandbox scenario to proper UI tests.

What happens:

  • Agent creates TestCases.HostApp/Issues/Issue12345.xaml[.cs]
  • Agent creates TestCases.Shared.Tests/Tests/Issues/Issue12345.cs
  • Copies working test logic from Sandbox
  • Cleans up Sandbox

Why this workflow works:

  • Fast feedback loop - Appium test validates each attempt
  • Objective validation - Not guessing if fix works, test proves it
  • Incremental progress - Each iteration gets closer to solution
  • Smooth transition - Working test becomes regression test

Example conversation:

You: reproduce issue #12345 - CollectionView crashes when removing last item

Agent: [Sets up reproduction, runs test]
"Bug reproduced successfully. Appium test demonstrates crash on item removal."

You: Now fix this issue. Keep testing until it works.

Agent: [Iteration 1]
"Added null check in CollectionViewHandler. Testing..."
"Test still fails - crash occurs before null check. Analyzing..."

Agent: [Iteration 2]
"Modified item removal sequence to update adapter first. Testing..."
"Test passes! No crash observed. Fix validated."

You: Great! Now move this to UI tests.

Agent: [Creates Issue12345.xaml and Issue12345.cs in proper locations]
"UI test created. Sandbox cleaned up. Ready for PR."

Validate a Bug Fix

verify PR #12345 fixes the crash reported in issue #12000

Expected flow:

  1. Agent creates reproduction scenario
  2. Tests WITH the fix
  3. Optionally tests WITHOUT the fix to confirm bug exists
  4. Reports whether fix resolves the issue

Quick Manual Testing

deploy PR #12345 to Android so I can test it manually

Expected flow:

  1. Agent sets up basic test scenario
  2. Deploys to device
  3. Leaves app running for manual exploration
  4. Provides instructions for manual validation

Pre-Submit Validation

test my changes on iOS before I submit a PR

Expected flow:

  1. Tests current branch changes
  2. Validates functionality
  3. Confirms no obvious issues
  4. Gives green light or flags concerns

Reproduce Community-Reported Bugs

reproduce issue #12345 to verify it still happens on main branch

Expected flow:

  1. Reads issue description
  2. Creates reproduction scenario
  3. Tests on main branch
  4. Confirms whether bug is reproducible

Platform Selection

The agent automatically selects platforms based on:

  1. PR title tags - [Android], [iOS], etc.
  2. Modified file paths - Platform-specific code paths
  3. Issue description - Mentioned platforms
  4. Code changes - Cross-platform vs. platform-specific

Default: Tests on Android only (faster) unless PR affects iOS-specific code or cross-platform controls.

You can override by specifying:

test PR #12345 on iOS

Understanding Test Results

✅ Success

✅ FIX VALIDATED - Test scenario completes successfully, expected behavior observed

Meaning: PR fix works as expected, no issues found.


⚠️ Partial Success

⚠️ PARTIAL - Fix appears to work but noticed a minor animation glitch

Meaning: Fix mostly works but there are concerns worth noting.


❌ Issues Found

❌ ISSUES FOUND - App crashes when tapping button after navigation

Meaning: Test revealed problems with the fix.


🚫 Cannot Test

🚫 CANNOT TEST - Build failed due to missing dependency

Meaning: Unable to complete testing due to technical issues.


Troubleshooting

Build Failures

If the agent reports build failures:

the build failed - can you check what went wrong?

The agent will analyze build logs and suggest fixes.


Test Can't Find Elements

If Appium can't locate UI elements:

the test can't find "TestButton" - can you check the AutomationIds?

Agent will verify and fix AutomationId mismatches.


App Crashes

If the app crashes during testing:

the app crashed - what does the log say?

Agent will analyze crash logs and identify the root cause.


Manual Validation

After automated testing, you can manually validate by:

  1. Simulator stays running - The app remains deployed
  2. Navigate to Sandbox - Find the app on simulator
  3. Test manually - Interact with the test scenario
  4. Review logs - Check CustomAgentLogsTmp/Sandbox/ for captured logs

The Sandbox Agent leaves the environment ready for hands-on exploration.


Advanced Usage

Test Multiple Scenarios

test PR #12345 with these scenarios:
1. Tap button once
2. Tap button rapidly 10 times
3. Navigate away and back, then tap

Creates comprehensive test coverage.


Capture Specific Metrics

test PR #12345 and measure the Grid layout dimensions

Uses Appium to capture element properties.


Compare Branches

test this PR on Android, then test main branch to compare behavior

Shows before/after comparison.


Best Practices

  • Test platform-specific changes on that platform - Don't test iOS changes on Android
  • Start with one platform - Test Android first (faster), then iOS if needed
  • Read issue reproduction steps - Use them as your test scenario when available
  • Validate incrementally - Test small changes frequently rather than large batches
  • Keep Sandbox simple - Focus on the specific bug or feature, don't create complex scenarios

Cleanup

The Sandbox Agent leaves your repository in a ready state:

  • Sandbox app contains your test scenario
  • Logs are captured in CustomAgentLogsTmp/Sandbox/
  • Device remains booted with app deployed

When to clean up:

git checkout -- src/Controls/samples/Controls.Sample.Sandbox/
rm -rf CustomAgentLogsTmp/Sandbox/

Only clean up when you're done with this test cycle and want to start fresh.

How the Sandbox Agent Uses RunAppiumTest Script to Run Tests

Overview: The Complete Workflow

The sandbox-agent uses a multi-step automated workflow to test MAUI apps. Here's the complete flow:

flowchart TD
    A[1. Agent Updates MainPage.xaml] --> B[2. Agent Creates RunWithAppiumTest.cs]
    B --> C[3. Run BuildAndRunSandbox.ps1]
    C --> D[4. Script Orchestrates Everything]
    D --> E[Build & Deploy]
    D --> F[Start Appium]
    D --> G[Run Test]
    D --> H[Capture Logs]
Loading

Detailed Steps:

  1. Agent Updates MainPage.xaml[.cs]

    • Creates test scenario (UI elements, event handlers)
    • Adds AutomationIds to elements for Appium to find
  2. Agent Creates RunWithAppiumTest.cs

    • Copies from template
    • Updates to match MainPage AutomationIds
    • Adds test logic (tap buttons, verify labels, etc.)
    • Saves to: CustomAgentLogsTmp/Sandbox/RunWithAppiumTest.cs
  3. User/Agent Runs BuildAndRunSandbox.ps1

    pwsh .github/scripts/BuildAndRunSandbox.ps1 -Platform android
  4. BuildAndRunSandbox.ps1 Orchestrates Everything

    • Validates prerequisites
    • Starts device/emulator
    • Builds and deploys app
    • Starts Appium server
    • Runs Appium test
    • Captures logs

Step-by-Step: How BuildAndRunSandbox.ps1 Works

Phase 1: Setup & Validation 🔧

Directory Structure:

CustomAgentLogsTmp/Sandbox/
├── RunWithAppiumTest.cs     # Agent creates this
├── appium.log               # Created by script
├── android-device.log       # Created by script
└── *.png                    # Screenshots from test

Validation Steps:

  • ✅ Check if RunWithAppiumTest.cs exists
  • ✅ Check if dotnet is available
  • ✅ Check if Appium is installed

Phase 2: Device Management 📱

Android:

# 1. Find Android emulators
adb devices

# 2. Pick first available (or use specified UDID)
$DeviceUdid = "emulator-5554"

iOS:

# 1. Find iOS simulators
xcrun simctl list devices

# 2. Boot iPhone Xs if not running
xcrun simctl boot "AC8BCB28-..."

# 3. Return UDID
$DeviceUdid = "AC8BCB28-..."

Result: $DeviceUdid contains the target device


Phase 3: Build & Deploy 🏗️

# Build
dotnet build Maui.Controls.Sample.Sandbox.csproj `
    -f net10.0-android `
    -c Debug

# Deploy - Android
adb -s $DeviceUdid install bin/Debug/.../*.apk

# Deploy - iOS
xcrun simctl install $DeviceUdid bin/Debug/.../*.app

Result: Sandbox app is installed on device


Phase 4: Appium Server 🔌

# Check if Appium already running
try {
    $response = Invoke-WebRequest "http://localhost:4723/status"
    Write-Host "✅ Already running"
}
catch {
    # Start Appium in background
    $appiumJob = Start-Job { 
        appium --log-level info > appium.log 
    }
    
    # Wait up to 30 seconds for it to be ready
    while ($waited < 30) {
        try {
            Invoke-WebRequest "http://localhost:4723/status"
            break  # Ready!
        }
        catch { Start-Sleep 1 }
    }
}

Result: Appium server running on http://localhost:4723


Phase 5: Run Appium Test 🎯 ← KEY PART

# Set environment variable for test script
$env:DEVICE_UDID = $DeviceUdid

# Change to Sandbox directory where RunWithAppiumTest.cs lives
cd CustomAgentLogsTmp/Sandbox/

# Run the C# script using dotnet-script
$appiumOutput = dotnet run RunWithAppiumTest.cs /p:NoWarn="CA1307;CS0162" 2>&1

# Display output
$appiumOutput | ForEach-Object { Write-Host $_ }

# Extract PID from output (Android only)
$pidLine = $appiumOutput | Select-String "SANDBOX_APP_PID=(\d+)"
if ($pidLine) {
    $sandboxPid = $pidLine.Matches.Groups[1].Value
}

What Happens Inside RunWithAppiumTest.cs

Script Structure

#!/usr/bin/env dotnet run
#:package Appium.WebDriver@8.0.1  // ← Tells dotnet-script to install Appium

using System;
using OpenQA.Selenium.Appium;
using OpenQA.Selenium.Appium.Android;

// ========== 1. READ ENVIRONMENT ==========
var udid = Environment.GetEnvironmentVariable("DEVICE_UDID");
string PLATFORM = udid.Contains("-") && udid.Length > 20 ? "ios" : "android";

// ========== 2. CONFIGURE APPIUM ==========
AppiumOptions options;
if (PLATFORM == "android") {
    options = new AppiumOptions();
    options.PlatformName = "Android";
    options.AutomationName = "UIAutomator2";
    options.AddAdditionalAppiumOption("appium:appPackage", "com.microsoft.maui.sandbox");
    options.AddAdditionalAppiumOption("appium:noReset", true);  // ← CRITICAL for Android
    options.AddAdditionalAppiumOption(MobileCapabilityType.Udid, udid);
}

// ========== 3. CONNECT TO APPIUM & LAUNCH APP ==========
var serverUri = new Uri("http://localhost:4723");
AndroidDriver driver = new AndroidDriver(serverUri, options);

// ========== 4. GET APP PID (for logcat filtering) ==========
if (PLATFORM == "android") {
    var pid = GetAppPidFromAdb();
    Console.WriteLine($"SANDBOX_APP_PID={pid}");  // ← PowerShell captures this!
}

// ========== 5. VERIFY APP LAUNCHED ==========
driver.WaitForElement("InstructionLabel", TimeSpan.FromSeconds(30));
Console.WriteLine("✅ App launched successfully");

// ========== 6. RUN TEST LOGIC ==========
// Agent implements custom test logic here
driver.FindElement(By.Id("NavigateButton")).Click();
var labelText = driver.FindElement(By.Id("ResultLabel")).Text;
Console.WriteLine($"Label text: {labelText}");

// ========== 7. EXIT (app stays running) ==========
// No driver.Quit() - app remains open for manual validation

Key Implementation Details

Package Management

#!/usr/bin/env dotnet run
#:package Appium.WebDriver@8.0.1
  • Uses dotnet-script to run C# as a script
  • #:package directive automatically installs NuGet packages
  • No need to create a full .csproj project!

Device Detection

var udid = Environment.GetEnvironmentVariable("DEVICE_UDID");
string PLATFORM = udid.Contains("-") && udid.Length > 20 ? "ios" : "android";

How it works:

  • iOS UDIDs: Long with hyphens (e.g., AC8BCB28-A72D-4A2D-90E7-E78FF0BA07EE)
  • Android UDIDs: Short without many hyphens (e.g., emulator-5554, 192.168.1.100:5555)

PID Capture (Android Only)

Inside RunWithAppiumTest.cs:

var pid = GetAppPid();  // adb shell pidof com.microsoft.maui.sandbox
Console.WriteLine($"SANDBOX_APP_PID={pid}");

Inside BuildAndRunSandbox.ps1:

$pidLine = $appiumOutput | Select-String "SANDBOX_APP_PID=(\d+)"
$sandboxPid = $pidLine.Matches.Groups[1].Value

# Use PID to filter logcat
adb -s $DeviceUdid logcat -d --pid=$sandboxPid > android-device.log

Why this matters:

  • Without PID: Get ALL logcat output (thousands of lines from all apps)
  • With PID: Get only Sandbox app logs (clean, focused output)

Why appium:noReset is CRITICAL for Android

options.AddAdditionalAppiumOption("appium:noReset", true);
Without noReset With noReset
❌ Appium clears app data ✅ App data persists
❌ .NET MAUI Fast Deployment breaks ✅ Fast Deployment works
❌ Crash: "No assemblies found" ✅ App launches successfully

Fast Deployment uses a special directory (__override__) to deploy assemblies quickly. When Appium clears app data, this directory is deleted, causing the app to crash immediately.


Phase 6: Log Capture 📝

# Extract PID from Appium output
$sandboxPid = "12345"  # Extracted from "SANDBOX_APP_PID=12345"

# Dump Android logcat for that specific PID
if ($Platform -eq "android") {
    adb -s $DeviceUdid logcat -d --pid=$sandboxPid > android-device.log
}

# iOS logs captured via xcrun simctl
if ($Platform -eq "ios") {
    xcrun simctl spawn booted log stream --predicate 'processImagePath contains "Sandbox"' > ios-device.log
}

Result: All logs saved to CustomAgentLogsTmp/Sandbox/


Complete Flow Diagram

BuildAndRunSandbox.ps1
│
├─ 1. Validate prerequisites
│   └─ Check RunWithAppiumTest.cs exists ✅
│
├─ 2. Start device
│   └─ Start-Emulator.ps1 → returns UDID
│
├─ 3. Build & deploy app
│   └─ Build-AndDeploy.ps1 → installs app
│
├─ 4. Start Appium server
│   └─ appium --log-level info > appium.log
│
├─ 5. Run Appium test  ← MAIN ACTION
│   │
│   └─ dotnet run RunWithAppiumTest.cs
│       │
│       ├─ Reads $env:DEVICE_UDID
│       ├─ Connects to http://localhost:4723
│       ├─ Launches com.microsoft.maui.sandbox
│       ├─ Outputs "SANDBOX_APP_PID=12345"
│       ├─ Waits for UI element
│       ├─ Runs test logic (tap, verify, etc.)
│       └─ Exits (app stays running)
│
└─ 6. Capture logs
    ├─ Extract PID from output
    └─ adb logcat --pid=12345 > android-device.log

Summary Table

Component Purpose Key Details
BuildAndRunSandbox.ps1 Orchestrates entire workflow Builds, deploys, starts Appium, runs test, captures logs
RunWithAppiumTest.cs Appium test script C# script that connects to Appium and tests the app
dotnet-script Script runner Allows running C# with #:package directive for NuGet
Appium Server UI automation Runs on port 4723, connects to device, controls app
PID Capture Log filtering Extracts app PID to filter logcat (Android only)
appium:noReset Prevents data wipe Critical for Android Fast Deployment to work

Key Innovation

Using dotnet run with #:package directive allows running C# scripts with NuGet dependencies without creating a full project!

This makes it easy for agents to:

  1. Generate a single .cs file
  2. Run it directly with dotnet run
  3. Automatically install dependencies via #:package

No need for .csproj, dotnet restore, or complex project setup! 🎉


Files Involved

File Purpose
.github/scripts/BuildAndRunSandbox.ps1 Main orchestration script
.github/scripts/templates/RunWithAppiumTest.template.cs Template for Appium test
CustomAgentLogsTmp/Sandbox/RunWithAppiumTest.cs Generated Appium test
CustomAgentLogsTmp/Sandbox/appium.log Appium server logs
CustomAgentLogsTmp/Sandbox/android-device.log Android logcat output
CustomAgentLogsTmp/Sandbox/ios-device.log iOS device logs
CustomAgentLogsTmp/Sandbox/*.png Screenshots from test
src/Controls/samples/Controls.Sample.Sandbox/MainPage.xaml Test UI

Next Steps

Want to run a test? Here's how:

# 1. Create your test scenario in MainPage.xaml
# 2. Create RunWithAppiumTest.cs in CustomAgentLogsTmp/Sandbox/
# 3. Run the script
pwsh .github/scripts/BuildAndRunSandbox.ps1 -Platform android

# 4. Check logs
cat CustomAgentLogsTmp/Sandbox/android-device.log

That's it! 🚀

Clone this wiki locally