Verify Tests Fail Without Fix
Verifies UI tests actually catch the issue. Supports two workflow modes:
Mode 1: Verify Failure Only (Test Creation)
Use when creating tests before writing a fix:
- Runs tests to verify they FAIL (proving they catch the bug)
- No fix files required
- Perfect for test-first development
# Auto-detect test filter from changed test files pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform android # With explicit test filter pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform ios -TestFilter "Issue33356"
Mode 2: Full Verification (Fix Validation)
Use when validating both tests and fix:
- Without fix - tests should FAIL (bug is present)
- With fix - tests should PASS (bug is fixed)
# Auto-detect everything (recommended) pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform android -RequireFullVerification # With explicit test filter pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 -Platform ios -TestFilter "Issue33356" -RequireFullVerification
Note: -RequireFullVerification ensures the script errors if no fix files are detected, preventing silent fallback to failure-only mode.
Requirements
Verify Failure Only Mode:
- Test files in the PR (or working directory)
Full Verification Mode:
- Test files in the PR
- Fix files in the PR (non-test code changes)
The script auto-detects which mode to use based on whether fix files are present.
Expected Output
Verify Failure Only Mode:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β VERIFICATION PASSED β
β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ£
β Tests FAILED as expected! β
β This proves the tests correctly reproduce the bug. β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Full Verification Mode:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β VERIFICATION PASSED β
β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ£
β - FAIL without fix (as expected) β
β - PASS with fix (as expected) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
What It Does
Verify Failure Only Mode (no fix files):
- Fetches base branch from origin (if available)
- Auto-detects test classes from changed test files
- Runs tests (should FAIL to prove they catch the bug)
- Updates PR labels based on result
- Reports result
Full Verification Mode (fix files detected):
- Fetches base branch from origin to ensure accurate diff
- Auto-detects fix files (non-test code) from git diff
- Auto-detects test classes from
TestCases.Shared.Tests/*.cs - Reverts fix files to base branch
- Runs tests (should FAIL without fix)
- Restores fix files
- Runs tests (should PASS with fix)
- Generates markdown reports:
CustomAgentLogsTmp/TestValidation/verification-report.md- Full detailed reportCustomAgentLogsTmp/PRState/verification-report.md- Gate section for PR agent
- Updates PR labels based on result
- Reports result
PR Labels
The skill automatically manages two labels on the PR to indicate verification status:
| Label | Color | When Applied |
|---|---|---|
s/ai-reproduction-confirmed | π’ Green (#2E7D32) | Tests correctly FAIL without fix (AI verified tests catch the bug) |
s/ai-reproduction-failed | π Orange (#E65100) | Tests PASS without fix (AI verified tests don't catch the bug) |
Behavior:
- When verification passes, adds
s/ai-reproduction-confirmedand removess/ai-reproduction-failedif present - When verification fails, adds
s/ai-reproduction-failedand removess/ai-reproduction-confirmedif present - If a PR is re-verified after fixing tests, labels are updated accordingly
- No label = AI hasn't verified tests yet
Output Files
The skill generates output files under CustomAgentLogsTmp/PRState/<PRNumber>/verify-tests-fail/:
| File | Description |
|---|---|
verification-report.md | Comprehensive markdown report with test results and full logs |
verification-log.txt | Text log of the verification process |
test-without-fix.log | Full test output from run without fix |
test-with-fix.log | Full test output from run with fix |
Plus UI test logs in CustomAgentLogsTmp/UITests/:
android-device.logorios-device.log- Device logstest-output.log- NUnit test output
Example structure:
CustomAgentLogsTmp/
βββ UITests/ # Shared UI test logs
β βββ android-device.log
β βββ test-output.log
βββ PRState/
βββ 27847/
βββ verify-tests-fail/
βββ verification-report.md # Full detailed report
βββ verification-log.txt
βββ test-without-fix.log
βββ test-with-fix.log
PR Number Detection:
- Auto-detected from branch name (e.g.,
pr-27847) - Falls back to
gh pr viewcommand - Uses "unknown" if detection fails
- Can be manually specified with
-PRNumberparameter
Troubleshooting
| Problem | Cause | Solution |
|---|---|---|
| No fix files detected | Base branch detection failed or no non-test files changed | Use -FixFiles or -BaseBranch explicitly |
| Tests pass without fix | Tests don't detect the bug | Review test assertions, update test |
| Tests fail with fix | Fix doesn't work or test is wrong | Review fix implementation |
| App crashes | Duplicate issue numbers, XAML error | Check device logs |
| Element not found | Wrong AutomationId, app crashed | Verify IDs match |
Optional Parameters
# Require full verification (fail if no fix files detected) - recommended -RequireFullVerification # Explicit test filter -TestFilter "Issue32030|ButtonUITests" # Explicit fix files -FixFiles @("src/Core/src/File.cs") # Explicit base branch -BaseBranch "main"