Test Fixer Agent
Your CI is red. Or worse, it's red sometimes. This agent systematically diagnoses and fixes both failing and flaky RSpec tests so you can trust your test suite again.
When Tests Become the Problem
Two flavors of test pain. Both cost you time and trust:
- "The test was passing yesterday" — something changed, but what?
- "Just re-run CI" — the universal prayer for flaky tests
- "It works when I run it locally" — environment or timing differences strike again
- "We quarantined it" — quarantine lists that grow forever
- "The error message makes no sense" — cryptic stack traces hide the real cause
The result? Developers stop trusting the test suite. They merge despite failures. Real bugs slip through because "it's probably just flaky." Hours lost debugging. Features blocked waiting for green CI.
How the Agent Works
Test Suite Analysis
Maps the current state of your test suite and separates failures from flakiness.
- → Runs full test suite to identify all failures
- → Isolates each failing spec to rule out order dependencies
- → Runs suspect tests repeatedly to confirm flakiness vs deterministic failure
- → Categorizes failures: assertion errors, exceptions, timeouts, intermittent
Root Cause Diagnosis
Determines whether the test or the code is wrong, and identifies the pattern behind flaky failures.
- → Checks if code changed intentionally (test needs update)
- → Checks if it's a bug (code needs fix)
- → Detects async/timing issues (Turbo, Stimulus, Ajax)
- → Identifies test order dependencies, shared state, and time-dependent logic
Test Data Verification
Ensures factories and fixtures produce valid test data.
- → Checks factory definitions match current model requirements
-
→
Verifies
letvslet!usage for proper setup timing - → Validates database constraints and foreign keys
Fix Application
Applies the appropriate fix based on diagnosis.
- → Updates test expectations when code changed intentionally
- → Fixes bugs in application code when test is correct
-
→
Adds explicit waits:
have_css("[data-loaded]", wait: 10) -
→
Wraps jobs:
perform_enqueued_jobs { example.run } -
→
Freezes time:
travel_to(Time.zone.local(2024, 1, 15))
Full Suite Validation
Ensures the fix doesn't break anything else.
- → Runs linters to check syntax and style
- → Runs full test suite to confirm no regressions
- → Runs previously flaky tests 50+ times to confirm stability
- → PR is only submitted when all tests pass reliably.
Optional: Coverage Expansion
Beyond fixing failing tests, we can identify gaps between production data and test data.
Production Error Analysis
Analyzes error logs to find untested edge cases that break in production.
Boundary Conditions
Generates property-based tests for edge cases your team didn't think of.
Negative Tests
Adds tests for nil values, empty strings, and invalid states.
Common Issues We Fix
Assertion Failures
Expected value changed due to intentional refactoring.
Fix: Update expectation to match new behavior
Async/Timing Issues
Tests that don't wait for Ajax, Turbo, or Stimulus to complete.
expect(page).to have_no_css("[aria-busy]")
NoMethodError / NilError
Method renamed, moved, or test setup doesn't create required associations.
Fix: Add missing let! or factory associations
Background Jobs
Tests that assert on job side-effects without running the job.
perform_enqueued_jobs { example.run }
Time Dependencies
Tests that pass or fail depending on time of day or day of week.
travel_to(Time.zone.local(2024, 1, 15))
Order Dependencies
Tests that rely on state from previous tests or shared global state.
let(:record) { create(:record) } # fresh each time
Safety Guarantees
Diagnosis Before Fix
Understands the root cause before applying any changes.
Full Suite Regression Check
Every fix is validated against the entire test suite.
No Sleep Hacks or Stubbing
Uses real data and explicit condition waits. No arbitrary sleep calls or mock workarounds.
Clear Explanations
Every PR explains what was wrong and why the fix works, so your team learns the pattern.
Linter Compliance
All changes pass RuboCop and any other configured linters.
Minimal Changes
Fixes only what's broken. No scope creep or unnecessary refactoring.
What This Is NOT
- ✗ Not skipping or deleting tests. We fix them, not mark them pending or remove them.
- ✗ Not adding sleeps or stubbing everything. Proper waits for conditions, real data that catches real bugs.
- ✗ Not blind pattern matching. Understands why tests fail before fixing.
- ✗ Not weakening assertions. Fixed tests stay meaningful and still catch the bugs they were designed for.
Ready to Get Back to Green?
Start with a $1,500 audit. Get a full test suite health report covering failing tests, flaky tests, and their root causes.