Skip to content

test-debug

Use when tests are failing unexpectedly — diagnoses root cause by reading test + code + imports, classifies failure type (CODE BUG / CODE GAP / TEST BUG / ENV), and applies targeted fix. Never brute-forces.

ModelSourceCategory
sonnetcoreOther
  • Tests failing unexpectedly
  • Test passes locally but fails in CI
  • Need to understand WHY a specific test fails
  • After refactoring causes test breakage
Full Reference

Systematic test failure diagnosis. Reads test + code + imports, classifies failure type, applies targeted fix. Never brute-forces.

Mandatory Announcement — FIRST OUTPUT before anything else:

┏━ 🔍 test-debug ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ [one-line: which test(s) are failing] ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
  • Tests failing unexpectedly
  • Test passes locally but fails in CI
  • Need to understand WHY a specific test fails
  • After refactoring causes test breakage

Run the failing test and capture FULL error output — not just the summary line.

Terminal window
npm test -- [test-file] 2>&1

Read the test file line by line. Understand:

  • What is being tested (the behavior, not the implementation)
  • What assertions are made
  • What setup/teardown exists
  • What mocks or fixtures are used

Read the implementation being tested. Understand:

  • What it does
  • What it returns
  • What side effects it has
  • What its edge cases are

Read all imports and dependencies of both test and implementation. Check for:

  • Breaking interface changes
  • Mock mismatches (mock doesn’t match real signature)
  • Side effect conflicts between tests
  • Shared state pollution
TypeMeaningExample
CODE BUGImplementation has a logic errorOff-by-one, wrong condition
CODE GAPImplementation missing expected behaviorUnhandled edge case
TEST BUGTest assertion is wrong or outdatedTesting old interface
ENVConfig, dependency, or timing issueMissing env var, race condition

If unclear, search official docs for the test framework to confirm expected behavior. Don’t assume — verify.

ClassificationAction
CODE BUGFix the implementation
CODE GAPAdd the missing behavior
TEST BUGFix the test assertion
ENVFix config/setup
Terminal window
npm test -- [test-file]

Max 3 diagnostic cycles. If still failing after 3 rounds, escalate — you’re likely missing something structural.

StepActionTime
1Capture output30s
2Read test1m
3Read implementation1m
4Trace imports1m
5Classify30s
6Verify framework1m
7Fix root cause2-5m
8Re-run30s
Never Do ThisDo This Instead
Delete a failing testFix the root cause
Skip/disable testsDocument why if absolutely necessary
Change assertions to match wrong behaviorFix the code, not the test
Retry the same fix hoping for different resultsRe-classify the failure
Add try/catch to suppress errorsFix the error source
Brute-force random changesFollow the 8-step process
MistakeFix
Only reading the error messageRead the FULL test + implementation + imports
Guessing instead of classifyingAlways complete Steps 1-5 before fixing
Fixing symptoms not root causeTrace to the actual source of the problem
Not checking shared test stateTests that share state can interfere — check isolation
Ignoring flaky testsFlaky = timing or state issue — diagnose properly