testing
Testing Rules
Section titled “Testing Rules”Philosophy: Diagnose, Don’t Brute-Force
Section titled “Philosophy: Diagnose, Don’t Brute-Force”When tests fail, the goal is NEVER to tweak code blindly until green. Every failure is one of:
- A BUG in the code — code doesn’t do what it should
- A GAP in the code — code is missing something the test expects
- A BUG in the test — test itself is wrong or outdated
- An ENVIRONMENTAL issue — missing env var, wrong config, external dependency
Diagnose which one it is, then fix the ROOT CAUSE.
7-Step Diagnostic Process
Section titled “7-Step Diagnostic Process”- READ the failing test. Understand what it expects and why. Read the ENTIRE test file.
- READ the code under test line by line. Trace actual execution path for the failing case.
- READ imports and dependencies. Check shared state, utilities, side effects.
- DIAGNOSE the category. Is it a code bug, code gap, test bug, or environmental?
- VERIFY framework behavior. If unclear, search official docs for the test framework.
- APPLY the root-cause fix. Fix the actual problem, not the symptom.
- RE-RUN and verify. Run the specific test, then the FULL suite to catch regressions.
Maximum 3 diagnostic cycles per failure. If still stuck after 3 attempts, escalate to user with findings.
Test Requirements by Task Type
Section titled “Test Requirements by Task Type”| Task Type | Test Requirement |
|---|---|
| Bug fix | Regression test proving the bug is fixed |
| New feature | Unit tests for core logic + integration for API/UI |
| Refactor | Existing tests must still pass (no behavior change) |
| API endpoint | Request/response validation, error cases, auth checks |
| Schema/content change | Build validation passes |
- Every new feature or bugfix requires tests where applicable.
- Run the full test suite before committing. Fix all failures.
- NEVER skip, disable, or delete tests to make a commit pass.
- NEVER use
test.skip()ortest.todo()without a tracked TODO explaining why. - Test edge cases: empty inputs, null/undefined, boundary values, error paths.
- Snapshot tests are a last resort. Prefer explicit assertions.
- Mock external services in unit tests.
- Use
describeblocks to group related tests logically.
Anti-Patterns
Section titled “Anti-Patterns”| Anti-Pattern | Why It’s Wrong | Do Instead |
|---|---|---|
| Changing code randomly until tests pass | Hides real bug, creates new ones | Follow 7-step diagnostic process |
| Deleting a failing test | Removes safety net | Fix the root cause |
Adding // @ts-ignore to pass type tests | Masks type errors | Fix the type issue |
| Testing implementation details | Breaks on refactor | Test behavior and outcomes |
| No assertions in test | False confidence | Every test must assert something |
Visual Testing (Frontend Only)
Section titled “Visual Testing (Frontend Only)”Activates when the project has a frontend stack — detected from stack.json, package.json deps (react, next, astro, svelte, vue), or file patterns (src/components/, *.tsx, *.jsx).
Pure backend/CLI projects: Skip this section entirely.
Expanded TDD Cycle
Section titled “Expanded TDD Cycle”When frontend work is detected, the TDD cycle becomes:
RED → GREEN → VISUAL → REFACTOR- RED — failing functional test (behavior, not appearance)
- GREEN — implementation passes functional test
- VISUAL — capture/verify visual baseline
- Playwright
toHaveScreenshot()for automated regression - Cross-browser: Chromium + Firefox + WebKit (minimum)
- Viewports: mobile (375px), tablet (768px), desktop (1280px)
- Deterministic:
animations: 'disabled', fonts loaded, time frozen maskoption for dynamic elements (timestamps, avatars, ads)
- Playwright
- REFACTOR — clean up with both functional + visual tests as safety net
Visual Approval Workflow
Section titled “Visual Approval Workflow”| Scenario | Action |
|---|---|
| Intentional visual change | npx playwright test --update-snapshots |
| Unintentional visual diff | Treat as RED — it’s a regression, fix it |
| New component (no baseline) | First run creates baseline, commit snapshots |
Local vs CI
Section titled “Local vs CI”| Context | Scope |
|---|---|
| Local development | Selective — only changed components’ visual tests |
| CI pipeline | Full visual suite across all browsers + viewports |
| Pre-commit | Affected visual tests only |
| Pre-push | Full visual suite |
Interactive Verification
Section titled “Interactive Verification”When claude --chrome is available:
- Use for live design verification during development
- Complementary to automated tests, not a replacement
- Good for subjective quality checks automated tests can’t catch
Deterministic Rendering Checklist
Section titled “Deterministic Rendering Checklist”Before capturing screenshots:
-
animations: 'disabled'in Playwright config - Fonts loaded (
page.waitForLoadState('networkidle')or font-face check) - Time frozen (
page.clock.setFixedTime()for timestamps) - Dynamic content masked (
mask: [page.locator('.avatar')]) - Viewport set explicitly (
page.setViewportSize()) - Color scheme set (
page.emulateMedia({ colorScheme: 'light' }))
Integration
Section titled “Integration”| Skill | How it uses this rule |
|---|---|
test-driven-development | Adds VISUAL step after GREEN when frontend detected |
verification-before-completion | Adds visual regression check to completion gate |
playwright | Viewport presets + cross-browser config templates |