Skip to content

observation-masking

Token-efficient tool output management — decides when to keep tool output in context vs offload to files, using write-to-file + return-reference patterns. Use when tool outputs are large, when context is filling up, or when building token-efficient agent workflows.

ModelSource
sonnetpack: context-engineering
Full Reference

Tool outputs are context. Every byte of stdout, every JSON blob, every file read that lands in the assistant turn costs tokens — and stays in context for every future turn. Observation masking is the discipline of deciding what deserves context residency and what gets offloaded.

Two patterns cover 95% of cases: write-to-file + return-reference and selective extraction. The decision between them — and whether to mask at all — lives in reference/decision-matrix.md.


ItemValue
Core questionDoes this output need to be in the window for future turns?
Primary patternWrite full output to file → return file path as reference
Secondary patternExtract only the needed fields → discard the rest
Trigger threshold>500 tokens of tool output that won’t be re-referenced
Context budget signal>60% context used → apply masking aggressively

I want to…File
See write-to-file, selective extraction, and summary-in-context patterns with codereference/patterns.md
Decide whether to mask, extract, or keep based on output size and reusereference/decision-matrix.md

Usage: Read the reference file matching your current task from the index above. Each file is self-contained with inline examples and gotchas.


Every tool observation enters the context window as a message. Without masking, large outputs accumulate:

Turn 1: bash output → 2,000 tokens (stays forever)
Turn 2: file read → 3,500 tokens (stays forever)
Turn 3: API call → 1,800 tokens (stays forever)
Turn 4: you're now 7,300 tokens poorer for data you looked at once

With masking:

Turn 1: bash output → written to .claude/tmp/scan-output.txt → "Saved to .claude/tmp/scan-output.txt (2,847 lines)"
Turn 2: file read → extract 3 relevant lines → "Found: PORT=3000, DB_URL=..., NODE_ENV=production"
Turn 3: API call → written to .claude/tmp/api-response.json → "200 OK — 47 records. Saved to .claude/tmp/api-response.json"
Turn 4: context budget: intact

Tool is about to produce output
Is output large? (>500 tokens estimated)
├── No → Keep in context (default)
└── Yes → Will I reference specific fields later?
├── Yes, known fields → Selective extraction
├── Yes, unknown scope → Write-to-file + reference
└── No → Write-to-file + discard (summary only)

Full decision logic with size thresholds and reuse scoring → reference/decision-matrix.md.


Write-to-file + return-reference — full output preserved on disk, tiny reference in context:

Terminal window
# Instead of reading a 500-line config dump into context:
some-tool --verbose > .claude/tmp/tool-output.txt && echo "Saved to .claude/tmp/tool-output.txt"

Selective extraction — pull only what matters, discard the rest:

Terminal window
# Instead of keeping a full JSON blob:
curl -s api/endpoint | jq '{id: .id, status: .status, error: .error}'

Summary-in-context + detail-in-file — human-readable summary stays, raw data offloaded:

Terminal window
npm audit --json > .claude/tmp/audit.json && npm audit 2>&1 | tail -5

Full implementations with edge cases → reference/patterns.md.


Apply observation masking when:

  • A tool output exceeds ~500 tokens and won’t be re-read verbatim
  • Context usage is above 60% (aggressive masking mode)
  • Building multi-turn agent workflows where context accumulates over many steps
  • Scanning/auditing tools that produce exhaustive output (dependency trees, lint reports, audit logs)
  • API responses containing nested data where only 2-3 fields matter

Do NOT apply when:

  • Output is small (<200 tokens) — masking overhead exceeds benefit
  • Output will be referenced multiple times in the same session
  • Output needs to be in context for the model to reason over it holistically
  • Writing a test/implementation where the full content is the work product

┏━ ⚡ observation-masking ━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ [one-line description of what’s being managed] ┃ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛