macos-agent-builder

Use when building AI-driven macOS automation agents — configuring Peekaboo agent mode, selecting AI models, composing MCP tool chains, or exporting agent configurations for Claude Desktop/Cursor.

Model	Source
sonnet	pack: macos-automation

Full Reference

macOS Agent Builder

Guide for building AI-powered macOS desktop automation agents. Covers Peekaboo’s built-in agent mode, custom MCP tool chains, model selection, and agent configuration export for Claude Desktop, Cursor, and other MCP clients.

Mandatory Announcement — FIRST OUTPUT before anything else:

┏━ 🧠 macos-agent-builder ━━━━━━━━━━━━━━━━━━━━━━━┓
┃ [one-line description of agent being designed]   ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

No exceptions. Box frame first, then work.

Agent Architecture Options

Option 1: Peekaboo Built-in Agent

The fastest path. Peekaboo’s agent command provides NL → tool chain automation out of the box.

peekaboo agent "Fill out the registration form with test data" --model claude-sonnet-4.5

Free-form natural language task description
Multi-model support (see Model Selection Guide)
Interactive chat mode (auto TTY detection)
Session persistence (--list-sessions, --resume)
Audio input (--audio, --audio-file, --realtime)

Best for: Quick tasks, interactive exploration, prototyping automations.

Option 2: Custom MCP Tool Chain

Selective tool exposure through MCP server configuration. More control over what the AI can do.

{
  "mcpServers": {
    "peekaboo": {
      "command": "npx",
      "args": ["-y", "@steipete/peekaboo", "mcp"],
      "env": {
        "PEEKABOO_ALLOW_TOOLS": "see,click,type,press,hotkey,list,image"
      }
    },
    "macos-automator": {
      "command": "npx",
      "args": ["-y", "@steipete/macos-automator-mcp@latest"]
    }
  }
}

Best for: Production agents, restricted environments, specific tool needs.

Option 3: Hybrid (Peekaboo Agent + Script Tools)

Combine Peekaboo’s visual capabilities with macos-automator’s script execution.

Configure Peekaboo as MCP server + macos-automator as separate MCP server → AI client has both visual + scripting tools available.

Best for: Complex workflows needing both GUI interaction and scripted data extraction.

Model Selection Guide

Model	Speed	Reasoning	Vision	Best For
gpt-5.1	Fast	Good	Yes	Simple UI tasks, form filling, navigation
claude-sonnet-4.5	Medium	Excellent	Yes	Complex multi-step, error recovery, data extraction
gemini-3-flash	Fast	Good	Excellent	Vision-heavy tasks, screen reading, visual verification
ollama (local)	Varies	Model-dependent	Model-dependent	Privacy-sensitive, offline, no API cost
grok-3	Fast	Good	Yes	General purpose, experimental

Recommendation: Start with claude-sonnet-4.5 for complex tasks. Switch to gemini-3-flash for vision-heavy tasks. Use gpt-5.1 for speed on simple tasks.

Provider Configuration:

Provider	Env Var	Models
OpenAI	`OPENAI_API_KEY`	gpt-5.1, gpt-4o, gpt-4o-mini
Anthropic	`ANTHROPIC_API_KEY`	claude-sonnet-4.5, claude-haiku-4.5
xAI	`GROK_API_KEY`	grok-3, grok-3-mini
Google	`GEMINI_API_KEY`	gemini-3-flash, gemini-3-pro
Ollama	`PEEKABOO_OLLAMA_BASE_URL`	Any installed model
Custom	`~/.peekaboo/credentials`	OpenRouter, Groq, Together AI

Tool Chain Configuration

Full Tool Set (all 35+ commands)

Default — no filtering. AI has access to everything.

peekaboo agent "task description" --model claude-sonnet-4.5

Restricted Set (recommended for production)

Allow only specific tools:

PEEKABOO_ALLOW_TOOLS="see,click,type,press,hotkey,scroll,list,image" peekaboo agent "task"

Or deny dangerous tools:

PEEKABOO_DISABLE_TOOLS="app quit,clean,config" peekaboo agent "task"

Or via config.json:

{
  "tools": {
    "allow": ["see", "click", "type", "press", "hotkey", "scroll", "list", "image"],
    "deny": []
  }
}

Minimal Set (safest)

Read-only + basic interaction:

PEEKABOO_ALLOW_TOOLS="see,list,image"

⚠ GOTCHA [silent-bug]: Tool filtering is ALL-or-NOTHING for subcommands. allow: ["window"] allows ALL window subcommands (close, minimize, maximize, etc.). There’s no per-subcommand filtering.

MCP Configuration Export

Claude Desktop

{
  "mcpServers": {
    "peekaboo": {
      "command": "npx",
      "args": ["-y", "@steipete/peekaboo", "mcp"],
      "env": {
        "PEEKABOO_ALLOW_TOOLS": "see,click,type,press,hotkey,scroll,list,image,window,app,menu"
      }
    }
  }
}

Config location: ~/Library/Application Support/Claude/claude_desktop_config.json

Cursor

{
  "mcpServers": {
    "peekaboo": {
      "command": "npx",
      "args": ["-y", "@steipete/peekaboo", "mcp"]
    }
  }
}

Config location: ~/.cursor/mcp.json

Multi-Tool Agent Config

Combine Peekaboo (GUI) + macos-automator (scripts) + applescript-mcp (raw bridge):

{
  "mcpServers": {
    "peekaboo-gui": {
      "command": "npx",
      "args": ["-y", "@steipete/peekaboo", "mcp"],
      "env": {
        "PEEKABOO_ALLOW_TOOLS": "see,click,type,press,hotkey,scroll,list,image,window,app,menu,dialog"
      }
    },
    "macos-scripts": {
      "command": "npx",
      "args": ["-y", "@steipete/macos-automator-mcp@latest"]
    },
    "raw-applescript": {
      "command": "npx",
      "args": ["@peakmojo/applescript-mcp"]
    }
  }
}

Agent Patterns

Form Filler

See → identify form fields → type data into each → click submit → verify success

Model: gpt-5.1 (fast, simple task) Tools: see, click, type, press

Data Extractor

See → identify data elements → copy/read values → format output

Model: claude-sonnet-4.5 (data reasoning) Tools: see, list + macos-automator for script-accessible data

App Automator

Launch app → navigate to feature → perform action → verify result

Model: claude-sonnet-4.5 (multi-step reasoning) Tools: see, click, type, press, hotkey, app, menu, window

Screen Monitor

Capture → diff against baseline → alert on change → repeat

Model: gemini-3-flash (fast vision) Tools: see, image, capture

Workflow Executor

Load .peekaboo.json → execute steps → handle errors → report

No AI needed — use peekaboo run script.peekaboo.json

Safety Considerations

Tool Filtering

Always filter for production agents — don’t expose app quit, clean, config commands
Use PEEKABOO_ALLOW_TOOLS allowlist (safer than denylist)

Timeouts

Peekaboo --wait-for defaults to 5000ms — increase for slow apps
macos-automator timeout_seconds defaults to 60 — increase for long scripts
Agent --max-steps caps total actions — set based on task complexity

Error Boundaries

Max 3 retries per action (See→Act→Verify loop)
Fail loudly after retry exhaustion — don’t loop forever
Log each action for debugging

Human-in-the-Loop

Peekaboo agent interactive chat mode: auto TTY detection
/help for available commands during agent session
Esc cancels current action
Ctrl+D exits agent mode
--dry-run shows planned actions without executing

Sensitive Operations

Never automate password entry without explicit user consent
Financial transactions should always have human confirmation
File deletion operations should use --dry-run first
System settings changes should be reversible

Integration

Need	Skill
Peekaboo agent mode reference	`peekaboo` (agent.md)
MCP server setup	`peekaboo` (mcp.md)
Script execution tools	`macos-automator` (tools.md)
Multi-step workflow patterns	`macos-workflow`
Script generation	`applescript-forge`