macos-agent-builder
Use when building AI-driven macOS automation agents — configuring Peekaboo agent mode, selecting AI models, composing MCP tool chains, or exporting agent configurations for Claude Desktop/Cursor.
| Model | Source |
|---|---|
| sonnet | pack: macos-automation |
Full Reference
macOS Agent Builder
Section titled “macOS Agent Builder”Guide for building AI-powered macOS desktop automation agents. Covers Peekaboo’s built-in agent mode, custom MCP tool chains, model selection, and agent configuration export for Claude Desktop, Cursor, and other MCP clients.
Mandatory Announcement — FIRST OUTPUT before anything else:
┏━ 🧠 macos-agent-builder ━━━━━━━━━━━━━━━━━━━━━━━┓┃ [one-line description of agent being designed] ┃┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛No exceptions. Box frame first, then work.
Agent Architecture Options
Section titled “Agent Architecture Options”Option 1: Peekaboo Built-in Agent
Section titled “Option 1: Peekaboo Built-in Agent”The fastest path. Peekaboo’s agent command provides NL → tool chain automation out of the box.
peekaboo agent "Fill out the registration form with test data" --model claude-sonnet-4.5- Free-form natural language task description
- Multi-model support (see Model Selection Guide)
- Interactive chat mode (auto TTY detection)
- Session persistence (
--list-sessions,--resume) - Audio input (
--audio,--audio-file,--realtime)
Best for: Quick tasks, interactive exploration, prototyping automations.
Option 2: Custom MCP Tool Chain
Section titled “Option 2: Custom MCP Tool Chain”Selective tool exposure through MCP server configuration. More control over what the AI can do.
{ "mcpServers": { "peekaboo": { "command": "npx", "args": ["-y", "@steipete/peekaboo", "mcp"], "env": { "PEEKABOO_ALLOW_TOOLS": "see,click,type,press,hotkey,list,image" } }, "macos-automator": { "command": "npx", "args": ["-y", "@steipete/macos-automator-mcp@latest"] } }}Best for: Production agents, restricted environments, specific tool needs.
Option 3: Hybrid (Peekaboo Agent + Script Tools)
Section titled “Option 3: Hybrid (Peekaboo Agent + Script Tools)”Combine Peekaboo’s visual capabilities with macos-automator’s script execution.
Configure Peekaboo as MCP server + macos-automator as separate MCP server → AI client has both visual + scripting tools available.
Best for: Complex workflows needing both GUI interaction and scripted data extraction.
Model Selection Guide
Section titled “Model Selection Guide”| Model | Speed | Reasoning | Vision | Best For |
|---|---|---|---|---|
| gpt-5.1 | Fast | Good | Yes | Simple UI tasks, form filling, navigation |
| claude-sonnet-4.5 | Medium | Excellent | Yes | Complex multi-step, error recovery, data extraction |
| gemini-3-flash | Fast | Good | Excellent | Vision-heavy tasks, screen reading, visual verification |
| ollama (local) | Varies | Model-dependent | Model-dependent | Privacy-sensitive, offline, no API cost |
| grok-3 | Fast | Good | Yes | General purpose, experimental |
Recommendation: Start with claude-sonnet-4.5 for complex tasks. Switch to gemini-3-flash for vision-heavy tasks. Use gpt-5.1 for speed on simple tasks.
Provider Configuration:
| Provider | Env Var | Models |
|---|---|---|
| OpenAI | OPENAI_API_KEY | gpt-5.1, gpt-4o, gpt-4o-mini |
| Anthropic | ANTHROPIC_API_KEY | claude-sonnet-4.5, claude-haiku-4.5 |
| xAI | GROK_API_KEY | grok-3, grok-3-mini |
GEMINI_API_KEY | gemini-3-flash, gemini-3-pro | |
| Ollama | PEEKABOO_OLLAMA_BASE_URL | Any installed model |
| Custom | ~/.peekaboo/credentials | OpenRouter, Groq, Together AI |
Tool Chain Configuration
Section titled “Tool Chain Configuration”Full Tool Set (all 35+ commands)
Section titled “Full Tool Set (all 35+ commands)”Default — no filtering. AI has access to everything.
peekaboo agent "task description" --model claude-sonnet-4.5Restricted Set (recommended for production)
Section titled “Restricted Set (recommended for production)”Allow only specific tools:
PEEKABOO_ALLOW_TOOLS="see,click,type,press,hotkey,scroll,list,image" peekaboo agent "task"Or deny dangerous tools:
PEEKABOO_DISABLE_TOOLS="app quit,clean,config" peekaboo agent "task"Or via config.json:
{ "tools": { "allow": ["see", "click", "type", "press", "hotkey", "scroll", "list", "image"], "deny": [] }}Minimal Set (safest)
Section titled “Minimal Set (safest)”Read-only + basic interaction:
PEEKABOO_ALLOW_TOOLS="see,list,image"⚠ GOTCHA [silent-bug]: Tool filtering is ALL-or-NOTHING for subcommands.
allow: ["window"]allows ALL window subcommands (close, minimize, maximize, etc.). There’s no per-subcommand filtering.
MCP Configuration Export
Section titled “MCP Configuration Export”Claude Desktop
Section titled “Claude Desktop”{ "mcpServers": { "peekaboo": { "command": "npx", "args": ["-y", "@steipete/peekaboo", "mcp"], "env": { "PEEKABOO_ALLOW_TOOLS": "see,click,type,press,hotkey,scroll,list,image,window,app,menu" } } }}Config location: ~/Library/Application Support/Claude/claude_desktop_config.json
Cursor
Section titled “Cursor”{ "mcpServers": { "peekaboo": { "command": "npx", "args": ["-y", "@steipete/peekaboo", "mcp"] } }}Config location: ~/.cursor/mcp.json
Multi-Tool Agent Config
Section titled “Multi-Tool Agent Config”Combine Peekaboo (GUI) + macos-automator (scripts) + applescript-mcp (raw bridge):
{ "mcpServers": { "peekaboo-gui": { "command": "npx", "args": ["-y", "@steipete/peekaboo", "mcp"], "env": { "PEEKABOO_ALLOW_TOOLS": "see,click,type,press,hotkey,scroll,list,image,window,app,menu,dialog" } }, "macos-scripts": { "command": "npx", "args": ["-y", "@steipete/macos-automator-mcp@latest"] }, "raw-applescript": { "command": "npx", "args": ["@peakmojo/applescript-mcp"] } }}Agent Patterns
Section titled “Agent Patterns”Form Filler
Section titled “Form Filler”See → identify form fields → type data into each → click submit → verify successModel: gpt-5.1 (fast, simple task) Tools: see, click, type, press
Data Extractor
Section titled “Data Extractor”See → identify data elements → copy/read values → format outputModel: claude-sonnet-4.5 (data reasoning) Tools: see, list + macos-automator for script-accessible data
App Automator
Section titled “App Automator”Launch app → navigate to feature → perform action → verify resultModel: claude-sonnet-4.5 (multi-step reasoning) Tools: see, click, type, press, hotkey, app, menu, window
Screen Monitor
Section titled “Screen Monitor”Capture → diff against baseline → alert on change → repeatModel: gemini-3-flash (fast vision) Tools: see, image, capture
Workflow Executor
Section titled “Workflow Executor”Load .peekaboo.json → execute steps → handle errors → reportNo AI needed — use peekaboo run script.peekaboo.json
Safety Considerations
Section titled “Safety Considerations”Tool Filtering
Section titled “Tool Filtering”- Always filter for production agents — don’t expose
app quit,clean,configcommands - Use
PEEKABOO_ALLOW_TOOLSallowlist (safer than denylist)
Timeouts
Section titled “Timeouts”- Peekaboo
--wait-fordefaults to 5000ms — increase for slow apps - macos-automator
timeout_secondsdefaults to 60 — increase for long scripts - Agent
--max-stepscaps total actions — set based on task complexity
Error Boundaries
Section titled “Error Boundaries”- Max 3 retries per action (See→Act→Verify loop)
- Fail loudly after retry exhaustion — don’t loop forever
- Log each action for debugging
Human-in-the-Loop
Section titled “Human-in-the-Loop”- Peekaboo agent interactive chat mode: auto TTY detection
/helpfor available commands during agent session- Esc cancels current action
- Ctrl+D exits agent mode
--dry-runshows planned actions without executing
Sensitive Operations
Section titled “Sensitive Operations”- Never automate password entry without explicit user consent
- Financial transactions should always have human confirmation
- File deletion operations should use
--dry-runfirst - System settings changes should be reversible
Integration
Section titled “Integration”| Need | Skill |
|---|---|
| Peekaboo agent mode reference | peekaboo (agent.md) |
| MCP server setup | peekaboo (mcp.md) |
| Script execution tools | macos-automator (tools.md) |
| Multi-step workflow patterns | macos-workflow |
| Script generation | applescript-forge |