Extensions

Extensions are built-in capabilities bundled with the pi agent. Each extension either registers tools the agent can call directly, or provides hooks, commands, and stream-level guardrails that operate transparently without direct tool calls.

20 extensions are active by default. They fall into a few broad categories:

Interaction — collect user input and secrets through masked, paged TUI flows (ask-user-questions, get-secrets-from-user)
Shell & process management — run commands asynchronously, manage persistent background processes, and track their lifecycle (async-jobs, bg-shell)
Browser automation — full Playwright-powered browser control with 57 tools covering navigation, interaction, assertions, tracing, and network inspection (browser-tools)
Search & documentation — web search via Brave, page fetching, Google Search via Gemini grounding, and up-to-date library docs via Context7 (search-the-web, google-search, context7)
macOS automation — Accessibility API, window management, and screen capture for native app control (mac-tools)
Infrastructure — native MCP server client, subagent spawning, activity logging, and configuration discovery (mcp-client, subagent, gsd, universal-config)
Guardrails & hooks — zero-cost stream-level regex guardrails, AWS credential auto-refresh, GitHub sync, and remote configuration validation (ttsr, aws-auth, github-sync, remote-questions)

Cards below are sorted by tool count — extensions with more tools appear first. Click any card to expand and see its registered tools.

browser-tools 61 tools

browser-tools — pi extension: full browser interaction via Playwright.

browser_action_cache —
browser_assert — Run one or more explicit browser assertions and return structured PASS/FAIL results. Prefer this for verification instead of inferring success from prose summaries.
browser_diff — Report meaningful browser-state changes. By default compares the current page to the most recent tracked action state. Use this to understand what changed after a click, submit, or navigation.
browser_batch — Execute multiple explicit browser steps in one call. Prefer this for obvious action sequences like click → type → wait → assert to reduce round trips and token usage.
browser_generate_test —
browser_emulate_device —
browser_extract —
browser_analyze_form —
browser_fill_form —
browser_check_injection —
browser_get_console_logs —
browser_get_network_logs —
browser_get_dialog_logs —
browser_evaluate —
browser_get_accessibility_tree —
browser_find — Find elements on the page by text content, ARIA role, or CSS selector. Returns only the matched nodes as a compact accessibility snapshot — far cheaper than browser_get_accessibility_tree. Use this after any action to locate a specific button, input, heading, or link before clicking it.
browser_get_page_source —
browser_find_best —
browser_act —
browser_click —
browser_drag —
browser_type —
browser_upload_file —
browser_scroll —
browser_hover —
browser_key_press —
browser_select_option —
browser_set_checked —
browser_set_viewport —
browser_navigate —
browser_go_back —
browser_go_forward —
browser_reload —
browser_mock_route —
browser_block_urls —
browser_clear_routes —
browser_list_pages —
browser_switch_page —
browser_close_page —
browser_list_frames —
browser_select_frame —
browser_save_pdf —
browser_snapshot_refs —
browser_get_ref —
browser_click_ref —
browser_hover_ref —
browser_fill_ref —
browser_screenshot —
browser_close —
browser_trace_start —
browser_trace_stop —
browser_export_har —
browser_timeline —
browser_session_summary —
browser_debug_bundle —
browser_save_state —
browser_restore_state —
browser_verify — Run a structured browser verification flow: navigate to a URL, run checks (element visibility, text content), capture screenshots as evidence, and return structured pass/fail results.
browser_visual_diff —
browser_wait_for —
browser_zoom_region —

gsd 15 tools

GSD Activity Log — Save raw chat sessions to .gsd/activity/ Before each context wipe in auto-mode, dumps the full session as JSONL.

gsd_decision_save — toolDef.description (alias for ${canonicalName} — prefer the canonical name)
gsd_requirement_update — Update an existing requirement in the GSD database and regenerate REQUIREMENTS.md. Provide the requirement ID (e.g. R001) and any fields to update.
gsd_summary_save — Save a summary, research, context, or assessment artifact to the GSD database and write it to disk. Computes the file path from milestone/slice/task IDs automatically.
gsd_milestone_generate_id — Generate the next milestone ID for a new GSD milestone. Scans existing milestones on disk and respects the unique_milestone_ids preference. Always use this tool when creating a new milestone — never invent milestone IDs manually.
gsd_plan_milestone — Write milestone planning state to the GSD database, render ROADMAP.md from DB, and clear caches after a successful render.
gsd_plan_slice — Write slice planning state to the GSD database, render S##-PLAN.md plus task PLAN artifacts from DB, and clear caches after a successful render.
gsd_plan_task — Write task planning state to the GSD database, render tasks/T##-PLAN.md from DB, and clear caches after a successful render.
gsd_task_complete — Record a completed task to the GSD database, render a SUMMARY.md to disk, and toggle the plan checkbox — all in one atomic operation. Writes the task row inside a transaction, then performs filesystem writes outside the transaction.
gsd_slice_complete — "Record a completed slice to the GSD database, render SUMMARY.mdUAT.md to disk, and toggle the roadmap checkbox — all in one atomic operation. "Validates all tasks are complete before proceeding. Writes the slice row inside a transaction, then performs filesystem writes outside the transaction.
gsd_complete_milestone — Record a completed milestone to the GSD database, render MILESTONE-SUMMARY.md to disk — all in one atomic operation. Validates all slices are complete before proceeding.
gsd_validate_milestone — Validate a milestone before completion — persist validation results to the DB, render VALIDATION.md to disk. Records verdict (pass/needs-attention/needs-remediation) and rationale.
gsd_replan_slice — Replan a slice after a blocker is discovered. Structurally enforces preservation of completed tasks — mutations to completed task IDs are rejected with actionable error payloads. Writes replan history to DB, applies task mutations, re-renders PLAN.md, and renders REPLAN.md.
gsd_reassess_roadmap — Reassess the milestone roadmap after a slice completes. Structurally enforces preservation of completed slices — mutations to completed slice IDs are rejected with actionable error payloads. Writes assessment to DB, applies slice mutations, re-renders ROADMAP.md, and renders ASSESSMENT.md.
gsd_save_gate_result — Save the result of a quality gate evaluation (Q3-Q8) to the GSD database. Called by gate evaluation sub-agents after analyzing a specific quality question.
gsd_journal_query — Query the structured event journal for auto-mode iterations. Returns matching journal entries filtered by flow ID, unit ID, rule name, event type, or time range.

mac-tools 12 tools

mac-tools — pi extension Gives the agent macOS automation capabilities via a Swift CLI that interfaces with Accessibility APIs, NSWorkspace, and CGWindowList.

mac_check_permissions — Check whether macOS Accessibility and Screen Recording permissions are enabled for the current terminal. Returns { accessibilityEnabled, screenRecordingEnabled }. Accessibility is required for UI automation; Screen Recording is required for mac_screenshot. Both are granted in System Settings > Privacy & Security.
mac_list_apps — List all running macOS applications. Returns an array of { name, bundleId, pid, isActive } for user-facing apps (regular activation policy). Set includeBackground to true to also include accessory/background apps.
mac_launch_app — Launch a macOS application by name or bundle ID. Returns { launched, name, bundleId, pid } on success. Provide either 'name' (e.g. 'TextEdit') or 'bundleId' (e.g. 'com.apple.TextEdit').
mac_activate_app — Bring a running macOS application to the front. Returns { activated, name } on success. Errors if the app is not running. Provide either 'name' or 'bundleId'.
mac_quit_app — Quit a running macOS application. Returns { quit, name } on success. Errors if the app is not running. Provide either 'name' or 'bundleId'.
mac_list_windows — List all on-screen windows for a macOS application. Returns an array of { windowId, title, bounds: {x,y,width,height}, isOnScreen, layer }. The windowId can be used with getWindowInfo for detailed inspection or with screenshotWindow for capture. Returns an empty array (not error) if the app is running but has no visible windows. Errors if the app is not running.
mac_find — Find UI elements in a macOS application's accessibility tree. Three modes:\n- 'search' (default): Find elements matching role/title/value/identifier criteria. Returns a numbered list of matches.\n- 'tree': Dump the full accessibility subtree as an indented tree. Use maxDepth/maxCount to bound output.\n- 'focused': Get the currently focused element in the app. No criteria needed.\nThe 'app' param accepts an app name (e.g. 'Finder') or bundle ID (e.g. 'com.apple.Finder').
mac_get_tree — Get a compact accessibility tree of a macOS application's UI structure. Returns an indented tree showing role, title, and value of each element. Tighter defaults than mac_find's tree mode — designed for quick structure inspection. Each line: `role \"title\" [value]` with 2-space indent per depth level. Omits title/value when nil or empty.
mac_click — Click a UI element in a macOS application by performing AXPress. Finds the first element matching the given criteria (role, title, value, identifier) and clicks it. At least one criterion is required. Returns the clicked element's attributes.
mac_type — Type text into a UI element in a macOS application by setting its AXValue attribute. Finds the first element matching the given criteria and sets its value. Returns the actual value after setting (read-back verification). At least one criterion is required.
mac_screenshot — Take a screenshot of a macOS application window by its window ID (from mac_list_windows). Returns the screenshot as an image content block for visual analysis, alongside text metadata (dimensions and format). Requires Screen Recording permission — use mac_check_permissions to verify.
mac_read — Read one or more accessibility attributes from a UI element in a macOS application. Finds the first element matching the given criteria and reads the named attribute(s). AXValue subtypes (CGPoint, CGSize, CGRect, CFRange) are automatically unpacked to structured dicts. Use 'attribute' for a single attribute or 'attributes' for multiple. At least one search criterion is required.

async-jobs 3 tools

Async Jobs Extension Allows bash commands to run in the background.

async_bash — Run a bash command in the background. Returns a job ID immediately so you can continue working. ` + `Use await_job to get results or cancel_job to stop. Ideal for long-running builds, tests, or installs. ` + `Output is truncated to the last ${DEFAULT_MAX_LINES} lines or ${DEFAULT_MAX_BYTES / 1024}KB.
await_job —
cancel_job —

mcp-client 3 tools

MCP Client Extension — Native MCP server integration for pi Provides on-demand access to MCP servers configured in project files (.mcp.json, .gsd/mcp.json) using the @modelcontextprotocol/sdk Client directly — no external CLI dependency required.

mcp_servers — List all available MCP servers configured in project files (.mcp.json, .gsd/mcp.json). Shows server names, transport type, and connection status. Use mcp_discover to get full tool schemas for a server.
mcp_discover — Get detailed tool signatures and JSON schemas for a specific MCP server. Connects to the server on first call (lazy connection). Use this to understand what tools a server provides and what arguments they accept before calling them with mcp_call.
mcp_call — Call a tool on an MCP server. Provide the server name, tool name, and arguments. Connects to the server on first call (lazy connection). Use mcp_discover first to see available tools and their required arguments.

search-the-web 3 tools

Web Search Extension v4 Native Anthropic hooks stay eager.

fetch_page — Fetch a web page and extract its content as clean markdown. Use this to read the full content of URLs found via search-the-web. Uses Jina Reader for high-quality markdown extraction. Control the amount of content returned with maxChars (default: 8000, max: 30000).
search_and_read — Search the web AND read page content in a single call. Returns pre-extracted, relevance-scored text from multiple pages — no separate fetch_page needed. Best when you need content, not just links. "For selective URL browsing, use search-the-webfetch_page instead."
search-the-web — Search the web using Brave Search API. Returns top results with titles, URLs, descriptions, extra contextual snippets, result ages, and optional AI summary. Supports freshness filtering, domain filtering, and auto-detects recency-sensitive queries.

context7 2 tools

Context7 Documentation Extension Replaces the context7 MCP server with a native pi extension.

resolve_library — Search the Context7 library catalogue by name and return matching libraries with metadata. Use this to find the correct library ID before fetching documentation. Results are ranked by trustScore (0–10) and benchmarkScore — prefer the highest. If you already have a library ID (e.g. /vercel/next.js), skip this and call get_library_docs directly.
get_library_docs — Fetch up-to-date documentation from Context7 for a specific library. Pass the library ID from resolve_library (e.g. /websites/react_dev) and a focused topic query to get the most relevant snippets. The tokens parameter controls how much documentation to retrieve (default 5000, max 10000). A specific query (e.g. 'server actions form submission') returns better results than a broad one.

ask-user-questions 1 tool

Request User Input — LLM tool for asking the user questions Thin wrapper around the shared interview-ui.

ask_user_questions — Request user input for one to three short questions and wait for the response. Single-select questions have 2-3 mutually exclusive options with a free-form 'None of the above' added automatically. Multi-select questions (allowMultiple: true) let the user toggle multiple options with SPACE and confirm with ENTER.

bg-shell 1 tool

Background Shell Extension v2 Command/tool registration is deferred in interactive mode so startup does not block on the full background-process stack before the TUI paints.

bg_shell — Run shell commands in the background without blocking. Manages persistent background processes with intelligent lifecycle tracking. Actions: start (launch with auto-classification & readiness detection), digest (structured summary ~30 tokens vs ~2000 raw), output (raw lines with incremental delivery), wait_for_ready (block until process signals readiness), "send (write stdin), send_and_wait (expect-style: sendwait for output pattern), ""run (execute a command on a persistent shell session, block until done, return outputexit code), "env (query shell cwd and environment variables), "signal (send OS signal), list (all processes with status), kill (terminate), restart (killrelaunch), "group_status (health of a process group), highlights (significant output lines only).

get-secrets-from-user 1 tool

get-secrets-from-user — paged secure env var collection + apply Collects secrets one-per-page via masked TUI input, then writes them to .env (local), Vercel, or Convex.

secure_env_collect — Collect one or more env vars through a paged masked-input UI, then write them to .env, Vercel, or Convex. Values are shown masked to the user (e.g. sk-ir***dgdh) and never echoed in tool output.

google-search 1 tool

Google Search Extension Provides a `google_search` tool that performs web searches via Gemini's Google Search grounding feature.

google_search — Search the web using Google Search via Gemini. Returns an AI-synthesized answer grounded in Google Search results, plus source URLs. Use this when you need current information from the web: recent events, documentation, product details, technical references, news, etc. Requires GEMINI_API_KEY or Google login. Alternative to Brave-based search tools.

subagent 1 tool

Subagent Tool - Delegate tasks to specialized agents Spawns a separate `pi` process for each subagent invocation, giving it an isolated context window.

subagent — [ "Delegate tasks to specialized subagents with isolated context windows.", "Each subagent is a separate pi process with its own tools, model, and system prompt.", "Modes: single ({ agent, task }), parallel ({ tasks: [{agent, task},...] }), chain ({ chain: [{agent, task},...] } with {previous} placeholder).", "Agents are defined as .md files in ~/.gsd/agent/agents/ (user) or .gsd/agents/ (project).", "Use the /subagent command to list available agents and their descriptions.", "Use chain mode to pipeline: scout finds context, planner designs, worker implements.", ].join(" ")

universal-config 1 tool

Universal Config Discovery Extension Auto-detects and displays configuration from 8 AI coding tools: Claude Code, Cursor, Windsurf, Gemini CLI, Codex, Cline, GitHub Copilot, and VS Code.

discover_configs — Scan for existing AI coding tool configurations in this project and the user's home directory. Discovers MCP servers, rules, context files, settings, Claude skills, and Claude plugins from Claude Code, Cursor, Windsurf, Gemini CLI, Codex, Cline, GitHub Copilot, and VS Code. Read-only — never modifies config files.

aws-auth

AWS Auth Refresh Extension Automatically refreshes AWS credentials when Bedrock API requests fail with authentication/token errors, then retries the user's message.