Files
browser_mcp/README.md
T

76 lines
3.0 KiB
Markdown

# browser_mcp
MCP server (Go) that exposes the registry's CDP browser-control functions
(`fn-registry/functions/browser`) as MCP tools. Drive a live Chrome/Chromium over the
Chrome DevTools Protocol: navigate, read the DOM, click, manage cookies, evaluate
JavaScript, operate iframes, and persist/restore session state.
33 tools total, grouped by domain. See `app.md` for the full per-tool reference and the
"Omitido en v1" section.
## Build
```bash
cd projects/web_scraping/apps/browser_mcp
go mod tidy # first time only
go build -o browser_mcp .
```
`browser_mcp` only imports `fn-registry/functions/browser` (no sqlite/cgo), so a plain
`go build` works. If transitive deps ever require it, fall back to
`CGO_ENABLED=1 go build -tags fts5 -o browser_mcp .`.
## Architecture: live CDP connection pool
Unlike `registry_mcp` (one DB handle), `browser_mcp` keeps a **pool of live CDP
connections** keyed by port. A CDP connection is a live WebSocket session to a "page"
tab; reusing it avoids paying the ~50-200ms handshake on every tool and preserves state
between tools (e.g. the persistent dialog auto-handler armed by `handle_dialog`). The
pool retries once on a dead-connection error (Chrome may have closed the tab between
tools). See `pool.go` and `deps.withConn` in `main.go`.
## Register in Claude Code
Add to a `.mcp.json` (the project's `projects/web_scraping/.mcp.json` already has it):
```json
{
"mcpServers": {
"browser": {
"command": "/home/enmanuel/fn_registry/projects/web_scraping/apps/browser_mcp/browser_mcp",
"args": []
}
}
}
```
For an inspection-only session that cannot mutate browser state, pass `"args": ["--read-only"]`.
## Transports
- **stdio** (default) — for MCP clients.
- **HTTP** — `./browser_mcp --http :7740` (Streamable HTTP). `--bind 0.0.0.0` requires
`REGISTRY_API_TOKEN` (bearer auth).
## Example session
Assuming a Chrome already running with `--remote-debugging-port=9222` (or call
`browser_launch` first), a typical agent flow:
```
browser_launch { "port": 9222, "url": "https://example.com" } # -> "launched pid=... port=9222"
browser_connect { "port": 9222 } # -> "connected port=9222"
tab_navigate { "port": 9222, "url": "https://example.com" }
page_wait_load { "port": 9222, "timeout_ms": 10000 }
page_get_html { "port": 9222 } # -> serialized HTML (truncated 200k)
dom_find_by_text { "port": 9222, "text": "More information" } # -> "a" / "#id" selector
dom_click { "port": 9222, "selector": "a" }
page_eval_js { "port": 9222, "expression": "document.title" } # -> page title
page_screenshot { "port": 9222, "path": "/tmp/example.png", "full_page": true }
browser_disconnect{ "port": 9222 }
```
Cookies, iframes (`frame_list` -> `frame_eval`/`frame_get_html`), keyboard/scroll
(`press_key`, `scroll`), JS dialogs (`handle_dialog`), and session persistence
(`storage_save` / `storage_load`) follow the same per-port pattern.