feat: P0 LLM-readiness — Chrome aislado (9333), tab_select determinista, page_get_text, page_perceive
This commit is contained in:
@@ -5,9 +5,23 @@ MCP server (Go) that exposes the registry's CDP browser-control functions
|
||||
Chrome DevTools Protocol: navigate, read the DOM, click, manage cookies, evaluate
|
||||
JavaScript, operate iframes, and persist/restore session state.
|
||||
|
||||
33 tools total, grouped by domain. See `app.md` for the full per-tool reference and the
|
||||
36 tools total, grouped by domain. See `app.md` for the full per-tool reference and the
|
||||
"Omitido en v1" section.
|
||||
|
||||
## Security: isolated Chrome by default (port 9333)
|
||||
|
||||
**By default the MCP operates on its OWN isolated Chrome, NOT the user's daily browser.**
|
||||
|
||||
In this ecosystem the user's daily chromium has CDP enabled globally on port **9222** (via
|
||||
`/etc/chromium.d/cdp`). If the MCP defaulted there, the agent could drive the user's own
|
||||
tabs (banking, email). To prevent that:
|
||||
|
||||
- The default CDP port is **9333** (the MCP's dedicated Chrome), not 9222.
|
||||
- `browser_launch` without `user_data_dir` uses a dedicated isolated profile
|
||||
(`<tmp>/browser_mcp_userdata`) on port 9333.
|
||||
- **Port 9222 = the daily browser.** Pass `port: 9222` explicitly, with care, only when you
|
||||
deliberately want to attach to it.
|
||||
|
||||
## Build
|
||||
|
||||
```bash
|
||||
@@ -54,22 +68,21 @@ For an inspection-only session that cannot mutate browser state, pass `"args": [
|
||||
|
||||
## Example session
|
||||
|
||||
Assuming a Chrome already running with `--remote-debugging-port=9222` (or call
|
||||
`browser_launch` first), a typical agent flow:
|
||||
The default port is **9333** (the MCP's isolated Chrome). A typical LLM-readiness agent
|
||||
flow — launch isolated Chrome, pick the right tab, perceive the page, act, read result:
|
||||
|
||||
```
|
||||
browser_launch { "port": 9222, "url": "https://example.com" } # -> "launched pid=... port=9222"
|
||||
browser_connect { "port": 9222 } # -> "connected port=9222"
|
||||
tab_navigate { "port": 9222, "url": "https://example.com" }
|
||||
page_wait_load { "port": 9222, "timeout_ms": 10000 }
|
||||
page_get_html { "port": 9222 } # -> serialized HTML (truncated 200k)
|
||||
dom_find_by_text { "port": 9222, "text": "More information" } # -> "a" / "#id" selector
|
||||
dom_click { "port": 9222, "selector": "a" }
|
||||
page_eval_js { "port": 9222, "expression": "document.title" } # -> page title
|
||||
page_screenshot { "port": 9222, "path": "/tmp/example.png", "full_page": true }
|
||||
browser_disconnect{ "port": 9222 }
|
||||
browser_launch { "url": "https://example.com" } # -> "launched pid=... port=9333 user_data_dir=<tmp>/browser_mcp_userdata"
|
||||
tab_list { } # -> JSON list of targets (id, type, url, title)
|
||||
tab_select { "match": "example.com" } # -> "selected target matching: example.com" (deterministic, by id or URL substring)
|
||||
page_perceive { } # -> indented accessibility outline (roles, names, #ref) — the LLM "sees" the page compactly
|
||||
dom_click { "selector": "a" } # act on what you perceived
|
||||
page_get_text { "selector": "body", "max_bytes": 20000 } # -> visible innerText, compact (does NOT blow up the context like page_get_html)
|
||||
browser_disconnect{ }
|
||||
```
|
||||
|
||||
To attach to the daily browser instead, pass `port: 9222` explicitly in each call (with care).
|
||||
|
||||
Cookies, iframes (`frame_list` -> `frame_eval`/`frame_get_html`), keyboard/scroll
|
||||
(`press_key`, `scroll`), JS dialogs (`handle_dialog`), and session persistence
|
||||
(`storage_save` / `storage_load`) follow the same per-port pattern.
|
||||
|
||||
Reference in New Issue
Block a user