egutierrez 254f089982 fix: matar los chromium que el MCP lanza para cerrar el leak de RAM
El pool nunca guardaba el PID del Chrome lanzado por browser_launch, así que
closeAll() y drop() cerraban con CdpClose(c, 0): solo soltaban el WebSocket y
dejaban el proceso chromium vivo y huérfano (~789 MiB RSS cada uno). Llamadas
repetidas a browser_launch acumulaban instancias sin límite hasta saturar la RAM
(apagón del 06/06/2026, ~35 chromium huérfanos).

Cambios:
- pool.go: el pool registra el PID lanzado por puerto (mapa `pids`) con
  setPID/getPID/clearPID/launchedCount. drop() y closeAll() matan el grupo de
  proceso completo (CdpClose con pid real) SOLO si el PID está registrado, es
  decir, si lo lanzó el MCP. Un Chrome externo sin PID registrado (el navegador
  diario del usuario en 9222) nunca se mata: pid=0 solo cierra el WebSocket.
  Nuevo releaseConn() suelta únicamente el WebSocket preservando el PID, para la
  reconexión interna (no debe matar el navegador).
- tools_session.go: handleLaunch registra el PID devuelto por ChromeLaunch
  (setPID); es idempotente por puerto (reusa el Chrome ya lanzado), pasa
  ReuseExisting=true para no duplicar un Chrome ya vivo en el puerto, y aplica
  un tope duro de 4 instancias (maxLaunchedChromes) devolviendo un error de tool
  al superarlo. browser_disconnect ahora mata el Chrome propio.
- main.go: handler SIGTERM/SIGINT que llama closeAll antes de salir (los defers
  no corren al recibir señal). El retry de withConn usa releaseConn en vez de
  drop para no matar el Chrome al reconectar.
- pool_test.go: tests lógicos sin Chrome (cap, idempotencia, ciclo de PID, drop).
- pool_e2e_test.go: tests con Chrome real (gate BMCP_E2E=1) — golden (3 launch →
  closeAll → 0 huérfanos), dedup mismo puerto, y salvaguarda propio-vs-externo.
- app.md: e2e_checks (build, unit, leak_no_orphans) + growth log + bump a 0.5.0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 17:06:14 +02:00

browser_mcp

MCP server (Go) that exposes the registry's CDP browser-control functions (fn-registry/functions/browser) as MCP tools. Drive a live Chrome/Chromium over the Chrome DevTools Protocol: navigate, read the DOM, click, manage cookies, evaluate JavaScript, operate iframes, and persist/restore session state.

36 tools total, grouped by domain. See app.md for the full per-tool reference and the "Omitido en v1" section.

Security: isolated Chrome by default (port 9333)

By default the MCP operates on its OWN isolated Chrome, NOT the user's daily browser.

In this ecosystem the user's daily chromium has CDP enabled globally on port 9222 (via /etc/chromium.d/cdp). If the MCP defaulted there, the agent could drive the user's own tabs (banking, email). To prevent that:

  • The default CDP port is 9333 (the MCP's dedicated Chrome), not 9222.
  • browser_launch without user_data_dir uses a dedicated isolated profile (<tmp>/browser_mcp_userdata) on port 9333.
  • Port 9222 = the daily browser. Pass port: 9222 explicitly, with care, only when you deliberately want to attach to it.

Build

cd projects/web_scraping/apps/browser_mcp
go mod tidy        # first time only
go build -o browser_mcp .

browser_mcp only imports fn-registry/functions/browser (no sqlite/cgo), so a plain go build works. If transitive deps ever require it, fall back to CGO_ENABLED=1 go build -tags fts5 -o browser_mcp ..

Architecture: live CDP connection pool

Unlike registry_mcp (one DB handle), browser_mcp keeps a pool of live CDP connections keyed by port. A CDP connection is a live WebSocket session to a "page" tab; reusing it avoids paying the ~50-200ms handshake on every tool and preserves state between tools (e.g. the persistent dialog auto-handler armed by handle_dialog). The pool retries once on a dead-connection error (Chrome may have closed the tab between tools). See pool.go and deps.withConn in main.go.

Register in Claude Code

Add to a .mcp.json (the project's projects/web_scraping/.mcp.json already has it):

{
  "mcpServers": {
    "browser": {
      "command": "/home/enmanuel/fn_registry/projects/web_scraping/apps/browser_mcp/browser_mcp",
      "args": []
    }
  }
}

For an inspection-only session that cannot mutate browser state, pass "args": ["--read-only"].

Transports

  • stdio (default) — for MCP clients.
  • HTTP./browser_mcp --http :7740 (Streamable HTTP). --bind 0.0.0.0 requires REGISTRY_API_TOKEN (bearer auth).

Example session

The default port is 9333 (the MCP's isolated Chrome). A typical LLM-readiness agent flow — launch isolated Chrome, pick the right tab, perceive the page, act, read result:

browser_launch    { "url": "https://example.com" }                  # -> "launched pid=... port=9333 user_data_dir=<tmp>/browser_mcp_userdata"
tab_list          { }                                               # -> JSON list of targets (id, type, url, title)
tab_select        { "match": "example.com" }                        # -> "selected target matching: example.com" (deterministic, by id or URL substring)
page_perceive     { }                                               # -> indented accessibility outline (roles, names, #ref) — the LLM "sees" the page compactly
dom_click         { "selector": "a" }                               # act on what you perceived
page_get_text     { "selector": "body", "max_bytes": 20000 }        # -> visible innerText, compact (does NOT blow up the context like page_get_html)
browser_disconnect{ }

To attach to the daily browser instead, pass port: 9222 explicitly in each call (with care).

Cookies, iframes (frame_list -> frame_eval/frame_get_html), keyboard/scroll (press_key, scroll), JS dialogs (handle_dialog), and session persistence (storage_save / storage_load) follow the same per-port pattern.

S
Description
Synced from fn_registry
Readme 190 KiB
Languages
Go 100%