a03675113a
- .claude/agents/fn-orquestador/SKILL.md - .claude/commands/fn_claude.md - .claude/rules/INDEX.md - .claude/rules/cpp_apps.md - .claude/rules/ids_naming.md - CHANGELOG.md - apps/dag_engine/README.md - apps/dag_engine/api.go - apps/dag_engine/dags_migrated/example.yaml - apps/dag_engine/dags_migrated/example_lineage_tracking.yaml - ... Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2.5 KiB
2.5 KiB
name, kind, lang, domain, version, purity, signature, description, tags, uses_functions, uses_types, returns, returns_optional, error_type, imports, params, output, tested, tests, test_file_path, file_path
| name | kind | lang | domain | version | purity | signature | description | tags | uses_functions | uses_types | returns | returns_optional | error_type | imports | params | output | tested | tests | test_file_path | file_path | ||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| llm_propose_scraping_schema | function | py | infra | 1.0.0 | impure | def llm_propose_scraping_schema(url: str, ax_tree: list, max_chunks: int = 5, max_chars_per_chunk: int = 25000) -> dict | Orquesta trim_ax_tree -> chunk_ax_tree -> N llamadas a Claude CLI -> merge. Propone schema de scraping (fields, selectors, types) a partir del AX tree de una pagina. |
|
|
false | error_go_core |
|
|
dict {schema: [{field, selector, sample_value, type, source_role}], notes: str, chunks_processed: int, truncated: bool} | false | python/functions/infra/llm_propose_scraping_schema.py |
Ejemplo
import sys
sys.path.insert(0, "python/functions")
from infra.llm_propose_scraping_schema import llm_propose_scraping_schema
# ax_tree obtenido previamente con cdp_get_ax_tree
result = llm_propose_scraping_schema(
url="https://shop.example.com/products",
ax_tree=ax_tree,
max_chunks=3,
)
# {"schema": [{"field": "price", "selector": ".product-price", ...}], "notes": "...", ...}
for field in result["schema"]:
print(field["field"], "->", field["selector"])
Cuando usarla
Cuando tienes el AX tree de una pagina y quieres que Claude proponga automaticamente que campos extraer y con que selectores CSS. Paso de discovery antes de escribir la recipe YAML a mano o de forma asistida.
Gotchas
- Requiere
claudeCLI instalado y disponible en PATH (validado porclaude_cli_prompt). - Cada chunk genera una llamada a Claude (coste de tokens). Usar
max_chunksconservador en paginas muy grandes. - La respuesta de Claude se parsea tolerando fenced code blocks (
json ...). Si Claude devuelve prosa sin JSON, el chunk se omite con nota de error. - Dedup por
field: primera ocurrencia gana si el mismo campo aparece en varios chunks. - No accede a red directamente — delega en
claude_cli_prompt.