Files
fn_registry/python/functions/notebook/jupyter_read.md
T
egutierrez a03675113a chore: auto-commit (286 archivos)
- .claude/agents/fn-orquestador/SKILL.md
- .claude/commands/fn_claude.md
- .claude/rules/INDEX.md
- .claude/rules/cpp_apps.md
- .claude/rules/ids_naming.md
- CHANGELOG.md
- apps/dag_engine/README.md
- apps/dag_engine/api.go
- apps/dag_engine/dags_migrated/example.yaml
- apps/dag_engine/dags_migrated/example_lineage_tracking.yaml
- ...

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 16:33:22 +02:00

129 lines
3.9 KiB
Markdown

---
name: jupyter_read
kind: function
lang: py
domain: notebook
version: "1.0.0"
purity: impure
signature: "def jupyter_read_cells(notebook_path: str, server_url: str = 'http://localhost:8888', token: str = '', cell_index: int | None = None) -> list[dict]"
description: "Lee celdas de un notebook Jupyter abierto via el protocolo de colaboracion en tiempo real (CRDT/Y.js). Devuelve el estado actual incluyendo cambios no guardados. Expone tambien jupyter_notebook_info() para metadata rapida."
tags: [jupyter, notebook, crdt, yjs, websocket, cells, read, realtime, pendiente-usar, extractor]
uses_functions: []
uses_types: []
returns: []
returns_optional: false
error_type: "error_go_core"
imports: [jupyter_nbmodel_client]
params:
- name: notebook_path
desc: "Ruta relativa del notebook"
- name: server_url
desc: "URL del servidor Jupyter (default localhost:8888)"
- name: token
desc: "Token de autenticación"
- name: cell_index
desc: "Índice de celda específica a leer (opcional)"
output: "Lista de dicts con info de celdas (index, type, source, outputs) o dict con metadata del notebook"
tested: false
tests: []
test_file_path: ""
file_path: "python/functions/notebook/jupyter_read.py"
---
## Funciones exportadas
### `jupyter_read_cells`
```python
def jupyter_read_cells(
notebook_path: str,
server_url: str = "http://localhost:8888",
token: str = "",
cell_index: int | None = None,
) -> list[dict]
```
Lee todas las celdas o una especifica de un notebook Jupyter en vivo.
Retorna lista de dicts:
```python
{"index": 0, "type": "code", "source": "import pandas as pd", "outputs": ["..."]}
```
- Para celdas markdown y raw, el campo `outputs` no se incluye.
- Para code cells, los outputs se convierten a texto legible:
- `stream` -> texto plano
- `display_data`/`execute_result` -> `text/plain`, o `[HTML Output]`, `[Image Output (PNG)]`
- `error` -> traceback limpio (sin codigos ANSI)
### `jupyter_notebook_info`
```python
def jupyter_notebook_info(
notebook_path: str,
server_url: str = "http://localhost:8888",
token: str = "",
) -> dict
```
Retorna metadata del notebook:
```python
{
"notebook_path": "notebooks/analysis.ipynb",
"server_url": "http://localhost:8888",
"total_cells": 12,
"cell_counts": {"code": 9, "markdown": 3}
}
```
## Ejemplo
```python
from notebook.jupyter_read import jupyter_read_cells, jupyter_notebook_info
# Leer todas las celdas
cells = jupyter_read_cells(
"notebooks/analysis.ipynb",
server_url="http://localhost:8888",
token="mi-token",
)
for cell in cells:
print(f"[{cell['index']}] {cell['type']}: {cell['source'][:60]}")
# Leer solo la celda 3
cell = jupyter_read_cells("notebooks/analysis.ipynb", token="mi-token", cell_index=3)
# Solo metadata
info = jupyter_notebook_info("notebooks/analysis.ipynb", token="mi-token")
print(f"Total celdas: {info['total_cells']}")
```
## CLI
```bash
# Ver todas las celdas (formato legible)
python jupyter_read.py notebooks/analysis.ipynb --token MI_TOKEN
# Ver solo la celda 5
python jupyter_read.py notebooks/analysis.ipynb --token MI_TOKEN --cell 5
# Solo metadata
python jupyter_read.py notebooks/analysis.ipynb --token MI_TOKEN --info
# Salida JSON (todas las celdas)
python jupyter_read.py notebooks/analysis.ipynb --token MI_TOKEN --json
# Servidor remoto
python jupyter_read.py notebooks/analysis.ipynb --server http://mi-servidor:8888 --token MI_TOKEN
```
## Notas
- Usa `NbModelClient` de `jupyter_nbmodel_client` con protocolo CRDT/Y.js.
- Las funciones internas son `async`; las publicas envuelven con `asyncio.run()` para ser sincronas.
- Lee el estado **en memoria del servidor**, no el archivo `.ipynb` en disco — captura cambios no guardados.
- `notebook_path` debe ser relativo a la raiz del servidor Jupyter, no al sistema de archivos local.
- Para servidores sin token, usar `token=""` (default).
- El CLI muestra preview de hasta 8 lineas de source y 4 lineas por output en modo legible.