feat(eda): núcleo AutomaticEDA — documento por capítulos + renderers PDF/PPTX anti-corte

Introduce la capa intermedia entre el contenido de un EDA y su formato de salida. Un documento es una lista de capítulos versionados; cada capítulo es un conjunto ordenado de bloques (heading, markdown, kv_table, data_table, figure, image, caption, note) independientes del formato. Núcleo (paquete de soporte python/functions/datascience/automatic_eda/): - model.py: dataclasses de bloques + Chapter, normalizadores defensivos (aceptan dataclass o dict, nunca lanzan), ENGINE_VERSION y el manifiesto por capítulo (automatic_eda_manifest.json). - text_layout.py: medición/wrapping por rejilla de caracteres compartida. - chapters_registry.py: CHAPTER_ORDER pre-declarado + build_document con auto-discovery de capítulos por convención (permite añadir capítulos en paralelo sin editar el registro). - render_pdf_impl.py: paginador A5 retrato móvil que MIDE cada bloque y nunca corta: texto a líneas completas, tablas largas partidas por filas repitiendo cabecera, figuras/imágenes escaladas para caber enteras. Pie versionado por capítulo. - render_pptx_impl.py: mismo principio sobre slides 16:9 (continúa en slide "(cont.)"; tablas repiten cabecera; figuras exportadas a PNG escaladas). - chapters/portada.py y chapters/overview.py: capítulos de referencia. Portada con nombre, rótulo Automatic-EDA, fuente, almacenamiento (inferido de source), fecha europea, filas×cols, descripción, granularidad y calidad con criterios. Overview con df.head (placeholder honesto si falta head_rows), diccionario de columnas (tipo/nulos/ejemplos) y describe numérico. Funciones públicas del registry (grupo eda, dict-no-throw): - render_automatic_eda_pdf / render_automatic_eda_pptx: aceptan capítulos o un TableProfile (construyen los capítulos con build_document) y escriben el manifiesto. Aditivas — no reemplazan render_eda_pdf. Tests self-contained (sin DuckDB) para ambos renderers: golden (portada + overview), partición de tablas largas repitiendo cabecera, no-corte de celdas y markdown largos, profile None/{} válido de 1 página/slide, y error path en directorio no escribible. 23 tests verdes (incluye los previos de render_eda_pdf, intactos). Dependencia nueva python-pptx>=1.0.2 declarada en python/pyproject.toml. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 14:30:31 +02:00
parent 5501507588
commit 9cdde4a341
17 changed files with 2563 additions and 0 deletions
@@ -0,0 +1,57 @@
+"""AutomaticEDA — chapter-based, versioned EDA document with PDF + PPTX output.
+
+Public surface (support package for the registry functions
+``render_automatic_eda_pdf`` and ``render_automatic_eda_pptx``):
+
+- Document model: ``Heading``, ``Markdown``, ``KVTable``, ``DataTable``,
+  ``Figure``, ``Image``, ``Caption``, ``Note``, ``Chapter``; normalizers
+  ``as_blocks`` / ``as_chapters``; ``ENGINE_VERSION`` / ``ENGINE_NAME``.
+- ``build_document(profile, ctx)`` — assemble the ordered chapters of a profile.
+- ``render_pdf(chapters, out_path, meta)`` / ``render_pptx(...)`` — the two
+  renderers (used by the public registry functions).
+- ``merge_manifest(...)`` — write/update the per-chapter version manifest.
+"""
+
+from __future__ import annotations
+
+from .model import (  # noqa: F401
+    ENGINE_NAME,
+    ENGINE_VERSION,
+    Caption,
+    Chapter,
+    DataTable,
+    Figure,
+    Heading,
+    Image,
+    KVTable,
+    Markdown,
+    Note,
+    as_blocks,
+    as_chapters,
+    merge_manifest,
+)
+from .chapters_registry import CHAPTER_ORDER, build_chapter, build_document  # noqa: F401
+from .render_pdf_impl import render_pdf  # noqa: F401
+from .render_pptx_impl import render_pptx  # noqa: F401
+
+__all__ = [
+    "ENGINE_NAME",
+    "ENGINE_VERSION",
+    "Heading",
+    "Markdown",
+    "KVTable",
+    "DataTable",
+    "Figure",
+    "Image",
+    "Caption",
+    "Note",
+    "Chapter",
+    "as_blocks",
+    "as_chapters",
+    "merge_manifest",
+    "CHAPTER_ORDER",
+    "build_chapter",
+    "build_document",
+    "render_pdf",
+    "render_pptx",
+]
@@ -0,0 +1,7 @@
+"""AutomaticEDA chapters.
+
+Each chapter is a module ``<id>.py`` exposing ``build_<id>(profile, ctx) ->
+Chapter | None`` and a ``CHAPTER_VERSION`` constant. The canonical document
+order lives in :mod:`automatic_eda.chapters_registry`. Implemented today:
+``portada`` and ``overview`` (the reference chapters other agents copy).
+"""
@@ -0,0 +1,176 @@
+"""Overview chapter — df.head, column dictionary and describe (reference).
+
+Second reference chapter for AutomaticEDA. Renders (across as many pages/slides
+as needed, the renderers paginate):
+
+1. ``df.head`` — the first rows of the table. The current ``TableProfile`` does
+   NOT carry the raw head, so this is read from ``ctx['head_rows']`` /
+   ``profile['head_rows']`` (a list of row dicts). When absent the chapter shows
+   an honest placeholder documenting the missing key instead of inventing data.
+2. Column dictionary — name / type / nulls / non-null examples. Examples come
+   from ``columns[i]['examples']`` when present; otherwise they are derived from
+   real non-null profile values (categorical top values, numeric min/median/max)
+   so the cell is never empty nor fabricated.
+3. ``df.describe`` — mean / median / min / max / std for every numeric column.
+
+Contract: build_<id>(profile, ctx) -> Chapter | None ; CHAPTER_VERSION = "x.y.z".
+"""
+
+from __future__ import annotations
+
+from .. import model
+
+CHAPTER_VERSION = "1.0.0"
+CHAPTER_ID = "overview"
+CHAPTER_TITLE = "Overview"
+
+# Profile/ctx keys the calculation phase must add for a full head + examples.
+HEAD_KEY = "head_rows"          # list[dict] — df.head(n)
+EXAMPLES_KEY = "examples"       # per column: list of non-null sample values
+
+
+def _fmt_num(value, decimals: int = 3) -> str:
+    if value is None:
+        return "—"
+    if isinstance(value, bool):
+        return str(value)
+    if isinstance(value, int):
+        return f"{value:,}".replace(",", ".")
+    if isinstance(value, float):
+        if value != value:  # NaN
+            return "NaN"
+        if value in (float("inf"), float("-inf")):
+            return str(value)
+        text = f"{value:.{decimals}f}".rstrip("0").rstrip(".")
+        return text if text else "0"
+    return str(value)
+
+
+def _fmt_pct(value, decimals: int = 1) -> str:
+    if value is None:
+        return "—"
+    try:
+        return f"{float(value) * 100:.{decimals}f}%"
+    except (TypeError, ValueError):
+        return str(value)
+
+
+def _examples_for(col: dict) -> str:
+    """Build a short string of real non-null example values for a column."""
+    explicit = col.get(EXAMPLES_KEY)
+    if isinstance(explicit, (list, tuple)) and explicit:
+        return ", ".join(model._safe_str(v) for v in explicit[:4])
+    cat = col.get("categorical") or {}
+    top = cat.get("top") or []
+    if top:
+        vals = [model._safe_str((t or {}).get("value")) for t in top[:4]
+                if isinstance(t, dict)]
+        vals = [v for v in vals if v]
+        if vals:
+            return ", ".join(vals)
+    num = col.get("numeric") or {}
+    if num:
+        bits = []
+        for key in ("min", "median", "max"):
+            v = num.get(key)
+            if v is not None:
+                bits.append(_fmt_num(v))
+        if bits:
+            return ", ".join(bits)
+    return "—"
+
+
+def _head_block(profile: dict, ctx: dict):
+    """Return a DataTable for df.head, or a Note documenting the missing key."""
+    head = ctx.get(HEAD_KEY) or profile.get(HEAD_KEY)
+    if isinstance(head, list) and head and isinstance(head[0], dict):
+        # Column order from the profile, then any extra keys present in rows.
+        cols = [c.get("name") for c in (profile.get("columns") or [])
+                if c.get("name")]
+        if not cols:
+            cols = list(head[0].keys())
+        rows = [[model._safe_str(r.get(c)) for c in cols] for r in head[:10]]
+        return model.DataTable(header=cols, rows=rows,
+                               note=f"primeras {len(rows)} filas")
+    return model.Note(
+        "df.head no disponible: el TableProfile no incluye 'head_rows'. La fase "
+        "de cálculo debe añadir profile['head_rows'] (lista de dicts fila) o "
+        "pasarlo en ctx['head_rows'] para mostrar las primeras filas.")
+
+
+def _columns_block(profile: dict):
+    cols = profile.get("columns") or []
+    header = ["Columna", "Tipo", "Nulos", "Ejemplos (no nulos)"]
+    rows = []
+    for c in cols:
+        if not isinstance(c, dict):
+            continue
+        name = c.get("name") or "(col)"
+        ctype = c.get("inferred_type") or c.get("physical_type") or "—"
+        sem = c.get("semantic_type")
+        if sem:
+            ctype = f"{ctype} ({sem})"
+        null_pct = c.get("null_pct")
+        null_count = c.get("null_count")
+        if null_pct is not None:
+            nulls = _fmt_pct(null_pct)
+            if null_count is not None:
+                nulls += f" ({null_count})"
+        elif null_count is not None:
+            nulls = str(null_count)
+        else:
+            nulls = "—"
+        rows.append([name, ctype, nulls, _examples_for(c)])
+    if not rows:
+        return None
+    return model.DataTable(header=header, rows=rows, title="Columnas")
+
+
+def _describe_block(profile: dict):
+    cols = profile.get("columns") or []
+    header = ["Columna", "mean", "median", "min", "max", "std"]
+    rows = []
+    for c in cols:
+        if not isinstance(c, dict) or c.get("inferred_type") != "numeric":
+            continue
+        num = c.get("numeric") or {}
+        if not num:
+            continue
+        rows.append([
+            c.get("name") or "(col)",
+            _fmt_num(num.get("mean")),
+            _fmt_num(num.get("median")),
+            _fmt_num(num.get("min")),
+            _fmt_num(num.get("max")),
+            _fmt_num(num.get("std")),
+        ])
+    if not rows:
+        return None
+    return model.DataTable(header=header, rows=rows, title="Estadística (describe)")
+
+
+def build_overview(profile: dict, ctx: dict):
+    """Build the Overview Chapter, or None if the profile has no columns."""
+    profile = profile or {}
+    ctx = ctx or {}
+    cols = profile.get("columns") or []
+    if not cols and not (ctx.get(HEAD_KEY) or profile.get(HEAD_KEY)):
+        return None
+
+    blocks = [
+        model.Heading(text="Primeras filas (df.head)", level=2),
+        _head_block(profile, ctx),
+    ]
+    cols_block = _columns_block(profile)
+    if cols_block is not None:
+        blocks.append(model.Heading(
+            text="Diccionario de columnas", level=2))
+        blocks.append(cols_block)
+    desc_block = _describe_block(profile)
+    if desc_block is not None:
+        blocks.append(model.Heading(
+            text="Resumen estadístico numérico", level=2))
+        blocks.append(desc_block)
+
+    return model.Chapter(id=CHAPTER_ID, title=CHAPTER_TITLE,
+                         version=CHAPTER_VERSION, blocks=blocks)
@@ -0,0 +1,156 @@
+"""Cover chapter (PORTADA) — the reference chapter for AutomaticEDA.
+
+Builds the document cover from a TableProfile plus an optional ``ctx`` of
+presentation metadata. Reads everything defensively (``.get``) and degrades
+honestly: a field that is neither in the profile nor in ``ctx`` is shown as a
+placeholder rather than invented, leaving a hook for the LLM layer to fill it.
+
+Contract for chapter authors (see ``docs/capabilities/automatic_eda.md``):
+    build_<id>(profile: dict, ctx: dict) -> Chapter | None
+    CHAPTER_VERSION = "x.y.z"
+"""
+
+from __future__ import annotations
+
+import os
+from datetime import datetime, timezone
+
+from .. import model
+
+CHAPTER_VERSION = "1.0.0"
+CHAPTER_ID = "portada"
+CHAPTER_TITLE = "Portada"
+
+# Default human description of what the table quality score measures. Chapters
+# can override it via ctx["quality_criteria"].
+_DEFAULT_QUALITY_CRITERIA = (
+    "media de los scores por columna (0–100): completitud (sin nulos/vacíos), "
+    "validez (tipo y rango coherentes) y consistencia (sin duplicados/constantes)."
+)
+
+
+def _storage_from_source(source: str) -> str:
+    """Infer the storage technology the dataset currently lives in.
+
+    Heuristic on the profile ``source`` string (a path, DSN or backend name).
+    Returns a human label; falls back to the raw source when unknown.
+    """
+    s = (source or "").strip().lower()
+    if not s:
+        return "—"
+    if s.endswith(".csv") or s.endswith(".tsv"):
+        return "CSV"
+    if s.endswith(".parquet") or s.endswith(".pq"):
+        return "Parquet"
+    if s.endswith(".json") or s.endswith(".ndjson"):
+        return "JSON"
+    if s.endswith(".xlsx") or s.endswith(".xls"):
+        return "Excel"
+    if s.endswith((".duckdb", ".ddb")) or s == "duckdb" or s.endswith(".db"):
+        return "DuckDB"
+    if s.startswith(("postgres://", "postgresql://")) or "postgres" in s:
+        return "PostgreSQL"
+    if s.startswith("bigquery") or "bigquery" in s or s.count(".") == 2 and " " not in s:
+        return "BigQuery"
+    if "sqlite" in s:
+        return "SQLite"
+    # Unknown: show the raw source so nothing is hidden.
+    return source
+
+
+def _fmt_int(v) -> str:
+    if v is None:
+        return "—"
+    try:
+        return f"{int(v):,}".replace(",", ".")
+    except (TypeError, ValueError):
+        return str(v)
+
+
+def _fmt_date_eu(value) -> str:
+    """Format a date/ISO string as European DD/MM/AAAA HH:mm (UI convention).
+
+    Accepts a datetime, an ISO-8601 string (with or without microseconds/tz) or
+    any other string. Non-parseable strings are returned verbatim so nothing is
+    lost; None yields a placeholder.
+    """
+    if value is None:
+        return "—"
+    if isinstance(value, datetime):
+        return value.strftime("%d/%m/%Y %H:%M")
+    s = str(value).strip()
+    if not s:
+        return "—"
+    try:
+        dt = datetime.fromisoformat(s.replace("Z", "+00:00"))
+        return dt.strftime("%d/%m/%Y %H:%M")
+    except (TypeError, ValueError):
+        # Try a couple of common forms before giving up.
+        for fmt in ("%Y-%m-%d %H:%M:%S UTC", "%Y-%m-%d %H:%M UTC",
+                    "%Y-%m-%d %H:%M:%S", "%Y-%m-%d"):
+            try:
+                return datetime.strptime(s, fmt).strftime("%d/%m/%Y %H:%M")
+            except ValueError:
+                continue
+        return s
+
+
+def build_portada(profile: dict, ctx: dict):
+    """Build the cover Chapter, or None if there is truly nothing to show."""
+    profile = profile or {}
+    ctx = ctx or {}
+
+    dataset_name = (ctx.get("dataset_name") or profile.get("table")
+                    or "(dataset sin nombre)")
+    source = profile.get("source") or ""
+    # Where the dataset comes from (origin), distinct from where it is stored.
+    source_origin = ctx.get("source_origin") or source or "—"
+    storage = ctx.get("storage") or _storage_from_source(source)
+
+    when = _fmt_date_eu(
+        ctx.get("generated_at") or profile.get("profiled_at")
+        or datetime.now(timezone.utc))
+
+    n_rows = profile.get("n_rows")
+    n_cols = profile.get("n_cols")
+    shape = f"{_fmt_int(n_rows)} filas × {_fmt_int(n_cols)} columnas"
+
+    score = profile.get("quality_score")
+    quality_criteria = ctx.get("quality_criteria") or _DEFAULT_QUALITY_CRITERIA
+    quality_value = "—" if score is None else f"{score} / 100"
+
+    # Granularity: ctx wins; else derive from key candidates; else be honest.
+    granularity = ctx.get("granularity")
+    if not granularity:
+        keys = profile.get("key_candidates") or []
+        if keys:
+            granularity = ("Cada fila parece identificada por "
+                           + ", ".join(str(k) for k in keys[:3]) + ".")
+        else:
+            granularity = ("Cada fila es… (granularidad no determinada — "
+                           "pendiente de la capa de cálculo/LLM).")
+
+    description = ctx.get("description")
+    if not description:
+        description = ("Descripción no provista — pendiente de la capa LLM "
+                       "(`run_llm`) o de `ctx['description']`.")
+
+    blocks = [
+        model.Heading(text=str(dataset_name), level=1),
+        model.Markdown(text="**Automatic-EDA** · informe exploratorio automático"),
+        model.KVTable(rows=[
+            ("Fuente", source_origin),
+            ("Almacenamiento", storage),
+            ("Generado", when),
+            ("Tamaño", shape),
+            ("Calidad", quality_value),
+            ("Criterios de calidad", quality_criteria),
+        ]),
+        model.Heading(text="Descripción", level=2),
+        model.Markdown(text=str(description)),
+        model.Heading(text="Granularidad", level=2),
+        model.Markdown(text=str(granularity)),
+    ]
+
+    return model.Chapter(id=CHAPTER_ID, title=CHAPTER_TITLE,
+                         version=CHAPTER_VERSION, blocks=blocks)
@@ -0,0 +1,89 @@
+"""Chapter registry — the canonical order of an AutomaticEDA document.
+
+``CHAPTER_ORDER`` declares every chapter the engine will *ever* place, in the
+order they appear in the document. Each id maps by convention to a module
+``automatic_eda/chapters/<id>.py`` exposing ``build_<id>(profile, ctx) ->
+Chapter | None`` and a ``CHAPTER_VERSION`` constant.
+
+This pre-declared order is what lets many agents add chapters in parallel
+without contention: an agent only creates its own ``chapters/<id>.py`` module —
+it never edits this file. ``build_document`` imports each chapter lazily; a
+chapter whose module does not exist yet (not implemented) is simply skipped, so
+the document is always renderable with whatever chapters are present today.
+
+``build_document`` never raises: a chapter that errors out is dropped with a
+note, and a chapter that returns ``None`` (does not apply to this dataset, e.g.
+time series on a dataset with no date column) is omitted.
+"""
+
+from __future__ import annotations
+
+import importlib
+
+from . import model
+
+# Canonical document order. Implemented today: portada, overview. The rest are
+# placeholders other agents will fill by creating chapters/<id>.py — they will
+# appear in this exact position automatically once their module exists.
+CHAPTER_ORDER = [
+    "portada",       # cover
+    "overview",      # df.head + columns/types/nulls/examples + describe
+    "num_distr",     # numeric distributions
+    "cat_distr",     # categorical distributions
+    "calidad",       # data quality
+    "correlacion",   # correlations / associations
+    "modelos",       # cheap models (PCA/KMeans/outliers)
+    "analisis_llm",  # LLM interpretation
+    "timeseries",    # time-series analysis
+    "geospatial",    # geospatial
+    "agregacion",    # aggregations / pivots
+]
+
+
+def build_chapter(chapter_id: str, profile: dict, ctx: dict):
+    """Build a single chapter by id, or None if absent/not-applicable/error.
+
+    Looks up ``automatic_eda.chapters.<chapter_id>`` and calls its
+    ``build_<chapter_id>(profile, ctx)``. Returns a normalized Chapter, or None
+    when the module is missing, the builder returns None, or anything raises.
+    """
+    mod_name = f"{__package__}.chapters.{chapter_id}"
+    try:
+        mod = importlib.import_module(mod_name)
+    except Exception:  # noqa: BLE001 — chapter not implemented yet → skip.
+        return None
+    builder = getattr(mod, f"build_{chapter_id}", None)
+    if builder is None:
+        return None
+    try:
+        result = builder(profile or {}, ctx or {})
+    except Exception:  # noqa: BLE001 — a broken chapter never aborts the doc.
+        return None
+    return model.as_chapter(result)
+
+
+def build_document(profile: dict, ctx: dict = None) -> list:
+    """Build the full ordered list of chapters for a TableProfile.
+
+    Args:
+        profile: the ``eda`` group TableProfile dict (may be None/empty).
+        ctx: optional context dict carrying presentation metadata not present in
+            the profile (dataset_name, source_origin, storage, generated_at,
+            description, granularity, quality_criteria, head_rows, ...).
+
+    Returns:
+        list[Chapter] in canonical order, containing only the chapters that are
+        implemented and applicable. Never raises.
+    """
+    if profile is None:
+        profile = {}
+    if not isinstance(profile, dict):
+        profile = {}
+    if ctx is None:
+        ctx = {}
+    chapters = []
+    for cid in CHAPTER_ORDER:
+        ch = build_chapter(cid, profile, ctx)
+        if ch is not None and ch.blocks:
+            chapters.append(ch)
+    return chapters
@@ -0,0 +1,310 @@
+"""AutomaticEDA document model — format-independent blocks and chapters.
+
+This is the intermediate layer between *content* (what an EDA chapter wants to
+say) and *output format* (PDF for mobile reading, PPTX for sharing). A document
+is an ordered list of :class:`Chapter`. A chapter is ``{id, title, version,
+blocks}``. A block is one of a small, closed set of presentation primitives
+(heading, markdown, key/value table, data table, figure, image, caption, note).
+
+Neither renderer knows anything about the EDA profile: they only know how to lay
+out blocks so that **nothing is ever cut** — long text wraps to whole lines,
+long tables split by rows repeating the header, figures and images are scaled to
+fit entirely. Each chapter declares its own ``version`` so every page/slide can
+be stamped ``<Chapter> · v<version>`` and tracked in a manifest for continuous,
+per-chapter improvement.
+
+Reading is defensive throughout (the ``eda`` group "dict-no-throw" style): the
+normalizers accept dataclass blocks *or* plain dicts, coerce anything unknown
+into a readable :class:`Note` instead of raising, and the renderers degrade a
+malformed block to text rather than crashing the whole document.
+"""
+
+from __future__ import annotations
+
+import json
+import os
+from dataclasses import dataclass, field
+from typing import Any, Callable, Optional
+
+# Global engine version. Bump when the document model or a renderer changes in a
+# way that affects output. Individual chapters carry their own CHAPTER_VERSION.
+ENGINE_VERSION = "1.0.0"
+ENGINE_NAME = "AutomaticEDA"
+
+
+# --------------------------------------------------------------------------- #
+# Block primitives. Each carries a stable ``kind`` string so renderers can
+# dispatch by kind (works for dataclass instances and for plain dicts alike).
+# --------------------------------------------------------------------------- #
+@dataclass
+class Heading:
+    """A section heading. ``level`` 1 (largest) .. 3 (smallest)."""
+
+    text: str = ""
+    level: int = 1
+    kind: str = field(default="heading", init=False)
+
+
+@dataclass
+class Markdown:
+    """A block of light markdown text.
+
+    Supported subset (everything else is rendered verbatim, never dropped):
+    ``#``/``##``/``###`` headings, ``-``/``*`` bullet lists, ``| a | b |``
+    tables (consecutive pipe lines become a data table), blank lines as
+    paragraph breaks, and ``**bold**`` inline markers (markers are stripped, the
+    text is kept). Text is wrapped to whole lines so it is never cut mid-line.
+    """
+
+    text: str = ""
+    kind: str = field(default="markdown", init=False)
+
+
+@dataclass
+class KVTable:
+    """A two-column key/value table. ``rows`` is a list of ``(label, value)``."""
+
+    rows: list = field(default_factory=list)
+    title: Optional[str] = None
+    kind: str = field(default="kv_table", init=False)
+
+
+@dataclass
+class DataTable:
+    """A tabular block with a header row.
+
+    If it does not fit in the remaining page/slide space it is split by rows,
+    **repeating the header** on each continuation. Long cell text wraps inside
+    its column (the row grows taller) so no cell content is ever lost.
+    """
+
+    header: list = field(default_factory=list)
+    rows: list = field(default_factory=list)  # list[list[Any]]
+    title: Optional[str] = None
+    note: Optional[str] = None
+    kind: str = field(default="data_table", init=False)
+
+
+@dataclass
+class Figure:
+    """A matplotlib figure, scaled to fit entirely (never cropped).
+
+    Provide either an already-built ``fig`` (a ``matplotlib.figure.Figure``) or
+    a zero-arg ``make`` callable that returns one (lazy: only built when the
+    renderer needs it). ``height_in`` is an optional hint for the target height
+    on the page; renderers clamp it to the available space preserving aspect.
+    """
+
+    fig: Any = None
+    make: Optional[Callable[[], Any]] = None
+    caption: Optional[str] = None
+    height_in: Optional[float] = None
+    kind: str = field(default="figure", init=False)
+
+
+@dataclass
+class Image:
+    """A raster image (PNG/JPG) by path, scaled to fit entirely."""
+
+    path: str = ""
+    caption: Optional[str] = None
+    height_in: Optional[float] = None
+    kind: str = field(default="image", init=False)
+
+
+@dataclass
+class Caption:
+    """Small auxiliary text rendered under a figure/table."""
+
+    text: str = ""
+    kind: str = field(default="caption", init=False)
+
+
+@dataclass
+class Note:
+    """Small auxiliary note (italic). Also the fallback for unknown content."""
+
+    text: str = ""
+    kind: str = field(default="note", init=False)
+
+
+@dataclass
+class Chapter:
+    """An ordered set of blocks with an id, a title and a generation version."""
+
+    id: str = ""
+    title: str = ""
+    version: str = "1.0.0"
+    blocks: list = field(default_factory=list)
+
+
+# --------------------------------------------------------------------------- #
+# Defensive normalizers — accept dataclasses OR plain dicts, never raise.
+# --------------------------------------------------------------------------- #
+_BLOCK_BY_KIND = {
+    "heading": Heading,
+    "markdown": Markdown,
+    "kv_table": KVTable,
+    "data_table": DataTable,
+    "figure": Figure,
+    "image": Image,
+    "caption": Caption,
+    "note": Note,
+}
+
+
+def as_block(obj: Any):
+    """Coerce a value into a block dataclass. Unknown values become a Note."""
+    if isinstance(obj, (Heading, Markdown, KVTable, DataTable, Figure, Image,
+                        Caption, Note)):
+        return obj
+    if isinstance(obj, dict):
+        kind = obj.get("kind")
+        cls = _BLOCK_BY_KIND.get(kind)
+        if cls is None:
+            return Note(text=_safe_str(obj))
+        # Build only with fields the dataclass accepts (ignore extras).
+        try:
+            if cls is Heading:
+                return Heading(text=_safe_str(obj.get("text")),
+                               level=int(obj.get("level", 1) or 1))
+            if cls is Markdown:
+                return Markdown(text=_safe_str(obj.get("text")))
+            if cls is KVTable:
+                return KVTable(rows=list(obj.get("rows") or []),
+                               title=obj.get("title"))
+            if cls is DataTable:
+                return DataTable(header=list(obj.get("header") or []),
+                                 rows=list(obj.get("rows") or []),
+                                 title=obj.get("title"), note=obj.get("note"))
+            if cls is Figure:
+                return Figure(fig=obj.get("fig"), make=obj.get("make"),
+                              caption=obj.get("caption"),
+                              height_in=obj.get("height_in"))
+            if cls is Image:
+                return Image(path=_safe_str(obj.get("path")),
+                             caption=obj.get("caption"),
+                             height_in=obj.get("height_in"))
+            if cls is Caption:
+                return Caption(text=_safe_str(obj.get("text")))
+            if cls is Note:
+                return Note(text=_safe_str(obj.get("text")))
+        except Exception:  # noqa: BLE001 — never raise on a malformed block.
+            return Note(text=_safe_str(obj))
+    return Note(text=_safe_str(obj))
+
+
+def as_blocks(seq: Any) -> list:
+    """Normalize an arbitrary sequence into a list of block dataclasses."""
+    if seq is None:
+        return []
+    if not isinstance(seq, (list, tuple)):
+        return [as_block(seq)]
+    return [as_block(b) for b in seq]
+
+
+def as_chapter(obj: Any) -> Optional[Chapter]:
+    """Coerce a value into a Chapter (or None). Accepts a dict or a Chapter."""
+    if obj is None:
+        return None
+    if isinstance(obj, Chapter):
+        obj.blocks = as_blocks(obj.blocks)
+        return obj
+    if isinstance(obj, dict):
+        return Chapter(
+            id=_safe_str(obj.get("id")),
+            title=_safe_str(obj.get("title")) or _safe_str(obj.get("id")),
+            version=_safe_str(obj.get("version")) or "1.0.0",
+            blocks=as_blocks(obj.get("blocks")),
+        )
+    return None
+
+
+def as_chapters(seq: Any) -> list:
+    """Normalize a sequence of chapters, dropping anything that can't coerce."""
+    if seq is None:
+        return []
+    if isinstance(seq, Chapter):
+        return [as_chapter(seq)]
+    if not isinstance(seq, (list, tuple)):
+        return []
+    out = []
+    for c in seq:
+        ch = as_chapter(c)
+        if ch is not None:
+            out.append(ch)
+    return out
+
+
+def _safe_str(v: Any) -> str:
+    """str() that never raises and maps None to ''."""
+    if v is None:
+        return ""
+    try:
+        return str(v)
+    except Exception:  # noqa: BLE001
+        return ""
+
+
+# --------------------------------------------------------------------------- #
+# Manifest — per-chapter versions and page/slide counts for tracking.
+# --------------------------------------------------------------------------- #
+def merge_manifest(manifest_path: str, renderer: str, chapters_meta: list,
+                   generated_at: str,
+                   engine_version: str = ENGINE_VERSION) -> dict:
+    """Read-modify-write the AutomaticEDA manifest, merging one renderer's run.
+
+    The manifest lives next to the outputs as ``automatic_eda_manifest.json``
+    and records, per chapter, its version plus the page count (PDF) and slide
+    count (PPTX). Calling either renderer creates or updates it. Never raises:
+    on any error returns the in-memory manifest without writing.
+
+    Args:
+        manifest_path: path to the JSON manifest to create or update.
+        renderer: "pdf" or "pptx" — selects which count key is written.
+        chapters_meta: list of ``{"id", "version", "n_pages"|"n_slides"}``.
+        generated_at: ISO-ish timestamp string for this run.
+        engine_version: AutomaticEDA engine version.
+
+    Returns:
+        The merged manifest dict (also written to disk on success).
+    """
+    data: dict = {}
+    try:
+        if manifest_path and os.path.exists(manifest_path):
+            with open(manifest_path, "r", encoding="utf-8") as fh:
+                loaded = json.load(fh)
+            if isinstance(loaded, dict):
+                data = loaded
+    except Exception:  # noqa: BLE001 — a corrupt manifest is overwritten.
+        data = {}
+
+    data["engine"] = ENGINE_NAME
+    data["engine_version"] = engine_version
+    data["generated_at"] = generated_at
+    chapters = data.get("chapters")
+    if not isinstance(chapters, dict):
+        chapters = {}
+    count_key = "n_slides" if renderer == "pptx" else "n_pages"
+    for cm in chapters_meta or []:
+        if not isinstance(cm, dict):
+            continue
+        cid = cm.get("id")
+        if not cid:
+            continue
+        entry = chapters.get(cid)
+        if not isinstance(entry, dict):
+            entry = {}
+        entry["version"] = cm.get("version") or entry.get("version") or "1.0.0"
+        entry[count_key] = cm.get(count_key, cm.get("n_pages", cm.get("n_slides")))
+        chapters[cid] = entry
+    data["chapters"] = chapters
+
+    try:
+        parent = os.path.dirname(os.path.abspath(manifest_path))
+        os.makedirs(parent, exist_ok=True)
+        with open(manifest_path, "w", encoding="utf-8") as fh:
+            json.dump(data, fh, ensure_ascii=False, indent=2, default=str)
+    except Exception:  # noqa: BLE001 — never raise from the manifest writer.
+        pass
+    return data
@@ -0,0 +1,532 @@
+"""AutomaticEDA PDF renderer — A5 portrait, mobile-first, never cuts content.
+
+A flow paginator: it measures each block (using the deterministic character grid
+from :mod:`text_layout`) and places it top-to-bottom on the current page. When a
+unit does not fit in the remaining space it moves whole to the next page —
+text by whole lines (never mid-line, never mid-word), data tables by rows
+**repeating the header**, figures/images scaled to fit entirely (never cropped).
+
+Each chapter starts on a fresh page and every page is stamped in the footer with
+``<Chapter> · v<version>`` plus the engine version and a running page number, so
+output is versioned per chapter for continuous improvement.
+
+dict-no-throw: a failure inside one block is caught and noted; the PDF is always
+produced and at least one page is guaranteed. Engine: matplotlib ``PdfPages``.
+"""
+
+from __future__ import annotations
+
+import io
+import os
+
+import matplotlib
+
+matplotlib.use("Agg")
+
+import matplotlib.image as mpimg  # noqa: E402
+import matplotlib.pyplot as plt  # noqa: E402
+from matplotlib.backends.backend_pdf import PdfPages  # noqa: E402
+from matplotlib.patches import Rectangle  # noqa: E402
+
+from . import model  # noqa: E402
+from . import text_layout as tl  # noqa: E402
+
+# A5 portrait, inches.
+_W, _H = 5.83, 8.27
+_ML, _MR, _MT, _MB = 0.5, 0.42, 0.55, 0.5
+_FOOTER_H = 0.34
+_USABLE_W = _W - _ML - _MR
+_CONTENT_TOP = _MT
+_CONTENT_BOTTOM = _H - _MB - _FOOTER_H
+
+# Palette / type (inherits the Tufte-ish mobile look of render_eda_pdf).
+_INK = "#1b1b1b"
+_ACCENT = "#2a6f97"
+_MUTED = "#8a8a8a"
+_RULE = "#cccccc"
+_HEAD_BG = "#eef3f6"
+
+_RC = {
+    "font.size": 10,
+    "font.family": "sans-serif",
+    "figure.facecolor": "white",
+    "savefig.facecolor": "white",
+    "pdf.fonttype": 42,  # embed TrueType — text stays selectable on mobile.
+}
+
+# Font sizes (pt) and derived line heights (in).
+_FS_H1, _FS_H2, _FS_H3 = 17, 13, 11
+_FS_BODY, _FS_CELL, _FS_NOTE = 10.5, 9.0, 9.0
+_GAP = 0.12          # vertical gap after a block, inches.
+_CELL_PAD = 0.06     # horizontal padding inside a table cell, inches.
+_ROW_VPAD = 0.05     # vertical padding inside a table row, inches.
+
+
+class _PdfState:
+    """Mutable layout cursor for the running PDF document."""
+
+    def __init__(self, pdf, title: str):
+        self.pdf = pdf
+        self.title = title
+        self.fig = None
+        self.y = _CONTENT_TOP        # inches from the top of the page.
+        self.page = 0                # global page counter.
+        self.chapter = None          # current Chapter (for the footer).
+        self.chapter_pages = 0       # pages produced for the current chapter.
+
+
+# --------------------------------------------------------------------------- #
+# Coordinate helpers (inches-from-top → matplotlib figure fraction).
+# --------------------------------------------------------------------------- #
+def _yf(y_in: float) -> float:
+    return 1.0 - (y_in / _H)
+
+
+def _xf(x_in: float) -> float:
+    return x_in / _W
+
+
+def _new_page(st: _PdfState) -> None:
+    """Close the current page (if any) and open a fresh one with a footer."""
+    _flush_page(st)
+    st.fig = plt.figure(figsize=(_W, _H))
+    st.y = _CONTENT_TOP
+    st.page += 1
+    st.chapter_pages += 1
+    _draw_footer(st)
+
+
+def _flush_page(st: _PdfState) -> None:
+    if st.fig is not None:
+        st.pdf.savefig(st.fig)
+        plt.close(st.fig)
+        st.fig = None
+
+
+def _draw_footer(st: _PdfState) -> None:
+    ch = st.chapter
+    left = ""
+    if ch is not None:
+        left = f"{ch.title} · v{ch.version}"
+    right = f"{model.ENGINE_NAME} v{model.ENGINE_VERSION} · p.{st.page}"
+    yb = (_MB * 0.45) / _H
+    st.fig.text(_xf(_ML), yb, left, fontsize=7.5, color=_MUTED,
+                ha="left", va="center")
+    st.fig.text(_xf(_W - _MR), yb, right, fontsize=7.5, color=_MUTED,
+                ha="right", va="center")
+    # A thin rule above the footer.
+    st.fig.add_artist(Rectangle(
+        (_xf(_ML), (_MB + _FOOTER_H * 0.5) / _H),
+        _xf(_W - _MR) - _xf(_ML), 0.0008,
+        transform=st.fig.transFigure, color=_RULE, lw=0.6))
+
+
+def _remaining(st: _PdfState) -> float:
+    return _CONTENT_BOTTOM - st.y
+
+
+def _ensure_space(st: _PdfState, height: float) -> None:
+    """Open a new page if ``height`` does not fit in the remaining space."""
+    if _remaining(st) < height:
+        _new_page(st)
+
+
+# --------------------------------------------------------------------------- #
+# Block placers. Each advances st.y and paginates as needed.
+# --------------------------------------------------------------------------- #
+def _place_heading(st: _PdfState, block) -> None:
+    level = max(1, min(3, int(getattr(block, "level", 1) or 1)))
+    fs = {1: _FS_H1, 2: _FS_H2, 3: _FS_H3}[level]
+    text = tl.strip_inline_md(getattr(block, "text", ""))
+    max_chars = tl.chars_per_line(_USABLE_W, fs)
+    lines = tl.wrap(text, max_chars)
+    lh = tl.line_height_in(fs, leading=1.2)
+    block_h = lh * len(lines) + 0.06
+    # Keep at least the heading + a couple of body lines together when possible.
+    _ensure_space(st, min(block_h + tl.line_height_in(_FS_BODY) * 2,
+                          _CONTENT_BOTTOM - _CONTENT_TOP))
+    for ln in lines:
+        _ensure_space(st, lh)
+        st.fig.text(_xf(_ML), _yf(st.y), ln, fontsize=fs, fontweight="bold",
+                    color=_INK, ha="left", va="top")
+        st.y += lh
+    if level == 1:
+        # Accent underline under a top-level heading.
+        st.fig.add_artist(Rectangle(
+            (_xf(_ML), _yf(st.y + 0.02)), _xf(_ML + 1.4) - _xf(_ML), 0.0016,
+            transform=st.fig.transFigure, color=_ACCENT, lw=0))
+        st.y += 0.10
+    st.y += _GAP
+
+
+def _place_text_lines(st: _PdfState, lines: list, fs: float, color: str,
+                      style: str = "normal", indent: float = 0.0) -> None:
+    lh = tl.line_height_in(fs)
+    for ln in lines:
+        _ensure_space(st, lh)
+        st.fig.text(_xf(_ML + indent), _yf(st.y), ln, fontsize=fs, color=color,
+                    ha="left", va="top", style=style)
+        st.y += lh
+
+
+def _place_markdown(st: _PdfState, block) -> None:
+    raw = getattr(block, "text", "") or ""
+    md_lines = str(raw).split("\n")
+    i = 0
+    n = len(md_lines)
+    while i < n:
+        line = md_lines[i]
+        stripped = line.strip()
+        # Consecutive pipe-table lines → a DataTable.
+        if stripped.startswith("|") and stripped.endswith("|"):
+            j = i
+            tbl_lines = []
+            while j < n and md_lines[j].strip().startswith("|") \
+                    and md_lines[j].strip().endswith("|"):
+                tbl_lines.append(md_lines[j])
+                j += 1
+            parsed = tl.parse_md_table(tbl_lines)
+            if parsed:
+                header, rows = parsed
+                _place_data_table(st, model.DataTable(header=header, rows=rows))
+                i = j
+                continue
+        if stripped == "":
+            st.y += tl.line_height_in(_FS_BODY) * 0.5
+            i += 1
+            continue
+        if stripped.startswith("### "):
+            _place_heading(st, model.Heading(stripped[4:], level=3))
+            i += 1
+            continue
+        if stripped.startswith("## "):
+            _place_heading(st, model.Heading(stripped[3:], level=2))
+            i += 1
+            continue
+        if stripped.startswith("# "):
+            _place_heading(st, model.Heading(stripped[2:], level=1))
+            i += 1
+            continue
+        if stripped.startswith("- ") or stripped.startswith("* "):
+            content = tl.strip_inline_md(stripped[2:])
+            bullet_chars = tl.chars_per_line(_USABLE_W - 0.22, _FS_BODY)
+            wrapped = tl.wrap(content, bullet_chars)
+            first = True
+            for w in wrapped:
+                prefix = "•  " if first else "   "
+                _place_text_lines(st, [prefix + w], _FS_BODY, _INK,
+                                  indent=0.0)
+                first = False
+            i += 1
+            continue
+        # Plain paragraph (gather following plain lines into one paragraph).
+        para = [tl.strip_inline_md(stripped)]
+        j = i + 1
+        while j < n:
+            nxt = md_lines[j].strip()
+            if nxt == "" or nxt.startswith(("|", "#", "- ", "* ")):
+                break
+            para.append(tl.strip_inline_md(nxt))
+            j += 1
+        text = " ".join(para)
+        max_chars = tl.chars_per_line(_USABLE_W, _FS_BODY)
+        _place_text_lines(st, tl.wrap(text, max_chars), _FS_BODY, _INK)
+        i = j
+    st.y += _GAP
+
+
+def _place_kv_table(st: _PdfState, block) -> None:
+    title = getattr(block, "title", None)
+    if title:
+        _place_heading(st, model.Heading(title, level=2))
+    rows = getattr(block, "rows", []) or []
+    key_w = 1.9  # inches reserved for the label column.
+    val_chars = tl.chars_per_line(_USABLE_W - key_w - 0.1, _FS_BODY)
+    lh = tl.line_height_in(_FS_BODY)
+    for row in rows:
+        try:
+            label, value = row[0], row[1]
+        except Exception:  # noqa: BLE001
+            label, value = str(row), ""
+        v_lines = tl.wrap(model._safe_str(value), val_chars)
+        row_h = lh * len(v_lines) + _ROW_VPAD
+        _ensure_space(st, row_h)
+        y0 = st.y
+        st.fig.text(_xf(_ML), _yf(y0), tl.strip_inline_md(model._safe_str(label)),
+                    fontsize=_FS_BODY, color=_MUTED, ha="left", va="top")
+        for k, vl in enumerate(v_lines):
+            st.fig.text(_xf(_ML + key_w), _yf(y0 + k * lh), vl,
+                        fontsize=_FS_BODY, color=_INK, ha="left", va="top")
+        st.y = y0 + row_h
+    st.y += _GAP
+
+
+def _col_widths(header: list, rows: list, fs: float) -> list:
+    """Distribute usable width across columns proportional to content length."""
+    ncol = len(header) if header else (len(rows[0]) if rows else 1)
+    ncol = max(1, ncol)
+    natural = [3] * ncol
+    for c in range(ncol):
+        if header and c < len(header):
+            natural[c] = max(natural[c], len(model._safe_str(header[c])))
+        for r in rows:
+            if c < len(r):
+                natural[c] = max(natural[c], len(model._safe_str(r[c])))
+    # Clamp so one very long column does not starve the others.
+    clamped = [min(max(w, 4), 40) for w in natural]
+    total = float(sum(clamped)) or 1.0
+    widths = [_USABLE_W * w / total for w in clamped]
+    # Enforce a minimum readable column width.
+    min_w = 0.45
+    widths = [max(w, min_w) for w in widths]
+    # Renormalize if the minimums pushed us over the usable width.
+    s = sum(widths)
+    if s > _USABLE_W:
+        widths = [w * _USABLE_W / s for w in widths]
+    return widths
+
+
+def _wrap_row(cells: list, widths: list, fs: float) -> list:
+    """Wrap each cell to its column width → list of line-lists per cell."""
+    out = []
+    for c, w in enumerate(widths):
+        text = model._safe_str(cells[c]) if c < len(cells) else ""
+        max_chars = tl.chars_per_line(w - _CELL_PAD * 2, fs)
+        out.append(tl.wrap(text, max_chars))
+    return out
+
+
+def _draw_table_row(st: _PdfState, cells_lines: list, widths: list, fs: float,
+                    y0: float, header: bool) -> float:
+    lh = tl.line_height_in(fs)
+    nlines = max((len(c) for c in cells_lines), default=1)
+    row_h = lh * nlines + _ROW_VPAD * 2
+    if header:
+        st.fig.add_artist(Rectangle(
+            (_xf(_ML), _yf(y0 + row_h)), _xf(_ML + _USABLE_W) - _xf(_ML),
+            _yf(y0) - _yf(y0 + row_h), transform=st.fig.transFigure,
+            color=_HEAD_BG, lw=0, zorder=0))
+    x = _ML
+    for c, lines in enumerate(cells_lines):
+        for k, ln in enumerate(lines):
+            st.fig.text(_xf(x + _CELL_PAD), _yf(y0 + _ROW_VPAD + k * lh), ln,
+                        fontsize=fs, color=_INK,
+                        fontweight="bold" if header else "normal",
+                        ha="left", va="top", zorder=2)
+        x += widths[c]
+    # Bottom rule of the row.
+    st.fig.add_artist(Rectangle(
+        (_xf(_ML), _yf(y0 + row_h)), _xf(_ML + _USABLE_W) - _xf(_ML), 0.0006,
+        transform=st.fig.transFigure, color=_RULE, lw=0, zorder=1))
+    return row_h
+
+
+def _place_data_table(st: _PdfState, block) -> None:
+    title = getattr(block, "title", None)
+    if title:
+        _place_heading(st, model.Heading(title, level=2))
+    header = list(getattr(block, "header", []) or [])
+    rows = list(getattr(block, "rows", []) or [])
+    fs = _FS_CELL
+    widths = _col_widths(header, rows, fs)
+    header_lines = _wrap_row(header, widths, fs) if header else None
+    lh = tl.line_height_in(fs)
+
+    def header_h() -> float:
+        if not header_lines:
+            return 0.0
+        return lh * max((len(c) for c in header_lines), default=1) + _ROW_VPAD * 2
+
+    def draw_header() -> None:
+        if header_lines:
+            st.y += _draw_table_row(st, header_lines, widths, fs, st.y,
+                                    header=True)
+
+    # Ensure header + first row fit, else start on a new page.
+    first_row_h = 0.0
+    if rows:
+        first_lines = _wrap_row(rows[0], widths, fs)
+        first_row_h = lh * max((len(c) for c in first_lines), default=1) \
+            + _ROW_VPAD * 2
+    _ensure_space(st, header_h() + max(first_row_h, lh))
+    draw_header()
+    for r in rows:
+        cells_lines = _wrap_row(r, widths, fs)
+        row_h = lh * max((len(c) for c in cells_lines), default=1) \
+            + _ROW_VPAD * 2
+        if _remaining(st) < row_h:
+            _new_page(st)
+            draw_header()  # repeat header on the continuation page.
+        st.y += _draw_table_row(st, cells_lines, widths, fs, st.y, header=False)
+    note = getattr(block, "note", None)
+    if note:
+        _place_text_lines(st, tl.wrap(model._safe_str(note),
+                          tl.chars_per_line(_USABLE_W, _FS_NOTE)),
+                          _FS_NOTE, _MUTED, style="italic")
+    st.y += _GAP
+
+
+def _resolve_figure(block):
+    fig = getattr(block, "fig", None)
+    if fig is not None:
+        return fig, False
+    make = getattr(block, "make", None)
+    if callable(make):
+        try:
+            return make(), True
+        except Exception:  # noqa: BLE001
+            return None, False
+    return None, False
+
+
+def _png_from_figure(fig) -> bytes:
+    buf = io.BytesIO()
+    fig.savefig(buf, format="png", dpi=150, bbox_inches="tight")
+    buf.seek(0)
+    return buf.read()
+
+
+def _place_image_array(st: _PdfState, arr, caption) -> None:
+    h_px, w_px = arr.shape[0], arr.shape[1]
+    aspect = (h_px / w_px) if w_px else 1.0
+    max_h = _CONTENT_BOTTOM - _CONTENT_TOP
+    target_w = _USABLE_W
+    target_h = target_w * aspect
+    if target_h > max_h:
+        target_h = max_h
+        target_w = target_h / aspect if aspect else _USABLE_W
+    cap_h = tl.line_height_in(_FS_NOTE) + 0.04 if caption else 0.0
+    # Move whole image to next page if it does not fit in remaining space.
+    if _remaining(st) < target_h + cap_h:
+        if (max_h) >= target_h + cap_h:
+            _new_page(st)
+        else:
+            # Taller than a full page even at min — already clamped to max_h.
+            _new_page(st)
+    left_frac = _xf(_ML + (_USABLE_W - target_w) / 2.0)
+    bottom_frac = _yf(st.y + target_h)
+    ax = st.fig.add_axes([left_frac, bottom_frac, target_w / _W, target_h / _H])
+    ax.imshow(arr)
+    ax.axis("off")
+    st.y += target_h + 0.04
+    if caption:
+        _place_text_lines(st, tl.wrap(model._safe_str(caption),
+                          tl.chars_per_line(_USABLE_W, _FS_NOTE)),
+                          _FS_NOTE, _MUTED, style="italic")
+    st.y += _GAP
+
+
+def _place_figure(st: _PdfState, block) -> None:
+    fig, owned = _resolve_figure(block)
+    if fig is None:
+        _place_text_lines(st, ["(figura no disponible)"], _FS_NOTE, _MUTED,
+                          style="italic")
+        st.y += _GAP
+        return
+    try:
+        png = _png_from_figure(fig)
+    finally:
+        if owned:
+            try:
+                plt.close(fig)
+            except Exception:  # noqa: BLE001
+                pass
+    arr = mpimg.imread(io.BytesIO(png))
+    _place_image_array(st, arr, getattr(block, "caption", None))
+
+
+def _place_image(st: _PdfState, block) -> None:
+    path = getattr(block, "path", "")
+    if not path or not os.path.exists(path):
+        _place_text_lines(st, [f"(imagen no encontrada: {path})"], _FS_NOTE,
+                          _MUTED, style="italic")
+        st.y += _GAP
+        return
+    arr = mpimg.imread(path)
+    _place_image_array(st, arr, getattr(block, "caption", None))
+
+
+def _place_caption(st: _PdfState, block) -> None:
+    _place_text_lines(st, tl.wrap(getattr(block, "text", ""),
+                      tl.chars_per_line(_USABLE_W, _FS_NOTE)),
+                      _FS_NOTE, _MUTED, style="italic")
+    st.y += _GAP
+
+
+def _place_note(st: _PdfState, block) -> None:
+    _place_text_lines(st, tl.wrap(getattr(block, "text", ""),
+                      tl.chars_per_line(_USABLE_W, _FS_NOTE)),
+                      _FS_NOTE, _MUTED, style="italic")
+    st.y += _GAP
+
+
+_PLACERS = {
+    "heading": _place_heading,
+    "markdown": _place_markdown,
+    "kv_table": _place_kv_table,
+    "data_table": _place_data_table,
+    "figure": _place_figure,
+    "image": _place_image,
+    "caption": _place_caption,
+    "note": _place_note,
+}
+
+
+def render_pdf(chapters: list, out_path: str, meta: dict = None) -> dict:
+    """Render a list of Chapters into an A5-portrait, mobile-readable PDF.
+
+    Never raises. Returns ``{path, n_pages, chapters, note}`` where ``chapters``
+    is a list of ``{id, version, n_pages}`` for the manifest. On a fatal write
+    error ``path`` is None and ``note`` explains why.
+    """
+    meta = meta or {}
+    chapters = model.as_chapters(chapters)
+    notes = []
+
+    try:
+        parent = os.path.dirname(os.path.abspath(out_path))
+        os.makedirs(parent, exist_ok=True)
+    except OSError as e:
+        return {"path": None, "n_pages": 0, "chapters": [],
+                "note": f"no se pudo crear el directorio destino: {e}"}
+
+    title = meta.get("title") or model.ENGINE_NAME
+    chapters_meta = []
+    try:
+        with plt.rc_context(_RC):
+            with PdfPages(out_path) as pdf:
+                st = _PdfState(pdf, title)
+                for ch in chapters:
+                    st.chapter = ch
+                    st.chapter_pages = 0
+                    _new_page(st)  # each chapter starts on a fresh page.
+                    for block in ch.blocks:
+                        placer = _PLACERS.get(getattr(block, "kind", ""),
+                                              _place_note)
+                        try:
+                            placer(st, block)
+                        except Exception as e:  # noqa: BLE001
+                            notes.append(
+                                f"bloque '{getattr(block, 'kind', '?')}' del "
+                                f"capítulo '{ch.id}' omitido: {e}")
+                    chapters_meta.append({"id": ch.id, "version": ch.version,
+                                          "n_pages": st.chapter_pages})
+                _flush_page(st)
+                if st.page == 0:
+                    # No chapters at all → guarantee one valid page.
+                    st.chapter = model.Chapter(id="vacio", title=title,
+                                               version=model.ENGINE_VERSION)
+                    _new_page(st)
+                    _place_note(st, model.Note(
+                        "(documento vacío — sin capítulos aplicables)"))
+                    _flush_page(st)
+                n_pages = st.page
+    except Exception as e:  # noqa: BLE001
+        return {"path": None, "n_pages": 0, "chapters": [],
+                "note": f"fallo al escribir el PDF: {e}"}
+
+    note = f"{n_pages} páginas"
+    if notes:
+        note += " · " + "; ".join(notes)
+    return {"path": out_path, "n_pages": n_pages, "chapters": chapters_meta,
+            "note": note}
@@ -0,0 +1,518 @@
+"""AutomaticEDA PPTX renderer — 16:9 slides, never cuts content.
+
+Same flow principle as the PDF renderer but onto PowerPoint slides: measure each
+block and place it top-to-bottom; when it does not fit in the remaining slide
+space, continue on a new slide titled ``<Chapter> (cont.)``. Data tables split by
+rows **repeating the header**; figures/images are scaled to fit entirely. Every
+slide carries a footer ``<Chapter> · v<version>`` plus the engine version.
+
+dict-no-throw: a failure inside one block is caught and noted; the deck is always
+produced with at least one slide. Engine: ``python-pptx`` (added dependency).
+"""
+
+from __future__ import annotations
+
+import io
+import os
+
+from . import model
+from . import text_layout as tl
+
+try:
+    from pptx import Presentation
+    from pptx.util import Inches, Pt, Emu
+    from pptx.dml.color import RGBColor
+    from pptx.enum.text import PP_ALIGN
+    _PPTX_OK = True
+    _PPTX_ERR = ""
+except Exception as _e:  # noqa: BLE001 — surfaced as a dict-no-throw note.
+    _PPTX_OK = False
+    _PPTX_ERR = str(_e)
+
+# 16:9 widescreen, inches.
+_W, _H = 13.333, 7.5
+_ML, _MR = 0.7, 0.7
+_TITLE_TOP, _TITLE_H = 0.28, 0.7
+_CONTENT_TOP = 1.12
+_FOOTER_H = 0.4
+_CONTENT_BOTTOM = _H - _FOOTER_H - 0.15
+_USABLE_W = _W - _ML - _MR
+
+_INK = (0x1B, 0x1B, 0x1B)
+_ACCENT = (0x2A, 0x6F, 0x97)
+_MUTED = (0x8A, 0x8A, 0x8A)
+_HEAD_BG = (0xEE, 0xF3, 0xF6)
+_WHITE = (0xFF, 0xFF, 0xFF)
+
+_FS_TITLE = 26
+_FS_H1, _FS_H2, _FS_H3 = 20, 16, 13
+_FS_BODY, _FS_CELL, _FS_NOTE = 14, 11, 11
+_GAP = 0.12
+
+
+class _PptxState:
+    def __init__(self, prs, title: str):
+        self.prs = prs
+        self.title = title
+        self.slide = None
+        self.y = _CONTENT_TOP
+        self.chapter = None
+        self.slide_no = 0
+        self.chapter_slides = 0
+
+
+def _rgb(c):
+    return RGBColor(*c)
+
+
+def _new_slide(st: _PptxState, cont: bool = False) -> None:
+    blank = st.prs.slide_layouts[6]
+    st.slide = st.prs.slides.add_slide(blank)
+    st.y = _CONTENT_TOP
+    st.slide_no += 1
+    st.chapter_slides += 1
+    _draw_title(st, cont)
+    _draw_footer(st)
+
+
+def _draw_title(st: _PptxState, cont: bool) -> None:
+    ch = st.chapter
+    title = ch.title if ch is not None else st.title
+    if cont:
+        title = f"{title} (cont.)"
+    box = st.slide.shapes.add_textbox(
+        Inches(_ML), Inches(_TITLE_TOP), Inches(_USABLE_W), Inches(_TITLE_H))
+    tf = box.text_frame
+    tf.word_wrap = True
+    p = tf.paragraphs[0]
+    run = p.add_run()
+    run.text = title
+    run.font.size = Pt(_FS_TITLE)
+    run.font.bold = True
+    run.font.color.rgb = _rgb(_INK)
+
+
+def _draw_footer(st: _PptxState) -> None:
+    ch = st.chapter
+    left = f"{ch.title} · v{ch.version}" if ch is not None else ""
+    right = f"{model.ENGINE_NAME} v{model.ENGINE_VERSION} · {st.slide_no}"
+    box = st.slide.shapes.add_textbox(
+        Inches(_ML), Inches(_H - _FOOTER_H), Inches(_USABLE_W),
+        Inches(_FOOTER_H * 0.7))
+    tf = box.text_frame
+    tf.word_wrap = False
+    p = tf.paragraphs[0]
+    r = p.add_run()
+    r.text = left
+    r.font.size = Pt(9)
+    r.font.color.rgb = _rgb(_MUTED)
+    # Right-aligned engine stamp on a second textbox.
+    box2 = st.slide.shapes.add_textbox(
+        Inches(_ML), Inches(_H - _FOOTER_H), Inches(_USABLE_W),
+        Inches(_FOOTER_H * 0.7))
+    tf2 = box2.text_frame
+    p2 = tf2.paragraphs[0]
+    p2.alignment = PP_ALIGN.RIGHT
+    r2 = p2.add_run()
+    r2.text = right
+    r2.font.size = Pt(9)
+    r2.font.color.rgb = _rgb(_MUTED)
+
+
+def _remaining(st: _PptxState) -> float:
+    return _CONTENT_BOTTOM - st.y
+
+
+def _ensure(st: _PptxState, height: float) -> None:
+    if _remaining(st) < height:
+        _new_slide(st, cont=True)
+
+
+def _add_text(st: _PptxState, lines: list, fs: float, color, bold=False,
+              italic=False, indent=0.0, bullet=False) -> None:
+    lh = tl.line_height_in(fs)
+    height = lh * len(lines) + 0.05
+    _ensure(st, height)
+    box = st.slide.shapes.add_textbox(
+        Inches(_ML + indent), Inches(st.y), Inches(_USABLE_W - indent),
+        Inches(height))
+    tf = box.text_frame
+    tf.word_wrap = True
+    first = True
+    for ln in lines:
+        p = tf.paragraphs[0] if first else tf.add_paragraph()
+        first = False
+        run = p.add_run()
+        run.text = ("•  " + ln) if bullet else ln
+        run.font.size = Pt(fs)
+        run.font.bold = bold
+        run.font.italic = italic
+        run.font.color.rgb = _rgb(color)
+    st.y += height
+
+
+def _place_heading(st: _PptxState, block) -> None:
+    level = max(1, min(3, int(getattr(block, "level", 1) or 1)))
+    fs = {1: _FS_H1, 2: _FS_H2, 3: _FS_H3}[level]
+    text = tl.strip_inline_md(getattr(block, "text", ""))
+    lines = tl.wrap(text, tl.chars_per_line(_USABLE_W, fs))
+    _add_text(st, lines, fs, _INK, bold=True)
+    st.y += 0.04
+
+
+def _place_markdown(st: _PptxState, block) -> None:
+    raw = str(getattr(block, "text", "") or "")
+    md_lines = raw.split("\n")
+    i, n = 0, len(md_lines)
+    while i < n:
+        stripped = md_lines[i].strip()
+        if stripped.startswith("|") and stripped.endswith("|"):
+            j = i
+            tbl = []
+            while j < n and md_lines[j].strip().startswith("|") \
+                    and md_lines[j].strip().endswith("|"):
+                tbl.append(md_lines[j])
+                j += 1
+            parsed = tl.parse_md_table(tbl)
+            if parsed:
+                header, rows = parsed
+                _place_data_table(st, model.DataTable(header=header, rows=rows))
+                i = j
+                continue
+        if stripped == "":
+            st.y += tl.line_height_in(_FS_BODY) * 0.4
+            i += 1
+            continue
+        if stripped.startswith("### "):
+            _place_heading(st, model.Heading(stripped[4:], level=3))
+            i += 1
+            continue
+        if stripped.startswith("## "):
+            _place_heading(st, model.Heading(stripped[3:], level=2))
+            i += 1
+            continue
+        if stripped.startswith("# "):
+            _place_heading(st, model.Heading(stripped[2:], level=1))
+            i += 1
+            continue
+        if stripped.startswith("- ") or stripped.startswith("* "):
+            content = tl.strip_inline_md(stripped[2:])
+            lines = tl.wrap(content, tl.chars_per_line(_USABLE_W - 0.3, _FS_BODY))
+            _add_text(st, lines, _FS_BODY, _INK, bullet=True)
+            i += 1
+            continue
+        para = [tl.strip_inline_md(stripped)]
+        j = i + 1
+        while j < n:
+            nxt = md_lines[j].strip()
+            if nxt == "" or nxt.startswith(("|", "#", "- ", "* ")):
+                break
+            para.append(tl.strip_inline_md(nxt))
+            j += 1
+        text = " ".join(para)
+        _add_text(st, tl.wrap(text, tl.chars_per_line(_USABLE_W, _FS_BODY)),
+                  _FS_BODY, _INK)
+        i = j
+    st.y += _GAP
+
+
+def _place_kv_table(st: _PptxState, block) -> None:
+    title = getattr(block, "title", None)
+    if title:
+        _place_heading(st, model.Heading(title, level=2))
+    rows = getattr(block, "rows", []) or []
+    data_rows = []
+    for row in rows:
+        try:
+            label, value = row[0], row[1]
+        except Exception:  # noqa: BLE001
+            label, value = str(row), ""
+        data_rows.append([model._safe_str(label), model._safe_str(value)])
+    _place_data_table(st, model.DataTable(header=["Campo", "Valor"],
+                                          rows=data_rows), shaded_header=True,
+                      key_value=True)
+
+
+def _col_widths(header, rows):
+    ncol = len(header) if header else (len(rows[0]) if rows else 1)
+    ncol = max(1, ncol)
+    natural = [3] * ncol
+    for c in range(ncol):
+        if header and c < len(header):
+            natural[c] = max(natural[c], len(model._safe_str(header[c])))
+        for r in rows:
+            if c < len(r):
+                natural[c] = max(natural[c], len(model._safe_str(r[c])))
+    clamped = [min(max(w, 4), 44) for w in natural]
+    total = float(sum(clamped)) or 1.0
+    return [_USABLE_W * w / total for w in clamped]
+
+
+def _row_height_in(cells, widths, fs) -> float:
+    lh = tl.line_height_in(fs)
+    maxlines = 1
+    for c, w in enumerate(widths):
+        text = model._safe_str(cells[c]) if c < len(cells) else ""
+        lines = tl.wrap(text, tl.chars_per_line(w - 0.12, fs))
+        maxlines = max(maxlines, len(lines))
+    return lh * maxlines + 0.10
+
+
+def _emit_table(st: _PptxState, header, chunk, widths, fs) -> None:
+    nrows = len(chunk) + (1 if header else 0)
+    ncol = len(widths)
+    # Pre-measure total height to size the shape (pptx still auto-grows rows).
+    heights = []
+    if header:
+        heights.append(_row_height_in(header, widths, fs))
+    for r in chunk:
+        heights.append(_row_height_in(r, widths, fs))
+    total_h = sum(heights)
+    gtable = st.slide.shapes.add_table(
+        nrows, ncol, Inches(_ML), Inches(st.y), Inches(_USABLE_W),
+        Inches(total_h)).table
+    gtable.first_row = bool(header)
+    gtable.horz_banding = False
+    for c in range(ncol):
+        gtable.columns[c].width = Emu(int(Inches(widths[c])))
+    ridx = 0
+    if header:
+        for c in range(ncol):
+            cell = gtable.cell(0, c)
+            cell.text = model._safe_str(header[c]) if c < len(header) else ""
+            _style_cell(cell, fs, _INK, bold=True, fill=_HEAD_BG)
+        ridx = 1
+    for r in chunk:
+        for c in range(ncol):
+            cell = gtable.cell(ridx, c)
+            cell.text = model._safe_str(r[c]) if c < len(r) else ""
+            _style_cell(cell, fs, _INK, bold=False, fill=_WHITE)
+        ridx += 1
+    st.y += total_h + _GAP
+
+
+def _style_cell(cell, fs, color, bold, fill) -> None:
+    cell.fill.solid()
+    cell.fill.fore_color.rgb = _rgb(fill)
+    cell.margin_left = Inches(0.05)
+    cell.margin_right = Inches(0.05)
+    cell.margin_top = Inches(0.02)
+    cell.margin_bottom = Inches(0.02)
+    for p in cell.text_frame.paragraphs:
+        for run in p.runs:
+            run.font.size = Pt(fs)
+            run.font.bold = bold
+            run.font.color.rgb = _rgb(color)
+
+
+def _place_data_table(st: _PptxState, block, shaded_header=True,
+                      key_value=False) -> None:
+    title = getattr(block, "title", None)
+    if title:
+        _place_heading(st, model.Heading(title, level=2))
+    header = list(getattr(block, "header", []) or [])
+    rows = list(getattr(block, "rows", []) or [])
+    fs = _FS_CELL
+    widths = _col_widths(header, rows)
+    header_h = _row_height_in(header, widths, fs) if header else 0.0
+
+    idx = 0
+    n = len(rows)
+    if n == 0:
+        # Header-only table still rendered (one slide).
+        _ensure(st, header_h + 0.2)
+        _emit_table(st, header, [], widths, fs)
+        return
+    while idx < n:
+        # Greedily fill the current slide with as many rows as fit.
+        if _remaining(st) < header_h + _row_height_in(rows[idx], widths, fs):
+            _new_slide(st, cont=True)
+        avail = _remaining(st) - header_h
+        chunk = []
+        used = 0.0
+        while idx < n:
+            rh = _row_height_in(rows[idx], widths, fs)
+            if used + rh > avail and chunk:
+                break
+            chunk.append(rows[idx])
+            used += rh
+            idx += 1
+        _emit_table(st, header, chunk, widths, fs)
+    note = getattr(block, "note", None)
+    if note:
+        _add_text(st, tl.wrap(model._safe_str(note),
+                  tl.chars_per_line(_USABLE_W, _FS_NOTE)), _FS_NOTE, _MUTED,
+                  italic=True)
+
+
+def _img_size_px(data: bytes):
+    try:
+        from PIL import Image
+        with Image.open(io.BytesIO(data)) as im:
+            return im.size  # (w, h)
+    except Exception:  # noqa: BLE001
+        return (1200, 800)
+
+
+def _resolve_png(block):
+    fig = getattr(block, "fig", None)
+    make = getattr(block, "make", None)
+    f = fig
+    owned = False
+    if f is None and callable(make):
+        try:
+            f = make()
+            owned = True
+        except Exception:  # noqa: BLE001
+            f = None
+    if f is None:
+        return None
+    try:
+        import matplotlib.pyplot as plt
+        buf = io.BytesIO()
+        f.savefig(buf, format="png", dpi=150, bbox_inches="tight")
+        buf.seek(0)
+        return buf.read()
+    except Exception:  # noqa: BLE001
+        return None
+    finally:
+        if owned:
+            try:
+                import matplotlib.pyplot as plt
+                plt.close(f)
+            except Exception:  # noqa: BLE001
+                pass
+
+
+def _place_picture_bytes(st: _PptxState, data: bytes, caption) -> None:
+    w_px, h_px = _img_size_px(data)
+    aspect = (h_px / w_px) if w_px else 0.66
+    max_h = _CONTENT_BOTTOM - _CONTENT_TOP
+    target_w = _USABLE_W
+    target_h = target_w * aspect
+    if target_h > max_h:
+        target_h = max_h
+        target_w = target_h / aspect if aspect else _USABLE_W
+    cap_h = tl.line_height_in(_FS_NOTE) + 0.05 if caption else 0.0
+    if _remaining(st) < target_h + cap_h:
+        _new_slide(st, cont=True)
+    left = _ML + (_USABLE_W - target_w) / 2.0
+    st.slide.shapes.add_picture(io.BytesIO(data), Inches(left), Inches(st.y),
+                                width=Inches(target_w), height=Inches(target_h))
+    st.y += target_h + 0.05
+    if caption:
+        _add_text(st, tl.wrap(model._safe_str(caption),
+                  tl.chars_per_line(_USABLE_W, _FS_NOTE)), _FS_NOTE, _MUTED,
+                  italic=True)
+    st.y += _GAP
+
+
+def _place_figure(st: _PptxState, block) -> None:
+    png = _resolve_png(block)
+    if png is None:
+        _add_text(st, ["(figura no disponible)"], _FS_NOTE, _MUTED, italic=True)
+        st.y += _GAP
+        return
+    _place_picture_bytes(st, png, getattr(block, "caption", None))
+
+
+def _place_image(st: _PptxState, block) -> None:
+    path = getattr(block, "path", "")
+    if not path or not os.path.exists(path):
+        _add_text(st, [f"(imagen no encontrada: {path})"], _FS_NOTE, _MUTED,
+                  italic=True)
+        st.y += _GAP
+        return
+    try:
+        with open(path, "rb") as fh:
+            data = fh.read()
+    except Exception as e:  # noqa: BLE001
+        _add_text(st, [f"(no se pudo leer la imagen: {e})"], _FS_NOTE, _MUTED,
+                  italic=True)
+        st.y += _GAP
+        return
+    _place_picture_bytes(st, data, getattr(block, "caption", None))
+
+
+def _place_caption(st: _PptxState, block) -> None:
+    _add_text(st, tl.wrap(getattr(block, "text", ""),
+              tl.chars_per_line(_USABLE_W, _FS_NOTE)), _FS_NOTE, _MUTED,
+              italic=True)
+    st.y += _GAP
+
+
+def _place_note(st: _PptxState, block) -> None:
+    _place_caption(st, block)
+
+
+_PLACERS = {
+    "heading": _place_heading,
+    "markdown": _place_markdown,
+    "kv_table": _place_kv_table,
+    "data_table": _place_data_table,
+    "figure": _place_figure,
+    "image": _place_image,
+    "caption": _place_caption,
+    "note": _place_note,
+}
+
+
+def render_pptx(chapters: list, out_path: str, meta: dict = None) -> dict:
+    """Render a list of Chapters into a 16:9 PPTX deck. Never raises.
+
+    Returns ``{path, n_slides, chapters, note}`` where ``chapters`` is a list of
+    ``{id, version, n_slides}`` for the manifest. On a fatal error ``path`` is
+    None and ``note`` explains why (e.g. python-pptx not installed).
+    """
+    meta = meta or {}
+    if not _PPTX_OK:
+        return {"path": None, "n_slides": 0, "chapters": [],
+                "note": f"python-pptx no disponible: {_PPTX_ERR}"}
+
+    chapters = model.as_chapters(chapters)
+    notes = []
+    try:
+        parent = os.path.dirname(os.path.abspath(out_path))
+        os.makedirs(parent, exist_ok=True)
+    except OSError as e:
+        return {"path": None, "n_slides": 0, "chapters": [],
+                "note": f"no se pudo crear el directorio destino: {e}"}
+
+    title = meta.get("title") or model.ENGINE_NAME
+    chapters_meta = []
+    try:
+        prs = Presentation()
+        prs.slide_width = Inches(_W)
+        prs.slide_height = Inches(_H)
+        st = _PptxState(prs, title)
+        for ch in chapters:
+            st.chapter = ch
+            st.chapter_slides = 0
+            _new_slide(st, cont=False)
+            for block in ch.blocks:
+                placer = _PLACERS.get(getattr(block, "kind", ""), _place_note)
+                try:
+                    placer(st, block)
+                except Exception as e:  # noqa: BLE001
+                    notes.append(
+                        f"bloque '{getattr(block, 'kind', '?')}' del capítulo "
+                        f"'{ch.id}' omitido: {e}")
+            chapters_meta.append({"id": ch.id, "version": ch.version,
+                                  "n_slides": st.chapter_slides})
+        if st.slide_no == 0:
+            st.chapter = model.Chapter(id="vacio", title=title,
+                                       version=model.ENGINE_VERSION)
+            _new_slide(st, cont=False)
+            _place_note(st, model.Note(
+                "(documento vacío — sin capítulos aplicables)"))
+        prs.save(out_path)
+        n_slides = st.slide_no
+    except Exception as e:  # noqa: BLE001
+        return {"path": None, "n_slides": 0, "chapters": [],
+                "note": f"fallo al escribir el PPTX: {e}"}
+
+    note = f"{n_slides} slides"
+    if notes:
+        note += " · " + "; ".join(notes)
+    return {"path": out_path, "n_slides": n_slides, "chapters": chapters_meta,
+            "note": note}
@@ -0,0 +1,107 @@
+"""Shared text-measurement helpers for the AutomaticEDA renderers.
+
+Both renderers flow content top-to-bottom and must know, *before* placing a
+block, how much vertical space it will take — that is what guarantees nothing is
+cut: a unit either fits in the remaining space or moves to the next page/slide
+whole. Measuring proportional text exactly in matplotlib/pptx is impractical, so
+we use a deterministic character-grid estimate (chars-per-line from an average
+glyph width) which slightly over-estimates and is therefore safe: it never
+claims something fits when it would overflow.
+
+Wrapping is word-aware (``textwrap``) and additionally hard-splits any single
+token longer than the line so a 200-character value still wraps instead of
+overflowing — that is wrapping, not loss: every character is still rendered.
+"""
+
+from __future__ import annotations
+
+import textwrap
+
+
+def avg_char_width_in(fontsize_pt: float) -> float:
+    """Approximate average glyph width in inches for a sans-serif font.
+
+    ~0.5 of the point size is a conservative mean advance width for proportional
+    sans fonts; dividing by 72 converts points to inches.
+    """
+    return 0.5 * fontsize_pt / 72.0
+
+
+def line_height_in(fontsize_pt: float, leading: float = 1.32) -> float:
+    """Line height in inches for a given font size and leading."""
+    return leading * fontsize_pt / 72.0
+
+
+def chars_per_line(width_in: float, fontsize_pt: float) -> int:
+    """How many average glyphs fit in ``width_in`` at ``fontsize_pt``."""
+    cw = avg_char_width_in(fontsize_pt)
+    if cw <= 0:
+        return 80
+    n = int(width_in / cw)
+    return max(1, n)
+
+
+def wrap(text: str, max_chars: int) -> list:
+    """Word-wrap ``text`` to lines of at most ``max_chars``, never losing chars.
+
+    Long tokens (no spaces) are hard-split so they cannot overflow. Existing
+    newlines are honored as hard breaks. Empty input yields a single empty line
+    so callers can still reserve a row.
+    """
+    if max_chars < 1:
+        max_chars = 1
+    s = "" if text is None else str(text)
+    out: list = []
+    for raw_line in s.split("\n"):
+        if raw_line == "":
+            out.append("")
+            continue
+        # textwrap with break_long_words so no token overflows the column.
+        wrapped = textwrap.wrap(
+            raw_line, width=max_chars, break_long_words=True,
+            break_on_hyphens=False, replace_whitespace=True,
+            drop_whitespace=True,
+        )
+        if not wrapped:
+            out.append("")
+        else:
+            out.extend(wrapped)
+    return out or [""]
+
+
+def strip_inline_md(text: str) -> str:
+    """Strip a tiny subset of inline markdown markers, keeping the text.
+
+    Removes ``**bold**`` / ``__bold__`` / ``*em*`` / `` `code` `` markers so the
+    content is preserved without trying to style spans (which the line-grid
+    layout cannot do). Nothing is dropped except the markers themselves.
+    """
+    if not text:
+        return ""
+    s = str(text)
+    for marker in ("**", "__", "`"):
+        s = s.replace(marker, "")
+    return s
+
+
+def parse_md_table(lines: list):
+    """Parse consecutive ``| a | b |`` lines into ``(header, rows)`` or None.
+
+    Accepts an optional separator row (``|---|---|``) right after the header,
+    which is ignored. Returns None if the lines are not a pipe table.
+    """
+    cells_rows = []
+    for ln in lines:
+        s = ln.strip()
+        if not (s.startswith("|") and s.endswith("|")):
+            return None
+        parts = [c.strip() for c in s.strip("|").split("|")]
+        cells_rows.append(parts)
+    if not cells_rows:
+        return None
+    header = cells_rows[0]
+    body = cells_rows[1:]
+    # Drop a markdown separator row (all cells are dashes/colons).
+    if body and all(set(c) <= set("-: ") and "-" in c for c in body[0]):
+        body = body[1:]
+    return header, body