Files
fn_registry/python/functions/infra/render_table_page_pdfpages.py
egutierrez faac610745 feat: extraccion masiva footprint_aurgi (41 funcs + 4 types + stack Docker geo)
Extrae al registry funciones del proyecto interno footprint_aurgi:
- core (6): slugify_ascii, normalize_for_join, cp_provincia_es, infer_provincia_from_cp, safe_read_csv_fallback, csv_to_parquet_duckdb
- geo puras (7): haversine_km, point_in_ring, point_in_polygon, point_in_polygons_bbox, polygon_bbox, extent_with_padding, distance_bucket
- geo I/O (4): load_geojson_polygons, load_boundary_gdf, add_basemap_osm, add_basemap_with_timeout
- valhalla client (4): valhalla_route, valhalla_isochrone, valhalla_isochrones_async, valhalla_matrix_1_to_n
- datascience stats (7): trimmed_mean, geometric_mean, detect_distribution_type, best_central_tendency, summary_stats, kde_density_levels, alpha_shape_concave_hull
- datascience fuzzy (3): fuzzy_merge_adaptive (rapidfuzz), words_to_dataset, remove_words_from_column
- datascience viz (2): plot_kde_2d, plot_heatmap_log
- infra (4): compress_pdf_ghostscript, render_table_page_pdfpages, add_header_logo, osm2pgsql_ingest
- pipelines (4): setup_geo_stack_docker, compute_centers_reachability, generate_isochrones_by_zone, count_points_per_zone
- types geo (4): LonLat, BBox, IsochroneRequest, Centro

Incluye:
- apps/footprint_geo_stack/ (PostGIS + Martin + Valhalla via docker-compose)
- 131/132 tests pasan (1 skip esperado: osm2pgsql en PATH)
- Issue tracker dev/issues/0052-footprint-aurgi-extraction.md
- Atribucion uniforme: source_repo internal:footprint_aurgi, source_license internal-aurgi
- Build con 9 agentes en paralelo (8 wave 1 + 1 wave 2 pipelines)

Tambien commitea trabajo previo no commiteado: aggregate_extraction_results, chunk_with_overlap, clean_pdf_text, merge_entity_aliases, extract_graph_gliner2, extract_relations_mrebel, extract_triples_spacy_es, gliner2/mrebel/marianmt/rebel/spacy_es load_model, parse_rebel_output, translate_es_to_en, issue 0050/0051.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 23:35:22 +02:00

58 lines
1.9 KiB
Python

"""Render paginated table pages into a matplotlib PdfPages object."""
from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from matplotlib.backends.backend_pdf import PdfPages
def render_table_page_pdfpages(
pdf: "PdfPages",
title: str,
rows: list[list[str]],
col_labels: list[str],
max_rows: int = 28,
figsize: tuple[float, float] = (11.69, 8.27),
fontsize: int = 8,
dpi: int = 300,
) -> None:
"""Render rows as paginated table pages into an open PdfPages object.
Partitions rows into chunks of max_rows and writes one A4-landscape page
per chunk using matplotlib's table widget. Each page carries the given title.
Args:
pdf: An open matplotlib PdfPages context.
title: Page title shown above the table.
rows: List of rows, each row is a list of string cell values.
col_labels: Column header labels.
max_rows: Maximum rows per page before starting a new page.
figsize: Figure size in inches (default A4 landscape 11.69x8.27).
fontsize: Font size for table cells.
dpi: Resolution used when saving each page.
"""
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt
# Always render at least one page; use a placeholder row when rows is empty
chunks: list[list[list[str]]] = []
if not rows:
chunks = [[]]
else:
for start in range(0, len(rows), max_rows):
chunks.append(rows[start: start + max_rows])
for chunk in chunks:
fig, ax = plt.subplots(figsize=figsize)
ax.axis("off")
if chunk:
table = ax.table(cellText=chunk, colLabels=col_labels, loc="center")
table.auto_set_font_size(False)
table.set_fontsize(fontsize)
table.scale(1, 1.3)
ax.set_title(title, fontsize=14, pad=12)
pdf.savefig(fig, dpi=dpi)
plt.close(fig)