feat: funciones Python datascience, finance, cybersecurity y pipelines

Datascience: aggregate_by_group, deduplicate_entities/relations, detect_drift, diff_entities/relations, extract_entities/relations_llm, hotness_score, melt, merge_graphs, pivot, build_entity/relation_schema_prompt. Finance: avellaneda_stoikov_quotes, generate_gbm_prices, generate_taker_order, hawkes_intensity + módulo finance.py. Cybersecurity: envelope_encrypt/decrypt + módulo cybersecurity.py. Pipelines: extraction_pipeline, monte_carlo_market, run_market_sim. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 17:11:32 +02:00
parent 25a392df48
commit 63a9cb5273
62 changed files with 5376 additions and 0 deletions
@@ -0,0 +1,45 @@
+---
+name: aggregate_by_group
+kind: function
+lang: py
+domain: datascience
+version: "1.0.0"
+purity: pure
+signature: "def aggregate_by_group(rows: list[dict], group_by: list[str], aggs: dict[str, str]) -> list[dict]"
+description: "GROUP BY + agregaciones sobre datos tabulares. aggs es un dict de columna a funcion (sum, mean, count, min, max, first, last, collect). collect acumula valores en lista. None se ignora en agregaciones numericas."
+tags: [datascience, tabular, groupby, aggregate, transform, python]
+uses_functions: []
+uses_types: []
+returns: []
+returns_optional: false
+error_type: ""
+imports: ["collections"]
+tested: true
+tests:
+  - "Group by una columna con sum"
+  - "Group by multiples columnas"
+  - "Agregacion mean count min max"
+  - "collect acumula en lista"
+  - "Grupo con una sola fila"
+  - "Campo con None se ignora en agregaciones numericas"
+test_file_path: "python/functions/datascience/aggregate_by_group_test.py"
+file_path: "python/functions/datascience/aggregate_by_group.py"
+---
+
+## Ejemplo
+
+```python
+rows = [
+    {"dept": "eng", "salary": 100},
+    {"dept": "eng", "salary": 120},
+    {"dept": "sales", "salary": 80},
+]
+aggregate_by_group(rows, group_by=["dept"], aggs={"salary": "mean"})
+# [{"dept": "eng", "salary": 110.0}, {"dept": "sales", "salary": 80.0}]
+```
+
+## Notas
+
+Funcion pura sin dependencias externas (solo collections.defaultdict de stdlib).
+Preserva el orden de primera aparicion de cada grupo.
+La funcion 'collect' no filtra None — acumula todos los valores incluyendo None.