feat: funciones Python datascience, finance, cybersecurity y pipelines
Datascience: aggregate_by_group, deduplicate_entities/relations, detect_drift, diff_entities/relations, extract_entities/relations_llm, hotness_score, melt, merge_graphs, pivot, build_entity/relation_schema_prompt. Finance: avellaneda_stoikov_quotes, generate_gbm_prices, generate_taker_order, hawkes_intensity + módulo finance.py. Cybersecurity: envelope_encrypt/decrypt + módulo cybersecurity.py. Pipelines: extraction_pipeline, monte_carlo_market, run_market_sim. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,45 @@
|
||||
---
|
||||
name: aggregate_by_group
|
||||
kind: function
|
||||
lang: py
|
||||
domain: datascience
|
||||
version: "1.0.0"
|
||||
purity: pure
|
||||
signature: "def aggregate_by_group(rows: list[dict], group_by: list[str], aggs: dict[str, str]) -> list[dict]"
|
||||
description: "GROUP BY + agregaciones sobre datos tabulares. aggs es un dict de columna a funcion (sum, mean, count, min, max, first, last, collect). collect acumula valores en lista. None se ignora en agregaciones numericas."
|
||||
tags: [datascience, tabular, groupby, aggregate, transform, python]
|
||||
uses_functions: []
|
||||
uses_types: []
|
||||
returns: []
|
||||
returns_optional: false
|
||||
error_type: ""
|
||||
imports: ["collections"]
|
||||
tested: true
|
||||
tests:
|
||||
- "Group by una columna con sum"
|
||||
- "Group by multiples columnas"
|
||||
- "Agregacion mean count min max"
|
||||
- "collect acumula en lista"
|
||||
- "Grupo con una sola fila"
|
||||
- "Campo con None se ignora en agregaciones numericas"
|
||||
test_file_path: "python/functions/datascience/aggregate_by_group_test.py"
|
||||
file_path: "python/functions/datascience/aggregate_by_group.py"
|
||||
---
|
||||
|
||||
## Ejemplo
|
||||
|
||||
```python
|
||||
rows = [
|
||||
{"dept": "eng", "salary": 100},
|
||||
{"dept": "eng", "salary": 120},
|
||||
{"dept": "sales", "salary": 80},
|
||||
]
|
||||
aggregate_by_group(rows, group_by=["dept"], aggs={"salary": "mean"})
|
||||
# [{"dept": "eng", "salary": 110.0}, {"dept": "sales", "salary": 80.0}]
|
||||
```
|
||||
|
||||
## Notas
|
||||
|
||||
Funcion pura sin dependencias externas (solo collections.defaultdict de stdlib).
|
||||
Preserva el orden de primera aparicion de cada grupo.
|
||||
La funcion 'collect' no filtra None — acumula todos los valores incluyendo None.
|
||||
Reference in New Issue
Block a user