837563c3ba
Datascience: aggregate_by_group, deduplicate_entities/relations, detect_drift, diff_entities/relations, extract_entities/relations_llm, hotness_score, melt, merge_graphs, pivot, build_entity/relation_schema_prompt. Finance: avellaneda_stoikov_quotes, generate_gbm_prices, generate_taker_order, hawkes_intensity + módulo finance.py. Cybersecurity: envelope_encrypt/decrypt + módulo cybersecurity.py. Pipelines: extraction_pipeline, monte_carlo_market, run_market_sim. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1.5 KiB
1.5 KiB
name, kind, lang, domain, version, purity, signature, description, tags, uses_functions, uses_types, returns, returns_optional, error_type, imports, tested, tests, test_file_path, file_path
| name | kind | lang | domain | version | purity | signature | description | tags | uses_functions | uses_types | returns | returns_optional | error_type | imports | tested | tests | test_file_path | file_path | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| aggregate_by_group | function | py | datascience | 1.0.0 | pure | def aggregate_by_group(rows: list[dict], group_by: list[str], aggs: dict[str, str]) -> list[dict] | GROUP BY + agregaciones sobre datos tabulares. aggs es un dict de columna a funcion (sum, mean, count, min, max, first, last, collect). collect acumula valores en lista. None se ignora en agregaciones numericas. |
|
false |
|
true |
|
python/functions/datascience/aggregate_by_group_test.py | python/functions/datascience/aggregate_by_group.py |
Ejemplo
rows = [
{"dept": "eng", "salary": 100},
{"dept": "eng", "salary": 120},
{"dept": "sales", "salary": 80},
]
aggregate_by_group(rows, group_by=["dept"], aggs={"salary": "mean"})
# [{"dept": "eng", "salary": 110.0}, {"dept": "sales", "salary": 80.0}]
Notas
Funcion pura sin dependencias externas (solo collections.defaultdict de stdlib). Preserva el orden de primera aparicion de cada grupo. La funcion 'collect' no filtra None — acumula todos los valores incluyendo None.