--- name: aggregate_by_group kind: function lang: py domain: datascience version: "1.0.0" purity: pure signature: "def aggregate_by_group(rows: list[dict], group_by: list[str], aggs: dict[str, str]) -> list[dict]" description: "GROUP BY + agregaciones sobre datos tabulares. aggs es un dict de columna a funcion (sum, mean, count, min, max, first, last, collect). collect acumula valores en lista. None se ignora en agregaciones numericas." tags: [datascience, tabular, groupby, aggregate, transform, python] uses_functions: [] uses_types: [] returns: [] returns_optional: false error_type: "" imports: ["collections"] tested: true tests: - "Group by una columna con sum" - "Group by multiples columnas" - "Agregacion mean count min max" - "collect acumula en lista" - "Grupo con una sola fila" - "Campo con None se ignora en agregaciones numericas" test_file_path: "python/functions/datascience/aggregate_by_group_test.py" file_path: "python/functions/datascience/aggregate_by_group.py" --- ## Ejemplo ```python rows = [ {"dept": "eng", "salary": 100}, {"dept": "eng", "salary": 120}, {"dept": "sales", "salary": 80}, ] aggregate_by_group(rows, group_by=["dept"], aggs={"salary": "mean"}) # [{"dept": "eng", "salary": 110.0}, {"dept": "sales", "salary": 80.0}] ``` ## Notas Funcion pura sin dependencias externas (solo collections.defaultdict de stdlib). Preserva el orden de primera aparicion de cada grupo. La funcion 'collect' no filtra None — acumula todos los valores incluyendo None.