19 lines
593 B
Markdown
19 lines
593 B
Markdown
---
|
|
name: agent_coding_eval
|
|
lang: py
|
|
domain: datascience
|
|
description: "Evaluacion de agentes de coding (Qwen 2.5-Coder y otros) sobre tareas reales del fn_registry."
|
|
tags: [agents, coding, eval, qwen, llm, jupyter]
|
|
uses_functions: []
|
|
uses_types: []
|
|
framework: "jupyterlab"
|
|
entry_point: "notebooks/"
|
|
dir_path: "analysis/agent_coding_eval"
|
|
repo_url: "https://gitea-dgg044oo04woo4ggcsws4gk0.organic-machine.com/dataforge/agent_coding_eval"
|
|
---
|
|
|
|
## Notas
|
|
|
|
Notebooks de evaluacion de agentes de coding contra tareas del registry.
|
|
Prueba modelos locales (Qwen 2.5-Coder) y compara contra baselines.
|