feat(browser): auto-commit con 178 cambios
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,68 @@
|
||||
---
|
||||
name: fetch_hackernews_search
|
||||
kind: function
|
||||
lang: py
|
||||
domain: datascience
|
||||
version: "1.0.0"
|
||||
purity: impure
|
||||
signature: "def fetch_hackernews_search(query: str, limit: int = 50, tags: str = \"story\") -> list[dict]"
|
||||
description: "Busca en Hacker News via la API Algolia publica (sin auth ni anti-bot) y normaliza cada hit a un shape comun de market intelligence. GET a hn.algolia.com/api/v1/search filtrando por tags (story/comment/...)."
|
||||
tags: [market-intel, hackernews, scraping, http, social, demand, impure, datascience]
|
||||
uses_functions: []
|
||||
uses_types: []
|
||||
returns: []
|
||||
returns_optional: false
|
||||
error_type: "error_go_core"
|
||||
imports: [requests]
|
||||
params:
|
||||
- name: query
|
||||
desc: "termino de busqueda (ej: 'i wish there was a tool')"
|
||||
- name: limit
|
||||
desc: "maximo de resultados (hitsPerPage de Algolia, topea ~1000)"
|
||||
- name: tags
|
||||
desc: "filtro de tipo de item Algolia: 'story' (default), 'comment', 'story,comment', 'show_hn', 'ask_hn'"
|
||||
output: "list[dict] (puede ser []). Cada fila: {source:'hackernews', platform_id:str, title:str, body:str, url:str, author:str, channel:'hn', created_utc:float, platform_score:int, query:str}"
|
||||
tested: true
|
||||
tests:
|
||||
- "parser normaliza hits al shape exacto"
|
||||
- "hit sin url externa cae a news.ycombinator.com item link"
|
||||
- "points None se mapea a 0"
|
||||
- "hits vacio devuelve lista vacia"
|
||||
test_file_path: "python/functions/datascience/fetch_hackernews_search_test.py"
|
||||
file_path: "python/functions/datascience/fetch_hackernews_search.py"
|
||||
---
|
||||
|
||||
## Ejemplo
|
||||
|
||||
```python
|
||||
from datascience import fetch_hackernews_search
|
||||
|
||||
# Buscar stories
|
||||
rows = fetch_hackernews_search("i wish there was a tool", limit=50, tags="story")
|
||||
for r in rows[:3]:
|
||||
print(r["platform_score"], r["title"], r["url"])
|
||||
|
||||
# Buscar comentarios (mas senal de demanda conversacional)
|
||||
comments = fetch_hackernews_search("alternative to", limit=100, tags="comment")
|
||||
```
|
||||
|
||||
## Cuando usarla
|
||||
|
||||
Usala como fuente complementaria a `fetch_reddit_search` en pipelines de market
|
||||
intelligence. HN concentra demanda tecnica/SaaS y la API Algolia es estable y
|
||||
sin anti-bot, ideal para escaneos recurrentes. Pasa `tags="comment"` para captar
|
||||
demanda expresada en hilos (suele ser mas rica que los titulos de story).
|
||||
Combina con `score_demand_signal` para puntuar cada hit.
|
||||
|
||||
## Gotchas
|
||||
|
||||
- **Sin red = lista vacia, no excepcion**: si la peticion falla (red, 5xx,
|
||||
JSON malformado) la funcion devuelve `[]`. Revisa el tamano del resultado.
|
||||
- `created_utc` viene de `created_at_i` (epoch en segundos, float).
|
||||
- `platform_score` son los `points` del item, `0` si Algolia no lo provee
|
||||
(tipico en comentarios, que no tienen puntos visibles en la API).
|
||||
- `url`: si el hit es una story con enlace externo, `url` es ese enlace; si no
|
||||
(Ask HN, comentarios, Show HN sin link), cae al permalink
|
||||
`https://news.ycombinator.com/item?id={objectID}`.
|
||||
- A diferencia de Reddit, Algolia **no** exige User-Agent ni rate-limitea de
|
||||
forma agresiva en uso normal, pero conviene no abusar.
|
||||
Reference in New Issue
Block a user