AI-focused news aggregator that ranks, summarizes, and deduplicates articles about artificial intelligence in real time.
- Aggregates AI news from 32 sources (OpenAI, Anthropic, DeepMind, Google AI, arXiv, HN, GitHub trending, HuggingFace, and more)
- Semantic deduplication via Voyage embeddings + pgvector (collapses "GPT-5 released" from 5 sources into one item)
- Importance scoring with Claude Haiku 4.5 (calibrated 0–100 rubric)
- Topic clustering with automatic cluster labeling
- Real-time ranking blending Claude importance, HN/GitHub/HF engagement, source trust, cluster size, and citation signals
- Real-time UI via Supabase Realtime — new items stream to the browser as they drop
- Discord webhook push for high-importance items
- No accounts — public read-only feed, filter by category / min importance, sort by hot / new / trending
| Layer | Tool |
|---|---|
| Framework | Next.js 16 (App Router) + TypeScript |
| UI | Tailwind v4 + custom primitives + lucide-react icons |
| DB + realtime | Supabase Postgres + pgvector + Realtime |
| LLM | Anthropic claude-haiku-4-5-20251001 (prompt-cached) |
| Embeddings | Voyage AI voyage-3 (1024-dim) — optional |
| Ingestion | Vercel Cron → Next.js API routes per source adapter |
| Deploy | Vercel (frontend + cron), Supabase (DB) |
- Create a project at supabase.com.
- In the SQL editor, run (in order):
supabase/migrations/001_schema.sql— full schema, RLS, realtime, triggers, and RPCs (similar_items,similar_recent_items,bump_duplicate_count,recompute_topic_sizes,trending_items,top_topics)supabase/seed/sources.sql— 32 seed sources + reputation weights
cp .env.example .env.localFill in:
NEXT_PUBLIC_SUPABASE_URL,NEXT_PUBLIC_SUPABASE_ANON_KEY,SUPABASE_SERVICE_ROLE_KEY— from Supabase project settings → APIANTHROPIC_API_KEY— from console.anthropic.comVOYAGE_API_KEY— optional; enables semantic dedup viavoyage-3embeddings (200M free tokens on signup)CRON_SECRET— any long random string; required to call/api/cron/*endpointsITEM_RETENTION_DAYS— default14; items older than this get auto-pruned at ingest timeGITHUB_TOKEN— optional; raises GitHub trending/search rate limits from 60/hr to 5000/hrSEMANTIC_SCHOLAR_API_KEY— optional; raises S2 rate limits for arXiv citation enrichment
You need two terminals: one for the Next.js dev server, one for the ingest/enrich loop. The dev server alone shows an empty feed — new items only flow in when the cron endpoints get hit.
Terminal 1 — dev server:
npm install
npm run devOpen http://localhost:3000. Empty feed is expected until the loop kicks in.
Terminal 2 — ingest + enrich loop (every 15 min):
powershell -ExecutionPolicy Bypass -File .\scripts\loop.ps1scripts/loop.ps1 reads CRON_SECRET from .env.local, hits /api/cron/ingest then /api/cron/enrich, sleeps 15 min, repeats. Leave it running — items start appearing in the feed within a minute.
Quick one-off test (no loop):
curl -H "Authorization: Bearer <CRON_SECRET>" http://localhost:3000/api/cron/ingest
curl -H "Authorization: Bearer <CRON_SECRET>" http://localhost:3000/api/cron/enrichSurvive reboots: register scripts/loop.ps1 in Windows Task Scheduler as a "run at logon" task.
- Vercel cron —
vercel.jsonalready declares schedules. Requires Vercel Pro ($20/mo) — Hobby's 60s function timeout and daily-only crons won't work here. - GitHub Actions — free (2000 min/mo). Add a workflow that
curls the deployed/api/cron/*endpoints on a*/15 * * * *schedule.
app/ Next.js routes (feed, item detail, topic detail, search, API crons)
components/ UI primitives (item card, filter bar, topics strip, …)
lib/
anthropic/ Claude client, enrichment prompt + parser, embeddings
supabase/ browser + server + service-role clients
ingest/ source adapters + dedup + normalization + engagement scoring
topics/ cluster.ts (union-find) + label.ts (Claude cluster labeling)
types.ts shared domain types (Item, Source, Category)
supabase/
migrations/ schema
seed/ source registry seed
MIT — see LICENSE.