The continuous leaderboard for AI models and agents.
Benchmarks say it's great. Your gut says it got dumber. We track the gap.
live
checking pipeline…
How it works
- Hourly rankings across 18 public signals — benchmarks, GitHub, package registries, IDE marketplaces, social platforms.
- Multi-factor composite: sentiment, adoption, infrastructure, positioning, buzz, trust. Hype- and nerf-corrected.
- Validation logic peer-review-grade (NeurIPS 2026 D&B submission); production weights re-tuned as data improves.