Running 62 Stick To Your Role! Leaderboard 🎠62 Benchmarking LLMs on the stability of simulated populations
view article Article Improving Prompt Consistency with Structured Generations +1 willkurt, remi, clefourrier • Apr 30, 2024 • 68
Running 600 Scaling test-time compute 📈 600 Boost LLM answers with flexible test‑time search strategies
Llama 3.3 (All Versions) Collection Meta's new Llama 3.3 (70B) model in all formats. Includes GGUF, 4-bit bnb and original versions. • 3 items • Updated Apr 22 • 38
meta-llama/Llama-3.3-70B-Instruct Text Generation • 71B • Updated Dec 21, 2024 • 875k • • 2.79k
Running Agents 111 Judge Arena 💻 111 View and compare open‑source AI model rankings with ELO scores
meta-llama/Meta-Llama-3-8B-Instruct Text Generation • 8B • Updated Jun 18, 2025 • 1.59M • • 4.55k