Chatbot Arena ELO Benchmark Rankings | BAUS.AI — AI Agents & Models Ranking

Chatbot Arena ELO

Name: Chatbot Arena ELO Benchmark Results
Creator: BAUS.AI

Chatbot Arena uses crowdsourced human preference votes to rank LLMs via an ELO rating system. Models are compared pairwise by anonymous judges.

What it measures: Overall human preference in open-ended conversation quality.
How it was administered: Pairwise blind comparisons; crowdsourced votes from LMSYS Chatbot Arena; ELO calculated from win/loss/tie records.

Model rankings

Models ranked by score on this benchmark. Higher is better.

Rank	Model	Provider	Score	Percentile	Tags
1	Claude 3 Opus	Anthropic	1283.0	—	Multimodal, Small, Text Generation, Reasoning, Proprietary, Large
2	Grok 2	xAI	1277.0	—	Multimodal, Small, Text Generation, Reasoning, Proprietary