TruthfulQA
Benchmark website →TruthfulQA evaluates tendency to avoid common misconceptions and answer factually when faced with misleading questions.
About this test
- What it measures
- Truthfulness and resistance to false beliefs and imitation of human misconceptions.
- How it was administered
- Multiple-choice and generation; questions designed to elicit false answers; MC1/MC2 and generation metrics.
Model rankings
Models ranked by score on this benchmark. Higher is better.
| Rank | Model | Provider | Score | Percentile | Tags |
|---|---|---|---|---|---|
| 1 | OpenAI | 79.5 | — | Text Generation, Small, Multimodal, Reasoning, Proprietary | |
| 2 | Meta | 76.4 | — | Reasoning, Large, Text Generation, Open Weight | |