Provides LLM-as-judge and code-based evaluators for scoring LLM outputs, with built-in templates for hallucination, relevance, and toxicity detection.
Required Ruby Version
>= 3.4
Authors
Matthew Iverson
Versions
- 0.3.13 March 30, 2026 (21.5 KB)
- 0.3.11 March 30, 2026 (21.5 KB)
- 0.3.10 March 29, 2026 (21 KB)
- 0.3.9 March 29, 2026 (21 KB)
- 0.3.8 March 27, 2026 (20.5 KB)