Provides LLM-as-judge and code-based evaluators for scoring LLM outputs, with built-in templates for hallucination, relevance, and toxicity detection.

Required Ruby Version

>= 3.4

Authors

Matthew Iverson

Versions

  1. 0.3.13 March 30, 2026 (21.5 KB)
  2. 0.3.11 March 30, 2026 (21.5 KB)
  3. 0.3.10 March 29, 2026 (21 KB)
  4. 0.3.9 March 29, 2026 (21 KB)
  5. 0.3.8 March 27, 2026 (20.5 KB)
Show all versions (13 total)

Pushed by

SHA 256 checksum