Promptfoo

Active

GitHub TypeScript MIT

Description

Promptfoo is an evaluation and regression testing tool for LLM apps and agents, useful for comparing prompts, tool-call results, and model outputs over time.

Related Projects

Agenta

4.1k · TypeScript

Active

Agenta is an open-source LLMOps platform providing prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.

observabilityllmopsprompt-management +2

Giskard

5.3k · Python

Active

An open-source evaluation and testing library for LLM agents providing automated model scanning, bias detection, performance benchmarking, and compliance checks.

evaluationtestingllm-safety +3

AgentLabs

548 · TypeScript

Stale

AgentLabs is a toolkit for agent development and testing, focused on experimentation, replay, and workflow support to improve iteration speed.

testingdeveloper-toolsevaluation +1

AWS Agent Evaluation

360 · Python

Stale

Amazon's AI agent evaluation tool for automated quality assessment of Bedrock Agents and other LLM agents with multi-dimensional metrics and benchmarks.

awsevaluationbenchmark +2

Promptfoo

Description

Tags

Categories

Related Projects

Agenta

Giskard

AgentLabs

AWS Agent Evaluation