Promptfoo

Active

GitHub TypeScript MIT

Description

Promptfoo is an evaluation and regression testing tool for LLM apps and agents, useful for comparing prompts, tool-call results, and model outputs over time.

Related Projects

Agenta

4.2k · TypeScript

Active

Agenta is an open-source LLMOps platform providing prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.

observabilityllmopsprompt-management +2

Deep Research Bench

738 · Python

Active

Comprehensive benchmark for deep research agents, providing systematic evaluation framework for assessing deep research agent performance.

benchmarkevaluationdeep-research +2

Giskard

5.4k · Python

Active

An open-source evaluation and testing library for LLM agents providing automated model scanning, bias detection, performance benchmarking, and compliance checks.

evaluationtestingllm-safety +3

AgentLabs

550 · TypeScript

Stale

AgentLabs is a toolkit for agent development and testing, focused on experimentation, replay, and workflow support to improve iteration speed.

testingdeveloper-toolsevaluation +1

Promptfoo

Description

Tags

Categories

Related Projects

Agenta

Deep Research Bench

Giskard

AgentLabs