Promptfoo
ActiveDescription
Promptfoo is an evaluation and regression testing tool for LLM apps and agents, useful for comparing prompts, tool-call results, and model outputs over time.
Promptfoo is an evaluation and regression testing tool for LLM apps and agents, useful for comparing prompts, tool-call results, and model outputs over time.
Agenta is an open-source LLMOps platform providing prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.
Comprehensive benchmark for deep research agents, providing systematic evaluation framework for assessing deep research agent performance.
An open-source evaluation and testing library for LLM agents providing automated model scanning, bias detection, performance benchmarking, and compliance checks.
AgentLabs is a toolkit for agent development and testing, focused on experimentation, replay, and workflow support to improve iteration speed.