PydanticAI Harness

Active

GitHub Python MIT

Description

Batteries for your Pydantic AI agent — official harness providing testing, evaluation, and debugging infrastructure.

Related Projects

AgentLabs

550 · TypeScript

Stale

AgentLabs is a toolkit for agent development and testing, focused on experimentation, replay, and workflow support to improve iteration speed.

testingdeveloper-toolsevaluation +1

DeepEval

15.9k · Python

Active

DeepEval is an open-source evaluation framework for LLM applications. It provides rich evaluation metrics and tools, supporting unit testing and integration testing to help developers build reliable LLM applications.

llmevaluationtesting +1

Harbor

2.3k · Python

Active

Framework for running agent evaluations and creating RL environments to measure and improve agent performance

evaluationbenchmarkrl-environments +2

Prompt Ops

816 · Python

Normal

An open-source tool from Meta for LLM prompt optimization. Automates the process of continuously improving and refining LLM prompts.

prompt-engineeringllmtools +2

PydanticAI Harness

Description

Tags

Categories

Related Projects

AgentLabs

DeepEval

Harbor

Prompt Ops