Harbor
ActiveDescription
Framework for running agent evaluations and creating RL environments to measure and improve agent performance
Framework for running agent evaluations and creating RL environments to measure and improve agent performance
AgentLabs is a toolkit for agent development and testing, focused on experimentation, replay, and workflow support to improve iteration speed.
An open-source tool from Meta for LLM prompt optimization. Automates the process of continuously improving and refining LLM prompts.
DeepEval is an open-source evaluation framework for LLM applications. It provides rich evaluation metrics and tools, supporting unit testing and integration testing to help developers build reliable LLM applications.
RouteLLM is a framework for serving and evaluating LLM routers, enabling cost reduction without compromising quality through intelligent request routing across multiple model tiers.