AgentList
HomeProjectsArticlesAbout
Explore Projects
HomeProjectsArticlesAbout
Explore Projects
Home / Projects / Harbor

Harbor

Active
GitHub Python Apache-2.0

Description

Framework for running agent evaluations and creating RL environments to measure and improve agent performance

Tags

evaluation benchmark rl-environments agent-testing python

Categories

📊 Observability ⚡ Agent Tools
Visit GitHub Visit Website

Project Metrics

Stars 1.5k
Forks 918
Watchers 1.5k
Issues 264
Created August 4, 2025
Last commit April 17, 2026

Deployment

Local

Related Projects

AgentLabs

546 · TypeScript
Stale

AgentLabs is a toolkit for agent development and testing, focused on experimentation, replay, and workflow support to improve iteration speed.

testingdeveloper-toolsevaluation +1

Prompt Ops

800 · Python
Active

An open-source tool from Meta for LLM prompt optimization. Automates the process of continuously improving and refining LLM prompts.

prompt-engineeringllmtools +2

DeepEval

14.8k · Python
Active

DeepEval is an open-source evaluation framework for LLM applications. It provides rich evaluation metrics and tools, supporting unit testing and integration testing to help developers build reliable LLM applications.

llmevaluationtesting +1

RouteLLM

4.8k · Python
Stale

RouteLLM is a framework for serving and evaluating LLM routers, enabling cost reduction without compromising quality through intelligent request routing across multiple model tiers.

llm-routingcost-optimizationevaluation +1
AgentList

Curated directory of open-source AI agent projects

Quick Links

  • Project List
  • Featured Articles
  • Browse Categories

Contact

  • About
  • Privacy Policy
  • Contact Us

© 2026 AgentList. All rights reserved.

Made with for the open source community