Coval

Normal

GitHub Python Apache-2.0

Description

Coval is an evaluation tool for voice and conversational agents, helping teams test response quality, interaction stability, and real dialog behavior.

Related Projects

Giskard

5.3k · Python

Active

An open-source evaluation and testing library for LLM agents providing automated model scanning, bias detection, performance benchmarking, and compliance checks.

evaluationtestingllm-safety +3

PrompToMatix

954 · Python

Stale

An automatic prompt optimization framework by Salesforce AI Research that leverages LLMs to search for and refine prompts for improved model performance.

prompt-engineeringevaluationllm +1

SwanLab

3.9k · Python

Active

An open-source, modern-design AI training tracking and visualization tool. Supports PyTorch, Transformers and more. Monitor and evaluate AI agent training processes.

pythonobservabilityevaluation +2

AgentBench

3.4k · Python

Stale

A comprehensive benchmark to evaluate LLMs as agents (ICLR 2024), covering operating systems, databases, knowledge graphs, digital card games and more.

evaluationpythonagent +1