DeepEval
ActiveDescription
DeepEval is an open-source evaluation framework for LLM applications. It provides rich evaluation metrics and tools, supporting unit testing and integration testing to help developers build reliable LLM applications.
DeepEval is an open-source evaluation framework for LLM applications. It provides rich evaluation metrics and tools, supporting unit testing and integration testing to help developers build reliable LLM applications.
Ragas is a framework for evaluating RAG (Retrieval Augmented Generation) systems. It provides various evaluation metrics including faithfulness, answer relevance, context precision, helping developers optimize RAG application performance.
AgentLabs is a toolkit for agent development and testing, focused on experimentation, replay, and workflow support to improve iteration speed.
An open-source tool from Meta for LLM prompt optimization. Automates the process of continuously improving and refining LLM prompts.
PromptTools provides open-source tools for prompt testing and experimentation, supporting multiple LLMs (OpenAI, LLaMA) and vector databases (Chroma, Weaviate, LanceDB) to help developers systematically evaluate and optimize RAG systems.