AgentList
HomeProjectsArticlesAbout
Explore Projects
HomeProjectsArticlesAbout
Explore Projects
Home / Projects / pytest-evals

pytest-evals

Stale
GitHub Jupyter Notebook MIT

Description

A pytest plugin for running and analyzing LLM evaluation tests, enabling systematic validation of AI agent performance.

Tags

evaluation testing llm pytest python

Categories

🛡️ Security & Guardrails
Visit GitHub

Project Metrics

Stars 159
Forks 4
Watchers 159
Issues 2
Created January 13, 2025
Last commit February 5, 2025

Deployment

Local

Related Projects

PyRIT

3.7k · Python
Active

The Python Risk Identification Tool for generative AI — an open-source framework by Microsoft for proactively identifying risks in generative AI systems through red teaming and automated probing.

pythonsecurityevaluation +2

Giskard

5.3k · Python
Active

An open-source evaluation and testing library for LLM agents providing automated model scanning, bias detection, performance benchmarking, and compliance checks.

evaluationtestingllm-safety +3

Purple Llama

4.1k · Python
Active

Meta's set of tools to assess and improve LLM security, including safety benchmarks, prompt injection detection, and output auditing to help evaluate and enhance the safety of large language models.

securityevaluationpython +2

LLM Guard

2.9k · Python
Stale

The security toolkit for LLM interactions, providing prompt injection detection, PII anonymization, content safety auditing, and more to secure production LLM deployments.

securityllmpython +2
AgentList

Curated directory of open-source AI agent projects

Quick Links

  • Project List
  • Featured Articles
  • Browse Categories

Contact

  • About
  • Privacy Policy
  • Contact Us

© 2026 AgentList. All rights reserved.

Made with for the open source community