相关项目
Agent Governance Toolkit
1.1k · Python
Microsoft's AI Agent Governance Toolkit providing policy enforcement, zero-trust identity, execution sandboxing, and reliability engineering for autonomous AI agents. Covers 10/10 OWASP Agentic Top 10.
securityevaluationpython +2
Rogue
1.0k · Python
AI Agent Evaluator and Red Team Platform. Provides systematic security evaluation and adversarial testing tools to discover and fix vulnerabilities in agent systems.
securityevaluationobservability +2
Giskard
5.3k · Python
开源 LLM Agent 评估与测试库,提供自动化模型扫描、偏见检测、性能基准测试和合规检查,帮助团队在部署前全面验证 AI Agent 质量。
evaluationtestingllm-safety +3
AgentBench
3.3k · Python
A comprehensive benchmark to evaluate LLMs as agents (ICLR 2024), covering operating systems, databases, knowledge graphs, digital card games and more.
evaluationpythonagent +1