AgentDojo

Normal

Description

A dynamic environment by ETH Zurich to evaluate attacks and defenses for LLM agents, providing standardized benchmarks for measuring agent system security.

Related Projects

EasyJailbreak

851 · Python

Normal

An easy-to-use Python framework for generating adversarial jailbreak prompts, helping researchers systematically evaluate LLM safety defenses with multiple attack method combinations.

jailbreakadversarialllm-safety +2

AI Red Teaming Playground Labs

1.9k · TypeScript

Normal

Microsoft's open-source AI red teaming playground labs with infrastructure for running AI red teaming trainings and hands-on security exercises.

red-teamtrainingsecurity +2

SCAM

105 · Python

Normal

Security Comprehension Awareness Measure by 1Password. An open-source benchmark testing AI agents' security awareness during realistic, multi-turn workplace tasks.

security-benchmarkagent-safetyworkplace +2

Giskard

5.3k · Python

Active

An open-source evaluation and testing library for LLM agents providing automated model scanning, bias detection, performance benchmarking, and compliance checks.

evaluationtestingllm-safety +3

AgentDojo

Description

Tags

Categories

Related Projects

EasyJailbreak

AI Red Teaming Playground Labs

SCAM

Giskard