EasyJailbreak
NormalDescription
An easy-to-use Python framework for generating adversarial jailbreak prompts, helping researchers systematically evaluate LLM safety defenses with multiple attack method combinations.
An easy-to-use Python framework for generating adversarial jailbreak prompts, helping researchers systematically evaluate LLM safety defenses with multiple attack method combinations.
An automated LLM fuzzing tool by CyberArk that helps developers and security researchers identify and mitigate jailbreak vulnerabilities in LLM APIs with multiple attack vectors.
A dynamic environment by ETH Zurich to evaluate attacks and defenses for LLM agents, providing standardized benchmarks for measuring agent system security.
Microsoft's open-source AI red teaming playground labs with infrastructure for running AI red teaming trainings and hands-on security exercises.
Bag of Tricks for benchmarking jailbreak attacks on LLMs. NeurIPS 2024 paper providing empirical tricks for LLM jailbreaking with standardized evaluation.