Garak
ActiveDescription
NVIDIA's open-source LLM vulnerability scanner that automatically detects security issues in language models including safety vulnerabilities, hallucination tendencies, jailbreak risks, and prompt injection attacks.
NVIDIA's open-source LLM vulnerability scanner that automatically detects security issues in language models including safety vulnerabilities, hallucination tendencies, jailbreak risks, and prompt injection attacks.
Tencent's full-stack AI red teaming platform integrating OpenClaw security scanning, agent scanning, skills scanning, MCP scanning, AI infrastructure scanning, and LLM jailbreak evaluation.
An open-source LLM vulnerability scanner and AI red teaming kit for automated security fuzzing of LLM applications, detecting jailbreaks, prompt injection, and adversarial attacks.
OpenAI's framework for evaluating LLMs and LLM systems, providing an open-source registry of benchmarks and tools for systematic model assessment.
A framework for few-shot evaluation of language models by EleutherAI, providing standardized evaluation pipelines supporting hundreds of benchmark tasks and widely adopted as a core LLM evaluation tool in the community.