🛡️

Best Security & Guardrails Top 20

Top 20 most popular open-source Security & Guardrails projects, ranked by GitHub Stars.

SWE-agent

SWE-agent takes a GitHub issue and automatically generates fixes using your LLM of choice, also applicable to cybersecurity auditing and competitive coding. NeurIPS 2024 paper.

swecodingagentcybersecurity

OpenAI Evals

18.4k Stars

OpenAI's framework for evaluating LLMs and LLM systems, providing an open-source registry of benchmarks and tools for systematic model assessment.

llm-evaluationbenchmarkevalsred-teaming

PentAGI

16.8k Stars

Fully autonomous AI Agents system capable of performing complex penetration testing tasks using multi-agent architecture with support for multiple LLM providers.

securitytestingmulti-agentagent

PentestGPT

13.0k Stars

An automated penetration testing agentic framework powered by large language models for security testing and vulnerability discovery.

penetration-testingsecurityllmautomation

E2B

12.2k Stars

E2B provides secure cloud sandboxes for AI agents, supporting code execution, file operations, and isolated compute as an execution layer for coding and automation workflows.

sandboxcode-executionsecuritypython

Portkey AI Gateway

11.7k Stars

Portkey AI Gateway is a blazing fast AI gateway with integrated guardrails, routing to 200+ LLMs with 50+ AI guardrails through a single fast and friendly API.

gatewayllm-routingguardrailsai-safety

OpenSandbox

10.6k Stars

OpenSandbox is an open-source, secure, fast, and extensible sandbox runtime for AI agents, developed by Alibaba.

sandboxai-infrastructurekubernetessecurity

GhidraMCP

8.8k Stars

MCP server for Ghidra reverse engineering platform, enabling AI agents to autonomously perform binary analysis and vulnerability discovery.

mcpreverse-engineeringghidrasecurity

HexStrike AI

8.7k Stars

HexStrike AI is an advanced MCP server that lets AI agents autonomously run 150+ cybersecurity tools for automated pentesting, vulnerability discovery, and security research.

cybersecuritypentestingmcp-serversecurity

Garak

7.8k Stars

NVIDIA's open-source LLM vulnerability scanner that automatically detects security issues in language models including safety vulnerabilities, hallucination tendencies, jailbreak risks, and prompt injection attacks.

llm-securityvulnerability-scannerllm-evaluationred-teaming

Guardrails AI

6.8k Stars

Guardrails AI adds programmable guardrails to large language models, ensuring reliability and safety through input/output validation, structured data extraction, and custom validators.

guardrailsllm-safetyvalidationoutput-validation

Superagent

6.6k Stars

Superagent protects AI applications against prompt injections, data leaks, and harmful outputs, embedding safety directly into your app.

ai-safetyguardrailsagent-toolssecurity

Anthropic Cybersecurity Skills

6.2k Stars

754 structured cybersecurity skills for AI agents mapped to 5 frameworks: MITRE ATT&CK, NIST CSF 2.0, MITRE ATLAS, D3FEND and NIST AI RMF. Works with Claude Code, Codex CLI, Cursor, Gemini CLI and 20+ platforms.

pythonsecurityagenttools

NeMo Guardrails

6.1k Stars

NVIDIA NeMo Guardrails is an open-source toolkit for adding programmable guardrails to LLM-based conversational systems, supporting topic control, safety enforcement, and dialog guidance.

guardrailsllm-safetynvidiaconversational-ai

Microsandbox

6.0k Stars

Secure, local, cross-platform and programmable sandboxes for AI agents. Provides strict resource isolation using microVM technology.

rustagenttoolssecurity

OpenShell

5.8k Stars

OpenShell is the safe, private runtime for autonomous AI agents, developed by NVIDIA. Provides controlled execution environments and resource management.

rustagentframeworksecurity

Giskard

5.3k Stars

An open-source evaluation and testing library for LLM agents providing automated model scanning, bias detection, performance benchmarking, and compliance checks.

evaluationtestingllm-safetybias-detection

Purple Llama

4.2k Stars

Meta's set of tools to assess and improve LLM security, including safety benchmarks, prompt injection detection, and output auditing to help evaluate and enhance the safety of large language models.

securityevaluationpythonllm

PyRIT

3.8k Stars

The Python Risk Identification Tool for generative AI — an open-source framework by Microsoft for proactively identifying risks in generative AI systems through red teaming and automated probing.

pythonsecurityevaluationllm

MCP Context Forge

3.7k Stars

An AI Gateway, registry, and proxy by IBM that sits in front of any MCP, A2A, or REST/gRPC APIs, exposing a unified endpoint with centralized discovery, guardrails, and management.

mcpa2aapi-gatewayregistry

Agent 评估LLM 评测自动化测试

Agent Evaluation and Testing: From Vibe Checks to End-to-End Pipelines

Most teams evaluate agents by checking a few examples. Real evaluation needs layered metrics, non-rotting datasets, and judges that push back. This article provides runnable code patterns and a practical decision framework.

AI Agent安全Prompt Injection

AI Agent Security in Practice: From Prompt Injection to Defense in Depth

A systematic walkthrough of three major attack surfaces in AI agents, with practical code examples for prompt injection defense, tool permission scoping, and output filtering.

AI 编程Coding AgentCLI

AI Coding Agents Deep Dive: Architecture Trade-offs from CLI to IDE-Integrated

A deep architectural comparison of seven open-source coding agents across three paradigms — CLI-first, IDE-integrated, and fully autonomous — examining context management, tool access, and autonomy levels to help you pick the right tool for each development scenario.