安全护栏

OpenShell is the safe, private runtime for autonomous AI agents, developed by NVIDIA. Provides controlled execution environments and resource management.

rustagentframework +2

Garak

7.6k · HTML

活跃

NVIDIA 开源的 LLM 漏洞扫描器，可自动检测大语言模型中的安全漏洞、幻觉倾向、越狱风险和提示注入等安全问题，是 LLM 安全评估的核心工具。

llm-securityvulnerability-scannerllm-evaluation +2

Portkey AI Gateway

11.4k · TypeScript

活跃

Portkey AI Gateway 是一个高性能 AI 网关，支持路由到 200+ LLM 提供商，内置 50+ AI 安全护栏，提供统一 API 接口。

gatewayllm-routingguardrails +2

SWE-ReX

485 · Python

活跃

面向 AI Agent 的沙箱化代码执行环境，支持本地和云端部署，具备大规模并行执行能力，为 SWE-agent 等编程 Agent 提供安全可靠的代码运行时。

sandboxcode-executionswe-agent +3

SWE-agent

19.0k · Python

活跃

SWE-agent 能自动分析 GitHub Issue 并用 LLM 生成修复代码，支持网络安全审计和编程竞赛场景，NeurIPS 2024 论文项目。

swecodingagent +2

DeepCamera

2.7k · JavaScript

活跃

Open-Source AI Camera Skills Platform, AI NVR & CCTV Surveillance. LLM-powered agentic security camera agent with pluggable AI skills. Runs on Mac Mini & AI PC.

javascriptagenttools +3

AI-Infra-Guard

3.5k · Python

活跃

腾讯开源的全栈 AI 红队平台，集成 OpenClaw 安全扫描、Agent 扫描、Skills 扫描、MCP 扫描、AI 基础设施扫描及 LLM 越狱评估能力。

ai-securityred-teamingllm-security +2

Inspect AI

1.9k · Python

活跃

英国 AI 安全研究所（AISI）开源的大语言模型评估框架，提供全面的模型能力评估工具，支持安全性和对齐性测试。

llm-evaluationai-safetyevaluation-framework +2

Arrakis

802 · Go

不活跃

Arrakis is a fully customizable and self-hosted sandboxing solution written in Go, designed specifically for AI agent code execution scenarios, providing a secure isolated runtime environment.

goagentsecurity +2

AgentShield

510 · TypeScript

活跃

AI agent security scanner that detects vulnerabilities in agent configurations, MCP servers, and tool permissions. Available as CLI, GitHub Action, and GitHub App integration.

typescriptsecurityllm +2

AgentGateway

2.4k · Rust

活跃

Next generation agentic proxy for AI agents and MCP servers. Provides unified traffic management, routing, and security control.

rustmcpagent +3

OpenSandbox

10.1k · Python

活跃

OpenSandbox 是阿里巴巴开源的安全、快速、可扩展的 AI Agent 沙箱运行时环境。

sandboxai-infrastructurekubernetes +2

Archestra

3.6k · TypeScript

活跃

Enterprise AI Platform with guardrails, MCP registry, gateway and orchestrator — comprehensive AI agent governance and management.

typescriptmcpsecurity +2

UQLM

1.1k · Python

活跃

CVS Health 开源的 LLM 不确定性量化库，用于基于 UQ 的幻觉检测，提供置信度评分和幻觉缓解工具，帮助识别和降低 LLM 输出的不可靠内容。

hallucination-detectionuncertainty-quantificationllm-evaluation +2

E2B

11.8k · Python

活跃

E2B 提供面向 AI Agent 的安全云沙箱运行环境，支持代码执行、文件操作与隔离计算，适合作为代码 Agent、数据 Agent 与自动化任务的执行层。

sandboxcode-executionsecurity +1

Agent Safehouse

1.6k · Shell

活跃

本地 AI Agent 沙箱工具，通过文件系统权限控制让 AI 代理只能读写其所需的文件，保障本地运行安全。

sandboxsecurityagent-tools +2

PentestGPT

12.7k · Python

正常

基于大语言模型的自动化渗透测试 Agent 框架，利用 LLM 驱动安全测试和漏洞发现。

penetration-testingsecurityllm +2

Guardrails AI

6.7k · Python

活跃

Guardrails AI 为大语言模型添加可编程的安全护栏，通过输入输出验证、结构化数据提取和自定义校验器确保 LLM 应用的可靠性和安全性。

guardrailsllm-safetyvalidation +2

LLM-Jailbreaks

603 ·

不活跃

A comprehensive collection of LLM jailbreak techniques and prompts for ChatGPT, Claude, Llama, and other models — essential reference for LLM security research.

llmsecurityprompt-engineering

Purple Llama

4.1k · Python

活跃

Meta's set of tools to assess and improve LLM security, including safety benchmarks, prompt injection detection, and output auditing to help evaluate and enhance the safety of large language models.

securityevaluationpython +2

PyRIT

3.7k · Python

活跃

The Python Risk Identification Tool for generative AI — an open-source framework by Microsoft for proactively identifying risks in generative AI systems through red teaming and automated probing.

pythonsecurityevaluation +2

Agent Governance Toolkit

1.1k · Python

活跃

Microsoft's AI Agent Governance Toolkit providing policy enforcement, zero-trust identity, execution sandboxing, and reliability engineering for autonomous AI agents. Covers 10/10 OWASP Agentic Top 10.

securityevaluationpython +2

Agentic Security

1.8k · Python

正常

开源的 LLM 漏洞扫描器和 AI 红队工具包，支持对 LLM 应用进行自动化安全模糊测试，检测越狱、提示注入和对抗性攻击等风险。

llm-securityred-teamingllm-fuzzer +2

Anthropic Cybersecurity Skills

5.3k · Python

活跃

754 structured cybersecurity skills for AI agents mapped to 5 frameworks: MITRE ATT&CK, NIST CSF 2.0, MITRE ATLAS, D3FEND and NIST AI RMF. Works with Claude Code, Codex CLI, Cursor, Gemini CLI and 20+ platforms.

pythonsecurityagent +2

OpenAI Evals

18.2k · Python

活跃

OpenAI 推出的 LLM 评估框架，提供标准化的基准测试注册表和工具集，用于系统评估大语言模型和 LLM 系统的性能表现。

llm-evaluationbenchmarkevals +2

AI Agents From Scratch

3.4k · JavaScript

活跃

Demystify AI agents by building them yourself. Local LLMs, no black boxes, real understanding of function calling, memory, and ReAct patterns.

javascriptagentevaluation +2

LLM Guard

2.8k · Python

不活跃

LLM 交互安全工具包，提供提示词注入检测、敏感信息脱敏、内容安全审计等防护能力，保障生产环境 LLM 调用的安全性。

securityllmpython +2

Rebuff

1.5k · TypeScript

不活跃

针对 LLM 的提示词注入检测器，结合启发式规则、向量相似度和语言模型多重防御策略，有效识别和阻止恶意提示注入攻击。

securityllmtesting +2

Rogue

1.0k · Python

活跃

AI Agent Evaluator and Red Team Platform. Provides systematic security evaluation and adversarial testing tools to discover and fix vulnerabilities in agent systems.

securityevaluationobservability +2

Agent Scan

2.2k · Python

活跃

Security scanner for AI agents, MCP servers, and agent skills by Snyk — detect and fix security vulnerabilities before deployment.

pythonsecuritymcp +2

Agentic Radar

953 · Python

不活跃

A security scanner for LLM agentic workflows. Automatically detects security vulnerabilities, prompt injection risks, and permission violations in agent pipelines before deployment.

securityagentpython +2

CodeGate

784 · Python

不活跃

Security gateway for AI coding agents providing security protection, workspace isolation, and multiplexing, supporting Claude, Copilot, Cline, and other IDE extensions to prevent sensitive data leaks and malicious prompt injections.

pythonsecurityagent +5

ToolHive

1.7k · Go

活跃

企业级 MCP 服务器运行与管理平台，提供容器化的 MCP 服务部署方案，支持权限隔离、网络策略、资源限制等安全机制，可通过 Kubernetes 或 Docker 统一管理大规模 MCP 服务器集群。

mcptoolsgo +4

Superagent

6.5k · TypeScript

活跃

Superagent 是一个 AI 应用安全防护平台，提供提示注入防护、数据泄露检测和有害输出过滤，可嵌入任何 AI 应用中。

ai-safetyguardrailsagent-tools +2

Microsandbox

5.6k · Rust

活跃

Secure, local, cross-platform and programmable sandboxes for AI agents. Provides strict resource isolation using microVM technology.

rustagenttools +2

Prompt Injection Defenses

677 ·

不活跃

Every practical and proposed defense against prompt injection — a comprehensive reference for LLM security practitioners.

securityllmprompt-engineering

PentAGI

15.3k · Go

活跃

Fully autonomous AI Agents system capable of performing complex penetration testing tasks using multi-agent architecture with support for multiple LLM providers.

securitytestingmulti-agent +2

Awesome-LLM-Safety

1.8k · HTML

正常

A curated collection of safety-related papers, articles, and resources focused on Large Language Models — comprehensive reference for researchers and practitioners exploring LLM safety implications and advancements.

llmsecurityevaluation

43 个项目

HexStrike AI

Giskard

MCP Context Forge

GhidraMCP

NeMo Guardrails

OpenShell