53AI Hub
Open-source AI portal to quickly build operational AI portals for launching agents, prompts, and tools. Integrates with Coze, Dify.
Development tools and libraries for agent systems
Open-source AI portal to quickly build operational AI portals for launching agents, prompts, and tools. Integrates with Coze, Dify.
Open-source deep research agent from Alibaba Tongyi Lab, using multi-stage iterative information retrieval and reasoning to conduct deep analysis, synthesis, and summarization of complex topics with web search and document analysis.
Get 10X more out of Claude Code, Codex or any coding agent. Manage agent tasks through kanban boards, track progress, and optimize workflows.
BAML is an AI framework that adds engineering rigor to prompt engineering, offering type-safe prompt definitions, automatic testing, version management, and multi-model support across Python, TypeScript, Ruby, Java, C#, Rust, and Go.
Dynamically convert OpenAPI specs into AI agent tools for automatic API-to-tool transformation.
New API is a unified AI model hub for aggregation and distribution, supporting cross-conversion of various LLMs into OpenAI, Claude, or Gemini-compatible formats. A centralized gateway for personal and enterprise model management.
MCP server providing Chrome DevTools capabilities to coding agents, enabling web debugging, performance analysis, and DOM manipulation automation.
Fast, flexible LLM inference engine built in Rust — supports multiple model architectures and quantization schemes for high-performance local LLM deployment.
Open Vision Agents by Stream. Build voice and vision agents quickly with any model or video provider, using Stream's edge network for ultra-low latency realtime interactions.
Curated catalog of must-have external toolkits to integrate with AI agents built with Python agent frameworks.
Safe local execution layer for AI agent tools to build, validate, and publish MCP tools with a no-password secure runtime.
Full toolkit for running an AI agent service built with LangGraph, FastAPI, and Streamlit, providing a complete reference architecture for agent service deployment.
MCP server for Ghidra reverse engineering platform, enabling AI agents to autonomously perform binary analysis and vulnerability discovery.
AnythingLLM is an all-in-one AI productivity app with a self-hosted chat UI, RAG knowledge base, AI agents, and multi-model management, privacy-first with zero configuration.
A deep research agent framework optimized for complex research and prediction tasks, with MiroThinker-1.7 and MiroThinker-H1 models achieving 74.0 and 88.2 on BrowseComp benchmark, supporting multi-step reasoning and information retrieval.
A zero-code platform for auto-generating production-grade AI agents using Harness Engineering principles with unified tools, skills, memory, and orchestration with built-in constraints and feedback loops.
An open-source library by NVIDIA for efficiently connecting and optimizing teams of AI agents with orchestration, tool calling, and workflow management.
NVIDIA NeMo Guardrails is an open-source toolkit for adding programmable guardrails to LLM-based conversational systems, supporting topic control, safety enforcement, and dialog guidance.
An end-to-end RL training framework by NVIDIA for orchestrating tools and agentic workflows. Optimizes multi-step agent decision-making and tool-use policies.
An attempt to engineer prompts that help us understand AI agents. Research into agent reasoning mechanisms through prompt engineering.
Open-source AI agent desktop app for Windows and macOS with one-click install of Claude Code, MCP tools, and Skills, featuring sandbox isolation, multi-model support, and Feishu/Slack integration.
A spec-driven development workflow MCP server for AI-assisted software development, featuring a real-time web dashboard and VSCode extension for monitoring and managing project progress in AI coding workflows.
Official Polymarket autonomous trading AI agents that automatically make and execute trading decisions in prediction markets.
Portkey AI Gateway is a blazing fast AI gateway with integrated guardrails, routing to 200+ LLMs with 50+ AI guardrails through a single fast and friendly API.
A curated list of awesome LLM and AI Agent Skills, resources and tools for customising AI Agent workflows. Works with Claude Code, Codex, Gemini CLI and custom agents.
Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.
Middleware providing an OpenAI-compatible API endpoint that bridges MCP tools to any client or framework supporting the OpenAI API format
Your AI agent skills, finally organized — a macOS app to browse, edit, and manage skills across Claude Code, Cursor, Codex, Windsurf, and Amp.
An integrated platform for AI agent tool management and security with tool registration, access control, and audit trails.
Arrakis is a fully customizable and self-hosted sandboxing solution written in Go, designed specifically for AI agent code execution scenarios, providing a secure isolated runtime environment.
AG-UI is the open-source implementation of the Agent-User Interaction Protocol, defining a standardized interaction protocol between AI agents and frontend applications, initiated by the CopilotKit team.
Enterprise-ready MCP Gateway and Registry that centralizes AI development tools with secure OAuth authentication, dynamic tool discovery, and unified access with Keycloak/Entra integration.
AgentLabs is a toolkit for agent development and testing, focused on experimentation, replay, and workflow support to improve iteration speed.
A simple, open format for guiding coding agents. Define agent behavior, rules, and skills through structured AGENTS.md files to help AI coding assistants better understand project requirements.
An Agent Development Kit providing core abstractions and tools for building enterprise-grade AI agents with multiple LLM backends, tool use, and workflow orchestration.
Realtime Voice AI on Arduino ESP32 with 100+ Voice AI Models for AI Toys, Companions, and Devices. Supports OpenAI Realtime, Gemini, Grok, and Eleven Labs.
OpenSandbox is an open-source, secure, fast, and extensible sandbox runtime for AI agents, developed by Alibaba.
An AI-powered research assistant web UI that performs iterative, deep research on any topic by combining search engines with LLM reasoning.
Official Python SDK from Anthropic for building Claude-powered AI agents with tool use, multi-turn conversations, and agent orchestration.
A web scraping and browser automation library for Node.js to build reliable crawlers, supporting Puppeteer, Playwright, Cheerio, and raw HTTP. Extract data for AI, LLMs, RAG, or GPTs with proxy rotation and both headful and headless modes.
A CLI tool for code structural search, lint, and rewriting based on AST. Written in Rust, supports 20+ languages, providing precise code pattern matching for AI coding agents.
An AI Agent assistant that integrates multiple IM platforms, LLMs, plugins and AI features, supporting QQ, Telegram, Discord and more.
AI agent tooling for data engineering workflows, providing intelligent agent-assisted capabilities for data processing pipelines.
Official AWS Python SDK for building AI agents on Amazon Bedrock with lifecycle management, tool integration, memory, and audit trails.
Amazon Bedrock Agentcore samples that accelerate AI agents into production with scale, reliability, and security for real-world deployment.
Pi Mono is a comprehensive AI agent toolkit including a coding agent CLI, unified LLM API, TUI and web UI libraries, Slack bot, and vLLM pod management for end-to-end agent development.
LiteLLM provides a unified interface and proxy gateway for LLM calls, simplifying multi-model switching, routing, and cost control.
Blaxel AI SDK is a production-focused toolkit for agent systems, emphasizing tool definitions, execution control, tracing, and service integrations for enterprise apps.
Model Context Protocol server for searching and analyzing arXiv papers, enabling AI agents to retrieve and deeply analyze academic research
Conversational voice AI agents platform for building natural language phone interactions with multilingual speech synthesis and real-time dialogue management.
An MCP integration for Roblox Studio that enables AI agents to participate in game-development workflows, resource editing, and automation.
Botpress is an open-source conversational AI platform with a visual flow editor, knowledge base integration, multi-channel deployment, and GPT/LLM agent building capabilities for enterprise chatbot development.
An open-source long-horizon SuperAgent harness by ByteDance that researches, codes, and creates with sandboxes, memories, tools, skills, subagents and message gateway for complex tasks.
CLI to control iOS and Android devices for AI agents, enabling coding agents to directly interact with mobile devices for testing and automation.
Open-source text-to-SQL and text-to-chart GenBI agent with a semantic layer. Ask your database questions in natural language and get accurate SQL, charts, and BI insights. Supports 12+ data sources and any LLM.
An open-source implementation of Programmatic Tool Calling that demonstrates how agents can execute code and invoke tools through MCP-style mechanisms.
An agentic workflow tool for OpenCode that provides context-engineering support to help coding agents organize project knowledge.
Composio is a tools and SaaS integration layer for agents, helping applications connect quickly to services like Gmail, Slack, and GitHub for multi-tool workflows.
DeepEval is an open-source evaluation framework for LLM applications. It provides rich evaluation metrics and tools, supporting unit testing and integration testing to help developers build reliable LLM applications.
Contextal is a context management and retrieval-enhancement tool for multi-turn agents, long conversations, and complex knowledge injection workflows.
The open agent control plane that governs autonomous AI agents with pre-execution policy enforcement, approval gates, and audit trails. Works with LangChain, CrewAI, MCP, and more.
A curated collection of AI tools, utilities, and resources for developers and creators building agent-powered applications.
Crawl4AI is a web crawling toolkit for LLM and agent systems, offering structured extraction, site traversal, cleanup, and crawl controls for external knowledge acquisition.
CrewAI Tools provides reusable integrations for the CrewAI ecosystem, including search, scraping, database access, and code execution to extend multi-agent workflows quickly.
A beautiful Ruby API for OpenAI, Anthropic, Gemini, Azure, Ollama, and more. Built-in agents, chat, vision, audio, tools, streaming, and Rails integration.
CVS Health's open-source uncertainty quantification library for language models, providing UQ-based hallucination detection with confidence scoring and mitigation tools to identify and reduce unreliable LLM outputs.
Development environments for coding agents. Enable multiple agents to work safely and independently with your preferred stack. Provides isolated development environments to avoid conflicts and improve collaboration.
A frontier, first-principles handbook for moving beyond prompt engineering to the wider discipline of context design, orchestration, and optimization — inspired by Karpathy and 3Blue1Brown.
Daytona provides secure development-environment infrastructure for coding agents and automation workflows, serving as a runtime base for remote execution tasks.
Claude Code Router is a model routing tool for coding-agent scenarios, unifying requests across providers to optimize cost, latency, and task-specific routing strategies.
Curated collection of system prompts for top AI tools. Perfect for AI agent builders and prompt engineers. Including: ChatGPT, Claude, Perplexity, Manus, Claude-Code, Loveable, v0, Grok, same new, windsurf, notion, and MetaAI.
Open-source all-in-one AI productivity platform combining a generalist AI agent, workflow engine, instant messaging, and online documents
E2B provides secure cloud sandboxes for AI agents, supporting code execution, file operations, and isolated compute as an execution layer for coding and automation workflows.
Official Python SDK for ElevenLabs voice AI services — text-to-speech, voice cloning, real-time streaming, and Conversational AI agents.
Sandbox your local AI agents so they can only read and write what they need. File system permission control for secure local agent execution.
FastRTC is a developer tool for real-time multimodal and voice applications, useful as a communication layer for low-latency agent conversations and interactive audio/video workflows.
An AI agent that automates the job application process, analyzing job requirements and tailoring applications for personalized mass submission.
Build production-ready agentic workflows with natural language, supporting browser automation, computer use, and RAG workflows
Visual workflow builder for AI agents powered by Firecrawl - drag-and-drop web scraping pipelines with real-time execution. Build agent workflows without coding.
AI agent and animation engine powered by Large Language Models for creating interactive animations and visual content.
Graphiti is a temporal knowledge-graph engine for agent memory, helping systems continuously accumulate long-term context.
Phantom is an AI co-worker with its own computer, featuring self-evolving capabilities, persistent memory, and MCP server support, autonomously completing complex tasks like a virtual colleague.
GitHub Copilot CLI brings the power of Copilot coding agent directly to your terminal. Supports code generation, command suggestions, error fixing and more.
Multi-platform SDK for integrating GitHub Copilot Agent into apps and services. Supports multiple programming languages and platforms with unified Agent API interface.
A Python library by Google for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization, designed for data annotation and knowledge extraction workflows.
Gradio Agents is Gradio's interaction-layer toolkit for agent interfaces, helping developers build demoable and testable agent UIs for prototyping and human-in-the-loop workflows.
AI Agent Tools library for the Graphlit Platform providing knowledge retrieval and content processing capabilities for Python agents.
An automated penetration testing agentic framework powered by large language models for security testing and vulnerability discovery.
Guardrails AI adds programmable guardrails to large language models, ensuring reliability and safety through input/output validation, structured data extraction, and custom validators.
AI Agent Gateway to install MCP servers and skills once and share across all AI agents with unified tool management.
Framework for running agent evaluations and creating RL environments to measure and improve agent performance
Model Context Protocol server for Excel file manipulation, enabling AI agents to read, create and modify spreadsheets
PromptTools provides open-source tools for prompt testing and experimentation, supporting multiple LLMs (OpenAI, LLaMA) and vector databases (Chroma, Weaviate, LanceDB) to help developers systematically evaluate and optimize RAG systems.
Helicone is an open-source proxy and observability platform for LLM applications, offering request tracing, caching, and cost analytics.
LLM Agent framework within ComfyUI integrating MCP server, TTS, OCR, GraphRAG, and other AI tool nodes for visual workflow building
An open-source AI Voice Agent that integrates with Asterisk/FreePBX using Audiosocket/RTP technology for low-latency AI-powered phone interactions.
smolagents is a lightweight agent framework from Hugging Face for quickly building tool-using LLM agents.
Build local voice agents with open-source models. An end-to-end speech-to-speech pipeline from HuggingFace for fully local voice AI agent deployment.
Inngest Agent Kit is a TypeScript toolkit for agent development that combines step orchestration, tool calling, streaming execution, and event-driven workflows for production tasks.
A Postgres-based backend platform built for coding agents, combining auth, storage, compute, hosting, and an AI gateway for rapid app development.
Instructor is a Python library providing structured outputs for LLMs using Pydantic models, enabling AI agents to receive reliable typed responses — a key building block for agent tool-use.
Meta-project for the AI agent tooling ecosystem integrating Mulch, Seeds, Canopy, and Overstory agent tools.
Jina AI Serve is a cloud-native framework for building multimodal AI applications, supporting RAG pipelines, agent systems, and multimodal search.
Dynamic AI agent automation platform with multi-provider orchestration, adaptive memory, smart features, and a versatile plugin system
An AI-native proxy and data plane for agentic apps with built-in orchestration, safety, observability, and smart LLM routing so developers can focus on agent core logic.
Polyglot document intelligence framework with a Rust core, extracting text, metadata, and structured data from PDFs, Office documents, images and 91+ formats via MCP server, CLI, and REST API.
A flexible framework for experiencing heterogeneous LLM inference and fine-tuning optimizations — run large language models efficiently on consumer hardware with kernel-level optimizations.
LangMem is LangChain's memory layer for agents, helping developers add long-term memory, replay summaries, and context management to improve multi-turn performance.
Letta (formerly MemGPT) is an open-source framework for building stateful AI agents with advanced reasoning and transparent long-term memory. It allows you to visually test, debug, and observe agents.
Open source real-time audio/video infrastructure for AI agents. WebRTC transport, agent framework, SIP telephony, and real-time transcription.
RouteLLM is a framework for serving and evaluating LLM routers, enabling cost reduction without compromising quality through intelligent request routing across multiple model tiers.
Go implementation of the Model Context Protocol SDK enabling seamless integration between LLM applications and external data sources and tools
Automatically generate demo applications using LLMs. Describe your idea and get an interactive prototype in Streamlit or Gradio format.
Mem0 TS is the TypeScript version of Mem0, offering long-term memory management, preference extraction, and context compression for agent applications built in JS/TS stacks.
Mem0 is a long-term memory layer for AI agents, supporting cross-session memory management and personalized context retrieval.
An open-source tool from Meta for LLM prompt optimization. Automates the process of continuously improving and refining LLM prompts.
Agent Lightning is Microsoft's open-source training framework for AI agents, using reinforcement learning to enhance agent capabilities.
Microsoft AI call center solution. Send phone calls from an AI agent via API, or directly call the bot from a configured phone number.
Playwright MCP is a Microsoft MCP server exposing Playwright browser automation capabilities to AI agents, supporting web interaction, screenshots, and structured data extraction.
MindsDB is a query engine for AI analytics that enables building self-reasoning agents across live data, connecting diverse data sources with AI models.
Mintlify is a developer documentation and AI-search platform that gives agent toolchains, SDKs, and APIs a structured knowledge surface for both humans and assistants.
High-performance in-browser LLM inference engine — run large language models directly in the browser using WebGPU, no server-side computation needed.
Model Context Protocol server for mobile automation and scraping on iOS, Android, emulators, simulators and real devices
A comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems. Use when building, optimizing, or debugging agent systems.
Extract and convert data from any document (PDFs, images, Word, PPT, URLs) into multiple formats including Markdown, JSON, and CSV.
All-in-one AI framework for semantic search, LLM orchestration, and language model workflows with agent support, RAG, and vector database
Universal skills loader for AI coding agents. One-command installation of skill packages. Extends agent capabilities with code review, test generation, documentation writing and more.
OctoTools is an agentic framework with extensible tools for complex reasoning, featuring a tool card system for flexible composition of diverse reasoning capabilities.
OpenCompass is a comprehensive LLM evaluation platform supporting a wide range of models including Llama, Mistral, GPT-4, Qwen, GLM, and Claude across 100+ benchmark datasets.
GitAgent is a framework-agnostic, git-native standard for defining AI agents where identity, rules, memory, tools, and skills are version-controlled files in a Git repository, enabling reproducible and collaborative agent development.
A customer service demo built with the OpenAI Agents SDK, demonstrating tool use, context management, and intelligent customer support workflows.
OpenAI's framework for evaluating LLMs and LLM systems, providing an open-source registry of benchmarks and tools for systematic model assessment.
A financial data platform for analysts, quants and AI agents, providing comprehensive financial data access across stocks, crypto, economics and more.
An autonomous LLM agent framework for complex task solving with automatic task decomposition, tool usage, and multi-step reasoning from the OpenBMB team
OpenClaw is an open-source personal AI assistant platform supporting 25+ messaging channels (WhatsApp, Telegram, Slack, etc.) with multi-LLM integration and personal knowledge management.
Open Interpreter is a natural language interface for computers that lets LLMs run code locally to perform file operations, data analysis, and system management tasks.
OpenOperator is an open-source agent project for computer and browser control, focused on GUI automation, task execution, and human-in-the-loop workflows.
OpenRouter Agents is OpenRouter's platform capability for multi-model agent use cases, focused on routing, tool calling, and unified access layers.
Context management for Claude Code with hooks for state maintenance via ledgers and handoffs. Enables MCP execution without context pollution and agent orchestration with isolated context windows for long-running conversations.
In-depth tutorials on LLMs, RAGs and real-world AI agent applications. Rich notebook examples for learning AI engineering practices.
Pipecat is an open-source framework for voice and multimodal conversational AI, enabling real-time voice assistants, video bots, and multimodal agents with integrated TTS, STT, and LLM services.
An LLM-based data-analysis agent for dbt projects that automates exploration of models and project structure via a remote MCP server.
AI agent tooling for Python data science workflows providing agents with data analysis and visualization capabilities.
Home of the AI workforce featuring multi-agent systems, AI agents, and tools for building autonomous AI workflows in enterprises.
AgentGPT is a platform for assembling, configuring, and deploying autonomous AI Agents in your browser, allowing users to create goal-driven agents that execute tasks autonomously.
Run coding agents in sandboxes. Control them over HTTP. Supports Claude Code, Codex, OpenCode, and Amp with isolated execution environments.
An open-source AI coworker with persistent memory, supporting multi-turn conversations and context retention for knowledge management and collaborative task completion.
An ICLR 2024 Spotlight LM-based emulation framework for identifying the risks of LM agents with tool use, helping discover safety issues in tool-using agents.
Self-hosted, open-source AI gateway providing one API for 20+ LLM providers, databases, and files with integrated RAG, voice, and guardrails.
Python scraper based on AI that uses LLMs and knowledge graphs to automatically build web data extraction pipelines.
All-in-one LLM CLI tool with Shell Assistant, Chat-REPL, RAG, AI Tools & Agents, supporting OpenAI, Claude, Gemini, Ollama, Groq, and more.
LLM is Simon Willison's open-source CLI and plugin framework for working with multiple models through one interface, with embeddings, templates, tool extensions, and lightweight agent workflows.
Open-source agentic framework that uses computers like a human, capable of completing complex GUI tasks with autonomous learning and experience accumulation.
A Burp Suite extension that adds MCP tooling, AI-assisted analysis, privacy controls, and passive or active scanning to security testing workflows.
A general-purpose biomedical AI agent from Stanford for autonomous bioinformatics analysis, literature search, and scientific reasoning.
One API is an LLM API management and redistribution system that unifies OpenAI, Azure, Anthropic Claude, Google Gemini, DeepSeek, and more under a single API. Supports key management, redistribution, and one-click Docker deployment.
MCP Sequential Thinking server that recommends the most effective MCP tools at each reasoning stage, enhancing AI agent tool selection
Community edition of Spring AI Playground providing a safe local execution layer for AI agent tools and MCP tool building validation.
bolt.diy is an open-source platform to prompt, run, edit, and deploy full-stack web applications using any LLM you want, providing a visual development environment for AI-powered app creation.
Security gateway for AI coding agents providing security protection, workspace isolation, and multiplexing, supporting Claude, Copilot, Cline, and other IDE extensions to prevent sensitive data leaks and malicious prompt injections.
An enterprise-grade platform for running and managing MCP servers with containerized deployment, security isolation, network policies, resource limits, and unified management of large-scale MCP server fleets via Kubernetes or Docker.
HELM (Holistic Evaluation of Language Models) is Stanford CRFM's open-source framework for holistic, reproducible, and transparent evaluation of foundation models including LLMs and multimodal models.
macOS CLI and MCP server enabling AI agents to capture screenshots with optional visual question answering via AI models.
Superagent protects AI applications against prompt injections, data leaks, and harmful outputs, embedding safety directly into your app.
Collection of Apple-native tools for the Model Context Protocol, giving AI agents access to macOS system features like Notes, Calendar, Reminders and more
Library to expose FastAPI endpoints as Model Context Protocol tools with authentication support, enabling AI agents to call existing APIs directly
TanStack Store is a lightweight state-management tool that works well for agent UIs, workflow frontends, and real-time consoles that need to manage agent state and event flows.
Official Taskade MCP server and OpenAPI to MCP codegen for building AI agent tools from any OpenAPI specification.
TensorZero is an open-source inference gateway and optimization platform for LLM apps and agent systems, focused on high-performance serving, experimentation, routing, and production observability.
A Rust-based sandboxed TypeScript interpreter for AI agent tool execution, designed as a fast lightweight alternative to MCP-style tool calling.
CUA provides open-source infrastructure for Computer-Use Agents, including sandboxes, SDKs, and benchmarks to train and evaluate AI agents that control full desktops (macOS, Linux, Windows).
No-code multi-agent framework to build LLM agents, workflows, and applications with your own data, supporting diverse data source integrations
Deep Research enables deep research using any LLM provider, offering SSE API and MCP server support with OpenAI, Gemini, DeepSeek, Ollama, and more.
Enterprise-grade multi-tenant AI agent development platform from China Unicom, featuring RAG, workflow orchestration, and MCP tool integration
Agent framework designed for fintech and enterprise scenarios, providing task orchestration, tool integration, and production-grade reliability with multi-LLM backend support.
Context7 is Upstash's context-engineering toolkit for agents, helping applications manage long context windows, retrieval injection, and history compression.
A high-throughput and memory-efficient inference and serving engine for LLMs, featuring PagedAttention, continuous batching, and optimized KV cache management for production deployments.
Guardrail capabilities for Pydantic AI including cost tracking, prompt injection detection, PII filtering, and safety validation.
ARIS (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in.
A high-performance LLM inference API and Chat UI that integrates DeepSeek R1's chain-of-thought reasoning traces with Anthropic Claude models.
Shell Superpowers for AI Agents. Enhanced CLI toolkit helping AI agents execute tasks more efficiently in terminal environments.
An active tool-discovery framework for autonomous LLM agents, helping them discover and select MCP tools at runtime.
An open-source agentic AI sandbox matrix for Kubernetes and cloud-native environments, focused on isolated agent execution.
A framework that uses code execution as agent actions. Research shows code is a better action space than text for agents, powering the CodeAct paradigm.
Repomix packs your entire repository into a single AI-friendly file, perfect for feeding your codebase to LLMs like Claude, ChatGPT, and DeepSeek for analysis, review, or code generation.
Claude Code skill for generating production-quality SVG and PNG technical diagrams — supports 8 diagram types, 5 visual styles, and deep AI/Agent domain knowledge.
Deep research agent to help you find the best GitHub repositories — AI-powered intelligent search to discover the most suitable open-source projects for your needs.
Model Context Protocol server for converting web pages, PDFs, Office documents and other formats to Markdown for AI agent consumption
Comparing container, WebAssembly, and process-level isolation approaches, with practical code for safely executing agent-generated code.
Breaking down three abstraction layers for browser automation—from raw Playwright to structured extraction—with production patterns, runnable code, and common pitfalls.
Learn how to build stateful AI agents with long-term memory using Letta (formerly MemGPT), solving the LLM context window limitation.
Based on real production experience, this guide explains how to build a closed loop of tracing, evaluation, and cost analytics for AI agents with Langfuse.
From protocol modeling and server design to permission isolation, this guide shows how to build a stable tool integration layer for AI agents with MCP.
Learn how to evaluate RAG systems using Ragas and DeepEval, including measuring key metrics like faithfulness, answer relevance, and context precision.