Dify in Practice: Full-Stack Low-Code Platform from RAG to Agent Workflows
Dify (145k Stars, $30M Pre-A) is the benchmark for open-source LLM app platforms. From Docker deployment to RAG pipelines, agent orchestration and MCP integration — this article takes you end-to-end.
If WordPress made website building accessible to everyone, Dify is doing the same for LLM application development — turning "writing code" into "dragging on a canvas."
Dify is the world's most popular open-source LLM application development platform (145k+ GitHub Stars), created by the LangGenius team and open-sourced on GitHub on May 15, 2023. On March 9, 2026, it closed a $30M Series Pre-A round led by HSG (Hongshan, formerly Sequoia Capital China), with participation from GL Ventures, Alt-Alpha Capital, 5Y Capital, Mizuho Leaguer Investment, and NYX Ventures, at a reported post-money valuation of ~$180M.
Its core capability matrix covers three key layers from prototype to production:
- RAG Knowledge Base Layer: document parsing → chunking → vectorization → retrieval → reranking
- Agent Orchestration Layer: Function Calling / ReAct → tool invocation → MCP extension
- Workflow Automation Layer: visual drag-and-drop → conditions / loops / parallelism → publishing
All three layers are closed within a single platform — no need to switch between different toolchains.
Quick Deployment: Start in 5 Minutes
Dify provides one of the most mature deployment setups among LLM platforms. The official Docker Compose configuration starts all services with a single command:
git clone https://github.com/langgenius/dify.git
cd dify/docker
cp .env.example .env
docker compose up -d
Visit http://localhost/install after startup to complete initialization.
The docker-compose setup orchestrates the following services:
| Component | Tech Stack | Responsibility |
|---|---|---|
| Frontend | Next.js | Web UI, admin panel, chat widget |
| API Service | Python (Flask) | Business logic, API endpoints |
| Worker | Python RQ Worker | Async tasks (document parsing, index building) |
| PostgreSQL | — | App data, users, conversation history |
| Redis | — | Cache, task queue, session management |
| Weaviate | Vector DB | Default vector store (optional: Qdrant/Milvus/PGVector) |
The Worker's async architecture is a key differentiator — all time-consuming operations (PDF parsing, vector indexing, large file processing) happen in the background queue without blocking the API.
Production Recommendations
For production deployments, Dify recommends:
- Replace the vector database: Use a dedicated Qdrant or Milvus cluster instead of the Weaviate container
- Use external PostgreSQL: Avoid data loss
- Configure S3-compatible storage: Migrate from local volumes to MinIO / AWS S3
- HTTPS + reverse proxy: Nginx with Certbot auto-renewal
Dify also offers Kubernetes Helm Charts and one-click AWS / GCP deployment templates.
Multi-Tenancy with Workspaces
Dify has built-in Workspace support: a single instance can host multiple Workspaces, each with independent members, knowledge bases, apps, and configurations. Enterprises can serve multiple teams from one deployment with tenant-level data and configuration isolation.
RAG Knowledge Base: From Documents to Retrieval
Dify's RAG capabilities cover the complete pipeline from document upload to chunking, vectorization, retrieval, and reranking.
Supported Input Formats
- Documents: PDF, TXT, Markdown, JSON, DOCX, XLSX, CSV, HTML
- Web pages: Built-in web crawler — just enter a URL
- Notion / Confluence: Sync via API
- API Import: Batch write through the Knowledge Base API
Dify's document parser can handle PDFs with tables, formatted Word documents, and deeply nested JSON structures. It supports OCR preprocessing for extracting text from scanned PDFs.
Chunking Strategy Comparison
Dify offers three chunking strategies for different document types:
| Strategy | Principle | Best For |
|---|---|---|
| Fixed-size chunking | Split by token count with overlap window | Simple text, news articles |
| Paragraph chunking | Split by newlines / headings | Structured docs (manuals, papers) |
| Parent-child chunking | Split into parent chunks, then sub-chunks; return parent on child match | Long document Q&A, tech support docs |
Parent-child chunking is Dify's differentiator: sub-chunks handle precise matching while parent chunks provide full context. For example, retrieving a code snippet (sub-chunk) returns the full function definition (parent chunk), giving the LLM more accurate context.
Each strategy is configurable: chunk size (in tokens), chunk overlap (in tokens), and separator selection. Dify provides a "preview" feature to inspect actual chunking results after upload and adjust parameters.
Retrieval Strategies
Dify supports three retrieval modes:
- Vector similarity: Embedding-based semantic search for fuzzy matching
- Full-text search (keyword): Inverted index exact matching for proper nouns, IDs, code
- Hybrid search: Weighted combination of vector + full-text search, balancing semantics and precision
In practice, hybrid search delivers the best results. Dify lets you set weights (via sliders) for the vector vs. keyword ratio, and supports reranking via Cohere Rerank, Jina Reranker, or BGE Reranker models to improve top-k result quality.
Vector Database Options
| Database | Deployment | Best For |
|---|---|---|
| Weaviate | Docker | Default, development |
| Qdrant | Docker / standalone | Production, high-performance |
| Milvus | Cluster | Large-scale knowledge bases (millions of docs) |
| PGVector | PostgreSQL plugin | Avoiding extra infrastructure |
Switch between them by setting VECTOR_STORE in .env.
Agent Orchestration: From Tools to Intelligence
Dify's agent engine supports two core strategies:
Function Calling Mode
When the underlying LLM supports Function Calling (GPT-4, Claude 3, GLM-4, etc.), Dify automatically uses the native Tool Call API. Users simply "add tools" from the interface — built-in or custom — and the agent selects the right tool based on user input.
ReAct Mode
For models without Function Calling support, Dify uses the ReAct (Reasoning + Acting) prompting framework: the model first reasons about what information is needed, generates a tool call instruction, the system executes it, and feeds the result back. This adds one interaction round but offers broader model compatibility.
50+ Built-in Tools
Dify provides 50+ pre-built tools covering common scenarios:
- Web search: Bing Search, Google Search, SearXNG, Serper, SerpAPI
- Code execution: Safe Python / Node.js execution in Dify Sandbox
- Image generation: Stable Diffusion, DALL-E, Flux
- Data retrieval: Yahoo Finance, Wikipedia, YouTube Transcript
- Communication: Slack, Email, Discord notifications
- File processing: Document extraction, table reading
Each tool has independent parameter definitions and error handling logic.
Custom Tools: OpenAPI / API Definition
Dify supports adding custom tools via:
- OpenAPI / Swagger spec: Import any REST API's OpenAPI 3.0 spec to auto-generate tool definitions
- API Definition (manual): Manually configure endpoint, parameters, and auth
- Plugin marketplace: Install community-contributed tools
This means any existing system's REST API can become a Dify Agent tool in under a minute — no code required.
MCP Client: Access the MCP Ecosystem
Around 2026, Dify fully supports MCP (Model Context Protocol) in bidirectional mode — both as MCP Client and MCP Server.
As an MCP Client, Dify can connect to any standard MCP Server via SSE (Server-Sent Events) or Streamable HTTP:
# Configure an MCP Server in .env
MCP_SERVER_URL=http://localhost:8000/sse
MCP_SERVER_NAME=my-custom-server
Tools exposed by the MCP Server are automatically registered in Dify's Agent tool list. This dramatically expands Dify's tool ecosystem — no longer dependent on built-in tools or manual OpenAPI imports.
The MCP ecosystem now includes hundreds of available Servers: GitHub API, Jira, Notion, Slack, PostgreSQL, Google Calendar, Figma, and more. Through Dify's MCP Client, these capabilities can be connected to an Agent workflow in minutes.
Dify Sandbox: Secure Code Execution
Dify's code execution relies on langgenius/dify-sandbox, an isolated execution environment using gVisor or nsjail container sandboxing. It restricts:
- File system access (read-only whitelist)
- Network requests (whitelist domains)
- System calls (dangerous syscalls disabled)
- Execution time (configurable timeout)
- Memory usage (upper limit)
This allows Python or Node.js code execution within Dify Workflows without compromising host security.
Workflow: Visual Drag-and-Drop Orchestration
Workflow is Dify's killer feature, setting it apart from most agent platforms. It provides a React Flow-based drag-and-drop canvas for building complex AI processing pipelines.
Node Types
| Node Type | Function | Typical Use |
|---|---|---|
| LLM | Call LLM for text generation | Chat, translation, summarization |
| Knowledge Retrieval | Retrieve from knowledge base | RAG Q&A |
| Code | Execute Python / JavaScript | Data cleaning, formatting |
| Condition | if/else branching | Route based on input |
| Loop | Iterate sub-workflow | Batch processing |
| HTTP | Send HTTP requests | Call external APIs |
| Agent | Embed an Agent node | Complex reasoning |
| Template | Jinja2 template transformation | Text assembly |
| Variable | Declare and assign variables | Store intermediate results |
| Parameter Extract | Extract structured params from input | Entity recognition |
| Iteration | Traverse list executing sub-flow | Process items one by one |
| TTS | Text-to-speech | Voice output |
| S3 Storage | Upload files to S3 | Archive results |
| Answer | Set final output | Return to user |
Each node can reference outputs from previous nodes as variables using {{nodeId.outputField}} syntax to form data flows. Condition nodes support multiple branches; Loop and Iteration support nesting.
Workflow Design Tips
One Workflow, one job. Don't build monoliths — split into sub-workflows and call them via HTTP nodes
Condition nodes as guards. Check input validity before LLM nodes to avoid wasting tokens on invalid inputs
Prefer Iteration over Loop for batch data — Iteration can parallelize sub-flows (platform-dependent), which is orders of magnitude faster
Use Code nodes for data cleaning. LLM output is inconsistent; validate and transform with Python/JS before passing to the next node
Set reasonable timeouts and retries. HTTP and LLM nodes can time out — configure 30-60s timeouts with 1-2 retries
Manage variables wisely. Avoid more than 20 variable nodes in complex workflows — use Code nodes for data aggregation
Publishing Options
Dify offers three publishing channels:
- API Endpoint — Each app generates an independent REST API (POST, JSON response). Most flexible, best for system integration
- Web App — Auto-generated user-facing web page with chat or form interaction
- Embedded Widget — Embed via
<iframe>or JS SDK into existing websites, commonly used for customer service bots
Each channel has independent access control (public / private / whitelist) with rate limits and token usage caps.
2026 New Features
Bidirectional MCP Support
Dify implemented bidirectional MCP around 2026:
- MCP Client: Connect external MCP Servers, register their tools in Dify's Agent tool list
- MCP Server: Expose Dify apps as MCP Servers for external systems to call
This makes Dify both a consumer and producer in the MCP ecosystem. Enterprises can connect Dify to a unified MCP tool directory, sharing tools and knowledge across all LLM applications.
Supervisor Multi-Agent Mode
The Supervisor pattern introduces a "manager agent": upon receiving a task, the Supervisor Agent breaks it down, delegates to sub-agents, and aggregates results. Each sub-agent can have different models, tool sets, and prompt configurations.
Ideal for:
- Complex research: one agent searches, one analyzes, one writes
- Multi-step customer service: one handles auth, one queries orders, one generates responses
- Code generation + review: one writes code, one reviews it
The combination of Supervisor with Workflow is Dify's most exciting direction — embedding Supervisor Agent nodes in the visual canvas for a "visual orchestration + autonomous planning" layered architecture.
Plugin System
The 2026 plugin marketplace brings extensibility to Dify. Plugins are packaged as .difypkg files containing:
- Tool definitions (like custom tool OpenAPI specs)
- Node types (new Workflow nodes)
- Model integrations (new LLM / embedding / rerank models)
- UI extensions (custom configuration interfaces)
Plugins are browsed and installed from the Dify admin panel. Community-contributed plugins are reviewed by the Dify team before listing.
Prompt Version Management & A/B Testing
Dify's prompt version management supports:
- Automatic history snapshots on every prompt change
- Version comparison (diff view)
- One-click rollback to any historical version
- A/B testing: run two prompt versions simultaneously with traffic splitting, automatically tracking key metrics (response time, token consumption, user feedback)
A/B testing is especially valuable for operations teams — stop guessing which prompt works better and let data decide.
Dify vs Competitors
| Dimension | Dify | Open WebUI | LobeChat | LangChain |
|---|---|---|---|---|
| Positioning | LLM app platform | LLM chat UI + tools | Modern AI chat client | Dev framework (Python/TS SDK) |
| RAG Knowledge Base | Full built-in pipeline | Basic built-in RAG | Plugin-based | DIY |
| Workflow Orchestration | Visual drag-and-drop | None | None (chat only) | LangGraph (code-level) |
| Agent Mode | Function Calling + ReAct | Function Calling | Plugin Agent | Custom Agent |
| MCP Support | Bidirectional (Client + Server) | Read-only tool calls | Client only | Manual extension |
| Deployment | Docker Compose / K8s | Single Docker container | Docker / Vercel | SDK integration |
| Enterprise Features | Workspace, logging, monitoring, API | Basic user management | None | None |
| Best For | Enterprise LLM app dev | Personal/team chat UI | Personal multi-model use | Deep custom development |
Open WebUI is positioned as "a better ChatGPT interface" — excellent for chat UX and lightweight tool integration but lacks Workflow and enterprise capabilities.
LobeChat focuses on personal UX and plugin ecosystem, great for multi-model chat but unsuitable for complex app development.
LangChain is a dev framework, not a product — maximum flexibility but requires building your own frontend, API layer, knowledge base management, and all infrastructure.
Dify's uniqueness: it integrates RAG + Agent + Workflow + MCP in a visual interface, enabling non-engineers to build LLM applications. This is the fundamental difference between a "product" and a "framework."
Summary
Dify isn't universal — for extreme customization, LangChain + custom frontend remains the path. But for 90% of LLM application scenarios, Dify's visual pipeline is sufficient, with development speed 3-5x faster than writing code.
Dify's trajectory is worth watching: from "Chatbot builder" to "LLM app platform" to bidirectional MCP and plugin ecosystem — it's evolving from a single product into a platform. For teams choosing their LLM tech stack, Dify offers the lowest barrier to entry and the broadest coverage.
If you haven't tried Dify yet, start with Docker deployment and build a simple RAG Q&A app — you'll be amazed that 30 minutes turns a PDF into an interactive knowledge query system. That is Dify's greatest value: LLM application development no longer starts from zero.
A final note on licensing: Dify is released under the Dify Open Source License, which is based on Apache 2.0 but adds two conditions — no unauthorized multi-tenant SaaS use, and preservation of frontend Logo/copyright notices. Personal and internal enterprise deployments are generally unaffected; if you plan to offer a commercial multi-tenant hosted service based on Dify, you need to contact LangGenius for a commercial license.
Repo: langgenius/dify · Docs: docs.dify.ai · Try it: cloud.dify.ai
Projects in this article
Dify
146.2k ⭐Dify is an open-source LLM application development platform with a visual agent orchestration interface, supporting workflows, knowledge bases, and multiple models.
Dify Sandbox
1.2k ⭐A lightweight, fast, and secure code execution environment supporting multiple programming languages — provides sandboxed code execution for the Dify platform.
Open WebUI
142.6k ⭐Open WebUI is a feature-rich, user-friendly self-hosted AI platform supporting Ollama and OpenAI-compatible APIs, with RAG, agents, and MCP capabilities.
Lobe Chat
79.0k ⭐Lobe Chat is an open-source ChatGPT-style chat application with a plugin system and multi-model support, suitable as an agent conversation interface.