Building MCP Servers in Practice: Custom Tool Chains for AI Agents
Build a production-grade MCP server from scratch, covering tool definition, authentication design, and testing strategies to turn any API into an agent-ready tool.
Building MCP Servers in Practice: Custom Tool Chains for AI Agents
MCP (Model Context Protocol) lets agents call external tools through a unified protocol. But official examples usually stop at "Hello World" — a tool that returns a fixed string. Production MCP servers need authentication, error recovery, streaming responses, and version compatibility. This article builds a production-grade MCP server from scratch, covering these real scenarios.
Three Capabilities of MCP Servers
MCP defines three server capabilities. Understanding the distinction is the first step in server design:
| Capability | Purpose | Typical Use Case |
|---|---|---|
| Tools | Functions agents can invoke | Search docs, execute queries, send notifications |
| Resources | Data agents can read | File contents, database records, config info |
| Prompts | Predefined prompt templates | Standard workflows for common tasks, formatting instructions |
Selection principle: If the agent needs to "do something" (has side effects), use a Tool. If it only needs to "read information" (no side effects), use a Resource. If it needs to "guide behavior patterns," use a Prompt.
Building From Scratch: A Practical MCP Server
We will build a "Doc Assistant" MCP server with document search, summary generation, and bookmark management.
Project Structure
doc-assistant-mcp/
├── server.py # MCP Server main entry
├── tools/
│ ├── search.py # Document search tool
│ ├── summarize.py # Summary generation tool
│ └── bookmarks.py # Bookmark management tool
├── auth.py # Authentication module
└── pyproject.toml
Core Implementation
# server.py
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.types import (
Tool,
TextContent,
CallToolResult,
)
import json
app = Server("doc-assistant")
@app.list_tools()
async def list_tools() -> list[Tool]:
return [
Tool(
name="search_docs",
description="Search the document library for relevant content. Returns the best-matching document snippets.",
inputSchema={
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query in natural language",
},
"limit": {
"type": "integer",
"description": "Number of results to return, default 5",
"default": 5,
},
"doc_type": {
"type": "string",
"enum": ["api", "guide", "tutorial", "all"],
"description": "Filter by document type",
"default": "all",
},
},
"required": ["query"],
},
),
Tool(
name="bookmark",
description="Save a document snippet as a bookmark for quick reference later.",
inputSchema={
"type": "object",
"properties": {
"title": {
"type": "string",
"description": "Bookmark title",
},
"content": {
"type": "string",
"description": "Bookmark content (document snippet)",
},
"tags": {
"type": "array",
"items": {"type": "string"},
"description": "Tag list for categorization",
},
},
"required": ["title", "content"],
},
),
Tool(
name="get_bookmarks",
description="Retrieve saved bookmarks, optionally filtered by tag.",
inputSchema={
"type": "object",
"properties": {
"tag": {
"type": "string",
"description": "Filter by tag; returns all if omitted",
},
},
},
),
]
@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
if name == "search_docs":
return await handle_search(arguments)
elif name == "bookmark":
return await handle_bookmark(arguments)
elif name == "get_bookmarks":
return await handle_get_bookmarks(arguments)
else:
return [TextContent(type="text", text=f"Unknown tool: {name}")]
async def handle_search(args: dict) -> list[TextContent]:
query = args["query"]
limit = args.get("limit", 5)
doc_type = args.get("doc_type", "all")
try:
results = await search_documents(query, limit=limit, doc_type=doc_type)
if not results:
return [TextContent(
type="text",
text=f"No documents found for '{query}'. Try different keywords or broaden the scope.",
)]
formatted = format_search_results(results)
return [TextContent(type="text", text=formatted)]
except SearchError as e:
return [TextContent(type="text", text=f"Search failed: {e}")]
async def handle_bookmark(args: dict) -> list[TextContent]:
title = args["title"]
content = args["content"]
tags = args.get("tags", [])
bookmark_id = await save_bookmark(title, content, tags)
return [TextContent(
type="text",
text=f"Saved bookmark '{title}' (ID: {bookmark_id})",
)]
async def handle_get_bookmarks(args: dict) -> list[TextContent]:
tag = args.get("tag")
bookmarks = await load_bookmarks(tag=tag)
if not bookmarks:
return [TextContent(type="text", text="No bookmarks yet.")]
formatted = "\n\n".join(
f"**{b['title']}** (tags: {', '.join(b['tags'])})\n{b['content'][:200]}"
for b in bookmarks
)
return [TextContent(type="text", text=formatted)]
async def main():
async with stdio_server() as (read_stream, write_stream):
await app.run(read_stream, write_stream, app.create_initialization_options())
if __name__ == "__main__":
import asyncio
asyncio.run(main())
Authentication Design: API Keys and Scopes
Production MCP servers need authentication to prevent unauthorized access. MCP itself does not define auth — it is implemented at the transport layer.
# auth.py
from dataclasses import dataclass
import hashlib
import hmac
@dataclass
class AuthScope:
tools: list[str]
max_calls: int
rate_limit_window: int
SCOPES = {
"read_only": AuthScope(
tools=["search_docs", "get_bookmarks"],
max_calls=100,
rate_limit_window=3600,
),
"read_write": AuthScope(
tools=["search_docs", "bookmark", "get_bookmarks"],
max_calls=500,
rate_limit_window=3600,
),
"admin": AuthScope(
tools=["search_docs", "bookmark", "get_bookmarks", "admin_tools"],
max_calls=2000,
rate_limit_window=3600,
),
}
class MCPAuth:
def __init__(self):
self.api_keys: dict[str, str] = {}
self.key_scopes: dict[str, AuthScope] = {}
self.call_counts: dict[str, list[float]] = {}
def register_key(self, key_id: str, secret: str, scope_name: str = "read_only"):
hashed = hashlib.sha256(secret.encode()).hexdigest()
self.api_keys[key_id] = hashed
self.key_scopes[key_id] = SCOPES.get(scope_name, SCOPES["read_only"])
def authenticate(self, key_id: str, secret: str) -> bool:
if key_id not in self.api_keys:
return False
hashed = hashlib.sha256(secret.encode()).hexdigest()
return hmac.compare_digest(self.api_keys[key_id], hashed)
def authorize(self, key_id: str, tool_name: str) -> tuple[bool, str]:
scope = self.key_scopes.get(key_id)
if not scope:
return False, "Invalid API key"
if tool_name not in scope.tools:
return False, f"Tool '{tool_name}' not in scope"
import time
now = time.time()
calls = self.call_counts.get(key_id, [])
calls = [t for t in calls if now - t < scope.rate_limit_window]
if len(calls) >= scope.max_calls:
return False, f"Rate limit reached ({scope.max_calls}/hour)"
calls.append(now)
self.call_counts[key_id] = calls
return True, "ok"
Error Handling: Graceful Degradation
In agent systems, tool call failures should not crash the entire task. Each tool call should return clear error types with recovery suggestions.
from enum import Enum
from dataclasses import dataclass
class ErrorType(Enum):
VALIDATION = "validation"
NOT_FOUND = "not_found"
RATE_LIMITED = "rate_limited"
UPSTREAM_ERROR = "upstream"
TIMEOUT = "timeout"
@dataclass
class ToolError:
error_type: ErrorType
message: str
suggestion: str
retryable: bool
def handle_tool_error(error: ToolError) -> TextContent:
parts = [f"[{error.error_type.value}] {error.message}"]
if error.retryable:
parts.append(f"Suggestion: {error.suggestion} (retryable)")
else:
parts.append(f"Suggestion: {error.suggestion}")
return TextContent(type="text", text="\n".join(parts))
Testing Strategy
MCP server testing operates on three layers:
# test_tools.py
import pytest
# Layer 1: Tool logic tests (no MCP protocol dependency)
@pytest.mark.asyncio
async def test_search_returns_results():
results = await search_documents("RAG pipeline", limit=3)
assert len(results) <= 3
for r in results:
assert "title" in r
assert "content" in r
# Layer 2: Schema validation tests
def test_tool_schemas_are_valid():
tools = asyncio.run(list_tools())
for tool in tools:
assert tool.inputSchema.get("type") == "object"
required = tool.inputSchema.get("required", [])
properties = tool.inputSchema.get("properties", {})
for field_name in required:
assert field_name in properties
# Layer 3: Integration tests (simulate MCP calls)
@pytest.mark.asyncio
async def test_search_tool_integration():
result = await call_tool("search_docs", {"query": "test query"})
assert len(result) > 0
assert result[0].type == "text"
@pytest.mark.asyncio
async def test_bookmark_roundtrip():
save_result = await call_tool("bookmark", {
"title": "Test Bookmark",
"content": "Test content",
"tags": ["test"],
})
assert "Saved" in save_result[0].text
get_result = await call_tool("get_bookmarks", {"tag": "test"})
assert "Test Bookmark" in get_result[0].text
Common Mistakes
Mistake 1: "Tool descriptions don't matter, the agent will figure it out"
Tool descriptions are the only basis for an agent to decide when to invoke a tool. Vague descriptions lead to tools being called in wrong scenarios. Each description should clearly state: what this tool does, when to use it, and when not to use it.
Mistake 2: "MCP servers don't need rate limiting" If your MCP server wraps paid APIs (search, translation, LLM calls), no rate limiting means a single prompt injection could trigger massive API costs. Rate limiting is a security measure, not a performance optimization.
Mistake 3: "Error messages should expose internal implementation" Never expose database connection strings, API endpoints, or internal service names in error messages. Error messages should give agents enough context to recover without giving attackers reconnaissance material.
Summary
- MCP servers provide three capabilities (Tools/Resources/Prompts) — choose based on side effects
- Tool descriptions are the key to agent invocation decisions — write them clear, specific, and with use cases
- Authentication at the transport layer: API Key + scopes + rate limiting, all three are mandatory
- Error handling should classify types and provide recovery suggestions: tell the agent "what went wrong" and "what to do next"
- Three-layer testing strategy: tool logic → schema validation → integration tests
Prepared by AgentList. Explore more MCP-related projects in our directory.
Projects in this article
MCP Servers
85.5k ⭐MCP Servers provides a large collection of reusable Model Context Protocol server implementations, giving agents standardized tool capabilities.
Apify MCP Server
1.2k ⭐MCP server that enables AI agents to extract data from social media, search engines, maps, and e-commerce sites using thousands of Apify scrapers.
MCP Server Chart
4.1k ⭐A visualization MCP server by AntV with 25+ chart types, enabling AI assistants to generate line charts, bar charts, pie charts, maps, and more through MCP for data analysis and reporting.
AWS MCP Servers
9.0k ⭐Official MCP server collection from AWS, providing AI agents with integration to core AWS services including Lambda, S3, DynamoDB, and Bedrock.
Archestra
3.7k ⭐Enterprise AI Platform with guardrails, MCP registry, gateway and orchestrator — comprehensive AI agent governance and management.