What We Built
| Page | What You Learned |
|---|---|
| Introduction | LLM vs agent distinction, ReAct loop with createAgent, weather agent with tool calling |
| Model Context Protocol | MCP architecture, 3-server split (KnowledgeBase, CustomerInfo, IncidentTicket), tool design principles |
| Agent Memory | Working memory (MemorySaver + thread_id), long-term memory (cross-session persistence), integration patterns |
Key Takeaways
- Agents = LLM + tools + loop. A simple LLM call can’t look up data or take actions. Agents add tool calling with a reasoning loop — the LLM decides which tool to use, observes the result, and continues until it can answer.
-
MCP decouples tools from agents. Instead of hardcoding tools inside agents, MCP servers expose them over HTTP. Any agent connects, discovers tools via
tools/list, and calls them — no code changes when tools are added or updated. -
Split servers by domain. One monolithic MCP server with 15 tools is hard to maintain. Three servers (knowledge base, customer info, tickets) can be deployed, scaled, and owned independently. The agent connects to all of them via
MultiServerMCPClient. -
The agent is stateless.
user_uuidandthread_idare passed per call. All conversation state lives in theMemorySavercheckpointer, not the agent instance. One agent serves all users concurrently. -
User identity flows via headers, not tool params. The
x-user-uuidheader identifies the user at the HTTP layer. MCP servers read it from the request — the LLM never needs to pass it. Tools likeget_customer_infotake no user parameter. - Tool design is the #1 reliability lever. Clear names, “when to use / when NOT to use” descriptions, and 1-3 parameters per tool. The LLM only sees descriptions — not your implementation.
-
Memory has two tiers. Working memory (
MemorySaver+thread_id) handles within-session state. Long-term memory (external store) handles cross-session persistence. Start with working memory; add long-term when needed.
Production Checklist
- Tools exposed via MCP servers (not hardcoded in agent)
- Servers split by domain (separate deployment and ownership)
- Tool descriptions include “when to use” and “when NOT to use”
- Tool parameters kept simple (1-3 per tool)
- User identity via
x-user-uuidheader (not in tool schemas) -
MemorySaverwiththread_idfor session persistence - Agent is stateless —
userIdandthreadIdpassed per call - Structured error responses from tools (never throw to agent)
- System prompt instructs agent to call
get_customer_infoon first turn - Tool call logging for observability
Learn More
Specifications & Standards
- Model Context Protocol — MCP specification
- Anthropic Tool Use — Claude’s tool calling guide
- OpenAI Function Calling — GPT function calling
Frameworks (used in this module)
- LangChain / LangGraph — Agent framework with
createAgent,MemorySaver - @langchain/mcp-adapters — MCP tool loading for LangChain
- @modelcontextprotocol/sdk — MCP server/client SDK
Other Agent Frameworks
- CrewAI — Multi-agent orchestration
- AutoGen — Microsoft’s agent framework
- Semantic Kernel — Microsoft’s AI SDK
- Vercel AI SDK —
generateTextwithmaxStepsfor agent loops - Google ADK — Agent Development Kit
Research
- Tool Space Interference — Microsoft Research on tool design at scale
- Stop Converting REST APIs to MCP — Tool consolidation patterns
Memory
- Redis Agent Memory Server — Production memory with semantic search
- Mem0 — Managed memory layer
- Zep — Long-term memory with automatic extraction
- Managing Memory for AI Agents — O’Reilly report