Understanding AI Agents: From Concept to Production

Posted Jul 19, 2025 Updated Jul 19, 2025

By mavsankar 10 min read

My Notes on Agentic AI - Fundamentals

Introduction

AI agents are rapidly transforming how we interact with technology, moving beyond simple task execution to complex problem-solving. In this comprehensive guide, we’ll explore what AI agents are, the core components that power them, essential design principles, various architectural patterns, and practical considerations for building and deploying them effectively.

What We’ll Cover

In this blog post, we’ll dive deep into:

What AI Agents Are: Understanding the fundamental concepts and components
Design Principles: The key principles for building effective AI agents
Design Patterns: Essential patterns like tool use, agentic RAG, and multi-agent collaboration
Production Deployment: Practical considerations for real-world deployment
Frameworks and Protocols: Overview of agentic frameworks and the Model Context Protocol

Whether you’re a developer looking to build intelligent systems, a product manager exploring AI capabilities, or simply curious about the future of AI agents, this post will provide valuable insights into this exciting field.

If you wish to listen to this as a small podcast (also AI generated), check this out:

What Are AI Agents?

At their core, AI agents are sophisticated systems designed to understand user requests, create plans to fulfill those tasks, and execute actions to complete them. They represent a significant evolution from simple AI models to intelligent entities capable of autonomous decision-making and task execution.

Core Components of AI Agents

The fundamental parts of an AI agent include:

Large Language Model (LLM): This is the reasoning engine behind AI agents. It enables them to identify user-requested tasks, formulate a plan to complete them, and perform the necessary actions.
Memory: AI agents utilize memory to improve over time. This can be short-term memory (the immediate conversation context) or long-term memory (a collection of data for ongoing improvement). For example, an agent might remember your toothpaste flavor preference from long-term memory or the current status of your teeth brushing from short-term memory.
Tools: These are external services, APIs, functions, or data sources that allow agents to perform actions or gather information. An agent uses its LLM to recognize a task, identify needed tools, and use memory to collect necessary data.

graph TD
    A[AI Agent] --> B[Large Language Model]
    A --> C[Memory System]
    A --> D[Tools & APIs]
    
    B --> E[Task Identification]
    B --> F[Plan Formulation]
    B --> G[Action Execution]
    
    C --> H[Short-term Memory]
    C --> I[Long-term Memory]
    
    H --> J[Current Context]
    I --> K[User Preferences]
    I --> L[Historical Data]
    
    D --> M[External Services]
    D --> N[Functions]
    D --> O[Data Sources]

Principles for Building Good AI Agents

Three key principles contribute to effective AI agent design:

Space: This refers to the agent’s environment. Good agents should connect events, people, and knowledge, be easily discoverable, and seamlessly transition between foreground and background use based on user needs. Clear instructions on usage and limitations should be provided, perhaps in the UI/UX.
Time: This principle emphasizes how an agent operates over time with a user, allowing for improvement through reflection and connection to past events. Displaying past prompts and interactions in the UI helps users understand the agent’s long-term context.
Core: This involves embracing uncertainty inherent in LLM-driven planning. Building trust and transparency is crucial, empowering users with visible controls and feedback mechanisms, much like video playback controls (pause, play, captions).

Key Design Patterns for AI Agents

Several design patterns enable AI agents to perform complex tasks and work collaboratively:

Tool Use Design Pattern

While LLMs are powerful on their own, external tools are often needed to complete user tasks, such as calculators, APIs for flight status, or currency exchange functions.
Agents can combine multiple tools to automate complex workflows, like analyzing an email, retrieving knowledge base information, and forwarding it to customer service.
Tools can be configured to be called automatically by the agent or required for specific scenarios.
LLMs’ ability to interpret natural language means developers don’t need to define every specific condition (e.g., “is a city in Europe?”) if the LLM’s base training data covers it.

Agentic Retrieval Augmented Generation (RAG)

Basic RAG allows LLMs to retrieve relevant information from a database and add it to the conversation context, providing more relevant responses beyond just training data.
Agentic RAG enhances this by enabling the agent to analyze the query, create a plan of smaller tasks using data sources and tools, and verify if retrieved information is sufficient. If not, it can repeat the tool-calling process.
A key benefit is the agent’s ability to maintain long-term memory of this process, recalling previous attempts to improve future interactions.

flowchart TD
    Q[User Query] --> AR{Agentic RAG}
    AR --> A[Analyze Query]
    A --> P[Create Plan]
    P --> RT[Retrieve from Tools/Data]
    RT --> V{Sufficient Info?}
    V -->|No| P
    V -->|Yes| G[Generate Response]
    G --> M[Update Memory]
    M --> R[Return Response]
    
    subgraph "Memory System"
        M --> STM[Short-term Memory]
        M --> LTM[Long-term Memory]
        LTM --> P
    end
    
    subgraph "Data Sources"
        RT --> DB[(Database)]
        RT --> API[External APIs]
        RT --> KB[Knowledge Base]
    end

Planning Design Pattern

This pattern explicitly has an AI agent list the subtasks that constitute a more complex task. For example, a travel plan can be broken into booking flights, hotels, and activities.
It’s especially impactful with multiple agents, where subtasks can be completed by separate agents or processes.
The agent’s output can be structured (e.g., JSON) and validated to ensure all necessary information is present for subsequent steps or other agents.

Multi-Agent Design Pattern

This involves multiple AI agents working together to achieve a common goal. Different patterns apply based on their objective:

Group Chat: Messages are broadcast to all agents, and a “group chat manager” (often another agent) selects the appropriate agent to handle the task (e.g., separate agents for booking, complaints, or flight status in customer service).
Handoff Pattern: Agents complete sequential steps in a defined workflow, passing the task to the next agent upon completion.
Collaborative Filtering: Each agent acts as a specialist, responding to a task in its unique way, ideal for analysis requiring diverse inputs or perspectives.

This allows agents to refine results by providing feedback to each other, leading to better suggestions than a single agent alone.

graph LR
    subgraph "Group Chat Pattern"
        GM[Group Chat Manager] --> A1[Booking Agent]
        GM --> A2[Complaint Agent]
        GM --> A3[Status Agent]
        U1[User Query] --> GM
    end
    
    subgraph "Handoff Pattern"
        U2[User Request] --> B1[Agent 1]
        B1 --> B2[Agent 2]
        B2 --> B3[Agent 3]
        B3 --> R1[Final Result]
    end
    
    subgraph "Collaborative Filtering"
        U3[Complex Task] --> C1[Specialist A]
        U3 --> C2[Specialist B]
        U3 --> C3[Specialist C]
        C1 --> AG[Aggregator]
        C2 --> AG
        C3 --> AG
        AG --> R2[Refined Result]
    end

Metacognition Design Pattern

Metacognition in AI agents means “thinking about thinking”. Agents use data and analysis to identify errors, make improvements in their planning and responses, and adapt over time.
This enables agents to be more transparent in their reasoning, adapt to changing environments, and improve data interpretation.
An agent can reflect on decisions (e.g., how it defines the “best” flight) and incorporate user preferences for future decisions, requiring the system to gather, store, and retrieve feedback.

Building Trustworthy and Effective AI Agents

Key Architectural Considerations

System Message Framework: This framework is crucial for setting clear and specific instructions for AI agents. It involves using a basic system message as a template to generate more detailed prompts, covering responsibilities, tone, style, and interaction instructions. This promotes iterative prompt building and scalability across multiple agents.
Human-in-the-Loop Architecture: For scenarios requiring human approval or intervention, this architecture integrates a human user into the multi-agent cooperation workflow. Agents can be instructed to perform specific actions when a human input like “approve” is detected.
Security and Privacy: Considerations include ensuring agents have only the required access to tools and data. It’s recommended to exercise caution with databases in production due to potential vulnerabilities, even with read-only access. Overall, LLM security requires careful measures.

Deploying AI Agents to Production

Getting AI agents ready for real-world use requires rigorous evaluation at every step of the system, not just the final output. Key evaluation points include:

Initial request to the LLM (connection, response times, model selection).
Agent’s ability to identify user intent and complete the requested task.
Agent’s ability to identify and use the correct tools.
Tool response to the agent’s request (errors, malformed responses, uptime issues).
Collecting user feedback (e.g., thumbs up/down in UI, manual evaluation, or using LLMs to judge responses).
Preparing for scenarios where external services might be down or return errors by implementing backup functions.

graph TD
    UR[User Request] --> LLM[LLM Connection]
    LLM --> UI[Intent Identification]
    UI --> TC[Tool Calling]
    TC --> TR[Tool Response]
    TR --> RG[Response Generation]
    RG --> UF[User Feedback]
    
    subgraph "Evaluation Points"
        E1[Connection & Response Times]
        E2[Intent Recognition Accuracy]
        E3[Tool Selection & Usage]
        E4[Tool Response Validation]
        E5[Feedback Collection]
        E6[Error Handling]
    end
    
    LLM -.-> E1
    UI -.-> E2
    TC -.-> E3
    TR -.-> E4
    UF -.-> E5
    
    subgraph "Backup Systems"
        TC --> BF[Backup Functions]
        TR --> EH[Error Handlers]
        BF -.-> E6
        EH -.-> E6
    end

Agentic Frameworks and Model Context Protocol

Popular Agentic Frameworks

Several agentic frameworks assist in building AI agents by offering control over task management, contextual understanding, and agent collaboration. Examples include:

Azure AI Agent Service: Designed for single agents, integrating well with existing Azure services.
Semantic Kernel: An enterprise-focused framework supporting C#, Java, and Python, with various model connectors.
Autogen: From Microsoft Research, focusing on enabling experimentation with the latest agentic research ideas.

Model Context Protocol (MCP)

The Model Context Protocol (MCP) is an open protocol that standardizes how applications provide context to LLMs, often compared to USB-C for AI. It helps connect various resources beyond just code.

MCP interactions involve three key parts:

Hosts: The environments or tools where MCP operations occur (e.g., VS Code).
Clients: The tools used to interact with MCP servers (e.g., GitHub Copilot).
Servers: Provide the core functionality, tools, and resources.

graph TB
    subgraph "MCP Architecture"
        H[Host Environment<br/>VS Code]
        C[Client<br/>GitHub Copilot]
        S[MCP Server]
        
        H <--> C
        C <--> S
    end
    
    subgraph "MCP Server Components"
        S --> T[Tools<br/>Functions & Actions]
        S --> R[Resources<br/>Data & Information]
        S --> P[Prompts<br/>Templates]
        S --> SA[Sampling<br/>Auto-calling]
    end
    
    T --> T1[Retrieve Data]
    T --> T2[Send Messages]
    T --> T3[Update Records]
    
    R --> R1[Text Files]
    R --> R2[Database Records]
    R --> R3[Images]
    
    P --> P1[Dynamic Input]
    P --> P2[Consistent Messaging]

MCP servers expose:

Tools: Functions invoked by the model to perform actions (e.g., retrieving data, sending messages, updating records).
Resources: Ways to expose information or data to the application (e.g., text, files, database records, images), which can reduce token usage and provide assets to the user.
Prompts: User-controlled, predefined templates for interactions with the AI tool, allowing for dynamic input and consistent messaging.
Sampling: A way for the client (e.g., VS Code) to default to auto-calling an LLM to run a command or query a resource, keeping context in mind.

Conclusion

Building AI agents is an evolving field that demands continuous experimentation to find the right balance between the model, prompt, and context to achieve optimal results. As we move forward, the combination of well-designed agents, robust frameworks, and standardized protocols like MCP will enable more sophisticated and reliable AI systems.

The future of AI agents lies not just in their individual capabilities, but in their ability to work together, learn from experience, and seamlessly integrate into our digital workflows. Whether you’re building a simple task automation agent or a complex multi-agent system, understanding these core concepts and patterns will be essential for success.

*This blog post covers the fundamental concepts of AI agents as of July 2025. The field continues to evolve rapidly with new patterns and frameworks emerging regularly.

AI, Development, Innovation

AI AI-tools automation LLM agentic multi-agent mcp

This post is licensed under CC BY 4.0 by the author.