Building Multi-Agent Systems with Google ADK: The Complete Step-by-Step Guide

Google’s Agent Development Kit is the same framework powering Agentspace and Google’s Customer Engagement Suite. This guide teaches you to build production-grade multi-agent systems with it — from your first agent to parallel specialist teams.

Contents

The Day One Agent Problem
Part 1: Understanding ADK’s Architecture

The Hierarchy Model

Part 2: Installation and Setup
Part 3: Your First Agent — One LlmAgent with Tools
Part 4: Tool Design — Plain Python Functions
Part 5: AgentTool — Agents as Tools
Part 6: SequentialAgent — Guaranteed-Order Pipelines
Part 7: ParallelAgent — Concurrent Specialist Teams
Part 8: LoopAgent — Iterative Refinement (Generator-Critic)
Part 9: The Complete Multi-Agent System
Part 10: Session State and Agent Communication
Part 11: Running and Debugging
Part 12: Deployment
The Architecture Mental Model
What You’ve Built
Resources

The Day One Agent Problem

Every AI agent project starts with an optimistic prompt: “You are a smart assistant. Handle everything the user asks.”

Three weeks later, that single agent is juggling 40 tools, a system prompt that’s 3,000 tokens long, and a reliability rate that drops with every new capability you add. The more it knows, the worse it performs at any one thing.

This is the monolith trap. And the solution — like in software architecture — is decomposition.

Instead of one agent that does everything, build a team of specialists that each do one thing exceptionally well, coordinated by an orchestrator that knows how to delegate. That’s exactly what multi-agent systems are designed for.

Google’s Agent Development Kit (ADK) was built for this exact pattern. Announced at Google Cloud NEXT 2025 and now open-source, ADK is designed to simplify the full stack end-to-end development of agents and multi-agent systems, empowering developers to build production-ready agentic applications with greater flexibility and precise control. Critically, it’s the same framework Google uses internally — ADK is the same framework powering agents within Google products like Agentspace and the Google Customer Engagement Suite (CES).

This guide teaches you every concept you need, with working code at every step.

Part 1: Understanding ADK’s Architecture

Before writing code, internalize the mental model. ADK is built around a handful of clean primitives that compose naturally.

ADK is built around a few key primitives and concepts. The Agent is the fundamental worker unit designed for specific tasks. Agents can use language models (LlmAgent) for complex reasoning, or act as deterministic controllers of execution called workflow agents (SequentialAgent, ParallelAgent, LoopAgent). Tools give agents abilities beyond conversation, letting them interact with external APIs, search information, run code, or call other services.

The three agent types serve different roles:

Type	Powered by	Use when
`LlmAgent`	Gemini / any LLM	Reasoning, decision-making, dynamic responses
`SequentialAgent`	Deterministic	Fixed step-by-step pipelines
`ParallelAgent`	Deterministic	Independent tasks that can run concurrently
`LoopAgent`	Deterministic	Iterative refinement until a condition is met

The ADK empowers developers to get more reliable, sophisticated, multi-step behaviors from generative models. Instead of one complex prompt, ADK lets you build a flow of multiple, simpler agents that collaborate on a problem by dividing the work.

Why does this matter? Because specialized agents are more reliable at their specific tasks than one large, complex agent. It’s easier to fix or improve a small, specialized agent without breaking other parts of the system. Agents built for one workflow can be easily reused in others.

The Hierarchy Model

In ADK, you organize agents in a tree structure. A root coordinator sits at the top. Specialist sub-agents handle specific domains. Communication flows through three mechanisms: shared session state, LLM-driven delegation (agent transfer), and explicit invocation via AgentTool.

Root Coordinator (LlmAgent)
├── Specialist A (LlmAgent + tools)
├── Specialist B (LlmAgent + tools)
└── Workflow Orchestrator
    ├── Stage 1 Agent
    ├── Stage 2 Agent
    └── Stage 3 Agent

Part 2: Installation and Setup

ADK is available in Python, TypeScript, Go, and Java. We’ll use Python throughout.

# Create project and install ADK
mkdir travel-multi-agent && cd travel-multi-agent
python -m venv .venv && source .venv/bin/activate

pip install google-adk

# Set your Gemini API key
export GOOGLE_API_KEY="your_gemini_api_key_here"
# Get one free at: https://aistudio.google.com/app/apikey

Verify the install:

adk --version

ADK ships with a built-in developer UI you can launch for any project:

adk web          # Launches the visual debugger at http://localhost:8000
adk run          # CLI runner for scripted testing

The developer UI is one of ADK’s most practical advantages over other frameworks — every event, tool call, state change, and agent transfer is inspectable in real time without any extra instrumentation.

Part 3: Your First Agent — One LlmAgent with Tools

Let’s start minimal. A single LlmAgent with a tool teaches you the fundamental pattern before we add orchestration.

# agent.py
# pip install google-adk

import os
from google.adk.agents import LlmAgent
from google.adk.tools import google_search

# A minimal single agent
weather_agent = LlmAgent(
    model="gemini-2.0-flash",
    name="weather_agent",
    description="Answers weather-related questions using Google Search.",
    instruction="""
    You are a helpful weather assistant.
    Always use the google_search tool to find current weather data.
    Provide concise, accurate answers including temperature, conditions,
    and any relevant weather warnings.
    """,
    tools=[google_search],
)

Run it:

adk run agent.py

Three things are worth noting here. First, model="gemini-2.0-flash" sets the LLM — ADK natively supports all Gemini variants, and via LiteLLM integration you can swap in Claude, Mistral, or any open model with one line. Second, description is what other agents read when deciding whether to delegate to this agent — it’s the sub-agent’s job posting. Third, instruction is the system prompt — be specific and prescriptive.

Part 4: Tool Design — Plain Python Functions

ADK’s cleanest design decision: any Python function with a docstring becomes a tool. The docstring is parsed into the tool’s schema and shown to the model. You don’t need wrappers, decorators, or SDK imports.

# tools.py

def search_flights(origin: str, destination: str, date: str) -> dict:
    """Search for available flights between two cities on a given date.
    
    Args:
        origin: Departure city (e.g. 'Mumbai')
        destination: Arrival city (e.g. 'London')
        date: Travel date in YYYY-MM-DD format
    
    Returns:
        dict with available flights and prices
    """
    # In production: wire to a real flights API (Amadeus, Skyscanner, etc.)
    return {
        "flights": [
            {"flight": "AI-101", "departure": "08:00", "price_usd": 850},
            {"flight": "AI-205", "departure": "14:30", "price_usd": 720},
        ],
        "origin": origin,
        "destination": destination,
        "date": date,
    }


def search_hotels(city: str, check_in: str, check_out: str) -> dict:
    """Search for hotels in a given city for given dates.
    
    Args:
        city: City name
        check_in: Check-in date YYYY-MM-DD
        check_out: Check-out date YYYY-MM-DD
    
    Returns:
        dict with available hotels and prices
    """
    return {
        "hotels": [
            {"name": "Grand Hotel", "stars": 5, "price_per_night_usd": 180},
            {"name": "City Suites", "stars": 4, "price_per_night_usd": 95},
        ],
        "city": city,
    }


# Each tool goes to the specialist that needs it — NOT to all agents
from google.adk.agents import LlmAgent

flight_agent = LlmAgent(
    model="gemini-2.0-flash",
    name="flight_agent",
    description="Searches for available flights between cities.",
    instruction="You are a flights specialist. Use search_flights to find options.",
    tools=[search_flights],
)

hotel_agent = LlmAgent(
    model="gemini-2.0-flash",
    name="hotel_agent",
    description="Finds and recommends hotel accommodations.",
    instruction="You are a hotel specialist. Use search_hotels to find options.",
    tools=[search_hotels],
)

The discipline here matters: give each tool to exactly the agent that needs it. Never give all tools to a coordinator. Tool overload is how monolith agents happen.

Part 5: AgentTool — Agents as Tools

The most powerful pattern in ADK: wrapping a sub-agent as a tool that the coordinator calls explicitly. This gives the coordinator full control over when each specialist runs, while keeping each specialist cleanly isolated.

# coordinator.py
from google.adk.agents import LlmAgent
from google.adk.tools.agent_tool import AgentTool

# (flight_agent and hotel_agent defined in tools.py above)

# Coordinator delegates to specialists via AgentTool
coordinator = LlmAgent(
    model="gemini-2.0-flash",
    name="travel_coordinator",
    description="Orchestrates travel planning by delegating to specialist agents.",
    instruction="""
    You are a travel planning coordinator.
    When users ask about travel:
    - Use the flight_agent tool for anything related to flights
    - Use the hotel_agent tool for anything related to accommodation
    - Synthesize both results into a coherent, complete travel plan
    - Present the plan clearly with costs and timings
    """,
    tools=[
        AgentTool(agent=flight_agent),
        AgentTool(agent=hotel_agent),
    ],
)

When the coordinator receives “Book a flight to Paris and find a hotel”, it calls flight_agent, gets the result, then calls hotel_agent, gets that result, and synthesises both into a unified response. This is a game-changer. When a complex query is run, the root agent understands and intelligently calls the flight tool, gets the result, and then calls the hotel tool.

Part 6: SequentialAgent — Guaranteed-Order Pipelines

Some workflows must run in strict order: you can’t summarise a document before fetching it. You can’t run a risk model before gathering market data. For these, SequentialAgent is the right primitive.

The SequentialAgent is a workflow agent that executes its sub-agents in the order they are specified in the list. Use the SequentialAgent when you want the execution to occur in a fixed, strict order.

Here’s an equity analyst pipeline — research → risk assessment → report generation, guaranteed in that order:

# analyst_pipeline.py
from google.adk.agents import LlmAgent, SequentialAgent

def fetch_market_data(ticker: str) -> dict:
    """Fetch latest market data for a stock ticker."""
    return {"ticker": ticker, "price": 142.50, "volume": 1_200_000, "change_pct": 2.3}

def run_risk_model(data: dict) -> dict:
    """Run risk assessment on market data."""
    return {"risk_score": 0.42, "recommendation": "moderate_buy", "data": data}


# Step 1: Research — writes to session state via output_key
research_agent = LlmAgent(
    model="gemini-2.0-flash",
    name="research_agent",
    description="Fetches and structures market data for analysis.",
    instruction="""Fetch market data for the requested ticker.
    Return structured data including price, volume, and daily change.""",
    tools=[fetch_market_data],
    output_key="market_data",        # ← writes result to session state
)

# Step 2: Risk — reads {market_data} from session state
risk_agent = LlmAgent(
    model="gemini-2.0-flash",
    name="risk_agent",
    description="Runs risk assessment on the researched market data.",
    instruction="""Read the market data from {market_data} in session state.
    Run a risk assessment and produce a structured recommendation.""",
    tools=[run_risk_model],
    output_key="risk_assessment",
)

# Step 3: Report — synthesises both outputs
report_agent = LlmAgent(
    model="gemini-2.0-flash",
    name="report_agent",
    description="Generates the final analyst report.",
    instruction="""Using the market data from {market_data} and risk assessment
    from {risk_assessment}, write a concise investment report with:
    - Executive summary
    - Key metrics
    - Risk rating
    - Recommendation""",
)

# SequentialAgent: guaranteed order, no LLM routing overhead
analyst_pipeline = SequentialAgent(
    name="equity_analyst_pipeline",
    sub_agents=[research_agent, risk_agent, report_agent],
)

The output_key parameter is how agents communicate through session state — a lightweight shared memory available to all agents in the tree during a single session. Agent B can read what Agent A wrote simply by referencing {agent_a_output_key} in its instruction.

Part 7: ParallelAgent — Concurrent Specialist Teams

When sub-tasks are independent of each other, there’s no reason to run them serially. ParallelAgent runs all sub-agents concurrently and collects their results before returning.

# parallel_research.py
from google.adk.agents import LlmAgent, ParallelAgent

def search_flights(origin: str, destination: str, date: str) -> dict:
    """Search flights between two cities."""
    return {"flights": [{"flight": "AI-101", "price_usd": 850}]}

def search_hotels(city: str, check_in: str, check_out: str) -> dict:
    """Search hotels in a city."""
    return {"hotels": [{"name": "Grand Hotel", "price_per_night_usd": 180}]}

def search_activities(city: str, date: str) -> dict:
    """Search top activities in a city."""
    return {"activities": ["Eiffel Tower", "Louvre Museum", "Seine River Cruise"]}


flight_agent = LlmAgent(
    model="gemini-2.0-flash",
    name="flight_agent",
    description="Searches for flights.",
    instruction="Find flights for the given route and date.",
    tools=[search_flights],
    output_key="flight_results",
)

hotel_agent = LlmAgent(
    model="gemini-2.0-flash",
    name="hotel_agent",
    description="Finds hotels.",
    instruction="Find hotels for the given city and dates.",
    tools=[search_hotels],
    output_key="hotel_results",
)

activities_agent = LlmAgent(
    model="gemini-2.0-flash",
    name="activities_agent",
    description="Finds things to do.",
    instruction="Find top activities and attractions for the given city.",
    tools=[search_activities],
    output_key="activities_results",
)

# ParallelAgent: all three run concurrently → 3x faster than sequential
research_team = ParallelAgent(
    name="travel_research_team",
    sub_agents=[flight_agent, hotel_agent, activities_agent],
)

Parallel research that previously took 9 seconds (3 sequential API calls at ~3s each) now takes ~3 seconds. For any multi-step workflow where steps are independent, ParallelAgent is the right choice.

Some outputs improve with iteration. A first-draft blog post benefits from a critic pass. A travel itinerary improves when checked against constraints. LoopAgent implements this generator-critic pattern: it loops through its sub-agents repeatedly until one of them triggers an escalate signal or max_iterations is reached.

# refinement_loop.py
from google.adk.agents import LlmAgent, LoopAgent

# Writer produces or revises the draft
writer_agent = LlmAgent(
    model="gemini-2.0-flash",
    name="writer_agent",
    description="Writes or revises the content draft.",
    instruction="""
    If there is no draft yet, write an initial blog post based on the topic.
    If there is a draft in {current_draft}, revise it based on the critic's
    feedback in {critic_feedback}. Output the improved draft.
    """,
    output_key="current_draft",
)

# Critic reviews and decides whether to continue or finish
critic_agent = LlmAgent(
    model="gemini-2.0-flash",
    name="critic_agent",
    description="Reviews content quality and decides whether to continue iterating.",
    instruction="""
    Review the draft in {current_draft}. Score it from 1-10 for:
    clarity, accuracy, engagement, and SEO value.
    Provide specific, actionable improvement notes.
    If the overall score is 8 or above, set escalate=true to finish.
    Otherwise set escalate=false to request another revision.
    """,
    output_key="critic_feedback",
)

# Loops until escalate=true or max_iterations reached
content_refinement_loop = LoopAgent(
    name="content_refinement_loop",
    sub_agents=[writer_agent, critic_agent],
    max_iterations=5,
)

This maps directly onto production use cases: report generation with quality gates, code generation with test-run feedback, regulatory documents with compliance checks.

Part 9: The Complete Multi-Agent System

Now compose every pattern into one production system: a travel planner that runs research in parallel, refines the itinerary through a writer-critic loop, then validates before delivery.

# travel_planner.py — full production multi-agent system
from google.adk.agents import LlmAgent, SequentialAgent, ParallelAgent, LoopAgent
from google.adk.tools.agent_tool import AgentTool


# ── Tool functions ────────────────────────────────────────────────────────────

def search_flights(origin: str, destination: str, date: str) -> dict:
    """Search flights between two cities."""
    return {"flights": [{"flight": "AI-101", "price_usd": 850}]}

def search_hotels(city: str, check_in: str, check_out: str) -> dict:
    """Search hotels in a city."""
    return {"hotels": [{"name": "Grand Hotel", "price_per_night_usd": 180}]}

def search_activities(city: str, date: str) -> dict:
    """Search top attractions in a city."""
    return {"activities": ["Eiffel Tower", "Louvre Museum"]}

def validate_itinerary(itinerary: str) -> dict:
    """Validate an itinerary for conflicts and completeness."""
    return {"valid": True, "issues": []}


# ── Stage 1: Parallel research team ──────────────────────────────────────────

flight_agent    = LlmAgent(model="gemini-2.0-flash", name="flight_agent",
    description="Searches for available flights.",
    instruction="Find flights for the given route and date.",
    tools=[search_flights], output_key="flight_results")

hotel_agent     = LlmAgent(model="gemini-2.0-flash", name="hotel_agent",
    description="Finds hotels.",
    instruction="Find hotels for the city and dates.",
    tools=[search_hotels], output_key="hotel_results")

activities_agent = LlmAgent(model="gemini-2.0-flash", name="activities_agent",
    description="Recommends activities and attractions.",
    instruction="Find top activities for the city.",
    tools=[search_activities], output_key="activities_results")

research_team = ParallelAgent(
    name="research_team",
    sub_agents=[flight_agent, hotel_agent, activities_agent],
)

# ── Stage 2: Writer-critic refinement loop ────────────────────────────────────

writer_agent = LlmAgent(model="gemini-2.0-flash", name="itinerary_writer",
    description="Drafts a travel itinerary from research results.",
    instruction="""Using flight_results, hotel_results, and activities_results
    from session state, compose a detailed 3-day travel itinerary.
    On revision rounds, apply critic_feedback.""",
    output_key="itinerary_draft")

critic_agent = LlmAgent(model="gemini-2.0-flash", name="itinerary_critic",
    description="Reviews the itinerary for quality.",
    instruction="""Review the itinerary in {itinerary_draft}.
    Check for: logical flow, realistic timing, missing essentials.
    Score 1-10. If score >= 8, set escalate=true.""",
    output_key="critic_feedback")

refinement_loop = LoopAgent(
    name="itinerary_refinement",
    sub_agents=[writer_agent, critic_agent],
    max_iterations=3,
)

# ── Stage 3: Validation ───────────────────────────────────────────────────────

validator_agent = LlmAgent(model="gemini-2.0-flash", name="validator_agent",
    description="Validates the final itinerary.",
    instruction="""Validate the itinerary in {itinerary_draft} using the
    validate_itinerary tool. Return the validation result.""",
    tools=[validate_itinerary],
    output_key="validation_result")

# ── Full pipeline: Research → Refine → Validate ───────────────────────────────

travel_planner = SequentialAgent(
    name="travel_planner",
    sub_agents=[research_team, refinement_loop, validator_agent],
)

Run this with:

adk run travel_planner.py
# Or test with web UI:
adk web travel_planner.py

The architecture: Research (Parallel, 3x faster) → Refinement Loop (quality gates) → Validation (safety check) → Final output. Each stage is independently testable, swappable, and improvable without touching the others.

Part 10: Session State and Agent Communication

The mechanism agents use to pass data between each other in ADK is session state — a shared key-value store available within a single conversation session. output_key on an LlmAgent writes the agent’s final response to a state key. Any downstream agent can read it via {key_name} interpolation in its instruction.

This is the recommended pattern for SequentialAgent pipelines. For AgentTool invocations, the result is returned inline to the calling coordinator — no state write needed.

For cross-session persistence (memory that survives across different user conversations), ADK provides a Memory component separate from State. Think of State as session RAM and Memory as persistent storage.

Reference: Sessions & Memory — ADK Docs

Part 11: Running and Debugging

ADK’s developer tooling is one of its strongest differentiators.

# Run interactively in the terminal
adk run travel_planner.py

# Launch the visual dev UI (inspect events, state, tool calls)
adk web

# Evaluate against test datasets
adk eval travel_planner.py eval_dataset.json

The web UI shows every Event in the execution tree: which agent ran, which tools were called, what was written to state, and how long each step took. For multi-agent systems with 5+ agents, this is invaluable for debugging delegation failures and unexpected routing.

Part 12: Deployment

When your agent is production-ready, ADK provides first-class deployment to Google Cloud:

# Deploy to Vertex AI Agent Engine (managed, auto-scaling)
adk deploy agent-engine travel_planner.py

# Or containerise for Cloud Run
adk deploy cloud-run travel_planner.py --project YOUR_GCP_PROJECT

ADK’s architecture includes several production-focused features: direct integration with Vertex AI Agent Engine, support for containerised deployment, pre-built connectors to enterprise systems and databases like AlloyDB, BigQuery, and NetApp, bidirectional streaming support for real-time audio and video interactions, and built-in frameworks to assess response quality and execution paths.

References: Deploy to Agent Engine, Deploy to Cloud Run

The Architecture Mental Model

USER QUERY
     │
     ▼
┌─────────────────────────────────────────────────────────────┐
│  ROOT COORDINATOR (LlmAgent)                                │
│  Receives query → decides which agents/tools to invoke      │
└────────┬──────────────┬──────────────────────┬─────────────┘
         │              │                      │
         ▼              ▼                      ▼
  AgentTool A     AgentTool B           SequentialAgent
  (Specialist)    (Specialist)          └─ Step 1 Agent
                                        └─ Step 2 Agent
                                        └─ Step 3 Agent
                                                │
                                         ParallelAgent
                                         ├─ Worker A  ──┐
                                         ├─ Worker B  ──┤ → merged
                                         └─ Worker C  ──┘
                                                │
                                           LoopAgent
                                           ├─ Writer → draft
                                           └─ Critic → escalate?
                                                │
                                         FINAL RESPONSE

What You’ve Built

Walking through this guide, you’ve assembled the full ADK vocabulary: LlmAgent for reasoning specialists, SequentialAgent for guaranteed-order pipelines, ParallelAgent for concurrent research teams, LoopAgent for iterative refinement cycles, and AgentTool for explicit coordinator-to-specialist delegation.

The travel planner is a working template for any multi-agent system in production: research fast (parallel), draft well (loop), gate with quality checks (critic), validate before shipping (sequential). Swap the domain, adjust the tools, deploy to Vertex AI.

This is how Google builds its own production agent systems. Now it’s your framework too.

Resources

ADK Official Documentation — home of all ADK guides
ADK Python Quickstart — your first agent in 5 minutes
Multi-Agent Systems in ADK — patterns and primitives
Sequential Agents — guaranteed-order pipelines
Parallel Agents — concurrent execution
Loop Agents — iterative refinement
Sessions & Memory — state and cross-session persistence
Deploy to Agent Engine — Vertex AI deployment
Google Cloud Blog: Build Multi-Agentic Systems
ADK Technical Overview — deep dive on architecture

All code examples syntax-verified against Python 3.11. Install: pip install google-adk. Get a free Gemini API key at aistudio.google.com.

Must Read

How to Build an Agentic Workflow with n8n and an LLM (2026 Tutorial)

Building with Google Agent Studio: The Complete Guide to Gemini Enterprise Agent Platform

Microsoft Copilot: The Complete Guide for 2026 (And Why It Actually Matters)