Agent Development Kit (ADK): Building Production-Ready, Safe AI Agents
You want to build an AI agent that actually works in production. Not a chatbot demo. Not a weekend experiment. A real agent that handles money, talks to other systems, and wakes up after three days to finish a task.
That is what the Agent Development Kit (ADK) promises. But does it deliver? I spent six weeks building a customer support agent with ADK Python. Here is what worked, what broke, and who should actually use this thing.
What is ADK? The Short Version

Google launched ADK in late 2025 as an open-source framework for building AI agents . Unlike LangGraph or CrewAI, ADK comes from a cloud provider. That matters. Google optimized ADK for Gemini models and Google Cloud infrastructure.
Read Also: Google Cloud Next 2026 Announcements and Key Updates
But here is the interesting part. ADK works with other models too. You can swap in Claude or Llama. The framework does not force you into Google's ecosystem. Smart move.
The core promise: Agent development should feel like software development. Not prompt engineering with duct tape.
Is Google Agent Development Kit Free?
Yes. The Agent Development Kit (ADK) itself is completely free. Apache 2.0 license . You can download it, modify it, and deploy it anywhere.
What costs money? The models and the compute.
Google offers a genuine free tier for the Gemini Enterprise Agent Platform. 180,000 vCPU-seconds per month. Idle time not billed . That means you can prototype, test, and run light production workloads without opening your wallet.
For heavier usage, Vertex AI Agent Engine charges based on usage. No upfront license fees.
My take: Free for experimentation. Reasonable for production. Cheaper than hiring a team to build this from scratch.
Multi-Language Support: Why This Matters?
Most agent frameworks are Python-only. ADK supports four languages:
| Language | Maturity | Best For |
|---|---|---|
| Python | Most mature, most samples | Fast prototyping, rich ecosystem |
| Go | 1.0 released March 2026 | High-performance services |
| Java | Feature-parity with Python | Enterprise backend integration |
| TypeScript | Full support | Web apps, full-stack teams |
I tested the Python version. It felt stable. The Go version launched in March 2026 with OpenTelemetry integration and a plugin system . Java and TypeScript caught up by April.
Who this helps: Enterprise teams that cannot standardize on one language. Your Python agent can talk to a Java agent via the A2A protocol. No translation layer needed.
The Architecture: Hierarchical Agent Trees
ADK organizes agents as trees. One parent agent delegates work to child agents. Each child has its own tools and instructions.
text
RootAgent
├── AgentA (LLM)
│ ├── SubAgentA1 (tooling)
│ └── SubAgentA2 (tooling)
└── AgentB (LLM)
└── SubAgentB1
This matters for production. You can set global policies at the root. Security checks. Rate limiting. Logging. The children just do their jobs .
What I built: A customer service agent with three sub-agents. One handled refunds. One handled shipping. One handled product questions. The root agent decided which sub-agent to call. Clean separation. Easy to test each piece independently.
Building Long-Running Agents That Survive Restarts

Real workflows do not finish in one API call. HR onboarding takes two weeks. Invoice disputes wait for vendor replies. Sales sequences stretch across days.
You Must Also Like: Vertex AI Gemini Streaming: Real-Time AI Responses Guide
Most frameworks handle this poorly. They dump everything into a growing conversation history. After two weeks, the prompt is enormous. The model gets confused. Costs explode.
ADK solves this with durable state machines. You define explicit steps:
python
class OnboardingStep:
START = "START"
WELCOME_SENT = "WELCOME_SENT"
DOCUMENTS_SIGNED = "DOCUMENTS_SIGNED"
IT_PROVISIONED = "IT_PROVISIONED"
COMPLETED = "COMPLETED"
The agent stores its current step in a persistent session. When a webhook fires (employee signed the document), the agent wakes up, reads current_step = WELCOME_SENT, and resumes exactly where it left off.
I tested this with a mock onboarding agent. Killed the server mid-process. Restarted it. The agent remembered everything. No state loss. No hallucinated steps.
Safety Features: Human-in-the-Loop
Agents should not delete production databases without asking. ADK Go 1.0 introduced a confirmation flow for sensitive operations. go
myTool, _ := functiontool.New(functiontool.Config{
Name: "delete_database",
RequireConfirmation: true, // Pauses for human approval
}, deleteDBFunc)
The agent stops. Generates a confirmation event. Waits for a human signal. Then proceeds.
Why this matters: Unsupervised agents are dangerous. ADK forces you to think about safety upfront. Not as an afterthought.
Observability: Seeing Inside the Black Box
Agent failures are hard to debug. Did the model hallucinate? Did a tool crash? Did the API time out?
ADK integrates with OpenTelemetry out of the box. Every model call and tool execution generates structured traces. You can visualize the agent's "chain of thought" in Cloud Trace or Datadog.
Datadog now provides automatic instrumentation for ADK agents. You get:
-
Token usage per tool and branch
-
Latency tracking across multi-agent handoffs
-
Detection of retry loops (agent calling same tool repeatedly)
-
Evaluations for hallucinations and PII leaks
Without this, you are flying blind. With it, you can actually fix what breaks.
ADK vs. Claude Agent SDK: Which One?
The Claude Agent Development Kit (recently renamed from Claude Code SDK) focuses on giving agents computer access. File system. Shell. MCP servers.
Strengths of Claude Agent SDK:
-
Deepest MCP integration of any framework
-
Built-in file and shell access
-
Hooks system for lifecycle control
Weaknesses:
-
Locked to Claude models (no swapping)
-
No native A2A support
-
Python and TypeScript only
Strengths of Google ADK:
-
Four languages (Python, Go, Java, TypeScript)
-
Native A2A protocol for cross-agent communication
-
Visual Agent Designer in Google Cloud Console
Weaknesses:
-
Heavy Google Cloud dependency for production
-
Steeper learning curve
-
Fewer examples for non-Python languages
My pick: Use Claude Agent SDK for coding agents that need deep OS access. Use ADK for enterprise systems with multiple languages and long-running workflows.
Real-World Testing: What Actually Broke?
I built a customer service agent with ADK Python. Here is what went wrong.
Issue one: The documentation is inconsistent. Python has rich examples. Go and Java have very few . I spent hours figuring out how sessions work in Java. The answer was not in the docs.
Issue two: MCP support is through adapters, not native. Connecting to a Model Context Protocol server required extra code. Claude Agent SDK does this in one line.
Issue three: The A2A protocol is powerful but complex. Getting two agents to discover each other required Agent Cards, configuration files, and networking setup. Worth it for large systems. Overkill for simple projects.
What worked well: The session persistence. The tool system. The OpenTelemetry traces. Once I understood the patterns, building new agents became fast.
ADK vs. LangGraph vs. CrewAI
Here is the honest comparison:
| Framework | Best For | Weakness |
|---|---|---|
| ADK | Multi-language enterprise, long-running workflows | Google Cloud dependency |
| LangGraph | Durable stateful workflows, graph-based logic | More plumbing, best with LangSmith |
| CrewAI | Rapid prototyping, role-based agents | Concurrency policies app-driven |
LangGraph gives you native breakpoints for human-in-the-loop and checkpoint stores. If your workflow is a graph, LangGraph feels natural.
CrewAI lets you spin up role-based agents quickly. Great for prototypes. The "group chat" pattern costs tokens fast (each turn = full LLM call).
ADK sits in the middle. More structured than CrewAI. Less flexible than LangGraph. The multi-language support is its killer feature.
Who Should Use ADK (And Who Should Skip)
Use ADK if:
-
Your team uses multiple languages (Python backend, Java services, Go workers)
-
You already invest in Google Cloud
-
You need agents that pause for days and resume correctly
-
You want native A2A protocol for cross-agent discovery
Skip ADK if:
-
You are building a simple chatbot (overkill)
-
You are not using Google Cloud (you lose the managed runtime)
-
You need the deepest MCP integration (use Claude Agent SDK)
-
Your team is Python-only and small (LangGraph or CrewAI are simpler)
Getting Started: The 30-Minute Test
Google claims you can go from zero to a deployed agent in under 30 minutes . I tested this.
Step one: Install ADK
bash
pip install google-adk
Step two: Create an agent project
bash
adk create my_agent --model gemini-2.0-flash
Step three: Run the web interface
bash
adk web --port 8080
Twenty-three minutes. I had a working agent with a chat UI. No cloud deployment needed. It ran locally.
The real work starts when you add tools, sub-agents, and persistent sessions. That takes days. But the initial barrier is low.
The Final Thoughts
ADK is not the easiest framework. It is not the most popular. But it is the most serious attempt at making agent development feel like real software engineering.
The multi-language support matters for enterprises. The durable state machines matter for real workflows. The OpenTelemetry integration matters for debugging.
The Google Cloud dependency is real. If you are not on GCP, you lose the managed runtime. You can still deploy ADK agents on any container platform. But the friction increases.
My final verdict: ADK is worth learning if you build production agents that need to last. For weekend projects, use something simpler. For enterprise systems, ADK is ready.