Agent Development Kit (ADK): Building Production-Ready, Safe AI Agents

Category: Google Cloud May 21, 2026

You want to build an AI agent that actually works in production. Not a chatbot demo. Not a weekend experiment. A real agent that handles money, talks to other systems, and wakes up after three days to finish a task.

That is what the Agent Development Kit (ADK) promises. But does it deliver? I spent six weeks building a customer support agent with ADK Python. Here is what worked, what broke, and who should actually use this thing.

What is ADK? The Short Version

Google launched ADK in late 2025 as an open-source framework for building AI agents . Unlike LangGraph or CrewAI, ADK comes from a cloud provider. That matters. Google optimized ADK for Gemini models and Google Cloud infrastructure.

But here is the interesting part. ADK works with other models too. You can swap in Claude or Llama. The framework does not force you into Google's ecosystem. Smart move.

The core promise: Agent development should feel like software development. Not prompt engineering with duct tape.

Is Google Agent Development Kit Free?

Yes. The Agent Development Kit (ADK) itself is completely free. Apache 2.0 license . You can download it, modify it, and deploy it anywhere.

What costs money? The models and the compute.

Google offers a genuine free tier for the Gemini Enterprise Agent Platform. 180,000 vCPU-seconds per month. Idle time not billed . That means you can prototype, test, and run light production workloads without opening your wallet.

For heavier usage, Vertex AI Agent Engine charges based on usage. No upfront license fees.

My take: Free for experimentation. Reasonable for production. Cheaper than hiring a team to build this from scratch.

Multi-Language Support: Why This Matters?

Most agent frameworks are Python-only. ADK supports four languages:

Language	Maturity	Best For
Python	Most mature, most samples	Fast prototyping, rich ecosystem
Go	1.0 released March 2026	High-performance services
Java	Feature-parity with Python	Enterprise backend integration
TypeScript	Full support	Web apps, full-stack teams

I tested the Python version. It felt stable. The Go version launched in March 2026 with OpenTelemetry integration and a plugin system . Java and TypeScript caught up by April.

Who this helps: Enterprise teams that cannot standardize on one language. Your Python agent can talk to a Java agent via the A2A protocol. No translation layer needed.

The Architecture: Hierarchical Agent Trees

ADK organizes agents as trees. One parent agent delegates work to child agents. Each child has its own tools and instructions.

text

RootAgent
├── AgentA (LLM)
│   ├── SubAgentA1 (tooling)
│   └── SubAgentA2 (tooling)
└── AgentB (LLM)
    └── SubAgentB1

This matters for production. You can set global policies at the root. Security checks. Rate limiting. Logging. The children just do their jobs .

What I built: A customer service agent with three sub-agents. One handled refunds. One handled shipping. One handled product questions. The root agent decided which sub-agent to call. Clean separation. Easy to test each piece independently.

Building Long-Running Agents That Survive Restarts

Real workflows do not finish in one API call. HR onboarding takes two weeks. Invoice disputes wait for vendor replies. Sales sequences stretch across days.

You Must Also Like: Vertex AI Gemini Streaming: Real-Time AI Responses Guide

Most frameworks handle this poorly. They dump everything into a growing conversation history. After two weeks, the prompt is enormous. The model gets confused. Costs explode.

ADK solves this with durable state machines. You define explicit steps:

python

class OnboardingStep:
    START = "START"
    WELCOME_SENT = "WELCOME_SENT"
    DOCUMENTS_SIGNED = "DOCUMENTS_SIGNED"
    IT_PROVISIONED = "IT_PROVISIONED"
    COMPLETED = "COMPLETED"

The agent stores its current step in a persistent session. When a webhook fires (employee signed the document), the agent wakes up, reads current_step = WELCOME_SENT, and resumes exactly where it left off.

I tested this with a mock onboarding agent. Killed the server mid-process. Restarted it. The agent remembered everything. No state loss. No hallucinated steps.

Safety Features: Human-in-the-Loop

Agents should not delete production databases without asking. ADK Go 1.0 introduced a confirmation flow for sensitive operations. go

myTool, _ := functiontool.New(functiontool.Config{
    Name:                "delete_database",
    RequireConfirmation: true,  // Pauses for human approval
}, deleteDBFunc)

The agent stops. Generates a confirmation event. Waits for a human signal. Then proceeds.

Why this matters: Unsupervised agents are dangerous. ADK forces you to think about safety upfront. Not as an afterthought.

Observability: Seeing Inside the Black Box

Agent failures are hard to debug. Did the model hallucinate? Did a tool crash? Did the API time out?

ADK integrates with OpenTelemetry out of the box. Every model call and tool execution generates structured traces. You can visualize the agent's "chain of thought" in Cloud Trace or Datadog.

Datadog now provides automatic instrumentation for ADK agents. You get:

Token usage per tool and branch
Latency tracking across multi-agent handoffs
Detection of retry loops (agent calling same tool repeatedly)
Evaluations for hallucinations and PII leaks

Without this, you are flying blind. With it, you can actually fix what breaks.

ADK vs. Claude Agent SDK: Which One?

The Claude Agent Development Kit (recently renamed from Claude Code SDK) focuses on giving agents computer access. File system. Shell. MCP servers.

Strengths of Claude Agent SDK:

Deepest MCP integration of any framework
Built-in file and shell access
Hooks system for lifecycle control

Weaknesses:

Locked to Claude models (no swapping)
No native A2A support
Python and TypeScript only

Strengths of Google ADK:

Four languages (Python, Go, Java, TypeScript)
Native A2A protocol for cross-agent communication
Visual Agent Designer in Google Cloud Console

Weaknesses:

Heavy Google Cloud dependency for production
Steeper learning curve
Fewer examples for non-Python languages

My pick: Use Claude Agent SDK for coding agents that need deep OS access. Use ADK for enterprise systems with multiple languages and long-running workflows.

Real-World Testing: What Actually Broke?

I built a customer service agent with ADK Python. Here is what went wrong.

Issue one: The documentation is inconsistent. Python has rich examples. Go and Java have very few . I spent hours figuring out how sessions work in Java. The answer was not in the docs.

Issue two: MCP support is through adapters, not native. Connecting to a Model Context Protocol server required extra code. Claude Agent SDK does this in one line.

Issue three: The A2A protocol is powerful but complex. Getting two agents to discover each other required Agent Cards, configuration files, and networking setup. Worth it for large systems. Overkill for simple projects.

What worked well: The session persistence. The tool system. The OpenTelemetry traces. Once I understood the patterns, building new agents became fast.

ADK vs. LangGraph vs. CrewAI

Here is the honest comparison:

Framework	Best For	Weakness
ADK	Multi-language enterprise, long-running workflows	Google Cloud dependency
LangGraph	Durable stateful workflows, graph-based logic	More plumbing, best with LangSmith
CrewAI	Rapid prototyping, role-based agents	Concurrency policies app-driven

LangGraph gives you native breakpoints for human-in-the-loop and checkpoint stores. If your workflow is a graph, LangGraph feels natural.

CrewAI lets you spin up role-based agents quickly. Great for prototypes. The "group chat" pattern costs tokens fast (each turn = full LLM call).

ADK sits in the middle. More structured than CrewAI. Less flexible than LangGraph. The multi-language support is its killer feature.

Who Should Use ADK (And Who Should Skip)

Use ADK if:

Your team uses multiple languages (Python backend, Java services, Go workers)
You already invest in Google Cloud
You need agents that pause for days and resume correctly
You want native A2A protocol for cross-agent discovery

Skip ADK if:

You are building a simple chatbot (overkill)
You are not using Google Cloud (you lose the managed runtime)
You need the deepest MCP integration (use Claude Agent SDK)
Your team is Python-only and small (LangGraph or CrewAI are simpler)

Getting Started: The 30-Minute Test

Google claims you can go from zero to a deployed agent in under 30 minutes . I tested this.

Step one: Install ADK

bash

pip install google-adk

Step two: Create an agent project

bash

adk create my_agent --model gemini-2.0-flash

Step three: Run the web interface

bash

adk web --port 8080

Twenty-three minutes. I had a working agent with a chat UI. No cloud deployment needed. It ran locally.

The real work starts when you add tools, sub-agents, and persistent sessions. That takes days. But the initial barrier is low.

The Final Thoughts

ADK is not the easiest framework. It is not the most popular. But it is the most serious attempt at making agent development feel like real software engineering.

The multi-language support matters for enterprises. The durable state machines matter for real workflows. The OpenTelemetry integration matters for debugging.

The Google Cloud dependency is real. If you are not on GCP, you lose the managed runtime. You can still deploy ADK agents on any container platform. But the friction increases.

My final verdict: ADK is worth learning if you build production agents that need to last. For weekend projects, use something simpler. For enterprise systems, ADK is ready.

Agent Development Kit (ADK): Building Production-Ready, Safe AI Agents

What is ADK? The Short Version

Is Google Agent Development Kit Free?

Multi-Language Support: Why This Matters?

The Architecture: Hierarchical Agent Trees

Building Long-Running Agents That Survive Restarts

Safety Features: Human-in-the-Loop

Observability: Seeing Inside the Black Box

ADK vs. Claude Agent SDK: Which One?

Real-World Testing: What Actually Broke?

ADK vs. LangGraph vs. CrewAI

Who Should Use ADK (And Who Should Skip)

Getting Started: The 30-Minute Test

The Final Thoughts

Agent Development Kit (ADK): Building Production-Ready, Safe AI Agents

How to Turn Off Pageless Mode in Google Docs Mobile?

How to Fix Common Errors When Using Google Takeout Transfer for School Accounts?

How to Move Shared Folder to Another Drive With Permissions?

Google Cloud Next 2026 Announcements and Key Updates

Categories

Pages

About us