<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:media="http://search.yahoo.com/mrss/"><channel><title>Developers &amp; Practitioners</title><link>https://cloud.google.com/blog/topics/developers-practitioners/</link><description>Developers &amp; Practitioners</description><atom:link href="https://cloudblog.withgoogle.com/blog/topics/developers-practitioners/rss/" rel="self"></atom:link><language>en</language><lastBuildDate>Wed, 22 Apr 2026 14:27:09 +0000</lastBuildDate><image><url>https://cloud.google.com/blog/topics/developers-practitioners/static/blog/images/google.a51985becaa6.png</url><title>Developers &amp; Practitioners</title><link>https://cloud.google.com/blog/topics/developers-practitioners/</link></image><item><title>Next '26 Hands-On: 10 Codelabs to Build Featured Tech</title><link>https://cloud.google.com/blog/topics/developers-practitioners/next-26-hands-on-10-codelabs-to-build-featured-tech/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;em&gt;&lt;span style="vertical-align: baseline;"&gt;Significant contributors to this article include &lt;strong&gt;Megan O'Keefe&lt;/strong&gt;, Senior Staff Developer Advocate, and &lt;/span&gt;&lt;strong&gt;Karl Weinmeister&lt;/strong&gt;, Director of Developer Relations.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Whether you are joining us in person in Las Vegas or tuning in virtually from around the world, Google Cloud Next '26 offers a deep look into the practical evolution of AI. With &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;89% of sessions&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; this year dedicated to artificial intelligence, the focus has shifted from high-level concepts to the "Day 2" reality of building and maintaining agentic systems.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We've assembled &lt;strong&gt;55+ new codelabs&lt;/strong&gt; across Cloud at Next, and we want to share 10 highlights with you. The following curated list of codelabs is designed to help you translate the announcements from the talks and demos into functional code. These labs provide a structured way to explore the latest in multi-agent orchestration, data grounding, and enterprise security for your own workflows.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Dive into Codelabs!&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;1&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;—&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Build Rich Agent Experiences (ADK + A2UI)&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; | &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/adk-a2ui/#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Improve user interaction&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;through intuitive, high-quality interfaces that allow users to interact with agentic systems seamlessly.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;2&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;—&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Building a Multi-Agent System&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; | &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/multi-agent-system#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Build&lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;the&lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;architecture required to make multiple agents work together to achieve a shared goal.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;3&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;—&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Beyond the Simple SELECT: AlloyDB NL2SQL&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; | &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/alloydb-querydata#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Democratize data access by building systems that allow users to query complex databases using natural language, supported by high-speed vector search.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;4&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;—&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Beat Fraud with an AI Shield (Spanner &amp;amp; BigQuery Graph)&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; | &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/spanner-bigquery-graph/#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Implement real-time reasoning with Spanner and BigQuery Graph databases. Analyze complex relationships in your data to prevent fraud at the point of transaction.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;5&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;—&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Building Secure Agents: Protecting Access and Data&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; | &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/showcase-build-secure-agent/#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Protect the reasoning engine with&lt;strong&gt; &lt;/strong&gt;&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Model Armor&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Identity and Access Management (IAM) &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;to manage agent access and ensure that sensitive data remains protected during execution.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;6&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;—&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Ground Agents with Google Maps Platform&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; | &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/maps-grounding/#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Use Geo-intelligent logistics to ground your agents in real-world location data to optimize field operations and logistics in real-time.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;7&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;—&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy and Scale Agents on Agent Engine&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; | &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/adk-deploy-scale/#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Deploy agents as containerized microservices that scale dynamically with your workload.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;8&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;—&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;The Ultimate Guide to Cloud Run: From Zero to Production&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; | &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/ultimate-cloud-run-guide/#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Achieve rapid deployment using this lab as a blueprint for moving from a local prototype to a production-ready, auto-scaling platform on Cloud Run.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;9&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;—Developer Keynote: Building Agents with Skills &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;|&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/dev-keynote/building-agents-with-skills#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Learn the ins and outs of AI agent development including Agent Development Kit (ADK), prompting, Agent Skill usage, and MCP. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;10—General Keynote: Forecasting with AI Agents | &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/gen-keynote/raw-data-forecasting#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Transform unstructured chaos into actionable business intelligence in seconds&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Start Building Today&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These codelabs will connect you to the heart of the conference. You'll be able to bridge the high-level announcements, talks, and demos into the reality of the technology featured at Next '26. Whether you're here in person or attending virtually, these labs provide the concrete skills to drive real-world value during the conference &lt;em&gt;&lt;strong&gt;and&lt;/strong&gt;&lt;/em&gt; long after the conference ends.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;And there's more!&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Go to the &lt;a href="https://codelabs.developers.google.com/?event=googlecloudnext2026" rel="noopener" target="_blank"&gt;Codelab landing page&lt;/a&gt; to find the &lt;code&gt;Cloud Next '26&lt;/code&gt; tag and access &lt;strong&gt;more than 75 total&lt;/strong&gt; &lt;strong&gt;codelabs&lt;/strong&gt; that support the featured tech at this year's conference.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 22 Apr 2026 13:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/next-26-hands-on-10-codelabs-to-build-featured-tech/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/corrected_updated_final_codelabs_image.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Next '26 Hands-On: 10 Codelabs to Build Featured Tech</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/corrected_updated_final_codelabs_image.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/next-26-hands-on-10-codelabs-to-build-featured-tech/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Mandy Grover</name><title>Strategic Content, Google Cloud</title><department></department><company></company></author></item><item><title>Level Up Your Agents: Announcing Google's Official Skills Repository</title><link>https://cloud.google.com/blog/topics/developers-practitioners/level-up-your-agents-announcing-googles-official-skills-repository/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As AI models improve, technical practitioners are increasingly turning to agentic AI tools to build with Google Cloud products, from Firebase and the Gemini API, to BigQuery and GKE.  But how can you ensure that the model is equipped with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;accurate, up-to-date information &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;about these technologies? &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;One way to do this is to plug your AI agent into a grounded, real-time information source. For instance, &lt;/span&gt;&lt;a href="https://developers.google.com/knowledge/mcp" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google offers a Model Context Protocol (MCP) server for its developer documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. But heavily using MCP servers can cause a problem called “context bloat,” where huge amounts of context are loaded into the context window, confusing the model and racking up token costs.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We need a way to equip agents with additional, condensed expertise — and we can do this with &lt;/span&gt;&lt;a href="https://agentskills.io/home" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Skills.&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://agentskills.io/home" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Skills&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; are “a simple, open format for giving agents new capabilities and expertise.” Think of a skill as compact, agent-first documentation for a specific technology or task. Skills are written in Markdown and can contain reference files, code snippets, and other assets. Agents load in skill information &lt;/span&gt;&lt;a href="https://agentskills.io/what-are-skills#how-skills-work" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;only as-needed,&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; reducing the risk of context bloat. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today, on Day 1 of &lt;/span&gt;&lt;a href="https://www.googlecloudevents.com/next-vegas/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Next 2026,&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; we’re excited to announce the launch of Google’s official Agent Skills repository: &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/google/skills" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;github.com/google/skills&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This repository is starting off with thirteen skills, focused on Google Cloud technologies: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;A selection of products&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;AlloyDB, BigQuery, Cloud Run, Cloud SQL, Firebase, Gemini API, and Google Kubernetes Engine (GKE).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Three &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/architecture/framework"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Well-Architected Pillar&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;skills: Security, Reliability, and Cost Optimization &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;“Recipe” skills for Google Cloud Onboarding, Authentication, and Network Observability. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/image_1_BwwkF6A.max-1000x1000.png"
        
          alt="image (1)"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Use &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;npx skills install &lt;/code&gt;&lt;a href="http://github.com/google/skills" rel="noopener" target="_blank"&gt;&lt;code style="text-decoration: underline; vertical-align: baseline;"&gt;github.com/google/skills&lt;/code&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to install these skills to your agents of choice, including &lt;/span&gt;&lt;a href="https://antigravity.google/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Antigravity&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://geminicli.com/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini CLI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and third-party agents. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/agent_skills-2.max-1000x1000.png"
        
          alt="agent_skills-2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Stay tuned as we launch additional skills in this repo in the coming weeks and months! &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now get building!&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 22 Apr 2026 13:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/level-up-your-agents-announcing-googles-official-skills-repository/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Agent_Skills_Blog_-_Hero.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Level Up Your Agents: Announcing Google's Official Skills Repository</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Agent_Skills_Blog_-_Hero.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/level-up-your-agents-announcing-googles-official-skills-repository/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Megan O'Keefe</name><title>Senior Staff Developer Advocate</title><department></department><company></company></author></item><item><title>What’s new with the Cross-Cloud Network at Next ‘26</title><link>https://cloud.google.com/blog/products/networking/whats-new-in-cloud-networking-at-next26/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While generative AI sparked a revolution, the true paradigm shift is the rapid evolution from standalone AI models to multi-agent autonomous systems. In this new era, the network transcends basic connectivity to become the critical integration layer for your agentic enterprise.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As AI agents and services surge, your core applications remain as vital as ever. To thrive in this rapidly evolving landscape, you need a planet-scale network to connect, protect, govern, deliver, and secure all your users, data, agents, AI services, and core applications across clouds and on-premises.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud's Cross-Cloud Network provides this unified foundation, and is now used by 65% of the Fortune 100 and handles up to 27 exabytes of data per month. At Google Cloud Next, we are introducing networking innovations to accelerate your AI infrastructure, strengthen security, and simplify operations. &lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Optimized networking infrastructure for AI &lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As we move toward an agentic world, the network must support massive-scale inference paired with reinforcement learning. At Google, we’ve spent years refining this cycle to power our own global AI services. Today, we’re announcing AI infrastructure network innovations that bring this same architecture directly to your workloads, across &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;agents&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;inference&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;training&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, and beyond.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Networking for agents&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/introducing-gemini-enterprise-agent-platform"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise Agent Platform&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is a comprehensive enterprise environment designed to build, scale, govern, and optimize the next generation of autonomous agents. Key innovations being announced in preview include: &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Gateway:&lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt; Air-traffic control for agentic traffic&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Agent Gateway understands MCP and A2A agentic protocols and provides an open, extensible, scalable way to enforce centralized governance policies to securely connect agents, models, and tools across runtimes.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Ambient networking: &lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt;A seismic shift in service-to-service connectivity&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ambient networking, a new integrated data plane for Google Kubernetes Engine (GKE) and Cloud Run, provides service discovery, zero-trust access, and traffic management without the need for complex and resource-heavy sidecar proxies. It reduces operational overhead and enables up to a 10x reduction in GKE resource usage for layer 4 (L4) mesh capabilities&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ambient networking underpins two new capabilities:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Service bindings &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;automatically establish service-to-service connectivity, allowing developers to move faster to build and scale their agentic applications and services.&lt;/span&gt;&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Network Services Monitoring&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; bridges application and network observability gaps resulting in faster root-cause analysis and simplified troubleshooting.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Rich partner integrations and customizations&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With the help of &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/service-extensions/docs/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Service Extensions&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, we are developing solutions&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;for identity, governance, and AI security for agent-to-anywhere traffic. Coming soon in preview to Agent Gateway are:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Identity and governance administration&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Offering delegated authorization to Cloud IAM and partner services from Okta, Ping, Saviynt, and Silverfort to enforce real-time, contextual governance policies based on application and business context.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Runtime security:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; As a universal enforcement point by integrating with Google Cloud’s Model Armor and partner solutions from Broadcom, Check Point, Cisco, CrowdStrike, Exabeam, F5, Netskope, Palo Alto Networks, Thales, and Zscaler. Together, these can help to secure agentic communications against emerging AI attack vectors.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These innovations are built on an open foundation including Envoy and Kubernetes, providing strong, integrated governance in multicloud environments using standard Kubernetes Gateway APIs.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Networking for inference&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;At Google we run inference at scale with optimized use of distributed GPU and TPU resources, automatic failover between regions for high availability, and optimized global request routing for fast end-user performance. GKE Inference Gateway delivers these capabilities to our cloud customers including the following new innovations:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Multi-region support &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;allows scaling inference services across regions, enabling cross-regional failover, optimized utilization, and reduced global latency (preview).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Predictive latency boost&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; improves utilization with intelligent request routing based on predefined performance targets (preview).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Disaggregated serving&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; leverages &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;llm-d’s SGLang support, offering the flexibility to choose between vLLM and SGLang for model serving (&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;GA).&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-pull_quote"&gt;&lt;div class="uni-pull-quote h-c-page"&gt;
  &lt;section class="h-c-grid"&gt;
    &lt;div class="uni-pull-quote__wrapper h-c-grid__col h-c-grid__col--8 h-c-grid__col-m--6 h-c-grid__col-l--6
      h-c-grid__col--offset-2 h-c-grid__col-m--offset-3 h-c-grid__col-l--offset-3"&gt;
      &lt;div class="uni-pull-quote__inner-wrapper h-c-copy h-c-copy"&gt;
        &lt;q class="uni-pull-quote__text"&gt;Gemini Enterprise Agent Platform reduced Time to First Token (TTFT) latency by over 35% for Qwen3-Coder by using GKE Inference Gateway.&lt;/q&gt;

        
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/section&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“Before GKE Inference Gateway, managing our inference stack with Ray Serve created a complex, dual-orchestration layer that was a significant burden on our small operations team. Moving to the Inference Gateway and native Kubernetes deployments was the 'North Star' architecture we needed to simplify management and achieve robust production stability with a GKE-native batteries-included solution.”&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;- &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Mikhail Lubinets, Lead HPC Engineer, Technology Innovation Institute&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Networking for training&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;At Google, we build and run the largest AI models in the world — and we built a network to support that. Some of the new enhancements are:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Massive scale with &lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt;Virgo Network&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This new &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;non-blocking data center fabric&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; removes latency barriers: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Virgo&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; can link up-to 134,000 chips with 47 Petabits/sec of non-blocking bi-sectional bandwidth in a single fabric. This delivers a staggering 1.6M Exaflops of FP4 compute. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;With &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;enhancements in Pathways and JAX&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, you can further connect these Virgo fabrics to scale to over 1 million TPU chips in a single training cluster.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;We are also making Virgo Network&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; available on NVIDIA Vera Rubin NVL72&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, supporting up to 960,000 GPUs.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For more on Virgo Network, check out this &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/networking/introducing-virgo-megascale-data-center-fabric"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;blog&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Accelerator network profiles&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;It’s easier than ever to handle the complex networking prerequisites for accelerator-equipped GKE node pools with &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/networking/introducing-managed-dranet-in-google-kubernetes-engine"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;DRANET&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, which improves bandwidth for distributed AI/ML workloads by up to 60% (GA).&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;AI-native Cloud Interconnect&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;SLA-backed, and optimized for efficiency, Cloud Interconnect supports petabit-scale data transfers and is available with a fixed price option. Cloud Interconnect now supports:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;400 Gbps circuits with up to 3.2 Tbps in a single connection (GA)&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Partner Cross-Cloud Interconnect for AWS (GA), CoreWeave (in preview soon), and Lumen (in preview soon)&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Cross-Cloud Network for AI and core applications&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The Cross-Cloud Network helps ensure you can securely connect users, data, locations, applications, services, and infrastructure anywhere in the world, at planetary scale. We designed our global multi-shard network to scale horizontally to meet the demands of the AI era and enable us to accommodate our 10x WAN traffic growth from 2020 to 2025.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These are some of the improvements we’re making to the Cross-Cloud Network: &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Ultra Low Latency Solution for financial exchanges &lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In partnership with CME Group, we are bringing the world's leading derivatives marketplace to Google Cloud. To support CME Group’s performance requirements, we developed an ultra low latency (ULL) networking and compute solution. This fully managed cloud environment will allow CME Group and its clients to migrate its core trading systems to Google Cloud. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now in preview, the solution is designed to meet the unique and exacting requirements of running financial exchanges in the cloud. It includes several new technologies:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Deterministic high-performance compute &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;powered by ULL networking, with bare metal and VM form factors, delivers a comprehensive portfolio for your trading compute needs. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Scalable multicast data distribution &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;with hardware-based ultra-low latency enables reliable one-to-many market data sharing.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Nanosecond-level clock sync &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;enabled by &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/networking/understanding-the-firefly-clock-synchronization-protocol/"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Firefly&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a novel clock synchronization system. Firefly achieves sub-10ns NIC-to-NIC synchronization to support high-frequency trading.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Advanced network observability &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;with 64-bit nanosecond timestamps, support for multiple traffic-mirroring destinations and multicast traffic, and support for auditing and regulatory requirements.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Low-latency inference &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;allowing exchange participants to connect their AI-driven services to the exchange’s infrastructure. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;The Google Cloud Ultra Low Latency Solution provides the level of performance necessary for CME Group futures and options markets to run in the cloud, expanding access to clients worldwide.” &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;- Sunil Cutinho, CIO, CME Group&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Cross-cloud observability for networks, applications, and agents&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Whether you’re running core applications or new AI agents, you need visibility into your network infrastructure. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Network Insights&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, now in preview, offers network performance monitoring (NPM) and digital experience monitoring (DEM) to dramatically reduce the mean time to detect and mitigate network-related agent, application, and API issues.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Network Insights is enabled by technologies from Broadcom’s AppNeta and powered by AI-enabling natural language queries through Gemini Cloud Assist.&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"In an environment as complex and high-scale as Sabre’s, total visibility isn't just a luxury — it's a requirement for operational resilience. Cloud Network Insights will enable us to further shift our posture from reactive troubleshooting to proactive optimization. By providing granular, real-time telemetry across our global cloud footprint, it helps eliminate the traditional 'black box' of the network, allowing our teams to resolve bottlenecks before they impact the traveler experience."&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; - Alfredo Rodriguez, VP Cloud Platform Infrastructure, Sabre Corporation&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“Cloud Network Insights closes the 'visibility gap' between the private corporate network and the public cloud, empowering our joint customers to pinpoint performance bottlenecks in seconds rather than hours.”&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; - Alan Davidson, CIO, Broadcom&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Cross-Cloud Network for distributed applications&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Multicloud and hybrid networks require secure, reliable, and high-performance connectivity. New enhancements for our foundational networking services and tools include:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Private Service Connect &lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Private Service Connect traffic volume grew 4x in 2025 and it now supports &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/vpc/docs/private-service-connect-compatibility"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;40+ Google and third-party published services&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, enabling secure private global access to your managed services. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Private Service Connect endpoint-based security &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;allows for granular authorization policies for producer-to-consumer service communications (preview).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Gemini Cloud Assist for Private Service Connect&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; provides for automated troubleshooting (preview).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud-native IP address management (IPAM)&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Number Registry &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;is an IPAM solution powered by agentic technologies. Network admins can easily find free IP ranges, track utilization, and allocate resources (preview). It also integrates with Infoblox Universal DDI for Cross-Cloud Network IPAM discovery and enforcement.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Hybrid Subnets&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; allow you to migrate legacy workloads from on-premises to a VPC without needing to change hard-coded IP addresses (GA).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud NAT &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;allows you to connect your IPv6-only workloads to private IPv4 destinations using the combined power of DNS64 and private NAT64 (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Network Connectivity Center (NCC)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Partner Cross-Cloud Interconnect for AWS is available as a connectivity type in NCC (preview).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Support for static routes using an &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;internal load balancer as the next hop&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; allows the integration of Secure Web Proxy and third-party network security virtual appliances (GA).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Support for &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;privately used public IP&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (PUPI) allows the exchange of PUPI IPv4 addresses with VPC spokes and producer VPC spokes (GA).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Granular networking charge visibility&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cost Explorer and the new App Optimize API now provide attribution of associated Data Transfer costs to the originating resources for Google Cloud products (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Cross-Cloud Network for internet-facing services&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As part of Cross-Cloud Network, the &lt;/span&gt;&lt;a href="https://cloud.google.com/solutions/cross-cloud-network#deliver-internet-facing-apps-and-content"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Global Front End&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; simplifies how you deliver, scale, and protect web, API, and AI workloads. New capabilities include: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Global Front End Enterprise delivers&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; simplified consumption by combining capabilities from global Cloud Load Balancing, Google Cloud Armor, Cloud CDN, and Service Extensions with up to 15% lower TCO (in preview soon). &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Post quantum cryptography &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;(PQC) helps secure your workloads with industry-standard algorithms that provide a layered defense against both classical and quantum adversaries.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Google tag gateway,&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; enabling advertisers to serve tags from their own domain, which can significantly improve the accuracy and resilience of measurement signals (GA soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In addition, &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud CDN&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, an important part of the Global Front End, now offers:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Built-in image optimization &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;to help you deliver content that best fits your end users’ screens and saves on bandwidth costs (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;GKE Gateway support&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; so you can enable and manage caching services using GKE APIs (GA).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Cross-Cloud Network’s Cloud WAN for global enterprises&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud WAN is a fully managed, reliable global backbone to connect your enterprise. New capabilities include:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Expanded geographic reach: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Our network spans more than 10 million kilometers of terrestrial and subsea fiber, and Network Connectivity Center’s site-to-site data transfer is now available in over &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/network-connectivity/docs/network-connectivity-center/concepts/locations"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;25 countries&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;NCC Gateway &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;enables third-party secure service edge (SSE) integrations from Palo Alto Networks (GA soon) and Symantec (preview).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The Verified Peering Provider program&lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt;, &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;which offers highly reliable internet connectivity to Google, now has dramatically expanded availability through &lt;/span&gt;&lt;a href="https://peering.google.com/#/options/verified-peering-provider" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;175+ providers worldwide&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Last mile connectivity&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Provision site-to-cloud private connectivity in minutes with preferred partners from the Google Cloud console (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“Cloud WAN enables Dun &amp;amp; Bradstreet to evolve our global network via composable, cloud-native constructs. Leveraging NCC, we’ve built a resilient, high-performance platform that simplifies operations and optimizes costs. This foundation supports continued modernization and AI-driven workloads. We expect to extend this architecture as new patterns emerge, maintaining our blueprints-first approach.”&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; - Josh Barry, VP, Network Engineering, Dun &amp;amp; Bradstreet&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;AI-powered security against evolving threats&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The threat landscape is evolving faster than ever, with AI-driven attacks. Staying ahead requires the latest defenses. Cross-Cloud Network relies on Cloud NGFW and Cloud Armor for advanced security capabilities. Here’s the latest on those offerings.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud NGFW &lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Advanced malware sandbox &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;uses AI models trained on data from 70k+ customers &lt;/span&gt;&lt;a href="https://www.paloaltonetworks.com/apps/pan/public/downloadResource?pagePath=/content/pan/en_US/resources/datasheets/advanced-wildfire" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;to stop 99% of known and unknown malware&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, including evasive zero-days. Advanced malware sandbox is powered by Palo Alto Networks Advanced Wildfire (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Internal Application and proxy Network Load Balancer &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;support helps to enforce consistent, service-centric security for abstracted services like GKE, Cloud Run, and Private Service Connect traffic (preview).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Project-level policies &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;allow for creating and managing Cloud NGFW endpoints, security profiles, and security profile groups at the project level (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Armor &lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Managed rules, &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;built-in rulesets across 15 threat categories, deliver automated threat protection against a broad set of attacks and zero-day CVEs. This is powered by Thales Imperva based on visibility to &lt;/span&gt;&lt;a href="https://engage-cybersec.thalesgroup.com/rs/727-WRL-406/images/EMEA-2025-Partner-Connect-05-Shailes-Nanda.pdf" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;1.5 trillion web requests each month&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Google Cloud Fraud Defense integration&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; helps to &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;discern the legitimacy and authorization of bots, humans, and agents. &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/introducing-google-cloud-fraud-defense-the-next-evolution-of-recaptcha"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Fraud Defense&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is the evolution of reCAPTCHA, which protects over 14 million domains from fraud and abuse.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Adaptive protection for Network Load Balancers &amp;amp; VMs&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; brings advanced machine learning to L3/L4 traffic, to detect and mitigate volumetric DDoS attacks (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;A simplified user experience&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; with a visual rule builder makes custom rule creation easier (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;strong style="vertical-align: baseline;"&gt;AI-powered network operations&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Finally, new AI-powered technologies in &lt;/span&gt;&lt;a href="https://cloud.google.com/products/gemini/cloud-assist"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Cloud Assist&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; can help automate manual tasks, ease troubleshooting, predict reliability issues, improve security, and help optimize your network to reduce toil and improve reliability with new specialist agents. These include:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;A network security agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; that streamlines network security operations by assisting with policy generation, recommendations, and impact analysis (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;A network agent &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;that optimizes workload placement for performance and reliability, and also provides advanced cost estimation for observability services (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Additionally, to enable customers and partners to build their own agents, we are releasing &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Network observability MCP tools and agent skills.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This will allow their agents to leverage connectivity tests, and allows for natural language querying of VPC Flow Logs (both in preview).&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;strong style="vertical-align: baseline;"&gt;The network that scales with you&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We built our Cross-Cloud Network on the same global infrastructure that powers Google’s largest AI and internet services. This provides you with a blazing-fast, planet-scale foundation that is both secure by design and open by principle, allowing you to integrate your trusted partners across any environment.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As we move into the agentic era, our flexible, future-proof solutions ensure you can quickly adopt the latest AI technologies while maintaining the reliability of your core applications. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Whatever comes next, we’ve built the network to help you lead it. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Attend our networking sessions at Next ’26 to learn more, or learn more about the &lt;/span&gt;&lt;a href="https://cloud.google.com/solutions/cross-cloud-network?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cross-Cloud Network&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;!&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 22 Apr 2026 12:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/whats-new-in-cloud-networking-at-next26/</guid><category>Hybrid &amp; Multicloud</category><category>Infrastructure Modernization</category><category>Developers &amp; Practitioners</category><category>Google Cloud Next</category><category>Networking</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/GCN26_102_BlogHeader_2436x1200_Opt_5_Dark.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>What’s new with the Cross-Cloud Network at Next ‘26</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/GCN26_102_BlogHeader_2436x1200_Opt_5_Dark.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/whats-new-in-cloud-networking-at-next26/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Rob Enns</name><title>VP/GM of Cloud Networking</title><department></department><company></company></author></item><item><title>Introducing Gemini Enterprise Agent Platform, powering the next wave of agents</title><link>https://cloud.google.com/blog/products/ai-machine-learning/introducing-gemini-enterprise-agent-platform/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In the early days of generative AI, building safe and reliable business tools took massive engineering effort and a high tolerance for trial and error. We helped solve that with &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Vertex AI&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, our trusted AI development platform. But today, we’re managing a different level of complexity, with agents interacting across multiple systems — and often without security and governance guardrails. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To move toward a truly autonomous enterprise, one where agents can act with the same independence and reliability as a member of your team, you need a foundation that can sustain that level of trust. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;What’s new: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Today, we’re launching &lt;/span&gt;&lt;a href="https://console.cloud.google.com/agent-platform/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise Agent Platform&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; — our new, comprehensive platform to build, scale, govern, and optimize agents. It’s the evolution of Vertex AI, bringing the model selection, model building, and agent building capabilities that customers love, together with new features for agent integration, DevOps, orchestration, and security. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Agent Platform provides a single destination for your technical teams to build agents that can transform your products, services, and operations. These agents can be seamlessly delivered to your employees through the &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/whats-new-in-gemini-enterprise"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise app&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, all while remaining tightly integrated with your IT operations to help ensure control, governance, and security as you scale.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The platform also provides first-class access to more than 200 of the world’s leading models through Model Garden. This includes our latest first-party breakthroughs like &lt;/span&gt;&lt;a href="https://deepmind.google/models/gemini/pro/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini 3.1 Pro&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://deepmind.google/models/gemini-image/flash/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini 3.1 Flash Image&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;a href="https://deepmind.google/models/lyria/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Lyria 3&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, alongside our open models like &lt;/span&gt;&lt;a href="https://deepmind.google/models/gemma/gemma-4/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemma 4&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. And, of course, customers have full flexibility to use the best model for the job with support for third-party models like Anthropic’s Claude Opus, Sonnet and Haiku. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Moving forward, all Vertex AI services and roadmap evolutions will be delivered exclusively through the Agent Platform, rather than as a standalone service, to power the next generation of agent development.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Why Agent Platform matters for your business: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Agent Platform helps you move from managing individual AI tasks to delegating business outcomes with total confidence. You can: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Build:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Choose the right environment for the job — from the low-code, visual interface of the new &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Studio,&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to the code-first logic of the upgraded &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Development Kit (ADK)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. We’ve simplified the entire lifecycle with AI-native coding capabilities to help you ship production-grade agents faster.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Scale:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Clear the path to production with the re-engineered &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Runtime&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. This supports long-running agents that maintain state for days at a time and are backed by &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Memory Bank&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; for persistent, long-term context.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Govern: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Establish centralized control with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Identity, Agent Registry, and Agent Gateway&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. These capabilities help ensure every agent — whether built on Agent Platform or sourced from our partner ecosystem — has a trackable identity and operates within enterprise-grade guardrails. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Optimize:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Guarantee quality with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Simulation, Agent Evaluation, &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;and&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; Agent Observability&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. These tools provide full execution traces and a real-time lens into agent reasoning to help ensure your agents always hit their goals.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_0_gemini_enterprise_agent_platform.max-1000x1000.jpg"
        
          alt="1 gemini enterprise agent platform"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Get started with Agent Platform: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Visit &lt;/span&gt;&lt;a href="https://console.cloud.google.com/agent-platform/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Platform&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in the Google Cloud console to explore new features and start building today. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Keep reading for a deeper look at our latest releases and how Agent Platform helps you deliver the production-ready agents you can trust at every stage of the journey.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;How customers are achieving more with Gemini Enterprise Agent Platform&lt;/span&gt;&lt;/h3&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/3_gemini_enterprise_agent_platform.max-1000x1000.jpg"
        
          alt="2- GEAP Logo Wall"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;"Burns &amp;amp; McDonnell uses Agent Platform to transform how organizational knowledge is applied across the enterprise. Using ADK, we are building an AI agent that turns decades of project data into real-time, actionable intelligence. Agent Platform enables this innovation to scale responsibly by combining deterministic business rules with probabilistic reasoning — making AI a trusted operational capability, not just a productivity tool. With Agent Platform, we aren’t just managing knowledge; we are activating experience to drive faster, more confident decisions." &lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;– Matt Olson, Chief Innovation Officer, Burns &amp;amp; McDonnell&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;“Color Health uses Agent Platform to power our Virtual Cancer Clinic, delivering end-to-end care. By building our Color Assistant with the Agent Development Kit (ADK) and scaling it via Agent Runtime, we are helping more women get screened for breast cancer. The Color Assistant engages users to check screening eligibility, connects them to clinicians, and helps schedule appointments. The power of the agent lies in the scale it enables — helping us reach more people and respond to individual risk and eligibility in real time.” &lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;– Jayodita Sanghvi, PhD., Head of AI Platform, Color&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;“By rebuilding Comcast’s Xfinity Assistant with Agent Development Kit (ADK), we’ve moved beyond simple scripted automation to conversational generative intelligence that delivers personalized troubleshooting and self-service support to our customers. Agent Runtime has been a massive accelerator, allowing us to deploy a sophisticated multi-agent architecture that increases digital containment while ensuring secure, grounded interactions via Gemini. We aren't just reducing repeat interactions by solving customers’ issues the first time; we're redefining the customer experience at scale.” &lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;– Rick Rioboli, Chief Technical Officer, Connectivity &amp;amp; Platforms, Comcast&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;“Geotab uses Agent Platform to rapidly accelerate our AI Agent Center of Excellence. Google's Agent Development Kit (ADK) provides the flexibility to orchestrate various frameworks under a single, governable path to production, while offering an exceptional developer experience that dramatically speeds up our build-test-deploy cycle. For Geotab, ADK is the foundation that allows us to rapidly and safely scale our agentic AI solutions across the enterprise” &lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;– Mike Branch, Vice President, Data &amp;amp; Analytics, GeoTab&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;"Gurunavi uses Agent Platform to power 'UMAME!', an AI restaurant discovery app that leverages Memory Bank to achieve a deep understanding of user context. Unlike conventional prompt-based systems, our agent remembers a user's past actions and preferences to proactively present the best options. This eliminates the need for manual searches and creates a seamless experience that will improve user satisfaction by 30% or more. We view this memory function as a non-negotiable feature for the future of new culinary experiences.” &lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;– Toshiaki Iwamoto, CTO, Gurunavi&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;"At L'Oréal, Beauty Tech is not just a support function — it is a powerful catalyst to create the beauty that moves the world. To live up to that ambition, we decided to build our own proprietary Beauty Tech Agentic Platform, powered by Google Cloud. Leveraging Agent Development Kit (ADK), we are leading a fundamental shift: moving from deterministic workflow automation to autonomous, outcome-oriented agent orchestration. Our agents are not locked in a vacuum — through Model Context Protocol (MCP), they are securely connected to our single sources of truth, including our Beauty Tech Data Platform and core operational applications. Google Cloud gives us the resilience, the multi-LLM flexibility, and the enterprise-grade trust framework we need to scale this platform globally, while keeping human oversight at the center."&lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt; – Etienne BERTIN, Group CIO, L'Oréal&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;“Payhawk uses Agent Platform to transform our AI agents from simple task executors into genuine financial assistants. By leveraging Memory Bank, we have moved from stateless interactions to long-term context retention. Our agents now act like dedicated team members, autonomously recalling user-specific constraints and history. For example, our Financial Controller Agent now remembers a user’s habits to auto-submit expenses, reducing submission time by over 50%. This shift allows our agents to anticipate needs based on past behavior rather than just reacting to prompts.”&lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt; – Diyan Bogdanov, Principal Applied AI Engineer, Payhawk&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;"PayPal uses Agent Platform to rapidly build and deploy agents in production. Specifically, we use Agent Development Kit (ADK) and visual tools to inspect agent interactions, and manage multi-agent workflows. This provides the step-by-step visibility we need to visualize the flow of intent and payment mandates. Finally, Agent Payment Protocol (AP2) on Agent Platform provides the critical foundation for trusted agent payments. helping our ecosystem accelerate the shipping of secure agent-based commerce experiences." &lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;– Nitin Sharma, Principal Engineer, AI, PayPal&lt;/strong&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Build AI agents&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Build agents quickly and easily by empowering your developers, business users and everyone in between to build and deploy agents at scale.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Build smarter agents, faster&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;A major upgrade to ADK: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;More than six trillion tokens are processed monthly on Gemini models through &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/build/adk"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ADK&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Unlock more powerful reasoning by organizing agents into a network of sub-agents. This new, graph-based framework allows you to define clear, reliable logic for how agents work together to solve complex problems.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Workspaces are secure-by-design: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Give agents a hardened, sandboxed environment to run bash commands and manage files safely, isolated from your core systems.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Multimodal streaming:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Bring human-like stability to real-time interactions with multimodal support for live audio and video cues.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Connect your agents to the enterprise&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Securely access any system:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Use plug-and-play architecture with Native Ecosystem Integrations to connect agents to your internal data and tools without custom coding.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Automate background operations:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Activate your data in BigQuery and Pub/Sub with Batch &amp;amp; Event-driven agents. This way, you can run massive, asynchronous tasks like content evaluation or data analysis in the background.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Go from idea to production in hours&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Enable AI-driven development: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;A programmatic interface for coding agents to access Google’s complete suite of agentic capabilities, allowing them to build, evaluate, and deploy production-ready agents on your behalf.&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Bringing agent building directly to Agent Studio:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Now, you can move seamlessly from building simple prompts to deploying complex agents in &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/agent-studio/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Studio&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Once you're ready for deep customization, export your logic directly into ADK to continue development in a full-code environment.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Get a head start with pre-built agents: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Access a curated set of agent templates in &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/build/agent-garden"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Garden&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; — including code modernization, financial analysis, economic research, invoice processing, and more — that serve as immediate building blocks for your multi-agent systems.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Scale AI agents&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To move from a proof-of-concept to a live environment, you need a platform that can handle the performance, state, and security requirements of real-world work. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Powering high-performance agent execution&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The latest Agent Runtime: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Our revamped &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/build/runtime"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Runtime&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; delivers sub-second cold starts and allows you to provision new agents in seconds.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Support for multi-day workflows:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; You can now deploy long-running agents that run autonomously for days at a time. This allows your agents to manage complex, multi-step workflows and deep reasoning tasks that require extended persistence, like managing a sales prospecting sequence. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Autonomous action with security-by-design environments:&lt;/strong&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/scale/sandboxes"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/a&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/scale/sandbox/code-execution-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Sandbox&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;provides a hardened environment to safely execute model-generated code and perform computer use tasks like browser-based automation without risk to your host systems.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent-to-agent orchestration:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Enables agents to seamlessly delegate tasks to one another, including support for complex, generative, and deterministic orchestration patterns. This ensures that for critical flows such as compliance, your agents follow well-specified paths every time.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Move beyond temporary session data to high-accuracy context&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Personalize interactions:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/scale/memory-bank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Memory Bank&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; dynamically generates and curates long-term memories from conversations. Using new Memory Profiles, agents can recall high-accuracy details with low latency, ensuring context is never lost.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Link AI interactions to your existing records: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Store and manage history using &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/scale/sessions"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Sessions&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. With Custom Session IDs, you can use your own unique identifiers to track sessions and map them directly to your internal database and CRM records.&lt;/span&gt;&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Enable real-time, human-like interactions:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Using the WebSocket protocol for Bidirectional Streaming, you can help ensure your agents are highly responsive during live customer or employee interactions, processing audio and video without lag.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Govern AI agents&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Govern with a secure-by-design architecture that applies enterprise rigor to every agent in your fleet – from the ones you build on Agent Platform to the ones you source from our partner ecosystem.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Manage all of your agents through a single source of truth for identity and access.&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Assign every agent a verifiable identity: &lt;/strong&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/scale/runtime/agent-identity"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Identity&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; improves the security posture of your agents by ensuring every agent receives a unique cryptographic ID. This creates a clear, auditable trail for every action an agent takes, mapped back to defined &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/scale/runtime/agent-identity"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;authorization policies&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Maintain a central library of approved tools:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Our new &lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/govern/agent-registry"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Registry&lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; provides a single source of truth for your enterprise. It indexes every internal agent, tool, and skill, simplifying discovery and ensuring only governed, approved assets are available to your users.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Manage your agent fleet from one control point:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/govern/gateways/agent-gateway-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Gateway&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; acts as the air traffic control for your agent ecosystem. It provides secure, unified connectivity between agents and tools across any environment, while enforcing consistent security policies and &lt;/span&gt;&lt;a href="https://cloud.google.com/security/products/model-armor?e=48754805&amp;amp;hl=en"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Armor&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; protections to safeguard against prompt injection and data leakage.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Use AI-powered insights to detect hidden risks and suspicious behavior before they impact your business.&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Detect suspicious behavior in real-time:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Agent Anomaly Detection uses statistical models and an LLM-as-a-judge framework to flag unusual reasoning. This works alongside &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/govern/view-security-findings"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Threat Detection&lt;/span&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;to provide visibility into malicious activity, such as reverse shells or connections to known bad IP addresses.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Uncover vulnerabilities automatically:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; A new &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/govern/view-security-findings"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Security&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; dashboard, powered by &lt;/span&gt;&lt;a href="https://cloud.google.com/security/products/security-command-center"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Security Command Center&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, unifies threat detection and risk analysis. It allows your teams to map relationships between agents and models, automate asset discovery, and scan for vulnerabilities in the underlying operating system and language packages.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Optimize AI agents &lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Agent Platform gives you the visibility needed to understand how your AI is performing, making it easy to refine their logic and get smarter over time.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Test your agents before they ship&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Simulate realistic conversations: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Use &lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/optimize/evaluation/evaluate-simulated"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Simulation&lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; to test agents against human-like synthetic user interactions and virtualized tools in a controlled environment. Agents are automatically scored based on task success and safety across multi-step conversations.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Monitor and improve in production&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Track live performance: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Use &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/optimize/evaluation/agent-evaluation"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Evaluation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to continuously score agents against live traffic using multi-turn autoraters that can evaluate the logic of an entire conversation, not just a single response. With turnkey dashboards and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/optimize/observability/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Observability&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, you can visually trace complex reasoning to debug issues as they happen.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Automate agent refinement: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Instead of manually digging through logs, &lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/optimize/evaluation/optimize-agent"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Optimizer&lt;/span&gt;&lt;/a&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;automatically clusters real-world failures and suggests refined system instructions to improve accuracy.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Detailed technical guides and a full list of updates are available in our updated &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;documentation&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/release-notes"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;release notes&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;a href="http://cloud.google.com/products/gemini-enterprise-agent-platform"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Platform &lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;is the new standard for enterprise agent development, built to help you move from experimentation to production-scale impact, starting today.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 22 Apr 2026 12:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/ai-machine-learning/introducing-gemini-enterprise-agent-platform/</guid><category>Developers &amp; Practitioners</category><category>Google Cloud Next</category><category>AI &amp; Machine Learning</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/0_gemini_enterprise_agent_platform.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Introducing Gemini Enterprise Agent Platform, powering the next wave of agents</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/0_gemini_enterprise_agent_platform.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/ai-machine-learning/introducing-gemini-enterprise-agent-platform/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Michael Gerstenhaber</name><title>VP, Product Management, Cloud AI</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Michael Bachman</name><title>VP/GM, Cloud Foundations</title><department></department><company></company></author></item><item><title>Next ‘26: Redefining security for the AI era with Google Cloud and Wiz</title><link>https://cloud.google.com/blog/products/identity-security/next26-redefining-security-for-the-ai-era-with-google-cloud-and-wiz/</link><description>&lt;div class="block-aside"&gt;&lt;dl&gt;
    &lt;dt&gt;aside_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;title&amp;#x27;, &amp;#x27;Our news today from Next ‘26&amp;#x27;), (&amp;#x27;body&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7273d56c40&amp;gt;), (&amp;#x27;btn_text&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;href&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;image&amp;#x27;, None)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The AI era demands a new security era. Organizations are facing the dual challenge of harnessing the potential of AI while &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/threat-intelligence/defending-enterprise-ai-vulnerabilities?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;defending against its malicious use&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and Google Cloud can help you adapt and thrive.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The latest research from Google Cloud shows that adversaries are using AI to &lt;/span&gt;&lt;a href="https://cloud.google.com/transform/new-mandiant-report-boost-basics-with-ai-to-counter-adversaries/"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;accelerate the speed, scale, and sophistication of attacks&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Meanwhile, &lt;/span&gt;&lt;a href="https://cloud.google.com/security/resources/m-trends?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;M-Trends 2026&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; also showed that increased threat actor coordination has driven down the time to hand-off from an initial access to a secondary threat actor from eight hours to 22 seconds in the last three years.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today at Google Cloud Next, we are showcasing how Google Cloud can help you defend against increasingly sophisticated threats at machine speed, protect AI and multicloud environments, and secure cloud workloads at scale. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Delivering agentic defense &lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our full-stack AI approach, from the chips to the models, gives you a competitive advantage with better integration and velocity to help protect customers. Not only can Google action insights from the world’s largest threat observatory and Mandiant frontline experts, but we also bring cutting-edge insights and breakthroughs from Google DeepMind, to help make your platforms more secure. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today we are introducing three new agents in &lt;/span&gt;&lt;a href="https://cloud.google.com/security/products/security-operations"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Google Security Operations&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to help you defend at the speed of AI. &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Threat Hunting agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, now in preview, can help teams proactively hunt for novel attack patterns and stealthy adversary behaviors that bypass traditional defenses. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Detection Engineering agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, now in preview, can identify coverage gaps and create new detections for threat scenarios, reducing toil and transforming detection creation from a manual craft into an automated science. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Third-Party Context agent, &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;coming soon to preview, can enrich your workflows with contextual data from third-party content. &lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/1_-_Threat_Hunt_Initiation.gif"
        
          alt="1 - Threat Hunt Initiation"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="mhwgf"&gt;Initiating a threat hunt with the Threat Hunting agent&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Triage and Investigation agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; processed over &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;5 million alerts&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; in the last year, reducing a typical 30-minute manual analysis to 60 seconds with Gemini.&lt;/span&gt;&lt;span style="text-decoration: line-through; vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;“Operational resilience and cybersecurity are the bedrock of customer trust at BBVA. By integrating advanced artificial intelligence, such as the Triage and Investigation agent, we are able to scale in new ways," said Diego Martinez Blanco, head of Security Technology, BBVA. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;“It handles the initial heavy lifting and filters out false positives so we can prioritize issues that require human attention. The agent's transparent explanations allow our team to understand recommendations and ultimately dedicate our resources to more complex investigations,” he said.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can build your own security agents with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;remote Google Cloud model context protocol (MCP) server support for Google Security Operations&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, now generally available. To make it even easier, you can also access the MCP server client directly from the Google Security Operations chat interface, available in preview. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-pull_quote"&gt;&lt;div class="uni-pull-quote h-c-page"&gt;
  &lt;section class="h-c-grid"&gt;
    &lt;div class="uni-pull-quote__wrapper h-c-grid__col h-c-grid__col--8 h-c-grid__col-m--6 h-c-grid__col-l--6
      h-c-grid__col--offset-2 h-c-grid__col-m--offset-3 h-c-grid__col-l--offset-3"&gt;
      &lt;div class="uni-pull-quote__inner-wrapper h-c-copy h-c-copy"&gt;
        &lt;q class="uni-pull-quote__text"&gt;Organizations leveraging an intelligence-led, AI-augmented approach to modern security operations with Google Cloud&amp;#x27;s agentic defense can realize a strong ROI.&lt;/q&gt;

        
          &lt;cite class="uni-pull-quote__author"&gt;
            
            
              &lt;span class="uni-pull-quote__author-meta"&gt;
                
                  &lt;strong class="h-u-font-weight-medium"&gt;Christopher Kissel&lt;/strong&gt;&lt;br /&gt;
                
                
                  Research Vice President, IDC
                
              &lt;/span&gt;
            
          &lt;/cite&gt;
        
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/section&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/2_-_Threat_Hunt_report.gif"
        
          alt="2 - Threat Hunt report"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="mhwgf"&gt;Findings report created by the Threat Hunting agent&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Security teams can also automate response actions with &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/rsac-26-supercharging-agentic-ai-defense-with-frontline-threat-intelligence"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;agentic automation&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;in Google Security Operations. To further move teams from manual triage to agentic defense, we introduced &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/bringing-dark-web-intelligence-into-the-ai-era"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;dark web intelligence&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in Google Threat Intelligence, now in preview. Internal tests show it can &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;analyze millions of daily external events with 98% accuracy&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to elevate threats that truly matter.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;"IDC found that organizations experienced measurable operational gains, including substantial reductions in mean time to detect and mean time to respond, fewer false positives, and higher analyst productivity with AI-powered context and automation. These operational improvements translate into significant &lt;/span&gt;&lt;a href="https://services.google.com/fh/files/misc/gti_idc_business_value_report.pdf" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;business outcomes&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, such as shorter disruption periods, lower incident-related costs, and improved executive confidence in security posture and decision-making," said Christopher Kissel, research vice president, IDC. "Organizations leveraging an intelligence-led, AI-augmented approach to modern security operations with Google Cloud's agentic defense can realize a strong ROI." &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;New partner-supported workflows for Google Security Operations&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today, we are also announcing a robust cohort of &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/next26-announcing-new-partner-supported-workflows-for-google-security-operations"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;new partner integrations&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for Google Security Operations. Designed to deliver high-fidelity security workflows right out of the box, our latest participating Google Cloud Security integration ecosystem partners include Darktrace, Gigamon, and SAP.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Protecting AI and cloud applications across any infrastructure&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AI and cloud applications are built across multiple platforms and models. To protect them end-to-end, we want to make it easier and faster to mitigate risk, regardless of where and how you build. This support includes major cloud environments like Amazon Web Services, Google Cloud, Microsoft Azure, and Oracle Cloud; software-as-a-service (SaaS) environments like OpenAI; and even custom hosted environments. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/google-completes-acquisition-of-wiz?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Wiz, now a part of Google Cloud&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, expands and deepens our ability to protect the apps you build and run. Wiz empowers you to quickly and securely adopt AI, while also helping protect the AI development lifecycle. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Wiz announced its &lt;/span&gt;&lt;a href="https://www.wiz.io/blog/introducing-wiz-ai-app" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;AI-Application Protection Platform&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (AI-APP) at the RSA Conference, providing deep visibility, risk posture, and runtime analysis for your AI applications. Wiz also announced &lt;/span&gt;&lt;a href="https://www.wiz.io/blog/introducing-wiz-agents" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Wiz Security Agents&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://www.wiz.io/blog/introducing-wiz-workflows" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Wiz Workflows&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, helping you identify and respond to risks and threats at machine speed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today, we’re taking our commitment to secure customers in any cloud, platform, and AI environment further. Wiz now &lt;/span&gt;&lt;a href="https://www.wiz.io/blog/wiz-databricks-security-graph" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;supports Databricks&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; as well as new agent studios like AWS Agentcore, Gemini Enterprise Agent Platform, Microsoft Azure Copilot Studio, and Salesforce Agentforce, so customers gain visibility however their teams choose to build.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In addition, Wiz continues to support security ecosystems with integrations to the outer layer of the cloud, including &lt;/span&gt;&lt;a href="http://wiz.io/blog/wiz-apigee-integration-for-api-discovery" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Apigee&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://www.cloudflare.com/press/press-releases/2026/cloudflare-partners-with-wiz-to-secure-the-global-ai-attack-surface/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloudflare AI Security for Apps&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and the &lt;/span&gt;&lt;a href="https://www.wiz.io/blog/introducing-wiz-vercel-integration" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Vercel platform&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, further extending the power of the Wiz Security Graph. We’ve also updated how we integrate security detections from Wiz Defend with Google Security Operations and Mandiant Threat Defense to help analysts more easily configure automatic threat information forwarding.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Wiz is also announcing new capabilities designed to secure the AI-native development lifecycle, helping teams to innovate faster and more securely:  &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Secure vibe-coded applications: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Wiz is announcing a new integration, generally available in May, that runs Wiz security scanning directly inside the Lovable platform so vulnerabilities, secrets, and misconfigurations caught by Wiz surface in Lovable's built-in security view, right where teams are already building.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Secure AI-generated code&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Wiz removes risks from AI-generated code the moment it is created. Inline AI security hooks integrate directly into IDEs and agent workflows to evaluate prompts and scan AI-generated output instantly, injecting security guardrails before the code is ever committed.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent-based remediation&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Wiz Skills equip coding agents and AI-native IDEs with full code-to-cloud context and validated attack surface findings from the Wiz Security Graph. These capabilities enable teams to trigger automated, agent-driven remediation workflows either locally from the developer's individual IDE or globally at the repository and pull request level within your version control system.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Eliminate shadow AI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Wiz’s dynamic &lt;/span&gt;&lt;a href="https://www.wiz.io/academy/ai-security/ai-bom-ai-bill-of-materials" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AI-Bill of Materials&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (AI-BOM)&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; automatically inventories all AI frameworks, models, and IDE extensions across your environment. This provides complete visibility into what is writing code across your stack, allowing you to track sanctioned corporate tools like Gemini Code Assist and GitHub Copilot while simultaneously uncovering unapproved shadow AI plugins.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can learn more about the &lt;/span&gt;&lt;a href="https://wiz.io/blog/wiz-at-google-cloud-next" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Wiz announcements here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Securing your agents and the agentic web&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In addition to securing your cloud and AI workloads, Google Cloud’s secure-by-design foundation can help you innovate at the speed of AI — from agents to fraud defense to the web.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Securing and governing agents with the Gemini Enterprise Agent Platform&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;To build, orchestrate, govern, and optimize agents&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;today we are announcing &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/introducing-gemini-enterprise-agent-platform"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise Agent Platform&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; including:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Identity&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to enable access management and &lt;/span&gt;&lt;a href="https://cloud.google.com/transform/these-4-ai-governance-tips-help-counter-shadow-agents"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AI governance at scale&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Our new&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;capability provides agents unique identities to operate autonomously with specific authentication flows, and with scoped human delegation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Gateway, &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;which&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;enables policy enforcement for all agent-to-agent and agent-to-tool connections. It governs your enterprise agent traffic and understands agent protocols like MCP and Agent2Agent (A2A) to inspect and secure every agent interaction.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Model Armor&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;,&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;our runtime protection for model and agent interactions, now integrates with Agent Gateway, &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/agent-builder/agent-engine/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Runtime&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;/a&gt;&lt;a href="https://docs.cloud.google.com/model-armor/model-armor-langchain-integration"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Langchain&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; available in preview, and &lt;/span&gt;&lt;a href="https://firebase.google.com/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Firebase&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, generally available, to help developers add inline enforcement and sanitization of agent traffic and interactions without the need to change code. These integrations expand Model Armor's protection against runtime risks such as prompt injections, tool poisoning, and sensitive data leakage across &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/model-armor/integrations"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud services and our AI portfolio&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Securing the agentic web with Google Cloud Fraud Defense and Chrome Enterprise&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Today, we are evolving reCAPTCHA with the &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/introducing-google-cloud-fraud-defense-the-next-evolution-of-recaptcha"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;launch of &lt;/span&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Fraud Defense&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, generally available. This comprehensive platform is designed to discern the legitimacy and authorization of bots, humans, and agents. Using the same scale and signals that protect Google’s own ecosystem, Fraud Defense will soon offer in preview agent-specific capabilities for human users and AI agents that can help secure the digital commerce journey, from account creation and login to payment and checkout.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our commitment to securing AI extends to the browser, a vital endpoint for interacting with AI. &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/chrome-enterprise/new-ways-to-navigate-the-ai-era-with-googles-enterprise-platforms-and-devices"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Chrome Enterprise&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; provides comprehensive data protection for the AI era with the visibility and controls needed to embrace AI safely without compromising corporate data:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;AI-aware extension threat detections&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, now in preview, can surface advanced extension telemetry that helps security teams detect and respond to anomalous AI agent activity. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;New &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;shadow AI reporting&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, generally available soon, can help you gain visibility into the shadow AI landscape by flagging employee use of unsanctioned web-based AI and SaaS applications. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;What’s new in Trusted Cloud&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We continue to offer new security controls and enhance capabilities across identity, data, and  networking on our cloud platform to help you secure your environments. Today we’re announcing the following updates:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Simplifying permissions with modern IAM&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;To help achieve least privilege quickly and simply, we’ve streamlined our predefined roles catalog with easy-to-use administrator, editor, and viewer roles, such as the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/iam/docs/role-picker-gemini"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;IAM role picker&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and the ability to &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/docs/authentication/reauthentication"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;re-authenticate sensitive actions&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Data security&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;We are announcing several new capabilities for our cloud platform data security portfolio to help protect your most sensitive data and accelerate AI transformation.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Confidential Computing&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: In partnership with NVIDIA, today we’re announcing &lt;/span&gt;&lt;a href="https://cloud.google.com/security/products/confidential-computing"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Confidential Computing&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; support for G4 VMs&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, featuring NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs on Google Compute Engine (GCE) Confidential G4 VMs, available in preview globally, to help strengthen confidentiality and integrity for a wide spectrum of sensitive AI workloads. In partnership with Intel, we’re also introducing the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;preview of C4 Confidential VMs&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, bringing Intel TDX to 6th Gen Xeon processors to help protect diverse AI and &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/compute/c4-vms-based-on-intel-6th-gen-xeon-granite-rapids-now-ga"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;analytics workloads&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; while providing industry-leading compute density and performance.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Key Management Services (KMS)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: We are announcing the new &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Confidential External Key Manager (cEKM)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; in preview, giving you the flexibility to host and protect external keys in any region and maintain verifiable control within a confidential environment.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Post-quantum cryptography (PQC)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: We are introducing &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;KMS Quantum Safe Key Imports&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, available&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;in preview, to help you bring your own keys with quantum-safe algorithms. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Secret Manager&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: To help prevent password leaks and mitigate prompt injection risks, we are announcing the general availability of the native integration of our &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Secret Manager with Agent Development Kit&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Network security &lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud’s Cross-Cloud Network security products offer several new capabilities:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud NGFW: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;We’re announcing the &lt;/span&gt;&lt;a href="https://cloud.google.com/security/products/firewall?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud NGFW&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;advanced malware sandbox&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, in preview later this year, to help defend against highly evasive zero-day threats. This capability is powered by &lt;/span&gt;&lt;a href="https://www.paloaltonetworks.com/apps/pan/public/downloadResource?pagePath=/content/pan/en_US/resources/datasheets/advanced-wildfire" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Palo Alto Networks Advanced Wildfire&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, trained on data from &lt;/span&gt;&lt;a href="https://www.paloaltonetworks.com/apps/pan/public/downloadResource?pagePath=/content/pan/en_US/resources/datasheets/advanced-wildfire" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;more than 70,000 Palo Alto Networks customers to stop 99% of known and unknown malware&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Armor: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;We have released new &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/armor/docs"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Armor&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; managed rules, powered by Thales Imperva&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;and&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;available in preview, to detect Layer 7 application attacks and zero-day CVEs (like &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/responding-to-cve-2025-55182"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;React2Shell&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;). &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Advancing Google Cloud security with SCC&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;As our Google Cloud-native security solution, Security Command Center (SCC) establishes a cloud security baseline to protect both your traditional and AI applications on Google Cloud:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;AI agents, models, and MCP servers are secured by providing continuous discovery and comprehensive risk analysis to identify threats, vulnerabilities, and misconfigurations.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;SCC will add deep runtime visibility to uncover shadow AI for your Google Cloud workloads. Coming soon in preview, SCC will automatically discover unmanaged agentic workloads — including agents, MCP servers hosted on Cloud Run, GKE, and inference endpoints running on GKE, and surface those as posture findings in SCC.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Our enhanced &lt;/span&gt;&lt;a href="https://cloud.google.com/security/products/security-command-center?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Security Command Center Standard tier&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; provides data security posture management, compliance, vulnerability management, and risk analysis to help any Google Cloud customer establish strong security, compliance and risk coverage from the start at no additional costs. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Take the next step&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When you make Google part of your security team, you gain the power of an intelligence-driven, AI-native defense; the freedom of an open cloud that’s secure-by-design; and the industry's most-battle tested experts as an extension of your organization. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For more on these new innovations and how you can secure what’s next, &lt;/span&gt;&lt;a href="https://www.googlecloudevents.com/next-vegas/session-library?session_id=3818847&amp;amp;name=secure-what&amp;amp;" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;tune in to watch our security spotlight&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. And be sure to check out the many great security breakout sessions — live and on-demand — to learn more about all of our Next ‘26 announcements.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 22 Apr 2026 12:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/identity-security/next26-redefining-security-for-the-ai-era-with-google-cloud-and-wiz/</guid><category>AI &amp; Machine Learning</category><category>Networking</category><category>Developers &amp; Practitioners</category><category>Google Cloud Next</category><category>Security &amp; Identity</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/GCN26_102_BlogHeader_2436x1200_Opt_3_Dark.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Next ‘26: Redefining security for the AI era with Google Cloud and Wiz</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/GCN26_102_BlogHeader_2436x1200_Opt_3_Dark.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/identity-security/next26-redefining-security-for-the-ai-era-with-google-cloud-and-wiz/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Francis deSouza</name><title>COO, Google Cloud and President, Security Products</title><department></department><company></company></author></item><item><title>From keynote to the terminal: Join our Next ‘26 developer livestreams</title><link>https://cloud.google.com/blog/topics/developers-practitioners/join-our-next26-developer-livestreams/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The main stage at Google Cloud Next is where the vision is set. This year, we’re bridging the gap between those massive "Cloud-scale" announcements and your local terminal.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We are thrilled to announce the Next ‘26 developer livestreams, a daily broadcast live from the show floor at Google Cloud Next. We aren't just reporting the news, we’re deconstructing it into actionable demos and immediate workflows before the keynote seats are even cold.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;What to expect&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Real-time demos that turn inspiration into versioning.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Energy from the show floor delivered straight to your screen.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Interviews with the builders, community leaders, and disruptors moving at light speed.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Schedule&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Day 1: From the Next ‘26 main stage to the terminal&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;When: Wednesday, April 22, beginning at 11 AM PT&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=m9HeWXndjAU"
      data-glue-modal-trigger="uni-modal-m9HeWXndjAU-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/yt_vid_2_ctenqNy.max-1000x1000.png);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;From the Next ‘26 main stage to the terminal&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-m9HeWXndjAU-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="m9HeWXndjAU"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=m9HeWXndjAU"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Immediately following the opening keynote host Jason Davenport kicks things off with special guests including Acquired's Ben Gilbert and David Rosenthal to get their reaction to the day’s announcements. Then we dive into the hardware and platforms powering the next wave of AI with Addy Osmani, Shubham Saboo, Philip Kelly of Baseten, Yasmeen Ahmad, and other surprise guests. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Day 2: Next ‘26 Developer keynote deep-dive &lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;When: Thursday, April 23, beginning at 12 PM PT&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=JemyjTlOvy0"
      data-glue-modal-trigger="uni-modal-JemyjTlOvy0-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/yt_vid_1.max-1000x1000.png);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Next ‘26 Developer keynote deep-dive&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-JemyjTlOvy0-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="JemyjTlOvy0"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=JemyjTlOvy0"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Fresh off the Developer Keynote, we’re taking the tech to the terminal. We’ll be live-coding agentic workflows and testing new announcements in real-world scenarios.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Host Stephanie Wong will sit down with Michele Catasta (President &amp;amp; Head of AI at Replit). We’ll also feature a "hot off the press" breakdown with Google Cloud’s Sarah Kennedy and Ricky Robinett, plus a security deep dive with Ankur Kotwal and Wiz’s Salman Ladha. And hear from LangChain’s Harrison Chase, and conversations with Googlers Kevin Moore and Ines Envid, and more!&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Where to watch&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Don’t just watch the news — build it with us. We’ll be streaming live across all your favorite platforms. Bookmark the links below and set your reminders now!&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/@googlecloudtech/streams" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Tech YouTube&lt;/span&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://x.com/GoogleCloudTech" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Tech X&lt;/span&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.linkedin.com/showcase/google-cloud/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud LinkedIn&lt;/span&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://discord.com/channels/1009525727504384150/@home" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Discord&lt;/span&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Replays will be available on-demand.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Next digital pass &lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To make sure you don't miss any of the action, claim your complimentary digital pass today. Stream select breakout and Spotlight sessions, catch the big keynote announcements as they drop, and enjoy short-form videos – all from wherever you happen to be. Plus, your digital ticket unlocks special offers once Next wraps up. &lt;/span&gt;&lt;a href="https://www.googlecloudevents.com/next-vegas/developer-experiences?utm_source=cgc-blog&amp;amp;utm_medium=blog&amp;amp;utm_campaign=FY26-Q2-GLOBAL-GLO27877-physicalevent-er-next26-mc-105752&amp;amp;utm_content=cgc-blog-lp-devs&amp;amp;utm_term=-" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Register now&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;! &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;What’s next after Next? Stay agent-ready with GEAR&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The conversation around AI agents is moving fast. Want to stay in the loop? &lt;/span&gt;&lt;a href="https://developers.google.com/program/gear?utm_source=cgc-blog&amp;amp;utm_medium=blog&amp;amp;utm_campaign=FY-26-Q2-GEAR-sign-up&amp;amp;utm_content=livestream-blog-cgc&amp;amp;utm_term=-" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Join the Gemini Enterprise Agent Ready (GEAR)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; program and get access to curated news and learning materials from the experts at Google.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 21 Apr 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/join-our-next26-developer-livestreams/</guid><category>AI &amp; Machine Learning</category><category>Google Cloud Next</category><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Next_26_developer_livestreams.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>From keynote to the terminal: Join our Next ‘26 developer livestreams</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Next_26_developer_livestreams.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/join-our-next26-developer-livestreams/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Google Cloud Content &amp; Editorial </name><title></title><department></department><company></company></author></item><item><title>Introducing the Builders Hub from the Google Developer Program</title><link>https://cloud.google.com/blog/topics/developers-practitioners/introducing-the-builders-hub-from-the-google-developer-program/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today’s developer experience is often spread across dozens of consoles, documentation pages, and sites to stay informed. We know that the friction of jumping between surfaces can slow down the most important part of your day: building.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To solve this, we are introducing &lt;a href="http://builders.google" rel="noopener" target="_blank"&gt;Builders Hub&lt;/a&gt; within Google Developer Program as a new centralized service designed to provide developers with a unified entry point, a workbench for projects, and resources—including personalized suggestions for community engagement and learning.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/builders_hub_-_cropped_1080_height_1.max-1000x1000.png"
        
          alt="builders hub - cropped 1080 height (1)"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Whether you are a vibe coder, an AI Builder, or a professional developer, Builders Hub has something to offer you. Learn more about how the new Builders Hub &lt;em&gt;&lt;strong&gt;helps you move faster&lt;/strong&gt;&lt;/em&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;A Frictionless "Front Door"&lt;/span&gt;&lt;/span&gt;&lt;/h2&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/projects_-_full_list_grid_1.max-1000x1000.png"
        
          alt="projects - full list grid (1)"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Getting started should be measured in seconds, not hours. Builders Hub eliminates onboarding complexity by providing a unified activation point for all Google developer tools.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Unified Project Dashboard&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Access and view all of your Google Cloud, Firebase and AI Studio projects and apps from a single destination. You can now see at a glance exactly which services are enabled across your entire environment without hopping between separate consoles.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Personalized Learning &amp;amp; Interests&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Receive tailored recommendations and compatible interest suggestions based on the specific services you’ve selected. Builders Hub understands your tech stack and serves up the most relevant learning paths to help you master new tools faster.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;The Integrated Workbench: Build While You Learn&lt;/span&gt;&lt;/h2&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/codelabs_-_Cloud_next_26_filtered_1.max-1000x1000.png"
        
          alt="codelabs - Cloud next &amp;#x27;26 filtered (1)"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We’re moving beyond static documentation. The new Builders Hub introduces an interactive environment where learning and execution happen side-by-side, allowing you to focus on innovation.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Integrated Credits &amp;amp; Seamless Execution&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Unlock Google Cloud credits directly within a Codelab to get started with zero friction. This seamless flow allows you to spin up real environments immediately, so you can learn by doing without the traditional operational toil of manual billing or account setup.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Showcase Your Proficiency with Badges:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Every milestone counts. Unlock and showcase digital badges that highlight your specific achievements and skill sets. These credentials allow you to prove your proficiency to the global community and potential employers.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Grow With Your Career&lt;/span&gt;&lt;/h2&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/Your_Profile_1.max-1000x1000.png"
        
          alt="Your Profile (1)"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With Builders Hub, Google Developer Program is no longer just a place to start—it’s where you build a legacy. We’ve expanded the Hub to prioritize community and professional recognition, giving you the tools to turn your technical proficiency into career-defining milestones.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Discover Local Communities &amp;amp; Events:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Connect with other builders in your backyard. The Hub now features a dedicated discovery engine for communities and local events, making it easier than ever to build your network and find your tribe.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The Pulse of Google Developers&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Stay connected with an integrated feed of upcoming events and recent blog posts from across all of Google’s developer channels, curated directly within your workbench.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Get Started&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The transition to agentic, AI-driven development requires a new set of tools and a more integrated experience. Builders Hub is built to be your workbench for this next era.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Access the new Builders Hub today&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; by signing into Google Developer Program at &lt;/span&gt;&lt;a href="http://builders.google" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;builders.google&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;!&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 21 Apr 2026 13:26:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/introducing-the-builders-hub-from-the-google-developer-program/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/hero_image_-_blog_post.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Introducing the Builders Hub from the Google Developer Program</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/hero_image_-_blog_post.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/introducing-the-builders-hub-from-the-google-developer-program/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Chris Demeke</name><title>Group Product Manager</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Bala Muthukrishnan</name><title>Senior Product Manager</title><department></department><company></company></author></item><item><title>Create Expert Content: Deploying a Multi-Agent System with Terraform and Cloud Run</title><link>https://cloud.google.com/blog/topics/developers-practitioners/create-expert-content-deploying-a-multi-agent-system-with-terraform-and-cloud-run/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In support of our mission to accelerate the developer journey on Google Cloud, we built Dev Signal: a multi-agent system designed to transform raw community signals into reliable technical guidance by automating the path from discovery to expert creation. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In the first three parts of this series, we laid the essential groundwork by establishing its core capabilities and local verification process:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/build-a-multi-agent-system-for-expert-content-with-google-adk-mcp-and-cloud-run-part-1"&gt;part 1&lt;/a&gt;, &lt;span style="vertical-align: baseline;"&gt;we standardize the agent's capabilities through the Model Context Protocol (MCP), connecting it to Reddit for trend discovery and Google Cloud Docs for technical grounding. In &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/multi-agent-architecture-and-long-term-memory-with-adk-mcp-and-cloud-run?utm_campaign=CDR_0x91b1edb5_default_b8022895&amp;amp;utm_medium=external&amp;amp;utm_source=social"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;part 2&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, we built a multi-agent architecture and integrated the Vertex AI memory bank to allow the system to learn and persist user preferences across different conversations. In &lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/create-expert-content-local-testing-of-a-multi-agent-system-with-memory"&gt;part 3&lt;/a&gt;, we verified the full end-to-end lifecycle locally using a dedicated test runner to ensure that research, content creation, and cloud-based memory retrieval were perfectly synchronized.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you’d like to dive straight into the code, you can clone the repository &lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/ai-ml/dev-signal" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Deployment to Cloud Run and the Path to Production&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To help you transition from this local prototype to a production service, this final part focuses on building the production backbone of your agent using the foundational deployment patterns provided by the &lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/agent-starter-pack" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Starter Pack&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. We will implement the essential structural components required for monitoring, data integrity, and long-term state management in the cloud. You will learn to implement the application server and helper utilities needed for a production-ready deployment before provisioning secure, reproducible infrastructure with Terraform.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While the Dockerfile packages your agent's code and its specialized dependencies, such as Node.js for the Reddit MCP tool, Terraform is used to build the platform it lives on. Terraform automates the creation of your Artifact Registry, least-privilege service accounts, and Secret Manager integrations to ensure your API keys remain protected.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By the end of this part, you will have a standardized application framework deployed on Google Cloud Run and a roadmap for graduating your prototype through continuous evaluation, CI/CD and advanced observability.&lt;/span&gt;&lt;/p&gt;
&lt;h2 role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Production Utilities and Server: Building the System's Body&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this section, you implement the structural components required for monitoring and long-term state management in the cloud.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The Application Server:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Initializing the FastAPI server and establishing a vital connection to the Vertex AI memory bank.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Implementing Telemetry: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Enabling 'Agent Traces' for visibility into internal reasoning.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;The Application Server &lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;fast_api_app.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file serves as the vital entry point for your agent, transforming the core logic into a production FastAPI server that acts as the "body" of your system. When deploying to Cloud Run, this server is essential because it provides the necessary web interface to listen for incoming HTTP requests and dispatch them to the agent for processing. Beyond basic serving, its most critical role is establishing a connection to the Vertex AI memory bank by defining a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;MEMORY_URI&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, which allows the ADK framework to persist and retrieve user preferences across different production sessions. Additionally, the application server initializes production-grade telemetry for real-time monitoring.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Go back to the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent folder.&lt;/code&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;cd ..&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271c8dd60&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste the following code in &lt;/span&gt;&lt;code&gt;dev_signal_agent/fast_api_app.py&lt;/code&gt;: &lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;import os\r\nfrom fastapi import FastAPI\r\nfrom google.adk.cli.fast_api import get_fast_api_app\r\nfrom google.cloud import logging as cloud_logging\r\nfrom vertexai import agent_engines\r\nfrom dev_signal_agent.app_utils.env import init_environment\r\n\r\n# --- Initialization &amp;amp; Secure Secret Retrieval ---\r\n# We now unpack the SECRETS dictionary returned by our updated env.py\r\nPROJECT_ID, MODEL_LOC, SERVICE_LOC, SECRETS = init_environment()\r\nlogger = cloud_logging.Client().logger(__name__)\r\n\r\n# Access sensitive credentials from the SECRETS dictionary \r\n# These keys stay in memory and are NOT injected into os.environ\r\nREDDIT_CLIENT_ID = SECRETS.get(&amp;quot;REDDIT_CLIENT_ID&amp;quot;)\r\nREDDIT_CLIENT_SECRET = SECRETS.get(&amp;quot;REDDIT_CLIENT_SECRET&amp;quot;)\r\nREDDIT_USER_AGENT = SECRETS.get(&amp;quot;REDDIT_USER_AGENT&amp;quot;)\r\nDK_API_KEY = SECRETS.get(&amp;quot;DK_API_KEY&amp;quot;)\r\n\r\n# --- Configuration &amp;amp; Sessions ---\r\nAGENT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))\r\n# Non-sensitive configuration uses environment variables \r\nBUCKET = os.environ.get(&amp;quot;AI_ASSETS_BUCKET&amp;quot;) \r\nUSE_IN_MEMORY = os.environ.get(&amp;quot;USE_IN_MEMORY_SESSION&amp;quot;, &amp;quot;&amp;quot;).lower() in (&amp;quot;true&amp;quot;, &amp;quot;1&amp;quot;)\r\n\r\n# --- MEMORY BANK CONNECTION ---\r\ndef _get_memory_bank_uri():\r\n    if USE_IN_MEMORY: return None, None\r\n    # We use \&amp;#x27;dev_signal_agent\&amp;#x27; as the display name for the Vertex AI memory bank\r\n    name = os.environ.get(&amp;quot;AGENT_ENGINE_MEMORY_BANK_NAME&amp;quot;, &amp;quot;dev_signal_agent&amp;quot;) \r\n    existing = list(agent_engines.list(filter=f&amp;quot;display_name={name}&amp;quot;))\r\n    ae = existing[0] if existing else agent_engines.create(display_name=name)\r\n    uri = f&amp;quot;agentengine://{ae.resource_name}&amp;quot;\r\n    print(f&amp;quot;DEBUG: Connecting to Memory Bank: {uri} (display_name={name})&amp;quot;)\r\n    return uri, uri\r\n\r\nSESSION_URI, MEMORY_URI = _get_memory_bank_uri()\r\n\r\n# --- Initialize FastAPI with ADK ---\r\napp: FastAPI = get_fast_api_app(\r\n    agents_dir=AGENT_DIR,\r\n    web=True,\r\n    artifact_service_uri=f&amp;quot;gs://{BUCKET}&amp;quot; if BUCKET else None,\r\n    allow_origins=os.getenv(&amp;quot;ALLOW_ORIGINS&amp;quot;, &amp;quot;&amp;quot;).split(&amp;quot;,&amp;quot;) if os.getenv(&amp;quot;ALLOW_ORIGINS&amp;quot;) else None,\r\n    session_service_uri=SESSION_URI,\r\n    memory_service_uri=MEMORY_URI, # &amp;lt;--- Connects the Memory Bank\r\n    otel_to_cloud=True,            # &amp;lt;--- Enables production telemetry\r\n)\r\n\r\nif __name__ == &amp;quot;__main__&amp;quot;:\r\n    import uvicorn\r\n    # Standard Cloud Run port is 8080 \r\n    uvicorn.run(app, host=&amp;quot;0.0.0.0&amp;quot;, port=8080)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271c8de20&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Implementing Telemetry&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In a production environment, visibility into your agent's reasoning is critical. We leverage the built-in observability features of the Google ADK by setting the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;otel_to_cloud=True&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; flag in our application server. This single parameter handles the majority of the instrumentation automatically, exporting "Agent Traces" directly to the Google Cloud Console. These traces provide a "visual waterfall" of the agent's operation, including individual agent thought processes, LLM invocations, and MCP tool calls.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Monitoring vs. Targeted Evaluation&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;It is essential to understand that production tracing is subject to sampling to balance performance and cost. Because Cloud Run captures only a subset of requests, not every individual user interaction will be visible.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;System Traces (Monitoring):&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Used to analyze behavior "at large," such as identifying latency bottlenecks or system timeouts.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Reasoning Traces (Evaluation):&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; High-quality evaluation mandates targeted trace capture. This means calling the agent specifically for a test case where you know you will evaluate that particular request in full detail.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Viewing the Trace&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To see your traces, navigate to the Trace Explorer in the Google Cloud Console and filter for your service (e.g., &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev-signal&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;). Clicking a specific Trace ID opens a Gantt chart that allows you to distinguish between cognitive reasoning failures (wrong decisions) and physical system issues (timeouts).&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/trace.max-1000x1000.png"
        
          alt="trace"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For advanced configurations, refer to the following documentation:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/run/docs/trace#trace_sampling_rate?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run Trace Sampling&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/stackdriver/docs/instrumentation/ai-agent-adk#configure?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Configuring ADK Telemetry&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/trace/docs/collect-view-multimodal-prompts-responses?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Multimodal Trace Capture&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://google.github.io/adk-docs/integrations/bigquery-agent-analytics/" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;BigQuery Agent Analytics Integration&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Infrastructure as Code: Provisioning Secure Cloud Resources&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We utilize the infrastructure-as-code patterns provided by the &lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/agent-starter-pack" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Starter Pack&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;'s security-first design. The starter pack builds the professional platform required to automate the creation of least-privilege service accounts and robust secret management in seconds.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Using Terraform ensures that your entire Google Cloud environment - from IAM roles to Secret Manager versions - is defined in reproducible, secure code. We break our infrastructure into the following logical blocks:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Resources &amp;amp; Variables&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Define the specific project, region, and sensitive API secrets used by the agent.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Core Infrastructure&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Enable essential APIs and provision a private Artifact Registry to host your agent's container images.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Identity &amp;amp; Access Management (IAM)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Configure specialized Service Accounts that strictly follow the Principle of Least Privilege to ensure your system remains secure.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Secret Management&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Securely ingest API credentials into Google Secret Manager for protected runtime access.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Run Configuration&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Define the container environment, resource limits, and automated secret injection for the final deployment.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To begin provisioning, return to the root folder of your project (dev-signal) and create the necessary deployment directories:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;cd ..\r\nmkdir deployment\r\ncd deployment\r\nmkdir terraform\r\ncd terraform&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271c8dee0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Terraform Resources and Variables&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;variables.tf&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file defines the configurable parameters for your deployment, allowing you to customize the infrastructure without altering the underlying logic. It includes variables for the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;project_id&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, the deployment &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;region&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; (defaulting to &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;us-central1&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;), and the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;service_name&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for your Cloud Run instance. Furthermore, it defines a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;secrets&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; map used to securely ingest sensitive API credentials—such as Reddit and Developer Knowledge keys—into Google Secret Manager for runtime access. This modular approach ensures your production environment remains reproducible, secure, and adaptable across different projects.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste the following code into &lt;/span&gt;&lt;code&gt;deployment/terraform/variables.tf&lt;/code&gt;:&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;variable &amp;quot;project_id&amp;quot; {\r\n description = &amp;quot;The Google Cloud Project ID&amp;quot;\r\n type        = string\r\n}\r\nvariable &amp;quot;region&amp;quot; {\r\n description = &amp;quot;The Google Cloud region to deploy to&amp;quot;\r\n type        = string\r\n default     = &amp;quot;us-central1&amp;quot;\r\n}\r\nvariable &amp;quot;service_name&amp;quot; {\r\n description = &amp;quot;The name of the Cloud Run service&amp;quot;\r\n type        = string\r\n default     = &amp;quot;dev-signal&amp;quot;\r\n}\r\nvariable &amp;quot;secrets&amp;quot; {\r\n description = &amp;quot;A map of secret names and their values (e.g., REDDIT_CLIENT_ID, DK_API_KEY)&amp;quot;\r\n type        = map(string)\r\n default     = {}\r\n}\r\nvariable &amp;quot;ai_assets_bucket&amp;quot; {\r\n description = &amp;quot;The GCS bucket for storing AI assets&amp;quot;\r\n type        = string\r\n}&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271c8d9d0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Core Infrastructure Logic &lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We define our infrastructure in logical blocks. Here is what each part does:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;1. Enable APIs&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Ensures the project has the necessary services active (Cloud Run, Vertex AI, etc.). We use &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;disable_on_destroy = false&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to prevent accidental data loss if the Terraform is destroyed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste the following code into &lt;/span&gt;&lt;code&gt;deployment/terraform/main.tf&lt;/code&gt;:&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;resource &amp;quot;google_project_service&amp;quot; &amp;quot;services&amp;quot; {\r\n  project = var.project_id\r\n  for_each = toset([\r\n    &amp;quot;run.googleapis.com&amp;quot;,\r\n    &amp;quot;artifactregistry.googleapis.com&amp;quot;,\r\n    &amp;quot;cloudbuild.googleapis.com&amp;quot;,\r\n    &amp;quot;aiplatform.googleapis.com&amp;quot;,\r\n    &amp;quot;secretmanager.googleapis.com&amp;quot;,\r\n    &amp;quot;logging.googleapis.com&amp;quot;\r\n  ])\r\n  service            = each.key\r\n  disable_on_destroy = false\r\n}&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271c8d340&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;2. Artifact Registry&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Creates a private Docker registry to store our agent's container images.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;resource &amp;quot;google_artifact_registry_repository&amp;quot; &amp;quot;repo&amp;quot; {\r\n  location      = var.region\r\n  project       = var.project_id\r\n  repository_id = &amp;quot;dev-signal-repo&amp;quot;\r\n  description   = &amp;quot;Docker repository for Dev Signal Agent&amp;quot;\r\n  format        = &amp;quot;DOCKER&amp;quot;\r\n  depends_on    = [google_project_service.services]\r\n}&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271c8d850&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;3. Service Account &amp;amp; IAM: Adhering to the Principle of Least Privilege&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; - This is a critical security step. In accordance with the Principle of Least Privilege, we avoid using the default compute service account and instead provision a dedicated user-managed service account (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev-signal-sa&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;). By designating this as the Cloud Run service identity, we can grant it only the minimum necessary permissions—specifically &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;roles/aiplatform.user&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;roles/logging.logWriter&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;roles/storage.objectAdmin&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. This granular access control ensures that the agent has the exact permissions required to interact with Vertex AI and Cloud Storage without over-granting access to other sensitive cloud resources, significantly reducing the potential impact of a compromised account. Learn more &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/iam/docs/best-practices-service-accounts?content_ref=because%20a%20service%20account%20is%20a%20principal%20you%20must%20limit%20its%20privileges%20to%20reduce%20the%20potential%20harm%20that%20can%20be%20done%20by%20a%20compromised%20service%20account&amp;amp;utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;best practices for using service accounts securely&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;resource &amp;quot;google_service_account&amp;quot; &amp;quot;agent_sa&amp;quot; {\r\n  project      = var.project_id\r\n  account_id   = &amp;quot;${var.service_name}-sa&amp;quot;\r\n  display_name = &amp;quot;Dev Signal Agent Service Account&amp;quot;\r\n}&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271c8da30&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;4. &lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt;Secret Management&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: This handles your API keys securely. It creates secrets in Google Secret Manager and gives the agent's Service Account permission to access them at runtime.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;resource &amp;quot;google_secret_manager_secret&amp;quot; &amp;quot;agent_secrets&amp;quot; {\r\n  project   = var.project_id\r\n  for_each  = toset(keys(var.secrets))\r\n  secret_id = each.key\r\n  replication {\r\n    auto {}\r\n  }\r\n  depends_on = [google_project_service.services]\r\n}\r\n\r\nresource &amp;quot;google_secret_manager_secret_version&amp;quot; &amp;quot;agent_secrets_version&amp;quot; {\r\n  for_each    = toset(keys(var.secrets))\r\n  secret      = google_secret_manager_secret.agent_secrets[each.key].id\r\n  secret_data = var.secrets[each.key]\r\n}\r\n\r\nresource &amp;quot;google_secret_manager_secret_iam_member&amp;quot; &amp;quot;secret_accessor&amp;quot; {\r\n  project   = var.project_id\r\n  for_each  = toset(keys(var.secrets))\r\n  secret_id = google_secret_manager_secret.agent_secrets[each.key].id\r\n  role      = &amp;quot;roles/secretmanager.secretAccessor&amp;quot;\r\n  member    = &amp;quot;serviceAccount:${google_service_account.agent_sa.email}&amp;quot;\r\n}&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271c8d910&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;5. Cloud Run Configuration:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Security Best Practice:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; To satisfy production security standards, our &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;main.tf&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; grants the Service Account the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;secretmanager.secretAccessor&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; role. Our Python application then uses the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/secret-manager/docs/best-practices#coding-practices"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Secret Manager SDK &lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;to pull these credentials directly into local memory at runtime, ensuring they never touch the container's environment configuration&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# 6. Cloud Run Service Deployment\r\nresource &amp;quot;google_cloud_run_v2_service&amp;quot; &amp;quot;default&amp;quot; {\r\n  project  = var.project_id\r\n  name     = var.service_name\r\n  location = var.region\r\n  ingress  = &amp;quot;INGRESS_TRAFFIC_ALL&amp;quot;\r\n\r\n  template {\r\n    service_account = google_service_account.agent_sa.email\r\n    \r\n    containers {\r\n      image = &amp;quot;us-docker.pkg.dev/cloudrun/container/hello&amp;quot; # Placeholder until first build\r\n     \r\n      env {\r\n        name  = &amp;quot;GOOGLE_CLOUD_PROJECT&amp;quot;\r\n        value = var.project_id\r\n      }\r\n      env {\r\n        name  = &amp;quot;GOOGLE_CLOUD_LOCATION&amp;quot;\r\n        value = &amp;quot;global&amp;quot;\r\n      }\r\n      env {\r\n        name  = &amp;quot;GOOGLE_GENAI_USE_VERTEXAI&amp;quot;\r\n        value = &amp;quot;True&amp;quot;\r\n      }\r\n      env {\r\n        name  = &amp;quot;AI_ASSETS_BUCKET&amp;quot;\r\n        value = var.ai_assets_bucket\r\n      }\r\n\r\n      resources {\r\n        limits = {\r\n          cpu    = &amp;quot;1&amp;quot;\r\n          memory = &amp;quot;2Gi&amp;quot;\r\n        }\r\n      }\r\n    }\r\n  }\r\n  \r\n  traffic {\r\n    type    = &amp;quot;TRAFFIC_TARGET_ALLOCATION_TYPE_LATEST&amp;quot;\r\n    percent = 100\r\n  }&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271c8d880&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Provision the Infrastructure&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Before we can deploy our code, we need to provision the Google Cloud infrastructure we just defined.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Initialize Terraform&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: This downloads the necessary provider plugins. Run this in &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;deployment/terraform&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt; &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;folder&lt;/span&gt;:&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;terraform init&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271c8db20&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Create a Variables File&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;deployment/terraform/terraform.tfvars &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;and update it with your project details and secrets.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;project_id = &amp;quot;your-project-id&amp;quot;\r\nregion     = &amp;quot;us-central1&amp;quot;\r\nservice_name      = &amp;quot;dev-signal&amp;quot;\r\nai_assets_bucket  = &amp;quot;your-bucket-name&amp;quot;\r\nsecrets = {\r\n  REDDIT_CLIENT_ID     = &amp;quot;your_client_id&amp;quot;\r\n  REDDIT_CLIENT_SECRET = &amp;quot;your_client_secret&amp;quot;\r\n  REDDIT_USER_AGENT    = &amp;quot;your_user_agent&amp;quot;\r\n  DK_API_KEY           = &amp;quot;your_dk_api_key&amp;quot;\r\n}&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271c8d040&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Plan configuration&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: This allows you to review the changes before they are applied. Run this in the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;deployment/terraform&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; folder:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;terraform plan -out=plan.tfplan&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271c8d1f0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Apply Configuration&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Once you have reviewed the plan and confirmed it does what you want, run:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;terraform apply plan.tfplan&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271c8dcd0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Deployment: Containerization and the Cloud Build Pipeline&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this final stage of the build process, we package our agent's "body" and "brain" into a portable, production-ready container. This ensures that every component - from our Python logic to the Node.js environment required for the Reddit MCP tool - is bundled together with its exact dependencies.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We utilize a &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Dockerfile&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to define this environment and a &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Makefile&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to orchestrate the deployment pipeline. When you trigger the deployment, &lt;/span&gt;&lt;a href="https://pantheon.corp.google.com/cloud-build/builds" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Build&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; takes your local source code, builds the container image according to the Dockerfile, and stores it in the private Artifact Registry created earlier by Terraform. Finally, the pipeline automatically updates your Cloud Run service to serve traffic using this fresh image, completing the journey from local code to a live, secure cloud workload.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code&gt;dev-signal/Dockerfile&lt;/code&gt;:&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;FROM python:3.12-slim\r\n\r\n# Install Node.js and npm for MCP tools (like reddit-mcp)\r\nRUN apt-get update &amp;amp;&amp;amp; apt-get install -y \\\r\n    curl \\\r\n    &amp;amp;&amp;amp; curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \\\r\n    &amp;amp;&amp;amp; apt-get install -y nodejs \\\r\n    &amp;amp;&amp;amp; npm install -g reddit-mcp \\\r\n    &amp;amp;&amp;amp; apt-get clean \\\r\n    &amp;amp;&amp;amp; rm -rf /var/lib/apt/lists/*\r\n\r\nRUN pip install --no-cache-dir uv==0.8.13\r\n\r\nWORKDIR /code\r\n\r\nCOPY ./pyproject.toml ./README.md ./uv.lock* ./\r\nCOPY ./dev_signal_agent ./dev_signal_agent\r\n\r\nRUN uv sync --frozen\r\n\r\nEXPOSE 8080\r\n\r\nCMD [&amp;quot;uv&amp;quot;, &amp;quot;run&amp;quot;, &amp;quot;uvicorn&amp;quot;, &amp;quot;dev_signal_agent.fast_api_app:app&amp;quot;, &amp;quot;--host&amp;quot;, &amp;quot;0.0.0.0&amp;quot;, &amp;quot;--port&amp;quot;, &amp;quot;8080&amp;quot;]&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271c8dc40&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Makefile&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; automates the build and deploys.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code&gt;dev-signal/Makefile&lt;/code&gt;:&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;PROJECT_ID ?= $(shell gcloud config get-value project)\r\nREGION     ?= us-central1\r\nIMAGE_REPO ?= dev-signal-repo\r\nIMAGE      := $(REGION)-docker.pkg.dev/$(PROJECT_ID)/$(IMAGE_REPO)/agent:latest\r\n\r\n# Deploy via Cloud Build &amp;amp; Container\r\ndocker-deploy:\r\n\t@echo &amp;quot;? Building and deploying to $(PROJECT_ID) via Cloud Build...&amp;quot;\r\n\tgcloud builds submit --tag $(IMAGE) --project $(PROJECT_ID) .\r\n\tgcloud run services update dev-signal \\\r\n\t\t--image $(IMAGE) \\\r\n\t\t--region $(REGION) \\\r\n\t\t--project $(PROJECT_ID) \\\r\n--labels dev-tutorial=dev-signal-agent&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271c8d790&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy Application&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now that our infrastructure is ready, we can build and deploy the application code.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Run the following command from the root of your project:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;make docker-deploy&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271c8d100&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;What happens when you run this?&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Build&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Google Cloud Build takes your local code and the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;Dockerfile&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, builds a container image, and stores it in the Artifact Registry.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Deploy&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: It updates the Cloud Run service defined in Terraform to use this new image.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When the deployment completes, you should get a message like this:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;Service [dev-signal] revision [dev-signal...] has been deployed and is serving 100 percent of traffic.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;Service URL: https://dev-signal-...-.us-central1.run.app&lt;/code&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Verification: Accessing and Testing Your Deployed Agent&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Since production services are private by default, this section covers how to grant permissions and access the agent securely.&lt;/span&gt;&lt;/p&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Managing IAM Permissions:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Granting the necessary &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;run.invoker&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; role to authorized users.&lt;/span&gt;&lt;/p&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Secure Access via Cloud Run Proxy:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Using the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; proxy to interact with your live service.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Granting User Permissions&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Before you can invoke the service, you must grant your Google account the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;roles/run.invoker&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; role for this specific service. Run the following command:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud run services add-iam-policy-binding dev-signal \\\r\n  --member=&amp;quot;user:$(gcloud config get-value account)&amp;quot; \\\r\n  --role=&amp;quot;roles/run.invoker&amp;quot; \\\r\n  --region=us-central1 \\\r\n  --project=$(gcloud config get-value project)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271c8dfd0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Launch the Proxy&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now, access your private service securely via the proxy:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud run services proxy dev-signal \\\r\n  --region us-central1 \\\r\n  --project $(gcloud config get-value project)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271c8d3d0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Visit &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;http://localhost:8080&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to chat with your deployed agent! S&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;ee a possible test scenario in &lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/create-expert-content-local-testing-of-a-multi-agent-system-with-memory"&gt;part 3&lt;/a&gt; of the series.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Summary&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Congratulations! You have successfully built &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Dev Signal&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;What we covered:&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/build-a-multi-agent-system-for-expert-content-with-google-adk-mcp-and-cloud-run-part-1"&gt;&lt;strong style="vertical-align: baseline;"&gt;Tooling (MCP)&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: You connected your agent to &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Reddit&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Google Docs&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, and a &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Local Image Generator&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; using the Model Context Protocol.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/multi-agent-architecture-and-long-term-memory-with-adk-mcp-and-cloud-run"&gt;&lt;strong style="vertical-align: baseline;"&gt;Architecture&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: You implemented a &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Root Orchestrator&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; managing specialized agents (Scanner, Expert, Drafter).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/create-expert-content-local-testing-of-a-multi-agent-system-with-memory"&gt;&lt;strong style="vertical-align: baseline;"&gt;Memory&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: You integrated &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Vertex AI memory bank&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to give your agent long-term persistence across sessions.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Production&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: You deployed the entire stack to &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Google Cloud Run&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; using &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Terraform&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; for secure, reproducible infrastructure.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You now have a solid foundation for building sophisticated, stateful AI applications on Google Cloud.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 17 Apr 2026 08:56:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/create-expert-content-deploying-a-multi-agent-system-with-terraform-and-cloud-run/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/devsignalheroimage.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Create Expert Content: Deploying a Multi-Agent System with Terraform and Cloud Run</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/devsignalheroimage.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/create-expert-content-deploying-a-multi-agent-system-with-terraform-and-cloud-run/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Shir Meir Lador</name><title>Head of AI, Product DevRel</title><department></department><company></company></author></item><item><title>Building Event-Driven Data Agents with BigQuery, Pub/Sub, and ADK</title><link>https://cloud.google.com/blog/topics/developers-practitioners/building-event-driven-data-agents-with-bigquery-pubsub-and-adk/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;The Need for Real-Time Autonomous Agents&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Data is only as valuable as your ability to act on it. In the modern enterprise, reacting to events hours—or even minutes—after they occur is often too late. Whether you're dealing with financial fraud or dynamic supply chain disruptions, every second counts.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;But a lot of systems still rely on slow scheduled batch jobs or fragile microservices that constantly pull for changes. By the time a problem surfaces, it's often too late. That leaves human investigators scrambling to piece things together by digging through logs and database queries. It's a slow, painful process that just doesn't scale.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Enter Event-Driven Data Agents&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;What if, instead of waiting for slow pipelines and manual triage, your data platform could instantly push an alert as soon as an anomaly is detected, triggering an autonomous AI agent to investigate and resolve it?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This is the promise of the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Event-Driven Data Agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; architecture. By combining &lt;/span&gt;&lt;a href="https://cloud.google.com/bigquery"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;BigQuery&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; continuous queries&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/pubsub/docs/overview"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Pub/Sub&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/agent-builder/agent-development-kit/overview"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;ADK Agents&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; on &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/agent-builder/agent-engine/overview"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Vertex AI Agent Engine&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, you can build a pipeline that triages events in real time and autonomously investigates them. The agent uses advanced reasoning to gather context, analyze the data, and either resolve the issue on the spot or escalate it to a person when human-in-the-loop intervention is needed.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;The Hybrid Architecture: How it Works&lt;/span&gt;&lt;/h2&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/blog_image2.max-1000x1000.jpg"
        
          alt="blog_image2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This event-driven pipeline leverages three core building blocks:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Detection:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; BigQuery continuous queries monitor live data streams and detect anomalies using a rules-based engine.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Routing:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Pub/Sub reliably delivers these events, using Single Message Transforms (SMTs) to reshape the payloads into the exact format your AI agents expect, thereby triggering the agentic pipeline to start its investigation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Resolution:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; A Vertex AI Agent (built with ADK) receives the event, investigates using custom tools, and logs its decision.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Let’s dive in and explore each component. To make this concrete, we'll walk through a simple use case: detecting and investigating fraudulent financial transactions in real-time.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Part 1: BigQuery Continuous Queries&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://cloud.google.com/bigquery/docs/continuous-queries-introduction"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;BigQuery continuous queries&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; allow you to build real-time event streams natively using standard SQL. They are persistent SQL queries that run continuously, analyzing incoming data and immediately exporting SQL results to destinations like Pub/Sub.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;The shift from pulling to pushing streaming events natively in BigQuery&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; means you can detect complex anomalies (like a user transacting in two different countries within a user specified window) within your data warehouse using standard SQL. There’s no need to move your data to a separate streaming analytics engine.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This transformation is powered by the launch of BigQuery continuous query &lt;/span&gt;&lt;a href="https://cloud.google.com/bigquery/docs/continuous-queries#stateful_processing_with_joins_and_windowing_aggregations"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;stateful data processing&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in public preview, which introduces native support for stream-to-stream JOINs, windowed aggregations, and tumbling windows. By allowing you to correlate disparate data streams and calculate complex metrics—such as rolling averages or sum totals—directly in BigQuery, we are democratizing stream processing for any SQL user. This eliminates the need for specialized external tools or deep data science expertise to build a real-time 'System of Action' that detects and reacts to events as they happen. This approach also helps manage LLM token costs; by using stateful SQL to filter for specific anomalies, you ensure that your agents only process the exact context they need, rather than overwhelming them with raw data.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Implementing this is straightforward.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; By combining a standard SQL query with an EXPORT DATA statement, you can &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/bigquery/docs/continuous-queries#export-pubsub"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;route matching rows directly into a Pub/Sub topic&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; the second they occur:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;EXPORT DATA OPTIONS (\r\n  format = &amp;quot;CLOUD_PUBSUB&amp;quot;,\r\n  uri = &amp;quot;https://pubsub.googleapis.com/projects/YOUR_PROJECT_ID/topics/cymbal-bank-escalations-topic&amp;quot;\r\n) AS (\r\n  WITH TransactionHeuristics AS (\r\n    SELECT\r\n      *,\r\n      _CHANGE_TIMESTAMP AS bq_changed_ts,\r\n    FROM APPENDS(TABLE `cymbal_bank.retail_transactions`, CURRENT_TIMESTAMP() - INTERVAL 10 MINUTE)\r\n  )\r\n  SELECT\r\n    TO_JSON_STRING(STRUCT(\r\n      window_end,\r\n      user_id,\r\n      COUNT(*) AS tx_count,\r\n      SUM(amount) AS total_window_spend,\r\n      MAX_BY(merchant_name, amount) AS highest_value_merchant,\r\n      MAX_BY(merchant_category_code, amount) AS highest_value_mcc,\r\n      100 AS final_risk_score,\r\n      STRUCT(\r\n        APPROX_COUNT_DISTINCT(location_country) &amp;gt; 1 AS is_impossible_travel,\r\n        LOGICAL_OR(NOT is_trusted_device) AS has_security_mismatch\r\n      ) AS logic_signals\r\n    )) AS data\r\n  FROM TUMBLE(TABLE TransactionHeuristics, &amp;quot;bq_changed_ts&amp;quot;, INTERVAL 2 MINUTE)\r\n  GROUP BY window_start, window_end, user_id\r\n  HAVING APPROX_COUNT_DISTINCT(location_country) &amp;gt; 1\r\n);&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-sql&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726c387df0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Part 2: Pub/Sub &amp;amp; Single Message Transforms (SMT)&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Bridging the schema gap with Pub/Sub.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The exported event data from our continuous query is sent directly to a Pub/Sub topic. Before this raw data can be consumed by our AI agent, the payload needs to be transformed to match the schema expected by our agent.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Instead of deploying something like a dedicated Cloud Function to reformat these messages, you can handle it entirely within the Pub/Sub subscription using a &lt;/span&gt;&lt;a href="https://cloud.google.com/pubsub/docs/smts/smts-overview"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Single Message Transform (SMT)&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. SMTs allow you to run lightweight, inline &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/pubsub/docs/smts/udfs-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;JavaScript User-Defined Functions (UDFs) &lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;directly within Pub/Sub to map, reshape, or clean the payload on the fly.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For instance, you can define a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;transform.yaml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; with a Javascript snippet that intercepts the BigQuery payload and extracts the exact &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;query&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; format our Agent Engine expects:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;function process(res) {\r\n  let bq_payload = JSON.parse(res.message.data);\r\n  res.message.data = JSON.stringify({&amp;quot;query&amp;quot;: bq_payload});\r\n  return res;\r\n}&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726c3876d0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;To configure the routing pipeline&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, you create a Pub/Sub Push Subscription. This subscription automatically pushes every transformed BigQuery event directly to your AI agent's webhook endpoint:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud pubsub subscriptions create cymbal-bank-escalations-sub \\\r\n  --topic=projects/$PROJECT_ID/topics/cymbal-bank-escalations-topic \\\r\n  --message-transforms-file=setup/transform.yaml \\\r\n  --push-endpoint=&amp;quot;https://YOUR_AGENT_WEBHOOK_URL&amp;quot; \r\n  --push-no-wrapper \\\r\n  --ack-deadline=600 \\\r\n --push-auth-service-account=&amp;quot;adk-agent-sa@$PROJECT_ID.iam.gserviceaccount.com&amp;quot;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726c3873d0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/pub_sub_screenshot.max-1000x1000.png"
        
          alt="pub_sub_screenshot"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Notice the push-endpoint parameter above. This webhook URL is generated by our final architectural piece: the AI Agent itself.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Part 3: ADK and Vertex AI Agent Engine&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When an agent is deployed to &lt;/span&gt;&lt;a href="https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/overview"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Vertex AI Agent Engine&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, the platform automatically provisions a secure &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;streamQuery&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; endpoint specifically designed to receive these incoming events.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;This is the brain of the operation.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Once an anomaly is detected and routed via Pub/Sub, the message triggers an ADK agent deployed on Vertex AI.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/vertex_ai_agent_engine.max-1000x1000.png"
        
          alt="vertex_ai_agent_engine"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;To implement the reasoning loop,&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; you define your agent, equipped with tools, and deploy it:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;investigation_agent = Agent(\r\n    model=&amp;quot;gemini-2.5-flash&amp;quot;,\r\n    name=&amp;quot;fraud_investigation_agent&amp;quot;,\r\n    description=&amp;quot;Expert fraud analyst agent that autonomously investigates alerts...&amp;quot;,\r\n    instruction=(\r\n        &amp;quot;You are an expert fraud investigator for Cymbal Bank. &amp;quot;\r\n        &amp;quot;Your goal is to investigate financial transaction alerts, &amp;quot;\r\n        &amp;quot;determine if they are fraudulent, and take appropriate action. &amp;quot;\r\n        &amp;quot;Use the BigQuery toolset to analyze data in the transactions table..&amp;quot;\r\n        &amp;quot;Use the Google Search toolset to search for the merchant...&amp;quot;\r\n        &amp;quot;Conslidate your findings and use the escalate_to_human tool if required...&amp;quot;\r\n    ),\r\n    tools=[\r\n        bigquery_toolset,\r\n        google_search,\r\n    ],\r\n)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726c387dc0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Equipped with specific instructions and this custom toolset, the agent autonomously investigates the alert by actively gathering external context. It can query BigQuery for a user’s transaction history, analyze unstructured data like receipts, or ground its findings with Google Search to verify a merchant's reputation. Ultimately, it categorizes the transaction as a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;FALSE_POSITIVE&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; or flags it as &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;ESCALATION_NEEDED&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;The Human-in-the-Loop Advantage&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This approach is central to the architecture's scalability. By effectively filtering out the noise, it dramatically reduces operational overhead and ensures that your investigators only spend their time on the most complex cases. And since ADK offers an impressive array of &lt;/span&gt;&lt;a href="https://google.github.io/adk-docs/integrations/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;tools and integrations&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, you can have your agent escalate events to a wide array of enterprise systems for both human-in-the-loop engagement, or even automate pipelines end-to-end using human-on-the-loop observability.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Bringing it All Together: Agent Analytics&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Once your pipeline is live, the work shifts from building to monitoring. Unlike traditional software, autonomous agents run persistently in the background. Because they operate behind the scenes, having deep observability into &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;what&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; they are doing, &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;how long&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; they take, and &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;how much&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; they cost is critical.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;By initializing the &lt;/span&gt;&lt;a href="https://adk.dev/integrations/bigquery-agent-analytics/" rel="noopener" target="_blank"&gt;&lt;code style="text-decoration: underline; vertical-align: baseline;"&gt;BigQuery Agent Analytics plugin&lt;/code&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; during deployment, the ADK automatically logs all trace data, tool usage, and execution latency directly into BigQuery:&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/bigquery_results3.max-1000x1000.png"
        
          alt="bigquery_results3"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By joining this trace data with the structured decisions output by your agent, you unlock rich analytics. This enables you to build dynamic dashboards and set up custom alerts to monitor your AI workforce in real-time. You &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/adk-bigquery-agent-analytics-plugin" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;can check out this Codelab&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to learn more about using and the Agent Analytics Plugin.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Conclusion&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The convergence of real-time data streaming and Agentic AI is changing how we handle operational alerts.&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Detect in real-time&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; with BigQuery continuous queries.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Transform and Route&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; with Pub/Sub SMTs.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Investigate and Resolve&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; with Vertex AI Agent Engine.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Analyze&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; with BigQuery Agent Analytics Plugin&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This architecture enables you to build a proactive, autonomous workforce capable of handling anomalies the moment they occur—all within a governed, scalable, and serverless Google Cloud environment.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Ready to get hands on? &lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://codelabs.developers.google.com/bigquery-adk-event-driven-agents" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Check out our codelab&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for a step-by-step guide on how to build this Cymbal Bank pipeline from scratch!&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 10 Apr 2026 21:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/building-event-driven-data-agents-with-bigquery-pubsub-and-adk/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/blog_hero_image_final.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Building Event-Driven Data Agents with BigQuery, Pub/Sub, and ADK</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/blog_hero_image_final.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/building-event-driven-data-agents-with-bigquery-pubsub-and-adk/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Rachael Deacon-Smith</name><title>Developer Advocate, Google</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Nick Orlove</name><title>BigQuery Product Manager</title><department></department><company></company></author></item><item><title>Migrating to Google Cloud’s Application Load Balancer: A practical guide</title><link>https://cloud.google.com/blog/products/networking/migrate-on-prem-application-load-balancing-to-google-cloud/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Migrating your existing application load balancer infrastructure from an on-premises hardware solution to Cloud Load Balancing offers substantial advantages in scalability, cost-efficiency, and tight integration within the Google Cloud ecosystem. Yet, a fundamental question often arises: "What about our current load balancer configurations?"&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Existing on-premises load balancer configurations often contain years of business-critical logic for traffic manipulation. The good news is that not only can you fully migrate existing functionalities, but this migration also presents a significant opportunity to modernize and simplify your traffic management.&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;This guide outlines a practical approach for migrating your existing load balancer to Google Cloud’s Application Load Balancer. It addresses common functionalities, leveraging both its declarative configurations and the innovative, event-driven Service Extensions edge compute capability.&lt;/span&gt;&lt;/p&gt;
&lt;h3 style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;A simple, phased approach to migration&lt;/span&gt;&lt;/h3&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Transitioning from an imperative, script-based system to a cloud-native, declarative-first model requires a structured plan. We recommend a straightforward, four-phase approach.&lt;/span&gt;&lt;/p&gt;
&lt;h4 style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Phase 1: Discovery and mapping&lt;/span&gt;&lt;/h4&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Before commencing any migration, you must understand what you have. Analyze and categorize your current load balancer configurations. What is each rule's intent? Is it performing a simple HTTP-to-HTTPS redirect? Is it engaged in HTTP header manipulation (addition or removal)? Or is it handling complex, custom authentication logic? &lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Most configurations typically fall into two primary categories:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="2" style="list-style-type: circle; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Common patterns:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Logic that is common to most web applications, such as redirects, URL rewrites, basic header manipulation, and IP-based access control lists (ACLs).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="2" style="list-style-type: circle; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Bespoke business logic:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Complex logic unique to your application, like custom proprietary token authentication, advanced header extraction / replacement, dynamic backend selection based on HTTP attributes, or HTTP response body manipulation. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Phase 2: Choose your Google Cloud equivalent&lt;/span&gt;&lt;/h4&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Once your rules are categorized, the next step involves mapping them to the appropriate Google Cloud feature. This is not a one-to-one replacement; it's a strategic choice.&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Option 1: the declarative path (for ~80% of rules)&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;For the majority of common patterns, leveraging the Application Load Balancer's built-in declarative features is usually the best approach. Instead of a script, you define the desired state in a configuration file. This is simpler to manage, version-control, and scale.&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Common patterns to declarative feature mapping:  &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="3" style="list-style-type: square; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Redirects/rewrites&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; -&amp;gt; &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Application Load Balancer URL maps&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="3" style="list-style-type: square; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;ACLs/throttling&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; -&amp;gt; &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Google Cloud Armor security policies&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="3" style="list-style-type: square; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Session persistence&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; -&amp;gt; &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;backend service configuration&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Option 2: The programmatic path (for complex, bespoke rules)&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;When dealing with complex, bespoke business logic, you have a programmatic equivalent: &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/service-extensions/docs/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Service Extensions&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a powerful edge compute capability that allows you to inject custom code (written in Rust, C++ or Go) directly into the load balancer's data path. This approach gives you flexibility in a modern, managed, and high-performance framework.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/image1_bkebSe1.max-1000x1000.jpg"
        
          alt="image1"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="s1mli"&gt;This flowchart helps you decide the appropriate Google Cloud feature for each configuration&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h4 style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Phase 3: Test and validate&lt;/span&gt;&lt;/h4&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Once you’ve chosen the appropriate path for your configurations, you are ready to &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;deploy your new Application Load Balancer configuration in a staging environment that mirrors your production setup. Thoroughly test all application functionality, paying close attention to the migrated logic. Use a combination of automated testing and manual QA to validate the redirects, security policies, and that the custom Service Extensions logic are behaving as expected.&lt;/span&gt;&lt;/p&gt;
&lt;h4 style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Phase 4: Phased cutover (canary deployment)&lt;/span&gt;&lt;/h4&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Don't flip a single switch for all your traffic; instead, implement a phased migration strategy. Start the transitioning process by routing a small percentage of production traffic (e.g., 5-10%) to your new Google Cloud load balancer. During this initial period, be sure to monitor key metrics like latency, error rates, and application performance. As you gain confidence, you can progressively increase the percentage of traffic routed to the Application Load Balancer. Always have a clear rollback plan to revert back to the legacy infrastructure in the event you encounter critical issues.&lt;/span&gt;&lt;/p&gt;
&lt;h3 style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Best practices for a smooth migration&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Drawing from our practical experience, we have compiled the following recommendations to assist you in planning your load balancer migrations. &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Analyze first, migrate second:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; A thorough analysis of your existing configurations is the most critical step. Don't "lift and shift" logic that is no longer needed.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Prefer declarative:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Always default to Google Cloud's managed, declarative features (URL Maps, Cloud Armor) first. They are simpler, more scalable, and require less maintenance.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Use Service Extensions strategically:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Reserve Service Extensions for the complex, bespoke business logic that declarative features cannot handle.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Monitor everything:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Continuously monitor both your existing load balancers and Google Cloud load balancers during the migration. Watch key metrics like traffic volume, latency, and error rates to detect and address issues instantly.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Train your team:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Ensure your team is trained on Cloud Load Balancing concepts. This will empower them to effectively operate and maintain the new infrastructure.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Migrating from the existing on-premises load balancer infrastructure is more than just a technical task, it's an opportunity to modernize your application delivery. By thoughtfully mapping your current load balancing configurations and capabilities to either declarative Application Load Balancer features or programmatic Service Extensions, you can build a more scalable, resilient, and cost-effective infrastructure destined for future demands.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To get started, review the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/load-balancing/docs/application-load-balancer"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Application Load Balancer&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/service-extensions/docs/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Service Extensions&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; features and advanced capabilities to come up with the right design for your application. For more guidance and complex use cases, contact your &lt;/span&gt;&lt;a href="https://cloud.google.com/contact"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud team&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 10 Apr 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/migrate-on-prem-application-load-balancing-to-google-cloud/</guid><category>Cloud Migration</category><category>Developers &amp; Practitioners</category><category>Networking</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Migrating to Google Cloud’s Application Load Balancer: A practical guide</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/migrate-on-prem-application-load-balancing-to-google-cloud/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Gopinath Balakrishnan</name><title>Customer Engineer, Google Cloud</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Xiaozang Li</name><title>Customer Engineer, Google Cloud</title><department></department><company></company></author></item><item><title>Create Expert Content: Local Testing of a Multi-Agent System with Memory</title><link>https://cloud.google.com/blog/topics/developers-practitioners/create-expert-content-local-testing-of-a-multi-agent-system-with-memory/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In support of our mission to accelerate the developer journey on Google Cloud, we built Dev Signal: a multi-agent system designed to transform raw community signals into reliable technical guidance by automating the path from discovery to expert creation.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/build-a-multi-agent-system-for-expert-content-with-google-adk-mcp-and-cloud-run-part-1"&gt;part 1&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/multi-agent-architecture-and-long-term-memory-with-adk-mcp-and-cloud-run?utm_campaign=CDR_0x91b1edb5_default_b8022895&amp;amp;utm_medium=external&amp;amp;utm_source=social"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;part 2&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; of this series, we established the essential groundwork by standardizing the core capabilities through the Model Context Protocol (MCP) and constructing a multi-agent architecture integrated with the Vertex AI memory bank to provide long-term intelligence and persistence. Now, we'll explore how to test your multi-agent system locally!&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you’d like to dive straight into the code and explore it at your own pace, you can clone the repository &lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/ai-ml/dev-signal" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Testing the agent Locally&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Before transitioning your agentic system to Google Cloud Run, it is essential to ensure that its specialized components work seamlessly together on your workstation. This testing phase allows you to validate trend discovery, technical grounding, and creative drafting within a local feedback loop, saving time and resources during the development process.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this section, you will configure your local secrets, implement environment-aware utilities, and use a dedicated test runner to verify that Dev Signal can correctly retrieve user preferences from the Vertex AI memory bank on the cloud. This local verification ensures that your agent's "brain" and "hands" are properly synchronized before moving to deployment.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Environment Setup&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;.env&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file in your project root. These variables are used for local development and will be replaced by Terraform/Secret Manager in production.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev-signal&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;/.env &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;and update with your own details.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Note&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;GOOGLE_CLOUD_LOCATION &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;is set as global because that is where Gemini-3-flash-preview is supported. We will use &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;GOOGLE_CLOUD_LOCATION &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;for the model location.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# Google Cloud Configuration\r\nGOOGLE_CLOUD_PROJECT=your-project-id\r\nGOOGLE_CLOUD_LOCATION=global\r\nGOOGLE_CLOUD_REGION=us-central1\r\nGOOGLE_GENAI_USE_VERTEXAI=True\r\nAI_ASSETS_BUCKET=your_bucket_name\r\n\r\n# Reddit API Credentials\r\nREDDIT_CLIENT_ID=your_client_id\r\nREDDIT_CLIENT_SECRET=your_client_secret\r\nREDDIT_USER_AGENT=my-agent/0.1\r\n\r\n# Developer Knowledge API Key\r\nDK_API_KEY=your_api_key&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f72714d2fa0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Helper Utilities&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Create a new directory for your application utils.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;cd dev_signal_agent\r\nmkdir app_utils\r\ncd app_utils&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f72714d20d0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Environment configuration &lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This module standardizes how the agent discovers the active Google Cloud Project and Region, ensuring a seamless transition between development environments. Using &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;load_dotenv()&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, the script first checks for local configurations before falling back to &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;google.auth.default()&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; or environment variables to retrieve the Project ID. This automated approach ensures your agent is properly authenticated and grounded in the correct cloud context without requiring manual configuration changes.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Beyond basic project discovery, the script provides a robust &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Secret Management&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; layer. It attempts to resolve sensitive credentials, such as Reddit API keys, first from the local environment (for rapid development) and then dynamically from the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/secret-manager/docs/reference/rest?rep_location=me-central2&amp;amp;utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Secret Manager API&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for production security. By returning these as a dictionary rather than injecting them into environment variables, the module maintains a clean security posture.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The script further calibrates the environment by distinguishing between global and regional requirements for different AI services. It specifically assigns the "global" location for models to access cutting-edge preview features while designating a regional location, such as &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;us-central1&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, for infrastructure like the Vertex AI Agent Engine. By finalizing this setup with a global SDK initialization, the module integrates these settings into the session, allowing the rest of your application to interact with models and memory banks without having to repeatedly pass project or location parameters.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;/app_utils/env.py&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;import os\r\nimport google.auth\r\nimport vertexai\r\nfrom google.cloud import secretmanager\r\nfrom dotenv import load_dotenv\r\n\r\ndef _fetch_secrets(project_id: str):\r\n    &amp;quot;&amp;quot;&amp;quot;Fetch secrets from Secret Manager and return them as a dictionary.&amp;quot;&amp;quot;&amp;quot;\r\n    secrets_to_fetch = [&amp;quot;REDDIT_CLIENT_ID&amp;quot;, &amp;quot;REDDIT_CLIENT_SECRET&amp;quot;, &amp;quot;REDDIT_USER_AGENT&amp;quot;, &amp;quot;DK_API_KEY&amp;quot;]\r\n    fetched_secrets = {}\r\n\r\n    # First, check local environment (for local development via .env)\r\n    for s in secrets_to_fetch:\r\n        val = os.getenv(s)\r\n        if val:\r\n            fetched_secrets[s] = val\r\n\r\n    # If keys are missing (common in production), fetch from Secret Manager API\r\n    if len(fetched_secrets) &amp;lt; len(secrets_to_fetch):\r\n        client = secretmanager.SecretManagerServiceClient()\r\n        for secret_id in secrets_to_fetch:\r\n            if secret_id not in fetched_secrets:\r\n                name = f&amp;quot;projects/{project_id}/secrets/{secret_id}/versions/latest&amp;quot;\r\n                try:\r\n                    response = client.access_secret_version(request={&amp;quot;name&amp;quot;: name})\r\n                    # DO NOT set os.environ[secret_id] here. \r\n                    # Keep it in this dictionary only.\r\n                    fetched_secrets[secret_id] = response.payload.data.decode(&amp;quot;UTF-8&amp;quot;)\r\n                except Exception as e:\r\n                    print(f&amp;quot;Warning: Could not fetch {secret_id} from Secret Manager: {e}&amp;quot;)\r\n\r\n    return fetched_secrets\r\n\r\ndef init_environment():\r\n    &amp;quot;&amp;quot;&amp;quot;Consolidated environment discovery.&amp;quot;&amp;quot;&amp;quot;\r\n    load_dotenv()\r\n    try:\r\n        _, project_id = google.auth.default()\r\n    except Exception:\r\n        project_id = os.getenv(&amp;quot;GOOGLE_CLOUD_PROJECT&amp;quot;)\r\n    \r\n    model_location = os.getenv(&amp;quot;GOOGLE_CLOUD_LOCATION&amp;quot;, &amp;quot;global&amp;quot;)\r\n    service_location = os.getenv(&amp;quot;GOOGLE_CLOUD_REGION&amp;quot;, &amp;quot;us-central1&amp;quot;)\r\n    \r\n    secrets = {}\r\n    if project_id:\r\n        vertexai.init(project=project_id, location=service_location)\r\n        # Fetch secrets into a local variable\r\n        secrets = _fetch_secrets(project_id)\r\n        \r\n    return project_id, model_location, service_location, secrets&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f72714d2e50&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Local testing script&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The Google ADK comes with a built-in Web UI, This UI is excellent for visualizing agent logic and tool composition.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can launch it by running in the project root:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;uv run adk web&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f72714d2a30&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;However, the default Web UI will not test the long-term memory integration described in this tutorial because it is not pre-connected to a Vertex AI memory session. By default, the generic UI often relies on in-memory services that do not persist data across sessions. Therefore, we use the dedicated &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;test_local.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; script to explicitly initialize the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;VertexAiMemoryBankService&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. This ensures that even in a local environment, your agent is communicating with the real cloud-based memory bank to validate preference persistence.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;test_local.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; script:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Connects to the real &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/agent-builder/agent-engine/overview?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Vertex AI Agent Engine&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in the cloud for memory storage.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Uses an in-memory session service for local chat history (so you can wipe it easily).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Run a chat loop where you can talk to your agent.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Go back to the root folder  &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev-signal&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;:&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;cd ../..&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f72714d26d0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev-signal&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;/test_local.py&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;import asyncio\r\nimport os\r\nimport google.auth\r\nimport vertexai\r\nimport uuid\r\nfrom dotenv import load_dotenv\r\nfrom google.adk.runners import Runner\r\nfrom google.adk.memory.vertex_ai_memory_bank_service import VertexAiMemoryBankService\r\nfrom google.adk.sessions import InMemorySessionService\r\nfrom vertexai import agent_engines\r\nfrom google.genai import types\r\nfrom dev_signal_agent.agent import root_agent\r\n\r\n# Load environment variables\r\nload_dotenv()\r\n\r\nasync def main():\r\n    # 1. Setup Configuration\r\n    project_id = os.getenv(&amp;quot;GOOGLE_CLOUD_PROJECT&amp;quot;)\r\n    # Agent Engine (Memory) MUST use a regional endpoint\r\n    resource_location = &amp;quot;us-central1&amp;quot;\r\n    agent_name = &amp;quot;dev-signal&amp;quot;\r\n    \r\n    print(f&amp;quot;--- Initializing Vertex AI in {resource_location} ---&amp;quot;)\r\n    vertexai.init(project=project_id, location=resource_location)\r\n\r\n    # 2. Find the Agent Engine Resource for Memory\r\n    existing_agents = list(agent_engines.list(filter=f&amp;quot;display_name={agent_name}&amp;quot;))\r\n    if existing_agents:\r\n        agent_engine = existing_agents[0]\r\n        agent_engine_id = agent_engine.resource_name.split(&amp;quot;/&amp;quot;)[-1]\r\n        print(f&amp;quot;✅ Using persistent Memory Bank from Agent: {agent_engine_id}&amp;quot;)\r\n    else:\r\n        print(f&amp;quot;❌ Error: Agent Engine \&amp;#x27;{agent_name}\&amp;#x27; not found. Please deploy with Terraform first.&amp;quot;)\r\n        return\r\n\r\n    # 3. Initialize Services\r\n    # We use InMemorySessionService for easier local testing (IDs are flexible)\r\n    # BUT we use VertexAiMemoryBankService for REAL cloud persistence\r\n    session_service = InMemorySessionService()\r\n    \r\n    memory_service = VertexAiMemoryBankService(\r\n        project=project_id,\r\n        location=resource_location,\r\n        agent_engine_id=agent_engine_id\r\n    )\r\n\r\n    # 4. Create a Runner\r\n    runner = Runner(\r\n        agent=root_agent,\r\n        app_name=&amp;quot;dev-signal&amp;quot;,\r\n        session_service=session_service,\r\n        memory_service=memory_service \r\n    )\r\n\r\n    # 5. Run a Test Loop\r\n    user_id = &amp;quot;local-tester&amp;quot;\r\n    \r\n    print(&amp;quot;\\n--- TEST SCENARIO ---&amp;quot;)\r\n    print(&amp;quot;1. Start a session, tell the agent your preference (e.g., \&amp;#x27;write in rhymes\&amp;#x27;).&amp;quot;)\r\n    print(&amp;quot;2. Type \&amp;#x27;new\&amp;#x27; to start a FRESH session (local state wiped).&amp;quot;)\r\n    print(&amp;quot;3. Ask for a blog post. The agent should retrieve your preference from the CLOUD memory.&amp;quot;)\r\n    \r\n    current_session_id = f&amp;quot;session-{str(uuid.uuid4())[:8]}&amp;quot;\r\n    await session_service.create_session(\r\n        app_name=&amp;quot;dev-signal&amp;quot;,\r\n        user_id=user_id,\r\n        session_id=current_session_id\r\n    )\r\n    print(f&amp;quot;\\n--- Chat Session (ID: {current_session_id}) ---&amp;quot;)\r\n\r\n    while True:\r\n        user_input = input(&amp;quot;\\nYou: &amp;quot;)\r\n        \r\n        if user_input.lower() in [&amp;quot;exit&amp;quot;, &amp;quot;quit&amp;quot;]:\r\n            break\r\n            \r\n        if user_input.lower() == &amp;quot;new&amp;quot;:\r\n            # Simulate starting a completely fresh session\r\n            current_session_id = f&amp;quot;session-{str(uuid.uuid4())[:8]}&amp;quot;\r\n            await session_service.create_session(\r\n                app_name=&amp;quot;dev-signal&amp;quot;,\r\n                user_id=user_id,\r\n                session_id=current_session_id\r\n            )\r\n            print(f&amp;quot;\\n--- Fresh Session Started (ID: {current_session_id}) ---&amp;quot;)\r\n            print(&amp;quot;(Local history is empty, retrieval must come from Memory Bank)&amp;quot;)\r\n            continue\r\n\r\n        print(&amp;quot;Agent is thinking...&amp;quot;)\r\n        async for event in runner.run_async(\r\n            user_id=user_id,\r\n            session_id=current_session_id,\r\n            new_message=types.Content(parts=[types.Part(text=user_input)])\r\n        ):\r\n            if event.content and event.content.parts:\r\n                for part in event.content.parts:\r\n                    if part.text:\r\n                        print(f&amp;quot;Agent: {part.text}&amp;quot;)\r\n            \r\n            if event.get_function_calls():\r\n                for fc in event.get_function_calls():\r\n                    print(f&amp;quot;?️  Tool Call: {fc.name}&amp;quot;)\r\n\r\nif __name__ == &amp;quot;__main__&amp;quot;:\r\n    asyncio.run(main())&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f72714d2ee0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Running the Test&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;First, ensure you have your Application Default Credentials set up:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud auth application-default login&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f72714d27c0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Then run the script:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;uv run test_local.py&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f72714d2be0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;section id="test-scenario"&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Test Scenario&lt;/span&gt;&lt;/h2&gt;
&lt;/section&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This scenario validates the full end-to-end lifecycle of the agent: from discovery and research to multimodal content creation and long-term memory retrieval.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Phase &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;1: Teaching &amp;amp; Multimodal Creation (Session 1)&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;&lt;span style="vertical-align: baseline;"&gt;Goal: Establish technical context and set a specific stylistic preference.&lt;/span&gt;&lt;/em&gt;&lt;/p&gt;
&lt;h4 role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Discovery&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/h4&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Ask the agent to find trending Cloud Run topics.&lt;/span&gt;&lt;/p&gt;
&lt;p role="presentation"&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Input&lt;/span&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;"Find high-engagement questions about AI agents on Cloud Run from the last 21 days."&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/test1.max-1000x1000.png"
        
          alt="test1"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/test2.max-1000x1000.png"
        
          alt="test2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h4 role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Research&lt;/span&gt;&lt;/h4&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Instruct the agent to perform a deep dive on a specific result.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Input&lt;/span&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;"Use the GCP Expert to research topic #1."&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/test3.max-1000x1000.png"
        
          alt="test3"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h4 role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Personalization&lt;/span&gt;&lt;/h4&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Request a blog post and explicitly set your style preference.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Input&lt;/span&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;"Draft a blog post based on this research. From now on, I want all my technical blogs written in the style of a 90s Rap Song."&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/test4.max-1000x1000.png"
        
          alt="test4"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h4 role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Image generation&lt;/span&gt;&lt;/h4&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Ask the agent to generate an image that demonstrates the main ideas in the blog using the Nano Banana Pro tool. The image would be saved to your bucket in Google Cloud and you should get the path to see it which will look like this: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;https://storage.mtls.cloud.google.com/...&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/tokenoptimization.max-1000x1000.png"
        
          alt="tokenoptimization"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Phase &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;2: Long-Term Memory Recall (Session 2)&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;&lt;span style="vertical-align: baseline;"&gt;Goal: Verify the agent recalls preferences across a completely fresh session.&lt;/span&gt;&lt;/em&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Type &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;new&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; in the console to wipe local session history and start a fresh state.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Retrieval) Inquire about your stored preferences to test the Vertex AI memory bank.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;ol&gt;
&lt;li aria-level="2" style="list-style-type: lower-alpha; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Input&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;"What are my current topics of interest and what is my preferred blogging style?"&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Verification: Confirm the agent successfully retrieves your "AI Agents on Cloud Run" interest and "Rap" style from the cloud.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/test5.max-1000x1000.png"
        
          alt="test5"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;Final Test&lt;/strong&gt;: Ask for a new blog on a different topic (e.g., "GKE Autopilot") and ensure it is automatically written as a rap song without being prompted.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Summary&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this part of our series we focused on verifying the agent's functionality in a local environment before proceeding to cloud deployment. By configuring local secrets and utilizing environment-aware utilities, we used a dedicated test runner to confirm that the core reasoning and tool logic are properly integrated. We successfully validated the full lifecycle: from Reddit discovery to expert content creation, confirming that the agent correctly retrieves preferences from the cloud-based Vertex AI memory bank even in completely fresh sessions.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ready to run the test scenario yourself? Clone the &lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/ai-ml/dev-signal" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;repository&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and try the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;test_local.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; script to see 'Dev Signal' retrieve your preferences from the Vertex AI memory bank in real-time. For a deeper dive into the underlying mechanics of memory orchestration, check out this &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/agent-builder/agent-engine/memory-bank/quickstart-adk?content_ref=manage%20long%20term%20memories%20for%20you%20this%20tutorial%20demonstrates%20how%20you%20can%20use%20memory%20bank%20with%20the%20adk%20to%20manage%20long%20term%20memories%20create%20your%20local%20adk%20agent%20and%20runner&amp;amp;utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;quickstart guide&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In the final part of this series, we will transition our prototype into production service on Google Cloud Run using Terraform for secure infrastructure and explore the roadmap to production excellence through continuous evaluation and security&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Special thanks to &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/remigiusz-samborski/" rel="noopener" target="_blank"&gt;&lt;span style="font-style: italic; text-decoration: underline; vertical-align: baseline;"&gt;Remigiusz Samborski&lt;/span&gt;&lt;/a&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; for the helpful review and feedback on this article.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;For more content like this, Follow me on &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/shirmeirlador/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Linkedin&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://x.com/shirmeir86?lang=en" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 10 Apr 2026 08:11:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/create-expert-content-local-testing-of-a-multi-agent-system-with-memory/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/devsignalheroimage.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Create Expert Content: Local Testing of a Multi-Agent System with Memory</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/devsignalheroimage.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/create-expert-content-local-testing-of-a-multi-agent-system-with-memory/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Shir Meir Lador</name><title>Head of AI, Product DevRel</title><department></department><company></company></author></item><item><title>Experimenting with GPUs: GKE managed DRANET and Inference Gateway AI Deployment</title><link>https://cloud.google.com/blog/topics/developers-practitioners/experimenting-with-gpus-gke-managed-dranet-and-inference-gateway-ai-deployment/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Building and serving models on infrastructure is a strong use case for businesses. In Google Cloud, you have the ability to design your AI infrastructure to suit your workloads. Recently, I experimented with Google Kubernetes Engine &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;(GKE) managed DRANET&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; while deploying a model for inference with NVIDIA B200 GPUs on GKE. In this blog, we will explore this setup in easy to follow steps.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;What is DRANET &lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://kubernetes.io/docs/concepts/scheduling-eviction/dynamic-resource-allocation/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Dynamic Resource Allocation (DRA)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is a feature that lets you request and share resources among Pods. DRANET allows you to request and allocate networking resources for your Pods, including network interfaces that support TPUs &amp;amp; Remote Direct Memory Access (RDMA). In my case, the use of high-end GPUs.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;How GPU RDMA VPC works &lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/vpc/docs/rdma-network-profiles#overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;RDMA network&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is set up as an isolated VPC, which is regional and assigned a network profile type. In this case, the network profile type is RoCEv2. This VPC is dedicated for GPU-to-GPU communication. The GPU VM families have RDMA capable NICs that connect to the RDMA VPC. The GPUs communicate between multiple nodes via this low latency, high speed rail aligned setup.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Design pattern example&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our aim was to deploy a LLM model (Deepseek) onto a GKE cluster with &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/compute/docs/accelerator-optimized-machines#a4-vms"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;A4 nodes&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; that support 8 B200 GPUs and serve it via &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-gke-inference-gateway"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GKE Inference gateway&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; privately. To set up an &lt;a href="https://docs.cloud.google.com/ai-hypercomputer/docs/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AI Hypercomputer&lt;/span&gt;&lt;/a&gt; GKE cluster, you can use the Cluster Toolkit, but in my case, I wanted to test the &lt;span style="vertical-align: baseline;"&gt;GKE managed &lt;/span&gt;DRANET dynamic setup of the networking that supports RDMA for the GPU communication.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1-archgpu.max-1000x1000.png"
        
          alt="1-archgpu"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This design utilizes the following services to provide an end-to-end solution:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;VPC:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Total of 3 VPC. One VPC manually created, two created automatically by &lt;span style="vertical-align: baseline;"&gt;GKE managed &lt;/span&gt;DRANET, one standard and one for RDMA.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;GKE:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; To deploy the workload.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;GKE Inference gateway:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; To expose the workload internally using a regional internal Application Load Balancers type gke-l7-rilb.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;A4 VM’s:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; These support RoCEv2 with NVIDIA B200 GPU.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Putting it together &lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To get access to the A4 VM a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/ai-hypercomputer/docs/consumption-models#comparison"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;future reservation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; was used. This is linked to a specific zone.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Begin:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Set up the environment &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/vpc/docs/create-modify-vpc-networks#create-custom-network"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;standard VPC&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, with firewall rules and subnet in the same zone as the reservation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/load-balancing/docs/proxy-only-subnets#proxy_only_subnet_create"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;proxy-only subnet&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; this will be used with the Internal regional application load balancer attached to the GKE inference gateway&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Next&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Create a standard GKE cluster node and default node pool.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud container clusters create $CLUSTER_NAME \\\r\n    --location=$ZONE \\\r\n    --num-nodes=1 \\\r\n    --machine-type=e2-standard-16 \\\r\n    --network=${GVNIC_NETWORK_PREFIX}-main \\\r\n    --subnetwork=${GVNIC_NETWORK_PREFIX}-sub \\\r\n    --release-channel rapid \\\r\n    --enable-dataplane-v2 \\\r\n    --enable-ip-alias \\\r\n    --addons=HttpLoadBalancing,RayOperator \\\r\n    --gateway-api=standard \\\r\n    --enable-ray-cluster-logging \\\r\n    --enable-ray-cluster-monitoring \\\r\n    --enable-managed-prometheus \\\r\n    --enable-dataplane-v2-metrics \\\r\n    --monitoring=SYSTEM&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7273d5b0a0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Once that is complete you can connect to your cluster:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud container clusters get-credentials $CLUSTER_NAME --zone $ZONE --project $PROJECT&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7273d5ba60&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra#enable-dra-driver-gpu"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GPU node pool&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (this example uses, A4 VM with reservation) and additionals flags: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;code style="vertical-align: baseline;"&gt;---accelerator-network-profile=auto&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; (GKE automatically adds the gke.networks.io/accelerator-network-profile: auto label to the nodes) &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;--node-labels=cloud.google.com/gke-networking-dra-driver=true &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;(Enables DRA for high-performance networking)&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud beta container node-pools create $NODE_POOL_NAME \\\r\n  --cluster $CLUSTER_NAME \\\r\n  --location $ZONE \\\r\n  --node-locations $ZONE \\\r\n  --machine-type a4-highgpu-8g \\\r\n  --accelerator type=nvidia-b200,count=8,gpu-driver-version=latest \\\r\n  --enable-autoscaling --num-nodes=1 --total-min-nodes=1 --total-max-nodes=3 \\\r\n  --reservation-affinity=specific \\\r\n--reservation=projects/$PROJECT/reservations/$RESERVATION_NAME/reservationBlocks/$BLOCK_NAME \\\r\n   --accelerator-network-profile=auto \\\r\n--node-labels=cloud.google.com/gke-networking-dra-driver=true&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7273d5bc40&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Next:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Create a ResourceClaimTemplate, which will be used to attach the networking resources to your deployments. The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;deviceClassName: mrdma.google.com &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;is used for GPU workloads:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;apiVersion: resource.k8s.io/v1\r\nkind: ResourceClaimTemplate\r\nmetadata:\r\n  name: all-mrdma\r\nspec:\r\n  spec:\r\n    devices:\r\n      requests:\r\n      - name: req-mrdma\r\n        exactly:\r\n          deviceClassName: mrdma.google.com\r\n          allocationMode: All&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7273d5b490&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy model and inference&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Now that a cluster and node pool is setup,&lt;/span&gt; we can deploy a model and serve it via Inference gateway. In my experiment I used DeepSeek but this could be any model.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy model and services&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt; nodeSelector: gke.networks.io/accelerator-network-profile: auto &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;is used to assign to the GPU node&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt; resourceClaims: &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;attaches the resource we defined for networking&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Create a secret (&lt;/span&gt;&lt;a href="https://huggingface.co/docs/hub/security-tokens#how-to-manage-user-access-tokens" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;I used Hugging Face&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt; token)&lt;/strong&gt;:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;kubectl create secret generic hf-secret \\\r\n  --from-literal=hf_token=${HF_TOKEN}&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7273d5b160&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Deployment&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;apiVersion: apps/v1\r\nkind: Deployment\r\nmetadata:\r\n  name: deepseek-v3-1-deploy\r\nspec:\r\n  replicas: 1\r\n  selector:\r\n    matchLabels:\r\n      app: deepseek-v3-1\r\n  template:\r\n    metadata:\r\n      labels:\r\n        app: deepseek-v3-1\r\n        ai.gke.io/model: deepseek-v3-1\r\n        ai.gke.io/inference-server: vllm\r\n        examples.ai.gke.io/source: user-guide\r\n    spec:\r\n      containers:\r\n      - name: vllm-inference\r\n        image: us-docker.pkg.dev/vertex-ai/vertex-vision-model-garden-dockers/pytorch-vllm-serve:20250819_0916_RC01\r\n        resources:\r\n          requests:\r\n            cpu: &amp;quot;190&amp;quot;\r\n            memory: &amp;quot;1800Gi&amp;quot;\r\n            ephemeral-storage: &amp;quot;1Ti&amp;quot;\r\n            nvidia.com/gpu: &amp;quot;8&amp;quot;\r\n          limits:\r\n            cpu: &amp;quot;190&amp;quot;\r\n            memory: &amp;quot;1800Gi&amp;quot;\r\n            ephemeral-storage: &amp;quot;1Ti&amp;quot;\r\n            nvidia.com/gpu: &amp;quot;8&amp;quot;\r\n          claims:\r\n          - name: rdma-claim\r\n        command: [&amp;quot;python3&amp;quot;, &amp;quot;-m&amp;quot;, &amp;quot;vllm.entrypoints.openai.api_server&amp;quot;]\r\n        args:\r\n        - --model=$(MODEL_ID)\r\n        - --tensor-parallel-size=8\r\n        - --host=0.0.0.0\r\n        - --port=8000\r\n        - --max-model-len=32768\r\n        - --max-num-seqs=32\r\n        - --gpu-memory-utilization=0.90\r\n        - --enable-chunked-prefill\r\n        - --enforce-eager\r\n        - --trust-remote-code\r\n        env:\r\n        - name: MODEL_ID\r\n          value: deepseek-ai/DeepSeek-V3.1\r\n        - name: HUGGING_FACE_HUB_TOKEN\r\n          valueFrom:\r\n            secretKeyRef:\r\n              name: hf-secret\r\n              key: hf_token\r\n        volumeMounts:\r\n        - mountPath: /dev/shm\r\n          name: dshm\r\n        livenessProbe:\r\n          httpGet:\r\n            path: /health\r\n            port: 8000\r\n          initialDelaySeconds: 1800\r\n          periodSeconds: 10\r\n        readinessProbe:\r\n          httpGet:\r\n            path: /health\r\n            port: 8000\r\n          initialDelaySeconds: 1800\r\n          periodSeconds: 5\r\n      volumes:\r\n      - name: dshm\r\n        emptyDir:\r\n            medium: Memory\r\n      nodeSelector:\r\n        gke.networks.io/accelerator-network-profile: auto\r\n      resourceClaims:\r\n      - name: rdma-claim\r\n        resourceClaimTemplateName: all-mrdma\r\n---\r\napiVersion: v1\r\nkind: Service\r\nmetadata:\r\n  name: deepseek-v3-1-service\r\nspec:\r\n  selector:\r\n    app: deepseek-v3-1\r\n  type: ClusterIP\r\n  ports:\r\n    - protocol: TCP\r\n      port: 8000\r\n      targetPort: 8000\r\n---\r\napiVersion: monitoring.googleapis.com/v1\r\nkind: PodMonitoring\r\nmetadata:\r\n  name: deepseek-v3-1-monitoring\r\nspec:\r\n  selector:\r\n    matchLabels:\r\n      app: deepseek-v3-1\r\n  endpoints:\r\n  - port: 8000\r\n    path: /metrics\r\n    interval: 30s&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7273d5b5b0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy GKE Inference Gateway&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/deploy-gke-inference-gateway#prepare-environment"&gt;install needed Custom Resource Definitions (CRDs) in your GKE cluster:&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For GKE versions &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;1.34.0-gke.1626000&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; or later, install only the alpha &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;InferenceObjective&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; CRD:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/v1.0.0/config/crd/bases/inference.networking.x-k8s.io_inferenceobjectives.yaml&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7273d5b1c0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Create Inference pool  &lt;/span&gt;&lt;/h3&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;helm install deepseek-v3-pool \\\r\n  oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool \\\r\n  --version v1.0.1 \\\r\n  --set inferencePool.modelServers.matchLabels.app=deepseek-v3-1 \\\r\n  --set provider.name=gke \\\r\n  --set inferenceExtension.monitoring.gke.enabled=true&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7273d5b8e0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Create the Gateway, HTTPRoute and InferenceObjective&lt;/span&gt;&lt;/h3&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# 1. The Regional Internal Gateway (ILB)\r\napiVersion: gateway.networking.k8s.io/v1\r\nkind: Gateway\r\nmetadata:\r\n  name: deepseek-v3-gateway\r\n  namespace: default\r\nspec:\r\n  gatewayClassName: gke-l7-rilb\r\n  listeners:\r\n  - name: http\r\n    protocol: HTTP\r\n    port: 80\r\n    allowedRoutes:\r\n      namespaces:\r\n        from: Same\r\n---\r\n# 2. The HTTPRoute (Routing to the Pool)\r\napiVersion: gateway.networking.k8s.io/v1\r\nkind: HTTPRoute\r\nmetadata:\r\n  name: deepseek-v3-route\r\n  namespace: default\r\nspec:\r\n  parentRefs:\r\n  - name: deepseek-v3-gateway\r\n  rules:\r\n  - matches:\r\n    - path:\r\n        type: PathPrefix\r\n        value: /\r\n    backendRefs:\r\n    - group: inference.networking.k8s.io\r\n      kind: InferencePool\r\n      name: deepseek-v3-pool\r\n---\r\n# 3. The Inference Objective (Performance Logic)\r\napiVersion: inference.networking.x-k8s.io/v1alpha2\r\nkind: InferenceObjective\r\nmetadata:\r\n  name: deepseek-v3-objective\r\n  namespace: default\r\nspec:\r\n  poolRef:\r\n    name: deepseek-v3-pool&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7255317b20&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Once complete, you can create a test VM in your main VPC and make a call to the IP address of the GKE Inference Gateway:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;curl -N -s -X POST &amp;quot;http://$GATEWAY_IP/v1/chat/completions&amp;quot; \\\r\n  -H &amp;quot;Content-Type: application/json&amp;quot; \\\r\n  -d \&amp;#x27;{\r\n    &amp;quot;model&amp;quot;: &amp;quot;deepseek-ai/DeepSeek-V3.1&amp;quot;,\r\n    &amp;quot;messages&amp;quot;: [{&amp;quot;role&amp;quot;: &amp;quot;user&amp;quot;, &amp;quot;content&amp;quot;: &amp;quot;Box A: red. Box B: blue. Box C: empty. Move A to C, Move B to A, Swap B and C. Where is red?&amp;quot;}],\r\n    &amp;quot;stream&amp;quot;: true\r\n  }\&amp;#x27; | stdbuf -oL grep &amp;quot;data: &amp;quot; | sed -u \&amp;#x27;s/^data: //\&amp;#x27; | grep -v &amp;quot;\\[DONE\\]&amp;quot; | \\\r\n  jq --unbuffered -rj \&amp;#x27;.choices[0].delta | (.reasoning_content // .reasoning // .content // empty)\&amp;#x27;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7255317040&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Next Steps&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Take a deeper dive into GKE managed DRANET and GKE Inference Gateway, review the following.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Blog: &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/containers-kubernetes/kubernetes-device-management-with-dra-dynamic-resource-allocation?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;DRA: A new era of Kubernetes device management with Dynamic Resource Allocation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Document set: &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/config-auto-net-for-accelerators"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;DRANET&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Documentation: &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/ai-hypercomputer/docs/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AI Hypercomputer&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Want to ask a question, find out more or share a thought? Please connect with me on &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/ammett/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Linkedin&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 08 Apr 2026 10:05:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/experimenting-with-gpus-gke-managed-dranet-and-inference-gateway-ai-deployment/</guid><category>Networking</category><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/0-hero-dranet.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Experimenting with GPUs: GKE managed DRANET and Inference Gateway AI Deployment</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/0-hero-dranet.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/experimenting-with-gpus-gke-managed-dranet-and-inference-gateway-ai-deployment/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Ammett Williams</name><title>Developer Relations Engineer</title><department></department><company></company></author></item><item><title>See beyond the IP and secure URLs with Google Cloud NGFW</title><link>https://cloud.google.com/blog/products/identity-security/see-beyond-the-ip-and-secure-urls-with-google-cloud-ngfw/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In a cloud-first world, traditional IP-based defenses are no longer enough to protect your perimeter. As services migrate to shared infrastructure and content delivery networks, relying on static IP addresses and FQDNs can create security gaps.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Because single IP addresses can host multiple services, and IPs addresses can change frequently, we are introducing domain filtering with a wildcard capability in Cloud Next Generation Firewall (NGFW) Enterprise. This new capability provides increased security and granular policy controls.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Why domain and SNI filtering matters&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The Cloud NGFW URL filtering service performs deep inspections of HTTP payloads to secure workloads against threats from both public and internal networks. This service elevates security controls to the application layer and helps restrict access to malicious domains. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Key use cases include: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Granular egress control&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: This capability enables the precise allowing and blocking of connections based on domain names and SNI information found in egress HTTP(S) messages. By inspecting Layer 7 (L7) headers, it offers significantly finer control than traditional filtering based solely on IP addresses and FQDNs, which can be inefficient when a single IP hosts multiple services.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Control access without decrypting&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: For organizations that prefer not to perform full TLS decryption on their traffic, Cloud NGFW can still enforce security policies by controlling traffic based on SNI headers provided during the TLS handshake. This allows for effective domain-level filtering while maintaining end-to-end encryption for privacy or compliance reasons.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Reduced operational overhead&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Implementing domain-based filtering helps reduce the constant maintenance typically required to track frequently changing IP addresses and DNS records. By focusing on stable domain identities rather than dynamic network attributes, security teams can minimize the manual effort involved in updating firewall rulebases.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Flexible matching&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: The service utilizes matcher strings within URL lists, supporting limited wildcard domains to define criteria for both domains and subdomains. For example, using a wildcard like &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;*.example.com&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; allows a single filter to cover all associated subdomains, providing a more scalable solution than defining thousands of individual FQDN entries.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Improved security: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;URL filtering significantly enhances the security posture by protecting against sophisticated flaws like SNI header spoofing. By evaluating L7 headers before allowing access to an application, Cloud NGFW ensures that attackers cannot bypass security controls by simply spoofing lower-layer identifiers. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;How Cloud NGFW URL filtering works&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The URL filtering service functions by inspecting traffic at L7 using a distributed architecture. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/image1_zzP0Xt6.max-1000x1000.png"
        
          alt="image1"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="6nmqq"&gt;Cloud NGFW URL filtering service&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can get started with URL filtering in three simple steps.&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Deploy Cloud NGFW endpoints&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;ol&gt;
&lt;li aria-level="2" style="list-style-type: lower-alpha; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The first step is to &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/configure-firewall-endpoints#create-firewall-endpoint"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;create and deploy a Cloud NGFW endpoint&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in a zone. The &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/about-firewall-endpoints"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;NGFW endpoint&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is an organization level resource. Please ensure you have the right permission before deploying the endpoint.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="2" style="list-style-type: lower-alpha; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Once the endpoint is deployed you can &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/configure-firewall-endpoint-associations#create-end-assoc-network"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;associate it to one or more VPCs&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; of your choice.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Create security profiles and security profile groups:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;ol&gt;
&lt;li aria-level="2" style="list-style-type: lower-alpha; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/about-security-profiles#url-filtering-profile"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;URL filtering security profile&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; holds the URL filters with matcher strings and an action (allow or deny).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="2" style="list-style-type: lower-alpha; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/about-security-profile-groups"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;security profile group&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; acts as a container for these security profiles, which is then referenced by a firewall policy rule. &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/configure-urlf-security-profiles#create-urlf-security-profile"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Create URL filtering security profiles&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; with desired URLs, wildcard FQDNs and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/configure-security-profile-groups#create-security-profile-group"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;add them to a security profile group&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="2" style="list-style-type: lower-alpha; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Once the security profile group is created, you will need to reference the security profile group in firewall policies.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Policy enforcement:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;ol&gt;
&lt;li aria-level="2" style="list-style-type: lower-alpha; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;You enable the service by configuring a hierarchical or global network firewall policy rule using the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;apply_security_profile_group&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; action, specifying the name of your security profile group. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For more information about configuring a firewall policy rule, see the following:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/using-firewall-policies#create-ingress-rule-target-vm"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Create an ingress hierarchical firewall policy rule&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/using-firewall-policies#create-egress-rule-target-vm"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Create an egress hierarchical firewall policy rule&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/use-network-firewall-policies#create-ingress-rule-target-vm"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Create an ingress global network firewall policy rule&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/use-network-firewall-policies#create-egress-rule-target-vm"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Create an egress global network firewall policy rule&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Getting started&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Get started with Cloud NGFW URL filtering by visiting our &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/about-url-filtering"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/cloud-ngfw-enterprise-urlf" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;codelab&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 07 Apr 2026 17:30:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/identity-security/see-beyond-the-ip-and-secure-urls-with-google-cloud-ngfw/</guid><category>Networking</category><category>Developers &amp; Practitioners</category><category>Security &amp; Identity</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>See beyond the IP and secure URLs with Google Cloud NGFW</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/identity-security/see-beyond-the-ip-and-secure-urls-with-google-cloud-ngfw/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Uttam Ramesh</name><title>Product Manager</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Susan Wu</name><title>Outbound Product Manager</title><department></department><company></company></author></item><item><title>Envoy: A future-ready foundation for agentic AI networking</title><link>https://cloud.google.com/blog/products/networking/the-case-for-envoy-networking-in-the-agentic-ai-era/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In today's agentic AI environments, the network has a new set of responsibilities.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In a traditional application stack, the network mainly moves requests between services. But as discussed in a recent white paper,&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://services.google.com/fh/files/misc/cloud_infrastructure_in_the_agent_native_era.pdf" rel="noopener" target="_blank"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Cloud Infrastructure in the Agent-Native Era&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;,&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; in an agentic system the network sits in the middle of model calls, tool invocations, agent-to-agent interactions, and policy decisions that can shape what an agent is allowed to do. The rapid proliferation of agents, often built on diverse frameworks, necessitates a consistent enforcement of governance and security across all agentic paths at scale. To achieve this, the enforcement layer must shift from the application level to the underlying infrastructure. That means the network can no longer operate as a blind transport layer. It has to understand more, enforce better, and adapt faster. This shift is precisely where Envoy comes in.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As a high-performance distributed proxy and universal data plane, Envoy is built for massive scale. Trusted by demanding enterprise environments, including Google Cloud, it supports everything from single-service deployments to complex service meshes using Ingress, Egress, and Sidecar patterns. Because of its deep extensibility, robust policy integration, and operational maturity, Envoy is uniquely suited for an era where protocols change quickly and the cost of weak control is steep. For teams building agentic AI, Envoy is more than a concept: it's a practical, production-ready foundation.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_xPxMxF4.max-1000x1000.jpg"
        
          alt="1"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Agentic AI changes the networking problem&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Agentic workloads still often use HTTP as a transport, but they break some of the assumptions that traditional HTTP intermediaries rely on. Protocols such as&lt;/span&gt;&lt;a href="https://modelcontextprotocol.io/docs/getting-started/intro" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Context Protocol&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (MCP) and&lt;/span&gt;&lt;a href="https://github.com/google/A2A" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent2agent&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (A2A) use&lt;/span&gt;&lt;a href="https://www.jsonrpc.org/specification" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;JSON-RPC&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or&lt;/span&gt;&lt;a href="https://grpc.io" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;gRPC&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; over HTTP, adding protocol-level phases such as MCP initialization, where client and server exchange their capabilities, on top of standard HTTP request/response semantics. The key aspects of agentic systems that require intermediaries to adapt include:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Diverse enterprise governance imperatives. &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;The primary challenge is satisfying the wide spectrum of non-negotiable enterprise requirements for safety, security, data privacy, and regulatory compliance. These needs often go beyond standard network policies and require deep integration with internal systems, custom logic, and the ability to rapidly adapt to new organizational rules or external regulations. This demands a highly extensible framework where enterprises can plug in their specific governance models.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Policy attributes live inside message bodies, not headers.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Unlike traditional web traffic where policy inputs like paths and headers are readily accessible, agentic protocols frequently bury critical attributes (e.g., model names, tool calls, resource IDs) deep within JSON-RPC or gRPC payloads. This shift requires intermediaries to possess the ability to parse and understand message contents to apply context-aware policies.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Handling diverse and evolving protocol characteristics. &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Agentic protocols are not uniform. Some, like MCP with Streamable HTTP, can introduce stateful interactions requiring session management across distributed proxies (e.g., using &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;Mcp-Session-Id&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;). The need to support such varied behaviors, along with future protocol innovations, reinforces the necessity of an inherently adaptable and extensible networking foundation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These factors mean enterprises need more than just connectivity. The network must now serve as a central point for enforcing the crucial governance needs mentioned earlier. This includes providing capabilities like centralized security, comprehensive auditability, fine-grained policy enforcement, and dynamic guardrails, all while keeping pace with the rapid evolution of protocols and agent behaviors. Put simply, agentic AI transforms the network from a mere transit path into a critical control point.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Why Envoy fits this shift&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Envoy is a strong fit for agentic AI networking for three reasons. Envoy is:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Battle-tested.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Enterprises already rely on Envoy in high-scale, security-sensitive environments, making it a credible platform to anchor a new generation of traffic management and policy enforcement.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Extensible.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Envoy can be extended through native filters, Rust modules, WebAssembly (Wasm) modules, and &lt;/span&gt;&lt;a href="https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/ext_proc_filter" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;external processing&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; patterns. That gives platform teams room to adopt new protocols without having to rebuild their networking layer every time the ecosystem changes.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Operationally useful today.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Envoy already acts as a gateway, enforcement point, observability layer, and integration surface for control planes. That makes it a practical choice for organizations that need to move now, not after the standards settle.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Building on these core strengths, Envoy has introduced specific architectural advancements to meet the unique demands of agentic networking:&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;1. Envoy understands agent traffic&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The first requirement for agentic networking is simple: The gateway needs to understand what the agent is actually trying to do.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;That’s harder than it sounds. In protocols such as MCP, A2A, and OpenAI-style APIs, important policy signals may live inside the request body. Traditional HTTP proxies are optimized to treat bodies as opaque byte streams. That design is efficient, but it limits what the proxy can enforce. For protocols that use JSON messages, a proxy may need to buffer the entire request body to locate attribute values needed for policy application — especially when those attributes appear at the end of the JSON message. Business logic specific to gen AI protocols, such as rate limiting based on consumed tokens, may also require parsing server responses.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Envoy addresses this by deframing protocol messages carried over HTTP and exposing useful attributes to the rest of the filter chain. The extensibility model for gen AI protocols was guided by two goals:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Easy reuse of existing HTTP extensions that work with gen AI protocols out of the box, such as RBAC or tracers.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Easy access to deframed messages for gen-AI-specific extensions, so that developers can focus on gen AI business logic without needing to deal with HTTP or JSON envelopes.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Based on these goals, new extensions for gen AI protocols are still built as HTTP extensions and configured in the HTTP filter chain. This provides flexibility to mix HTTP-native business logic, such as OAuth or mTLS authorization, with gen AI protocol logic in a single chain. A deframing extension parses the protocol messages carried by HTTP and provides an ambient context with extracted attributes, or even the entirety of parsed messages, to downstream extensions via well-known filter state and metadata values.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Instead of forcing every policy component to parse JSON envelopes or protocol-specific message formats on its own, Envoy makes those attributes available as structured metadata. Once the gateway has deframed protocol messages, existing Envoy extensions such as &lt;/span&gt;&lt;a href="https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/ext_authz_filter" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ext_authz&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or RBAC can read protocol properties to evaluate policies using protocol-specific attributes such as tool names for MCP, message attributes for A2A, or model names for OpenAI.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Access logs can include message attributes for enhanced monitoring and auditing. The protocol attributes are also available to the &lt;/span&gt;&lt;a href="https://cel.dev/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Common Expression Language&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (CEL) runtime, simplifying creation of complex policy expressions in RBAC or composite extensions.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_t4lf1kG.max-1000x1000.png"
        
          alt="2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Buffering and memory management&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Envoy is designed to use as little memory as possible when proxying HTTP requests. However, parsing agentic protocols may require an arbitrary amount of buffer space, especially when extensions require the entire message to be in memory. The flexibility of allowing extensions to use larger buffers needs to be balanced with adequate protection from memory exhaustion, especially in the presence of untrusted traffic.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To achieve this, Envoy now provides a per-request buffer size limit. Buffers that hold request data are also integrated with the overload manager, enabling a full range of protective actions under memory pressure, such as reducing idle timeouts or resetting requests that consume the most memory for an extended duration. These changes pave the way for Envoy to serve as a gateway and policy-enforcement point for gen AI protocols without compromising its resource efficiency.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;2. Envoy enforces policy on things that matter&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Understanding traffic is only useful if the gateway can act on it.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In agentic systems, policy is not just about which service an agent can reach. It’s about which tools an agent can call, which models it can use, what identity it presents, how much it can consume, and what kinds of outputs require additional controls. Those are higher-value decisions than simple layer-4 or path-based controls, and they are exactly the kinds of controls enterprises care about when agents are allowed to take action on their behalf.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Envoy is well-positioned here because it can combine transport-level security with application-aware policy enforcement. Teams can authenticate workloads with mTLS and SPIFFE identities, then enforce protocol-specific rules with RBAC, external authorization, external processing, access logging, and CEL-based policy expressions.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This capability is crucial because it lets platform teams decouple agent development from enforcement. Developers can focus on building useful agents, while operators enforce a consistent zero-trust posture at the network layer, even as tools, models, and protocols continue to change.&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;A prime example of this zero-trust decoupling is the critical "user-behind-agent" scenario, where an AI agent must execute tasks on a human user's behalf. Traditionally, handing user credentials directly to an application introduces severe security risks — if the agent is compromised or manipulated via prompt injection, an attacker could exfiltrate or misuse those credentials. By offloading identity management to Envoy, the proxy can automatically insert user delegation tokens into outbound requests at the infrastructure layer. Because the agent never directly holds the sensitive credential, the risk of a compromised agent misusing or leaking the token is completely neutralized, ensuring actions remain strictly bound to the user's actual permissions.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Case study: Restricting an agent to specific GitHub MCP tools&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Consider an agent that triages GitHub issues.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The GitHub MCP server may expose dozens of tools, but the agent may only need a small read-only subset, such as &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;list_issues&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;get_issue&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;get_issue_comments&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. In most enterprises, that difference matters. A useful agent should not automatically become an unrestricted one.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With Envoy in front of the MCP server, the gateway can verify the agent identity using SPIFFE during the mTLS handshake, parse the MCP message via &lt;/span&gt;&lt;a href="https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/http/mcp/v3/mcp.proto#envoy-v3-api-msg-extensions-filters-http-mcp-v3-mcp" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;the deframing filter&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, extract the requested method and tool name, and enforce a policy that allows only the approved tool calls for that specific agent identity. RBAC uses metadata created by the MCP deframing filter to check the method and tool name in the MCP message:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;envoy.filters.http.rbac:\r\n  &amp;quot;@type&amp;quot;: type.googleapis.com/envoy.extensions.filters.http.rbac.v3.RBACPerRoute\r\n  rbac:\r\n    rules:\r\n      policies:\r\n        github-issue-reader-policy:\r\n          permissions:\r\n            - and_rules:\r\n                rules:\r\n                  - sourced_metadata:\r\n                      metadata_matcher:\r\n                        filter: envoy.http.filters.mcp\r\n                        path: [{ key: &amp;quot;method&amp;quot; }]\r\n                        value: { string_match: { exact: &amp;quot;tools/call&amp;quot; } }\r\n                  - sourced_metadata:\r\n                      metadata_matcher:\r\n                        filter: envoy.http.filters.mcp\r\n                        path: [{ key: &amp;quot;params&amp;quot; }, { key: &amp;quot;name&amp;quot; }]\r\n                        value:\r\n                          or_match:\r\n                            value_matchers:\r\n                              - string_match: { exact: &amp;quot;list_issues&amp;quot; }\r\n                              - string_match: { exact: &amp;quot;get_issue&amp;quot; }\r\n                              - string_match: { exact: &amp;quot;get_issue_comments&amp;quot; }\r\n          principals:\r\n            - authenticated:\r\n                principal_name:\r\n                  exact: &amp;quot;spiffe://cluster.local/ns/github-agents/sa/issue-triage-agent&amp;quot;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f727146b310&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;That’s the real value: Policy is enforced centrally, close to the traffic, and in terms that match the agent's actual behavior.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/3_jtbLCMn.max-1000x1000.png"
        
          alt="3"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Beyond static rules: External authorization&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;A complex compliance policy that can’t be expressed using RBAC rules can be implemented in an external authorization service using the &lt;/span&gt;&lt;a href="https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/ext_authz_filter" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ext_authz&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; protocol. Envoy provides MCP message attributes along with HTTP headers in the context of the ext_authz RPC. It can also forward the agent's SPIFFE identity from the peer certificate:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;http_filters:\r\n  - name: envoy.filters.http.ext_authz\r\n    typed_config:\r\n      &amp;quot;@type&amp;quot;: type.googleapis.com/envoy.extensions.filters.http.ext_authz.v3.ExtAuthz\r\n      grpc_service:\r\n        envoy_grpc:\r\n          cluster_name: auth_service_cluster\r\n      include_peer_certificate: true\r\n      metadata_context_namespaces:\r\n        - envoy.http.filters.mcp&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7255317880&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This allows external services to make authorization decisions based on the full combination of agent identity, MCP method, tool name, and any other protocol attributes, without the agent or the MCP server needing to be aware of the policy layer.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Protocol-native error responses&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;When Envoy denies a request, the error should be meaningful to the calling agent. For MCP traffic, Envoy can use &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;local_reply_config&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to map HTTP error codes to appropriate JSON-RPC error responses. For example, a 403 Forbidden can be mapped to a JSON-RPC response with &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;isError: true&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and a human-readable message, ensuring the agent receives a protocol-appropriate denial rather than an opaque HTTP status code.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;3. Envoy supports stateful agent interactions at scale&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Not all agent traffic is stateless. Some protocols, including Streamable HTTP for MCP, can rely on session-oriented behavior. That creates a new challenge for intermediaries, especially when traffic flows through multiple gateway instances to achieve scale and resilience. An MCP session effectively binds the agent to the server that established it, and all intermediaries need to know this to direct incoming MCP connections to the correct server.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If a session is established on one backend, later requests in that conversation need to reach the right destination. That sounds straightforward for a single-proxy deployment, but it becomes more complicated in horizontally scaled systems, where multiple Envoy instances may handle different requests from the same agent.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Passthrough gateway&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;In the simpler passthrough mode, Envoy establishes one upstream connection for each downstream connection. Its primary use is enforcing centralized policies, such as client authorization, RBAC, rate limiting, and authentication, for external MCP servers. The session state transferred between intermediaries needs to include only the address of the server that established the session over the initial HTTP connection, so that all session-related requests are directed to that server.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Session state transfer between different Envoy instances is achieved by appending encoded session state to the MCP session ID provided by the MCP server. Envoy removes the session-state suffix from the session ID before forwarding the request to the destination MCP server. This session stickiness is enabled by configuring Envoy's &lt;/span&gt;&lt;a href="https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/http/stateful_session/envelope/v3/envelope.proto" rel="noopener" target="_blank"&gt;&lt;code style="text-decoration: underline; vertical-align: baseline;"&gt;envoy.http.stateful_session.envelope&lt;/code&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; extension.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/4_j0wGyAp.max-1000x1000.png"
        
          alt="4"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Aggregating gateway&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;In aggregating mode, Envoy acts as a single MCP server by aggregating the capabilities, tools, and resources of multiple backend MCP servers. In addition to enforcing policies, this simplifies agent configuration and unifies policy application for multiple MCP servers.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Session management in this mode is more complicated because the session state also needs to include mapping from tools and resources to the server addresses and session IDs that advertised them. The session ID that Envoy provides to the agent is created before tools or resources are known, and the mapping has to be established later, after the MCP initialization phases between Envoy and the backend MCP servers are complete.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;One approach, currently implemented in Envoy, is to combine the name of a tool or resource with the identifier and session ID of its origin server. The exact tool or resource names are typically not meaningful to the agent and can carry this additional provenance information. If unmodified tool or resource names are desirable, another approach is to use an Envoy instance that does not have the mapping, and then recreate it by issuing a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;tools/list&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; command before calling a specific tool. This trades latency for the complexity of deploying an external global store of MCP sessions, and is currently in planning based on user feedback.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/5_61xwM79.max-1000x1000.png"
        
          alt="5"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This matters because it moves Envoy beyond simple traffic forwarding. It allows Envoy to serve as a reliable intermediary for real agent workflows, including those spanning multiple requests, tools, and backends.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;4. Envoy supports agent discovery&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Envoy is adding support for the A2A protocol and agent discovery via a well-known AgentCard endpoint. AgentCard, a JSON document with agent capabilities, enables discovery and multi-agent coordination by advertising skills, authentication requirements, and service endpoints. The AgentCard can be provisioned statically via direct response configuration or obtained from a centralized agent registry server via xDS or ext_proc APIs. A more detailed description of A2A implementation and agent discovery will be published in a forthcoming blog post.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;5. Envoy is a complete solution for agentic networking challenges&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Building on the same foundation that enabled policy application for MCP protocol in demanding deployments, Envoy is adding support for OpenAI and transcoding of agentic protocols into RESTful HTTP APIs. This transcoding capability simplifies the integration of gen AI agents with existing RESTful applications, with out-of-the-box support for OpenAPI-based applications and custom options via dynamic modules or Wasm extensions. In addition to transcoding, Envoy is being strengthened in critical areas for production readiness, such as advanced policy applications like quota management, comprehensive telemetry adhering to&lt;/span&gt;&lt;a href="https://opentelemetry.io/docs/specs/semconv/gen-ai/" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;OpenTelemetry semantic conventions for generative AI systems&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and integrated guardrails for secure agent operation.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Guardrails for safe agents&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;The next significant area of investment is centralized management and application of guardrails for all agentic traffic. Integrating policy enforcement points with external guardrails presently requires bespoke implementation and this problem area is ripe for standardization.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Control planes make this operational&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The gateway is only part of the story. To achieve this policy management and rollout at scale, a separate control plane is required to dynamically configure the data plane using the xDS protocol, also known as the universal data plane API.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;That is where control planes become important. Cloud Service Mesh, alongside open-source projects such as &lt;/span&gt;&lt;a href="https://aigateway.envoyproxy.io/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Envoy AI Gateway&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://github.com/kubernetes-sigs/kube-agentic-networking" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;kube-agentic-networking&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, uses Envoy as the data plane while giving operators higher-level ways to define and manage policy for agentic workloads.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This combination is powerful: Envoy provides the enforcement and extensibility in the traffic path, while control planes provide the operating model teams need to deploy that capability consistently.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Why this matters now&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The shift towards agentic systems and gen AI protocols such as MCP, A2A, and OpenAI necessitates an evolution in network intermediaries. The primary complexities Envoy addresses include:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Deep protocol inspection.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Protocol deframing extensions extract policy-relevant attributes (tool names, model names, resource paths) from the body of HTTP requests, enabling precise policy enforcement where traditional proxies would only see an opaque byte stream.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Fine-grained policy enforcement.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; By exposing these internal attributes, existing Envoy extensions like RBAC and ext_authz can evaluate policies based on protocol-specific criteria. This allows network operators to enforce a unified, zero-trust security posture, ensuring agents comply with access policies for specific tools or resources.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Stateful transport management.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Envoy supports managing session state for the Streamable HTTP transport used by MCP, enabling robust deployments in both passthrough and aggregating gateway modes, even across a fleet of intermediaries.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Agentic AI protocols are still in their early stages, and the protocol landscape will continue to evolve. That’s exactly why the networking layer needs to be adaptable. Enterprises should not have to rebuild their security and traffic infrastructure every time a new agent framework, transport pattern, or tool protocol gains traction. They need a foundation that can absorb change without sacrificing control.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Envoy brings together three qualities that are hard to get in one place: proven production maturity, deep extensibility, and growing protocol awareness for agentic workloads. By leveraging Envoy as an agent gateway, organizations can decouple security and policy enforcement from agent development code.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;That makes Envoy more than just a proxy that happens to handle AI traffic. It makes Envoy a future-ready foundation for agentic AI networking.&lt;/span&gt;&lt;/p&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;sup&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Special thanks to the additional co-authors of this blog: Boteng Yao, Software Engineer, Google and Tianyu Xia, Software Engineer, Google and Sisira Narayana, Sr Product Manager, Google.&lt;/span&gt;&lt;/sup&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 03 Apr 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/the-case-for-envoy-networking-in-the-agentic-ai-era/</guid><category>Containers &amp; Kubernetes</category><category>AI &amp; Machine Learning</category><category>GKE</category><category>Developers &amp; Practitioners</category><category>Networking</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Envoy: A future-ready foundation for agentic AI networking</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/the-case-for-envoy-networking-in-the-agentic-ai-era/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Yan Avlasov</name><title>Staff Software Engineer, Google</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Erica Hughberg</name><title>Product and Product Marketing Manager, Tetrate</title><department></department><company></company></author></item><item><title>Activating Your Data Layer for Production-Ready AI</title><link>https://cloud.google.com/blog/topics/developers-practitioners/activating-your-data-layer-for-production-ready-ai/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When discussing applications and systems using generative AI and the new opportunities they present, one component of the ecosystem is irreplaceable - data. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Specifically, the data that companies gather, hold, and use daily. This data serves as the backbone for applications, analytics, knowledge bases, and much more. We use databases to store and work with this data, and most, if not all, AI-driven initiatives and new applications are going to use that data layer.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;But how can we start to use the data in our AI systems? Let me introduce you to some of the labs showing how to prepare and use the data with AI models in Google databases.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Semantic Search: Text Embeddings in Database&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our journey starts by preparing our data for semantic search and running first tests to augment the Gen AI model's response by grounding it with your semantic search results. The grounding data is the basis for RAG (Retrieval Augmented Generation). Then, you can improve the performance of your search by indexing your embeddings using the latest indexing techniques.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;One of the options is the &lt;/span&gt;&lt;a href="https://cloud.google.com/products/alloydb?e=48754805"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Google AlloyDB database&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, which has direct integration with AI models and supports the most demanding workloads. The following lab guides us through all the steps, starting from creating an AlloyDB cluster, loading sample data, and generating embeddings, to using those embeddings to generate an augmented response from the Gen AI model.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-aside"&gt;&lt;dl&gt;
    &lt;dt&gt;aside_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;title&amp;#x27;, &amp;#x27;Go to the lab!&amp;#x27;), (&amp;#x27;body&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271d717f0&amp;gt;), (&amp;#x27;btn_text&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;href&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;image&amp;#x27;, None)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AI integration is not limited to AlloyDB. All Google Cloud databases have AI integration and are capable of generating and using embeddings for semantic search. For example, if you are using &lt;/span&gt;&lt;a href="https://cloud.google.com/sql"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud SQL&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, you can also generate and use embeddings for semantic search directly within your existing &lt;/span&gt;&lt;a href="https://cloud.google.com/sql/postgresql"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;PostgreSQL&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or &lt;/span&gt;&lt;a href="https://cloud.google.com/sql/mysql"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;MySQL&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; instances.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The next two labs are very similar to the previous one, but instead of Google AlloyDB for PostgreSQL, we are using Cloud SQL for PostgreSQL and Cloud SQL for MySQL to use semantic search as the grounding engine for the model's response. Some steps are of course different due to variations in SQL language and different database engines, but the main idea stays the same: use our data to ground the model response and improve output.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-aside"&gt;&lt;dl&gt;
    &lt;dt&gt;aside_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;title&amp;#x27;, &amp;#x27;Go to the labs!&amp;#x27;), (&amp;#x27;body&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271d71b50&amp;gt;), (&amp;#x27;btn_text&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;href&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;image&amp;#x27;, None)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Semantic search using text data is one of the cornerstones and important features making responses much more reliable and useful, but Google Gen AI models can offer much more. Let's talk about multimodal search.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Multimodal Embeddings: Bring Images to the Search&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In real life, of course, we use all our senses, including vision, to evaluate the world around us. The Google multimodal embedding models bring an additional layer of understanding, improving search by using embeddings not only for text but also for images.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In the following lab, we use a catalog of products placed in AlloyDB and supplemented by images in Google Cloud Storage. In the lab, we show how we can use both text descriptions and images for semantic search, supplementing and replacing each other, naturally incorporating search based on image input into our response.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-aside"&gt;&lt;dl&gt;
    &lt;dt&gt;aside_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;title&amp;#x27;, &amp;#x27;Go to the lab!&amp;#x27;), (&amp;#x27;body&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271d71760&amp;gt;), (&amp;#x27;btn_text&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;href&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;image&amp;#x27;, None)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Preparing the data and making first steps are important for a general understanding of RAG and tools available for the search, but Google has other cases when direct AI integration can help with your data analysis without any data preparations.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;AlloyDB AI Functions and Reranking&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Google AlloyDB database comes with additional AI integrations that help you use some AI capabilities without data preparation. For example, the AI.IF function can perform semantic search on the fly, evaluating sentiment or comparing data in columns with a natural language query, returning results filtered by the query condition. Also, you can apply a ranking function to the search output, improving the final result. You can try some of the new functionality using the following lab and let us know if it can help in your use case.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-aside"&gt;&lt;dl&gt;
    &lt;dt&gt;aside_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;title&amp;#x27;, &amp;#x27;Go to the lab!&amp;#x27;), (&amp;#x27;body&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271d717c0&amp;gt;), (&amp;#x27;btn_text&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;href&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;image&amp;#x27;, None)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;But what if somebody is not particularly savvy with SQL or not familiar with the data structure in your database? The AlloyDB NL2SQL can help you with that.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Generate SQL using AlloyDB AI Natural Language&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The "alloydb_ai_nl" AlloyDB extension allows you not only to generate SQL queries based on default metadata available out-of-the-box but to build either automatic or custom context, helping to make the best of the query generation. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The NL2SQL functions can add a layer describing your data structure, relations between tables, and metadata based on real data samples from your tables without compromising the data itself, providing necessary information helping the AI model to understand how to build the best query. The following lab helps you to start with the new features and generate your first queries based on your data schema.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-aside"&gt;&lt;dl&gt;
    &lt;dt&gt;aside_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;title&amp;#x27;, &amp;#x27;Go to the lab!&amp;#x27;), (&amp;#x27;body&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7271d71250&amp;gt;), (&amp;#x27;btn_text&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;href&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;image&amp;#x27;, None)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;From Tests to Production&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Those labs are part of the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;From Data Foundations to Advanced RAG&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; module of  our &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/production-ready-ai-with-google-cloud-learning-path"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Production-Ready AI with Google Cloud&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; program. Check the other modules and see if they can help you to adopt the AI capabilities provided by our Google Cloud services and tools. The end game goal is a high quality application using the full potential of modern technologies.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;And stay tuned on release notes for ALloyDB and Cloud SQL - the engineering team is busy working on new features and improvements. Happy testing.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Thu, 02 Apr 2026 13:18:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/activating-your-data-layer-for-production-ready-ai/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/hero_new.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Activating Your Data Layer for Production-Ready AI</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/hero_new.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/activating-your-data-layer-for-production-ready-ai/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Gleb Otochkin</name><title>Cloud Advocate, Databases</title><department></department><company></company></author></item><item><title>Create Expert Content: Architect A Personalized Multi-Agent System with Long-Term Memory</title><link>https://cloud.google.com/blog/topics/developers-practitioners/multi-agent-architecture-and-long-term-memory-with-adk-mcp-and-cloud-run/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In support of our mission to accelerate the developer journey on Google Cloud, we built &lt;strong&gt;Dev Signal&lt;/strong&gt;—a multi-agent system designed to transform raw community signals into reliable technical guidance by automating the path from discovery to expert creation.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In the &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/build-a-multi-agent-system-for-expert-content-with-google-adk-mcp-and-cloud-run-part-1" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;first part&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; of this series for the &lt;strong&gt;Dev Signal&lt;/strong&gt;, we laid the essential groundwork for this system by establishing a project environment and equipping core capabilities through the Model Context Protocol (MCP). We standardized our external integrations, connecting to Reddit for trend discovery, Google Cloud Docs for technical grounding, and building a custom Nano Banana Pro MCP server for multimodal image generation. If you missed &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/build-a-multi-agent-system-for-expert-content-with-google-adk-mcp-and-cloud-run-part-1" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Part 1&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or want to explore the code directly, you can find the complete project implementation in our &lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/ai-ml/dev-signal" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GitHub repository&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now, in Part 2, we focus on building the multi-agent architecture and integrating the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/agent-builder/agent-engine/memory-bank/overview?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Vertex AI memory bank&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to personalize these capabilities. We will implement a Root Orchestrator that manages three specialist agents: the Reddit Scanner, GCP Expert, and Blog Drafter, to provide a seamless flow from trend discovery to expert content creation. We will also integrate a long-term memory layer that enables the agent to learn from your feedback and persist your stylistic preferences across different conversations. This ensures that Dev Signal doesn't just process data, but actually learns to match your professional voice over time.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Infrastructure and Model Setup&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;First, we initialize the environment and the shared Gemini model.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;/agent.py&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt; &lt;/code&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;from google.adk.agents import Agent\r\nfrom google.adk.apps import App\r\nfrom google.adk.models import Gemini\r\nfrom google.adk.tools import google_search, AgentTool, load_memory_tool, preload_memory_tool\r\nfrom google.adk.tools.tool_context import ToolContext\r\nfrom google.genai import types\r\nfrom dev_signal_agent.app_utils.env import init_environment\r\nfrom dev_signal_agent.tools.mcp_config import (\r\n    get_reddit_mcp_toolset, \r\n    get_dk_mcp_toolset, \r\n    get_nano_banana_mcp_toolset\r\n)\r\n\r\nPROJECT_ID, MODEL_LOC, SERVICE_LOC, SECRETS = init_environment()\r\n\r\n\r\nshared_model = Gemini(\r\n    model=&amp;quot;gemini-3-flash-preview&amp;quot;, \r\n    vertexai=True, \r\n    project=PROJECT_ID, \r\n    location=MODEL_LOC,\r\n    retry_options=types.HttpRetryOptions(attempts=3),\r\n)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7239a018b0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Memory Ingestion Logic&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We want Dev Signal&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; to do more than just follow instructions - we want it to learn from you. By capturing your preferences, such as specific technical interests on Reddit or a preferred blogging style, the agent can personalize its output for future use. To achieve this, we use the &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/agent-builder/agent-engine/memory-bank/overview?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Vertex AI memory bank&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to persist session history across different conversations.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Long-term Memory&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We automate this through the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;save_session_to_memory_callback&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; function. This callback is configured to run automatically after every turn, ensuring that session details are captured and stored in the memory bank without manual intervention.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Ho&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;w &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Managed Memory Works:&lt;/span&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Ingestion&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;save_session_to_memory_callback&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; sends the conversation data to Vertex AI.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Embedding&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Vertex AI converts the text into numerical vectors (embeddings) that capture the semantic meaning of your preferences.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Storage&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: These vectors are stored in a managed index, enabling the agent to perform semantic searches and retrieve relevant history in future sessions.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Retrieval&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: The agent recalls this history using built-in ADK tools. The PreloadMemoryTool proactively brings in context at the start of an interaction, while the LoadMemoryTool allows the agent to fetch specific memories on an as-needed basis.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;/agent.py&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;async def save_session_to_memory_callback(*args, **kwargs) -&amp;gt; None:\r\n    &amp;quot;&amp;quot;&amp;quot;\r\n    Defensive callback to persist session history to the Vertex AI memory bank.\r\n    &amp;quot;&amp;quot;&amp;quot;\r\n    ctx = kwargs.get(&amp;quot;callback_context&amp;quot;) or (args[0] if args else None)\r\n    \r\n    # Check connection to Memory Service\r\n    if ctx and hasattr(ctx, &amp;quot;_invocation_context&amp;quot;) and ctx._invocation_context.memory_service:\r\n        # Save the session!\r\n        await ctx._invocation_context.memory_service.add_session_to_memory(\r\n            ctx._invocation_context.session\r\n        )&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7239a01220&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Short-term Memory&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;add_info_to_state&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; function serves as the agent's short-term working memory, allowing the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcp_expert&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to reliably hand off its detailed findings to the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;blog_drafter&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; within the same session. This working memory and the conversation transcript are managed by the Vertex AI Session Service to ensure that active context survives server restarts or transient failures.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;The boundary between session-based state and long-term persistence &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;- It is important to note that while this service provides stability during an active interaction, this short-term memory does not persist between different sessions. Starting a fresh session ID effectively resets this working state, ensuring a clean slate for new tasks. Cross-session continuity, where the agent remembers your stylistic preferences or past feedback, is handled by the Vertex AI Memory Bank.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;/agent.py&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;def add_info_to_state(tool_context: ToolContext, key: str, data: str) -&amp;gt; dict:\r\n    tool_context.state[key] = data\r\n    return {&amp;quot;status&amp;quot;: &amp;quot;success&amp;quot;, &amp;quot;message&amp;quot;: f&amp;quot;Saved \&amp;#x27;{key}\&amp;#x27; to state.&amp;quot;}&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7239a019d0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Specialist 1: Reddit Scanner (Discovery)&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The Reddit scanner is our “Trend Spotter," it identifies high-engagement questions from the last 21 days (3 weeks) to ensure that all research findings remain both timely and relevant.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Memory Usage:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; It leverages &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;load_memory&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to retrieve your past areas of interest and preferred topics from the Vertex AI memory bank If relevant history exists, the agent prioritizes those specific topics in its search to provide a personalized discovery experience.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Beyond simple retrieval, each sub-agent actively updates its memories by listening for new preferences and explicitly acknowledging them during the chat. This process captures relevant information in the session history, where an automated callback then persists it to the long-term Vertex AI memory bank for future use.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This memory management is supported by two distinct retrieval patterns within the Google Agent Development Kit (ADK). The first is the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;PreloadMemoryTool&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, which proactively brings in historical context at the beginning of every interaction to ensure the agent is fully briefed before addressing the current request. The second is the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;LoadMemoryTool&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, which the agent uses on an as-needed basis, calling upon it only when it decides that deeper past knowledge would be beneficial for the current step in the workflow.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;/agent.py&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# Singleton toolsets\r\nreddit_mcp = get_reddit_mcp_toolset(\r\n    client_id=SECRETS.get(&amp;quot;REDDIT_CLIENT_ID&amp;quot;, &amp;quot;&amp;quot;),\r\n    client_secret=SECRETS.get(&amp;quot;REDDIT_CLIENT_SECRET&amp;quot;, &amp;quot;&amp;quot;),\r\n    user_agent=SECRETS.get(&amp;quot;REDDIT_USER_AGENT&amp;quot;, &amp;quot;&amp;quot;)\r\n)\r\nreddit_scanner = Agent(\r\n    name=&amp;quot;reddit_scanner&amp;quot;,\r\n    model=shared_model,\r\n    instruction=&amp;quot;&amp;quot;&amp;quot;\r\n    You are a Reddit research specialist. Your goal is to identify high-engagement questions \r\n    from the last 3 weeks on specific topics of interest, such as AI/agents on Cloud Run.\r\n    \r\n    Follow these steps:\r\n    1. **MEMORY CHECK**: Use `load_memory` to retrieve the user\&amp;#x27;s **past areas of interest** and **preferred topics**. Calibrate your search to align with these interests.\r\n    2. Use the Reddit MCP tools to search for relevant subreddits and posts.\r\n    3. Filter results for posts created within the last 21 days (3 weeks).\r\n    4. Analyze &amp;quot;high-engagement&amp;quot; based on upvote counts and the number of comments.\r\n    5. Recommend the most important and relevant questions for a technical audience.\r\n    6. **CRITICAL**: For each recommended question, provide a direct link to the original thread and a concise summary of the discussion.\r\n    7. **CAPTURE PREFERENCES**: Actively listen for user preferences, interests, or project details. Explicitly acknowledge them to ensure they are captured in the session history for future personalization.\r\n    &amp;quot;&amp;quot;&amp;quot;,\r\n    tools=[reddit_mcp, load_memory_tool.LoadMemoryTool()],\r\n    after_agent_callback=save_session_to_memory_callback,\r\n)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7239a01e50&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Specialist 2: GCP Expert (Grounding)&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The GCP expert is our "The Technical Authority". It triangulates facts by synthesizing official documentation from the Google Cloud Developer Knowledge MCP Server, community sentiment from Reddit, and broader context from Google Search.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;/agent.py&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;dk_mcp = get_dk_mcp_toolset(api_key=SECRETS.get(&amp;quot;DK_API_KEY&amp;quot;, &amp;quot;&amp;quot;))\r\n\r\n\r\nsearch_agent = Agent(\r\n    name=&amp;quot;search_agent&amp;quot;,\r\n    model=shared_model,\r\n    instruction=&amp;quot;Execute Google Searches and return raw, structured results (Title, Link, Snippet).&amp;quot;,\r\n    tools=[google_search],\r\n)\r\ngcp_expert = Agent(\r\n    name=&amp;quot;gcp_expert&amp;quot;,\r\n    model=shared_model,\r\n    instruction=&amp;quot;&amp;quot;&amp;quot;\r\n    You are a Google Cloud Platform (GCP) documentation expert. \r\n    Your goal is to provide accurate, detailed, and cited answers to technical questions by synthesizing official documentation with community insights.\r\n    \r\n    For EVERY technical question, you MUST perform a comprehensive research sweep using ALL available tools:\r\n    \r\n    1. **Official Docs (Grounding)**: Use DeveloperKnowledge MCP (`search_documents`) to find the definitive technical facts.\r\n    2. **Social Media Research (Reddit)**: Use the Reddit MCP to research the question on social media. This allows you to find real-world user discussions, common pain points, or alternative solutions that might not be in official documentation.\r\n    3. **Broader Context (Web/Social)**: Use the `search_agent` tool to find recent technical blogs, social media discussions, or tutorials.\r\n    \r\n    Synthesize your answer:\r\n    - Start with the official answer based on GCP docs.\r\n    - Add &amp;quot;Social Media Insights&amp;quot; or &amp;quot;Common Issues&amp;quot; sections derived from Reddit and Web Search findings.\r\n    - **CRITICAL**: After providing your answer, you MUST use the `add_info_to_state` tool to save your full technical response under the key: `technical_research_findings`.\r\n    - Cite your sources specifically at the end of your response, providing **direct links** (URLs) to the official documentation, blog posts, and Reddit threads used.\r\n    - **CAPTURE PREFERENCES**: Actively listen for user preferences, interests, or project details. Explicitly acknowledge them to ensure they are captured in the session history for future personalization.\r\n    &amp;quot;&amp;quot;&amp;quot;,\r\n    tools=[dk_mcp, AgentTool(search_agent), reddit_mcp, add_info_to_state],\r\n    after_agent_callback=save_session_to_memory_callback,\r\n)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7239a01a90&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt; Specialist 3: Blog Drafter (Creativity)&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The blog drafter is our Content Creator. It drafts the blog based on the expert's findings and offers to generate visuals.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Memory Usage&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: It checks &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;load_memory&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for the user's &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;preferred writing style&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (e.g. "Witty", "Rap") stored in the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Vertex AI memory bank&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;/agent.py&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;nano_mcp = get_nano_banana_mcp_toolset()\r\n\r\n\r\nblog_drafter = Agent(\r\n    name=&amp;quot;blog_drafter&amp;quot;,\r\n    model=shared_model,\r\n    instruction=&amp;quot;&amp;quot;&amp;quot;\r\n    You are a professional technical blogger specializing in Google Cloud Platform. \r\n    Your goal is to draft high-quality blog posts based on technical research provided by the GDE expert and reliable documentation.\r\n    \r\n    You have access to the research findings from the gcp_expert_agent here:\r\n    {{ technical_research_findings }}\r\n \r\n    Follow these steps:\r\n    1. **MEMORY CHECK**: Use `load_memory` to retrieve past blog posts, **areas of interest**, and user feedback on writing style. Adopt the user\&amp;#x27;s preferred style and depth.\r\n    2. **REVIEW &amp;amp; GROUND**: Review the technical research findings provided above. **CRITICAL**: Use the `dk_mcp` (Developer Knowledge) tool to verify key facts, technical limitations, and API details. Ensure every claim in your blog is grounded in official documentation.\r\n    3. Draft a blog post that is engaging, accurate, and helpful for a technical audience.\r\n    4. Include code snippets or architectural diagrams if relevant.\r\n    5. Provide a &amp;quot;Resources&amp;quot; section with links to the official documentation used.\r\n    6. Ensure the tone is professional yet accessible, while adhering to any style preferences found in memory.\r\n    7. **VISUALS**: After presenting the drafted blog post, explicitly ask the user: &amp;quot;Would you like me to generate an infographic-style header image to illustrate these key points?&amp;quot; If they agree, use the `generate_image` tool (Nano Banana).\r\n    8. **CAPTURE PREFERENCES**: Actively listen for user preferences, interests, or project details. Explicitly acknowledge them to ensure they are captured in the session history for future personalization.\r\n    &amp;quot;&amp;quot;&amp;quot;,\r\n    tools=[dk_mcp, load_memory_tool.LoadMemoryTool(), nano_mcp],\r\n    after_agent_callback=save_session_to_memory_callback,\r\n)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7239a017c0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;The Root Orchestrator&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The root agent serves as the system's strategist, managing a team of specialist agents and orchestrating their actions based on the specific goals provided by the user. At the start of a conversation, the orchestrator retrieves memory to establish context by checking for the user's past areas of interest, preferred topics, or previous projects. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;/agent.py&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;root_agent = Agent(\r\n    name=&amp;quot;root_orchestrator&amp;quot;,\r\n    model=shared_model,\r\n    instruction=&amp;quot;&amp;quot;&amp;quot;\r\n    You are a technical content strategist. You manage three specialists:\r\n    1. reddit_scanner: Finds trending questions and high-engagement topics on Reddit.\r\n    2. gcp_expert: Provides technical answers based on official GCP documentation.\r\n    3. blog_drafter: Writes professional blog posts based on technical research.\r\n \r\n    Your responsibilities:\r\n    - **MEMORY CHECK**: At the start of a conversation, use `load_memory` to check if the user has specific **areas of interest**, preferred topics, or past projects. Tailor your suggestions accordingly.\r\n    - **CAPTURE PREFERENCES**: Actively listen for user preferences, interests, or project details. Explicitly acknowledge them to ensure they are captured in the session history for future personalization.\r\n    - If the user wants to find trending topics or questions from Reddit, delegate to reddit_scanner.\r\n    - If the user has a technical question or wants to research a specific theme, delegate to gcp_expert.\r\n    - **CRITICAL**: After the gcp_expert provides an answer, you MUST ask the user: \r\n      &amp;quot;Would you like me to draft a technical blog post based on this answer?&amp;quot;\r\n    - If the user agrees or asks to write a blog, delegate to blog_drafter.\r\n    - Be proactive in helping the user navigate from discovery (Reddit) to research (Docs) to content creation (Blog).\r\n    &amp;quot;&amp;quot;&amp;quot;,\r\n    tools=[load_memory_tool.LoadMemoryTool(), preload_memory_tool.PreloadMemoryTool()],\r\n    after_agent_callback=save_session_to_memory_callback,\r\n    sub_agents=[reddit_scanner, gcp_expert, blog_drafter]\r\n)\r\n\r\napp = App(root_agent=root_agent, name=&amp;quot;dev_signal_agent&amp;quot;)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f7239a012b0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Summary&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this part of our series, we built multi-agent architecture and implemented a robust, dual-layered memory system. We established a Root Orchestrator, managing three specialist agents: a Reddit Scanner for trend discovery, a GCP Expert for technical grounding, and a Blog Drafter for creative content creation. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By utilizing short-term state to pass information reliably between specialists and integrating the Vertex AI memory bank for long-term persistence, we’ve enabled the agent to learn from your feedback and remember specific writing styles across different conversations. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In &lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/create-expert-content-local-testing-of-a-multi-agent-system-with-memory"&gt;part 3&lt;/a&gt;, we will show you how to test the agent locally to verify these components on your workstation, before transitioning to a full production deployment on Google Cloud Run in part 4. Can't wait for Part 3? The full implementation is already available for you to explore on &lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/ai-ml/dev-signal" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GitHub&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To learn more about the underlying technology, explore the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/agent-builder/agent-engine/memory-bank/overview?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Vertex AI Memory Bank overview&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or dive into the official &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/agent-builder/agent-development-kit/overview?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ADK Documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to see how to orchestrate complex multi-agent workflows.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Special thanks to &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/remigiusz-samborski/" rel="noopener" target="_blank"&gt;&lt;span style="font-style: italic; text-decoration: underline; vertical-align: baseline;"&gt;Remigiusz Samborski&lt;/span&gt;&lt;/a&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; for the helpful review and feedback on this article.&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For more content like this, Follow me on &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/shirmeirlador/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Linkedin&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://x.com/shirmeir86?lang=en" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 31 Mar 2026 09:31:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/multi-agent-architecture-and-long-term-memory-with-adk-mcp-and-cloud-run/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/devsignalheroimage.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Create Expert Content: Architect A Personalized Multi-Agent System with Long-Term Memory</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/devsignalheroimage.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/multi-agent-architecture-and-long-term-memory-with-adk-mcp-and-cloud-run/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Shir Meir Lador</name><title>Head of AI, Product DevRel</title><department></department><company></company></author></item><item><title>Five techniques to reach the efficient frontier of LLM inference</title><link>https://cloud.google.com/blog/topics/developers-practitioners/five-techniques-to-reach-the-efficient-frontier-of-llm-inference/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Every dollar that you spend on model inference buys you a position on a graph of latency and throughput. On this plot is a curve of optimal configurations, where you've squeezed the maximum possible performance from your hardware. That curve, borrowed from portfolio theory in finance, is the &lt;/span&gt;&lt;a href="https://en.wikipedia.org/wiki/Efficient_frontier" rel="noopener" target="_blank"&gt;&lt;span style="font-style: italic; text-decoration: underline; vertical-align: baseline;"&gt;efficient frontier&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With the assumption that you have a fixed budget for hardware, you can trade latency for throughput. But, you can't improve one aspect without sacrificing the other, unless the frontier curve itself moves. There are two fundamentally different dynamics at play, and this is the central insight for anyone running LLMs in production.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The first dynamic is &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;getting to the frontier&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, which involves applying the full stack of techniques available to you today. This part is within your control. &lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/tutorials/serve-gemma-gpu-tensortllm?utm_campaign=CDR_0x2b6f3004_default&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Continuous batching&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/best-practices/machine-learning/inference/llm-optimization#model-memory"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;paged attention&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-gke-inference-gateway?utm_campaign=CDR_0x2b6f3004_default&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;intelligent routing&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/vertex-ai/docs/blog/posts/from-research-to-production-accelerate-oss-llm-with-eagle-3-on-vertex?utm_campaign=CDR_0x2b6f3004_default&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;speculative decoding&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/best-practices/machine-learning/inference/llm-optimization#quantization"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;quantization&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; all exist right now. If you're not using these techniques, you're operating below the frontier and leaving performance on the table.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The second dynamic is that &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;the frontier itself is constantly moving outward&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;. This part is largely outside of your control. Researchers publish new algorithms. Hardware vendors ship new architectures. Open-source projects mature. Each breakthrough redefines what's physically achievable and expands the curve so that yesterday's optimal configuration is today's inefficiency.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Your job as a platform engineer is to stay as close to the frontier as possible as you build infrastructure that's flexible enough to absorb each new advance as it arrives. This article gives you the tools to do just that.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Why inference has an efficient frontier&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Every LLM request has two computational phases, and they can have bottlenecks for different hardware resources.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;1. Prefill (Compute-Bound)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: In this phase, the GPU processes your entire input prompt at one time to build the &lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/best-practices/machine-learning/inference/llm-optimization#attention-layer-optimization?utm_campaign=CDR_0x2b6f3004_default&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;key-value (KV) cache&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for the attention mechanism. Because the instructions are batch-processed in parallel, the GPU's compute cores (tensor cores) are highly utilized. This phase is fast and efficient: the processors have all of the data that they need, immediately available, to perform massive matrix multiplications. Longer prompts just mean more computations.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;2. Decode (Memory-Bandwidth-Bound)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: This phase generates new tokens, one at a time, &lt;/span&gt;&lt;a href="https://en.wikipedia.org/wiki/Autoregressive_model" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;autoregressively&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. To generate only one single token, the GPU can't batch the work. It must fetch the &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;entire&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; model's weights and the growing KV cache from &lt;/span&gt;&lt;a href="https://en.wikipedia.org/wiki/High_Bandwidth_Memory" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;High-Bandwidth Memory (HBM)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; into the compute cores. Then, the GPU needs to calculate that one token, and then waits to do it all over again for the next one.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This mismatch is the fundamental reason that the frontier exists. You can't optimize a single system for both phases simultaneously without making some tradeoffs.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/prefill-vs-decode.max-1000x1000.jpg"
        
          alt="prefill-vs-decode"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;The two axes of inference&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Instead of risk and return, the efficient frontier of LLM inference measures a different fundamental tradeoff, with the assumption that the hardware budget is fixed:&lt;/span&gt;&lt;/p&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;&lt;table border="1" style="border-collapse: collapse; width: 99.9748%;"&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="width: 31.5124%;"&gt;&lt;strong&gt;Axis&lt;/strong&gt;&lt;/td&gt;
&lt;td style="width: 31.5124%;"&gt;&lt;a href="https://bentoml.com/llm/inference-optimization/llm-inference-metrics" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Key metrics measured&lt;/strong&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td style="width: 31.5133%;"&gt;&lt;strong&gt;Hardware constraint&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="width: 31.5124%;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Latency (the X-Axis)&lt;/strong&gt;&lt;/td&gt;
&lt;td style="width: 31.5124%;"&gt;&lt;span style="vertical-align: baseline;"&gt;Time to First Token (TTFT) + Time Between Tokens (TBT)&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 31.5133%;"&gt;&lt;span style="vertical-align: baseline;"&gt;Compute (prefill) and memory bandwidth (decode)&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="width: 31.5124%;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Throughput (the Y-Axis)&lt;/strong&gt;&lt;/td&gt;
&lt;td style="width: 31.5124%;"&gt;&lt;span style="vertical-align: baseline;"&gt;Total tokens per second across all concurrent users&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 31.5133%;"&gt;&lt;span style="vertical-align: baseline;"&gt;Batch size × memory capacity&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cost is the constraint that buys the graph of latency and throughput itself. If you increase your hardware budget, or the industry invents a new algorithmic breakthrough, the entire frontier curve shifts outward. For a given budget and software stack, you can apply today's best practices to move from a sub-optimal point &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;towards&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; that frontier.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Getting to the frontier: Five techniques within your control&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Most production inference systems today operate &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;below&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; the frontier. They're leaving performance on the table, not because better techniques don't exist, but because they haven't adopted them yet. Everything described in this section is available today. If you're not applying these techniques, you're choosing to operate below the curve.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/interventions.max-1000x1000.jpg"
        
          alt="interventions"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3 role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;1. Semantic routing across model tiers&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Not every query needs a 400B parameter model. Simple classification, summarization, or formatting tasks can be routed to smaller, quantized models that are orders of magnitude cheaper per token. A lightweight classifier at the gateway edge analyzes query complexity and routes accordingly: frontier-class models for hard reasoning, and small models for everything else.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://cloud.google.com/blog/products/containers-kubernetes/how-gke-inference-gateway-improved-latency-for-vertex-ai?utm_campaign=CDR_0x2b6f3004_default&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Semantic routing&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; pushes your system dramatically closer to its theoretical maximum throughput, and avoids wasted cycles on easy tasks, without sacrificing aggregate output quality.&lt;/span&gt;&lt;/p&gt;
&lt;h3 role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;2. Prefill and decode disaggregation&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Physically separating prefill and decode phases onto different hardware is one of the most architecturally significant optimizations available today.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The prefill phase needs compute-dense GPUs. The decode phase needs high-bandwidth memory. If you force both phases onto the same GPU, then one resource is always underutilized.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To push both phases toward their theoretical hardware limits independently, run dedicated prefill clusters and decode clusters. Connect these clusters with high-speed networks that transfer only the compressed KV cache state to the same GPU, then one resource is always underutilized.&lt;/span&gt;&lt;/p&gt;
&lt;h3 role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;3. Quantization: Trading precision for speed&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When you &lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/best-practices/machine-learning/inference/llm-optimization#quantization?utm_campaign=CDR_0x2b6f3004_default&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;reduce model weights&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; from FP16 to the INT8 or INT4 formats, you can reduce the memory footprint to half or a quarter. Because the decode phase is memory-bandwidth-bound, 4-bit weights can be read up to 4× faster than 16-bit weights. This approach provides a direct TBT improvement.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The tradeoff is quality because naive quantization degrades model outputs. Modern techniques like Activation-aware Weight Quantization (AWQ) and GPTQ preserve the quality of sensitive weights, but aggressively compress others, to achieve near-FP16 quality at INT4 speeds.&lt;/span&gt;&lt;/p&gt;
&lt;h3 role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;4. Context routing: The biggest lever that most teams miss&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In a production deployment with dozens of model replicas, the&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; routing layer &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;is where the biggest competitive advantages are won or lost today.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In 2026, &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/vertex-ai/generative-ai/docs/open-models/model-garden-published-notebooks/model_garden_advanced_features#prefix_caching_"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;prefix caching&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is foundational. If ten users ask questions about the exact same 100-page RAG document, or use the identical massive system prompt, you shouldn't run the compute-heavy prefill phase ten times. You should compute the KV cache once, store it, and then let the other nine users reuse it. This approach slashes TTFT by up to 85% and drastically reduces compute costs.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;But, there's a catch: a standard L4 load balancer scatters requests randomly. If user 2's request lands on a different GPU than user 1's request, the prefix cache is useless. The system has to recompute the cache from scratch.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This is why &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;context-aware L7 routing&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; is the differentiator. An intelligent router inspects the incoming prompt's prefix and intentionally routes the request to the specific pod that &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;already holds that context in its cache&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;. You stop wasting compute power on redundant work and instantly push your latency and throughput closer to the physical limits of your hardware.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/prefix-aware-routing.max-1000x1000.jpg"
        
          alt="prefix-aware-routing"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3 role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;5. Speculative decoding&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Remember: during the decode phase, tensor cores are mostly idle because there's a bottleneck on memory bandwidth. &lt;/span&gt;&lt;a href="https://cloud.google.com/vertex-ai/docs/blog/posts/from-research-to-production-accelerate-oss-llm-with-eagle-3-on-vertex?utm_campaign=CDR_0x2b6f3004_default&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Speculative decoding&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; exploits this wasted computation power.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A small, fast "draft" model generates several candidate tokens cheaply. The large target model then &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;verifies&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; all of the candidates in a single forward pass, which is a parallel compute-bound operation, rather than a sequential memory-bound one. If the draft model predicted the candidates correctly, you've generated 4-5 tokens for the memory cost of one.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This approach directly breaks the TBT floor set by memory bandwidth. If you're not using speculative decoding for latency-sensitive workloads, you're not leveraging one of the most impactful optimizations available.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Although the addition of a draft model can introduce some operational complexity and slightly increase compute costs, the draft model is relatively tiny compared to the main model. This tradeoff for latency is worthwhile.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Note that some newer models have introduced &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;self-speculative decoding&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, which eliminates the overhead of managing a second model. These models use specialized internal layers (often called &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;prediction heads&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;) that are trained to predict extra future tokens simultaneously. These models generally achieve a highly meaningful token hit rate.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Case study: How Vertex AI moved closer to the frontier&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The Vertex AI engineering team moved closer to the frontier when they adopted &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/containers-kubernetes/how-gke-inference-gateway-improved-latency-for-vertex-ai?utm_campaign=CDR_0x2b6f3004_default&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GKE Inference Gateway&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, which is built on the standard Kubernetes Gateway API. Inference Gateway intercepted requests at Layer 7 and added two critical layers of intelligence:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Load-aware routing&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: It scraped real-time metrics (like KV cache utilization and queue depth) directly from the model server's Prometheus endpoints. This process routes requests to the pod that can serve them the fastest.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Content-aware routing&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Crucially, it inspected request prefixes and routed traffic to the pod that already held that specific context in its KV cache. This process avoids expensive re-computation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When the production workloads were migrated to this intelligent routing architecture, the Vertex AI team proved that optimizing the network layer is key to unlocking performance at scale. Validated on production traffic, the results were stark:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;35% faster TTFT&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; for Qwen3-Coder (context-heavy coding agent workloads)&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;2x better P95 tail latency&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (52% improvement) for DeepSeek V3.1 (bursty chat workloads)&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Doubled prefix cache hit rate&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (optimized from 35% to 70%)&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;The bottom line&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;LLM inference has an efficient frontier, which represents a hard boundary where latency and throughput are optimally balanced for a given compute budget.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Getting to that frontier is within your control&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. The techniques exist today: continuous batching, paged attention, intelligent L7 routing, speculative decoding, quantization, and prefill and decode disaggregation. The GKE Inference Gateway case study shows that routing alone, without changing hardware, models, or cluster size, cut TTFT by 35% and doubled cache efficiency. If you're not applying the full stack, you're operating below the curve and overpaying for every token.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;The frontier itself keeps moving outward&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. This part is outside of your control. Researchers publish new algorithms, hardware vendors ship new architectures, and open-source serving frameworks integrate these algorithms and architectures. Something that was cutting-edge optimization 18 months ago became a baseline table stake. Your job isn't to predict which breakthrough comes next; it's to build infrastructure flexible enough to absorb it when it arrives.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The organizations that will win on inference economics aren't the ones with the most GPUs. They're the ones that systematically close the gap to today's frontier while they stay ready for tomorrow's.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Have you applied any of these optimization techniques to your own LLM inference workloads? I'd love to hear about your experience! Share what you've built with me on &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/karlweinmeister/" rel="noopener" target="_blank"&gt;&lt;span style="font-style: italic; text-decoration: underline; vertical-align: baseline;"&gt;LinkedIn&lt;/span&gt;&lt;/a&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://x.com/kweinmeister" rel="noopener" target="_blank"&gt;&lt;span style="font-style: italic; text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;, or &lt;/span&gt;&lt;a href="https://bsky.app/profile/kweinmeister.bsky.social" rel="noopener" target="_blank"&gt;&lt;span style="font-style: italic; text-decoration: underline; vertical-align: baseline;"&gt;Bluesky&lt;/span&gt;&lt;/a&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;!&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 27 Mar 2026 10:02:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/five-techniques-to-reach-the-efficient-frontier-of-llm-inference/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/hero-image.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Five techniques to reach the efficient frontier of LLM inference</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/hero-image.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/five-techniques-to-reach-the-efficient-frontier-of-llm-inference/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Karl Weinmeister</name><title>Director, Developer Relations</title><department></department><company></company></author></item><item><title>The new AI literacy: Insights from student developers</title><link>https://cloud.google.com/blog/topics/developers-practitioners/how-uc-berkeley-students-use-ai-as-a-learning-partner/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AI has made it easier than ever for student developers to work efficiently, tackle harder problems, and pursue ambitious projects. But for students earning technical degrees, these new capabilities also create genuine tensions around learning. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;How much should I use AI? What should I use it for? &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As 90% of technology professionals now use AI in their daily work according to &lt;/span&gt;&lt;a href="https://dora.dev/dora-report-2025/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google's DORA 2025 report&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, understanding how the next generation navigates these tools matters more than ever. Contrary to fears that students use AI to cheat or are becoming intellectually lazy, our research with UC Berkeley students reveals something different. Students treated AI as a learning partner rather than a shortcut, using it strategically for some tasks while deliberately turning it off for others. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As AI becomes foundational to software development, the question isn't whether to adopt these tools but how to work with them thoughtfully. The students at UC Berkeley are showing us one answer: with curiosity, caution, and a commitment to genuine learning that technology can support but never replace.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;The research&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our team of four student researchers (Andrew Harlan, Mindy Tsai, Kenny Ly, and Karissa Wong) conducted a mixed methods research project with UC Berkeley students in Computer Science, Electrical Engineering, Design, and Data Science to understand how they're integrating AI into their academic work. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A separate UC Berkeley study (conducted by Edward Fraser, Jessie Deng, and Eileen Thai) used eye-tracking technology to observe how developers with one to five years of experience actually interact with AI coding assistants. Both student teams were supported by dedicated mentors, with Googlers Harini Sampath, Becky Sohn, and Derek DeBellis advising the mixed methods research, and UC Berkeley Professor John Chuang, PhD, advising the eye-tracking study.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Together, these studies reveal three key insights about how students balance AI's capabilities with their need to develop genuine expertise. The patterns emerging among students closely mirror what DORA research has found in professional developers.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Finding #1: The 24/7 office hour&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;AI as a tutor, not a shortcut&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When asked to describe their relationship with AI, every student in our study used educational terms. They referred to  AI as a "tutor" or "teacher," not an assistant or productivity tool.&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"AI is a teacher...in the sense that it is most helpful for understanding dense content and potentially parts of code that are prewritten in the database to allow for fundamental understanding of the project."&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"I use [AI] as my own private tutor...to [cover] any specific topics in the classes or lectures...not just in CS classes but in all classes."&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This framing matters because it reveals strategic use rather than dependency. Rather than asking AI to complete assignments, students described using AI metacognitively to identify gaps in their knowledge, clarify confusing concepts, and guide their learning process. They used AI to summarize academic papers mentioned in lectures so they could decide which ones warranted deeper reading. They asked AI to explain why their code produced specific errors.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;One student explained their workflow:&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"When I don't understand what my professor is explaining, I ask AI to help me understand the concept or what a piece of code is doing. If I don't know how to begin a lab, I give the prompt to AI to figure out where to start, then write the code myself and ask AI to correct my work."&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For students with learning disabilities, this constant availability addresses a real access gap:&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"As a student with a learning disability, I need more time to understand a problem. AI has helped me a lot—it's like having a 24/7 TA."&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By extending access beyond limited office hours, AI allows students to iterate on their understanding without waiting for help. This frees up cognitive space for higher-level thinking:&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"I spend less time actually coding and more time on big picture ideation. Now, my time is spent thinking through logic, concepts, and coming up with ideas creatively, rather than producing code manually."&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These accounts portray AI as a scaffold for exploration rather than a producer of finished work. This mirrors what DORA research found: when AI handles routine toil, developers can focus more energy on delivering user value.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Finding #2: Active resistance to overdependence&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Building guardrails to protect learning&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Despite embracing AI as a learning tool, students expressed genuine anxiety about becoming too dependent on it.&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"If AI disappeared, I'd struggle more with figuring out how to solve things on my own."&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In a recent study using EEG to measure brain activity during essay writing, researchers found that AI users showed weaker cognitive engagement patterns compared to those using search engines or no tools, and frequent AI users who later wrote without assistance remembered less of their content and felt less ownership over it, what the authors termed "cognitive debt”.&lt;sup&gt;1&lt;/sup&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our research revealed a positive signal: rather than passively accepting this risk, students responded by establishing deliberate boundaries.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;One mechanical engineering student described how she's developed a competency-based system over years of working with electronics: &lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"When I use basic sensors like a servo or ultrasonic, I can still code that myself. But when I have more complex sensors where I don't necessarily know the exact functions, that's when I'll use AI." She explained her reasoning: "I have the background to understand why things aren't working, but I don't always know the direct language to fix it, so AI is good for helping overcome that."&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For a recent project building a tactile storytelling tool, she knew the basic concept but needed help structuring the counting and comparison system. "AI was really useful in setting up that structure, but I still had to code after to fine-tune it." She's clear about the division of labor: "I'm still working with doing the code myself. I wouldn't say that I'm just handing it off like a technical expert. I'm working in tandem with it. I have to be the initiator of what I want it to actually do. If I just give it a blind request, it's not useful at all."&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Even when students do engage AI, they often set explicit rules:&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"Sometimes I tell AI not to give me the full answer, just to guide me in the right direction."&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Students have developed several specific strategies to prevent overreliance:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Limiting access to powerful models:&lt;/strong&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"I don't want to pay for AI tools because it could lead me to overuse the models."&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Alternating between assisted and unassisted work:&lt;/strong&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"I have actually gone back to hand-coding for certain things, like a for-loop for example."&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Warning against "vibe coding":&lt;/strong&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"AI tools can definitely be a good companion to boost developer productivity. However, one needs to be very mindful and not get used to vibe coding. It's very important to understand and validate the code AI is generating and use it appropriately."&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This anxiety is itself metacognitive awareness. Students recognize that the path of least resistance may not be the path of greatest learning. This mirrors DORA's findings: despite 90% adoption, about 30% of practitioners report little to no trust in AI-generated code. Effective AI use requires mastering critical evaluation and verification, not just adoption.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Finding #3: Knowing when to use AI and when to turn it off&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;What the eye-tracking data reveals&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A separate study using eye-tracking technology provides behavioral validation. When researchers observed developers with one to five years of experience interacting with AI coding assistants, they found stark differences in AI engagement depending on task type:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;During interpretive tasks&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; requiring deep understanding: &amp;lt;1% visual attention on AI&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;During mechanical tasks&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; like boilerplate code: 19% visual attention on AI&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Developers actively ignored AI suggestions during complex work, even when those suggestions were accurate and could save time. AI creates cognitive load during deep understanding work, and experienced developers know when to turn it off.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Strategic selectivity, not blanket adoption&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Students in our interviews echoed this context-dependent approach:&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"I typically use AI to generate ideas for a starting point."&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"Despite knowing AI was allowed, I wanted to go through the friction of learning and failing and having space for creativity."&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Customization matters&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Most AI coding assistants now let developers toggle inline suggestions, enable on-demand only modes, or adjust suggestion frequency. By experimenting with these settings, developers can align AI behavior with the cognitive demands of different tasks, reducing disruption during deep work while maintaining assistance for routine tasks.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;What this means for the industry&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Students are modeling the future of AI-augmented development&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The students in these studies are ahead of the curve. They've developed a literacy that knows when to engage AI, how to verify its output, and when to work manually to preserve understanding. For teams navigating AI adoption, the student experience offers direction:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Experiment with customization&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to find configurations that support rather than disrupt work&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Build verification practices&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; into workflows rather than accepting suggestions uncritically&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Create space for unassisted work&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; on complex problems where understanding matters more than speed&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As AI becomes foundational to software development, the question isn't whether to adopt these tools but how to work with them thoughtfully. The students at UC Berkeley are showing us one answer: with curiosity, caution, and a commitment to genuine learning that technology can support but never replace.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To learn more about how professionals across the industry are navigating AI adoption, &lt;/span&gt;&lt;a href="https://dora.dev/dora-report-2025/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;download the DORA 2025 State of AI-assisted Software Development Report&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. You can also &lt;/span&gt;&lt;a href="https://dora.dev/insights/tags/uc-berkeley/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;read the full research articles from our collaboration with researchers at UC Berkeley.&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;sup&gt;&lt;em&gt;&lt;span style="vertical-align: baseline;"&gt;1. Kosmyna, Nataliya, et al. "Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task." &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;arXiv&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, 10 June 2025, doi:10.48550/arXiv.2506.08872. Accessed 28 Jan. 2026.&lt;/span&gt;&lt;/em&gt;&lt;/sup&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Thu, 26 Mar 2026 17:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/how-uc-berkeley-students-use-ai-as-a-learning-partner/</guid><category>AI &amp; Machine Learning</category><category>Developers &amp; Practitioners</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>The new AI literacy: Insights from student developers</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/how-uc-berkeley-students-use-ai-as-a-learning-partner/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Andrew Harlan, Ph.D.</name><title>UX Researcher &amp; Creative Technologist, Independent</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Steve Fadden, Ph.D.</name><title>UX Research Lead, Google</title><department></department><company></company></author></item><item><title>Building Distributed AI Agents</title><link>https://cloud.google.com/blog/topics/developers-practitioners/building-distributed-ai-agents/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Let's be honest: building an AI agent that works &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;once&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; is easy. Building an AI agent that works &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;reliably&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; in production, integrated with your existing React or Node.js application? That's a whole different ball game.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;(TL;DR: Want to jump straight to the code? Check out the &lt;/span&gt;&lt;a href="https://github.com/amitkmaraj/course-creation-ai-agent-architecture" rel="noopener" target="_blank"&gt;&lt;span style="font-style: italic; text-decoration: underline; vertical-align: baseline;"&gt;Course Creator Agent Architecture on GitHub&lt;/span&gt;&lt;/a&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;.)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We've all been there. You have a complex workflow—maybe it's researching a topic, generating content, and then grading it. You shove it all into one massive Python script or a giant prompt. It works on your machine, but the moment you try to hook it up to your sleek frontend, things get messy. Latency spikes, debugging becomes a nightmare, and scaling is impossible without duplicating the entire monolith.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;But what if you didn't have to rewrite your entire application to accommodate AI? What if you could just... plug it in?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this post, we're going to explore a better way: the &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;orchestrator pattern&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;. Instead of just one powerful agent that does everything, we'll build a team of specialized, distributed microservices. This approach lets you integrate powerful AI capabilities directly into your existing frontend applications without the headache of a monolithic rewrite.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We'll use Google's &lt;/span&gt;&lt;a href="https://github.com/google/adk-python" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Development Kit (ADK)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to build the agents, the &lt;/span&gt;&lt;a href="https://a2a-protocol.org" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent-to-Agent (A2A)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; protocol to connect them and let them communicate with each other, and deploy them as scalable microservices on &lt;/span&gt;&lt;a href="https://cloud.google.com/run"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Why Distributed Agents? (And Why Your Frontend Team Will Love You)&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Imagine you have a polished Next.js application. You want to add a "Course Creator" feature.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you build a monolithic agent, your frontend has to wait for a single, long-running process to finish everything. If the research part hangs, the whole request times out. Additionally, you won’t have the opportunity to scale separate agents as needed. For example, if your judge agent requires more processing, you’ll have to scale &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;all&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; your agents up, instead of just the judge agent.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By adopting a distributed orchestrator pattern, you gain scalability and flexibility:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Seamless integration:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Your frontend talks to one endpoint (the orchestrator), which manages the chaos behind the scenes.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Independent scaling:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Is the judge step slow? Scale just that service to 100 instances. Your research service can stay small.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Modularity:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; You can write the high-performance networking parts in Go and the data science parts in Python. They just speak HTTP.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;The Blueprint: Course Creator App&lt;/span&gt;&lt;/h2&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/building-distributed-ai-agents-course-creator.gif"
        
          alt="building-distributed-ai-agents-course-creator"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Let's build that course creator system. We'll break it down into three distinct specialists:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The researcher&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: A specialist that digs up information.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The judge&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: A QA specialist that ensures quality.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The orchestrator&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: The manager that coordinates the work and talks to your frontend.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 1: Hiring the Specialist (The Researcher)&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;First, we need someone to do the legwork. We'll build a focused agent using ADK whose only job is to use Google Search.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# researcher/app/agent.py\r\nfrom google.adk.agents import Agent\r\nfrom google.adk.tools import google_search\r\n\r\nresearcher = Agent(\r\n    name=&amp;quot;researcher&amp;quot;,\r\n    model=&amp;quot;gemini-2.5-flash&amp;quot;,\r\n    description=&amp;quot;Gathers information on a topic using Google Search.&amp;quot;,\r\n    instruction=&amp;quot;&amp;quot;&amp;quot;\r\n    You are an expert researcher. Your goal is to find comprehensive information.\r\n    Use the `google_search` tool to find relevant information.\r\n    Summarize your findings clearly.\r\n    &amp;quot;&amp;quot;&amp;quot;,\r\n    tools=[google_search],\r\n)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726c776af0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;See? Simple. It doesn't know about courses or frontends. It just researches.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 2: The Judge (Structured Output)&lt;/span&gt;&lt;/h3&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/building-distributed-ai-agents-judge.max-1000x1000.png"
        
          alt="building-distributed-ai-agents-judge"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We can't have our agents rambling. We need strict pass or fail grades so our code can make decisions. We use &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Pydantic&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to enforce this contract.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# judge/app/agent.py\r\nfrom pydantic import BaseModel, Field\r\nfrom typing import Literal\r\n\r\nclass JudgeFeedback(BaseModel):\r\n    status: Literal[&amp;quot;pass&amp;quot;, &amp;quot;fail&amp;quot;] = Field(\r\n        description=&amp;quot;Whether the research is sufficient (\&amp;#x27;pass\&amp;#x27;) or needs more work (\&amp;#x27;fail\&amp;#x27;).&amp;quot;\r\n    )\r\n    feedback: str = Field(\r\n        description=&amp;quot;Detailed feedback on what is missing.&amp;quot;\r\n    )\r\n\r\njudge = Agent(\r\n    name=&amp;quot;judge&amp;quot;,\r\n    model=&amp;quot;gemini-2.5-flash&amp;quot;,\r\n    description=&amp;quot;Evaluates research findings.&amp;quot;,\r\n    instruction=&amp;quot;&amp;quot;&amp;quot;\r\n    You are a strict editor. Evaluate the findings.\r\n    If they are missing key info, output status=\&amp;#x27;fail\&amp;#x27; and provide feedback.\r\n    &amp;quot;&amp;quot;&amp;quot;,\r\n    output_schema=JudgeFeedback, # Enforce the contract!\r\n)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726c776a90&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now, when the judge speaks, it speaks JSON. Your application logic can trust it.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 3: The Universal Language (A2A Protocol)&lt;/span&gt;&lt;/h3&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/building-distributed-ai-agents-a2a-protoco.max-1000x1000.png"
        
          alt="building-distributed-ai-agents-a2a-protocol"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Here's the magic. We wrap these agents as web services using the &lt;/span&gt;&lt;a href="https://a2a-protocol.org" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;A2A Protocol&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Think of it as a universal language for agents. It lets them describe what they do (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;agent.json&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) and talk over standard HTTP.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# researcher/app/server.py\r\nfrom fastapi import FastAPI\r\nfrom a2a.server.apps import A2AFastAPIApplication\r\nfrom app.agent import app as adk_app\r\n\r\n# ... setup runner ...\r\n\r\n# Create the A2A App wrapper\r\na2a_app = A2AFastAPIApplication(agent_card=agent_card, http_handler=request_handler)\r\n\r\napp = FastAPI(lifespan=lifespan)\r\n\r\n# Register routes: /.well-known/agent.json and /rpc\r\na2a_app.add_routes_to_app(app)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726c776760&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now, your researcher is a microservice running on port 8000. It's ready to be called by anyone—including your orchestrator.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 4: The Orchestrator Pattern&lt;/span&gt;&lt;/h3&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/building-distributed-ai-agents-orchestrato.max-1000x1000.png"
        
          alt="building-distributed-ai-agents-orchestrator"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This is where it all comes together. The &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;orchestrator&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; is the general contractor. It doesn't do the research; it hires the researcher. It doesn't make judgments; it asks the judge.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Crucially, &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;this is the only agent your frontend needs to know about&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# orchestrator/app/agent.py\r\nfrom google.adk.agents import LoopAgent, SequentialAgent\r\nfrom google.adk.agents.remote_a2a_agent import RemoteA2aAgent\r\n\r\n# Connect to the remote Researcher service\r\nresearcher = RemoteA2aAgent(\r\n    name=&amp;quot;researcher&amp;quot;,\r\n    agent_card=&amp;quot;http://researcher-service:8000/.well-known/agent.json&amp;quot;,\r\n    description=&amp;quot;Gathers information on a topic.&amp;quot;\r\n)\r\n\r\n# Connect to the remote Judge service\r\njudge = RemoteA2aAgent(\r\n    name=&amp;quot;judge&amp;quot;,\r\n    agent_card=&amp;quot;http://judge-service:8000/.well-known/agent.json&amp;quot;,\r\n    description=&amp;quot;Evaluates research findings.&amp;quot;\r\n)\r\n\r\n# The Orchestrator manages the loop\r\nresearch_loop = LoopAgent(\r\n    name=&amp;quot;research_loop&amp;quot;,\r\n    sub_agents=[researcher, judge, escalation_checker],\r\n    max_iterations=3,\r\n)\r\n\r\n# The full pipeline\r\nroot_agent = SequentialAgent(\r\n    name=&amp;quot;course_creation_pipeline&amp;quot;,\r\n    sub_agents=[research_loop, content_builder],\r\n)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726c776dc0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The orchestrator handles the complexity—retries, loops, state management—so your frontend stays clean and simple.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Deployment: The "Grocery Store" Model&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Deploying this system on Cloud Run gives you what I call the "grocery store" model. If the checkout lines (researcher tasks) get long, you don't build a new store. You just open more registers. Cloud Run scales your researcher service independently to handle the load, while your judge service stays lean.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Caveats &amp;amp; Security Considerations&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Of course, with great power comes great responsibility (and security reviews).&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Authentication&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: In this demo, agents talk over open HTTP. In production, you &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;must&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; lock this down. Use mTLS, OIDC, or API keys to ensure that only your orchestrator can talk to your researcher.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Latency&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Every hop adds time. Use this pattern for coarse-grained tasks (like "research this topic") rather than chatty, low-level interactions.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Error handling&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Networks fail. Your orchestrator needs to be robust enough to handle timeouts and retries gracefully.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Ready to Build?&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Stop trying to build one giant agent that does it all. By using the &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;orchestrator pattern&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; and distributed microservices, you can build AI systems that are scalable, maintainable, and—best of all—play nicely with the apps that you already have.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Want to see the code? Check out the full &lt;/span&gt;&lt;a href="https://github.com/amitkmaraj/course-creation-ai-agent-architecture" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Course Creator Agent Architecture on GitHub&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;And if you're ready to deploy, get started with &lt;/span&gt;&lt;a href="https://cloud.google.com/run"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://github.com/google/adk-python" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ADK&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;a href="https://a2a-protocol.org" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;A2A&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to bring your agent team to life.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 18 Mar 2026 19:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/building-distributed-ai-agents/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/building-distributed-ai-agents-hero.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Building Distributed AI Agents</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/building-distributed-ai-agents-hero.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/building-distributed-ai-agents/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Amit Maraj</name><title>AI Developer Relations Engineer</title><department></department><company></company></author></item><item><title>Create Expert Content: Building Capabilities for a Multi-Agent System with Google ADK, MCP, and Cloud Run</title><link>https://cloud.google.com/blog/topics/developers-practitioners/build-a-multi-agent-system-for-expert-content-with-google-adk-mcp-and-cloud-run-part-1/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;My team’s mission is to accelerate the developer journey from writing code to running secure AI workloads on Google Cloud. To help developers succeed, we focus on identifying their most pressing questions and building demos that provide straightforward, easy-to-implement solutions.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Recently, I was struck with inspiration when the new &lt;/span&gt;&lt;a href="https://developers.google.com/knowledge/mcp?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Developer Knowledge MCP server&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; was released. It led me to build &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Dev Signal&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;—a multi-agent system designed with &lt;/span&gt;&lt;a href="https://github.com/google/adk-python" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Agent Development Kit (ADK)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;—to identify technical questions from Reddit, research them using official documentation, and draft detailed technical blogs. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Dev Signa&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;l also provides custom visuals using &lt;/span&gt;&lt;a href="https://blog.google/innovation-and-ai/products/nano-banana-pro/?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blo" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Nano Banana Pro.&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; I even integrated a long-term &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/agent-builder/agent-engine/memory-bank/overview?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;memory&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; layer so the agent remembers my specific preferences and blogging style.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By connecting my coding assistant, &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini/docs/codeassist/gemini-cli?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini CLI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, to the developer knowledge MCP server, I built and deployed this entire system to &lt;/span&gt;&lt;a href="https://cloud.google.com/run/docs?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Run&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in just two days.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Whether you want to learn how to architect a complex multi-agent system with long term memory, leverage local and remote MCP servers for tool standardization, or write detailed Terraform scripts for secure Cloud Run deployment, I'll show you how!&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you’d rather dive straight into the code and explore it at your own pace, you can clone the repository &lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/ai-ml/dev-signal" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=abZxJiXGrJs"
      data-glue-modal-trigger="uni-modal-abZxJiXGrJs-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        &lt;img src="//img.youtube.com/vi/abZxJiXGrJs/maxresdefault.jpg"
             alt="A YouTube video that walks through a demo to set up the Dev Signal system"/&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-abZxJiXGrJs-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="abZxJiXGrJs"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=abZxJiXGrJs"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;What you'll learn&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;In this four-part blog series, I’ll walk you through the step-by-step process of how I brought this project to life. &lt;/span&gt;&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Each blog post captures the journey of building and deploying &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Dev Signal&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Part 1: Tools for building agent capabilities (this blog post) &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;– You’ll begin by setting up your project environment and equipping your agent with tools using the Model Context Protocol (MCP). You’ll learn how to connect to Reddit for trend discovery, Google Cloud docs for technical grounding, and a custom Nano Banana Pro tool for image generation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/multi-agent-architecture-and-long-term-memory-with-adk-mcp-and-cloud-run"&gt;Part 2: The Multi-Agent Architecture with Long-term Memory&lt;/a&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;– You’ll build the "brain" of the system by implementing a root orchestrator and a team of specialized agents. You’ll also integrate the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/agent-builder/agent-engine/memory-bank/overview?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Vertex AI memory bank&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, enabling the agent to learn and persist your preferences across sessions.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/create-expert-content-local-testing-of-a-multi-agent-system-with-memory"&gt;&lt;strong style="vertical-align: baseline;"&gt;Part 3: Testing the agent Locally&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; – Before moving to the cloud, you’ll synchronize the agent's components and verify its performance on your workstation. You’ll use a dedicated test runner to simulate the full lifecycle of discovery, research, and multimodal creation, &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;with a special focus on validating long-term memory persistence by connecting your local agent directly to the cloud-based Vertex AI memory bank.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/create-expert-content-deploying-a-multi-agent-system-with-terraform-and-cloud-run"&gt;Part 4: Deployment to Cloud Run and the Path to Production&lt;/a&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;– Finally, you’ll deploy your service on Google Cloud Run using Terraform for reproducible infrastructure. You’ll also discuss the next steps required for a high quality secure production system.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Getting started with Dev Signal&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Dev Signal&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; is an intelligent monitoring agent designed to filter noise and create value. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Dev Signal&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; operates in the following ways:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Discovery&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Scouts Reddit for high-engagement technical questions.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Grounding&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Researches answers using official Google Cloud documentation to ensure accuracy.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Creation&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Drafts professional technical blog posts based on its findings.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Multimodal Generation&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Generates custom infographic headers for those posts.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Long-Term Memory&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Uses &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Vertex AI memory bank&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to remember your feedback across different sessions.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Prerequisites&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Before you begin, verify the following is installed: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Python 3.12+&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;uv&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (Python package manager): &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;curl -LsSf https://astral.sh/uv/install.sh | sh&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/sdk/docs/install?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud SDK&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; CLI) installed and authenticated.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://developer.hashicorp.com/terraform/install" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Terraform&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (for infrastructure as code).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.npmjs.com/downloading-and-installing-node-js-and-npm" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Node.js &amp;amp; npm&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (required for the Reddit MCP tool).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You will also need:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;A &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/resource-manager/docs/creating-managing-projects?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Project&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; with billing enabled.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/endpoints/docs/openapi/enable-api?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;&lt;strong&gt;APIs Enabled&lt;/strong&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: Vertex AI, Cloud Run, Secret Manager, Artifact Registry.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Reddit API Credentials&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (Client ID, Secret) - You can get these from the &lt;/span&gt;&lt;a href="https://www.reddit.com/prefs/apps" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Reddit Developer Portal&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Developer Knowledge API Key&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (for Google Cloud docs search) - Instructions on how to get it are &lt;/span&gt;&lt;a href="https://developers.google.com/knowledge/mcp?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Project Setup&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Dev Signal&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; system was built by first running the&lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/agent-starter-pack" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt; Agent Starter Pack,&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; following the automated architect workflow described in the &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=XCGbDx7aSks" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;Agent Factory episode&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; by &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/remigiusz-samborski/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Remigiusz Samborski&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/vkolesnikov/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Vlad Kolesnikov&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. This foundation provided the project’s modular directory structure, which is used to separate concerns between Agent Logic, Server Code, Utilities, and Tools.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The starter pack acts as a powerful starting point because it automates the creation of professional infrastructure, CI/CD pipelines, and observability tools in seconds. This allows you to focus entirely on the agent’s unique intelligence while ensuring the underlying platform remains secure and scalable. By building on top of this generated boilerplate with AI assistance from &lt;/span&gt;&lt;a href="https://blog.google/innovation-and-ai/technology/developers-tools/introducing-gemini-cli-open-source-ai-agent/?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini CLI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://antigravity.google/?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Antigravity&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, the development process is highly accelerated. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The agent starter pack high level architecture:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/agentstarterpack.max-1000x1000.png"
        
          alt="agentstarterpack"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;1. Initialize the Project&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Create a new directory for your project and initialize it. We'll use &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;uv&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, which is an extremely fast Python package manager.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;uv init dev-signal&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726fa51940&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;2. Folder Structure&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our project will follow this structure. We will populate these files step-by-step.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;dev-signal/\r\n├── dev_signal_agent/\r\n│   ├── __init__.py\r\n│   ├── agent.py           # Agent logic &amp;amp; orchestration\r\n│   ├── fast_api_app.py    # Application server &amp;amp; memory connection\r\n│   ├── app_utils/         # Env Config\r\n│   │   └── env.py\r\n│   └── tools/             # External capabilities\r\n│       ├── __init__.py\r\n│       ├── mcp_config.py  # Tool configuration (Reddit, Docs)\r\n│       └── nano_banana_mcp/# Custom local image generation tool\r\n│           ├── __init__.py\r\n│           ├── main.py\r\n│           ├── nano_banana_pro.py\r\n│           ├── media_models.py\r\n│           ├── storage_utils.py\r\n│           └── requirements.txt\r\n├── deployment/\r\n│   └── terraform/         # Infrastructure as Code\r\n├── .env                   # Local secrets (API keys)\r\n├── Makefile               # Shortcuts for building/deploying\r\n├── Dockerfile             # Container definition\r\n└── pyproject.toml         # Dependencies&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726fa51760&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;3. Define Dependencies&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Update your &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;pyproject.toml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; with the necessary dependencies. We use &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;google-adk&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for the agent framework and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;google-genai&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for the model interaction.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;[project]\r\nname = &amp;quot;dev-signal&amp;quot;\r\nversion = &amp;quot;0.1.0&amp;quot;\r\ndescription = &amp;quot;A multi-agent system for monitoring and content creation.&amp;quot;\r\nreadme = &amp;quot;README.md&amp;quot;\r\nrequires-python = &amp;quot;&amp;gt;=3.12, &amp;lt;3.14&amp;quot;\r\ndependencies = [\r\n     &amp;quot;google-adk&amp;gt;=0.1.0&amp;quot;,\r\n    \xa0&amp;quot;google-genai&amp;gt;=1.0.0&amp;quot;,\r\n     &amp;quot;mcp&amp;gt;=1.0.0&amp;quot;,\r\n    \xa0&amp;quot;python-dotenv&amp;gt;=1.0.0&amp;quot;,\r\n     &amp;quot;fastapi&amp;gt;=0.110.0&amp;quot;,\r\n     &amp;quot;uvicorn&amp;gt;=0.29.0&amp;quot;,\r\n     &amp;quot;google-cloud-logging&amp;gt;=3.0.0&amp;quot;,\r\n     &amp;quot;google-cloud-aiplatform&amp;gt;=1.38.0&amp;quot;,\r\n    \xa0&amp;quot;fastmcp&amp;gt;=2.13.0&amp;quot;,\r\n     &amp;quot;google-cloud-storage&amp;gt;=3.6.0&amp;quot;,\r\n     &amp;quot;google-auth&amp;gt;=2.0.0&amp;quot;,\r\n     &amp;quot;google-cloud-secret-manager&amp;gt;=2.26.0&amp;quot;,\r\n]&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726fa51040&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Run &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;uv sync&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to install everything.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Create a new directory for the agent code.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;mkdir dev_signal_agent\r\ncd dev_signal_agent&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726fa51bb0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Building the agent capabilities: MCP tools &lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our agent needs to interact with the outside world. We use the &lt;/span&gt;&lt;a href="https://modelcontextprotocol.io/" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Model Context Protocol (MCP)&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to standardize this. The &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Model Context Protocol (MCP)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; is a universal standard for connecting AI agents to external data and tools. Instead of writing custom API wrappers, we use standard MCP servers. This allows us to connect to APIs (Reddit), Knowledge Bases (Google Cloud Docs), and even local scripts (Image Generation using Nano Banana Pro) using a common interface. Create a new directory for the agent tools.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;mkdir tools\r\ncd tools&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726fa51d60&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Tools Configuration&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We'll define our toolsets in &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent/tools/mcp_config.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This file defines the connection parameters for our three main tools.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Reddit&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Connected via a local stdio subprocess.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Developer Knowledge&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Connected via a remote HTTP endpoint.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Nano Banana&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Connected via a local stdio subprocess (our custom Python script).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Reddit Search (Discovery Tool)&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;a href="https://github.com/Arindam200/reddit-mcp" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Reddit MCP server &lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;acts as a bridge to the Reddit API, allowing your agent to discover trending posts and analyze engagement without you having to write complex API wrappers. To ensure portability, the code uses a "find or fetch" strategy: it first checks for a local installation and, if missing, automatically uses &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;npx&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to download and run the server on demand.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Instead of a network connection, the agent launches the server as a local subprocess and communicates via standard input and output (stdio). Within the Google ADK, the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;McpToolset&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; class acts as a universal wrapper that standardizes these connections, enabling your agent to interact with various tools, from community resources to custom scripts like the Nano Banana image generator, using a common interface. By securely passing API credentials through environment variables, the system ensures these "plug-and-play" modules function as a seamless bridge between the AI and external platforms.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent/tools/mcp_config.py:&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;import os\r\nimport shutil\r\nfrom mcp import StdioServerParameters\r\nfrom google.adk.tools import McpToolset\r\nfrom google.adk.tools.mcp_tool import StreamableHTTPConnectionParams, StdioConnectionParams\r\n\r\ndef get_reddit_mcp_toolset(client_id: str = &amp;quot;&amp;quot;, client_secret: str = &amp;quot;&amp;quot;, user_agent: str = &amp;quot;&amp;quot;):\r\n    &amp;quot;&amp;quot;&amp;quot;\r\n    Connects to the Reddit MCP server.\r\n    This server runs as a local subprocess (stdio) and proxies requests to the Reddit API.\r\n    &amp;quot;&amp;quot;&amp;quot;\r\n    # Check if \&amp;#x27;reddit-mcp\&amp;#x27; is installed globally, otherwise use npx to run it\r\n    cmd = &amp;quot;reddit-mcp&amp;quot; if shutil.which(&amp;quot;reddit-mcp&amp;quot;) else &amp;quot;npx&amp;quot;\r\n    args = [] if shutil.which(&amp;quot;reddit-mcp&amp;quot;) else [&amp;quot;-y&amp;quot;, &amp;quot;--quiet&amp;quot;, &amp;quot;reddit-mcp&amp;quot;]\r\n    \r\n    # Inject secrets into the environment of the subprocess only\r\n    env = {\r\n        **os.environ, \r\n        &amp;quot;DOTENV_CONFIG_SILENT&amp;quot;: &amp;quot;true&amp;quot;, \r\n        &amp;quot;LANG&amp;quot;: &amp;quot;en_US.UTF-8&amp;quot;\r\n    }\r\n\r\n    if client_id: env[&amp;quot;REDDIT_CLIENT_ID&amp;quot;] = client_id\r\n    if client_secret: env[&amp;quot;REDDIT_CLIENT_SECRET&amp;quot;] = client_secret\r\n    if user_agent: env[&amp;quot;REDDIT_USER_AGENT&amp;quot;] = user_agent\r\n\r\n    return McpToolset(\r\n        connection_params=StdioConnectionParams(\r\n            server_params=StdioServerParameters(\r\n                command=cmd, \r\n                args=args, \r\n                env=env # Pass injected secrets directly to the subprocess\r\n            ),\r\n            timeout=120.0\r\n        )\r\n    )&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726fa51b20&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud Docs (Knowledge Tool)&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://developers.google.com/knowledge/mcp?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Developer Knowledge MCP server provides&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; grounding for your agent by allowing it to search the entire corpus of official Google Cloud documentation. Unlike the local Reddit server, this is a managed service hosted by Google and accessed as a remote endpoint over the internet. It exposes specialized tools like &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;google_developer_documentation_search&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for semantic queries and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;google_developer_documentation_fetch&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to retrieve full markdown content, ensuring that every technical claim the agent makes is supported by definitive, up-to-date facts. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Note:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; you can also connect your coding assistant tools such as &lt;/span&gt;&lt;a href="https://blog.google/innovation-and-ai/technology/developers-tools/introducing-gemini-cli-open-source-ai-agent/?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini CLI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or &lt;/span&gt;&lt;a href="https://antigravity.google/?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Antigravity&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to the developer knowledge MCP server to empower them with handy up to date Google Cloud documentation. I used it when writing this blog!&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To connect, the agent uses the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;McpToolset&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; class with &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;StreamableHTTPConnectionParams&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, pointing to a web URL instead of launching a local process. It securely authenticates using a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;DK_API_KEY&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; (&lt;/span&gt;&lt;a href="https://developers.google.com/knowledge/mcp?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;create your api key&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;) passed in the request headers, allowing the agent to perform a "comprehensive research sweep" across official docs, community sentiment, and broader web context through a single standardized interface. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent/tools/mcp_config.py:&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;def get_dk_mcp_toolset(api_key: str = &amp;quot;&amp;quot;):\r\n    &amp;quot;&amp;quot;&amp;quot;\r\n    Connects to Developer Knowledge (Google Cloud Docs).\r\n    This is a remote MCP server accessed via HTTP.\r\n    &amp;quot;&amp;quot;&amp;quot;\r\n    headers = {}\r\n    if api_key:\r\n        headers[&amp;quot;X-Goog-Api-Key&amp;quot;] = api_key\r\n    else:\r\n        # Fallback to os.environ for local testing if not passed via API\r\n        headers[&amp;quot;X-Goog-Api-Key&amp;quot;] = os.getenv(&amp;quot;DK_API_KEY&amp;quot;, &amp;quot;&amp;quot;)\r\n\r\n    return McpToolset(\r\n        connection_params=StreamableHTTPConnectionParams(\r\n            url=&amp;quot;https://developerknowledge.googleapis.com/mcp&amp;quot;,\r\n            headers=headers\r\n        )\r\n    )&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726fa51610&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;The Image Generator (Nano Banana MCP)&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While we've used external MCP servers for Reddit and documentation, we can also build our own custom MCP server to wrap specific Python logic. In this case, we are creating an image generation tool powered by Gemini 3 Pro Image (also known as Nano Banana Pro). This demonstrates that any Python function can be standardized into a tool that any agent can understand.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;How the image generation works:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://gofastmcp.com/getting-started/welcome" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;FastMCP&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: We use the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;fastmcp&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; library to drastically simplify server creation, allowing us to register Python functions as tools with just a few lines of code.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Gemini Integration&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: The server uses the Google GenAI SDK to call the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gemini-3-pro-image-preview&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; model, which converts the agent's descriptive prompts into raw image bytes.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;GCS Upload &amp;amp; Hosting:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Because agent interfaces typically require a URL to display images, the server automatically uploads the generated bytes to Google Cloud Storage (GCS) and returns a public link.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To connect this local tool, we use &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;StdioConnectionParams&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; because the server runs as a local subprocess communicating via standard input and output. This transport method directly matches the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;transport="stdio"&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; configuration we will define in our server entrypoint, ensuring a seamless connection for your custom local scripts.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The following code defines the MCP connection in &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent/tools/mcp_config.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. We use &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;uv run&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to ensure the server starts in an isolated environment with all its dependencies correctly installed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent/tools/mcp_config.py:&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;def get_nano_banana_mcp_toolset():\r\n    &amp;quot;&amp;quot;&amp;quot;\r\n    Connects to our local \&amp;#x27;Nano Banana\&amp;#x27; image generator.\r\n    This demonstrates how to wrap a local Python script as an MCP tool.\r\n    &amp;quot;&amp;quot;&amp;quot;\r\n    path = os.path.join(&amp;quot;dev_signal_agent&amp;quot;, &amp;quot;tools&amp;quot;, &amp;quot;nano_banana_mcp&amp;quot;, &amp;quot;main.py&amp;quot;)\r\n    bucket = os.getenv(&amp;quot;AI_ASSETS_BUCKET&amp;quot;)     \r\n    return McpToolset(\r\n        connection_params=StdioConnectionParams(\r\n            server_params=StdioServerParameters(\r\n                command=&amp;quot;uv&amp;quot;, \r\n                args=[&amp;quot;run&amp;quot;, path], \r\n                env={**os.environ, &amp;quot;AI_ASSETS_BUCKET&amp;quot;: bucket}\r\n            ),\r\n            timeout=600.0 # Image generation can take time\r\n        )\r\n    )&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726fa516d0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Implementing the Nano Banana Pro Server Logic&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now, we will implement the actual logic for this server. This implementation is based on the &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=XCGbDx7aSks&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=2" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Factory&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; demo &lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/devrel-demos/tree/a9a5f64a3394a4b5ecc64061f397bd5ed82927ee/ai-ml/agent-factory-antigravity-nano-banana-pro/mcp" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;code&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; by Remigiusz Samborski. While Remi's original code provides instructions for deploying the MCP server to Cloud Run, we will run it here as a local subprocess for faster development and testing.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To get started, create the directory for our new server:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;mkdir -p dev_signal_agent/tools/nano_banana_mcp\r\ncd dev_signal_agent/tools/nano_banana_mcp&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726fa513a0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;The Server Entrypoint (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;main.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; )&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This file acts as the "brain" that initializes and starts the MCP server.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;FastMCP Initialization: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;We use the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;FastMCP&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; library to create a server named "MediaGenerators" and register our &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;generate_image&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; function as a tool&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Safe Logging: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;_initialize_console_logging&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; function is critical. It forces all logs to &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;sys.stderr&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. This is because the MCP "stdio" transport uses &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;sys.stdout&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for communication between the agent and the tool; standard logs sent to &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;stdout&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; would corrupt that protocol.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Execution&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;mcp.run(transport="stdio")&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; line starts the server as a local subprocess, allowing it to listen for requests from your agent via standard input.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent/tools/nano_banana_mcp/main.py&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;import logging\r\nimport os\r\nimport sys\r\nfrom fastmcp import FastMCP\r\nfrom dotenv import load_dotenv\r\nfrom nano_banana_pro import generate_image\r\n\r\ndef _initialize_console_logging(min_level: int = logging.INFO):\r\n    # Ensure logs go to STDERR so they don\&amp;#x27;t break the MCP stdio protocol\r\n    handler = logging.StreamHandler(sys.stderr)\r\n    logging.basicConfig(level=min_level, handlers=[handler], force=True)\r\n\r\ntools = [generate_image]\r\nmcp = FastMCP(name=&amp;quot;MediaGenerators&amp;quot;, tools=tools)\r\n\r\nif __name__ == &amp;quot;__main__&amp;quot;:\r\n    load_dotenv()\r\n    _initialize_console_logging()\r\n    mcp.run(transport=&amp;quot;stdio&amp;quot;)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726fa518b0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;The Generation Logic (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;nano_banana_pro.py)&lt;/code&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This is where the actual image generation happens using Gemini.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;GenAI Client:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; We initialize the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;genai.Client()&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to interact with Google's generative models.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Model Selection:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; It specifically targets the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gemini-3-pro-image-preview&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; model. We set the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;response_modalities&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to "IMAGE" to tell the model we want pixels, not just text.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Robustness&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: The code includes a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;MAX_RETRIES&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; loop (set to 5) to handle any transient generation errors, ensuring the agent has multiple attempts to get a valid image.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Byte Processing: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Once the model generates the image, it arrives as raw inline data. We extract these bytes and call our helper to move them to the cloud.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;URI Conversion:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Finally, it replaces the internal &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gs://&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; path with a browser-accessible &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;https://&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; URL so the user can actually see the image.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent/tools/nano_banana_mcp/&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;nano_banana_pro&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;.py&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;import logging\r\nfrom typing import Literal, Optional\r\nfrom google import genai\r\nfrom google.genai import types\r\nfrom media_models import MediaAsset\r\nfrom storage_utils import upload_data_to_gcs\r\n\r\nAUTHORIZED_URI = &amp;quot;https://storage.mtls.cloud.google.com/&amp;quot;\r\nMAX_RETRIES = 5\r\n\r\nasync def generate_image(\r\n    prompt: str,\r\n    aspect_ratio: Literal[&amp;quot;16:9&amp;quot;, &amp;quot;9:16&amp;quot;] = &amp;quot;16:9&amp;quot;,\r\n) -&amp;gt; MediaAsset:\r\n    &amp;quot;&amp;quot;&amp;quot;Generates an image using Gemini 3 Image model.&amp;quot;&amp;quot;&amp;quot;\r\n    genai_client = genai.Client()\r\n    content = types.Content(parts=[types.Part.from_text(text=prompt)], role=&amp;quot;user&amp;quot;)\r\n    \r\n    logging.info(f&amp;quot;Starting image generation for prompt: {prompt[:50]}...&amp;quot;)\r\n    asset = MediaAsset(uri=&amp;quot;&amp;quot;)\r\n    \r\n    for _ in range(MAX_RETRIES):\r\n        response = genai_client.models.generate_content(\r\n            model=&amp;quot;gemini-3-pro-image-preview&amp;quot;,\r\n            contents=[content],\r\n            config=types.GenerateContentConfig(\r\n                response_modalities=[&amp;quot;IMAGE&amp;quot;],\r\n                image_config=types.ImageConfig(aspect_ratio=aspect_ratio)\r\n            )\r\n        )\r\n        if response and response.parts:\r\n            for part in response.parts:\r\n                if part.inline_data and part.inline_data.data:\r\n                    # Upload the raw bytes to GCS\r\n                    gcs_uri = await upload_data_to_gcs(\r\n                        &amp;quot;mcp-tools&amp;quot;,\r\n                        part.inline_data.data,\r\n                        part.inline_data.mime_type\r\n                    )\r\n                    asset = MediaAsset(uri=gcs_uri)\r\n                    break\r\n        if asset.uri: break\r\n\r\n    if not asset.uri:\r\n        asset.error = &amp;quot;No image was generated.&amp;quot;\r\n    else:\r\n        # Convert gs:// URI to an HTTP accessible URL if needed\r\n        asset.uri = asset.uri.replace(\&amp;#x27;gs://\&amp;#x27;, AUTHORIZED_URI)\r\n        logging.info(f&amp;quot;Image URL: {asset.uri}&amp;quot;)\r\n        \r\n    return asset&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726fa516a0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;GCS Upload Helper (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;storage_utils.py)&lt;/code&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Since agents need a web link to display images, this utility handles the hosting on Google Cloud Storage (GCS).&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Dynamic Bucket Selection&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: It looks for a bucket name in your environment variables, falling back from &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;AI_ASSETS_BUCKET&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;LOGS_BUCKET_NAME&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to ensure it always has a place to save data.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Unique Filenames:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; We use an MD5 hash of the raw image data to create a unique filename. This prevents filename collisions and acts as a simple way to avoid duplicate uploads of the same image.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Upload: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;blob.upload_from_string&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; method pushes the raw image bytes directly to your GCS bucket.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent/tools/nano_banana_mcp/&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;storage_utils&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;.py&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;import hashlib\r\nimport mimetypes\r\nimport os\r\nfrom google.cloud.storage import Client, Blob\r\nfrom dotenv import load_dotenv\r\n\r\nload_dotenv()\r\nstorage_client = Client()\r\nai_bucket_name = os.environ.get(&amp;quot;AI_ASSETS_BUCKET&amp;quot;) or os.environ.get(&amp;quot;LOGS_BUCKET_NAME&amp;quot;)\r\nai_bucket = storage_client.bucket(ai_bucket_name)\r\n\r\nasync def upload_data_to_gcs(agent_id: str, data: bytes, mime_type: str) -&amp;gt; str:\r\n    file_name = hashlib.md5(data).hexdigest()\r\n    ext = mimetypes.guess_extension(mime_type) or &amp;quot;&amp;quot;\r\n    blob_name = f&amp;quot;assets/{agent_id}/{file_name}{ext}&amp;quot;\r\n    blob = Blob(bucket=ai_bucket, name=blob_name)\r\n    blob.upload_from_string(data, content_type=mime_type, client=storage_client)\r\n    return f&amp;quot;gs://{ai_bucket_name}/{blob_name}&amp;quot;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726fa51fd0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Data Model (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;media_models.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;)&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This file ensures that our data follows a strict structure (Schema).&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Structured Output:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; By using a Pydantic &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;BaseModel&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, we guarantee that the tool always returns a consistent JSON object containing a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;uri&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; (the link) and an optional &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;error&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; message. This makes it much easier for the AI agent to understand and process the tool's result.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent/tools/nano_banana_mcp/&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;media_models&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;.py&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;from typing import Optional\r\nfrom pydantic import BaseModel\r\n\r\nclass MediaAsset(BaseModel):\r\n    uri: str\r\n    error: Optional[str] = None&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726fa51df0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Tool Dependencies (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;requirements.txt)&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While we use &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;uv&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to run our code, a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;requirements.txt&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file remains essential because it defines the specific dependencies &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;uv&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; needs to install for the Nano Banana server to function. This provides the necessary "ingredients" to set up the isolated environment before the server starts.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This file lists the three core libraries required for this tool:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;google-cloud-storage&lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Used for hosting the generated images on the cloud.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;google-genai&lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Provides the logic for the Gemini 3 Pro image generation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;fastmcp&lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The framework that turns our Python script into a standardized MCP tool.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Paste this code in &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;dev_signal_agent/tools/nano_banana_mcp/&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;requirements&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;.txt&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;google-cloud-storage==3.6.*\r\ngoogle-genai==1.52.*\r\nfastmcp==2.13.*&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f726fa51fa0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Summary&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this first part of our series, we focused on establishing the agent's core capabilities by standardizing its external integrations through the Model Context Protocol (MCP). We initialized the project using &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;uv&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for high-speed dependency management and successfully configured three critical toolsets: Reddit for trend discovery, Google Cloud Docs for technical grounding, and a custom Nano Banana MCP server for multimodal image generation. By utilizing the Google ADK’s &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;McpToolset&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, we’ve abstracted away complex API logic into simple, plug-and-play modules, ensuring that our tools share a common interface that decouples integration from intelligence.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For a deeper look into our technical foundation, you can explore the &lt;/span&gt;&lt;a href="https://developers.google.com/knowledge/mcp?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Developer Knowledge MCP server&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to learn more about knowledge grounding or visit the &lt;/span&gt;&lt;a href="https://github.com/google/adk-python" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;Google ADK GitHub repository&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to explore the framework's core capabilities&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With our toolset fully configured and ready for action, we can now move to &lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/multi-agent-architecture-and-long-term-memory-with-adk-mcp-and-cloud-run"&gt;Part 2&lt;/a&gt;, where we will build the multi-agent architecture and integrate the Vertex AI memory bank to orchestrate these capabilities. You can also jump ahead to &lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/create-expert-content-local-testing-of-a-multi-agent-system-with-memory"&gt;Part 3&lt;/a&gt;, where we will show you how to test the agent locally to verify these components on your workstation. If you’d like to dive ahead, you can explore the complete code for the entire series in our &lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/ai-ml/dev-signal" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GitHub repository&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Special thanks to&lt;/span&gt;&lt;a href="https://www.linkedin.com/in/remigiusz-samborski/" rel="noopener" target="_blank"&gt;&lt;span style="font-style: italic; text-decoration: underline; vertical-align: baseline;"&gt; Remigiusz Samborski &lt;/span&gt;&lt;/a&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;  for the helpful review and feedback on this article.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For more content like this, follow me on &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/shirmeirlador/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Linkedin&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://x.com/shirmeir86?lang=en" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 18 Mar 2026 09:18:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/build-a-multi-agent-system-for-expert-content-with-google-adk-mcp-and-cloud-run-part-1/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/devsignalheroimage.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Create Expert Content: Building Capabilities for a Multi-Agent System with Google ADK, MCP, and Cloud Run</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/devsignalheroimage.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/build-a-multi-agent-system-for-expert-content-with-google-adk-mcp-and-cloud-run-part-1/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Shir Meir Lador</name><title>Head of AI, Product DevRel</title><department></department><company></company></author></item></channel></rss>