AI Strategy5 min readJune 24, 2026

Google Just Made Your AI Agent Architecture Official. The Problem Is They Sold It Back to You as a Product.

Google AntigravityGemini 3.5 FlashMulti-Agent ArchitectureOrchestratorOperator StrategyFramework MoatAI Business AutomationSolo OperatorAgent-FirstAgentSkillVault

Google's Tulsee Doshi, Senior Director and Head of Product for Gemini, said something at I/O 2026 that most of the AI press buried in paragraph nine. She was explaining how Gemini 3.5 Pro and Gemini 3.5 Flash are designed to work together, and she said this: '3.5 Pro becomes your orchestrator, your planner, and then it actually can leverage Flash to be the various sub-agents.' Read that again slowly. That is the canonical multi-agent architecture. One reasoning model at the top of the hierarchy — the orchestrator — that plans and delegates. Multiple fast, cheap models running in parallel below it — the workers — that execute the individual tasks. Solo operators who have been building serious AI workflows for the last year recognize that description immediately, because it is the architecture they pieced together themselves from first principles. Today, Gemini 3.5 Flash became generally available inside Antigravity, Google's agent-first development platform, with the orchestrator/worker architecture baked directly into the product. Google just made the advanced multi-agent framework official. They also turned it into a platform subscription. Those two things are not the same event — and most operators are about to confuse them.

What Google Just Shipped

Gemini 3.5 Flash goes GA today via Antigravity, the Gemini API, and Gemini Enterprise. Benchmarks: it outperforms Gemini 3.1 Pro on coding and agentic tasks, runs four times faster than other frontier models, and Google has an optimized variant that is twelve times faster at the same quality level. That speed is the point — at twelve times faster than GPT-5.5, Flash can run as multiple parallel sub-agents on a long-horizon task without the latency compounding into workflow death. Antigravity 2.0, the platform housing all of this, is a standalone desktop application built around agent orchestration. It ships with a CLI, a SDK, and what Google calls Managed Agents — a hosted execution layer that runs your agents for you on Google's infrastructure. At Google I/O, engineers demonstrated Antigravity agents spawning off to work on separate components of an operating system simultaneously, then merging the work. The orchestrator/planner handled architecture decisions. Flash workers handled implementation in parallel. The OS was built from scratch in the demo. The platform architecture that Google shipped is: 3.5 Pro thinks, 3.5 Flash executes, Antigravity hosts and manages it all. That is the product. And it works.

The Part Nobody's Talking About

The AI press is running this as a model benchmark story — Flash vs. GPT-5.5, speed comparisons, pricing math. That framing is accurate and it buries the operator implication entirely. Doshi did not accidentally describe the orchestrator/planner + sub-agent architecture. Google built Antigravity 2.0 around it because it is the only architecture that actually scales for long-horizon autonomous work. One model doing everything is a chatbot. A planner model routing tasks to specialized worker models is an agent system. This is not new information — it is the architecture that serious AI operators have been hand-building for a year. Here is what is new: Google is now selling access to that architecture as a managed platform. Antigravity hosts your agents. Antigravity routes between 3.5 Pro and Flash. Antigravity runs the execution layer. For operators who have not yet built the orchestrator/worker architecture themselves, Antigravity is genuinely attractive — it gives them a working multi-agent system without having to understand the framework underneath. That is also the trap. The operators who plug into Antigravity without building their own framework layer are not building an AI capability. They are renting Google's architecture on Google's infrastructure at Google's pricing, with Google's deprecation schedule and Google's terms of service. The Fable 5 export ban happened twelve days ago. Single-platform dependency is not a theory. Antigravity is a powerful tool. Platform dependency on Antigravity is the same risk with a different logo.

What This Means for Your AI Agent Workflow

The practical read for solo operators is this: if you have already built the orchestrator/worker architecture in your own agent framework, Gemini 3.5 Flash going GA today is just a new fast worker model to plug in. Test it against your existing worker-tier models — GPT-5.5, Claude Sonnet 4.6, Gemini 3 Flash — and route based on your actual task performance. If Flash wins on your tasks at its price point, update the config. That is the entire migration. If you have not yet built the orchestrator/worker architecture, Antigravity makes the right call easy: it gives you the pattern for free in how it is designed. The two-tier architecture — reasoning model on top, execution model at the task level — is what you should build, and Google's product description tells you exactly how it works. Build your own version of that architecture in your own framework, with your own model routing logic, before you decide whether to use Antigravity's managed execution layer. The Antigravity SDK means you can eventually run your own architecture on Google's infrastructure if you choose. That is meaningfully different from building your architecture inside Antigravity's managed layer where the routing decisions are Google's to make. The twelve-times speed multiplier on Flash is real. A well-designed orchestrator/worker system using 3.5 Pro and Flash in tandem will outperform any single-model workflow on long-horizon tasks. Build the framework, then decide which platform hosts it.

Bottom Line

Google shipping Gemini 3.5 Flash GA through Antigravity is the clearest external validation of the orchestrator/worker architecture thesis since multi-agent frameworks became a real category. Google's own head of product described the canonical pattern — planner model orchestrates, fast models execute — and built a platform around it. For operators who already have the framework: Gemini 3.5 Flash is a powerful new worker to test. For operators who don't: Antigravity tells you exactly what to build — but build it in your own framework first, not inside their managed platform. The model is not the moat. The architecture you own — not the architecture you rent — is.

4 Moves to Make Right Now

Map your current agent workflows to the orchestrator/worker pattern today. For every agent workflow you run, identify which step requires genuine reasoning and planning — that is your orchestrator tier — and which steps are execution, data retrieval, or repeatable task completion — those are your worker tier. If you cannot draw this distinction in your current workflows, your architecture is not yet optimized for the multi-agent era. The pattern Google shipped in Antigravity is the benchmark. Your goal is to replicate that architecture in your own framework before you decide whether to host it anywhere.
Test Gemini 3.5 Flash as a worker-tier model on your three highest-volume execution tasks. The benchmark case for Flash is speed and cost at high quality — four times faster than competing frontier models, twelve times faster with the optimized variant. Run your current worker-tier tasks through Flash and compare output quality, latency, and cost against your existing model. If Flash wins on your actual tasks — not the benchmark tasks — update your routing config. This is the entire migration for operators who have the framework. It should take two hours, not two weeks.
Read the Antigravity SDK documentation before you commit to the managed platform. Google released the Antigravity SDK alongside Antigravity 2.0, which means you can use the Gemini orchestration tooling without being locked into the managed execution layer. There is a meaningful difference between using Google's SDK to build agent architecture that you control and deploying your agents into Antigravity's managed execution environment where Google controls the runtime. If Antigravity's managed layer is compelling, understand what you are giving up in portability before you migrate. The SDK path keeps your options open.
Use the Antigravity architecture description as a free framework audit. Doshi's quote — '3.5 Pro becomes your orchestrator, your planner, and then it actually can leverage Flash to be the various sub-agents' — is a two-sentence description of what a real agent architecture looks like. Hold it against your current workflows: do you have an orchestrator that plans and delegates? Do you have worker models that execute in parallel? If not, that is the framework gap to close. Pre-built multi-agent orchestration framework templates that give you this architecture without the managed platform lock-in are at https://agentskillvault.ai/catalog — use them as the foundation and own the routing logic yourself.

Gemini 3.5 Flash going GA in Antigravity is a genuinely good product launch. The orchestrator/worker architecture is correct. The speed multiplier is real. For operators who have the framework, it is a good day — a fast new worker model and a validation that the architecture they built is the right one. For operators who do not have the framework, this is the clearest signal yet that the architecture matters, that every major AI lab is now building platforms around it, and that waiting is no longer a neutral choice. The options in front of you are: build the framework and own the architecture, or rent it from whichever lab is currently winning the benchmark war. Two weeks ago the winning lab's flagship model was pulled by the government. The framework you own cannot be banned. Start building it at https://agentskillvault.ai/catalog.

Ready to put this into practice?

Browse Skill Frameworks