Gemini 3.5 Pro Leaks: Coding Matches GPT-5.5, Spark Agent Unveiled

Breaking AI Intelligence Report — May 15, 2026

Just hours ago, Gemini 3.5 Pro — codenamed “Cappuccino” — leaked publicly, with checkpoint artifacts already circulating among developers. What was rumored as Gemini 3.2 mere hours earlier has been superseded by a full 3.5 release — signaling Google’s strategic pivot ahead of its annual I/O keynote.

Gemini 3.5 Pro leak preview

🚀 Major Capabilities Confirmed

✅ Interactive Multi-Modal Generation

DualShock 4 controller blueprint with interactive SVG decomposition
Fully functional vector illustration of a pelican riding a bicycle — featuring 7 real-time customizable dimensions: frame color, lighting, headgear, basket contents, pedaling speed, and more

Pelican bicycle interactive demo

This is not static SVG output — it’s a prompt-generated, self-contained web application, complete with live controls and responsive rendering.

Interactive UI panel

✅ Programming Performance: On Par With GPT-5.5

According to Abacus.AI CEO Bindu Reddy’s benchmark data:
– Gemini 3.2 Flash achieves 92% of GPT-5.5’s coding & reasoning capability, at 15–20× lower inference cost
– LM Arena scores confirm 3.5 Flash outperforms 3.1 Pro in SVG generation, interactive 3D coding, and animation logic

LM Arena benchmark comparison

✅ Gemini Spark: The 24/7 Autonomous Agent

A newly exposed beta — Gemini Spark — redefines AI assistance:
– Runs continuously across devices
– Manages email, calendars, multi-step workflows, and e-commerce
– Integrates deeply with Google services: Gmail, Drive, Maps, Chrome history, Personal Intelligence, and location data

⚠️ Critical Privacy Note: Spark may execute purchases or share personal data without explicit consent, though it attempts pre-action confirmation where possible.

Gemini Spark interface preview

Spark evolves from internal project “Remy”, previously limited to AI Ultra subscribers — now scaled as a full-time digital life concierge.

⚖️ Reality Check: Where Gemini 3.5 Stands Today

Per exclusive reporting by Alex Heath (The Verge), the new model lands firmly at the GPT-5.5 tier, falling short of the frontier-defining Mythos — which recently passed both UK AI Safety Institute cybersecurity evaluation suites, while GPT-5.5 cleared only one.

Model hierarchy comparison

Model	LMArena Elo (Historical)	Cybersecurity Passes (AISI)
Gemini 3 (2025)	1501	—
GPT-5.5	~1620	1/2
Mythos	~1740	2/2

💻 Programming: Google’s Most Pressing Gap

DeepMind insiders acknowledge mounting pressure to close the developer trust gap — especially vs. Claude, now the de facto default for engineering teams.

Antigravity, Google’s AI coding platform, lags significantly in usability and fidelity:
XDA benchmarks show Claude Code correctly interprets complex creative prompts on first attempt
Antigravity outputs resemble “MS Paint doodles” — low-fidelity, brittle, and contextually shallow

Antigravity vs. Claude Code output comparison

Pricing friction persists: credit-based system, opaque quota alerts, and inconsistent free tiers erode developer goodwill.

Yet experts like Haider suggest Google’s long game isn’t head-to-head coding rivalry — but building the world’s most capable multimodal foundation, tightly coupled with ubiquitous distribution.

🌐 The ASI Flywheel Accelerates

All three leaders are now accelerating on parallel tracks:

Company	Strategy	Key Move
OpenAI	Speed & ecosystem lock-in	Codex ultrafast mode + 2-month enterprise subsidy → 2,000 devs onboarded in 3h
Anthropic	Quality & safety leadership	Opus 4.7 Fast + 50% Claude Code quota boost
Google	Distribution & ambient intelligence	Gemini Spark embedded in 1B+ Android/iOS devices; MCP tooling support confirmed

MCP tool testing interface

New UI reveals native MCP (Model Control Protocol) tool integration, plus a redesigned Thinking Mode with two global settings:
– Standard: Optimized for everyday queries
– Extended: Reserved for deep-reasoning, multi-step problems

Thinking mode toggle

🔮 Final Perspective

While Gemini 3.5 doesn’t reset the frontier, it marks Google’s decisive shift from model-first to agent-first AI — leveraging scale, integration, and behavioral data to fuel next-generation training loops no competitor can replicate.

As the ASI flywheel spins faster, developers aren’t just choosing tools — they’re betting on architectures. And right now, the race isn’t about who’s strongest today, but who builds the most sustainable feedback loop tomorrow.

Three-way AI race visualization

Sources:
– @alexeheath
– @Lentils80
– TestingCatalog: Google Prepares Gemini Spark Ahead of I/O