Google I/O Releases Gemini 3.5 Flash, 4x Faster Than Comparable Models and Less Than Half the Cost
Model ReleaseGoogle
At I/O 2026, Google launched the Gemini 3.5 series, with Gemini 3.5 Flash now globally available. The model excels in benchmarks such as Terminal-Bench 2.1 (76.2%) and MCP Atlas (83.6%), delivering output speeds of approximately 280 tokens/second—four times faster than other leading models—at less than half the cost. Google estimates enterprises could save over $10 billion annually by migrating 80% of workloads to Gemini. Gemini 3.5 Flash is now the default model for Gemini apps and Search AI Mode, serving over 900 million monthly active users. The more powerful 3.5 Pro is expected next month. Google also introduced Antigravity 2.0 desktop app and CLI, supporting multi-agent collaborative development.
Google Unveils Gemini Omni World Model for Multimodal Video Generation and Conversational Editing
Model ReleaseVideo Generation
At I/O, Google introduced Gemini Omni, with its first model, Gemini Omni Flash, now available on the Gemini app, Google Flow, and YouTube Shorts. The model can generate high-quality video from any combination of text, image, audio, and video inputs, enabling iterative video editing via natural language dialogue while preserving character and scene consistency. Omni understands physical laws and can create personalized digital avatars for video generation. All generated content includes an invisible SynthID watermark. Accessible to Google AI Plus, Pro, and Ultra subscribers, with API rollout planned. DeepMind CEO Hassabis called it a significant step toward general AI.
Google Launches Gemini Spark Personal AI Agent Capable of 24/7 Autonomous Task Execution
AI AgentProduct Release
At I/O, Google unveiled Gemini Spark, a cloud-based personal AI agent operating 24/7, continuing tasks even when devices are off. Spark autonomously performs multi-step actions across Google Workspace and third-party apps, such as analyzing bills, consolidating schedules, and drafting emails, seeking user confirmation before high-risk operations. Available to testers this week and rolling out in beta next week to U.S.-based Google AI Ultra subscribers. Google also introduced a $100/month AI Ultra entry tier, reduced the top-tier subscription from $250 to $200, and shifted from daily prompt limits to compute-based billing.
Andrej Karpathy Joins Anthropic to Lead Pretraining Research Team
Talent MovementAnthropic
OpenAI co-founder and former Tesla AI lead Andrej Karpathy announced he is joining Anthropic to lead a team using Claude models to accelerate pretraining research. Karpathy holds a PhD in computer science from Stanford and previously led Tesla's Autopilot vision team. After leaving Tesla in 2022, he briefly returned to OpenAI before founding Eureka Labs, an AI education company. He stated his education project will be paused but not abandoned. Karpathy has publicly praised Claude Code multiple times. His move marks a strategic talent acquisition for Anthropic, with Hugging Face CEO speculating it may drive greater open-source contributions from the company.
Alibaba Releases Qwen 3.7 Preview, Top-Ranked Domestic Model on Arena Leaderboard
Model ReleaseAlibaba
Alibaba released the preview versions of the Qwen 3.7 series, including Max and Plus variants. Qwen3.7-Max-Preview ranks 13th overall on the large model leaderboard, making it the highest-ranked Chinese-developed model to date, with top-10 performance in math and programming. The Plus version ranks 16th on the visual leaderboard. Real-world tests show the Max version can solve IMO-level math problems in about four minutes and demonstrates strong code generation capabilities. Both models are currently accessible via closed-access Qwen Studio, with further technical details to be revealed at Alibaba Cloud Summit on May 20. Since February 2026, Alibaba has released three generations of Qwen updates, significantly accelerating its iteration pace.
OpenAI Adopts SynthID Watermarking and Launches Public AI Image Verification Portal
AI SafetyContent Detection
OpenAI announced the integration of Google's SynthID digital watermarking into images generated by ChatGPT, Codex, and its API, while continuing to use the C2PA content credentials standard to form a multi-layer verification system. SynthID retains source signals even after transformations like screenshots, while C2PA provides detailed metadata. OpenAI launched a public verification portal allowing users to upload images to check if they were generated by OpenAI tools, initially limited to its own systems. The company has joined the C2PA compliance program. On Google's side, SynthID has already watermarked over 100 billion images and videos, with Meta and ElevenLabs among partners joining the initiative.
KPMG and Anthropic Form Global Strategic Alliance, Granting 276,000 Employees Access to Claude
Enterprise CollaborationAnthropic
KPMG and Anthropic formed a global strategic alliance, integrating Claude into KPMG’s core Digital Gateway platform for client services in tax, legal, and cybersecurity. Over 276,000 KPMG employees worldwide will gain access to Claude. Anthropic has named KPMG its preferred partner in private equity, with KPMG deploying Claude and AI agents across PE portfolio companies. AI agent capabilities that previously took weeks to build can now be deployed in minutes via Cowork and Managed Agents integration. The two firms are also collaborating with the University of Texas at Austin to study the role of human judgment in AI-driven value creation.
Blackstone Invests $5B with Google to Build TPU Data Centers, Challenging Nvidia
InfrastructureInvestment
Blackstone Group will invest $5 billion in equity to partner with Google on building new data centers powered by Google's proprietary TPU chips, aiming to deploy 500 megawatts of AI compute by 2027. Total investment, including leverage, could reach $25 billion. The project aims to create a 'new cloud' platform offering external access to Google's TPU computing power, enabling customers to rent Google's AI hardware and infrastructure. At I/O, Google also unveiled its eighth-generation TPU (8t for training and 8i for inference), supporting distributed training across over one million chips. Google expects annual infrastructure spending of $18–19 billion.
Hugging Face Releases Carbon Genomic Model, Inference Speed 275x Faster Than Peers
Open Source ModelBiotech
Hugging Face launched Carbon, an open-source generative DNA foundation model family. Carbon-3B, after training on one trillion tokens, matches the performance of leading DNA model Evo2-7B while achieving over 275x faster inference. The model uses 6-mer tokenization—each token containing six nucleotides—reducing sequence length by 6x. The team switched to a factorized loss function mid-training to improve prediction accuracy. Carbon supports zero-shot generation of novel DNA sequences and assessing functional impacts of mutations, capable of processing a full human genome on a single GPU within two days. The model, code, and data are all publicly available.
OpenAI Launches Long-Term Compute Reservation Service with 1–3 Year Discounted Token Pricing
Business ModelOpenAI
OpenAI co-founders Greg Brockman and CEO Sam Altman announced a guaranteed compute capacity service, allowing customers to secure discounted token pricing and reserved compute allocation through 1–3 year commitments. Altman noted that as models become more useful, demand for compute is expected to grow continuously. Current allocations will remain on sale until fully subscribed, with additional rounds planned. The service addresses rising enterprise demand for stable compute access while ensuring sufficient capacity for ChatGPT and Codex. This reflects a broader shift in the AI industry from on-demand usage to long-term contract models.