AI Daily Brief

Monday, May 25, 2026

10 stories3 min read

Today's Highlights

Anthropic's Three Unreleased Models Exposed on the Same Day: Mythos-1, Opus 4.8, Sonnet 4.8

AI ModelCybersecurityAnthropic

Anthropic's three unreleased AI models were exposed on the same day. The cybersecurity model Mythos-1 briefly appeared in the Claude Code interface, generating 181 valid payloads in Firefox exploit testing (compared to just 2 for Opus 4.6) and has already identified over 10,000 high-risk vulnerabilities through Project Glasswing in collaboration with AWS, Google, and nine other major companies, including a 17-year-old RCE vulnerability in FreeBSD. Claude Opus 4.8 was discovered in Google Vertex AI's backend. Sonnet 4.8 was confirmed via npm source code leaks to skip version 4.7 entirely, inheriting Opus 4.7's visual capabilities while introducing a new reasoning tier, though token consumption is expected to increase by approximately 30%. Mythos-1 is currently limited to around 40 partner organizations and has not been publicly released.

Read full article

GPT-5.6 Leak Hints at June Launch, Multiple Internal Test Variants Revealed

OpenAIModel ReleaseAI Competition

Developers have found GPT-5.6-related tags in OpenAI's internal testing environments and Codex logs, including codenames iris-alpha, ember-alpha, and beacon-alpha, indicating the model has entered deep internal testing. GPT-5.6 (codenamed iris-alpha) achieved a major breakthrough in UI design generation in its latest internal version, successfully creating a minimalist note-taking app called Lumen Notes—significantly outperforming previous AI-generated cluttered interfaces. The new version is expected to significantly enhance multi-step reasoning, code debugging, long-context understanding, autonomous agent workflows, and front-end interface generation. OpenAI may launch both standard and Pro versions, with an official release anticipated in June, positioning it against Claude Sonnet 4.8 and Gemini 3.5 Pro.

Read full article

Researchers Use Claude Code to Autonomously Discover AI Inference Algorithm, Reducing Token Usage by 70%

AI ResearchInference Optimization

Researchers from UMD, Google, and Meta proposed AutoTTS, a method enabling Claude Code to act as a coding agent autonomously searching for optimal test-time scaling (TTS) algorithms within a simulated environment. The AI-discovered algorithm achieved comparable or higher accuracy than standard self-consistency methods on mathematical benchmarks like AIME and HMMT, while consuming approximately 70% fewer tokens. It also demonstrated transferability across different models and tasks such as DeepSeek-R1 and GPQA-Diamond. Based on dynamic confidence-based adjustments of search paths, the algorithm exhibits coordination logic too complex for manual human design. The entire discovery process cost only about $40 and took 160 minutes, marking a paradigm shift from manually writing algorithms to constructing search spaces.

Read full article

NextEra Acquires Dominion Energy for $67B to Build AI Data Center-Specific Power Network

AI InfrastructureEnergyM&A

NextEra Energy acquired Dominion Energy for $67 billion, marking the largest power company merger in U.S. history. The acquisition aims to address surging energy demands from AI data centers by building a dedicated power network for AI infrastructure. Electricity has become one of the primary bottlenecks for AI scaling, with data center power demand growing at an unprecedented rate. This deal highlights the deep dependency of the AI industry on energy infrastructure and reflects how traditional energy sectors are accelerating integration with AI. Similar moves include Blackstone’s $5 billion investment with Google to build data centers.

Read full article

AMD's Lisa Su Identifies HBM as New AI Chip Supply Bottleneck, Consumer DRAM Prices Surge

Chip Supply ChainHBMAMD

AMD CEO Lisa Su stated on May 23 that the AI chip supply bottleneck has shifted from advanced packaging technology to High Bandwidth Memory (HBM). HBM consumes three times more wafer area per GB than DDR5 and suffers from low yields, leading to tight supply. Nvidia, Google, AMD, and Amazon now occupy over 90% of HBM and CoWoS capacity, reducing availability for consumer-grade DRAM. TrendForce reports that conventional DRAM contract prices rose 90–95% quarter-over-quarter in Q1 2026, with another 58–63% increase expected in Q2. Gartner predicts DRAM and SSD prices will be 130% higher by end-2026 compared to 2025 levels, pushing average PC prices up by 17%, and eliminating sub-$500 entry-level PCs before 2028.

Read full article

AI Agent System Design Conductor Completes Full 7nm Chip Design Flow Autonomously

Chip DesignAI AgentAutomation

The AI Agent system Design Conductor autonomously completed an entire chip design workflow—from architectural design, RTL coding, functional verification, timing closure, to 7nm GDSII layout generation—in just 12 hours using only a 219-word requirement description, with no human intervention. Composed of multiple sub-agents for planning, review, implementation, integration, root cause analysis, and PPA convergence, the system coordinates EDA toolchains via an LLM. It can independently analyze VCD waveforms, locate pipeline logic defects, and generate fixes, but consumed hundreds of billions of tokens. The designed VerCore is a simple RISC-V core with limited performance and has not yet been fabricated, but validates the technical feasibility of AI-driven end-to-end chip design.

Read full article

Microsoft Research Releases Webwright, Terminal-Native Web Agent Framework with 60.1% Accuracy

Web AgentMicrosoftOpen Source

Microsoft Research released Webwright, a terminal-native web agent framework that enables agents to complete web tasks by writing and executing Playwright scripts instead of step-by-step browser action predictions. On the Odysseys benchmark, Webwright paired with GPT-5.4 scored 60.1%, a 79.4% improvement over the baseline GPT-5.4 score of 33.5%; it achieved 86.67% accuracy on Online-Mind2Web. The framework consists of about 1,000 lines of code and requires no multi-agent orchestration. Even smaller models like Qwen3.5-9B, when combined with a pre-built toolkit, achieve 66.2% on challenging tasks. Task scripts can be packaged into reusable CLIs and shared across platforms such as Claude Code and Codex.

Read full article

Google CEO Pichai Admits Gemini Lags Behind Competitors in Programming Agent Capabilities

GoogleAI CompetitionProgramming Agent

Google CEO Sundar Pichai candidly admitted in a New York Times podcast interview that Gemini lags behind Anthropic and OpenAI in agentic coding, tool use, instruction following, and long-horizon tasks requiring multi-step reasoning. He noted Google is closing the gap by releasing new models like Gemini 3.5 Flash and internal tools such as Antigravity 2.0, whose internal token usage doubles weekly. Pichai emphasized the rapid pace of AI development—where changes in 30–60 days now equal five years of past progress—and suggested AGI may arrive sooner than expected, acknowledging public concerns about AI are justified.

Read full article

Pope Leo XIV Issues First AI Encyclical 'Magnifica Humanitas'

AI EthicsPolicyReligion

Pope Leo XIV issued the first papal encyclical on artificial intelligence, 'Magnifica Humanitas,' on May 25, exploring AI's impact on human dignity, labor, and ethics, co-authored with an Anthropic co-founder emphasizing human protection in the AI era. This marks the Catholic Church’s first formal encyclical systematically addressing AI development and is seen as a landmark event elevating AI to the level of societal transformation comparable to the Industrial Revolution. Released amid accelerating global AI advancement and multiple AI firms preparing for IPOs, it underscores the urgency of ethical discourse surrounding technology.

Read full article

AWS MCP Server Now Generally Available, Offering IAM Governance and Full API Coverage for AI Agents

AWSAI AgentCloud Service

AWS announced general availability of its Managed Model Context Protocol (MCP) Server, providing AI coding agents with IAM-based access control, CloudWatch metrics monitoring, and CloudTrail log auditing. The server supports all AWS APIs—including long-running operations and file uploads—and allows agents to execute multi-step tasks within a sandboxed Python environment without requiring access to local file systems or shell commands. Part of the newly launched open-source Agent Toolkit for AWS, the MCP Server offers agents up-to-date documentation, controlled API access, and operational guidance aimed at reducing errors and token consumption, positioning AWS as the default platform for AI coding agents.

Read full article

Don't Miss Tomorrow's Insights

Join thousands of professionals who start their day with AI Daily Brief