AI Daily Brief

Friday, May 8, 2026

10 stories3 min read

Today's Highlights

OpenAI Releases Three Realtime Voice Models, GPT-Realtime-2 Features GPT-5-Level Reasoning

Speech AIOpenAIAPI

OpenAI has launched three real-time voice AI models: GPT-Realtime-2 features GPT-5-level reasoning capability, supports a 128K context window and parallel tool calling, suitable for complex voice agent scenarios; GPT-Realtime-Translate enables real-time translation from over 70 languages into 13 output languages; GPT-Realtime-Whisper provides low-latency streaming speech-to-text. All three models are now available via the Realtime API, priced at $32/$64 per million audio input/output tokens, $0.034 per minute, and $0.017 per minute respectively. Performance is 11% higher than previous generations, with pricing unchanged.

Read full article

EU Reaches Deal to Ease AI Act, High-Risk AI Compliance Delayed to End of 2027

AI PolicyEURegulation

EU lawmakers have reached a provisional agreement to delay the compliance deadline for high-risk AI systems from August 2026 to December 2027, with AI systems embedded in products extended to August 2028. Driven by Germany, the deal significantly exempts industrial AI applications from regulation, easing compliance burdens for companies like Siemens and Bosch, though medical devices remain unaffected. The new rules ban AI systems that generate child sexual abuse material or non-consensual sexually explicit deepfakes, which must comply by December 2026. A three-month grace period is granted for watermarking requirements on AI-generated content. SMEs receive additional regulatory benefits. This marks the EU's first significant relaxation of tech regulation in the digital domain.

Read full article

Google DeepMind Releases AlphaEvolve Impact Report, Reduces DNA Sequencing Errors by 30%

Google DeepMindAI ApplicationScientific Computing

AlphaEvolve, a code agent based on Gemini, has made significant impacts across multiple domains: reducing DNA sequencing error detection rates by 30% in genomics; increasing feasible solution ratios in grid optimization from 14% to over 88%; improving natural disaster prediction accuracy by 5%. In research, it enabled a 10-fold reduction in error for quantum physics simulations and assisted Terence Tao in solving the Erdős discrepancy problem. On AI infrastructure, it optimized next-gen TPU designs and reduced write amplification in Spanner databases by 20%. Commercially, it doubled training speed for Klarna and improved route efficiency by 10.4% for FM Logistic.

Read full article

Roche Acquires AI Pathology Company PathAI for Up to $1.05 Billion

Healthcare AIM&ARoche

Roche announced a definitive merger agreement with PathAI, involving an upfront payment of $750 million and up to $300 million in milestone payments. PathAI specializes in AI-driven digital pathology, and its image management system will be integrated into Roche’s diagnostics division and rolled out globally. The two companies have collaborated since 2021, expanding in 2024 to co-develop AI companion diagnostic algorithms. PathAI’s AIM-NASH tool has already received FDA qualification. The deal, expected to close in the second half of 2026, will accelerate biomarker discovery and personalized medicine development.

Read full article

Gemini 3.1 Flash-Lite Officially Launched, Cost Reduced by Approximately 60%

GoogleModel ReleaseEnterprise AI

Google Cloud announced the general availability of Gemini 3.1 Flash-Lite, the fastest and most cost-efficient model in the Gemini 3 series, designed for ultra-low latency and high-throughput tasks. Customer case studies show Gladly achieved about 60% cost reduction on its customer service platform, with p95 latency maintained at 1.8 seconds and success rate reaching 99.6%. JetBrains uses it to accelerate IDE AI assistant responses, while Ramp and OffDeal apply it to real-time financial research. The model outperforms Gemini 2.5 Flash Lite in audio input, RAG ranking, translation, data extraction, and code completion.

Read full article

Lambda Secures $1 Billion Credit Facility to Expand Gigawatt-Scale AI Compute Factories

FundingAI InfrastructureCompute

AI infrastructure company Lambda announced the closure of a $1 billion syndicated senior secured credit facility led by JPMorgan Chase, a significant increase from its $275 million facility in August 2025 and oversubscribed. The funds will deploy next-generation NVIDIA AI accelerators and expand data center capacity to meet compute demands for frontier AI model training and super-intelligence customers. Since its founding in 2012, Lambda has evolved from a GPU hardware provider to an AI cloud infrastructure provider, aiming to make computing as accessible as electricity.

Read full article

Zhipu AI Lists on HKEX, Becomes World's First Large Model IPO with Market Cap Over HK$50 Billion

IPOLarge ModelsZhipu AI

Zhipu AI listed on the Hong Kong Stock Exchange at an offering price of HK$116.2, raising HK$4.3 billion, with an opening market capitalization exceeding HK$50 billion, making it the world’s first publicly traded large model company. Originating from technology transfer at Tsinghua University, its core product is the GLM series of large models. The company reported revenue of RMB 312.4 million in 2024, with cumulative R&D investment of approximately RMB 4.4 billion. Models such as GLM-4.6 and GLM-4.7 perform strongly in coding and visual understanding, and are compatible with domestic chips like Cambricon. Major shareholders include Legend Capital, Meituan, and Tencent, with cornerstone investors subscribing HK$2.98 billion.

Read full article

NASA and IBM's Prithvi Becomes First On-Orbit Deployed AI Geospatial Foundation Model

NASASatellite AIOpen Source

Prithvi, an open-source geospatial AI foundation model jointly developed by NASA and IBM, has been successfully deployed on the International Space Station and the Kanyini satellite, becoming the first AI geospatial foundation model to operate in orbit. Trained on 13 years of Landsat and Sentinel-2 satellite data, the model supports flood monitoring, disaster detection, and crop yield prediction. Its foundation model architecture allows functional expansion via small uploaded decoders, drastically reducing bandwidth needs. Its open-source nature significantly shortens R&D time. NASA also plans to release other foundation models for heliophysics and planetary science.

Read full article

Xiaomi Open-Sources OmniVoice Speech Cloning Model Covering 646 Languages

Speech SynthesisOpen SourceXiaomi

Xiaomi AI Lab open-sourced the OmniVoice multilingual speech cloning TTS model, featuring an ultra-minimalist bidirectional Transformer architecture and covering 646 languages—the broadest in the industry. Technical innovations include a full-codebook random masking strategy and the first introduction of LLM pre-trained parameters into a non-autoregressive model. It outperforms commercial systems across 24 languages and achieves near-natural intelligibility in 102 languages, even enabling high-quality synthesis for low-resource languages with less than 10 hours of data. It can train on 100,000 hours of data in one day, with PyTorch inference speeds reaching 40x real-time. Code and weights are released under the Apache 2.0 license.

Read full article

Cloudflare Lays Off Over 1,100 Employees, Calls It Strategic Restructuring for Agent AI Era

LayoffsCloudflareOrganizational Change

Cloudflare announced the layoff of more than 1,100 employees, with the CEO describing it as a strategic restructuring for the agent AI era rather than a cost-cutting measure. The company stated internal AI usage grew over 600% in three months, fundamentally changing work patterns. Departing employees will receive an industry-leading severance package, including full base salary through end of 2026, continued healthcare, and accelerated equity vesting. The company emphasized this is a one-time adjustment to avoid the uncertainty caused by multiple rounds of smaller layoffs.

Read full article

Don't Miss Tomorrow's Insights

Join thousands of professionals who start their day with AI Daily Brief