AI Daily Brief

Sunday, January 18, 2026

9 stories3 min read

Today's Highlights

Marvell to Acquire Celestial AI for $3.25 Billion, Betting on Optical Interconnects

M&AAI HardwareData Center

Marvell announced it has signed a definitive agreement to acquire photonic interconnect company Celestial AI for approximately $3.25 billion. The target company specializes in 'Photonic Fabric' chip-to-chip communication technology, addressing high-bandwidth, low-power interconnect needs for data centers and AI accelerators. The integration is expected to strengthen Marvell's portfolio in high-speed interconnects and custom silicon solutions, enabling more energy-efficient data pathways for large-scale GPU/accelerator clusters. The announcement did not disclose further details such as the closing timeline or regulatory conditions.

Read full article

Google Vertex AI Exposes Privilege Escalation: Low-Privilege Users Can Obtain Proxy Tokens

SecurityCloud ServicesLLMOps

Security research has revealed a privilege escalation vulnerability in Google Vertex AI's Agent Engine and Ray on Vertex AI: low-privilege users can exploit default configurations to obtain Service Agent tokens and inject malicious tool code for remote execution, potentially accessing LLM memory, chat logs, and data assets in GCS/BigQuery. Google reportedly classified this behavior as 'by design' and has not patched it. Researchers recommend enforcing IAM least-privilege principles, disabling interactive shells on head nodes, and monitoring metadata service access and anomalous token usage to reduce lateral movement risks.

Read full article

Chinese Customs Blocks Nvidia H200 Shipments, Supply Chain Pauses Production

AI ChipsSupply ChainGeopolitical Risk

Reports indicate that Chinese customs has blocked the import of Nvidia's H200 AI chips, prompting some component suppliers to pause related production lines. Materials such as PCBs dedicated to the H200 are difficult to repurpose, leading to inventory buildup and order uncertainty across the supply chain; expansion plans previously made to meet Chinese customer demand have also been disrupted. The report does not specify the reason for the block or whether it is temporary. The incident highlights compliance and policy uncertainties surrounding high-end GPU cross-border trade and may accelerate domestic customers' evaluation of alternative solutions and overseas compute leasing.

Read full article

xAI Releases Grok 4.1 Focused on Physical World Understanding and Manipulation

Model ReleaseRoboticsWorld Models

xAI has updated Grok 4.1 on its website, emphasizing capabilities in understanding and manipulating the physical world, targeting applications in robotics and autonomous systems. The release claims to lower deployment barriers and hardware requirements, making it suitable for workflows such as product design, physical simulation, and real-world interaction. The official summary does not disclose parameter scale, public benchmark results, or external API/pricing information; actual performance and applicability boundaries await further technical details and third-party validation. Stable inference interfaces, if provided later, would facilitate integration into robotics stacks and simulation platforms.

Read full article

Alibaba Cloud's Bailian Launches New Features: wan2.6 Video Generation and Qwen Multimodal Updates

Cloud ModelMultimodalProduct Update

Alibaba Cloud's large model platform Bailian launched several model updates on January 16 in both mainland China and international regions: wan2.6-i2v-flash supports generating audio or silent videos with enhanced multi-shot storytelling and audio processing; the qwen-image-edit-max series improves industrial design, geometric reasoning, and character consistency; qwen3-tts-vc-realtime-2026-01-15 further optimizes real-time voice cloning. The company also stated that text-to-image, reasoning, and speech recognition capabilities will continue iterative upgrades to maintain cross-regional service consistency and availability.

Read full article

EU Accelerates AI Act Enforcement, Tightening Compliance for General and High-Risk AI

Policy & RegulationComplianceGenerative AI

The European Union is accelerating the enforcement of the AI Act, with the European Commission and national regulators strengthening bans on 'unacceptable risk' systems and clarifying transparency and technical documentation obligations for high-risk and general-purpose AI (GPAI). The guidance indicates that upstream platforms generating images, videos, text, or audio must disclose synthetic content, explain training data practices, and conduct risk assessments, embedding compliance into product design and supply chain responsibilities. Teams relying on third-party models are expected to face significantly increased contractual terms, audit requirements, and traceability obligations.

Read full article

Pixel2Play Open-Sources 8,300 Hours of Data for Training General-Purpose Real-Time Game AI

Open SourceEmbodied IntelligenceDataset

Pixel2Play (P2P) has open-sourced a foundation model and dataset for general game AI, releasing over 8,300 hours of annotated gameplay data spanning more than 40 games, along with training and inference code. The model takes pixels and text instructions as input and directly outputs keyboard and mouse actions. The team employed a lightweight Transformer and an autoregressive discrete action space (using 8 tokens to represent actions), reporting approximately 5x faster inference speed. This resource provides a more reproducible data foundation for real-time interactive gaming, simulation training, and embodied intelligence research, lowering the barrier from data collection to training and evaluation.

Read full article

CUHK(SZ) × Microsoft Propose TARS: Speech Reasoning Recovery Rate Over 100%

ResearchSpeech AIReinforcement Learning

A joint team from CUHK(SZ) and Microsoft introduced TARS (Trajectory Alignment for Reasoning in Speech), addressing the 'modality reasoning gap' that degrades large model reasoning performance when using speech input. The method uses on-policy reinforcement learning (GRPO) to align reasoning trajectories between speech and text branches—not just final answers—and combines hidden state similarity with semantic consistency constraints. The paper reports modality recovery rates exceeding 100% on tasks like MMSU and OBQA, with additional gains in text reasoning, offering a new training paradigm for reliable reasoning in speech-based agents and speech RAG systems.

Read full article

ViDoRe V3 Released: 26K-Page Enterprise Document Retrieval Benchmark + ColPali

BenchmarkRAGOpen Source

ILLUIN Technology and NVIDIA released ViDoRe V3, a visual document retrieval benchmark on Hugging Face for evaluating enterprise multimodal RAG systems. The dataset covers 10 real-world datasets, over 26,000 pages of documents, and 3,099 queries in six languages, with human-annotated relevant pages, key element bounding boxes, and reference answers. The project also open-sourced the ColPali retrieval model, which uses an 'image-to-embedding generation with late interaction' architecture to improve retrieval efficiency and accuracy, enabling developers to reproduce, compare, and perform regression testing across different visual document RAG approaches.

Read full article

Don't Miss Tomorrow's Insights

Join thousands of professionals who start their day with AI Daily Brief