AI Daily Brief

Thursday, March 26, 2026

10 stories3 min read

Today's Highlights

Arm Launches 3nm 136-Core AGI CPU, Bets on Data Centers

AI ChipData Center

Arm unveiled its first in-house designed mass-produced data center CPU, the 'AGI CPU', fabricated on TSMC's 3nm process. It features up to 136 Neoverse V3 cores, a maximum clock speed of 3.7GHz, and a TDP of approximately 300W. The chip supports 12-channel DDR5-8800 (bandwidth >800GB/s), 96 lanes of PCIe Gen6, and CXL 3.0. Arm claims rack-level performance for agentic AI workloads could more than double that of the latest x86 systems, projecting around $15 billion in annual revenue within about five years. Meta is among the initial partners, with OpenAI and others expressing support.

Read full article

SK hynix Files Confidential ADR Listing Application with SEC to Expand HBM Production

SemiconductorFinancing

SK hynix confirmed it has filed confidentially with the U.S. SEC to prepare for an ADR listing in the United States to support AI memory expansion and global capital strategy; the offering size and timeline remain undisclosed. The company is simultaneously advancing construction of facilities in Cheongju M15X and Yongin semiconductor cluster in South Korea, as well as advanced packaging facilities in Indiana, USA. SK hynix stated it plans to reserve over 100 trillion KRW in net cash for long-term strategic investments. Reports also mention its recent procurement commitment of approximately $7.97 billion worth of ASML equipment and plans to sample next-generation HBM4E to meet surging AI-driven HBM demand.

Read full article

Google TurboQuant Requires No Retraining: Up to 6x KV Cache Compression, 8x Attention Speedup

Inference OptimizationModel Compression

Google Research introduced TurboQuant, a quantization and compression suite including PolarQuant and QJL, targeting memory bottlenecks caused by KV caches during inference. According to the release, it achieves up to 6x reduction in KV cache memory usage without requiring retraining, and boosts attention computation speed by up to 8x on H100 GPUs, while supporting compression down to 3-bit with minimal impact on output quality. Designed for long-context LLMs and vector retrieval systems, the method can be directly applied to existing fine-tuned models, reducing VRAM requirements and inference costs.

Read full article

LiteLLM 1.82.8 Poisoned via PyPI Supply Chain Attack: Downloaded ~47K Times in 47 Minutes

Supply Chain SecurityOpen Source Ecosystem

Malicious versions of LiteLLM on PyPI, including version 1.82.8, were found to contain .pth files that execute silently upon installation—without explicit import—during Python process startup. These payloads steal SSH keys, cloud credentials, Kubernetes configurations, database passwords, and API keys, and attempt to deploy backdoors within Kubernetes environments. The package reportedly had around 97 million monthly downloads, and the malicious version was downloaded approximately 47,000 times in roughly 46 minutes after publication. Statistics indicate about 88% of dependent packages do not enforce strict version pinning, amplifying transitive risks. Official versions have been yanked, and users are advised to rotate credentials and audit their environments.

Read full article

GitHub Updates Copilot Data Policy: Interaction Data Will Be Used for Training by Default Starting April 24

Developer ToolsData Governance

GitHub has updated its policy on Copilot interaction data usage: starting April 24, user interactions from Copilot Free, Pro, and Pro+ will be used by default to train and improve AI models within GitHub and Microsoft, unless users actively opt out in settings. Copilot Business and Enterprise tiers are unaffected by this change. Collected interaction data includes input prompts, model outputs, accepted or modified code snippets, and associated context such as related file information and repository structure, aimed at enhancing model quality and defect detection capabilities. GitHub emphasizes that this data will not be shared with third-party AI model providers or independent services.

Read full article

Anthropic Introduces Auto Mode for Claude Code: Automatically Assess Permissions and Block High-Risk Actions

Programming AgentSecurity Mechanism

Anthropic launched a research preview feature called 'Auto mode' for Claude Code, enabling the model to autonomously decide whether to execute certain permissioned actions without requiring逐项 human approval. The system performs risk assessment before each action, focusing on blocking large-scale deletions, sensitive data leaks, and execution of malicious code, while restricting anomalous instructions likely from prompt injection attacks. If repeated blocks occur, the system ultimately requires explicit user authorization. This feature currently supports only Claude Sonnet 4.6 and Claude Opus 4.6, and is recommended for use within sandboxed environments isolated from production. A high-risk option to bypass permission checks remains available but is not recommended.

Read full article

Cursor Launches Self-Hosted Cloud Coding Agents: Enterprises Can Run Internally with Cloud Orchestration

Programming AgentEnterprise Deployment

Cursor released 'self-hosted cloud agents,' allowing enterprises to deploy coding agent workers on their own infrastructure, ensuring codebases, dependency caches, secrets, and build artifacts remain within internal networks to meet compliance and security requirements. Architecturally, the workers connect outbound via HTTPS to Cursor’s cloud for orchestration, eliminating the need to open inbound ports or configure complex VPNs. Official support includes Helm charts for Kubernetes deployment and fleet management APIs, enabling scaling to large numbers of workers with centralized control. This capability targets enterprise use cases involving test execution, access to internal network endpoints, and CI/CD resources.

Read full article

JetBrains Launches Central to Manage AI Coding Agents, Phasing Out Code With Me

Developer ToolsAgent Platform

JetBrains announced Central, a new platform to manage AI coding agents, provide cloud execution infrastructure, and enable cross-project context sharing. It includes the preview Air IDE and JetBrains Console, which offers token management and AI usage analytics, with early access expected in Q2 2026. Positioned as the control and execution layer for agent-driven software development, Central supports integration with both JetBrains-native and third-party agents, delivering governance features such as policy enforcement, identity and permissions, observability, auditing, and cost attribution. Concurrently, JetBrains announced it will phase out the collaborative coding feature Code With Me: version 2026.1 will be the last officially supported release, with plugin support continuing until Q1 2027.

Read full article

Meituan Open-Sources LongCat-Next Multimodal Model: 74B Parameters, MIT License

Open Source ModelMultimodal

Meituan has open-sourced the native multimodal large model LongCat-Next on Hugging Face, featuring approximately 74 billion parameters. It claims to jointly model text, images, and audio within a unified discrete token space. The model employs the DiNA (Discrete Native Autoregression Paradigm) framework, combined with discrete visual representations and native-resolution Vision Transformers, enabling capabilities such as image understanding and generation, speech recognition, voice-to-voice conversion, and customizable voice cloning. Released under the MIT license, it facilitates research and engineering integration. Documentation advises users to independently evaluate accuracy, safety, and compliance risks.

Read full article

Google DeepMind Releases Lyria 3 Pro: Generates Up to 3-Minute Tracks with SynthID Watermarking

Generative AudioProduct Release

Google DeepMind released updates to its music generation models, Lyria 3 and Lyria 3 Pro, emphasizing enhanced structural control to generate tracks up to approximately three minutes long, with support for specifying segment structures (e.g., intro, verse, chorus, bridge) using natural language. The models are now available in public preview on Vertex AI and accessible for trial through developer portals like AI Studio. All generated audio includes embedded SynthID imperceptible watermarks to identify AI-generated content and enhance traceability. The company also noted it employs policies and filtering to reduce risks of directly imitating specific artist styles, targeting applications in video production and creative toolchain integration.

Read full article

Don't Miss Tomorrow's Insights

Join thousands of professionals who start their day with AI Daily Brief