Zhipu AI has released and open-sourced its flagship model GLM-5 on Z.ai, shifting its positioning from 'Vibe Coding' to 'Agentic Engineering'. The model adopts a MoE architecture with 744 billion total parameters and 40 billion activated during inference, trained on an expanded dataset of 28.5T tokens. It introduces DeepSeek Sparse Attention to reduce deployment costs for long-context scenarios. Official benchmarks claim leadership among open-source models on long-horizon coding and agent tasks such as SWE-bench Verified, Terminal-Bench 2.0, and Vending Bench 2. Weights are released under the MIT license, supporting local deployment via vLLM, SGLang, and multi-chip adaptation.
DeepSeek Upgrades to 1M Context and Reportedly Prepares V4 Flagship by Mid-February
Large ModelLong ContextAgent
DeepSeek has updated its model on both web and app platforms, increasing the context window to up to 1 million tokens, enabling single-pass processing of extremely long texts; real-world testing demonstrated it can parse documents exceeding 240,000 tokens, such as the novel 'Jane Eyre'. Meanwhile, reports indicate DeepSeek plans to launch its next-generation flagship model DeepSeek-V4 around mid-February, focusing on fixing logical reasoning deficiencies in prior versions and enhancing autonomous task execution (AI agent) capabilities. This strategy is described as pursuing 'low-cost, high-performance' to compete globally. However, no further technical details or third-party evaluation results have been disclosed.
Palo Alto Completes Acquisition of CyberArk: Unifying Identity Protection for Humans, Machines, and AI Agents
SecurityAcquisitionIdentity Security
Palo Alto Networks announced the completion of its acquisition of CyberArk, integrating CyberArk’s identity security platform into its broader security ecosystem to cover identity and privileged access management for humans, machines, and AI agents. The company stated that machine identities already outnumber human identities by more than 80 times, approximately 75% of organizations still rely on outdated, over-permissioned identity management practices, and nearly 90% have experienced identity-centric security incidents. Palo Alto will expand privileged access and least-privilege enforcement to reduce risks from static permissions and lateral movement, claiming up to 80% faster incident response. CyberArk products will continue operating independently while gradually being integrated.
Claude Desktop Extensions Exposed with CVSS 10 Zero-Click RCE: Calendar Invites Can Trigger Execution
SecurityVulnerabilityAgent
Security firm LayerX disclosed a zero-click remote code execution (RCE) vulnerability in Anthropic's Claude Desktop Extensions (also known as MCP Bundles), rated CVSS 10/10. Attackers can inject commands through Google Calendar events/invitations, inducing Claude to invoke local MCP servers and execute malicious operations without user interaction. The exploit chain includes downloading, compiling, and running arbitrary code. The root cause is attributed to insufficient sandboxing and the model’s high-privilege tool access. Anthropic reportedly responded that this scenario falls outside their current threat model, as these integrations are intended for local development tools where security responsibility lies primarily with users.
Nebius Acquires Tavily for $275M: Strengthening Real-Time Search Layer for AI Agents
AcquisitionAI InfrastructureRetrieval
AI infrastructure company Nebius Group has announced the acquisition of Israeli startup Tavily for an initial $275 million, with potential payouts raising the total value to $400 million upon milestone achievement. Tavily offers a search API tailored for AI agents, extracting real-time, structured information from public and private sources to mitigate LLM hallucinations and outdated responses. Customers reportedly include Cohere, MongoDB, IBM, and AWS. Nebius plans to integrate Tavily into its platform to enhance end-to-end agent development and operation capabilities—covering build, fine-tune, and run—and position real-time search as a core foundational component of agent systems. The deal is expected to close within the coming weeks.
Singapore Releases Agentic AI Governance Framework: Emphasizing Human Accountability and Full Lifecycle Control
PolicyGovernanceAgent
During the World Economic Forum, Singapore’s Infocomm Media Development Authority (IMDA) released the 'AI Governance Framework' for agentic AI, claimed as the world’s first governance guidance specifically targeting systems with autonomous decision-making capabilities. The framework outlines four core dimensions: risk assessment, human accountability, technical controls, and end-user responsibilities. It requires organizations to define clear usage boundaries, implement human approval checkpoints, and enforce technical oversight and auditability across the full lifecycle—from development to deployment and decommissioning. The document promotes voluntary compliance and alignment with international standards, aiming to fill policy gaps regarding 'autonomous-acting' systems and provide actionable compliance guidance for enterprises scaling agent deployments.
NVIDIA Nemotron 3 Nano 30B Lands on SageMaker JumpStart: Open-Source with 1M Context Support
Cloud ServiceOpen SourceLarge Model
AWS announced the general availability of NVIDIA’s Nemotron 3 Nano 30B MoE model on Amazon SageMaker JumpStart. This 30B-parameter MoE small model activates 3B parameters per forward pass and uses a Transformer-Mamba hybrid architecture, supporting up to 1 million tokens of context. It targets tasks including code generation, scientific reasoning, math, and instruction following. AWS and NVIDIA emphasize the model is 'fully open-source', providing weights, datasets, and training methodology to enable customization and deployment on private infrastructure for privacy and security needs. Developers can deploy directly via SageMaker Studio or use AWS CLI and SageMaker SDK for integration.
OpenAI Reportedly Disbands Mission Alignment Team, Leader Appointed 'Chief Futurist'
Company NewsAI Safety
According to Platformer, as reported by The Verge, OpenAI has disbanded its Mission Alignment team, redistributing members across other departments. The team was originally responsible for work related to 'ensuring AGI benefits all humanity.' Former head Joshua Achiam will transition into a newly created role, 'chief futurist.' Public details remain limited on the number of affected personnel, project handovers, or implications for external safety commitments. This move is interpreted as OpenAI further aligning its organizational structure toward product development and commercialization, though the company has not provided a comprehensive explanation in response to the report.
SMIC Warns HBM Shortage Will Last Years, Flags Risk of AI Data Center Overbuilding
AI InfrastructureChipCompute
In its earnings briefing, Semiconductor Manufacturing International Corporation (SMIC) stated that strong AI-driven demand for high-end memory will keep HBM (High Bandwidth Memory) in shortage for several years, with key bottlenecks potentially shifting from wafer fabrication to back-end testing. This could squeeze supply for other end markets and drive up costs. The company also warned that some enterprises aim to build data center capacity equivalent to a decade’s scale within just one to two years; if use cases are poorly planned, utilization rates may fall short of expectations. SMIC disclosed a 95.7% capacity utilization rate and projected capital expenditures of $8.1 billion in 2025, with similar levels expected in 2026.
Reuters: ByteDance in Talks with Samsung on AI Chips and Memory Supply to Ease Compute Chain Strain
ChipSupply ChainBig Tech
Reuters cited sources indicating that ByteDance is negotiating with Samsung on developing AI chips and securing scarce memory chip supplies to address supply chain pressures arising from global AI infrastructure expansion. The report frames this within the broader context of 'AI data center growth versus compute and storage shortages,' highlighting that critical components like memory have become tangible constraints on training and inference scalability. Specifics of the collaboration—such as whether it involves in-house design, joint development, or foundry arrangements—along with production commitments, delivery timelines, and product lines involved were not disclosed. This move reflects how leading content and platform companies are extending deeper into the semiconductor supply chain to reduce reliance on general-purpose GPUs and external suppliers.