Third-party model directories and media report that DeepSeek plans to release its flagship V4 in the first week of March: a trillion-parameter Mixture-of-Experts (MoE) model with approximately 32 billion activated parameters, natively multimodal across text, images, video, and audio, and supporting up to 1 million tokens of context. The information indicates it is optimized primarily for Huawei Ascend and Cambricon chips, with no availability for Nvidia/AMD during pre-release. Leaked benchmarks suggest a HumanEval score of around 90% and SWE-bench Verified performance exceeding 80%, with potential adoption of an MIT/Apache 2.0 open license and lower inference pricing; however, these specifications, performance metrics, and pricing have not been officially confirmed.
OpenAI Discloses Defense Department Contract: Bans Surveillance and Autonomous Weapons
Policy & ComplianceAI SafetyDefense
OpenAI has disclosed further details about its contract terms and deployment methods with the U.S. Department of Defense, stating that its models can be used within classified network environments but under three strict red lines: not for domestic mass surveillance, not for autonomous weapons systems, and not for high-risk automated decision-making (such as social credit scoring). The company emphasizes risk mitigation through cloud/API delivery, retaining control over safety architecture, involvement of cleared personnel, and contractual constraints, and reserves the right to terminate cooperation if terms are violated. Critics express concern that intelligence collection under Executive Order 12333 could still circumvent these restrictions, sparking debate over whether deployment architecture can substitute for legal safeguards.
Military Reportedly Used Claude to Plan Iran Airstrike Despite Ban
AI GovernanceDefense
Although the Trump administration announced on March 1 a federal ban on using Anthropic's Claude and mandated a six-month phase-out, the Wall Street Journal reported that the U.S. military continued to use Claude the next day for intelligence analysis and target identification during planning of an airstrike on Iran. The usage reportedly occurred within hours of the ban’s announcement, highlighting a gap between national security demands and policy enforcement, and intensifying discussions on military AI boundaries, auditability, and accountability. The incident also brings public attention to tensions between 'supply chain risk' classifications, actual procurement processes, and frontline operational timelines.
China Releases National Standard System for Embodied AI and Humanoid Robots
Policy & StandardsEmbodied Intelligence
According to Xinhua News Agency, China released on March 1 the first national standard system covering the entire industrial chain and full lifecycle of humanoid robots and embodied artificial intelligence. Developed by a technical committee under the Ministry of Industry and Information Technology, it involved over 120 research institutions, enterprises, and industry users. The system is divided into six parts: foundational commonalities, brain-inspired and intelligent computing, limbs and components, complete machines and systems, applications, and safety and ethics. It sets norms for data lifecycle management, model training and deployment, and scenario operations. The report also notes that 2025 is seen as the year of mass production, with more than 140 domestic manufacturers having launched over 330 models; the new standard system aims to provide unified interfaces, testing protocols, and safety boundaries for large-scale deployment.
Radford et al. Propose Token-Level Pretraining Filtering: Relearning Cost ×7000
AI SafetyTraining Data
Machine之心 reports that Alec Radford and colleagues propose token-level data filtering during pretraining—such as loss masking or direct token removal—to surgically prevent models from learning specific dangerous knowledge, reducing unnecessary data waste compared to document-level filtering. The paper reports a scaling-enhanced defense pattern: on a 1.8B-parameter model, relearning efficiency in filtered domains drops by roughly 7,000-fold, and this method proves more resistant to adversarial fine-tuning recovery than post-hoc machine unlearning. If validated, this suggests safety strategies could rely more on representational differences from 'never learned' states rather than solely on post-alignment semantic discrimination.
Open Cowork Open-Sources Desktop Agent: Supports GUI Operations and Remote Collaboration
Open SourceAI AgentsGUI Automation
The open-source project Open Cowork has launched a 'desktop virtual colleague' solution emphasizing end-to-end workflow from conversation to deliverables: its GUI module enables general screen interactions such as clicks and inputs, combined with Skills modules to generate structured outputs like documents, and supports remote collaboration and control via Feishu, directly feeding local execution results back into team workflows. The project warns of permission risks and adopts a secure-by-default design, restricting read/write access to designated Workspaces and recommending execution in isolated environments like WSL2 or Lima to minimize impacts on host systems and sensitive data.
Clay Uses LangSmith to Monitor 300M Monthly Agent Runs with 99.5% Billing Accuracy
LLMOpsAI Agents
A LangChain customer case study reveals that Clay runs approximately 300 million AI Agent tasks per month, each typically involving 10–30 tool calls. The team uses LangSmith for tracing and observability, improving cost reconciliation accuracy between platform data and vendor invoices to 99–99.5%. They also conduct automated evaluations on real-user traces using rule-based checks and LLM-as-a-judge methods. Clay enables support teams to directly inspect execution trees for faster debugging and employs a routing layer to distribute tasks across different models, managing both cost and quality regression—highlighting how observability and financial reconciliation become essential at scale for Agent operations.
MORPH PDE Foundation Model: 7M–480M Pretrained Across 1D–3D
Research PaperScientific Computing
The MORPH paper introduces a shape-agnostic PDE foundation model using convolutional vision Transformers to uniformly handle heterogeneous spatiotemporal data across 1D–3D, multiple resolutions, and scalar/vector fields. The authors introduce a Unified Physical Tensor Format (UPTF-7) and pretrain variants ranging from 7M to 480M parameters. After pretraining on six datasets, they fine-tune on seven downstream tasks. Experiments show superior performance over from-scratch training and existing PDE foundation models in both zero-shot and full-shot settings, with parameter-efficient fine-tuning methods like LoRA remaining effective—pointing toward more data-efficient and transferable pathways in scientific machine learning.
Honor Launches 'Robot Phone' and First Humanoid Robot
AI HardwareEmbodied Intelligence
According to the South China Morning Post, Honor unveiled its first 'robot phone' and first humanoid robot ahead of MWC Barcelona. The robot phone features a motorized three-axis gimbal camera that automatically extends, enabling motion tracking, interactive lens movements, full-angle video calling, and responsive actions such as nodding or dancing to music. The concurrently launched humanoid robot targets applications including shopping assistance, workplace inspection, and companionship. The launch reflects smartphone makers integrating multimodal AI with mechanical structures to explore terminal devices with stronger interaction and execution capabilities beyond traditional 'voice + screen' AI phones.