DeepSeek V4 Released: Trillion-Parameter MoE Model Open-Sourced with $5.2M Training Cost
Open Source ModelMoE ArchitectureAI Competition
AI startup DeepSeek has launched its latest model, DeepSeek V4, a trillion-parameter Mixture of Experts (MoE) architecture. The model is fully open-sourced under the Apache 2.0 license, matching the performance of top U.S. AI models while achieving a remarkably low training cost of approximately $5.2 million, highlighting exceptional cost efficiency. Reports suggest DeepSeek V4 will run entirely on Huawei chips, marking accelerated progress in China's AI computing self-reliance. The release underscores the potential of MoE technology in large-scale AI, delivering state-of-the-art performance at minimal cost and challenging industry pricing norms. The open-source strategy enables global developers to freely use and study the model, advancing an efficient and open AI ecosystem.
OpenAI Proposes Robot Tax, Public Wealth Fund, and Four-Day Workweek
AI PolicySocial ImpactOpenAI
OpenAI has released a set of policy recommendations to address socioeconomic impacts from rapid AI advancement, including establishing a public wealth fund to distribute AI-generated income to citizens, taxing automated labor, advocating a four-day workweek without pay cuts, shifting tax bases from labor to corporate income and capital gains, and accelerating expansion of U.S. power infrastructure to meet AI data center energy demands. Chief Global Affairs Officer Chris Lehane stated these proposals are inspired by historical technological shifts and have been discussed with U.S. government officials and senators. OpenAI also launched a safety research fellowship and policy research grants, offering up to $1 million in API credits for external research.
OpenAI CEO Sam Altman aims to push for an IPO in Q4 2026, targeting a valuation as high as $1 trillion to outpace rival Anthropic. However, CFO Sarah Friar has expressed concerns, questioning whether the company is ready for public markets. Key worries include over $600 billion in committed cloud server capacity spending over the next five years, which could consume more than $200 billion before achieving positive cash flow. Friar has since been reassigned to report to business head Fidji Simo and excluded from certain critical financial discussions. Meanwhile, several executives have recently departed or shifted roles, including the COO’s reassignment and CMO’s exit due to health reasons, intensifying uncertainty around the IPO outlook.
Generalist Releases GEN-1 Robotics AI Model, Task Success Rate Jumps from 64% to 99%
RoboticsEmbodied IntelligenceAI Model
Generalist has unveiled GEN-1, an AI model that significantly enhances robotic manipulation in real-world environments, increasing average task success rates from 64% to 99%. Built on the GEN-0 architecture, GEN-1 operates at roughly three times the speed of its predecessor and can autonomously recover after unexpected disruptions. It has demonstrated high success rates in tasks such as folding 86 consecutive T-shirts, inspecting robot vacuums over 200 times, and packing blocks 1,800 times. Unlike traditional models reliant on extensive robot-collected data, GEN-1 uses low-cost data gathered via human-worn devices for pre-training, enabling efficient learning. While not solving all tasks, the company states GEN-1 lays foundational progress for scalable physical intelligence.
Claude Code Hit by Critical Security Bypass Vulnerability, Over 50 Subcommands Can Evade Rules
AI SecurityVulnerabilityAnthropic
A critical security bypass vulnerability exists in Anthropic's Claude Code AI coding agent: when shell command chains exceed 50 subcommands, the system skips all user-defined deny rules and falls back only to generic permission prompts. Attackers could embed long chains containing over 50 benign commands plus hidden malicious ones into a CLAUDE.md file in a GitHub repository, silently bypassing security policies when developers clone and run the project, potentially stealing SSH keys or cloud credentials. Additionally, Anthropic accidentally leaked over 500,000 lines of Claude Code source code, which has since been re-uploaded to GitHub and widely analyzed in China’s developer community. The vulnerability has been patched in version v2.1.90.
Nvidia Rubin GPU Production May Drop 25% Due to Memory Shortage, Output Cut from 2M to 1.5M
NvidiaChip Supply ChainAI Infrastructure
According to KeyBanc analyst John Vinh, Nvidia may reduce production of its next-generation Rubin GPUs in 2026 from the planned 2 million units to 1.5 million due to delays in high-bandwidth memory (HBM) supply from SK Hynix and Micron. The Rubin GPU will power the upcoming Vera Rubin AI servers, expected to deliver 3.3x the performance of Blackwell Ultra, with a planned launch in late 2026. Meanwhile, Microsoft and Google are breaking industry norms by signing three-year DRAM supply agreements with SK Hynix, incorporating price floor guarantees and 10%-30% advance deposits, transforming DRAM procurement from commodity buying into strategic resource stockpiling. KeyBanc maintains its 'overweight' rating and $275 price target on Nvidia.
PrismML Emerges from Stealth with $16.25M Funding, Open-Sources 1-bit LLM Bonsai Requiring Just 1GB Memory
Model CompressionOpen SourceEdge AI
PrismML, founded by Caltech researchers, has closed a $16.25 million seed round and open-sourced Bonsai, a family of 1-bit large language models available in 8B, 4B, and 1.7B parameter versions. Bonsai 8B achieves performance comparable to standard 16-bit models—typically requiring 16GB of memory—while consuming only about 1GB. PrismML claims the model is fully end-to-end binarized with no high-precision fallback, enabling up to 8x faster processing and 75%-80% lower energy consumption, making it ideal for edge device deployment. Although its 'intelligence density' metric remains unverified independently, the technology is seen as a key direction for addressing rising AI scaling costs.
Google DeepMind Releases Framework for AI Agent Traps, Prompt Injection Succeeds Partially in 86% of Cases
AI SecurityAgentGoogle DeepMind
Google DeepMind has published the first systematic framework for AI Agent Traps, identifying six categories of web-based attacks targeting autonomous AI agents across perception, reasoning, memory, action, multi-agent dynamics, and human oversight. Empirical data shows prompt injection partially succeeded in 86% of test scenarios, sub-agent hijacking achieved 58%-90% success, and data leakage attacks exceeded 80% success across five different architectures. More critically, these traps can be chained together and distributed across multi-agent systems, surpassing the defensive capabilities of single-layer security filters. The paper also highlights a legal 'accountability gap': responsibility remains unclear when hijacked agents commit financial crimes.
Wipro Acquires Mindsprint for $375M and Secures $1B Long-Term Contract with Olam Group
M&AIT ServicesDigital Transformation
Indian IT giant Wipro has announced the acquisition of Singapore-based IT services firm Mindsprint for $375 million, expected to close by June 30, 2026. The acquisition is closely tied to an eight-year strategic agreement with Olam Group valued at over $1 billion, including $800 million in committed spending. Formerly Olam Group’s IT division, Mindsprint employs over 3,200 people and operates in enterprise applications, cloud, cybersecurity, and digital engineering, generating $135.6 million in revenue during FY2025. Through this move, Wipro strengthens its expertise in AI-driven digital transformation, supply chain, and agriculture sectors.
Japan's NII Open-Sources LLM-jp-4, Surpassing GPT-4o in Japanese Language Understanding
Open Source ModelJapanese AIMultilingual
Japan's National Institute of Informatics (NII) open-sourced its domestic large language model series LLM-jp-4 on April 3, including 8B and 32B-A3B versions. Trained on approximately 12 trillion tokens of high-quality data, both models scored 7.54 and 7.782 respectively on the Japanese MT-Bench benchmark, outperforming GPT-4o (7.29) and Qwen3-8B (7.14), while achieving performance on par with GPT-4o in English. The 32B version uses an MoE architecture, activating only 3B parameters, and supports up to 65,000 tokens for input and output. Trained on ~19.5 trillion tokens—about six times larger than its predecessor—using the AIST ABCI 3.0 supercomputer, development involved over 2,600 participants through a public-private collaboration. A larger 332B model is currently in development.