Back to Archive
Saturday, April 25, 2026
9 stories3 min read

Today's Highlights

1

DeepSeek Releases V4 Open-Source Model, 1.6T Parameters Fully Optimized for Huawei Ascend Chips

Model ReleaseOpen SourceDomestic Computing Power

On April 24, DeepSeek released and open-sourced its V4 series, including V4-Pro (1.6T total parameters / 49B activated) and V4-Flash (284B / 13B), both supporting a 1 million token context window. The models adopt a hybrid attention architecture (CSA+HCA), reducing KV cache to just 10% of V3.2's usage, with over 27 trillion tokens used in pretraining. V4-Pro scores 93.5 on LiveCodeBench and 3206 on Codeforces, surpassing GPT-5.4 and Claude Opus 4.6 in coding performance. API pricing is highly competitive: the Pro version costs only $3.48 per million output tokens, approximately 1/9th of GPT-5.5’s price. The model runs entirely on Huawei Ascend platforms, achieving 20ms low-latency inference for V4-Pro on Ascend 950 super-nodes—marking the first cutting-edge open-source model independent of NVIDIA hardware. Released under the MIT license, weights are now available on Hugging Face.

Read full article
2

Google Plans Up to $40 Billion Investment in Anthropic, Valuing Company at $350 Billion

FundingCloud ComputingCompetitive Landscape

Google plans to invest up to $40 billion in Anthropic, including an immediate $10 billion cash injection (valuing the company at $350 billion) and $30 billion tied to performance milestones, plus five years of 5 gigawatts of computing capacity, totaling about $43 billion in commitments. Anthropic has reached an annualized revenue of $30 billion, with Claude capturing 32% share in the enterprise LLM API market, serving eight Fortune 10 companies as clients. Amazon previously committed up to $33 billion. This move is seen as Google's strategic defense to compensate for Gemini's weak enterprise competitiveness, ensuring Anthropic's compute demand flows into Google Cloud infrastructure. The deal structure avoids merger scrutiny but raises antitrust concerns. Anthropic’s valuation on Forge Global’s secondary market has already reached $1 trillion.

Read full article
3

Cohere Acquires Germany's Aleph Alpha, Merging to Form $20 Billion Entity to Advance European AI Sovereignty

M&AAI SovereigntyEnterprise AI

Canadian AI firm Cohere announced the acquisition of German AI company Aleph Alpha, forming a combined entity valued at approximately $20 billion, with Cohere shareholders owning 90% and Aleph Alpha shareholders 10%. Germany’s Schwarz Group (parent of Lidl) will invest $500–600 million in Cohere’s upcoming Series E round. The merged company will operate dual headquarters in Canada and Heidelberg, focusing on regulated sectors such as defense, finance, and public services, with existing contracts signed by German federal and local governments. Cohere reports $240 million in annual recurring revenue. Supported by both Canadian and German governments, the deal continues under the bilateral 'Sovereign Technology Alliance' framework aimed at delivering AI solutions independent of U.S. tech giants.

Read full article
4

NVIDIA Deploys OpenAI Codex Agent to All Employees, Weekly Active Developers Surpass 4 Million

Enterprise DeploymentAI AgentDevelopment Tools

NVIDIA has rolled out the OpenAI Codex AI agent, powered by GPT-5.5, to all employees, following early access granted to around 10,000 staff across engineering, legal, marketing, and other departments. CEO Jensen Huang described internal testing feedback as "shocking" and "life-changing" in an internal email. Codex runs on NVIDIA’s proprietary Blackwell infrastructure. OpenAI simultaneously launched the Codex Labs enterprise program, with seven global system integrators—including Accenture and Tata Consultancy Services—already onboard. Weekly active developers using Codex have surpassed 4 million. GitHub Copilot has also officially integrated the GPT-5.5 model, offering promotional pricing at 7.5x request cost savings, available to Pro+, Business, and Enterprise users.

Read full article
5

Meituan Launches LongCat-2.0 Trillion-Parameter LLM, Trained on 50,000–60,000 Domestic AI Accelerator Cards

Model ReleaseDomestic Computing Power

On April 24, Meituan unveiled LongCat-2.0-Preview, a large language model with trillion-scale parameters and support for a 1M-token context window, matching GPT-5.5 in performance. Optimized specifically for Agent scenarios, it excels in code generation, task planning, and enterprise automation. The model was trained entirely on domestic AI clusters using 50,000–60,000 domestically produced accelerator cards, setting a record for the largest training run on Chinese-made compute. CEO Wang Xing emphasized three consecutive years of large-scale AI investment, including stakes in Moore Threads, Zhipu AI, and Moonshot. LongCat-2.0-Preview is now open for testing, offering 10 million free tokens per day.

Read full article
6

ComfyUI Raises $30 Million at $500 Million Valuation, Building Open-Source Creative AI Platform with 4 Million Users

FundingOpen SourceCreative Tools

Open-source generative AI workflow platform ComfyUI announced a $30 million funding round, reaching a $500 million valuation and bringing total funding to $48 million. The round was led by Craft Ventures, with participation from Pace Capital and Chemistry. ComfyUI offers a node-based visual interface enabling fine-grained control over diffusion models for image, video, and audio generation. It boasts over 4 million users, more than 60,000 community-built nodes, and averages 150,000 downloads per day. CEO Yoland Yan noted that traditional prompt-based tools achieve only 60%-80% of desired results, while the node-based architecture ensures output consistency and reproducibility. The platform is widely adopted in visual effects, animation, advertising, and industrial design.

Read full article
7

Ollama Exposes Critical Unpatched Vulnerability, Malicious Model Files Can Steal Sensitive Server Memory Data

Security VulnerabilityAI Infrastructure

Security researchers discovered a critical unpatched vulnerability, CVE-2026-5757, in the Ollama platform affecting how its model quantization engine processes GGUF format files. Attackers can upload malicious AI model files exploiting unchecked metadata and unsafe memory operations to read sensitive data—including API keys, user privacy, and intellectual property—from server heap memory without authentication. Leaked memory content can be embedded into new model layers and exfiltrated via registration APIs. As of late April, CERT had not contacted the vendor, and no official patch is available. Immediate mitigation includes disabling model uploads, restricting deployment environments, and using only trusted model sources. This vulnerability poses significant risks to shared development environments and internet-exposed systems.

Read full article
8

Orkes Completes $60 Million Series B, Building AI Workflow Orchestration Platform Based on Netflix Conductor

FundingEnterprise AIWorkflow

AI workflow orchestration platform Orkes announced a $60 million Series B financing led by AVP, bringing total funding to approximately $90 million. The company’s core technology originates from Netflix’s open-source project Conductor and already serves major enterprises like JPMorgan Chase and Tesla, aiming to solve the challenge of scaling AI applications from pilot to production. New features include Agent Runtime, MCP Gateway, and Prompt-to-Workflow, designed to improve predictability, fault tolerance, and internal API integration in AI agent systems. Funds will expand global operations and enhance tooling suites to support enterprise-scale deployment of mission-critical AI applications.

Read full article
9

Cloud Providers Like Microsoft Prioritize Internal GPU Allocation, Restricting Access for AI Startups

Compute SupplySupply Chain

According to The Information, major cloud providers including Microsoft are prioritizing allocation of NVIDIA GPUs to internal AI teams and large customers like OpenAI, making it difficult for AI startups to secure sufficient compute. Smaller external clients face longer wait times and higher prices for竞价 instances, forcing them to adopt multi-vendor strategies, use older accelerators, or build on-premise clusters. Techniques such as model distillation, quantization, and parameter-efficient fine-tuning (e.g., LoRA) are gaining value. This trend reflects ongoing AI hardware supply constraints and dominant firms’ control over infrastructure, exacerbating industry inequality.

Read full article

Don't Miss Tomorrow's Insights

Join thousands of professionals who start their day with AI Daily Brief