AI Daily Brief

Saturday, February 14, 2026

10 stories3 min read

Today's Highlights

MiniMax Open Sources M2.5, SWE-Bench Verified 80.2%

Model ReleaseOpen SourceAI Coding

MiniMax launched its text model M2.5 on February 12 and announced global open-sourcing on February 13, positioning it as a native Agent production-grade model. It achieved 80.2% on SWE-Bench Verified and 51.3% on Multi-SWE-Bench. The Lightning version delivers over 100 TPS output speed, with input costing approximately $0.3 per million tokens and output around $2.4 per million tokens; tool-calling and search capabilities improved by 20%. The company stated that leveraging the Forge framework and large-scale Agent reinforcement learning enabled approximately 40x training acceleration.

Read full article

Anthropic Completes $30 Billion Funding Round, Valued at $380 Billion

FundingLarge Model

Anthropic announced the completion of a new $30 billion funding round, reaching a post-investment valuation of $380 billion, led by Coatue and Singapore sovereign fund GIC, with participation from 38 investors including Microsoft and NVIDIA. The company disclosed an annualized revenue of approximately $14 billion, serving over 500 enterprise customers, eight of which are among the Fortune Top 10. Funds will be used for frontier research, product development, and scaling compute infrastructure. Anthropic also pledged to offset electricity price increases caused by data center expansion, avoiding passing power costs onto residential users.

Read full article

Ant Group Open Sources Ring-2.5-1T, Self-Evaluated 35/42 on IMO

Open SourceReasoning ModelLong Context

Ant Group open-sourced its reasoning model Ring-2.5-1T, featuring a linear attention architecture combining MLA and Lightning Linear in a 1:7 ratio, designed for long-context reasoning. Documentation indicates that when generation length exceeds 32K, memory access is reduced to one-tenth of the previous generation, with generation throughput tripled. Training introduced dense rewards on top of RLVR and employed fully asynchronous Agent reinforcement learning to enhance long-horizon task planning and tool collaboration. The model self-evaluated 35/42 on IMO 2025, claiming performance within gold medalist range.

Read full article

DeepMind Introduces Aletheia, Achieves 95.1% on IMO-Proof Advanced

AI AgentResearch Progress

Google DeepMind released Aletheia, a mathematical research agent based on an enhanced Gemini Deep Think, utilizing a 'generate-verify-revise' three-phase agent loop to improve proof accuracy. It achieved 95.1% accuracy on the IMO-Proof Bench Advanced, surpassing the prior record of 65.7%. The team reported that by scaling compute during inference, the computational cost for solving Olympiad problems was reduced 100-fold compared to the 2025 version. Aletheia independently resolved four open problems in the Erdős conjecture set and proposed a hierarchical framework for assessing AI autonomy in mathematical research.

Read full article

vLLM Discloses DeepSeek Performance of 7360 TGS per GB300 Card

Inference AccelerationComputeEngineering Practice

vLLM published performance benchmarks for running DeepSeek-V3.2 and DeepSeek-R1 on NVIDIA GB300 (Blackwell Ultra). Under NVFP4 quantization with TP2 parallelism, V3.2 achieves 7360 TGS prefill throughput per GPU and 2816 TGS in mixed scenarios (2k input / 1k output); R1 reaches 22476 TGS prefill and 3072 TGS mixed under 2×GB300 with EP2 configuration. The article notes that B300 offers approximately 8x prefill and 10–20x mixed throughput gains over Hopper, and discusses prefill/decode splitting to improve high-concurrency latency and throughput.

Read full article

GitHub Previews Agentic Workflows: Markdown-Driven Actions

Development ToolsAI Agent

GitHub unveiled a technical preview of Agentic Workflows, enabling developers to describe intent via Markdown and have AI agents execute repository maintenance tasks in GitHub Actions—such as triaging issues, updating documentation, and suggesting code simplifications—positioned as 'Continuous AI'. To mitigate overreach and errors, workflows default to read-only permissions; any write actions must map through 'safe outputs' to pre-approved, auditable GitHub operations, maintaining human approval loops. GitHub emphasized that agents run in sandboxed environments, with explicit declarations of permissions, tools, and allowed outputs.

Read full article

AWS Adds Proxy and Persistent Profile Support to AgentCore Browser

Cloud ServiceAI AgentEnterprise Deployment

AWS enhanced Amazon Bedrock AgentCore Browser with enterprise-grade browsing: support for proxy routing ensures stable egress IPs and meets corporate network compliance; persistent browser profiles retain cookies and local storage across sessions, reducing repeated logins; and support for loading Chrome extensions (hosted from S3) allows customization of page processing logic. AWS also introduced a tiered routing priority (bypass mode, domain rules, default proxy) for granular control over traffic routing and data boundaries, improving agent usability and security in real-world web workflows.

Read full article

Sophos: OpenClaw Exposes Over 30K Instances, Agent Security Concerns Rise

AI SecurityAI Agent

Sophos warned of OpenClaw (Moltbot/Clawdbot) security risks in enterprises: researchers found over 30,000 exposed OpenClaw instances online, with attackers already discussing repurposing its 'skills' for botnets. Threats include malicious skills or prompt injection leading to local host compromise, and agents transferring sensitive data between trusted and untrusted systems, creating data leakage chains. The report recommends enterprises prohibit direct usage or restrict execution to sandboxes devoid of sensitive data, while establishing approved skill marketplaces, dedicated LLM access layers, and session isolation to mitigate risks from the combination of executable tools, external connectivity, and handling untrusted content.

Read full article

Pete Warden Open Sources Streaming ASR, 245M Parameters with 6.65% WER

SpeechOpen Source

Pete Warden, a member of the OpenAI developer community, announced the open-sourcing of a streaming speech-to-text (streaming STT) model and runtime library for real-time speech recognition. Its largest model has approximately 245 million parameters and achieves a 6.65% word error rate (WER) on the HuggingFace OpenASR leaderboard, outperforming Whisper Large v3's 7.44% WER despite the latter having ~1.5 billion parameters. Due to forum posting limits, he did not directly attach the repository link; resources are compiled in the pinned blog post on his personal site. This release sparked community discussion on lightweight, low-latency ASR applications in real-time agents and voice interaction.

Read full article

Musk Announces xAI Restructuring, Grok Line Reorganization and Layoffs

Company NewsOrganizational Change

According to Business Insider Japan, Musk announced organizational restructuring at xAI during an all-hands meeting on the evening of February 10: establishing new leadership structures for Grok, Grok Voice, Grok Code, and Grok Imagine, and adjusting management of the white-collar automation project 'Macrohard'. Following reports of multiple co-founders departing, additional staff exits and headcount reductions occurred. Internal sources indicated Musk expressed dissatisfaction with progress on certain projects and pushed for team downsizing. This move comes about a week after SpaceX's acquisition of xAI, with reports suggesting the integrated company plans to pursue an IPO within 2026.

Read full article

Don't Miss Tomorrow's Insights

Join thousands of professionals who start their day with AI Daily Brief