Back to Archive
Saturday, January 24, 2026
9 stories3 min read

Today's Highlights

1

Inferact raises $150M in seed funding at $800M valuation to commercialize vLLM

FundingInference InfrastructureOpen Source

Inferact, founded by the core team behind the open-source inference engine vLLM, has completed a $150 million seed round led by a16z and Lightspeed, achieving an $800 million valuation. The company plans to continue maintaining and upgrading the vLLM open-source project while launching a production-grade commercial inference engine and a serverless version of vLLM. It will also add enterprise capabilities such as observability, debugging, and disaster recovery, aiming to reduce large model inference costs and improve deployment efficiency for cloud providers and enterprise customers.

Read full article
2

Sugon Technology's科创板 IPO application accepted, plans to raise 6 billion yuan for 5th/6th-gen AI chips

ChipIPOChina

Sugon Technology, a domestic AI chipmaker for cloud computing, has had its IPO application on the STAR Market accepted. The company plans to raise 6 billion yuan to fund research into its fifth- and sixth-generation AI chips and related software-hardware co-innovation projects. Disclosed information shows that Sugon's revenue grew from 90.1 million yuan in 2022 to 540 million yuan in the first three quarters of 2025, but it accumulated over 5 billion yuan in losses during the same period, with R&D expenses reaching up to 1096% of revenue. Tencent is both the largest shareholder and customer, accounting for 71.84% of sales in the first three quarters of 2025, raising concerns about customer concentration and related-party transaction risks.

Read full article
3

Google DeepMind releases D4RT: Reconstructs 4D from 2D video up to 300x faster

VisionWorld ModelResearch

Google DeepMind has released D4RT (Dynamic 4D Reconstruction and Tracking), a model capable of reconstructing 3D scenes from ordinary 2D videos and tracking object motion over time, enabling '4D' spatiotemporal understanding. D4RT uses a unified Transformer architecture and query mechanism to jointly handle depth estimation, motion tracking, and camera pose estimation. Official benchmarks indicate it is 18 to 300 times faster than traditional methods, processing one minute of video in approximately five seconds, and can still predict trajectories under occlusion. Potential applications include robot navigation, AR, and world model construction.

Read full article
4

MIIT proposes mandatory digital human ID standard: 'one person, one code' for commercial use

PolicyDigital HumanStandard

The Ministry of Industry and Information Technology (MIIT) has issued a notice proposing the development of a mandatory national standard titled 'Metaverse Classification and Identification – Requirements for Digital Human Identity,' now open for public comment. The goal is to establish a digital human identity system with registration, management, and labeling norms, ensuring 'one person, one code' for digital humans used in commercial and communication contexts within China. The regulatory intent includes improving traceability, rapidly identifying violators, and reducing fraud risks. The announcement notes that China already hosts over 1.14 million companies related to digital humans, with the industry expected to surpass 40 billion yuan by 2025.

Read full article
5

Cloudflare discloses BGP route leak: Miami node misrouted traffic for 25 minutes, dropped 12Gbps

CybersecurityInfrastructurePostmortem

Cloudflare disclosed a BGP route leak incident affecting its Miami data center on January 22, lasting approximately 25 minutes. The cause was overly permissive export rules following an automated routing policy change, which incorrectly advertised internal backbone routes externally, causing global traffic to be misdirected to a single node and triggering congestion. At peak, security filters dropped about 12Gbps of unauthorized inbound traffic. Cloudflare stated it will strengthen CI/CD policy validation and adopt mechanisms like BGP Roles and RPKI ASPA to programmatically block abnormal AS paths.

Read full article
6

Docker and Arm launch Docker MCP Toolkit: Reduces x86-to-ARM64 migration via Copilot to ~30 minutes

Development ToolsMCPCloud Native

Docker and Arm have launched the Docker MCP Toolkit and Arm MCP Server, offering migration capabilities callable by assistants like GitHub Copilot to automate the transition of legacy x86 applications to ARM64. The solution scans for x86-specific instructions (e.g., AVX2) and maps them to equivalents like NEON/SVE using a knowledge base, while automatically handling environment checks, code analysis, and pull request creation. The companies claim this reduces a previously 5–7 hour manual migration process to about 25–30 minutes, lowering reliance on architecture experts and enabling bulk enterprise modernization.

Read full article
7

arXiv tightens review to curb 'AI slop' and low-quality paper surge

ResearchPlatform GovernanceContent Quality

The arXiv preprint platform has announced enhanced scrutiny to combat the proliferation of low-quality, AI-generated papers ('AI slop'), aiming to preserve academic integrity and credibility. With nearly 3 million manuscripts spanning computer science and physics, arXiv has recently introduced stricter measures to block low-value or fraudulent submissions, potentially including tighter submission protocols or requiring more detailed author contribution statements. This move reflects ongoing concerns in the academic community about AI misuse, fabricated results, and citation pollution.

Read full article
8

GAF proposes 'Generative Application Firewall': Integrates prompt injection and semantic attacks into five-layer defense

AI SecurityAgentArchitecture

Researchers and vendors have proposed the concept of a 'Generative Application Firewall' (GAF), drawing inspiration from WAFs to provide a unified policy enforcement point for LLM applications and AI agents. GAF consolidates fragmented capabilities such as prompt filtering, safety guards, and data redaction into a layered defense strategy. The framework defines five layers—network, access, syntax, semantics, and context—to defend against threats including DDoS, session hijacking, code/prompt injection, jailbreaking, and multi-turn progressive attacks. It also extends the threat model to include a 'semantic layer (L8).' The architecture can be deployed as a reverse proxy, AI gateway, or service mesh sidecar for production governance.

Read full article
9

Study: Poetic prompts significantly increase jailbreak success, boosting unsafe output by 35% on average

AI SecurityJailbreakEvaluation

Kaspersky has disclosed a jailbreaking study showing that rewriting malicious prompts in poetic form can more easily bypass safety protections in LLMs from multiple vendors. The study tested 25 models from Anthropic, OpenAI, Google, Meta, and DeepSeek, comparing prose and poetic versions of 1,200 malicious queries. Results showed poetic forms increased the likelihood of unsafe responses by an average of 35%. Notably, DeepSeek-chat-v3.1 saw a nearly 68 percentage point rise in success rate, while Gemini 1.5 Pro was fully compromised in 100% of cases under the top 20 most effective poetic prompts. Safety evaluations were conducted using both AI jury and human reviewers.

Read full article

Don't Miss Tomorrow's Insights

Join thousands of professionals who start their day with AI Daily Brief