On March 2, XPeng Motors launched its second-generation VLA (Vision-Language-Action) end-to-end embodied AI large model with 22B parameters, claiming a 32x improvement in inference efficiency and approximately 80ms decision latency, running in real time on vehicles via self-developed Turing chips. Trained on about 50PB of data and consuming 58.8 trillion tokens daily, the world model generates over 500,000 simulation scenarios, equivalent to around 30 million kilometers of daily testing; average disengagement distance improved 25x, and safe disengagement distance improved 50x. The system will be rolled out in batches starting late March to models such as the P7 Ultra, with plans to merge VLA and cabin VLM by year-end and launch robotaxi operations.
Tongyi Releases Fun-CosyVoice3.5: Misreading Rate Drops to 5.3%
Speech Generation
Alibaba's Tongyi Lab speech team has released Fun-CosyVoice3.5 and Fun-AudioGen-VD, both supporting natural language FreeStyle instructions for style and scene control. CosyVoice3.5 targets Instruct-TTS, adding multilingual support including Thai, Indonesian, Portuguese, and Vietnamese, leading in WER and SpkSim metrics across 13 languages. After optimization on rare characters and complex sentences, misreading rate dropped from 15.2% to 5.3%. With reinforcement learning and halved Tokenizer frame rate, first-response latency reduced by 35%, enhancing real-time interaction. AudioGen-VD can generate integrated auditory scenes combining 'speaker voice, emotion, and environmental sounds' from descriptions.
Alibaba Open-Sources Qwen3.5 Small Models: 0.8B–9B, 262K Context
Open Source ModelEdge AI
The Alibaba Qwen team has open-sourced the Qwen3.5 small model series (0.8B/2B/4B/9B), using hybrid architectures like Gated DeltaNet and natively supporting text-image-video multimodalities. Officially reported, the 9B model outperforms larger open-source OpenAI gpt-oss-120B on multiple benchmarks. The model natively supports 262K context length, extendable to approximately 1 million tokens via YaRN, supports 201 languages and multi-token prediction. On the inference side, the 9B model requires about 5GB VRAM under 4-bit quantization, enabling local execution on consumer-grade GPUs or laptops. Weights are released under Apache 2.0 license, suitable for self-hosted deployment.
Xiaomi Humanoid Robot Deploys in EV Factory: 3-Hour Autonomy, 90.2% Success Rate
Humanoid RobotEmbodied Intelligence
Xiaomi revealed its humanoid robot has entered an electric vehicle factory for self-tapping nut assembly: autonomously operating continuously for 3 hours at a single station, achieving a 90.2% success rate in synchronized dual-arm installation, meeting the production line’s fastest takt time of 76 seconds. The solution uses an end-to-end data-driven control approach, integrating Xiaomi’s self-developed 4.7B-parameter VLA large model Xiaomi-Robotics-0 with reinforcement learning, and fuses head/wrist vision, fingertip tactile sensing, and proprioception to enhance alignment accuracy and anti-magnetic interference. Whole-body motion control employs a hybrid architecture of optimal control and RL, balancing safety constraints and real-time responsiveness. Xiaomi stated it is validating additional stations such as sorting and front badge installation, advancing toward large-scale production deployment.
ServiceNow Acquires Traceloop: Deal Valued at $60M–$80M
AcquisitionAI Governance
ServiceNow has acquired Israeli AI observability startup Traceloop in a deal reportedly valued between $60 million and $80 million. Founded about two and a half years ago, Traceloop's technology is based on the open-source project OpenLLMetry, providing automated evaluation, tracing, and monitoring for LLM applications and AI agents, helping identify performance degradation, prompt failures, and toolchain issues in production environments. Clients include IBM, Miro, and Dynatrace. ServiceNow plans to integrate these capabilities into its AI Control Tower, strengthening enterprise governance, auditing, and reliable operations for models and agents, reflecting that 'observability + compliance' is becoming foundational infrastructure for agent deployment.
Xiaohongshu Open-Sources FireRed-OCR: 2B Model Scores 92.94% on OmniDocBench
OCROpen Source
Xiaohongshu's tech team has open-sourced FireRed-OCR, a 2B-parameter document parsing model optimized vertically to address 'structural hallucination' in tables, formulas, and reading order common in general VLMs. The project adopts a three-stage training pipeline: multi-task pre-alignment to establish spatial awareness, domain-specific SFT to unify logical expressions, followed by GRPO reinforcement learning with format constraints to enforce output structure. Officially reported, it achieved a comprehensive score of 92.94% on the OmniDocBench v1.5 leaderboard, demonstrating that end-to-end parsing can surpass much larger general-purpose models even at smaller parameter scales. The model and methodology have been disclosed on REDtech and integrated into the ModelScope ecosystem, facilitating enterprise development of document understanding, information extraction, and RPA pipelines.
Galbot Secures 2.5 Billion Yuan Funding: Big Fund III's First Embodied Intelligence Investment
FundingEmbodied Intelligence
Embodied intelligence company Galbot announced completion of a 2.5 billion yuan funding round, with multiple reports indicating China's National IC Fund Phase III has entered the embodied intelligence sector for the first time. The company focuses on the end-to-end embodied large model 'AstraBrain' and synthetic data system 'AstraSynth', generating vast interaction data through simulation and aligning with minimal real-world data to accelerate closed-loop training from perception to control. Its robots have been deployed in industrial and service scenarios, including CATL production lines and smart pharmacies. Funds will support model training, hardware iteration, and large-scale scenario deployment; the involvement of state-backed capital signals that embodied models are entering a phase of capital-intensive, long-cycle industrial competition.
Huawei & CityU Use LLM + Evolutionary Computing: Break 98 Records in CVRP
Algorithm ResearchLLM Application
A joint team from Huawei and City University of Hong Kong won the top international CVRP (Capacitated Vehicle Routing Problem) competition using an 'LLM + evolutionary computing' EoH framework, setting new records in 98 instances. The method enables large models to generate heuristic algorithm ideas and code snippets, which then undergo evolutionary selection, execution evaluation, and iterative mutation in a closed loop, reducing reliance on manual expertise and hand-tuning; humans primarily provide problem templates and evaluation metrics. Results show AI can now rewrite algorithmic mechanisms in combinatorial optimization, not just assist coding, potentially reshaping R&D processes in logistics scheduling, warehouse distribution, and production planning software.
Databricks has introduced Real-Time Mode (RTM) for Spark Structured Streaming, achieving sub-second end-to-end latency for common streaming feature engineering workloads without introducing a second engine like Flink. Key changes include shifting from micro-batching to continuous data flow, adopting pipelined parallel scheduling to avoid stage blocking, and using streaming shuffle to bypass traditional disk bottlenecks. Official benchmarks show up to 92% faster performance than Apache Flink on certain stateless transformations, joins, and aggregations. RTM aims to enable enterprises to use a single Spark API for both ETL and real-time applications, reducing operational and governance overhead.