OpenAI Partners with Broadcom to Launch First In-House Inference Chip Jalapeño, Completed Design and Tape-Out in 9 Months
AI ChipOpenAIInference Acceleration
OpenAI and Broadcom jointly released Jalapeño, their first custom chip designed specifically for large language model inference. Utilizing a die-only design, it achieved tape-out just nine months after architecture definition—setting a record for the fastest development cycle of a high-performance ASIC—with parts of the design accelerated by OpenAI's own models. The chip focuses on minimizing data movement and balancing resources to approach theoretical peak performance, delivering significantly better performance per watt than current industry standards while remaining compatible with all major LLMs across the industry. Responsibilities are clearly divided: OpenAI leads on architecture and core logic, Broadcom provides silicon implementation and Tomahawk networking chips, and Celestica handles system integration. Deployment is scheduled to begin by the end of 2026 in gigawatt-scale data centers, with Microsoft among the partners as part of its multi-generation compute platform strategy.
Google Integrates Computer Use Capability into Gemini 3.5 Flash for Cross-Browser and Desktop Task Execution
GeminiAI AgentEnterprise Automation
Google announced that 「computer use」 is now an integrated feature in Gemini 3.5 Flash, replacing the previous standalone model. Developers can directly access this capability via the Gemini API and enterprise-grade Agent platform to build agents capable of perceiving, reasoning, and acting across browsers, mobile devices, and desktop environments—ideal for long-running enterprise automation tasks such as continuous software testing. For security, targeted adversarial training mitigates prompt injection risks, with optional enterprise protections: requiring user confirmation for irreversible actions and automatically halting tasks upon detection of indirect prompt injection. Google recommends a defense-in-depth approach and provides a demo environment and reference implementations for rapid onboarding.
ByteDance Releases Seedance 2.5 Video Model, Generating 30-Second 4K Videos from Single Prompt
Video GenerationByteDance
ByteDance has launched Seedance 2.5, its next-generation AI video generation model capable of producing 30-second, 4K-resolution videos from a single text prompt. The model supports up to 50 reference inputs—including images, videos, or audio—to enhance controllability during generation. It is scheduled for release in China next month. This launch continues the trend in the video generation race toward longer duration, higher clarity, and improved control.
4
OpenAI Updates GPT-5.5 Instant with Enhanced Intent Understanding and Complex Constraint Handling
GPT-5.5OpenAIModel Update
OpenAI has introduced an updated version of GPT-5.5 Instant, featuring improvements in intent understanding, handling of complex constraints, and recommendation coherence. The update was rolled out to paying users on the day of announcement and became available to free users the following day. Focused on real-world usability in conversational scenarios, the model demonstrates greater stability when processing multiple constraints simultaneously.
NVIDIA NeMo AutoModel Boosts MoE Fine-Tuning Throughput by 3.4–3.7x with One-Line Code Change
Model Fine-TuningNVIDIAMoE
NVIDIA released NeMo AutoModel based on Transformers v5, achieving 3.4–3.7x higher throughput and 29–32% lower GPU memory usage for MoE model fine-tuning—all with only a one-line import change and no other code modifications. The acceleration stems from three innovations: Expert Parallelism shards expert weights across multiple GPUs to reduce memory pressure, DeepEP fuses communication and computation, and TransformerEngine kernels accelerate core operators. On a 128-GPU setup, full fine-tuning of the 550B-parameter Nemotron 3 Ultra model becomes feasible, whereas native v5 fails due to insufficient memory. The library outputs standard HuggingFace checkpoints, ensuring compatibility with downstream tools like vLLM and SGLang.
Indian Manufacturer Tata Electronics Hit by Ransomware Attack, Over 200K Documents Including Tesla and Apple Specs Leaked
Data BreachCybersecurity
Indian electronics manufacturer Tata Electronics suffered a cybersecurity breach, with ransomware group The World Leaks publishing over 200,000 alleged internal documents containing product specifications, technical details, employee emails, and personal information related to Tesla and Apple. Meanwhile, FFmpeg patched a critical vulnerability named Pixelsmash that could enable remote code execution or denial-of-service attacks; researchers identified a 「role confusion」 issue in large language models that increases prompt injection success rates to 61%, reducible to 10% through 「de-stylization」 techniques. The Five Eyes alliance warned that frontier AI models with significant cyberattack capabilities may emerge within months.
7
Databricks Co-Founders Advocate for Open Frontier Ecosystem, Launch Open Meta-Framework Omnigent
AI AgentOpen SourceEnterprise AI
Databricks co-founders Matei Zaharia and Reynold Xin outlined their vision for an open agent ecosystem. Their open-source meta-framework, Omnigent, offers a unified API across different agent systems, covering conversations, files, tools, and collaboration, and can be layered atop platforms like Claude Code, Codex, and Cursor to solve issues around portability, conversation history, security, and cost control. They also proposed LTAP (Lakehouse Transactional Analytics Processing), which writes transactional data into columnar storage while supporting both real-time operational queries and analytics—providing AI agents with live data access. Their central thesis: as frontier model performance becomes increasingly commoditized, enterprise-specific data, controlled access, operational state, and workflows—the 「context」—will form lasting competitive moats, with open formats (Delta Lake, Parquet) ensuring data portability as a strategic advantage.
WAIC 2026 Future Tech Expo Selects 175 Early-Stage AI Projects from 1,200 Applications, Secures 268 Million Yuan in意向Orders
WAICVenture CapitalIndustry Trend
The WAIC 2026 Future Technology Expo selected 175 early-stage AI projects from 1,200 applications across four tracks, with industry applications and embodied intelligence emerging as the most competitive categories—reflecting AI’s accelerating shift from research to deployment. Representative startups showcased cutting-edge directions including quantum computing, streaming video generation, and neural EMG data acquisition. The OPC Independent Pioneer Challenge emphasized originality, openness, and individuality, with eight teams rising from over 600 entries across verticals like gaming, education, and finance. The event invited more than 200 investors and reached over 1,200 potential clients, generating 268 million yuan (~$37M USD) in preliminary orders, establishing an efficient channel connecting capital, use cases, and innovation.
Chinese 3D Generation Company Yingmu Tech Raises Hundreds of Millions in Funding, Launches Hyper3D Rodin Gen-2.5
3D GenerationAI FundingEmbodied Intelligence
Yingmu Tech has secured hundreds of millions in new funding and unveiled Hyper3D Rodin Gen-2.5, a native 3D generation model featuring an adaptive mechanism akin to 「depth of thought」 that treats representation length as a scalable variable, dynamically allocating compute based on object complexity. It enables five adjustable levels of generation speed and quality, supports meshes with millions of polygons, and outputs 12K native textures—meeting industrial usability standards. The model addresses controllability challenges in generative AI through four technologies: local editing, recursive component separation, 3D ControlNet, and full DCC plugin coverage. Commercially, the company reports near-100% B2B renewal rates, ARR in the tens of millions of USD, serving major clients including Lowe's, Unity, and NVIDIA, with 80% of revenue coming from overseas markets. Half of its 60-person team have won awards at SIGGRAPH, and 70% of its research outputs are successfully commercialized.
Cursor Launches Notion Integration and GLM 5.2 Model, Expanding Task Delegation and Coding Capabilities
CursorAI ProgrammingGLM
AI coding tool Cursor has added Notion integration, allowing users to delegate tasks directly from Notion to cloud-based agents built on the same SDK as the editor. Additionally, the GLM 5.2 model is now available within Cursor, along with benchmark results. These updates further expand Cursor’s capabilities in cross-platform task delegation and multi-model support, enabling developers to leverage AI coding assistance across broader workflows.