Back to Archive
Thursday, December 18, 2025
10 stories3 min read

Today's Highlights

1

OpenAI Releases GPT-Image-1.5, Boosting Image Generation Speed by 4x and Significantly Upgrading Editing Capabilities

OpenAIAIGCImage Generation

OpenAI has officially launched GPT-Image-1.5, significantly enhancing ChatGPT's image generation and editing capabilities. Generation speeds can reach up to 4 times faster than the previous generation, with major improvements in text rendering and multi-step editing. It supports more complex image synthesis and detail retention, has reclaimed the top spot in mainstream benchmarks, and directly competes with rivals like Google Nano Banana Pro.

Read full article
2

Google Gemini 3 Flash Launches Globally, Boosts Reasoning Speed by 3x and Significantly Reduces Costs

GoogleLarge Language ModelMultimodal

Google has released the Gemini 3 Flash model, combining Pro-level reasoning capability with the high efficiency and low latency of the Flash series. Its reasoning speed is 3 times faster than the 2.5 Pro model, with costs reduced to just one-fourth. Supporting multimodal inputs (text, image, audio, video) and leading in benchmarks like MMMU Pro, it has become the default model for the Gemini App and Google Search AI mode, and is available for free to global users.

Read full article
3

Google Gemini 3 Flash Performs on Par With or Exceeds GPT-5.2 in Multiple Tests, AI Model Competition Intensifies

GoogleOpenAILLM Comparison

Gemini 3 Flash scored within 1 percentage point of OpenAI's GPT-5.2 in high-difficulty reasoning tests like Humanity’s Last Exam, and led GPT-5.2 (79.5%) in the MMMU Pro multimodal understanding test with a score of 81.2%, indicating intensifying competition between Google and OpenAI on flagship model performance.

Read full article
4

Meta Releases SAM Audio, Supporting Multimodal Audio Source Separation via Text, Vision, or Time Segments

MetaMultimodalAudio AI

Meta has introduced the SAM Audio model, achieving target sound source separation from mixed audio for the first time using prompts based on text, vision, or time segments. Suitable for scenarios like background noise removal and instrument extraction, it supports various input methods to enhance audio editing flexibility. The model and evaluation tools have been open-sourced.

Read full article
5

MIT-IBM Proposes New PaTH Attention Architecture, Enhancing LLM State Tracking and Reasoning Capabilities for Long Text

Large Language ModelArchitectural InnovationAcademic Frontier

The MIT-IBM Watson AI Lab has proposed the PaTH Attention positional encoding method, significantly improving Transformer models' state tracking and sequential reasoning abilities on long texts, outperforming mainstream RoPE methods. Its effectiveness has been validated in multiple reasoning and long-context tasks, advancing large model applications in structured domains.

Read full article
6

Google Labs Launches Gmail AI Assistant CC, Automatically Summarizes Schedules & Tasks, Supports Email Interaction

GoogleAI AssistantProductivity Tool

Google Labs has launched the experimental AI assistant CC. Built on the Gemini model, it automatically integrates information from Gmail, Calendar, Drive, etc., and pushes a daily "Your Day Ahead" email summary. Users can interact via email to add tasks, query information, or customize preferences. It's available to AI Pro/Ultra subscribers in the US/Canada.

Read full article
7

Multi-Agent AI System Research: Google and MIT Experiments Show 'More is Better' Does Not Hold

Multi-AgentAI SystemAcademic Research

A joint study by Google and MIT shows that across 180 experiments, multi-agent collaboration can improve efficiency in tasks like financial analysis, but in tasks requiring sequential reasoning (e.g., Minecraft), it can cause a 70% performance drop. Multi-agent systems are also more likely to consume token budgets, suggesting businesses should adopt multi-agent orchestration cautiously.

Read full article
8

GitHub Actions to Introduce $0.002 per Minute Platform Fee from 2026, Adjusting Cloud CI/CD Cost Structure

GitHubDevOpsCloud Services

GitHub has announced that starting March 2026, all Actions workflows (including self-hosted runners) will incur a $0.002 per minute cloud platform fee. Concurrently, prices for GitHub-hosted runners will be reduced by up to 39%. This move will impact CI/CD cost structures for open-source and enterprise users, driving the commercial transformation of cloud automation tools.

Read full article
9

PromptPwnd: New Supply Chain Injection Vulnerability Emerges in AI Agent-Powered GitHub Actions

AI SecurityDevOpsSupply Chain Attack

Security research has uncovered the PromptPwnd vulnerability in GitHub Actions and GitLab CI/CD pipelines that use AI agents like Gemini CLI and Claude Code. Attackers can inject malicious content via issues, PRs, etc., to trick AI agents into leaking sensitive information such as GITHUB_TOKEN. Official remediation advice has been released.

Read full article
10

exe.dev Launches Efficient VM Hosting Service for AI Agents, Supporting Large-Scale, Low-Latency Deployment

AI InfrastructureCloud ComputingDevOps

exe.dev has launched a public beta of its VM hosting platform, supporting rapid, large-scale spin-up of Ubuntu virtual machines via an SSH API. It emphasizes privacy, persistence, low latency, and no per-VM marginal cost, making it suitable for batch-running AI agents and automation tools, simplifying AI infrastructure deployment.

Read full article

Don't Miss Tomorrow's Insights

Join thousands of professionals who start their day with AI Daily Brief