AI Daily Brief

Saturday, June 13, 2026

10 stories3 min read

Today's Highlights

MiniMax Releases Open-Source M3 Model with 428 Billion Parameters and 1 Million Token Context

Open Source ModelMiniMax

Chinese AI company MiniMax launched the open-source Mixture-of-Experts model M3 on June 11, featuring a total of 428 billion parameters with 23 billion activated per inference. The model natively supports text, image, and video inputs. It employs MiniMax's proprietary sparse attention technology MSA, enabling a 1 million token context window, with prefill speed improved by 9.7x and decoding speed by 15.6x. On SWE-Bench Pro, it scored 59%, Terminal Bench 2.1 at 66%, and MCP Atlas at 74.2%. Model weights are freely available on Hugging Face for local deployment. API pricing doubles after 512K tokens. NVIDIA Blackwell infrastructure is now supported for deployment.

Read full article

BAAI Unveils World’s First General-Purpose World Model with Native Multimodal Emu3.5 Open-Sourced

World ModelBAAI

Beijing Academy of Artificial Intelligence (BAAI) unveiled two world models 「Wujie·Physis-v0.1」 and 「Wujie·RoboBrain Orca-v0」 at the 2026 BAAI Conference. The former adopts a 「next physical state prediction」 paradigm to unify video, RGB-D, 3D point clouds, and force-tactile feedback, supporting over 50 physical scenarios. BAAI also open-sourced the native multimodal world model Emu3.5, a 34B-parameter model based on a Decoder-only Transformer architecture, pre-trained on over 10 trillion tokens, and integrated with a diffusion decoder capable of 2K-resolution reconstruction. Emu3.5 matches or surpasses Gemini-2.5-Flash-Image across multiple benchmarks. Its discrete diffusion adaptation technique accelerates image inference by nearly 20x.

Read full article

Bezos’ AI Startup Prometheus Raises $12 Billion at $41 Billion Valuation

FundingPhysical AI

Prometheus, a physical AI company co-founded by Jeff Bezos and former Verily CEO Vik Bajaj, announced on June 11 that it has raised $12 billion in funding, reaching a $41 billion valuation. The round was led by Bezos, JPMorgan Chase, Goldman Sachs, and BlackRock, marking the company’s second major raise following a $6.2 billion round last year. The company aims to build an 「artificial general engineer」—AI software capable of autonomously designing and manufacturing complex physical systems such as jet engines, drug compounds, and skyscrapers. With 150 employees across San Francisco, London, and Zurich, most of the capital will be allocated to meet computing demands. This valuation positions Prometheus among the highest-valued AI startups in history.

Read full article

SpaceX Completes IPO at $1.77 Trillion Valuation, Shares Surge 11% on Debut

IPOSpaceX

SpaceX completed its historic IPO on June 12 at $135 per share, raising $75 billion by issuing 5.556 billion shares, achieving a $1.77 trillion valuation—surpassing Tesla to become the seventh-largest U.S. company by market cap. On its first trading day, shares rose between 11% and 25.5%, briefly pushing its market value above $2.21 trillion. Elon Musk, holding about 40% equity but controlling 85% of voting power, saw his net worth approach $1 trillion, making him the world’s first trillionaire. SpaceX stock, ticker SPCX, listed on Nasdaq, integrates Starlink, xAI, and X platforms. Despite a 「sell」 rating from CFRA analysts, retail orders exceeded $100 billion, far surpassing allocations, signaling strong investor appetite ahead of anticipated IPOs from Anthropic and OpenAI.

Read full article

Google Open-Sources Edge Multimodal Gemma 3n, Runs on Just 2GB GPU Memory

Open Source ModelEdge AI

Google released Gemma 3n, an open-source edge multimodal model available in 5B (E2B) and 8B (E4B) versions. Leveraging the novel MatFormer architecture, it enables nested elastic inference, effectively matching 2B and 4B models respectively, while requiring as little as 2GB of GPU memory. It achieved 1,303 points on the LLM Arena, becoming the first sub-10B model to surpass 1,300. Natively supporting text, images, and audio-video input, it features a USM-based audio encoder for speech recognition and translation, along with the new MobileNet-V5-300M visual encoder. Memory usage is significantly reduced via PLE layer embedding and KV cache sharing. The model is now accessible through Google AI Studio, Ollama, and Hugging Face.

Read full article

Moonshot Releases Trillion-Parameter Open-Source Coding Model Kimi-K2.7-Code

Open Source ModelAI Programming

Moonshot AI released the open-source coding model Kimi-K2.7-Code in June, built on a Mixture-of-Experts architecture with a total of 1 trillion parameters and 32 billion activated per inference. Compared to the previous K2.6 version, it reduces inference token usage by 30%, significantly lowering computational cost and latency. It shows substantial gains across multiple benchmarks: +21.8% on Kimi Code Bench v2, +11.0% on Program Bench, and +31.5% on MLS Bench Lite for multilingual tasks, with notable accuracy improvements in Python, Rust, and Go programming. The model is available via the Kimi platform API and Hugging Face under a Modified MIT license allowing commercial use. This marks the company’s fifth major release within a year, focusing on agent capabilities and long-context processing.

Read full article

Claude Independently Solves Knuth’s Graph Theory Conjecture, Discovers «Fiber Decomposition» Solution in 31 Steps

AI MathClaude

Claude Opus 4.6 solved Knuth’s unsolved graph theory conjecture after 31 exploration steps: finding three disjoint Hamiltonian cycles covering all directed edges in an m×m×m toroidal grid. The search space scales as 3^(m³), previously only solvable for small m. Claude proposed a method called 「fiber decomposition,」 partitioning the graph using s = (i + j + k) mod m, and ultimately discovered a universal solution construction rule based on 「bump」 patterns derived from s, i, and j values. Knuth published a paper on Stanford’s website calling this 「Shock! Shock!」, stating it forces a reevaluation of generative AI’s role in mathematical research. This marks the first officially documented case of generative AI participating in core mathematical discovery.

Read full article

MIIT Issues “AI+ICT” Implementation Guidelines, Targeting 30 High-Value Scenarios by 2028

AI PolicyChina

China’s Ministry of Industry and Information Technology (MIIT) released the “Implementation Opinions on Innovative Development of ‘AI+Information and Communication’ (2026–2028).” It aims to establish an initial framework for AI and ICT integration by 2028, achieving high-level autonomous intelligence in communication networks and creating over 30 high-value application scenarios, with metropolitan computing power achieving sub-1ms latency coverage of no less than 75%. By 2030, integrated sensing, computing, and intelligent service capabilities will be significantly enhanced. The document outlines 17 tasks across four pillars: advancing intelligent upgrades in ICT, strengthening AI foundations, deepening fusion applications, and enhancing industry governance. Key initiatives include promoting 5G-A/6G and AI convergence, accelerating agent development, building 400Gbps/800Gbps backbone networks, and establishing a three-tiered computing hierarchy of hub-region-edge.

Read full article

Zyphra Releases Zamba2-VL Vision-Language Model, Reducing Time-to-First-Token by 10x

Open Source ModelVision-Language Model

Zyphra released Zamba2-VL, an open-source hybrid Mamba2-Transformer vision-language model available in 1.2B, 2.7B, and 7B parameter sizes. Built on a LLaVA-style architecture with the Qwen2.5-VL vision encoder and lightweight MLP adapters, its core innovation lies in replacing dense Transformer attention with Mamba2 state-space layers, enabling near-linear prefill time and fixed-size recurrent states. This reduces Time-to-First-Token (TTFT) by approximately an order of magnitude, particularly beneficial for long-sequence inputs. The model excels in visual counting (PixMoCount) and document understanding (DocVQA), though lags behind larger models in knowledge-intensive reasoning. Weights and code are publicly available under the Apache 2.0 license.

Read full article

PixelRAG Vision Retrieval System Outperforms Text RAG by 18% in Accuracy, Cuts Token Cost by 10x

RAGAI Research

A research team from UC Berkeley, Princeton University, EPFL, and Databricks introduced PixelRAG, a system that bypasses traditional text parsing by rendering web pages directly into screenshots and indexing them for visual language models to read. In tests involving 30 million screenshot tiles from Wikipedia, PixelRAG outperformed text-based RAG systems across six benchmarks, with accuracy gains up to 18.1%. The study identifies three main failure modes in HTML parsing: parsing loss (36.6%), ordering loss (55.2%), and reader loss (8.2%). Using Qwen3-VL-Embedding-2B and FAISS for efficient retrieval, PixelRAG reduced token consumption in agent tasks from 37.5 million to 3.6 million, cutting costs by 10x. The researchers recommend adopting hybrid text-vision retrieval strategies in the near term.

Read full article

Don't Miss Tomorrow's Insights

Join thousands of professionals who start their day with AI Daily Brief