Back to Archive
Friday, June 19, 2026
10 stories3 min read

Today's Highlights

1

OpenAI Collaborates with Hundreds of Doctors Across 60 Countries to Enhance ChatGPT's Health Capabilities; GPT-5.5 Instant Now Freely Available to All Users

Large ModelHealthcare AI

OpenAI announced a collaboration with hundreds of practicing physicians across 60 countries to evaluate and improve ChatGPT's health-related question-answering capabilities in terms of accuracy, safety, and communication quality, emphasizing the democratization of scarce specialist medical knowledge. Meanwhile, GPT-5.5 Instant has already reached the performance level of its frontier reasoning models on health-related queries and is now freely accessible to all users. OpenAI views improving human health as one of the most personal and tangible impacts of AGI. Additionally, its o3 Deep Research initiative, in collaboration with Boston Children's Hospital and Harvard, identified 18 new diagnoses among 376 previously undiagnosed complex cases, including a rare case of myofibrillar myopathy.

Read full article
2

OpenAI Releases Alignment Research Showing RL-Trained Models Improve on 44 of 53 Independent Evaluations

AI SafetyLarge Model

OpenAI published new research on model alignment, using reinforcement learning (RL) trained on real dialogues across 12 domains to strengthen beneficial traits such as truthfulness and fairness. Results show improvements in 44 out of 53 independent alignment evaluations, generalizing beyond training scenarios. The models also demonstrated increased resistance to harmful fine-tuning and adversarial prompts while still responding appropriately to legitimate beneficial instructions. OpenAI describes this as an early step toward training 「widely and consistently beneficial」 models, aiming to ensure AI maintains safe and reliable behavior even in novel high-risk domains.

Read full article
3

OpenAI Codex Launches Record & Replay: Generate Reusable Skills from a Single Demonstration

AI AgentDevelopment Tool

OpenAI introduced the Record & Replay feature for Codex: users can demonstrate a repetitive task once on Mac (e.g., uploading a video to YouTube, filling forms, submitting a PR), and Codex observes and captures the sequence of actions and preferences, automatically generating a skill that is inspectable and editable. This skill can then be reused with different parameters to automate the task. It runs in entirely new contexts and supports execution via computer use, browser use, or plugin combinations—without requiring step-by-step prompting. The feature is currently rolling out gradually in select markets, with broader availability planned.

Read full article
4

Claude Code Introduces Artifacts Feature: AI Programming Evolves Toward Visual, Real-Time Collaboration

AI ProgrammingCollaboration Tool

Anthropic launched the Artifacts feature for Claude Code (in Beta for Team and Enterprise plans), transforming work-in-progress from conversations—such as research findings, PR explanations, system diagrams, and team dashboards—into shareable, real-time web links. Artifacts are generated using full conversation context and refresh automatically as the session progresses, ensuring collaborators always see the latest version. By default private, they can only be shared within an organization. Developers view this as a major workflow enhancement—similar to Codex Sites but available to all users—and capable of directly generating clickable mobile prototypes for review.

Read full article
5

Perplexity Launches Self-Evolving Memory System Brain, Boosting Task Accuracy by 25% and Reducing Costs by 13%

AI AgentMemory System

Perplexity introduced Brain, a self-evolving memory system for its Computer agent, centered on the principle of 「remembering what the agent has done」 rather than 「remembering the user.」 Brain builds a traceable context graph (automatically loaded into the agent sandbox as an LLM wiki), updated nightly by synthesizing sessions, connector outputs, and historical corrections. Early internal data shows a 25% increase in answer accuracy and 16% improvement in recall on familiar tasks, with a 13% reduction in cost for tasks requiring historical context. A feedback loop enables recursive self-improvement. The company states current token consumption is an investment in future efficiency gains.

Read full article
6

Study Reveals Deep Research Agents Leak Privacy via Web Queries; PA-DR Method Reduces Leaks by 3x

AI SafetyPrivacy

The MosaicLeaks study reveals that deep research agents can leak private information from local documents during web queries due to a 「mosaic effect」—seemingly harmless individual queries cumulatively reconstructing sensitive facts. Experiments show that reinforcement learning (RL) training focused solely on task success significantly increases leakage: success rate rises from 48.7% to 59.3%, but leakage jumps from 34.0% to 51.7%. The study proposes a privacy-aware method, PA-DR, combining situational task rewards with learned privacy rewards, achieving 58.7% success with only 9.9% leakage, while reducing required training samples by 5–6x. The conclusion: 「Privacy cannot be injected via prompts—it must be trained in.」

Read full article
7

Anthropic Project Fetch Phase II: Claude Opus 4.7 Programs Robot Dog ~20x Faster Than Humans

RoboticsAnthropic

Anthropic's Frontier Red Team released results from Phase II of Project Fetch: Claude Opus 4.7 programmed a robot dog approximately 20 times faster than last year’s top human teams using Opus 4.1. This project assesses frontier models’ progress in autonomous physical-world manipulation and programming tasks, reflecting rapid advances in AI autonomy within robotics control.

Read full article
8

Claude Launches Enterprise-Grade Centralized Authorization for MCP Connectors, Integrating Okta, Asana, and More

Enterprise AIMCP

Anthropic announced support for Enterprise-Managed Auth extension in MCP, allowing administrators to centrally authorize workplace connectors through identity providers (IdP), eliminating the need for users to configure OAuth per application. Based on the open MCP standard, any client, server, or IdP can adopt it. Currently in Beta with Okta support, it integrates connectors including Asana, Atlassian, Canva, Figma, and Granola. With centralized management, Claude can execute cross-tool workflows—such as data retrieval, customer call recap, ticket creation, and draft updates—via a single organizational-level request.

Read full article
9

OpenAI Launches Usage Analytics and Enhanced Spend Controls for ChatGPT Enterprise, Unifying Codex and ChatGPT Credit Data

Enterprise AICost Management

OpenAI introduced new credit usage analytics and upgraded spend control tools for ChatGPT Enterprise. The global admin console unifies credit data from ChatGPT and Codex, enabling administrators to track trends and analyze spending patterns at the user, product, and model levels, identify high-usage users, and manage costs effectively. Spend controls allow setting default workspace limits, group limits, and individual overrides. Employees can request additional credits with explanatory notes, avoiding blanket quota increases. Usage data can also be accessed via a unified Cost API for integration into enterprise systems for deeper analysis and cost optimization.

Read full article
10

AMP Founder Proposes Independent Compute Network Vision, Arguing Frontier AI Is About System Efficiency, Not GPU Procurement

AI InfrastructureCompute

AMP founder Anjney Midha argues that the real bottleneck in frontier AI competition lies in compute utilization efficiency, not GPU acquisition—highlighting xAI's under 10% MFU compared to industry-leading 60–70% benchmarks, and stressing that scheduling, networking, kernels, and cluster reliability are key. AMP's vision is to become a 「Independent System Operator」 for compute, aggregating supply and demand across multiple clouds and chip vendors, making FLOPs flow like megawatts. He also notes that about 20% of U.S. data centers face community opposition, suggesting sharing marginal costs (e.g., $0.5/hour) with local communities, and criticizes lab 「research hoarding」 for causing market failure.

Read full article

Don't Miss Tomorrow's Insights

Join thousands of professionals who start their day with AI Daily Brief