cuppa

today's signal · no scroll

live

brewed 03:13 AM

← previous← May 27

Thursday

may28

2026

next →May 29 →

the brief

OpenAI leaned into enterprise agent plumbing: private MCP inside your network, Workload Identity Federation, and a beefed up Admin API, plus new Codex patterns for meetings. Real deployments showed up with self-improving tax agents, while IBM/HF's ITBench-AA and Google's zero-trust analytics underscored how much reliability and privacy work still matters. Tooling kept pace with Claude Code updates and even SQLite adding AGENTS.md for AI.

the poursit · sip · 12 items

alerts

(01)

@unknown· XMay 27, 02:59 PM
Codex deprecates GPT-5.2 and 5.3
OpenAI retires GPT-5.2 and GPT-5.3-Codex in Codex on June 2, with GPT-5.5 the new default frontier model—update pins and tests before the cutoff.
To simplify our Codex compute fleet management, we will be sunsetting GPT-5.2 and GPT-5.3-Codex in Codex on June 2nd when logged in with your ChatGPT account. For free plans, GPT-5.5 will be the default frontier model to build and work with going forward. These models will
signal 7hype 1model_deprecationproduct_updateopenailaunchsource ↗

pulse

(05)

@OpenAIDevs· XMay 27, 06:29 PM
Private MCP servers for OpenAI tools
OpenAI now supports outbound-only HTTPS to connect ChatGPT, Codex, and the Responses API to MCP servers kept inside your network, easing secure enterprise deployments.
Private MCP servers 🤝 OpenAI products Your team can keep MCP servers inside your network while ChatGPT, Codex, and the Responses API connect through outbound-only HTTPS. 🔗 developers.openai.com/api/docs/guide… pic.x.com/uMsQJJK9ho
signal 8hype 2mcpopenaienterpriselaunchsource ↗
@gdb· XMay 27, 03:47 PM
Codex real-time meeting Q&A
OpenAI is showcasing Codex transcribing meetings live and answering context-aware questions, pointing to built-in patterns for reliable audio, memory, and retrieval in agent workflows.
Codex for transcribing and answering questions about a meeting in real time: x.com/_simonsmith/st…
signal 5hype 1demomeeting_agentreal_timelaunchsource ↗
@OpenAIDevs· XMay 27, 06:29 PM
Workload Identity Federation for API
Teams can now use cloud IAM federation to grant temporary credentials to services using OpenAI, reducing static API key sprawl and aligning with enterprise security practices.
Workload Identity Federation brings cloud-based identity to the OpenAI API platform. Teams can manage access through IAM workflows while reducing the need to distribute permanent API keys across services. 🔗 developers.openai.com/api/docs/guide… pic.x.com/IJmsc3B8K9
signal 7hype 1authiamworkload_identitylaunchsource ↗
@OpenAIDevs· XMay 27, 06:29 PM
Admin API adds enterprise controls
New endpoints bring spend alerts, model allowlists, granular cost views, data retention, and hosted tool controls, making large org governance of OpenAI projects programmatic.
We’ve expanded the Admin API to help enterprises manage OpenAI projects programmatically. New support includes spend alerts, model allowlists, data retention controls, hosted tool controls, and more granular cost visibility for capabilities like file search and web search. 🔗 pic.x.com/M9rBcju7HF
signal 6hype 1api_updateenterprisegovernancelaunchsource ↗
anthropics/claude-code· First-partyMay 28, 12:52 AM
Claude Code v2.1.153 ships
Update adds Git LFS skip for marketplace clones, terminal sizing envs for status commands, better slash-command autocomplete, and improved global install self-diagnostics.
v2.1.153 — What's changed Added skipLfs option to github/git plugin marketplace sources to skip Git LFS downloads during clone and update Claude Code now shows a one-time notice when your npm global install can't auto-update; /doctor lists the fixes Status line commands now receive COLUMNS and LINES environment variables so scripts can size output to the terminal width claude agents: autocomplete in the dispatch input now suggests native slash commands and bundled skills, not just project ski...
signal 9hype 1release_notesclaude_codegit_lfslaunchsource ↗

findings

(04)

huggingface/blog· First-partyMay 27, 05:20 PM
Frontier models lag on ITBench-AA
IBM and Artificial Analysis release a benchmark for agentic enterprise IT tasks, where leading LLMs score under 50%, highlighting big gaps in real-world IT automation.
ITBench-AA: Frontier Models Score Below 50% on the First Benchmark for Agentic Enterprise IT Tasks — by Artificial Analysis and IBM
signal 8hype 1benchmarkagentsevaluationtechnicalsource ↗
google/research· First-partyMay 27, 04:56 PM
Zero-trust aggregation for private analytics
Google presents a system for privacy-preserving telemetry that avoids trusting any single party, combining cryptography and aggregation to enable useful analytics without raw data exposure.
Private analytics via zero-trust aggregation — Security, Privacy and Abuse Prevention
signal 5hype 1privacyzero_trustanalyticstechnicalsource ↗
@OpenAIDevs· XMay 27, 02:12 PM
How Codex self-improves on errors
OpenAI details a Tax AI workflow where human corrections are traced to failures, automatically propose changes, and run tests before shipping, a practical loop for reliable agent systems.
⚙️ Behind the build of self-improving tax agents with Codex We co-built Tax AI with @ThriveHoldings around tax prep workflows so when reviewers fix any errors, Codex can trace the failure, improve the system, and test the change before it ships. openai.com/index/building…
signal 6hype 3agent_frameworkcase_studyself_improvingtechnicalsource ↗
simonw/blog· AnalysisMay 27, 11:44 PM
SQLite adds AGENTS.md for AI
SQLite’s repo now includes an AGENTS.md clarifying contribution rules and safety notes for code-reading agents, a signal that mainstream projects are preparing for automated contributors.
sqlite AGENTS.md — <a href="https://github.com/sqlite/sqlite/blob/master/AGENTS.md">sqlite AGENTS.md</a> SQLite gained an AGENTS.md file <a href="https://github.com/sqlite/sqlite/commit/a1e5778889252d2609a59fd9b819d70392c5789e">five days ago</a> - but it's not intended for their own development, it's presumably aimed at people who are pointing agents at the SQLite codebase. It includes: <blockquote> SQLite does not accept pull requests without prior agreement an...
signal 8hype 1agentssqlitecommit_linktechnicalsource ↗

voices

(02)

@unknown· XMay 27, 02:56 PM
Self-improving tax agents in production
ThriveHoldings reports its OpenAI-powered system processed 7,000+ tax returns across 30+ firms and got better from reviewer feedback, offering rare real-world evidence of iterative agents.
At @ThriveHoldings, we built a product with @OpenAI to automate tax prep for the 30+ accounting firms we own across the country. This season, it processed 7k+ returns. But what I think is more interesting is that the product meaningfully self-improved as accountants used it.
signal 4hype 3case_studyenterprise_aiautomationculturalsource ↗
simonw/blog· AnalysisMay 27, 04:38 PM
PMF arrives for leading AI labs
Simon Willison argues OpenAI and Anthropic have real product-market fit as enterprise usage drives unexpected spend, reframing the conversation from models to sustainable value.
I think Anthropic and OpenAI have found product-market fit — Anthropic are <a href="https://techcrunch.com/2026/05/20/anthropic-says-its-about-to-have-its-first-profitable-quarter/">strongly rumored</a> to be about to have their first profitable quarter. Stories <a href="https://www.theinformation.com/newsletters/applied-ai/uber-cto-shows-claude-code-can-blow-ai-budgets">are circulating</a> of companies surprised at how expensive their LLM bills are becoming from usage by their staff. I th...
signal 6hype 2industry_trendsllm_costspmfculturalsource ↗

may28

Codex deprecates GPT-5.2 and 5.3

Private MCP servers for OpenAI tools

Codex real-time meeting Q&A

Workload Identity Federation for API

Admin API adds enterprise controls

Claude Code v2.1.153 ships

Frontier models lag on ITBench-AA

Zero-trust aggregation for private analytics

How Codex self-improves on errors

SQLite adds AGENTS.md for AI

Self-improving tax agents in production

PMF arrives for leading AI labs