cuppa

today's signal · no scroll

live

brewed 03:31 AM

← previous← May 18

Tuesday

may19

2026

next →May 20 →

the brief

Today tilted toward infrastructure and agent plumbing: Anthropic bought Stainless to deepen MCP and SDKs, then shipped multiple Claude upgrades from cache diagnostics to faster Opus defaults and bigger Design contexts. Vercel made WAF‑mitigated traffic free and added a CLI, while llama.cpp sped up Qwen3.6 with MTP and Google dropped an open-source gemini-cli. Hugging Face pushed agent evals and a practical Cosmos fine‑tuning recipe.

the poursit · sip · 12 items

pulse

(08)

@unknown· XMay 18, 05:00 PM
Anthropic acquires Stainless for MCP SDKs
Anthropic bought Stainless, maker of SDK generators and an MCP server platform powering their SDKs, signaling deeper investment in agent connectivity and developer tooling around MCP.
Anthropic is acquiring @stainlessapi, an SDK and MCP server platform that has powered every Anthropic SDK since the earliest days of our API. Read more: anthropic.com/news/anthropic…
signal 9hype 1acquisitionmcpsdklaunchsource ↗
@unknown· XMay 18, 07:40 PM
Claude Design doubles token limits
Claude Design token limits doubled across all plans, expanding context for image/layout work and enabling larger composition and iteration loops without prompt surgery.
You can now create more with Claude Design. We've doubled token limits across every plan. pic.x.com/d2AemkZUxW
signal 6hype 2product_updatetoken_limitsanthropiclaunchsource ↗
@ClaudeDevs· XMay 18, 05:59 PM
Claude Console adds prompt cache diffing
New cache diagnostics show exactly which prompt segments changed on a miss and the token delta, making cache tuning and cost regression debugging far easier.
Prompt cache diagnostics are now in Claude Console. When a request misses the cache, you can now see exactly which part of your prompt changed and how many tokens it cost you. pic.x.com/z0dV6zzLPm
signal 8hype 1prompt_cachefeature_updatedeveloper_toolslaunchsource ↗
@ClaudeDevs· XMay 18, 07:18 PM
Claude Code fast mode now Opus 4.7
Fast mode in Claude Code now defaults to Opus 4.7, delivering lower-latency responses for interactive coding sessions when speed matters more than per‑token cost.
Fast mode now defaults to Opus 4.7 in Claude Code. Try it out today with /fast pic.x.com/i0roMdEgEg
signal 7hype 1claude_codemodel_updatedefault_changelaunchsource ↗
vercel/news· First-partyMay 18, 08:00 PM
Vercel makes firewall‑mitigated traffic free
Requests denied, challenged, or rate‑limited by Vercel WAF no longer incur CDN or data transfer charges, effectively extending free DDoS-style mitigation to custom rules.
Firewall‑mitigated traffic is free on Vercel — Vercel Firewall now waives CDN Requests and Fast Data Transfer for any traffic denied, challenged, or rate‑limited by Web Application Firewall (WAF). Vercel has always provided unlimited DDoS mitigation at no cost. Vercel WAF, included in CDN cost, gives you custom rules, managed rules, and rate limiting for bad traffic that isn't DDoS. With this change, you don't pay for requests or bandwidth that WAF denies, challenges, or rate‑limits. That mea...
signal 6hype 1vercelpricing_changewaflaunchsource ↗
@unknown· XMay 18, 04:51 PM
Vercel CLI adds firewall management
A new vercel firewall command lets teams and agents manage custom rules, IP blocks, system mitigations, and Attack Mode directly from the terminal and CI.
The new Vercel CLI 𝚏𝚒𝚛𝚎𝚠𝚊𝚕𝚕 command brings firewall configuration to the terminal. You and your agents can now manage custom rules, IP blocks, system mitigations, Attack Mode, and more: vercel.com/changelog/mana…
signal 6hype 2vercel_clifirewallsecuritylaunchsource ↗
pplx/oss-rising-24h· ResearchMay 19, 03:01 AM
Google releases open‑source gemini‑cli agent
Google’s gemini-cli brings Gemini to the terminal as an extensible agent, enabling local workflows, tool use, and scripting without wiring a full web app.
GitHub - google-gemini/gemini-cli: An open-source AI agent that brings the power of Gemini directly into your terminal. — An open-source AI agent that brings the power of Gemini directly into your terminal. - google-gemini/gemini-cli
signal 7hype 1github_repooss_releaseclilaunchsource ↗
@unknown· XMay 18, 07:27 PM
llama.cpp adds MTP for Qwen3.6 speed
Speculative multi-token prediction boosts Qwen3.6‑27B dense generation from ~25 tok/s to ~45 tok/s on an A10G; enable via --spec-type draft-mtp and --spec-draft-n-max 2.
llama.cpp with MTP support makes local models fast enough to use as daily drivers 🚀 Qwen3.6-27B dense generation (on A10G): From 25 tok/s → 45 tok/s (+78%). Two flags on llama-server: --spec-type draft-mtp --spec-draft-n-max 2 pic.x.com/hhslKpLE71 x.com/ggerganov/stat…
signal 8hype 2llama_cppperformancemtphacksource ↗

findings

(03)

huggingface/blog· First-partyMay 18, 02:12 PM
Hugging Face launches Open Agent Leaderboard
A new leaderboard tracks open agents across standardized tasks and evals, providing reproducible baselines and a common yardstick for agent system performance.
The Open Agent Leaderboard
signal 7hype 2leaderboardagentsbenchmarkinglaunchsource ↗
huggingface/blog· First-partyMay 18, 04:00 PM
Fine‑tuning Cosmos Predict 2.5 with LoRA/DoRA
HF details fine‑tuning NVIDIA’s Cosmos Predict 2.5 for robot video generation using LoRA/DoRA, with code, data recipes, and training tips for efficient adaptation.
Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation
signal 7hype 1fine_tuningloradoratechnicalsource ↗
@ClaudeDevs· XMay 18, 03:56 PM
Running Claude Code at scale lessons
Anthropic shares practices from deployments across multi‑million‑line monorepos and microservices, covering repo indexing, context strategies, and operational patterns for agentic coding.
What are best practices for running Claude Code at scale? New blog post on what we've learned from teams running it across multi-million-line monorepos, decades-old legacy systems, and distributed microservices: claude.com/blog/how-claud…
signal 9hype 1claude_codebest_practicesscaletechnicalsource ↗

voices

(01)

@trq212· XMay 18, 04:45 PM
Prompt pattern: keep implementation notes
A simple directive—maintain implementation-notes.html while coding—captures decisions, tradeoffs, and deviations from spec, improving handoffs, auditability, and agent collaboration.
a prompt I've been using a lot recently: implement <SPEC> and while you do, keep a running implementation-notes.html file (or markdown) with decisions you had to make weren't in the spec, things you had to change, tradeoffs you had to make or anything else I should know pic.x.com/qQFTES4fjo
signal 6hype 1promptingdev_workflowengineering_processhacksource ↗

may19

Anthropic acquires Stainless for MCP SDKs

Claude Design doubles token limits

Claude Console adds prompt cache diffing

Claude Code fast mode now Opus 4.7

Vercel makes firewall‑mitigated traffic free

Vercel CLI adds firewall management

Google releases open‑source gemini‑cli agent

llama.cpp adds MTP for Qwen3.6 speed

Hugging Face launches Open Agent Leaderboard

Fine‑tuning Cosmos Predict 2.5 with LoRA/DoRA

Running Claude Code at scale lessons

Prompt pattern: keep implementation notes