the brief

Today tilted toward infrastructure and agent plumbing: Anthropic bought Stainless to deepen MCP and SDKs, then shipped multiple Claude upgrades from cache diagnostics to faster Opus defaults and bigger Design contexts. Vercel made WAF‑mitigated traffic free and added a CLI, while llama.cpp sped up Qwen3.6 with MTP and Google dropped an open-source gemini-cli. Hugging Face pushed agent evals and a practical Cosmos fine‑tuning recipe.

the poursit · sip · 12 items

pulse

(08)
  • @unknownMay 18, 05:00 PM

    Anthropic acquires Stainless for MCP SDKs

    Anthropic bought Stainless, maker of SDK generators and an MCP server platform powering their SDKs, signaling deeper investment in agent connectivity and developer tooling around MCP.

    Anthropic is acquiring @stainlessapi, an SDK and MCP server platform that has powered every Anthropic SDK since the earliest days of our API. Read more: anthropic.com/news/anthropic…

  • @unknownMay 18, 07:40 PM

    Claude Design doubles token limits

    Claude Design token limits doubled across all plans, expanding context for image/layout work and enabling larger composition and iteration loops without prompt surgery.

    You can now create more with Claude Design. We've doubled token limits across every plan. pic.x.com/d2AemkZUxW

    2x token limits on Claude Design, so you can create more.
    signal 6hype 2product_updatetoken_limitsanthropicsource ↗
  • @ClaudeDevsMay 18, 05:59 PM

    Claude Console adds prompt cache diffing

    New cache diagnostics show exactly which prompt segments changed on a miss and the token delta, making cache tuning and cost regression debugging far easier.

    Prompt cache diagnostics are now in Claude Console. When a request misses the cache, you can now see exactly which part of your prompt changed and how many tokens it cost you. pic.x.com/z0dV6zzLPm

    signal 8hype 1prompt_cachefeature_updatedeveloper_toolssource ↗
  • @ClaudeDevsMay 18, 07:18 PM

    Claude Code fast mode now Opus 4.7

    Fast mode in Claude Code now defaults to Opus 4.7, delivering lower-latency responses for interactive coding sessions when speed matters more than per‑token cost.

    Fast mode now defaults to Opus 4.7 in Claude Code. Try it out today with /fast pic.x.com/i0roMdEgEg

    signal 7hype 1claude_codemodel_updatedefault_changesource ↗
  • vercel/news· feedMay 18, 08:00 PM

    Vercel makes firewall‑mitigated traffic free

    Requests denied, challenged, or rate‑limited by Vercel WAF no longer incur CDN or data transfer charges, effectively extending free DDoS-style mitigation to custom rules.

    Firewall‑mitigated traffic is free on Vercel — Vercel Firewall now waives CDN Requests and Fast Data Transfer for any traffic denied, challenged, or rate‑limited by Web Application Firewall (WAF). Vercel has always provided unlimited DDoS mitigation at no cost. Vercel WAF, included in CDN cost, gives you custom rules, managed rules, and rate limiting for bad traffic that isn't DDoS. With this change, you don't pay for requests or bandwidth that WAF denies, challenges, or rate‑limits. That mea...

  • @unknownMay 18, 04:51 PM

    Vercel CLI adds firewall management

    A new vercel firewall command lets teams and agents manage custom rules, IP blocks, system mitigations, and Attack Mode directly from the terminal and CI.

    The new Vercel CLI 𝚏𝚒𝚛𝚎𝚠𝚊𝚕𝚕 command brings firewall configuration to the terminal. You and your agents can now manage custom rules, IP blocks, system mitigations, Attack Mode, and more: vercel.com/changelog/mana…

    signal 6hype 2vercel_clifirewallsecuritysource ↗
  • pplx/oss-rising-24h· researchMay 19, 03:01 AM

    Google releases open‑source gemini‑cli agent

    Google’s gemini-cli brings Gemini to the terminal as an extensible agent, enabling local workflows, tool use, and scripting without wiring a full web app.

    GitHub - google-gemini/gemini-cli: An open-source AI agent that brings the power of Gemini directly into your terminal. — An open-source AI agent that brings the power of Gemini directly into your terminal. - google-gemini/gemini-cli

  • @unknownMay 18, 07:27 PM

    llama.cpp adds MTP for Qwen3.6 speed

    Speculative multi-token prediction boosts Qwen3.6‑27B dense generation from ~25 tok/s to ~45 tok/s on an A10G; enable via --spec-type draft-mtp and --spec-draft-n-max 2.

    llama.cpp with MTP support makes local models fast enough to use as daily drivers 🚀 Qwen3.6-27B dense generation (on A10G): From 25 tok/s → 45 tok/s (+78%). Two flags on llama-server: --spec-type draft-mtp --spec-draft-n-max 2 pic.x.com/hhslKpLE71 x.com/ggerganov/stat…

findings

(03)
  • huggingface/blog· feedMay 18, 02:12 PM

    Hugging Face launches Open Agent Leaderboard

    A new leaderboard tracks open agents across standardized tasks and evals, providing reproducible baselines and a common yardstick for agent system performance.

    The Open Agent Leaderboard

    signal 7hype 2leaderboardagentsbenchmarkingsource ↗
  • huggingface/blog· feedMay 18, 04:00 PM

    Fine‑tuning Cosmos Predict 2.5 with LoRA/DoRA

    HF details fine‑tuning NVIDIA’s Cosmos Predict 2.5 for robot video generation using LoRA/DoRA, with code, data recipes, and training tips for efficient adaptation.

    Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation

  • @ClaudeDevsMay 18, 03:56 PM

    Running Claude Code at scale lessons

    Anthropic shares practices from deployments across multi‑million‑line monorepos and microservices, covering repo indexing, context strategies, and operational patterns for agentic coding.

    What are best practices for running Claude Code at scale? New blog post on what we've learned from teams running it across multi-million-line monorepos, decades-old legacy systems, and distributed microservices: claude.com/blog/how-claud…

    signal 9hype 1claude_codebest_practicesscalesource ↗

voices

(01)
  • @trq212May 18, 04:45 PM

    Prompt pattern: keep implementation notes

    A simple directive—maintain implementation-notes.html while coding—captures decisions, tradeoffs, and deviations from spec, improving handoffs, auditability, and agent collaboration.

    a prompt I've been using a lot recently: implement <SPEC> and while you do, keep a running implementation-notes.html file (or markdown) with decisions you had to make weren't in the spec, things you had to change, tradeoffs you had to make or anything else I should know pic.x.com/qQFTES4fjo

    signal 6hype 1promptingdev_workflowengineering_processsource ↗