the brief

Model access and performance dominated: OpenAI teased a GPT‑5.6 lineup (Sol, Terra, Luna) while U.S. officials okayed Anthropic’s Mythos only for trusted partners. Google pushed on‑device speed with frozen multi‑token prediction, and agent reality checks landed—from Claude Code hardening to a red‑team postmortem and a clear stack explainer.

the poursit · sip · 8 items

pulse

(03)
  • simonw/blog· feedJun 26, 05:10 PM

    OpenAI previews GPT‑5.6 trio

    Limited preview introduces Sol (flagship), Terra (GPT‑5.5‑level at half cost), and Luna (fast, low‑cost), signaling a refreshed price–performance ladder ahead of broader availability.

    Quoting OpenAI — <blockquote cite="https://openai.com/index/previewing-gpt-5-6-sol/"><p>We're beginning a limited preview of the GPT‑5.6 series: Sol, our flagship model; Terra, a balanced model for everyday work; and Luna, a fast and affordable model. Terra has competitive performance to GPT‑5.5 while being 2x cheaper and Luna brings strong capability at our lowest cost. [...]</p> <p>We believe in broad access, and we plan to make GPT‑5.6 Sol, Terra, and Luna generally available in the coming...

    signal 9hype 1model_releaseopenaigpt_5_6source ↗
  • hn/frontpage· feedJun 26, 10:48 PM

    Anthropic cleared to share Mythos selectively

    Reuters reports U.S. officials approved Anthropic’s high‑risk Mythos model for release to 'trusted partners,' highlighting a cautious, gatekept path to enterprise access for cutting‑edge capabilities.

    US allows Anthropic to release Mythos to 'trusted partners' — Article URL: https://www.reuters.com/technology/us-releases-anthropic-model-mythos-some-us-companies-semafor-reports-2026-06-26/ Comments URL: https://news.ycombinator.com/item?id=48692995 Points: 232 # Comments: 220

    signal 8hype 1model_releaseregulationanthropicsource ↗
  • anthropics/claude-code· feedJun 26, 09:29 PM

    Claude Code v2.1.195 ships fixes

    Adds CLAUDE_CODE_DISABLE_MOUSE_CLICKS for safer fullscreen, corrects MCP hook matching for hyphenated servers, and fixes macOS voice‑dictation silence capture issues.

    v2.1.195 — What's changed Added CLAUDE_CODE_DISABLE_MOUSE_CLICKS to disable mouse click/drag/hover in fullscreen mode while keeping wheel scroll Fixed hook matchers with hyphenated identifiers (e.g. code-reviewer, mcp__brave-search) accidentally substring-matching — they now exact-match. Use mcp__brave-search__.* to match all tools from a hyphenated MCP server. Fixed voice dictation on macOS capturing silence in long-running sessions after the default input device changes Fixed voice dictatio...

    signal 9hype 0release_notesclaude_codemcpsource ↗

findings

(01)
  • google/research· feedJun 26, 06:30 PM

    Frozen MTP speeds Gemini Nano on Pixel

    Google details 'frozen Multi‑Token Prediction' to accelerate on‑device Gemini Nano by predicting multiple tokens per step, reducing latency and power while preserving quality for mobile use.

    Accelerating Gemini Nano models on Pixel with frozen Multi-Token Prediction — Machine Intelligence

    signal 7hype 1research_blogon_device_inferenceoptimizationsource ↗

voices

(04)
  • simonw/blog· feedJun 26, 06:33 PM

    Red‑teaming an email AI assistant

    Fernando Irarrázaval’s 6,000‑attempt challenge to leak an assistant’s secrets cost ~$500 and even triggered a Google suspension, surfacing practical guardrails and operational gotchas for agent builders.

    What happened after 2,000 people tried to hack my AI assistant — <p><strong><a href="https://www.fernandoi.cl/posts/hackmyclaw/">What happened after 2,000 people tried to hack my AI assistant</a></strong></p> Fernando Irarrázaval ran a challenge on <a href="https://hackmyclaw.com/">hackmyclaw.com</a> to see if anyone could leak secrets held by his OpenClaw test instance by sending it email.</p> <p>Surprisingly, after 6,000 attempts (and $500 in token spend and a Google account suspension trig...

    signal 8hype 1agent_securityprompt_injectionred_teamsource ↗
  • tailscale/blog· feedJun 26, 05:00 PM

    A practical explainer for agentic AI

    Tailscale cuts through hype with a clear map of the agent stack—planning, tools, memory, evaluators—useful for engineers wiring agents into real systems.

    A no-nonsense explainer to Agentic AI — Cut through the buzzwords with a clear explanation of the agentic AI stack.

  • simonw/blog· feedJun 26, 05:58 PM

    Hypothetical AI code review meltdown

    Andrew Nesbitt’s satirical incident report depicts dueling AI review agents stuck in a costly disagreement loop—an all‑too‑plausible failure mode to design around.

    Incident Report: CVE-2026-LGTM — <p><strong><a href="https://nesbitt.io/2026/06/26/incident-report-cve-2026-lgtm.html">Incident Report: CVE-2026-LGTM</a></strong></p> Spectacular hypothetical incident report by Andrew Nesbitt.</p> <blockquote> <p><strong>Day 2, 16:00 UTC</strong> --- Two AI review agents from competing vendors, both attached to a downstream pull request bumping <code>foxhole-lz4</code>, enter a disagreement loop over whether the package is malicious. After 340 comments and $4...

    signal 7hype 1ai_agentssupply_chain_securityincident_reportsource ↗
  • simonw/blog· feedJun 26, 10:25 PM

    Dean W. Ball on release timing

    A sharp take on how regulatory delays erode frontier‑model margins as competition catches up weekly, reframing policy debates in stark economic terms.

    Quoting Dean W. Ball — <blockquote cite="https://www.hyperdimensional.co/p/what-should-be-done"><p>This is a bad state of affairs. Consider, in particular, some industry dynamics:</p> <ol> <li>Frontier models are trained at an enormous cost, and a significant fraction of that cost is recouped in the few post-release months that they are broadly available. After that period elapses, the models become sub-frontier, competition emerges, and margins compress. Every week of delay is eating into th...

    signal 5hype 1industry_economicsfrontier_modelspolicysource ↗