the brief

Today leaned into agents and tooling: Simon Willison shipped a coding agent and showed DSPy improving SQL prompts, while Vercel’s Andrew Qu unpacked eve’s skills and sandboxes. Core dev stacks moved with a Next.js canary and Claude Code fixes, and research surfaced on MI estimation and model unlearning. Meta tempered agent expectations, and Anthropic’s IPO prep signaled the capital race continues.

the poursit · sip · 12 items

pulse

(06)
  • simonw/blog· feedJul 2, 07:33 PM

    Simon releases llm-coding-agent alpha

    An early Python coding agent built on his LLM library adds file‑aware planning and tool use, offering a simple baseline for hacking on code‑gen agents.

    llm-coding-agent 0.1a0 — <p><strong>Release:</strong> <a href="https://github.com/simonw/llm-coding-agent/releases/tag/0.1a0">llm-coding-agent 0.1a0</a></p> <p>Another Fable 5 experiment. Now that my <a href="https://llm.datasette.io/">LLM library</a> has evolved into more of an agent framework it's time to see what a simple coding agent would look like built on it.</p> <p>I started a <a href="https://github.com/simonw/llm-coding-agent/tree/2466fa03ba8e5122c3bfa93d52167d33bce40ac6">new Python...

    signal 8hype 1agent_frameworkreleaseopen_sourcesource ↗
  • anthropics/claude-code· feedJul 2, 11:35 PM

    Claude Code v2.1.199 ships fixes

    Stacked slash‑skills now load correctly, SSL certificate errors fail fast with guidance, and partial outputs persist after mid‑stream API errors—quality‑of‑life updates for coding sessions.

    v2.1.199 — What's changed Stacked slash-skill invocations like /skill-a /skill-b do XYZ now load all leading skills (up to 5), not just the first Fixed SSL certificate errors (TLS-inspecting proxies, missing NODE_EXTRA_CA_CERTS, expired certs) burning retries before showing actionable guidance — they now fail immediately with the fix hint Fixed streaming responses being discarded when the API emits a mid-stream overloaded/server error after partial output — the partial is now kept with an inc...

    signal 9hype 1release_notesbugfixdeveloper_toolingsource ↗
  • vercel/next.js· feedJul 3, 12:04 AM

    Next.js 16.3 canary.76 updates

    Fixes navigation regressions, caching edge cases, history handling, and upgrades React to latest commit—useful for teams riding canary builds ahead of 16.3 stabilization.

    v16.3.0-canary.76 — Misc Changes Fix navigation getting reverted when a Server Action is in flight: #95391 Fix false-positive nested-cache error for a short default profile: #95373 Skip saving expire: 0 values in the default cache handler in prod: #95363 [ci] Disable mid-stack PR optimization for native PR stacks: #95427 Fix history push getting treated like replace when followed by refresh: #95392 Upgrade React from ec0fca31-20260701 to 3508aee6-20260702: #95410 fix(config): correctly valida...

    signal 8hype 1release_notesframeworknextjssource ↗
  • hn/frontpage· feedJul 2, 10:57 PM

    crustc translates rustc to C

    A bold effort ports the entire Rust compiler to C, potentially enabling new build environments, bootstrapping strategies, and platform reach for the Rust toolchain.

    crustc: entirety of `rustc`, translated to C — Article URL: https://github.com/FractalFir/crustc Comments URL: https://news.ycombinator.com/item?id=48768464 Points: 153 # Comments: 31

  • techmeme· feedJul 2, 09:00 PM

    Zuckerberg tempers Meta agent timelines

    At an internal town hall, Zuckerberg said Meta’s agent push hasn’t accelerated as hoped and reorgs were messy—signal that agent productization remains hard.

    At a town hall, Mark Zuckerberg said Meta's AI agent development has not accelerated as expected and its reorganization was not as "clean" as it could have been (Katie Paul/Reuters) — Katie Paul / Reuters: At a town hall, Mark Zuckerberg said Meta's AI agent development has not accelerated as expected and its reorganization was not as “clean” as it could have been — Meta (META.O) Chief Executive Mark Zuckerberg told an internal town hall on Thursday that AI agent development …

  • techmeme· feedJul 3, 01:10 AM

    Anthropic lines up IPO counsel

    The Information reports Anthropic’s bankers tapped UK firm Freshfields to advise on its expected multibillion‑dollar IPO, underscoring the capital race for frontier AI.

    Sources: Anthropic's bankers have hired UK law firm Freshfields to advise on its IPO; it also advised on Google's acquisition of Wiz and ServiceNow's Armis deal (The Information) — The Information: Sources: Anthropic's bankers have hired UK law firm Freshfields to advise on its IPO; it also advised on Google's acquisition of Wiz and ServiceNow's Armis deal — A British law firm has scored a big win: a role in the Anthropic initial public offering, expected to raise tens of billions at a valuat...

findings

(03)
  • simonw/blog· feedJul 2, 06:25 PM

    DSPy tunes Datasette Agent prompts

    Willison shows how to use DSPy to systematically evaluate and refine SQL system prompts for Datasette Agent, with a reproducible repo and measurable gains.

    Using DSPy to evaluate and improve Datasette Agent's SQL system prompts — <p><strong>Research:</strong> <a href="https://github.com/simonw/research/tree/main/dspy-datasette-agent-prompts#readme">Using DSPy to evaluate and improve Datasette Agent&#x27;s SQL system prompts</a></p> <p>One of this morning's AIE keynotes covered <a href="https://github.com/stanfordnlp/dspy">dspy</a>, which reminded me I've been meaning to see if it could help me improve the system prompt used by <a href="https://a...

    signal 9hype 1dspyprompt_evaluationagentssource ↗
  • tmlr-pub.bsky.socialJul 3, 12:19 AM

    MIST: supervised mutual information estimation

    A TMLR paper proposes MI estimation via supervised training, claiming improved stability and performance over contrastive methods; details and evaluations are available on OpenReview.

    MIST: Mutual Information Estimation via Supervised Training German Gritsai, Megan Richards, Maxime Méloux, Kyunghyun Cho, Maxime Peyrard Action editor: Matthew Holland https://openreview.net/forum?id=Qi4JgS2PLw #information #predict #estimators

    signal 5hype 1papermutual_informationestimationsource ↗
  • tmlr-pub.bsky.socialJul 2, 08:19 PM

    Unlearning as knowledge tracing

    New TMLR work reframes data‑tracing unlearning for foundation models as knowledge‑tracing, outlining methods and metrics to reason about what a model retains or forgets.

    Lifting Data-Tracing Machine Unlearning to Knowledge-Tracing for Foundation Models Yuwen Tan, Boqing Gong Action editor: Tianbao Yang https://openreview.net/forum?id=ScvUCNMdYN #unlearning #tracing #knowledge

    signal 4hype 1research_papermachine_unlearningfoundation_modelssource ↗

voices

(03)
  • pragmatic/engineer· feedJul 2, 06:46 PM

    Pragmatic routing for AI models

    The Pragmatic Engineer surveys emerging 'smart' model routers that pick the right LLM per task, mapping the players and approaches for cost‑performance wins.

    The Pulse: a new trend, smart model routing — Are there any ‘intelligent’ router solutions out there which select the right model for the right task? I looked into it, and there are a few options.

    signal 6hype 2model_routingtooling_surveytrend_analysissource ↗
  • latentspace/podcast· feedJul 3, 12:08 AM

    Vercel’s Andrew Qu on agents

    Inside Vercel’s eve framework: skills, sandboxing, and agent‑readable websites—why agents look like a new software class and how devs should design for them.

    Vercel's Andrew Qu on why agents are a new kind of software — The Vercel Chief of Software explains how its agent framework, eve, was created — and why skills, sandboxes and agent-readable websites now matter.

    signal 6hype 2agent_frameworkvercelpodcastsource ↗
  • simonw/blog· feedJul 2, 05:07 PM

    Understand to participate, not absorb

    Willison argues developers must keep cognitive debt low when collaborating with coding agents, building tools and UX that make agent reasoning and diffs tractable.

    Understand to participate — <p>I saw Geoffrey Litt speak at <a href="https://www.ai.engineer/worldsfair/2026">AIE</a> yesterday, and one framing he used particularly resonated with me:</p> <p><strong>Understand to participate</strong></p> <p>Geoffrey was talking about the challenge of collaborating with coding agents as they construct increasingly large and sophisticated changes, and the need to avoid taking on <a href="https://simonwillison.net/tags/cognitive-debt/">cognitive debt</a> as you...

    signal 6hype 1agentsengineering_processcognitive_debtsource ↗