返回 2026-04-25
🤖 AI / ML

Claude Code 质量报告更新:问题源于封装而非模型本身An update on recent Claude Code quality reports

simonwillison.net·2026-04-24 节选正文

Anthropic 发布关于近期 Claude Code 质量下降事件的复盘报告,确认过去两个月用户反馈的问题确实存在。但根本原因并非底层模型性能下滑,而是三个独立的工程缺陷——包括提示词注入漏洞、缓存机制错误以及日志记录混乱——共同导致了输出质量恶化。公司已修复这些问题并承诺加强测试流程。

Simon Willison

24th April 2026 - Link Blog

An update on recent Claude Code quality reports (via) It turns out the high volume of complaints that Claude Code was providing worse quality results over the past two months was grounded in real problems.

The models themselves were not to blame, but three separate issues in the Claude Code harness caused complex but material problems which directly affected users.

Anthropic's postmortem describes these in detail. This one in particular stood out to me:

On March 26, we shipped a change to clear Claude's older thinking from sessions that had been idle for over an hour, to reduce latency when users resumed those sessions. A bug caused this to keep happening every turn for the rest of the session instead of just once, which made Claude seem forgetful and repetitive.

I frequently have Claude Code sessions which I leave for an hour (or often a day or longer) before returning to them. Right now I have 11 of those (according to ps aux  | grep 'claude ') and that's after closing down dozens more the other day.

I estimate I spend more time prompting in these "stale" sessions than sessions that I've recently started!

If you're building agentic systems it's worth reading this article in detail - the kinds of bugs that affect harnesses are deeply complicated, even if you put aside the inherent non-deterministic nature of the models themselves.

需要完整排版与评论请前往来源站点阅读。