返回 2026-04-24
🤖 AI / ML

近期 Claude Code 质量报告更新:问题源于系统架构缺陷而非模型本身An update on recent Claude Code quality reports

simonwillison.net·2026-04-24 节选正文

Anthropic 承认过去两个月内 Claude Code 输出质量下降,但归因于三个独立的系统级问题:提示工程错误、工具调用逻辑缺陷以及上下文管理不当。这些问题导致复杂任务失败,而非模型能力退化。公司已发布详细复盘报告并提出修复措施。

Simon Willison

24th April 2026 - Link Blog

An update on recent Claude Code quality reports (via) It turns out the high volume of complaints that Claude Code was providing worse quality results over the past two months was grounded in real problems.

The models themselves were not to blame, but three separate issues in the Claude Code harness caused complex but material problems which directly affected users.

Anthropic's postmortem describes these in detail. This one in particular stood out to me:

On March 26, we shipped a change to clear Claude's older thinking from sessions that had been idle for over an hour, to reduce latency when users resumed those sessions. A bug caused this to keep happening every turn for the rest of the session instead of just once, which made Claude seem forgetful and repetitive.

I frequently have Claude Code sessions which I leave for an hour (or often a day or longer) before returning to them. Right now I have 11 of those (according to ps aux  | grep 'claude ') and that's after closing down dozens more the other day.

I estimate I spend more time prompting in these "stale" sessions than sessions that I've recently started!

If you're building agentic systems it's worth reading this article in detail - the kinds of bugs that affect harnesses are deeply complicated, even if you put aside the inherent non-deterministic nature of the models themselves.

需要完整排版与评论请前往来源站点阅读。