🤖 AI / ML

关于 AI 图像测试的讨论WHY ARE YOU LIKE THIS

simonwillison.net·2026-04-25 节选正文

针对 Simon Willison 提出的“信天翁骑单车”基准测试，有人建议在现有测试基础上叠加更多复杂元素（如警察追逐、宇航员等）以增强挑战性。该评论以幽默方式展示了当前 AI 图像生成系统在处理多对象动态场景时的局限性。

Simon Willison

25th April 2026

@scottjla on Twitter in reply to my pelican riding a bicycle benchmark:

I feel like we need to stack these tests now

I checked to confirm that the model (ChatGPT Images 2.0) added the "WHY ARE YOU LIKE THIS" sign of its own accord and it did - the prompt Scott used was:

Create an image of a horse riding an astronaut, where the astronaut is riding a pelican that is riding a bicycle. It looks very chaotic but they all just manage to balance on top of each other

Posted 25th April 2026 at 4:44 pm

关于 AI 图像测试的讨论WHY ARE YOU LIKE THIS

Recent articles

Monthly briefing