返回 2026-04-26
🤖 AI / ML

关于 AI 图像生成基准测试的思考WHY ARE YOU LIKE THIS

simonwillison.net·2026-04-25 节选正文

针对 pelican riding a bicycle 基准测试的讨论,有人建议在现有测试基础上增加更多样化的测试用例。文章展示了 AI 生成的图像:一只鹈鹕骑着自行车沿土路行驶,后面跟着一辆警车,鹈鹕看起来很惊恐,可能是因为一个宇航员(奇怪地长着可抓握脚趾)也在骑行。这个例子引发了关于 AI 图像生成能力和测试方法的深入讨论。

Simon Willison

25th April 2026

@scottjla on Twitter in reply to my pelican riding a bicycle benchmark:

I feel like we need to stack these tests now

I checked to confirm that the model (ChatGPT Images 2.0) added the "WHY ARE YOU LIKE THIS" sign of its own accord and it did - the prompt Scott used was:

Create an image of a horse riding an astronaut, where the astronaut is riding a pelican that is riding a bicycle. It looks very chaotic but they all just manage to balance on top of each other

Posted 25th April 2026 at 4:44 pm

Recent articles

  • DeepSeek V4 - almost on the frontier, a fraction of the price - 24th April 2026
  • Extract PDF text in your browser with LiteParse for the web - 23rd April 2026
  • A pelican for GPT-5.5 via the semi-official Codex backdoor API - 23rd April 2026
  • This is a note by Simon Willison, posted on 25th April 2026.

    ai 1985 generative-ai 1761 slop 38 text-to-image 42 pelican-riding-a-bicycle 111

    Monthly briefing

    Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

    Pay me to send you less!

    需要完整排版与评论请前往来源站点阅读。