理解 GPT-5.4 在知识工作上的进步,需要先了解 GDPval 这个基准的设计逻辑。
Chris was last on the show three years ago, as he was first stepping into the role, and we spent quite a bit of time then talking about his plans to collect more data, spin off parts of the company, and think about the future of collectibles… which, at that time, meant NFTs. Look, a lot’s happened in three years! NFTs just weren’t one of them. You’ll hear Chris laugh about this throughout, actually.
,详情可参考新收录的资料
The industry data tells a consistent story of volume outrunning quality. CircleCI’s 2026 report measured the largest year-over-year jump in feature branch activity they’ve ever seen, up 59%, yet main branch deployments fell and build success hit a five-year low. Nearly 3 in 10 merges to main are now failing. Faros AI’s telemetry across 10,000+ developers says teams with high AI adoption merged 98% more PRs but also that review time ballooned 91%. The reviews aren’t disappearing. The attention behind them is thinning. And it turns out AI doesn’t automatically help on this side of the equation either. Meta found that showing AI-generated patches to reviewers actually increased review time. They say it’s because reviewers felt obligated to verify the AI’s work on top of their own, but I’ve seen many other reasons for similar or higher increase in review time for AI-assisted work.
图源 Unsplash事情是人做的,在上述过程中一定会有人来跟你打感情牌,一般是在已经生成了一次实际处罚之后。比如我连续两次投诉以后,驿站工作人员就主动上门来跟我解释,并告诉我所有的处罚最后都会转嫁到他的身上,要求加联系方式,以后有问题直接联系他不要再投诉了。有些情况下,快递员会亲自上门联系你,同时也会要求你加联系方式之后帮你送货。无论是哪一种,都说明驿站和快递员会关注到你的问题,并且决定由谁来直接面对你,其实效果是一样的。