·E7

Are AI benchmarks doomed?

May 1

1h 5m

Episode Description

AI benchmarks saturate quickly, struggle to capture what we care about, and cost more than ever to build. But are they doomed? Greg Burnham, who leads Epoch's benchmarking team, and Tom Adamczewski, who developed MirrorCode, push back on the pessimism and dig into what the next generation of AI benchmarks could look like.

See all episodes

Never lose your place, on any device

Create a free account to sync, back up, and get personal recommendations.

Create account