Episode Description
The two models that you will hear discussed for at least the next two months - Claude Opus 4.6 and GPT 5.3 Codex - just got released within 26 mins or each other. The full breakdown of around 250 pages of reports, with just the most interest moments, from the battle of which is best, Claude personhood, the surprising misbehaviour of Opus 4.6, and much more
https://assemblyai.com/aiexplained
Check out my fast-growing (!) app, free to use, and code INSIDER15 for Pro: https://lmcouncil.ai
AI Insiders ($9): https://www.patreon.com/AIExplained
Chapters:
00:00 - Introduction
00:54 - Self-improvement?
02:44 - Knowledge Work
05:30 - Overly agentic behaviour
09:12 - Who Shouldn’t Use Claude Opus
11:39 - Step-change?
15:09 - Claude’s ‘Personhood’
Hassabis Roadmap: https://www.patreon.com/posts/hassabis-roadmap-149750869
Release of Opus 4.6: https://www.anthropic.com/news/claude-opus-4-6
212 Page System Card: https://www-cdn.anthropic.com/0dd865075ad3132672ee0ab40b05a53f14cf5288.pdf
Claude Code Tip: https://x.com/bcherny/status/2019475897691124107
GPT Codex 5.3: https://openai.com/index/introducing-gpt-5-3-codex/
System Card: https://openai.com/index/gpt-5-3-codex-system-card/
Browse Comp: https://arxiv.org/pdf/2504.12516v1
Finance Agent: https://www.vals.ai/benchmarks/finance_agent
Terminal Bench 2: https://arxiv.org/pdf/2601.11868
Vending Bench: https://andonlabs.com/blog/opus-4-6-vending-bench
My X post: https://x.com/AIExplainedYT/status/2016851303436095647
Anthropic Apology: https://x.com/ch402/status/2014066134194995256/photo/1
Altman rebuttal: https://x.com/sama/status/2019139174339928189
https://x.com/sama/status/2019140276246442089
4% of GitHub: https://x.com/dylan522p/status/2019490550911766763
Non-hype Newsletter: https://signaltonoise.beehiiv.com/
Podcast: https://aiexplainedopodcast.buzzsprout.com/