·S6 E27
Reading model benchmarks like a pro, Mythos is looming, and Claude talk caveman, save big token
View Transcript
Episode Description
Is the secret to slashing your token costs by 65% forcing your LLM to speak like a caveman? This week on the Friday Deploy, Andrew and Ben test out a hilarious new Claude plugin that reduces AI output to primitive shorthand before diving into Anthropic's $100 million push to win the cybersecurity arms race with Project Glasswing. The hosts also unpack the sudden release of four game-changing open-source models—including Gemma 4 and Holo3—and explain why modern AI benchmarks are proving that humans still have a cognitive edge. Finally, they wrap up by sharing how they deploy custom background agents to hack their way through expo floors at industry conferences.
Read the guide: The APEX Framework
Follow the show:
- Subscribe to our Substack
- Follow us on LinkedIn
- Subscribe to our YouTube Channel
- Leave us a Review
Follow the hosts:
Follow today's stories:
- Project Glasswing
- Tool School: Benchmarking 101 (How To Read AI Model Report Cards)
- Four Open Models Just Proved You Can Own Frontier AI at Every Scale
- JuliusBrussee/caveman
OFFERS
- Start Free Trial: Get started with LinearB's AI productivity platform for free.
- Book a Demo: Learn how you can ship faster, improve DevEx, and lead with confidence in the AI era.
LEARN ABOUT LINEARB
- AI Code Reviews: Automate reviews to catch bugs, security risks, and performance issues before they hit production.
- AI & Productivity Insights: Go beyond DORA with AI-powered recommendations and dashboards to measure and improve performance.
- AI-Powered Workflow Automations: Use AI-generated PR descriptions, smart routing, and other automations to reduce developer toil.
- MCP Server: Interact with your engineering data using natural language to build custom reports and get answers on the fly.