Controlling AI Models from the Inside

January 20
43 mins

Episode Description

As generative AI moves into production, traditional guardrails and input/output filters can prove too slow, too expensive, and/or too limited. In this episode, Alizishaan Khatri of Wrynx joins Daniel and Chris to explore a fundamentally different approach to AI safety and interpretability. They unpack the limits of today’s black-box defenses, the role of interpretability, and how model-native, runtime signals can enable safer AI systems. 

Featuring:

Upcoming Events: 

See all episodes

Never lose your place, on any device

Create a free account to sync, back up, and get personal recommendations.