Navigated to Death by Uptime

Death by Uptime

December 8
1 hr

View Transcript

Episode Description

We hit a new (and disturbing!) failure mode recently when a production rack that had been up for several months saw every (!) compute sled's service processor become simultaneously unresponsive. Bryan and Adam were joined by the members of the Oxide team who debugged the vexing issue -- and reached its surprising root cause.

In addition to Bryan Cantrill and Adam Leventhal, we were joined by Oxide colleagues, Cliff Biffle, Matt Keeter, and Will Chandler.

Previously, on Oxide and Friends:

Some of the topics we hit on, in the order that we hit them:

If we got something wrong or missed something, please file a PR! Our next show will likely be on Monday at 5p Pacific Time on our Discord server; stay tuned to our Mastodon feeds for details, or subscribe to this calendar. We'd love to have you join us, as we always love to hear from new speakers!

See all episodes

Never lose your place, on any device

Create a free account to sync, back up, and get personal recommendations.