Episode Transcript
1
00:00:04,431 --> 00:00:10,281
The original title for this episode was AI
Killed the QA Star, which would make more
2
00:00:10,281 --> 00:00:14,811
sense if you knew the eighties lore of
the very first music video played on MTV.
3
00:00:15,201 --> 00:00:18,141
Um, that's a music television TV channel.
4
00:00:18,141 --> 00:00:20,541
Back, back in the days, how you
watched videos before the internet.
5
00:00:20,911 --> 00:00:25,141
That was in 1981 and it was called
Video Killed the Radio Star.
6
00:00:25,141 --> 00:00:30,091
But I decided that a Deep cut title
was too obscure for this conversation.
7
00:00:30,214 --> 00:00:34,464
Yet the question still remains, could
the increased velocity of shipping
8
00:00:34,464 --> 00:00:40,254
AI generated code cause businesses
to leave human based QA behind,
9
00:00:40,621 --> 00:00:43,501
presumably, because we're not gonna
hire any more of them, and we don't
10
00:00:43,501 --> 00:00:46,951
want to grow those teams and operations
teams just because we created AI code.
11
00:00:47,401 --> 00:00:52,051
And would we start relying more on
production observability to detect code
12
00:00:52,051 --> 00:00:54,571
issues that affect user experience?
13
00:00:55,031 --> 00:00:59,441
And that's the theory of today's
guest, Andrew Tunall, the President
14
00:00:59,441 --> 00:01:01,721
and Chief product Officer at Embrace.
15
00:01:02,111 --> 00:01:05,261
They're a mobile observability
platform company that I first
16
00:01:05,261 --> 00:01:07,271
met at KubeCon London this year.
17
00:01:07,646 --> 00:01:12,776
Their pitch was that mobile apps were
ready for the full observability stack
18
00:01:12,986 --> 00:01:18,206
and that we now have SDKs to let mobile
dev teams integrate with the same
19
00:01:18,206 --> 00:01:22,766
tools that we platform engineers and
DevOps people and operators have been
20
00:01:22,766 --> 00:01:24,446
building and enjoying for years now.
21
00:01:25,046 --> 00:01:30,506
I can tell you that we don't yet have
exactly a full picture on how AI will
22
00:01:30,506 --> 00:01:35,216
affect those roles, but I can tell you
that business management is being told
23
00:01:35,546 --> 00:01:40,346
that similar to software development,
they can expect gains from using AI to
24
00:01:40,346 --> 00:01:46,046
assist or replace operators, testers,
build engineers, QA and DevOps.
25
00:01:46,436 --> 00:01:48,746
That's not true, or at least not yet.
26
00:01:49,316 --> 00:01:50,996
But it seems to be an expectation.
27
00:01:51,269 --> 00:01:56,559
And that's not gonna stop us from trying
to integrate LLMs into our jobs more.
28
00:01:56,559 --> 00:02:01,149
So I wanted to hear from observability
experts on how they think this
29
00:02:01,149 --> 00:02:02,319
is all going to shake out.
30
00:02:02,889 --> 00:02:05,829
So I hope you enjoyed this
conversation with Andrew of Embrace.
31
00:02:07,804 --> 00:02:08,304
Hello.
32
00:02:08,804 --> 00:02:09,204
Hey, Bret.
33
00:02:09,254 --> 00:02:09,734
How are you doing?
34
00:02:09,912 --> 00:02:14,542
Andrew Tunal is the president and
chief product officer of Embrace.
35
00:02:14,922 --> 00:02:20,152
And if you've not heard of Embrace,
they are, I think your claim to me the
36
00:02:20,152 --> 00:02:22,659
first time talked was, you're the first.
37
00:02:23,159 --> 00:02:27,579
Mobile focused or mobile only
observability company in the
38
00:02:27,579 --> 00:02:29,009
cloud native computing foundation.
39
00:02:29,009 --> 00:02:30,289
Is that a correct statement?
40
00:02:30,789 --> 00:02:33,629
I don't know if in,
probably in CNCF, right?
41
00:02:33,649 --> 00:02:38,219
Because like we're, we were certainly
the first company that started going
42
00:02:38,219 --> 00:02:42,099
all in on OpenTelemetry as the means
with which we published instrumentation.
43
00:02:42,159 --> 00:02:45,049
obviously CNCF has a bunch of
observability vendors, some of
44
00:02:45,049 --> 00:02:49,849
which do, mobile and web run,
but, completely focused on that.
45
00:02:49,849 --> 00:02:50,009
Yeah.
46
00:02:50,009 --> 00:02:50,789
I would say we're the first.
47
00:02:51,289 --> 00:02:51,599
Yeah.
48
00:02:51,719 --> 00:02:53,849
I mean, we can just, we can
say that until it's not true.
49
00:02:54,399 --> 00:02:57,919
the internet will judge us harshly,
for whatever claims we make today.
50
00:02:58,200 --> 00:02:58,620
Yeah.
51
00:02:59,120 --> 00:03:00,690
well, and that just means
people are listening.
52
00:03:00,770 --> 00:03:01,170
Andrew.
53
00:03:01,170 --> 00:03:06,700
Okay, I told people on the internet that
we were going to talk about So, the idea
54
00:03:06,760 --> 00:03:12,179
that you gave me that QA is at risk of
falling behind because if we're pretty
55
00:03:12,550 --> 00:03:16,506
If we're producing more code, if we're
shipping more code because of AI, even
56
00:03:16,506 --> 00:03:20,116
the pessimistic stuff about, you know,
we've seen some studies in the last
57
00:03:20,116 --> 00:03:25,986
quarter around effective use of AI, it is
anywhere between negative 20% And positive
58
00:03:25,986 --> 00:03:30,136
30 percent in productivity, depending
on the team and the organization.
59
00:03:30,326 --> 00:03:33,866
So let's assume for a minute it's on
the positive side, and the AI has helped
60
00:03:34,096 --> 00:03:39,936
the team produce more code, ship more
releases, have, even in the perfect world
61
00:03:39,936 --> 00:03:45,036
that I see teams automating everything,
there's still usually, and almost always,
62
00:03:45,036 --> 00:03:48,706
in fact, I will say in my experience,
100 percent of cases, there is a human at
63
00:03:48,706 --> 00:03:50,466
some point during the deployment process.
64
00:03:50,666 --> 00:03:51,826
Whether that's QA.
65
00:03:52,191 --> 00:03:56,411
Whether that's a PR reviewer,
whether that's someone who spun
66
00:03:56,411 --> 00:03:59,931
up the test instances to run
it in a staging environment.
67
00:04:00,391 --> 00:04:01,491
There's always something.
68
00:04:01,501 --> 00:04:04,741
So tell me a little bit about where
this idea came from and what you
69
00:04:04,751 --> 00:04:06,491
think might be a solution to that.
70
00:04:06,991 --> 00:04:07,541
Yeah.
71
00:04:07,641 --> 00:04:11,731
I'll start with the, the belief that
I mean, AI is going to fundamentally
72
00:04:11,731 --> 00:04:14,851
alter the productivity of software
engineering organizations.
73
00:04:14,851 --> 00:04:18,201
I mean, I think the, CTOs I
talk to out there are making
74
00:04:18,201 --> 00:04:19,681
a pretty big bet that it will.
75
00:04:20,181 --> 00:04:22,981
you know, there's today and then
there's, okay, think about the, even
76
00:04:22,981 --> 00:04:26,781
just the pace that, that AI has evolved
in, in the past, the past year and
77
00:04:26,781 --> 00:04:29,421
you know, what it'll look like given
the investment that's flowing into it.
78
00:04:29,431 --> 00:04:34,341
But if you start with that, the claim
that AI is going to kill QA is more
79
00:04:34,341 --> 00:04:37,911
about the fact that, we built, Our
software development life cycle under
80
00:04:37,911 --> 00:04:42,441
the assumption that software was slow to
build and relatively expensive to do so.
81
00:04:42,571 --> 00:04:47,201
And so if those start to change,
right, like a lot of the, systems
82
00:04:47,281 --> 00:04:50,471
and processes we put around, our
software development life cycle,
83
00:04:50,471 --> 00:04:51,771
they probably need to change too.
84
00:04:52,081 --> 00:04:56,371
because ultimately, If you say, okay,
we had 10 engineers and formerly we were
85
00:04:56,371 --> 00:04:59,391
going to have to double that number to
go build the number of features we want.
86
00:04:59,421 --> 00:05:04,541
And suddenly those engineers become, more
productive to go build the features and
87
00:05:04,541 --> 00:05:07,201
capabilities you want inside your apps.
88
00:05:07,701 --> 00:05:11,651
I find it really hard to believe that
those organizations are going to make
89
00:05:11,651 --> 00:05:16,781
the same investment in QA organizations
and automated testing, etc. to keep
90
00:05:16,791 --> 00:05:18,581
pace with that level of development.
91
00:05:18,931 --> 00:05:22,511
The underlying hypothesis was like
productivity allows them to build
92
00:05:22,511 --> 00:05:23,871
more software cheaper, right?
93
00:05:23,911 --> 00:05:27,081
And like cheaper rarely correlates
with adding more humans to the loop.
94
00:05:27,281 --> 00:05:28,651
yeah, the promise.
95
00:05:29,151 --> 00:05:30,931
Like I make this joke and it's probably.
96
00:05:31,431 --> 00:05:34,521
Not true in most cases, cause it's a
little old, but I used to say things
97
00:05:34,521 --> 00:05:38,141
like, yeah, CIO magazine told them to
deploy Kubernetes or, you know, like
98
00:05:38,141 --> 00:05:44,201
whatever, whatever executive VP level C,
CIO, CTO, and you're at that chief level.
99
00:05:44,201 --> 00:05:45,471
So I'm making a joke about you, but,
100
00:05:45,811 --> 00:05:45,851
funny.
101
00:05:46,187 --> 00:05:49,307
but that's when you're not in
the engineering ranks and you
102
00:05:49,307 --> 00:05:52,067
maybe have multiple levels of
management, you get that sort of.
103
00:05:52,567 --> 00:05:55,957
overview where you're reading and you're
discussing things with other suits.
104
00:05:56,147 --> 00:06:00,187
You're reading the suits magazines,
the CIO magazines, the uh, IT
105
00:06:00,187 --> 00:06:01,657
pro magazines and all that stuff.
106
00:06:01,793 --> 00:06:05,503
yeah, the point I'm making here is
that, people are being told that AI
107
00:06:05,503 --> 00:06:06,933
is going to save the business money.
108
00:06:07,003 --> 00:06:10,453
I, I think I've said this to several
podcasts already, but, you weren't here.
109
00:06:10,663 --> 00:06:15,633
So I'm telling you that I was in a
media event in London, KubeCon, and,
110
00:06:15,653 --> 00:06:18,323
they give me a media pass for some
reason, because I make a podcast.
111
00:06:18,323 --> 00:06:21,503
So they act like I'm a journalist
standing around all real journalists
112
00:06:21,523 --> 00:06:26,473
and pretending, but I was there and
I watched an analyst who's, from my
113
00:06:26,473 --> 00:06:29,393
understanding, I don't actually know
the company, but they sound like the
114
00:06:29,413 --> 00:06:31,083
Gartner of Europe or something like that.
115
00:06:31,583 --> 00:06:35,003
and the words came out of the mouth
at an infrastructure conference,
116
00:06:35,023 --> 00:06:39,213
the person said, my clients
are looking for humanless ops.
117
00:06:39,713 --> 00:06:43,483
And I visibly, I think,
chuckled in the room because
118
00:06:43,483 --> 00:06:44,353
I thought, well, that's great.
119
00:06:44,383 --> 00:06:47,913
That's rich that you're
at a 12, 000 ops person.
120
00:06:47,953 --> 00:06:52,013
Conference telling us that your
companies want none of this.
121
00:06:52,484 --> 00:06:54,124
these are all humans here doing this work.
122
00:06:55,150 --> 00:07:00,820
the premise of your argument about QA
is my exact same thoughts was nobody
123
00:07:00,820 --> 00:07:05,080
is budgeting for more QA or more
operations personnel or more DevOps
124
00:07:05,110 --> 00:07:10,340
personnel, just because the engineers
are able to produce more code with AI,
125
00:07:10,440 --> 00:07:12,700
in the app teams and the feature teams.
126
00:07:13,120 --> 00:07:16,400
And so we've all got to do better
and we've got to figure out where
127
00:07:16,460 --> 00:07:19,050
AI can help us because if we don't.
128
00:07:19,550 --> 00:07:21,710
They're going to, they're just
going to hire the person that
129
00:07:21,710 --> 00:07:24,860
says they can, even though maybe
it's a little bit of a shit show.
130
00:07:25,085 --> 00:07:28,025
my belief is that software
organizations need to change.
131
00:07:28,455 --> 00:07:32,795
The way they work for an AI future, so
that might be cultural changes, it might
132
00:07:32,795 --> 00:07:37,635
be role changes, it might be, the, you
know, the word, the words like human in
133
00:07:37,635 --> 00:07:41,125
the loop get tossed around a lot when
it's about engineers interacting with
134
00:07:41,125 --> 00:07:44,435
AI, and the question is, okay, what
does that actually look like, right?
135
00:07:44,445 --> 00:07:48,625
Are we just reviewing AI's PRs and,
you know, kind of blindly saying, yep,
136
00:07:48,845 --> 00:07:51,725
like they wrote the unit tests for
something that works, or is it like
137
00:07:51,725 --> 00:07:55,555
we're actually doing critical thinking,
critical systems thinking, unique
138
00:07:55,565 --> 00:08:00,345
thinking that allows us as owners of
the business and our user success?
139
00:08:00,845 --> 00:08:03,925
to design and build better
software with AI as a tool.
140
00:08:04,205 --> 00:08:07,185
and it's not just QA, it's kind of
like all along the software development
141
00:08:07,185 --> 00:08:10,815
life cycle, how do we put the right
practices in place, and how do we build
142
00:08:10,815 --> 00:08:15,735
an organization that actually allows us,
with this new AI driven future, you know,
143
00:08:15,735 --> 00:08:19,965
whether it's agents, doing work on our
behalf, or whether it's us just with AI
144
00:08:19,965 --> 00:08:22,405
assisted coding, to build better software.
145
00:08:22,595 --> 00:08:25,785
And, yeah, I'm interested in what that
looks like over the next couple of years.
146
00:08:26,226 --> 00:08:30,746
That's kind of the premise of my,
the new Adjetic DevOps podcast.
147
00:08:30,746 --> 00:08:34,466
And also as I'm building out this new
GitHub Actions course, I'm realizing that
148
00:08:34,966 --> 00:08:36,786
I'm having to make the best practices.
149
00:08:36,806 --> 00:08:40,526
I'm having to figure out what is risky
and what's not because no one has really
150
00:08:40,526 --> 00:08:46,476
figured this out yet in any great detail
and, in fact, at, KubeCon London in April,
151
00:08:46,476 --> 00:08:53,316
which feels like a lifetime ago, there
was only one talk around using AI anywhere
152
00:08:53,316 --> 00:08:55,446
in the DevOps and operations path.
153
00:08:55,946 --> 00:08:56,886
To benefit us.
154
00:08:56,916 --> 00:08:59,876
It was all about how to
run infrastructure for AI.
155
00:09:00,176 --> 00:09:04,236
And granted, KubeCon is an infrastructure
conference for platform engineering
156
00:09:04,246 --> 00:09:06,066
builders and all that, so it makes sense.
157
00:09:06,326 --> 00:09:10,556
But the fact that really only one talk,
and it was a team from Cisco, which
158
00:09:10,576 --> 00:09:13,446
I don't think of Cisco as like the
bleeding edge AI company, but it was a
159
00:09:13,446 --> 00:09:18,251
team from Cisco, simply trying to get
And workflow or maybe you would call it
160
00:09:18,251 --> 00:09:22,411
like an agentic workflow for PR review,
which in my case, in that case, I'm
161
00:09:22,411 --> 00:09:26,451
presuming that humans are still writing
their code and AI is reviewing the code.
162
00:09:26,701 --> 00:09:30,761
I'm actually, I was just yesterday, I
was on GitHub trying to figure out if
163
00:09:30,771 --> 00:09:35,111
I, if there was a way to make branch
rules or, some sort of automation
164
00:09:35,111 --> 00:09:41,536
rule that if the AI, wrote the PR,
the AI doesn't get to review the PR
165
00:09:42,036 --> 00:09:42,996
Yeah, yeah, right.
166
00:09:43,287 --> 00:09:45,047
we've got this coming
at us from both angles.
167
00:09:45,047 --> 00:09:47,027
We've got AI in our IDEs.
168
00:09:47,067 --> 00:09:51,177
We've got, multiple companies,
Devon and GitHub themselves.
169
00:09:51,177 --> 00:09:54,127
They now have the GitHub
copilot code agent.
170
00:09:54,137 --> 00:09:55,427
I have to make sure I
get that term, right.
171
00:09:55,787 --> 00:09:59,177
GitHub copilot code agent,
which will write the PR and
172
00:09:59,177 --> 00:10:00,927
the code to solve your issue.
173
00:10:01,427 --> 00:10:05,367
And then they have a PR
code copilot reviewer.
174
00:10:05,662 --> 00:10:07,302
Agent that will review the code.
175
00:10:07,802 --> 00:10:12,842
It's the same models, different
context, but, it feels like that
176
00:10:12,842 --> 00:10:14,332
doesn't, that's not a human in the loop.
177
00:10:14,572 --> 00:10:19,242
So we're going to need these guardrails
and these checks in order to make sure
178
00:10:19,242 --> 00:10:23,802
that code didn't end up in production
with literally no human eyeballs
179
00:10:24,002 --> 00:10:28,552
ever in the path of that code being
created, reviewed, tested, and shipped.
180
00:10:28,932 --> 00:10:29,832
cause we can do that now.
181
00:10:29,852 --> 00:10:30,532
Like we did, we
182
00:10:30,576 --> 00:10:31,836
Yeah, totally, you're totally good.
183
00:10:31,856 --> 00:10:35,396
And I mean, you can easily perceive the
mistakes that could happen too, right?
184
00:10:35,396 --> 00:10:39,566
I mean, before I, I took this role, at
Embrace, I was at New Relic for four and a
185
00:10:39,576 --> 00:10:41,356
half years, and before that I was at AWS.
186
00:10:41,656 --> 00:10:44,786
And so, obviously I've spent a
lot of time around CloudFormation
187
00:10:44,787 --> 00:10:46,426
templates, Terraform, etc.
188
00:10:46,926 --> 00:10:49,696
You can see a world where, you
know, AI builds your CloudFormation
189
00:10:49,706 --> 00:10:53,496
template for you and selects an EC2
instance type because the information
190
00:10:53,496 --> 00:10:57,176
it has about your workload is
optimal for this EC2 instance type.
191
00:10:57,646 --> 00:11:00,756
But in the region you're running,
that instance type's not freely
192
00:11:00,756 --> 00:11:02,616
available for you to autoscale to.
193
00:11:02,656 --> 00:11:05,106
And pretty soon, you go try
to provision more instances,
194
00:11:05,106 --> 00:11:06,546
and poof, you hit your cap.
195
00:11:06,586 --> 00:11:11,726
Because, that instance type just
doesn't have availability in Singapore.
196
00:11:11,796 --> 00:11:16,556
And, us as humans and operators, we learn
a lot about our operating environments, we
197
00:11:16,556 --> 00:11:20,636
learn about our workloads, we learn about
the, the character, the peculiarities
198
00:11:20,636 --> 00:11:24,626
of them that don't make sense to a
computer, but are, like, based on reality.
199
00:11:25,126 --> 00:11:27,626
over time, maybe AI gets really
good at those things, right?
200
00:11:27,626 --> 00:11:31,616
But the question is, how do we build
the culture into kind of be guiding our,
201
00:11:31,656 --> 00:11:36,560
our, you know, army of assistance to
build software that really works for our
202
00:11:36,560 --> 00:11:41,510
users, instead of just trusting it to go
do the right thing, because we view, you
203
00:11:41,510 --> 00:11:44,730
know, everything as having a true and
pure result, which I don't think is true.
204
00:11:44,730 --> 00:11:47,350
a lot of the tech we build is
for people who build consumer
205
00:11:47,350 --> 00:11:48,510
mobile apps and websites.
206
00:11:48,580 --> 00:11:50,360
I mean, that is what the
tech we build is for.
207
00:11:50,860 --> 00:11:54,790
And you can easily see, you know, some
of our, our engineers have been playing
208
00:11:54,790 --> 00:11:59,770
around with using AI assisted coding to
implement our SDK in a consumer mobile
209
00:11:59,770 --> 00:12:01,790
app that, and it works quite well, right?
210
00:12:02,290 --> 00:12:05,790
You can see situations where, an
engineer gets asked by somebody, a
211
00:12:05,800 --> 00:12:09,500
product manager, a leader to say, Hey,
you know, we got a note from marketing.
212
00:12:09,500 --> 00:12:13,300
They want to implement this new
attribution SDK that's going to go.
213
00:12:13,610 --> 00:12:17,010
build up a profile of our users
and help us build more, you know,
214
00:12:17,140 --> 00:12:18,970
customer friendly experiences.
215
00:12:19,470 --> 00:12:22,780
You have the bots go do it, it tests
the code, everything works just fine.
216
00:12:22,820 --> 00:12:29,320
And then, for some reason that SDK makes
a uncached request out to US West 2
217
00:12:29,330 --> 00:12:33,110
for users globally that, you know, for
your users in Southeast Asia instead
218
00:12:33,110 --> 00:12:37,480
of adding an additional six and a half
seconds to app startup because physics.
219
00:12:37,980 --> 00:12:39,720
And, what do those users do?
220
00:12:39,730 --> 00:12:41,750
if you start an app and it's.
221
00:12:41,980 --> 00:12:46,590
sits there hanging for four, five,
six seconds, and you don't have to use
222
00:12:46,590 --> 00:12:49,560
the app because it's you know, you're
waiting for your boarding pass to come
223
00:12:49,560 --> 00:12:51,380
up and you're about to get on the plane.
224
00:12:51,580 --> 00:12:53,770
You probably perceive it
as broken and abandoned.
225
00:12:53,800 --> 00:12:58,190
And to me, that's like a reliability
problem that requires systems thinking
226
00:12:58,190 --> 00:13:01,600
and cultural design around how you're
engineering organization to avoid.
227
00:13:01,915 --> 00:13:03,765
one that I don't think is
immediately solved with AI.
228
00:13:04,265 --> 00:13:04,715
right.
229
00:13:05,075 --> 00:13:09,655
observability is only getting more complex
even, you know, it's a cat and mouse game.
230
00:13:09,655 --> 00:13:13,815
I think in all, in all regards, and I
think everything we do has yin yang I'm
231
00:13:13,815 --> 00:13:19,855
watching organizations that are full
steam ahead, aggressively using ai, and.
232
00:13:21,225 --> 00:13:22,735
I'd love to see some stats.
233
00:13:22,755 --> 00:13:26,795
I don't know if you have empirical
evidence or if you sort of anecdotal stuff
234
00:13:26,795 --> 00:13:31,585
from your clients, but where you see them
accelerate with AI and then almost have a
235
00:13:32,085 --> 00:13:35,955
pulling back effect because they realize
how easy it is for them to ship more bugs.
236
00:13:36,455 --> 00:13:40,915
and then sure, we could have the AI
writing tests, but it can also write
237
00:13:40,915 --> 00:13:43,465
really horrible tests, or it can
just delete tests because it doesn't
238
00:13:43,465 --> 00:13:44,725
like them because they keep failing.
239
00:13:45,025 --> 00:13:46,485
it's that to me multiple times.
240
00:13:46,495 --> 00:13:48,685
In fact, I think, Oh, what's his name?
241
00:13:48,685 --> 00:13:50,125
Gene, not, it wasn't Gene Kim.
242
00:13:50,425 --> 00:13:53,795
There's a, a famous, it might've been
the guy who created Extreme Programming.
243
00:13:54,195 --> 00:13:58,125
I think I heard him on a podcast talking
about he wished he could make certain, all
244
00:13:58,125 --> 00:13:59,695
of his test files have to be read only.
245
00:14:00,195 --> 00:14:04,185
Because he writes the tests, and then
he expects the AI to make them pass,
246
00:14:04,355 --> 00:14:07,355
and the AI will eventually give up
and then want to rewrite his tests.
247
00:14:07,375 --> 00:14:10,915
And he doesn't seem to be able to stop
the AI from doing that, other than just
248
00:14:11,255 --> 00:14:13,325
denying, deny, or cancel, cancel, cancel.
249
00:14:13,665 --> 00:14:16,895
And there's a scenario where
that can easily happen in
250
00:14:16,895 --> 00:14:18,895
the automation of CI and CD.
251
00:14:19,160 --> 00:14:22,900
Where, you know, suddenly it decided
that the tests failing were okay.
252
00:14:23,090 --> 00:14:24,900
And then it's going to push
it to production anyway, or
253
00:14:24,900 --> 00:14:26,620
whatever craziness ensues.
254
00:14:26,840 --> 00:14:28,840
Do you see stuff happening
on the ground there?
255
00:14:28,840 --> 00:14:29,110
That's
256
00:14:29,110 --> 00:14:35,660
did you read that article about, the,
AI driven store, a fake store purveyor
257
00:14:35,670 --> 00:14:37,860
that Anthropic created named Claudius?
258
00:14:38,020 --> 00:14:42,270
That, when it got pushed back from
clo, the Anthropic employees around
259
00:14:42,270 --> 00:14:43,890
what it was stocking, its fake fridge.
260
00:14:43,890 --> 00:14:48,780
Its fake store with it called security
on the employees to try to displace them.
261
00:14:48,780 --> 00:14:52,330
I mean, it's, I think what you're
pointing to is like the moral compass of
262
00:14:52,330 --> 00:14:57,160
AI is not necessarily, it, like that's
a very complex thing to solve, right?
263
00:14:57,160 --> 00:14:58,930
Is is this the right thing to do or not?
264
00:14:58,940 --> 00:15:01,610
Because it's trying to solve the
problem however it can solve the
265
00:15:01,610 --> 00:15:03,160
problem with whatever tools it has.
266
00:15:03,170 --> 00:15:05,090
Not whether the problem
is the right one to solve.
267
00:15:05,090 --> 00:15:08,380
And right is, I mean, obviously, you
know, even humans struggle with this one.
268
00:15:08,880 --> 00:15:09,240
Right.
269
00:15:09,490 --> 00:15:11,450
I've just been labeling it all as taste.
270
00:15:11,790 --> 00:15:13,990
like the AI lacks taste
271
00:15:14,259 --> 00:15:14,999
Yeah, that's
272
00:15:15,240 --> 00:15:19,000
like in a certain team culture,
if you have a bad team culture,
273
00:15:19,000 --> 00:15:21,320
deleting tests might be acceptable.
274
00:15:21,330 --> 00:15:24,400
I used to work with a team
that would ignore linting.
275
00:15:24,420 --> 00:15:25,910
So I would implement linting.
276
00:15:26,070 --> 00:15:29,170
I'd give them the tools to customize
linting and then they would ignore
277
00:15:29,170 --> 00:15:30,540
it and accept the PR anyway.
278
00:15:30,940 --> 00:15:32,680
And did not care.
279
00:15:33,180 --> 00:15:35,720
They basically followed the
linting rules of the file.
280
00:15:35,750 --> 00:15:38,490
Now they were in a monolithic
repo with thousands and thousands
281
00:15:38,490 --> 00:15:39,870
of a 10 year old Ruby app.
282
00:15:40,350 --> 00:15:42,960
But they would just ignore the linting.
283
00:15:43,340 --> 00:15:46,330
And I always saw that as a
culture problem for them.
284
00:15:46,790 --> 00:15:49,740
That I as the consultant can't,
I would always tell the boss that
285
00:15:49,740 --> 00:15:52,220
I was working for there that I
can't solve your culture problems.
286
00:15:52,220 --> 00:15:54,310
I'm just a consultant here trying
to help you with your DevOps.
287
00:15:54,570 --> 00:15:59,260
You need a full time person leading
teams to tell the rest of the team
288
00:15:59,260 --> 00:16:02,580
that this is important and that these
things matter over the course of
289
00:16:02,590 --> 00:16:04,260
years as you change out engineers.
290
00:16:04,260 --> 00:16:07,090
You can't just go with,
well, this file is tab.
291
00:16:07,290 --> 00:16:08,920
This file is, spaces.
292
00:16:08,940 --> 00:16:12,160
And this one follows, especially
in places like Ruby and Python and
293
00:16:12,160 --> 00:16:15,260
whatnot, where there's multiple, sort
of style guys and whatnot like that.
294
00:16:15,260 --> 00:16:17,670
But I just call that a taste
issue or culture issue.
295
00:16:18,080 --> 00:16:19,950
AI doesn't have culture or taste.
296
00:16:19,960 --> 00:16:20,070
It.
297
00:16:20,365 --> 00:16:22,155
It just follows the rules we give it.
298
00:16:22,655 --> 00:16:26,865
we're not really good yet as
an industry at defining rules.
299
00:16:26,865 --> 00:16:29,095
I think that's actually part
of the course I'm creating is
300
00:16:29,595 --> 00:16:32,475
figuring out all the different
places where we can put AI rules.
301
00:16:32,475 --> 00:16:34,025
We've seen them put in repos.
302
00:16:34,035 --> 00:16:36,605
We've seen them put in
your UI and your IDEs.
303
00:16:37,105 --> 00:16:41,115
Now I'm trying to figure out how do I
put that into CI and how do I put that
304
00:16:41,115 --> 00:16:45,765
into even like the operations if you're
going to try to put AIs somewhere in your,
305
00:16:46,195 --> 00:16:49,795
somewhere where they're going to help with
troubleshooting or visibility or discovery
306
00:16:49,795 --> 00:16:52,395
of issues, like they also need to have.
307
00:16:52,810 --> 00:16:55,570
Taste, which I want to change all
my rules files to taste files.
308
00:16:55,570 --> 00:16:56,520
I guess that's my platform.
309
00:16:56,520 --> 00:16:56,920
I'm standing.
310
00:16:58,790 --> 00:16:59,110
Yeah.
311
00:16:59,380 --> 00:16:59,746
Cause it,
312
00:16:59,784 --> 00:17:02,644
Yeah, I mean, at the same time, you
probably don't want to be reviewing,
313
00:17:02,764 --> 00:17:06,544
hundreds of PRs that an AI robot is
going through, just changing casing of
314
00:17:06,544 --> 00:17:11,374
legacy code to, meet your, to the, Change
obviously introduces the opportunity
315
00:17:11,384 --> 00:17:15,084
for failure, bugs, etc. And if it's,
you know, it's somewhat arbitrary
316
00:17:15,114 --> 00:17:17,974
simply because you've given it a new
rule that this is what it has to do.
317
00:17:17,974 --> 00:17:21,724
I mean, I had a former co worker
many years ago, like 15, 20 years
318
00:17:21,724 --> 00:17:26,124
ago, who, was famous for just, going
through periodically and making
319
00:17:26,124 --> 00:17:30,004
nothing but casing code changes
because that's what he preferred.
320
00:17:30,014 --> 00:17:33,924
And it was just this, endless stream
of, trivialized changes on code we
321
00:17:33,924 --> 00:17:35,454
hadn't touched in months or years.
322
00:17:35,454 --> 00:17:35,694
That's it.
323
00:17:35,954 --> 00:17:38,284
that would inevitably lead to
some sort of problem, right?
324
00:17:38,294 --> 00:17:39,224
And because,
325
00:17:39,420 --> 00:17:40,420
The toil of that.
326
00:17:40,470 --> 00:17:43,520
I'm just like, yeah, that you're
giving me heartburn just by
327
00:17:43,550 --> 00:17:44,690
telling me about that story.
328
00:17:45,190 --> 00:17:48,240
yeah, it's, well, I've
gotten over the PTSD of that.
329
00:17:48,330 --> 00:17:49,200
It's a long time ago.
330
00:17:49,700 --> 00:17:50,400
So, okay.
331
00:17:50,410 --> 00:17:52,040
So, what are you seeing on the ground?
332
00:17:52,090 --> 00:17:57,300
do you have some examples of
how this is manifesting in apps?
333
00:17:57,300 --> 00:17:59,250
I mean, you've already given me
a couple of things, but I'm just
334
00:17:59,250 --> 00:18:00,470
curious if you've got some more,
335
00:18:00,945 --> 00:18:02,615
I kind of alluded to it
at the very beginning.
336
00:18:02,615 --> 00:18:07,035
obviously, in my role, I talk to a lot
of senior executives about directionally
337
00:18:07,035 --> 00:18:11,185
where they're going with their engineering
organizations, because I think, the
338
00:18:11,185 --> 00:18:15,115
software we build, it does a bunch of
tactical things, but at a broader sense,
339
00:18:15,185 --> 00:18:19,845
it allows people to measure reliability as
expressed through, you know, our customers
340
00:18:20,135 --> 00:18:24,535
staying engaged in your experience by
virtue of the fact that they otherwise
341
00:18:24,545 --> 00:18:28,635
liking your software are having a better
technical experience with the apps you
342
00:18:28,635 --> 00:18:32,580
build, that the front end and ultimately
measuring that and then thinking through
343
00:18:32,580 --> 00:18:36,340
all of the underlying root causes is a
cultural change in these organizations
344
00:18:36,340 --> 00:18:37,880
and how they think about reliability.
345
00:18:37,880 --> 00:18:41,830
It's no longer just like are
the API endpoints at the edge of
346
00:18:41,830 --> 00:18:45,100
our data center delivering their
payload, you know, responding in
347
00:18:45,110 --> 00:18:46,790
a timely manner and error free.
348
00:18:46,810 --> 00:18:52,110
And then the user experience is well
tested and, you know, we ship it to prod.
349
00:18:52,110 --> 00:18:52,200
And.
350
00:18:52,700 --> 00:18:53,350
That's enough.
351
00:18:53,850 --> 00:18:57,050
it really isn't a shift in how people
think about it, but when I talk to them,
352
00:18:57,050 --> 00:19:01,870
I mean, a lot of CTOs really are taking
a bet that AI is going to be the way
353
00:19:01,910 --> 00:19:05,580
that productivity gains, I won't say make
their business more efficient, but they
354
00:19:05,580 --> 00:19:07,640
do allow them to do more with less, right?
355
00:19:08,140 --> 00:19:11,750
You know, like it or not, especially
over the past few years, I mean, in B2B,
356
00:19:11,780 --> 00:19:16,160
we're in the B2B SaaS business, there's
been, you know, times have definitely
357
00:19:16,200 --> 00:19:21,460
been tougher than they were in the early,
2020s, for kind of everyone, the, I think
358
00:19:21,460 --> 00:19:24,560
there's a lot of pressure in consumer
tech with tariffs and everything to do
359
00:19:24,580 --> 00:19:28,470
things more cost effectively, and first
and foremost on the grounds, this is a
360
00:19:28,530 --> 00:19:31,920
change we are seeing, whether we like
it or not, and, you know, we can argue
361
00:19:31,920 --> 00:19:35,310
about whether it's going to work and
how long it'll take, but the fact is
362
00:19:35,320 --> 00:19:40,220
that, like you said, you mentioned that
the CIO magazines, leadership is taking
363
00:19:40,220 --> 00:19:43,973
a bet this is going to happen, and I
think as we start to talk about that
364
00:19:43,973 --> 00:19:47,753
with these executives, the question is
like, is the existing set of tools I have
365
00:19:47,763 --> 00:19:49,883
to cope with that reality good enough?
366
00:19:50,293 --> 00:19:54,453
And, yeah, I guess my underlying
hypothesis is it probably isn't
367
00:19:54,453 --> 00:19:55,743
for most companies, right?
368
00:19:55,743 --> 00:20:01,063
if you think about the, the world of,
you know, a web as an example, like
369
00:20:01,503 --> 00:20:05,003
a lot of, a lot of companies that are
consumer facing tech companies will
370
00:20:05,003 --> 00:20:08,863
measure their core web vitals, in
large part because it has SEO impact.
371
00:20:08,953 --> 00:20:12,793
And then they'll put a, exception handler
in there, like Sentry that grabs, you
372
00:20:12,793 --> 00:20:16,953
know, kind of JavaScript errors and
tries to give you some level of impact.
373
00:20:17,453 --> 00:20:20,503
around, you know, how many are impacting
your users and whether it's high
374
00:20:20,503 --> 00:20:23,723
severity and then you kind of have to
sort through them to figure out which
375
00:20:23,723 --> 00:20:25,553
ones really matter for you to solve.
376
00:20:26,053 --> 00:20:31,063
so take, existing user frustration
with, human efficiency of delivering
377
00:20:31,063 --> 00:20:33,173
code and the pace, the existing pace.
378
00:20:33,673 --> 00:20:37,423
users are already frustrated that
Core Web Vitals are hard for them to
379
00:20:37,423 --> 00:20:40,873
quantify what that really means in
terms of user impact and whether a user
380
00:20:40,873 --> 00:20:42,423
decides to stay on the site or not.
381
00:20:42,423 --> 00:20:42,439
so, yeah.
382
00:20:42,758 --> 00:20:45,688
and the fact that they're overwhelmed with
the number of JavaScript errors that could
383
00:20:45,698 --> 00:20:49,708
be out there because it's really, I mean,
you go to any site, go to developer tools.
384
00:20:50,008 --> 00:20:53,998
look at the number of JavaScript errors
you see, and then, you know, take your
385
00:20:53,998 --> 00:20:58,228
human experienced, idea of how you're
interacting with the site, and chances
386
00:20:58,228 --> 00:20:59,878
are most of those don't impact you, right?
387
00:20:59,878 --> 00:21:04,198
They're just like, it's a, it's an
analytics pixel that failed to load, or
388
00:21:04,198 --> 00:21:07,868
it's a library that's just barfing some
error, but it's otherwise working fine.
389
00:21:08,368 --> 00:21:11,998
So take that and put it on steroids now
where you have a bunch of AI assisted
390
00:21:11,998 --> 00:21:15,468
developers doubling or tripling the
number of things they're doing or
391
00:21:15,468 --> 00:21:18,448
just driving more experimentation,
right, which I think a lot of
392
00:21:18,458 --> 00:21:19,998
businesses have always wanted to do.
393
00:21:20,008 --> 00:21:23,528
But again, software has been kind
of slow and expensive to build.
394
00:21:23,958 --> 00:21:26,998
And so If it's slow and expensive
to build, my thirst for delivering
395
00:21:26,998 --> 00:21:31,958
an experiment across three customer
cohorts when I can only, deliver one,
396
00:21:32,198 --> 00:21:36,198
given my budget, just means that, I
only have to test for one variation.
397
00:21:36,198 --> 00:21:39,468
We'll now triple that, or quadruple
it, or, and, you know, multiply that
398
00:21:39,468 --> 00:21:41,728
by a number of, arbitrary user factors.
399
00:21:41,988 --> 00:21:44,458
It just gets more challenging,
and I think we need to think about
400
00:21:44,458 --> 00:21:45,748
how we measure things differently.
401
00:21:46,085 --> 00:21:50,165
once the team has the tools in place to
manage multiple experiments at the same
402
00:21:50,165 --> 00:21:54,665
time, that just ramps up exponentially
until the team can't handle it anymore.
403
00:21:54,665 --> 00:21:58,205
But if AI is giving them a chance
to go further, then yeah, they're
404
00:21:58,205 --> 00:21:59,075
just, they're going to do it.
405
00:21:59,105 --> 00:21:59,575
I mean,
406
00:21:59,639 --> 00:22:01,999
yeah, I mean, you're going to get
overwhelmed with support tickets
407
00:22:01,999 --> 00:22:04,959
and bad app reviews and whatever it
is, which I think most people are.
408
00:22:05,159 --> 00:22:08,189
Most business leaders would be
pretty upset if that's how they, are
409
00:22:08,209 --> 00:22:10,009
responding to reliability issues.
410
00:22:10,109 --> 00:22:15,179
I was just gonna say, we've already had
decades now of pressure at all levels
411
00:22:15,179 --> 00:22:18,239
of engineering to reduce personnel.
412
00:22:18,519 --> 00:22:22,089
you need to justify every new hire pretty
significantly unless you're funded and
413
00:22:22,089 --> 00:22:25,239
you're just, you know, in an early stage
startup and they're just growing to the
414
00:22:25,399 --> 00:22:26,909
point that they can burn all their cash.
415
00:22:27,309 --> 00:22:31,339
having been around 30 years in
tech, I've watched operations
416
00:22:31,389 --> 00:22:33,374
get, you know, Merged into DevOps.
417
00:22:33,374 --> 00:22:36,584
I've watched DevOps teams get, which
we didn't traditionally call them that.
418
00:22:36,584 --> 00:22:40,184
We might have called them sysadmins
or automation or build engineers
419
00:22:40,194 --> 00:22:43,554
or CI engineers, and they get
merged into the teams themselves.
420
00:22:43,554 --> 00:22:45,094
And the teams have to take
on that responsibility.
421
00:22:45,094 --> 00:22:48,364
I mean, We've got this weird culture
that we go to this Kubernetes conference
422
00:22:48,364 --> 00:22:52,374
all the time and one of the biggest
complaints is devs who don't want to
423
00:22:52,374 --> 00:22:57,144
be operators, but it's saddled on them
because somehow the industry got the
424
00:22:57,144 --> 00:22:59,949
word DevOps confused with Something.
425
00:22:59,989 --> 00:23:02,929
And we all thought, Oh, that
means the developers can do ops.
426
00:23:03,159 --> 00:23:07,039
That's not what the word meant, but we've,
we've worked, we've gotten to this world
427
00:23:07,039 --> 00:23:12,279
where I'm getting hired as a consultant
to help teams deal with just the sheer
428
00:23:12,279 --> 00:23:16,009
amount of ridiculous expectations a
single engineer is supposed to have,
429
00:23:16,349 --> 00:23:18,339
that not the knowledge you're supposed
to have, the systems are supposed to
430
00:23:18,339 --> 00:23:20,449
be able to run while making features.
431
00:23:20,889 --> 00:23:26,259
And it's already, I feel like a
decade ago, it felt unsustainable.
432
00:23:26,579 --> 00:23:30,109
So now here we are having to
give some of that work to AI.
433
00:23:30,609 --> 00:23:35,199
When it's still doing random
hallucinations on a daily basis,
434
00:23:35,199 --> 00:23:36,679
at least even in the best models.
435
00:23:36,679 --> 00:23:40,289
I think I was just ranting yesterday
on a podcast that SWE bench, which is
436
00:23:40,289 --> 00:23:45,179
like a engineering benchmark website
for AI models and how well they
437
00:23:45,179 --> 00:23:49,289
solve GitHub issues, essentially,
and the best models in the world.
438
00:23:49,709 --> 00:23:51,989
Can barely get two thirds of them.
439
00:23:51,989 --> 00:23:52,519
Correct.
440
00:23:52,529 --> 00:23:56,389
And that's, if you're paying the
premium bucks and you've got the premium
441
00:23:56,389 --> 00:24:00,339
foundational models and you are on the
bleeding edge stuff, which most teams
442
00:24:00,339 --> 00:24:04,719
are not because they have rules or
limitations on which model they can use,
443
00:24:04,719 --> 00:24:08,129
or they can only use the ones in house,
or they can only use a particular one.
444
00:24:08,509 --> 00:24:10,979
and it's just, it's one of
those things where I feel like
445
00:24:10,979 --> 00:24:12,209
we're being pushed at all sides.
446
00:24:12,504 --> 00:24:16,094
and at some point, it's amazing
that any of this even works.
447
00:24:16,104 --> 00:24:18,364
It's amazing that apps
actually load on phones.
448
00:24:18,784 --> 00:24:21,494
it's just, it just, it feels like
garbage on top of garbage, on top
449
00:24:21,494 --> 00:24:23,714
of garbage, turtles all the way
down, whatever you want to call it.
450
00:24:24,214 --> 00:24:28,084
So where are you coming in here to
help solve some of these problems?
451
00:24:28,108 --> 00:24:29,318
yeah, that's fair, yeah.
452
00:24:29,448 --> 00:24:33,218
I'll even add to that, All of that's
even discounting the fact that new tools
453
00:24:33,218 --> 00:24:37,938
are coming that make it even simpler
to push out software that like barely
454
00:24:37,938 --> 00:24:41,228
works without even the guided hands
of a software engineer who has any
455
00:24:41,228 --> 00:24:43,158
professional experience writing that code.
456
00:24:43,658 --> 00:24:47,188
some of the vibe coding tools, like
our product management team uses,
457
00:24:47,348 --> 00:24:49,148
largely for rapid prototyping.
458
00:24:49,648 --> 00:24:51,688
And I can write OK Python.
459
00:24:51,788 --> 00:24:54,528
I used to write OK C sharp.
460
00:24:55,028 --> 00:24:58,058
I have never been particularly
good at writing JavaScript.
461
00:24:58,438 --> 00:25:01,908
I can read it OK, but like when
it comes to fixing a particular
462
00:25:01,908 --> 00:25:03,818
problem, quickly get out of my depth.
463
00:25:04,318 --> 00:25:06,238
That's not to say, I couldn't
be capable of doing it.
464
00:25:06,238 --> 00:25:09,618
It's just not what I do every day,
nor do, I particularly have the energy
465
00:25:09,618 --> 00:25:12,658
when I'm, you know, done with a full
workday to, go teach myself JavaScript.
466
00:25:12,808 --> 00:25:15,628
And, I'll build an app with
one of the Vibe coding tools
467
00:25:15,838 --> 00:25:19,028
as a means of communicating
how I expect something to work.
468
00:25:19,043 --> 00:25:22,503
So if you have any questions, feel
free to reach out to me, and I'll
469
00:25:22,503 --> 00:25:32,719
be happy to answer them, and I'll
be excited to hear back from you.
470
00:25:32,719 --> 00:25:34,908
And for coming.
471
00:25:34,908 --> 00:25:38,458
and I'm like, ah, it's better
to just delete the project
472
00:25:38,458 --> 00:25:39,558
and start all over again.
473
00:25:40,028 --> 00:25:43,158
And, you know, if you can make
it work, that doesn't necessarily
474
00:25:43,158 --> 00:25:44,398
mean it'll work at scale.
475
00:25:44,438 --> 00:25:47,908
It doesn't necessarily mean that there
aren't like a myriad of use cases
476
00:25:47,918 --> 00:25:52,118
you haven't tested for as you click
through the app in the simulator.
477
00:25:52,618 --> 00:25:54,948
and so, you know, I think the
question is, okay, you know, given
478
00:25:54,948 --> 00:25:57,598
the fact that we have to accept more
software is going to make its way
479
00:25:57,598 --> 00:26:01,068
into human beings hands, because we
build software for human beings, it's
480
00:26:01,078 --> 00:26:02,318
going to get to more human beings.
481
00:26:02,818 --> 00:26:05,958
How do we build a, a reliability
paradigm where we can measure
482
00:26:05,958 --> 00:26:07,028
whether or not it's working?
483
00:26:07,038 --> 00:26:11,538
And I think that stops focusing
on, kind of, I guess to go back
484
00:26:11,538 --> 00:26:15,218
to the intentionally inflammatory
title of today's discussion, it
485
00:26:15,228 --> 00:26:18,523
stops focusing on like a zero
bug paradigm where I test things.
486
00:26:18,683 --> 00:26:22,503
I test every possible
pathway for my users.
487
00:26:22,523 --> 00:26:25,273
I, you know, have a set of
requirements, again, around
488
00:26:25,273 --> 00:26:27,033
trivialized performance and stuff.
489
00:26:27,533 --> 00:26:32,743
And, trying to put up these kind of
barriers to getting code into human hands.
490
00:26:32,753 --> 00:26:36,543
And I just accept the fact more code
is going to get into human hands faster
491
00:26:36,543 --> 00:26:38,323
at a pace I can't possibly control.
492
00:26:38,613 --> 00:26:42,063
And so, therefore, I have to put
measurements in the real world.
493
00:26:42,083 --> 00:26:44,392
So, I use a lot of different tools around
my app so that I can be as responsive
494
00:26:44,392 --> 00:26:48,032
as possible to resolving those issues
when I find them, which is I guess my,
495
00:26:48,402 --> 00:26:53,512
you know, charity majors who co founded
Honeycomb, she, her and I were talking a
496
00:26:53,512 --> 00:26:57,352
few months ago and big fan of stickers,
she shipped me an entire envelope of
497
00:26:57,362 --> 00:27:00,162
stickers and they're all like, you
know, ship fast and break things, right?
498
00:27:00,162 --> 00:27:02,122
I tested production, stuff like that.
499
00:27:02,622 --> 00:27:05,502
And somehow I feel like, you know,
in our world, because we build
500
00:27:05,502 --> 00:27:08,712
observability for front end and mobile
experiences, like web and mobile
501
00:27:08,712 --> 00:27:13,012
experiences, I feel like that message
just hadn't gotten through historically.
502
00:27:13,362 --> 00:27:16,682
part of it's because like release
cycles on mobile were really slow.
503
00:27:16,762 --> 00:27:19,432
you had to wait days for
an app to get out there.
504
00:27:19,452 --> 00:27:22,662
Part of it was software is expensive
to build and slow to build and so
505
00:27:22,682 --> 00:27:25,892
getting feature flags out there where
you can operate in production was hard.
506
00:27:26,392 --> 00:27:29,542
Part of it was just the observability
paradigm hadn't shifted, right?
507
00:27:29,572 --> 00:27:33,692
Like the, the paradigm of measure
everything and then find root cause
508
00:27:34,012 --> 00:27:36,022
had not made its way to frontend.
509
00:27:36,072 --> 00:27:40,492
It was more like measure the known
things you look for, like web exceptions,
510
00:27:40,492 --> 00:27:43,852
like core web vitals are on mobile,
look at crashes, and that's about it.
511
00:27:44,352 --> 00:27:48,032
And the notion of, okay, measure
whether, users are starting the app
512
00:27:48,032 --> 00:27:52,022
successfully, and when you see some
unknown outcomes start to occur or users
513
00:27:52,022 --> 00:27:55,712
start to abandon, how can you then sift
through the data to find the root cause?
514
00:27:56,132 --> 00:27:57,392
Hadn't really migrated its way.
515
00:27:57,392 --> 00:27:58,472
And that's what we're trying to do.
516
00:27:58,532 --> 00:28:01,892
we're trying to bring that paradigm
of how do you define the A, for
517
00:28:01,892 --> 00:28:03,512
lack of a better term, the APIs.
518
00:28:04,012 --> 00:28:07,332
The things that your humans interact
with, with your apps, the things they
519
00:28:07,332 --> 00:28:10,972
do, you don't build an API in your app
for human beings, you build a login
520
00:28:10,972 --> 00:28:15,432
screen, you build a checkout screen,
a cart experience, a product catalog.
521
00:28:15,932 --> 00:28:19,442
How do we take those things and measure
the success of them and then try to,
522
00:28:19,512 --> 00:28:23,992
attribute them to underlying technical
causes where your teams can have, better
523
00:28:24,002 --> 00:28:26,042
have those socio technical conversations?
524
00:28:26,402 --> 00:28:30,042
so that they can understand and then
resolve them, probably using AI, right?
525
00:28:30,142 --> 00:28:30,712
as we grow.
526
00:28:30,762 --> 00:28:33,972
but like it, it allows better,
system knowledge and the interplay
527
00:28:33,972 --> 00:28:38,232
between real human activities and
the telemetry we're gathering.
528
00:28:38,732 --> 00:28:39,102
Yeah.
529
00:28:39,602 --> 00:28:42,422
Have you been, I'm just curious,
have you been playing with, having
530
00:28:42,452 --> 00:28:48,322
AI look at observability data,
whether it's logs or metrics?
531
00:28:48,382 --> 00:28:49,692
have you had any experience with that?
532
00:28:49,722 --> 00:28:54,702
I'm asking that simply as a generic
question, because the conversations
533
00:28:54,702 --> 00:28:59,152
I've had in the last few months, it
sounds like AI is much better at reading
534
00:28:59,152 --> 00:29:03,542
logs than it is at reading metrics or
dashboards or anything that sort of
535
00:29:03,562 --> 00:29:09,082
lacks context or, you know, it's not
like we're putting in alt image messages
536
00:29:09,082 --> 00:29:10,802
for every single dashboard graph.
537
00:29:10,802 --> 00:29:15,012
And that's probably all coming from
the providers just because if if
538
00:29:15,012 --> 00:29:16,532
they're expecting AI to look at stuff.
539
00:29:17,032 --> 00:29:20,232
they're gonna have to give more context,
but it sounds like it's not as easy
540
00:29:20,232 --> 00:29:24,552
as just giving AI access to all those
systems and saying, yeah, go read the
541
00:29:24,552 --> 00:29:29,062
website for my dashboard, Grafana,
and figure out what's the problem?
542
00:29:29,501 --> 00:29:33,331
I've seen it deployed in two ways, one
of which I find really interesting and
543
00:29:33,341 --> 00:29:36,661
something we're actively working on,
because I think it's just a high degree
544
00:29:36,661 --> 00:29:40,501
of utility and it's, You know, given
the kind of state of LLMs, I think it's
545
00:29:40,511 --> 00:29:45,441
probably something that's relatively easy
to get right, which is the, the notion of,
546
00:29:45,491 --> 00:29:50,051
when you come into a product like ours,
or a Grafana, or, you know, a New Relic,
547
00:29:50,181 --> 00:29:53,751
you probably have an objective in mind,
maybe you have a question you're trying
548
00:29:53,751 --> 00:29:55,701
to ask, what's the health of this service?
549
00:29:55,731 --> 00:29:58,161
Or, I got page on a particular issue.
550
00:29:58,161 --> 00:30:01,711
I'm going to, I need to build a chart
that shows me like the interplay between
551
00:30:01,721 --> 00:30:06,191
latency for this particular service
and the success rate of, you know,
552
00:30:06,201 --> 00:30:09,861
some other type of thing, more like
database calls or something like that.
553
00:30:10,361 --> 00:30:13,981
Today, like people broadly have to
manually create those queries and it
554
00:30:13,981 --> 00:30:17,451
requires a lot of human knowledge around
the query language or schema of your data.
555
00:30:17,461 --> 00:30:21,551
And I think there's a ton of opportunity
for us to simply ask a human question
556
00:30:21,621 --> 00:30:26,851
of You know, show me a query of all
active sessions on this mobile app for
557
00:30:26,861 --> 00:30:32,181
the latest iOS version and the number
of, traces, like startup traces that
558
00:30:32,181 --> 00:30:33,831
took greater than a second and a half.
559
00:30:33,831 --> 00:30:37,471
And have it just simply pull up your
dashboard and query language and build
560
00:30:37,471 --> 00:30:40,991
the chart for you quite rapidly, which
is a massive time savings, right?
561
00:30:40,991 --> 00:30:41,131
And.
562
00:30:41,591 --> 00:30:46,441
It also just makes our tech, which
you're right, can get quite complex, more
563
00:30:46,441 --> 00:30:48,201
approachable by your average engineer.
564
00:30:48,231 --> 00:30:51,581
Which, you know, I'm a big believer that
if every engineer in your organization
565
00:30:51,581 --> 00:30:53,331
understands how your systems work.
566
00:30:53,766 --> 00:30:55,866
And the data around it, you're going
to build a lot better software,
567
00:30:55,926 --> 00:30:57,516
especially as they use AI, right?
568
00:30:57,516 --> 00:31:01,036
Because now they, they understand
how things work and they can better
569
00:31:01,096 --> 00:31:03,056
provide instructions to the robots.
570
00:31:03,556 --> 00:31:05,636
so I think that's really
a useful, interesting way.
571
00:31:05,646 --> 00:31:10,016
And we've seen people start to roll that
type of assistant act, functionality out.
572
00:31:10,516 --> 00:31:14,516
The second way I've seen deployed, I
see mixed results, which is I mean, an
573
00:31:14,516 --> 00:31:17,726
incident, go look at every potential
signal that I can see related to this
574
00:31:17,726 --> 00:31:20,516
incident and try to tell me what's
going on and get to the root cause, and
575
00:31:20,526 --> 00:31:24,326
more often than not, I find it's just a
summarization of stuff that you, as an
576
00:31:24,326 --> 00:31:27,986
experienced user, probably would come to
the exact conclusions on, I think there's
577
00:31:28,006 --> 00:31:33,086
utility there, certainly, it gets you a
written summary quickly of what you see.
578
00:31:33,396 --> 00:31:37,896
But I do also worry that, it doesn't
apply a high degree of critical thinking.
579
00:31:38,066 --> 00:31:42,986
and you know, an example of where lacking
context, I mean, it wouldn't be very
580
00:31:42,986 --> 00:31:47,196
smart, right, is you've probably seen
it, every traffic chart, around service
581
00:31:47,206 --> 00:31:52,256
traffic, depending upon how it runs,
tends to be pretty lumpy with time of day.
582
00:31:52,306 --> 00:31:57,806
Because Most companies don't
have equivalent distribution
583
00:31:57,806 --> 00:31:59,136
of traffic across the globe.
584
00:31:59,935 --> 00:32:02,925
Not every country across the globe
has an equivalent population.
585
00:32:03,425 --> 00:32:06,825
And so you'd tend to see these spikes
of where, you know, you have a number
586
00:32:06,825 --> 00:32:11,335
of service requests spiking during
daylight hours or the Monday of every
587
00:32:11,335 --> 00:32:14,665
week because people come into the office
and suddenly start e commerce shopping.
588
00:32:15,165 --> 00:32:18,605
And you see it taper throughout the
week, or you taper into the evening.
589
00:32:19,095 --> 00:32:19,955
I think that's normal.
590
00:32:19,955 --> 00:32:22,465
You understand that as an
operator of your services, because
591
00:32:22,465 --> 00:32:23,675
it's unique to your business.
592
00:32:23,695 --> 00:32:28,605
I think the AI would struggle, lacking
context around, your business to
593
00:32:28,605 --> 00:32:32,955
understand, somewhat normal fluctuations,
or the fact that, You know, a marketing
594
00:32:32,965 --> 00:32:36,855
dropped a campaign where there's no
data inside your observability system
595
00:32:36,855 --> 00:32:38,455
to tell you that that campaign dropped.
596
00:32:38,455 --> 00:32:39,505
It's not a release.
597
00:32:39,885 --> 00:32:41,445
see your AI is lacking context.
598
00:32:41,445 --> 00:32:45,875
It's lacking the historical context
that the humans already have implicitly.
599
00:32:45,885 --> 00:32:46,215
Yeah,
600
00:32:46,419 --> 00:32:49,119
and I mean that context might be in a
Slack channel where marketing said we
601
00:32:49,119 --> 00:32:53,569
just dropped an email and you know an
email drop so expect an increased number
602
00:32:53,569 --> 00:32:58,019
of requests to this endpoint as you know
people retrieve their, their special
603
00:32:58,019 --> 00:33:01,589
offer token or whatever that will allow
them to use it in our checkout flow.
604
00:33:02,044 --> 00:33:07,464
today like just giving, if we provided
that scope to a, an AI model within our
605
00:33:07,464 --> 00:33:09,264
system we would lock that type of context.
606
00:33:09,764 --> 00:33:13,334
Yeah, if I'm not creating any of
these apps, I'm sure they've already
607
00:33:13,334 --> 00:33:15,184
thought of all of this, but, the
first thing that comes to mind is
608
00:33:15,184 --> 00:33:19,084
well, the hack for me would be give
it access to our ops slack room,
609
00:33:19,584 --> 00:33:20,064
Right.
610
00:33:20,535 --> 00:33:23,590
we're probably all having those
conversations of oh, What's going on here?
611
00:33:23,590 --> 00:33:27,680
And someone in someone's who had
happened to get reached out to for
612
00:33:27,680 --> 00:33:30,880
marketing was like, well, you know,
yesterday we did send out a, you
613
00:33:30,880 --> 00:33:32,240
know, a new coupon sale or whatever.
614
00:33:32,730 --> 00:33:36,240
So, yeah, having it read all that
stuff might be, necessary for it to
615
00:33:36,240 --> 00:33:38,170
understand the, because you're right.
616
00:33:38,220 --> 00:33:41,320
it's not like we have a dashboard in
Grafana that's number of marketing
617
00:33:41,320 --> 00:33:46,110
email, you know, the email sent per day
or the level of sale we're expecting,
618
00:33:46,200 --> 00:33:51,170
based on in America, it's the 4th of
July sale, or, you know, some holiday
619
00:33:51,180 --> 00:33:52,680
in a certain region of the world.
620
00:33:52,937 --> 00:33:56,417
Or a social media influencer dropping
like some sort of link to your
621
00:33:56,417 --> 00:34:00,707
product that suddenly, you know, that
it's a new green doll that people
622
00:34:00,707 --> 00:34:02,297
attach to their designer handbags.
623
00:34:02,297 --> 00:34:05,897
I don't know like anything about what
the kids are into these days, but
624
00:34:05,897 --> 00:34:09,977
it seems kind of arbitrary and like
I, I would struggle to predict that.
625
00:34:09,977 --> 00:34:10,757
Let me put it that way.
626
00:34:11,257 --> 00:34:12,517
they're into everything old.
627
00:34:12,517 --> 00:34:14,907
So it's into everything that I'm into.
628
00:34:15,407 --> 00:34:15,497
Yes.
629
00:34:15,598 --> 00:34:15,948
all right.
630
00:34:15,958 --> 00:34:19,068
So if we're talking about this at a
high level, we talked a little bit
631
00:34:19,068 --> 00:34:25,188
about before the show around how
embrace is thinking about, observability
632
00:34:25,198 --> 00:34:28,878
and particularly on mobile, but
you know, anything front end there.
633
00:34:29,138 --> 00:34:34,002
the tooling ecosystem for, engineers
on web and mobile is pretty rich.
634
00:34:34,002 --> 00:34:37,642
but they all tend to be just like
hammers for a particular nail, right?
635
00:34:37,642 --> 00:34:41,132
It's you know, How do we give you a
better craft reporter, better exception
636
00:34:41,132 --> 00:34:43,652
handler, how do we go measure X or Y?
637
00:34:44,152 --> 00:34:47,272
some of the stuff that we're thinking
about, which is really how we define
638
00:34:47,272 --> 00:34:51,527
the objective of OB observability
for, the coming digital age, right?
639
00:34:51,527 --> 00:34:54,487
Which is you know, as
creators of user experiences.
640
00:34:54,937 --> 00:34:58,267
I think my opinion is that we
shouldn't just be measuring
641
00:34:58,267 --> 00:34:59,857
like crash rate on an app.
642
00:35:00,337 --> 00:35:04,257
We should be measuring, are users
staying engaged with our experience?
643
00:35:04,407 --> 00:35:08,197
And when we see they are not, sometimes,
crashes, I mean, the obvious, the
644
00:35:08,197 --> 00:35:10,457
answer is obviously they can't, right?
645
00:35:10,457 --> 00:35:12,507
Because the app explodes.
646
00:35:13,017 --> 00:35:16,077
But, I think, you know, I was
talking to a senior executive at
647
00:35:16,077 --> 00:35:18,337
a, a massive food delivery app.
648
00:35:18,337 --> 00:35:21,914
And it's listen, we know, anecdotally,
there's more than just crashes that make
649
00:35:21,914 --> 00:35:23,814
our users throw their phone at the wall.
650
00:35:23,874 --> 00:35:28,384
you're trying to You're trying to
order lunch at noon and something's
651
00:35:28,384 --> 00:35:31,754
really slow or you keep running into
a, just a validation error because we
652
00:35:31,754 --> 00:35:34,924
shipped you an experiment thinking it
worked and you can't order the item
653
00:35:34,924 --> 00:35:36,704
you want on the two for one promotion.
654
00:35:37,204 --> 00:35:42,204
you're enraged because you really want
the, you know, the spicy dry fried chicken
655
00:35:42,370 --> 00:35:43,439
hangry.
656
00:35:43,444 --> 00:35:46,224
and you want two of them because I
want to eat the other one tonight.
657
00:35:46,724 --> 00:35:49,704
and you've already suckered me
into that offer, you've convinced
658
00:35:49,704 --> 00:35:52,924
me I want it, and now I'm having
trouble, completing my objective.
659
00:35:53,424 --> 00:35:56,894
And broadly speaking, the observability
ecosystem on the front end really
660
00:35:56,894 --> 00:35:58,134
hasn't measured that, right?
661
00:35:58,134 --> 00:36:01,084
We've used all sorts of proxy
measurements in, out of the data
662
00:36:01,084 --> 00:36:04,504
center because the reliability story
has been really well told and evolved.
663
00:36:04,964 --> 00:36:09,514
Over the past 10 to 15 years in the data
center world, but it just really hasn't
664
00:36:09,514 --> 00:36:10,944
materially evolved in the front end.
665
00:36:10,964 --> 00:36:14,904
And so, a lot of that's like shifting
the objective from how do I just measure
666
00:36:14,924 --> 00:36:21,334
counts of things I already know are bad to
measuring what users engagement looks like
667
00:36:21,334 --> 00:36:25,644
and whether I can attribute that to change
in my software or defects I've introduced.
668
00:36:26,144 --> 00:36:27,424
So, that's kind of the take.
669
00:36:27,494 --> 00:36:32,264
just about anybody who has ever built
a consumer mobile app has Firebase
670
00:36:32,264 --> 00:36:35,124
Crashlytics in the app, which is
a free service provided by Google.
671
00:36:35,144 --> 00:36:36,804
It was a company a long time ago that.
672
00:36:37,741 --> 00:36:40,991
Crashlytics that got bought by Twitter
and then got reacquired by Google.
673
00:36:40,991 --> 00:36:44,341
it basically gives you rough cut
performance metrics and crash reporting.
674
00:36:44,341 --> 00:36:44,521
Right.
675
00:36:44,521 --> 00:36:48,591
I would consider this like the
foundational requirement of any level
676
00:36:48,611 --> 00:36:54,353
of, app quality, but to call this NSAID
app quality, I think, you know, our
677
00:36:54,353 --> 00:36:56,483
opinion is that would be a misnomer.
678
00:36:56,483 --> 00:36:58,563
So we're going to go kind of go
through what this looks like, right.
679
00:36:58,563 --> 00:36:59,783
Which is it's giving you things.
680
00:37:00,283 --> 00:37:04,733
You would expect to see, a number of
events that are crashes, et cetera, and,
681
00:37:04,743 --> 00:37:08,363
you can do what you would expect here,
crashes are bad, so I need to solve a
682
00:37:08,363 --> 00:37:13,413
crash, so I'm going to go into a crash,
view stack traces, get the information I
683
00:37:13,413 --> 00:37:16,193
need to actually be able to resolve it.
684
00:37:16,293 --> 00:37:20,853
and, you know, I think we see a lot
of customers before we talk to them
685
00:37:20,863 --> 00:37:25,173
who are just like, well, I have crash
reporting and I have QA, that's enough.
686
00:37:25,673 --> 00:37:28,483
there's a lot of products
that have other features.
687
00:37:28,483 --> 00:37:31,813
So like Core Web Vital measurements
on a page level, this is Sentry.
688
00:37:31,813 --> 00:37:35,473
It's a lot of data, but I don't
really know what to do with this.
689
00:37:35,873 --> 00:37:38,693
beyond, okay, it probably
has some SEO impact.
690
00:37:38,703 --> 00:37:42,743
There's a, you know, Core, bad Core
Web Vital or slow, you know, slow
691
00:37:42,743 --> 00:37:45,313
something for a render on this page.
692
00:37:45,633 --> 00:37:48,633
How do I actually go
figure out root cause?
693
00:37:49,263 --> 00:37:51,063
But again, right, this is a single signal.
694
00:37:51,073 --> 00:37:56,703
So this is kind of, you don't know that,
whether or not the P75 core web vital here
695
00:37:56,733 --> 00:38:01,803
that is, considered scored badly by Google
is actually causing your users to bounce.
696
00:38:02,293 --> 00:38:05,983
And I think that's important because I
was reading this article the other day
697
00:38:05,983 --> 00:38:09,783
on this like notion of a performance
plateau, like there's empirical science
698
00:38:09,793 --> 00:38:13,473
proving that like faster core web
vitals, especially with like content
699
00:38:13,483 --> 00:38:16,333
paint and interaction to next paint.
700
00:38:16,613 --> 00:38:20,683
et cetera, improve bounce rate
materially, like people are less likely
701
00:38:20,683 --> 00:38:22,883
to bounce if the page loads really fast.
702
00:38:23,383 --> 00:38:26,573
But at some point, like if it's long
enough, there's this massive long
703
00:38:26,573 --> 00:38:29,613
tail of people who just have a rotten
experience and you kind of have to
704
00:38:29,613 --> 00:38:33,093
figure out, I can't make everyone
globally have a great experience.
705
00:38:33,093 --> 00:38:36,513
Where's this plateau where, I know
that I'm improving the experience for
706
00:38:36,513 --> 00:38:40,023
people who I'm likely to retain and
improve their bounce rate versus I'm
707
00:38:40,023 --> 00:38:41,473
just, you know, going to live with this.
708
00:38:41,973 --> 00:38:44,703
And so we kind of have a different
take, which is like we wanted to
709
00:38:44,703 --> 00:38:48,753
center our experience less on just
individual signals, and more on
710
00:38:48,753 --> 00:38:53,423
like these flows, these tasks that
users are performing in your app.
711
00:38:53,923 --> 00:38:58,593
So if you think about the key flows,
like I'm breaking these down into
712
00:38:58,593 --> 00:39:01,873
the types of activities that I
actually built for my end users.
713
00:39:02,373 --> 00:39:05,613
And I want to say, okay, how many
of them were successful versus
714
00:39:05,613 --> 00:39:09,403
how many ended in an error, like
something went truly bad, right?
715
00:39:09,403 --> 00:39:10,763
You just could not proceed.
716
00:39:11,073 --> 00:39:14,853
Versus how many abandoned, and
when they abandoned, why, right?
717
00:39:14,863 --> 00:39:18,833
Did they abandon because they, they
clicked on a product catalog screen, they
718
00:39:18,833 --> 00:39:20,403
saw some stuff that they didn't like?
719
00:39:20,903 --> 00:39:25,983
Or did they abandon because the product
catalog was so slow to load, and images
720
00:39:25,983 --> 00:39:29,233
slow, like slow, so slow to hydrate?
721
00:39:29,653 --> 00:39:33,073
That they perceived it as
broken, lost interest in the
722
00:39:33,073 --> 00:39:34,563
experience and ended up leaving.
723
00:39:34,623 --> 00:39:38,113
And so, the way you do that is you
basically take the telemetry we're
724
00:39:38,163 --> 00:39:41,843
emitting from the app, the exhaust
we collect by default, and you create
725
00:39:41,873 --> 00:39:46,333
these start and end events that allow
you to then, we post process the data.
726
00:39:46,333 --> 00:39:49,213
We go through all of these sessions
we're collecting, which is basically
727
00:39:49,213 --> 00:39:52,973
a play by play of like linear
events that users went through.
728
00:39:53,368 --> 00:39:56,228
And we hydrate the flow to tell
you where people are dropping off.
729
00:39:56,728 --> 00:40:00,478
and so you can see like their actual
completion rates over time, you know,
730
00:40:00,478 --> 00:40:04,118
obviously it's a test app, so there's
not a ton of data there, but what gets
731
00:40:04,118 --> 00:40:09,688
really cool is we start to, we start to
build out this notion of once you see.
732
00:40:10,188 --> 00:40:15,458
the, issues happen, well how can I now
go look at all of the various attributes
733
00:40:15,458 --> 00:40:21,378
of those populations under the hood
to try to specify which of the things
734
00:40:21,748 --> 00:40:25,558
are most likely to be attributed to
the population suffering the issue?
735
00:40:25,948 --> 00:40:28,008
So that could be an experiment.
736
00:40:28,388 --> 00:40:32,778
it could be a particular mobile,
a particular version they're on.
737
00:40:32,818 --> 00:40:34,748
It could be, an OS version, right?
738
00:40:34,748 --> 00:40:38,658
You just shipped an experiment that
isn't supported in older OSs, and those
739
00:40:38,848 --> 00:40:40,458
users start having a bad experience.
740
00:40:40,958 --> 00:40:45,168
And then each of those gets you down to
what we call, this, user play by play
741
00:40:45,198 --> 00:40:50,548
session timeline, where you basically
get a full Recreation of every part of
742
00:40:50,588 --> 00:40:54,278
the exhaust stream that we're gathering
from you interacting with the app or
743
00:40:54,278 --> 00:40:56,838
website just for reproduction purposes.
744
00:40:56,898 --> 00:40:59,428
once you've distilled here,
you can say, okay, now let me
745
00:40:59,428 --> 00:41:00,778
look at that cohort of users.
746
00:41:01,268 --> 00:41:03,578
And so I can do pattern
recognition, which I think is pretty
747
00:41:03,729 --> 00:41:04,099
Hmm.
748
00:41:04,618 --> 00:41:08,698
So for the audio audience that didn't,
that didn't get to watch the video,
749
00:41:09,038 --> 00:41:14,858
what are some of the key sort of, if
someone's in a mobile and front end
750
00:41:14,858 --> 00:41:18,618
team and this is actually going back
to a conversation I had with one of
751
00:41:18,618 --> 00:41:25,128
your Embrace team members at KubeCon in
London, what are some of the key changes
752
00:41:25,138 --> 00:41:26,998
or things that they need to be doing?
753
00:41:27,498 --> 00:41:32,688
If I guess if I back up and say the
premise here is that if I'm it's
754
00:41:32,708 --> 00:41:34,038
there's almost like two archetypes.
755
00:41:34,148 --> 00:41:34,908
What am I trying to say here?
756
00:41:35,108 --> 00:41:36,468
There's two archetypes
that I'm thinking about.
757
00:41:36,468 --> 00:41:41,368
I'm thinking about me, the DevOps
slash observability system maintainer.
758
00:41:41,758 --> 00:41:43,018
I've probably set up.
759
00:41:43,518 --> 00:41:47,318
Elk or, you know, I've got
the key names, the Loki's, the
760
00:41:47,338 --> 00:41:49,368
Prometheus, the, the Grafana's.
761
00:41:49,368 --> 00:41:51,178
I've got all these things
that I've implemented.
762
00:41:51,208 --> 00:41:53,218
I've brought my
engineering teams on board.
763
00:41:53,218 --> 00:41:54,328
They like these tools.
764
00:41:54,828 --> 00:41:58,268
They tend to have, especially for
mobile, they tend to have other
765
00:41:58,268 --> 00:42:00,408
tools that I don't deal with.
766
00:42:00,778 --> 00:42:02,568
They might have platform, like the, the
767
00:42:02,672 --> 00:42:03,402
Traditionally, right?
768
00:42:03,452 --> 00:42:05,012
We're obviously trying to change that.
769
00:42:05,012 --> 00:42:07,592
But yeah, traditionally, they
have five or six other tools that
770
00:42:07,602 --> 00:42:11,402
don't play into the ecosystem,
observability ecosystem you have set up.
771
00:42:11,808 --> 00:42:12,298
Yeah.
772
00:42:12,298 --> 00:42:16,798
So, so we're on this journey to
try to centralize, bring them
773
00:42:16,798 --> 00:42:18,258
into the observability world.
774
00:42:18,308 --> 00:42:21,858
you know, traditional mobile
app developers might not even.
775
00:42:22,358 --> 00:42:26,908
Be aware of what's going on in the
cloud native observability space and
776
00:42:27,218 --> 00:42:28,348
we're bringing them on board here.
777
00:42:28,368 --> 00:42:34,808
Now suddenly they get even more code
coming at them that's slightly less
778
00:42:34,828 --> 00:42:40,918
reliable or maybe presents some unusual
problems that we didn't anticipate.
779
00:42:40,918 --> 00:42:45,258
So now, you know, we're in a
world where suddenly what we have
780
00:42:45,258 --> 00:42:46,668
in observability isn't enough.
781
00:42:47,262 --> 00:42:47,593
yeah,
782
00:42:47,643 --> 00:42:49,653
you're a potential solution.
783
00:42:49,653 --> 00:42:53,373
What are you looking at for behaviors
that they need to change to things
784
00:42:53,373 --> 00:42:54,613
that people can take home with them?
785
00:42:54,653 --> 00:42:55,253
And
786
00:42:55,362 --> 00:42:58,142
I mean, I guess the way I think about
it is, right, the way, the reason
787
00:42:58,142 --> 00:43:04,832
observability became so widely adopted
in server side products was because in
788
00:43:04,832 --> 00:43:10,002
an effort to more easily maintain our
software and to avoid widespread defects
789
00:43:10,002 --> 00:43:14,372
of high blast radius, we shifted from a
paradigm of like monoliths deployed on
790
00:43:14,372 --> 00:43:19,782
bare metal to virtualization, which was,
you know, various container schemes kind
791
00:43:19,782 --> 00:43:25,322
of that has right now most widely been
around, Kubernetes and microservices
792
00:43:25,352 --> 00:43:28,612
because you could scale them independently
and you could deploy them independently.
793
00:43:28,612 --> 00:43:28,962
Right.
794
00:43:29,462 --> 00:43:34,902
And that complexity of the deployment
scheme and the different apps and services
795
00:43:34,902 --> 00:43:39,682
interplaying with each other necessitated
an x ray vision into your entire system
796
00:43:39,682 --> 00:43:44,452
where you could understand system wide
impacts to your, the end of your world.
797
00:43:44,482 --> 00:43:48,232
And the end of your world, from the
most part became your API surface
798
00:43:48,232 --> 00:43:51,932
layer, the things that served
your web and mobile experiences.
799
00:43:51,972 --> 00:43:55,452
And, you know, there are
businesses that just serve APIs.
800
00:43:55,802 --> 00:43:59,842
Right, but broadly speaking, the brands
we interact with as human beings serve us
801
00:43:59,892 --> 00:44:02,302
visual experiences that we interact with.
802
00:44:02,802 --> 00:44:03,172
right.
803
00:44:03,292 --> 00:44:06,812
It's the server team managing
the server analytics, not so much
804
00:44:06,812 --> 00:44:10,862
the client device analytics that
805
00:44:11,021 --> 00:44:11,351
right.
806
00:44:11,831 --> 00:44:15,121
the world has gotten a lot more
complicated in what the front
807
00:44:15,121 --> 00:44:16,571
end experience looks like.
808
00:44:16,591 --> 00:44:21,901
And you could have a service that
consistently responds and has a nominal
809
00:44:21,901 --> 00:44:27,171
increase in latency and is well within
your alert thresholds, but where the
810
00:44:27,201 --> 00:44:32,831
SDK or library designed for your front
end experience suddenly starts retrying
811
00:44:32,871 --> 00:44:37,711
a lot more frequently, delivering
Perceived latency to your end user.
812
00:44:38,211 --> 00:44:42,441
And, and so I think the question
is, could you uncover that incident?
813
00:44:42,451 --> 00:44:46,961
Because if users suffer perceived latency
and therefore abandon, what metrics do
814
00:44:46,961 --> 00:44:51,441
you have to go measure whether or not
users are performing the actions you
815
00:44:51,471 --> 00:44:54,551
care about them performing, whether
that's attributable to system change?
816
00:44:55,051 --> 00:44:58,181
In most instances, I don't think
most observability systems have that.
817
00:44:58,681 --> 00:45:01,671
and then the second question is, right,
so, and by the way, Bret, that's the
818
00:45:01,671 --> 00:45:05,471
underlying supposition that in a real
observability scheme, mean time to detect
819
00:45:05,471 --> 00:45:09,481
is important, is as important, if not
more so, than mean time to resolve.
820
00:45:09,981 --> 00:45:13,441
the existing tooling ecosystem for
frontend and mobile has been set up to
821
00:45:13,461 --> 00:45:17,391
optimize mean time to resolve for known
problems where I can basically just
822
00:45:17,391 --> 00:45:19,301
count the instance and then alert you.
823
00:45:19,311 --> 00:45:22,591
So, And, you know, the lack of
desire to be on call, like I've
824
00:45:22,591 --> 00:45:26,551
heard this stupid saying that
there's no such thing as a front end
825
00:45:26,551 --> 00:45:28,701
emergency, which is like ridiculous.
826
00:45:29,001 --> 00:45:33,781
If I'm, you know, if I'm a major travel
website and I run a thousand different
827
00:45:33,791 --> 00:45:39,241
experiments and a team in, Eastern
Europe drops an experiment that affects
828
00:45:39,241 --> 00:45:42,511
users globally, 1 percent of users
globally in the middle of my night.
829
00:45:42,816 --> 00:45:47,496
That, makes the calendar control
broken and that some segment of that
830
00:45:47,496 --> 00:45:49,066
population can't book their flights.
831
00:45:49,346 --> 00:45:51,596
That sounds a lot like a
production emergency to me.
832
00:45:51,866 --> 00:45:55,226
I, that has material business
impact in terms of revenue.
833
00:45:55,726 --> 00:45:59,266
Or the font color changes
and the font's not readable
834
00:45:59,456 --> 00:46:04,496
yeah, I guess I am imploring the
world to shift to a paradigm where
835
00:46:04,496 --> 00:46:09,386
they view like users willingness
and ability to interact with your
836
00:46:09,386 --> 00:46:12,006
experiences as a reliability signal.
837
00:46:12,506 --> 00:46:17,056
And I think the underlying supposition
is that this only becomes more acute of a
838
00:46:17,076 --> 00:46:20,066
problem as the number of features we ship.
839
00:46:20,126 --> 00:46:26,676
I guess I'm starting with the
belief that from what I hear, people
840
00:46:26,676 --> 00:46:28,646
are doubling down on this, right?
841
00:46:28,646 --> 00:46:33,236
They're saying we need to make software
cheaper to build and faster to build.
842
00:46:33,736 --> 00:46:37,106
Because it is a competitive environment
and if we don't do it, somebody else will.
843
00:46:37,606 --> 00:46:40,976
and as that world starts to
expand, the acuity of the
844
00:46:40,976 --> 00:46:42,746
problem space only increases.
845
00:46:43,246 --> 00:46:43,596
Yeah.
846
00:46:43,606 --> 00:46:49,316
I think your theory here, matches
are well with some other like we're
847
00:46:49,316 --> 00:46:52,916
all kind of in this little bit of
we've got a little bit of evidence.
848
00:46:52,916 --> 00:46:54,746
We hear some things
and we've got theories.
849
00:46:55,136 --> 00:46:58,396
We don't have years of facts of
exactly how AI is affecting a lot of
850
00:46:58,406 --> 00:47:03,036
these things, but other compatible
theories I've heard recently on this,
851
00:47:03,266 --> 00:47:04,846
actually with three guests on the show.
852
00:47:05,181 --> 00:47:06,661
that I'm just coming to mind.
853
00:47:06,661 --> 00:47:10,701
One of them is, because of the velocity
change and because of the, it is
854
00:47:10,701 --> 00:47:14,831
this increase in velocity, it only
is going to increase the desire for
855
00:47:14,831 --> 00:47:20,351
standardization in CI and deployment,
which those of us in, if you've been
856
00:47:20,391 --> 00:47:23,161
living in this Kubernetes world, we've
all been trying to approach that.
857
00:47:23,161 --> 00:47:27,491
Like we're all leaning into Argo CD
as one of the, as the number one way
858
00:47:27,491 --> 00:47:29,021
on Kubernetes to deploy software.
859
00:47:29,271 --> 00:47:29,881
you know, we've got.
860
00:47:30,246 --> 00:47:35,876
This GitOps idea of how to standardize
change as we deliver in CI.
861
00:47:35,886 --> 00:47:37,826
It's still completely wild, wild west.
862
00:47:37,826 --> 00:47:41,746
You've got a thousand vendors
all in a thousand ways you
863
00:47:41,746 --> 00:47:42,896
can create your pipelines.
864
00:47:43,286 --> 00:47:46,226
And hence the reason I need to make
courses on it for people, because
865
00:47:46,226 --> 00:47:47,956
there's a lot of art still to it.
866
00:47:48,126 --> 00:47:50,746
We don't have a checkbox of
this is exactly how we do it.
867
00:47:51,146 --> 00:47:54,326
And in that world, the theory is.
868
00:47:54,616 --> 00:47:57,526
Right now that maybe AI is going to allow
us to act where it's going to force us
869
00:47:57,526 --> 00:48:01,326
to standardize because we can't have a
thousand different workflows or pipelines
870
00:48:01,616 --> 00:48:05,726
that are all slightly different for
different parts of our software stack,
871
00:48:06,036 --> 00:48:10,256
because then when we get to production and
we have started having problems, or if the
872
00:48:10,256 --> 00:48:13,886
AI is starting to take more control, it's
just going to get worse because the AI
873
00:48:13,990 --> 00:48:17,540
Yeah, and at the end of the day,
right, standardization is more
874
00:48:17,540 --> 00:48:18,970
for the mean than the outliers.
875
00:48:18,970 --> 00:48:22,450
And I think a lot of people assume they're
like, oh, we can do it right because we
876
00:48:22,450 --> 00:48:26,250
have hundreds of engineers working on like
our, you know, our automation and stuff.
877
00:48:26,250 --> 00:48:27,990
It's do you know how many
companies there are out there
878
00:48:27,990 --> 00:48:29,550
that are not technology companies?
879
00:48:29,590 --> 00:48:32,330
They're a warehouse
company with technology.
880
00:48:32,330 --> 00:48:32,345
Right.
881
00:48:32,895 --> 00:48:37,435
And like standardization allows them
to more often than not build good
882
00:48:37,585 --> 00:48:42,635
technology, to measure correctly, to
deploy things the right way, like we're
883
00:48:42,635 --> 00:48:46,995
all going to interact with it in some
way, right, and as the pressure for more
884
00:48:46,995 --> 00:48:51,690
velocity and them building technology
speeds up, the need for you know, a
885
00:48:51,700 --> 00:48:55,640
base layer of doing things, where it's
consistently right, but only increases.
886
00:48:55,640 --> 00:48:57,910
And I think that's, you know,
for the world it's a challenge,
887
00:48:57,910 --> 00:48:59,260
for us it's an opportunity.
888
00:48:59,760 --> 00:49:00,090
Yeah.
889
00:49:00,590 --> 00:49:01,060
Awesome.
890
00:49:01,560 --> 00:49:03,890
I think that's a perfect
place to wrap it up.
891
00:49:04,020 --> 00:49:05,280
we've been going for a while now.
892
00:49:05,790 --> 00:49:09,745
but, Andrew, io, right?
893
00:49:09,835 --> 00:49:10,115
That's,
894
00:49:10,329 --> 00:49:10,619
Yeah.
895
00:49:10,649 --> 00:49:11,509
www.
896
00:49:11,909 --> 00:49:12,169
embrace.
897
00:49:12,769 --> 00:49:12,779
io.
898
00:49:13,115 --> 00:49:13,925
let's just bring those up.
899
00:49:13,925 --> 00:49:16,455
We didn't show a lot of
that stuff, but, the,
900
00:49:16,505 --> 00:49:20,075
in case people didn't know, I made a short
a few months ago about Embrace or with one
901
00:49:20,075 --> 00:49:21,815
of the Embrace team members at KubeCon.
902
00:49:21,815 --> 00:49:24,975
which we had touched a little bit on
this show, but we talked about that.
903
00:49:25,475 --> 00:49:28,475
Observability is here for your mobile
apps now and your front end apps and
904
00:49:28,475 --> 00:49:32,575
that those people, those developers
can now join the rest of us in this,
905
00:49:32,675 --> 00:49:38,515
world of modern metrics, collection
and consolidation of logging tools
906
00:49:38,515 --> 00:49:43,175
and bringing it all together into
ideally one, one single pane of glass.
907
00:49:43,225 --> 00:49:46,325
if you're advanced enough to figure
all that out, the tools are making
908
00:49:46,325 --> 00:49:48,475
it a little bit easier nowadays,
but I still think there's a lot of.
909
00:49:48,975 --> 00:49:52,155
A lot of effort in terms of
implementation engineering to get all
910
00:49:52,155 --> 00:49:53,615
this stuff to work the way we hope.
911
00:49:53,645 --> 00:49:56,615
But, it sounds like that you all
are making it easier for those
912
00:49:56,615 --> 00:49:58,085
people, with your platform.
913
00:49:58,585 --> 00:49:59,305
We're definitely trying.
914
00:49:59,305 --> 00:49:59,525
Yeah.
915
00:49:59,525 --> 00:50:03,985
I think it'd be a future state that would
be pretty cool would be the ability for
916
00:50:03,985 --> 00:50:09,075
operators to look at a Grafana dashboard
or a, you know, dashboard or Chronosphere
917
00:50:09,075 --> 00:50:10,675
or New Relic or whatever it is.
918
00:50:11,175 --> 00:50:18,335
see that, you know, they see a disen a,
an engagement decrease on login on a web
919
00:50:18,335 --> 00:50:23,575
property, and immediately open an incident
where they page in the front end team and
920
00:50:23,575 --> 00:50:30,995
the core team servicing the off APIs in
a company and have them operating on data
921
00:50:31,025 --> 00:50:32,375
where the front end team can be like.
922
00:50:32,875 --> 00:50:38,525
We're seeing a number of retries
happening after we updated to the
923
00:50:38,525 --> 00:50:42,265
new version of the API that you
serve for logging credentials.
924
00:50:42,765 --> 00:50:46,345
even though they're all 200s, they're
all successful requests, what's going on?
925
00:50:46,435 --> 00:50:50,295
And that backend team says, well, it
looks like our, you know, we were slow
926
00:50:50,295 --> 00:50:55,845
rolling it out and the P90 latency
is actually 250 milliseconds longer.
927
00:50:56,345 --> 00:50:57,715
Why would that impact you?
928
00:50:57,745 --> 00:51:01,985
And they say, well, the SDK retries
after 500 milliseconds and our P50
929
00:51:01,985 --> 00:51:03,565
latency before this was 300 milliseconds.
930
00:51:03,575 --> 00:51:06,885
So, 10 percent of our users or
something are starting to retry
931
00:51:06,895 --> 00:51:07,985
and that's why we're seeing this.
932
00:51:07,985 --> 00:51:10,455
You know, the answer here is to increase.
933
00:51:10,955 --> 00:51:14,435
The resource provisioning for the
auth service to get latency back
934
00:51:14,435 --> 00:51:19,775
down and or change our SDK to have
a more permissive retry policy.
935
00:51:20,275 --> 00:51:24,425
and, you know, have teams be able to
collaborate around the right design
936
00:51:24,425 --> 00:51:27,625
of software for their end users and
understand the problem from both
937
00:51:27,645 --> 00:51:31,465
perspectives, but be able to kick
off that incident because they saw
938
00:51:31,825 --> 00:51:36,535
real people failing to disengage,
not just some server side metrics,
939
00:51:36,595 --> 00:51:37,805
which I think would be pretty neat.
940
00:51:38,305 --> 00:51:38,695
Yeah.
941
00:51:39,195 --> 00:51:42,265
And I should mention you all,
if I remember correctly in cloud
942
00:51:42,275 --> 00:51:49,145
native, you're a lead maintainers
on the mobile observability SDK.
943
00:51:49,145 --> 00:51:50,285
Is that, am I getting that right?
944
00:51:50,335 --> 00:51:50,555
I'm trying
945
00:51:50,569 --> 00:51:54,319
we have engineers who are
provers, on Android and iOS.
946
00:51:54,479 --> 00:51:59,059
we have the, from what I'm
aware of, the only production
947
00:51:59,059 --> 00:52:01,829
React native, OpenTelemetry SDK.
948
00:52:01,989 --> 00:52:07,009
we are also participants in a new
browser SIG, which is a, a subset
949
00:52:07,019 --> 00:52:08,699
of the former JavaScript SDK.
950
00:52:08,699 --> 00:52:11,439
So our OpenTelemetry SDK
for web properties is.
951
00:52:11,839 --> 00:52:15,759
Basically a very slimmed down chunk of
instrumentation that's only relevant
952
00:52:15,759 --> 00:52:17,729
for browser React implementations.
953
00:52:18,229 --> 00:52:22,929
so yeah, working to advance the kind of
standards community in the cloud native
954
00:52:22,939 --> 00:52:27,659
environment for what, Instrumenting
in real world runtimes where Swift,
955
00:52:27,699 --> 00:52:31,299
Kotlin, JavaScript are executed.
956
00:52:31,799 --> 00:52:32,339
Nice.
957
00:52:32,839 --> 00:52:34,369
Andrew, thanks so much for being here.
958
00:52:34,379 --> 00:52:35,639
So we'll see you soon.
959
00:52:35,709 --> 00:52:36,259
Bye everybody.
