Episode Transcript
1
00:00:04,333 --> 00:00:08,053
This is DevOps and Docker talk,
and I am your host, Bret Fisher.
2
00:00:08,413 --> 00:00:10,483
This was a fun episode this week.
3
00:00:10,483 --> 00:00:14,743
Nirmal Mehta is back from
AWS and our guest this week
4
00:00:14,773 --> 00:00:17,443
was Michael Irwin of Docker.
5
00:00:17,443 --> 00:00:22,183
He is a recurring guest if you've been
listening to this podcast for any length
6
00:00:22,183 --> 00:00:25,743
of time, I think we had him on earlier
this year, and he's been a friend for a
7
00:00:25,743 --> 00:00:30,693
decade, former Docker captain, now Docker
employee advocating for Docker everywhere.
8
00:00:30,693 --> 00:00:34,233
He's all over the place and, I
always have him on because he
9
00:00:34,233 --> 00:00:39,109
breaks stuff down to help us all
understand what's going on at Docker.
10
00:00:39,199 --> 00:00:45,109
And there is a giant list of things we had
to talk about this week, all AI related,
11
00:00:45,139 --> 00:00:49,789
all separate pieces of the puzzle that
can be used independently or together.
12
00:00:49,969 --> 00:00:54,259
So in this episode, we will be covering
the Docker Model Runner, which I've
13
00:00:54,259 --> 00:00:58,609
talked about a lot on this show over
the last four months, for running open
14
00:00:58,669 --> 00:01:03,109
or free models locally or remotely on
servers or on your machine, wherever.
15
00:01:03,229 --> 00:01:05,599
Basically you get to run your own
models and Docker will host them for
16
00:01:05,599 --> 00:01:10,289
you, Docker will encapsulate and allow
you to use the Docker CLI to manage it.
17
00:01:10,589 --> 00:01:15,539
Then we have the hub model catalogs,
so you can pick one of the dozens of
18
00:01:15,539 --> 00:01:19,649
models I guess, that are available in
Docker Hub and we talk about Gordon
19
00:01:19,649 --> 00:01:24,509
ai, which is their chat bot built into
Docker Desktop and the Docker CLI.
20
00:01:25,059 --> 00:01:31,509
We then get into MCP toolkit and the
hubs MCP catalog, and how to bring all
21
00:01:31,509 --> 00:01:37,504
our tools into our local ai or like,
using some other AI and you're just
22
00:01:37,504 --> 00:01:39,124
wanting to use your own MCP tools.
23
00:01:39,334 --> 00:01:43,204
We talk about how Docker manages
all that and it uses the MCP gateway
24
00:01:43,204 --> 00:01:47,764
that they recently open sourced
to front end all those tools and
25
00:01:47,764 --> 00:01:49,534
to help you manage your MCP tools.
26
00:01:50,074 --> 00:01:56,554
We also then get into compose and how
you add models and the MCP gateway plus
27
00:01:56,554 --> 00:02:00,049
MCP tools, all into your composed files.
28
00:02:00,319 --> 00:02:03,979
Then we talk about how to use that
for building agents a little bit
29
00:02:04,089 --> 00:02:10,209
And then Offload, which allows
you to build, run containers,
30
00:02:10,209 --> 00:02:15,999
run Docker models, all in Docker
Cloud, and they call it Offload
31
00:02:15,999 --> 00:02:18,264
It actually is a pretty unique name.
32
00:02:18,264 --> 00:02:21,564
Most other people just called it Docker
Cloud, but they call it Docker Offload.
33
00:02:21,834 --> 00:02:22,464
Great name.
34
00:02:22,464 --> 00:02:25,254
You just toggle inside your
Docker UI and you're good to go.
35
00:02:25,254 --> 00:02:26,064
Everything's remote.
36
00:02:26,154 --> 00:02:31,764
And then we at the very end talked
about Compose Now works at least
37
00:02:31,764 --> 00:02:35,364
the YAML files of Compose, now
work inside of Google Cloud Run.
38
00:02:35,704 --> 00:02:37,744
Probably coming to other places soon.
39
00:02:37,924 --> 00:02:43,504
So we go for almost 90 minutes on
this show because for good reason, we
40
00:02:43,534 --> 00:02:49,904
talk about the use cases of all these
different various AI parts of the puzzle.
41
00:02:50,264 --> 00:02:55,814
And I'm excited because it paints,
I hope for you a complete picture
42
00:02:56,114 --> 00:03:00,579
of what Docker has released, how it
all works together, when you would
43
00:03:00,579 --> 00:03:04,149
choose each part for problem solving,
different problems and all that.
44
00:03:04,359 --> 00:03:08,709
So please enjoy this episode with
Nirmal and Michael Irwin of Docker.
45
00:03:10,863 --> 00:03:11,533
Hi!
46
00:03:12,020 --> 00:03:13,071
Hey, Awesome.
47
00:03:13,444 --> 00:03:14,144
I'm doing all right.
48
00:03:14,174 --> 00:03:17,255
we have a special guest
Michael, another Virginian.
49
00:03:17,265 --> 00:03:17,955
Hello.
50
00:03:18,699 --> 00:03:19,179
Hello,
51
00:03:19,472 --> 00:03:21,222
we all have known each other a decade.
52
00:03:21,242 --> 00:03:22,592
So Michael, tell us who you are
53
00:03:23,052 --> 00:03:25,302
so I'm, Michael Irwin, I work at Docker.
54
00:03:25,302 --> 00:03:29,896
I'm in our developer success team,
and do a lot of teaching, training,
55
00:03:29,896 --> 00:03:34,966
education, breaking down the stuff,
trying to make it somewhat understandable.
56
00:03:35,056 --> 00:03:38,936
I've been with Docker for almost
three and a half years now, so it's
57
00:03:38,936 --> 00:03:41,726
gone pretty quickly, but, it's fun
to spend time with the community
58
00:03:41,776 --> 00:03:43,436
and just help developers out.
59
00:03:43,836 --> 00:03:47,006
Learn about this stuff, but also
how do you actually take advantage
60
00:03:47,006 --> 00:03:49,086
and do cool stuff with it?
61
00:03:49,385 --> 00:03:50,115
That's awesome.
62
00:03:50,145 --> 00:03:53,705
I mean, this is like a new wave
of, essentially development
63
00:03:53,715 --> 00:03:56,515
tooling and, it's pretty exciting.
64
00:03:56,915 --> 00:03:57,275
Yeah.
65
00:03:57,275 --> 00:03:58,265
we've got a list people.
66
00:03:58,365 --> 00:04:01,765
I only make lists on this
show a couple of times a year.
67
00:04:02,015 --> 00:04:05,495
Yeah, it does seem like every time I'm
on the show, there's a list involved.
68
00:04:06,065 --> 00:04:07,685
yeah, that's a pretty decent list.
69
00:04:07,965 --> 00:04:10,965
let's break it down because we got
a lot to get through and we realize
70
00:04:10,975 --> 00:04:13,895
that on Docker's channel and on
this channel, They've been talking
71
00:04:13,895 --> 00:04:15,175
about all these fun features.
72
00:04:15,205 --> 00:04:17,425
I've talked at length
about Docker Model Runner.
73
00:04:17,835 --> 00:04:23,155
Hub Catalog, I don't have a bunch of
videos on MCP or the MCP tools, but
74
00:04:23,725 --> 00:04:25,955
actually we've had streams on it before.
75
00:04:26,005 --> 00:04:30,605
But, I think the MCP toolkit and the
MCP gateway I'm most excited about that.
76
00:04:30,605 --> 00:04:31,935
So we're going to get
to that in a little bit.
77
00:04:32,329 --> 00:04:35,789
I think, technically, the first
thing out of the gate was Gordon AI.
78
00:04:36,794 --> 00:04:38,704
this to me is.
79
00:04:39,439 --> 00:04:44,509
From a user's perspective, it's a Docker
focused or maybe a developer focused
80
00:04:44,579 --> 00:04:52,219
AI chatbot, essentially similar to
ChatGPT, but it's in my Docker interface.
81
00:04:52,279 --> 00:04:56,329
if I'm staring at the Docker desktop
interface, there's an ask Gordon button.
82
00:04:56,929 --> 00:05:00,499
And if you've never clicked that,
or if you clicked it once a couple
83
00:05:00,499 --> 00:05:03,789
years ago, it has changed a lot.
84
00:05:03,849 --> 00:05:09,685
There have been enhancements so o now
we have memory and threads and it saves
85
00:05:09,685 --> 00:05:14,355
me from having to go to ChatGPT when
I'm working, specifically around Docker
86
00:05:14,355 --> 00:05:18,505
stuff, but anything it feels like I
can ask it developer related stuff
87
00:05:18,795 --> 00:05:21,275
Can you tell me now, is
this free for everyone?
88
00:05:21,275 --> 00:05:24,675
How does this play out in
terms of who can use this
89
00:05:24,675 --> 00:05:26,655
Yeah, so everybody can
access it right now.
90
00:05:26,655 --> 00:05:29,665
The only limitations may be
some of our business orgs.
91
00:05:29,705 --> 00:05:32,065
those orgs have to, enable it.
92
00:05:32,535 --> 00:05:35,825
We're just not going to roll out all
the AI stuff as most organizations
93
00:05:35,825 --> 00:05:37,155
are pretty cautious about that.
94
00:05:37,442 --> 00:05:39,522
but yeah, Gordon's available to everybody.
95
00:05:39,572 --> 00:05:43,562
it started off actually mostly as a
documentation help of just help me
96
00:05:43,572 --> 00:05:46,802
answer questions and keep up with
new features and that kind of stuff.
97
00:05:47,292 --> 00:05:51,472
But, as you've noted, we've added new
capabilities and new tools along the way.
98
00:05:51,937 --> 00:05:55,961
I was doing some experiments just
the other day of Hey Gordon, help
99
00:05:55,961 --> 00:06:00,601
me convert this Dockerfile to use
the new Docker hardened images.
100
00:06:00,611 --> 00:06:05,511
And it would do the conversion, find the
images that my organization has that match
101
00:06:05,511 --> 00:06:07,481
the one that I'm using this Dockerfile.
102
00:06:07,531 --> 00:06:10,981
you're starting to see more and more
capabilities built into it as well it's
103
00:06:10,981 --> 00:06:12,681
a pretty fun little assistant there.
104
00:06:13,057 --> 00:06:13,567
Yeah,
105
00:06:13,694 --> 00:06:18,714
highly recommend folks that are listening
that if you've not touched Docker and
106
00:06:18,714 --> 00:06:23,714
you just open it up, check out Gordon
and ask those questions that you probably
107
00:06:23,724 --> 00:06:27,104
have, compose all the things you could
probably put that in there and Gordon will
108
00:06:27,104 --> 00:06:28,804
probably try to compose all the things.
109
00:06:28,857 --> 00:06:32,887
you know what one of my common uses for
this is I need to translate a docker
110
00:06:32,887 --> 00:06:36,477
run command into a compose file or
back and forth like, please give me the
111
00:06:36,477 --> 00:06:40,807
docker run equivalent of this compose
file or please turn these docker run
112
00:06:40,807 --> 00:06:46,277
commands or this docker build command
into a compose file and it saves me.
113
00:06:46,677 --> 00:06:49,967
It's not, I mean, I've been doing this
stuff almost every day for a decade,
114
00:06:50,207 --> 00:06:55,697
so it's not like I needed to do it for
me, but it's still faster than a pro
115
00:06:55,747 --> 00:06:58,847
typing it in from memory, I'm at the
point now where I rarely need to refer
116
00:06:58,897 --> 00:07:03,067
to the docs, but it's still faster
than me at writing a Compose file
117
00:07:03,067 --> 00:07:04,467
out and saving it to the hard drive.
118
00:07:04,867 --> 00:07:08,757
You know, because sometimes we debate
around whether AI tools are useful for
119
00:07:08,767 --> 00:07:13,887
senior versus junior and blah, blah, blah,
and I don't know, as a senior, it saves
120
00:07:13,887 --> 00:07:17,727
me keystrokes, it saves me time, unlike
when it first came out a couple years
121
00:07:17,727 --> 00:07:23,562
ago, I have Tracked it, given feedback to
the team and it no longer makes Compose
122
00:07:23,562 --> 00:07:26,262
files with old Compose information.
123
00:07:26,262 --> 00:07:27,532
Like it, at least in the
124
00:07:27,721 --> 00:07:28,591
the version tag.
125
00:07:28,772 --> 00:07:29,192
Yeah.
126
00:07:29,192 --> 00:07:30,632
Cause it, the version tag.
127
00:07:30,642 --> 00:07:31,042
Yeah.
128
00:07:31,142 --> 00:07:33,772
I was a big proponent of Hey, this
hasn't been in there for five years.
129
00:07:33,772 --> 00:07:34,802
Why is it recommending it?
130
00:07:35,192 --> 00:07:36,122
and they fixed that.
131
00:07:36,132 --> 00:07:38,072
So, I'm very appreciative of it.
132
00:07:38,492 --> 00:07:41,042
We're going to save it, we're going
to talk about MCP tools, and then
133
00:07:41,042 --> 00:07:43,342
we're going to come back to this,
because it gets, it's gotten better,
134
00:07:43,592 --> 00:07:45,672
and one of the reasons it's gotten
better is it can talk to tools.
135
00:07:45,922 --> 00:07:49,382
So I don't want to really spoil that,
But yeah, I love the suggestions because
136
00:07:49,382 --> 00:07:53,625
sometimes when you're staring at a blank
chat box, it feels kind of like being a
137
00:07:53,625 --> 00:07:55,625
writer and staring at a blank document.
138
00:07:55,685 --> 00:07:57,845
It's like, I know this is
supposed to be a cool tool, but
139
00:07:57,845 --> 00:07:59,095
I don't know what to do with it.
140
00:07:59,125 --> 00:07:59,775
do I start?
141
00:07:59,885 --> 00:08:00,255
Yeah.
142
00:08:00,271 --> 00:08:00,951
analogy
143
00:08:01,375 --> 00:08:04,148
there's even some cool spots, like if
you go to the containers tab, You'll
144
00:08:04,148 --> 00:08:08,428
see on the right side a little like
magic icon there, and then you can
145
00:08:08,438 --> 00:08:10,338
ask questions about that container.
146
00:08:10,338 --> 00:08:13,348
So if you have a container that's
failing to start or whatever, you can
147
00:08:13,538 --> 00:08:17,668
basically start a thread around the
context of why is this thing not starting.
148
00:08:17,906 --> 00:08:18,386
Interesting.
149
00:08:18,386 --> 00:08:19,616
I have not tried that.
150
00:08:20,256 --> 00:08:23,506
it actually reminds me of the little
star we get for Gemini that's in every
151
00:08:23,506 --> 00:08:25,066
single Google app that I'm opening.
152
00:08:25,066 --> 00:08:29,896
I guess we're all, slowly converging
on the starlight AI icons now, I
153
00:08:29,896 --> 00:08:32,336
would have thought it would have
been a robot face, but we chose this
154
00:08:32,346 --> 00:08:34,666
vague collection of stars and pluses.
155
00:08:34,666 --> 00:08:35,646
I and that is pretty cool.
156
00:08:35,656 --> 00:08:38,336
I, so is there more, is
it like an images as well?
157
00:08:38,396 --> 00:08:40,746
So yeah, anywhere you see
that icon, you can start a
158
00:08:40,746 --> 00:08:42,636
conversation around that thing.
159
00:08:42,997 --> 00:08:45,137
I'm going to start clicking on that
more often to see what it tells me.
160
00:08:45,137 --> 00:08:47,307
Like, Hey, what's the, what data.
161
00:08:47,652 --> 00:08:49,402
Ooh, How do I back up this volume?
162
00:08:49,432 --> 00:08:50,332
That's, that is pretty cool.
163
00:08:50,332 --> 00:08:50,542
So they're
164
00:08:50,778 --> 00:08:52,568
that is the number one question
165
00:08:52,736 --> 00:08:54,386
I love that they're context aware.
166
00:08:54,386 --> 00:08:57,446
Like they know that this is a
volume, or at least the prompts.
167
00:08:57,956 --> 00:09:00,912
it's like prompt suggestions,
which This feels a little meta.
168
00:09:00,912 --> 00:09:03,162
Is the AI suggesting prompts for the ai?
169
00:09:03,242 --> 00:09:06,152
It's not in this case,
but it certainly could.
170
00:09:08,053 --> 00:09:08,433
All right.
171
00:09:08,973 --> 00:09:10,573
so that's ask Gordon,
172
00:09:10,673 --> 00:09:10,763
Woo!
173
00:09:11,283 --> 00:09:11,693
Alright.
174
00:09:11,976 --> 00:09:12,966
And that is what you said.
175
00:09:13,164 --> 00:09:15,614
that is available to everyone
running Docker Desktop.
176
00:09:15,804 --> 00:09:19,114
Is that, to be clear, that's
not available for people running
177
00:09:19,124 --> 00:09:21,404
Docker Engine on Linux, right?
178
00:09:21,424 --> 00:09:25,494
That is not There is a
CLI to it, isn't there?
179
00:09:25,594 --> 00:09:26,484
There's Docker AI.
180
00:09:26,705 --> 00:09:29,595
but that's part of the
Docker Desktop installation.
181
00:09:29,595 --> 00:09:32,435
that's a CLI plugin and it's going
to talk to the components that are
182
00:09:32,435 --> 00:09:34,115
bundled in Docker Desktop there.
183
00:09:34,165 --> 00:09:34,465
Yeah.
184
00:09:34,505 --> 00:09:37,895
so we have our, this is an AI
that's outsourced to Docker.
185
00:09:37,895 --> 00:09:38,795
It's not running locally.
186
00:09:38,825 --> 00:09:41,265
it's just calling APIs from
the Docker desktop interface.
187
00:09:41,315 --> 00:09:47,165
So then if I don't want to use Gordon
AI, but I want to run my own AI models
188
00:09:47,175 --> 00:09:52,295
locally, which I feel like this is
pretty niche because a lot of people I
189
00:09:52,295 --> 00:09:58,645
talk to, their GPU is underwhelming, and
they don't have an M4 Mac or a NVIDIA
190
00:09:58,655 --> 00:10:02,035
desktop tower with a giant GPU in it.
191
00:10:02,065 --> 00:10:05,185
granted, I guess there are
models that run on CPUs, but I
192
00:10:05,185 --> 00:10:07,195
am not the model expert here.
193
00:10:07,405 --> 00:10:11,075
I've been trying to catch up this year
because I actually do have a decent
194
00:10:11,075 --> 00:10:15,345
laptop now with an M4 in it so I can
run at least some of the Apple models.
195
00:10:15,675 --> 00:10:18,415
But this Docker model runner.
196
00:10:18,818 --> 00:10:22,318
it's a pretty cool feature, but
you can pick your models, so
197
00:10:22,318 --> 00:10:23,458
you can pick a really small one.
198
00:10:24,458 --> 00:10:28,178
On Windows, it has to be an
NVIDIA GPU on Windows to work.
199
00:10:28,188 --> 00:10:28,828
Is that right?
200
00:10:29,272 --> 00:10:32,542
So it supports both NVIDIA and
Adreno, so if you've got a
201
00:10:32,572 --> 00:10:35,298
Qualcomm chip, it'll work there.
202
00:10:35,668 --> 00:10:39,238
Okay, and on Mac, it just uses system
memory because that's how Macs work.
203
00:10:39,238 --> 00:10:41,768
They have the universal memory, right?
204
00:10:42,188 --> 00:10:45,588
And I mean, for me, it's been
great because I have a brand
205
00:10:45,588 --> 00:10:46,908
new machine with 48 gig of RAM.
206
00:10:46,948 --> 00:10:48,378
I know that is not normal.
207
00:10:48,808 --> 00:10:54,395
but these models, Mac, it can be a little
problematic because it's like, if I have
208
00:10:54,395 --> 00:10:57,475
a bunch of browser tabs open, I can't
run the full size model that I want to
209
00:10:57,475 --> 00:11:00,545
run, because that's one of the problems
is like, it's all using the same thing.
210
00:11:00,545 --> 00:11:03,635
So I have to do like we did back with
VMs, I have to shut all my apps down
211
00:11:03,975 --> 00:11:06,555
and then run, because I always want
to run the biggest model possible.
212
00:11:06,555 --> 00:11:09,390
So I have this like 40, 46 gig model.
213
00:11:09,390 --> 00:11:11,820
I think it's the Devstral new
model that's supposed to be
214
00:11:11,820 --> 00:11:13,110
great for local development.
215
00:11:13,570 --> 00:11:16,550
So for those of you listening, if you
want to go in depth, we're not going to
216
00:11:16,580 --> 00:11:20,440
have time for that, but it is, Docker
Model Runner lets you run models locally.
217
00:11:20,820 --> 00:11:23,820
you can technically run this now
on Docker infrastructure, right?
218
00:11:23,820 --> 00:11:25,070
Which we're going to
get to in a little bit.
219
00:11:25,759 --> 00:11:27,089
dun dun.
220
00:11:27,130 --> 00:11:27,450
Okay.
221
00:11:27,600 --> 00:11:29,290
yeah, there is this thing called offload.
222
00:11:29,380 --> 00:11:31,010
Michael's going to know
way more about that.
223
00:11:31,010 --> 00:11:33,850
Cause that's a really new feature,
but you don't have to actually
224
00:11:33,850 --> 00:11:34,970
run these models locally now.
225
00:11:34,970 --> 00:11:36,290
Like you could offload them.
226
00:11:36,290 --> 00:11:36,530
Right.
227
00:11:36,930 --> 00:11:37,240
Okay.
228
00:11:37,540 --> 00:11:38,530
What are you seeing?
229
00:11:38,530 --> 00:11:42,206
are people coming to Docker, like trying
to run the biggest models possible?
230
00:11:42,206 --> 00:11:45,446
They got like multiple GPUs or are you
just seeing people kind of tinkering out?
231
00:11:45,546 --> 00:11:46,556
What's some analogies there?
232
00:11:46,995 --> 00:11:48,655
there's a little bit of everything.
233
00:11:48,835 --> 00:11:52,785
I'd say a lot of folks are still exploring
the space and figure out what's the
234
00:11:52,785 --> 00:11:54,765
right ways of doing things as well too.
235
00:11:54,765 --> 00:11:58,205
So, one of the interesting things is,
and one of the things to keep in mind
236
00:11:58,775 --> 00:12:02,585
is that Let's break outta the AI space.
237
00:12:02,585 --> 00:12:04,595
when you say, for example, a database.
238
00:12:04,625 --> 00:12:04,805
Okay.
239
00:12:04,805 --> 00:12:05,945
I'm using Postgres database.
240
00:12:06,365 --> 00:12:10,585
If I'm using a Postgres database, a
managed offering out in the cloud.
241
00:12:10,645 --> 00:12:10,795
Okay.
242
00:12:10,795 --> 00:12:11,965
Let's just say RDS.
243
00:12:12,115 --> 00:12:12,355
Okay.
244
00:12:13,155 --> 00:12:17,085
during development I can run a Postgres
container and yeah, it's a smaller
245
00:12:17,085 --> 00:12:18,795
version and it's not a managed offering.
246
00:12:19,075 --> 00:12:21,595
and that works because all the
binary protocols are the same.
247
00:12:21,595 --> 00:12:22,915
I can basically just swap out.
248
00:12:23,120 --> 00:12:24,870
My connection URL, and it just works.
249
00:12:25,360 --> 00:12:26,690
But models are a little different.
250
00:12:27,130 --> 00:12:30,430
And so, I've seen some folks that
have been like, I'm going to use a
251
00:12:30,450 --> 00:12:34,810
smaller version of a model during local
development, but then when I deploy,
252
00:12:34,840 --> 00:12:39,637
I'm going to use the larger hosted
model that my cloud provider provides.
253
00:12:39,727 --> 00:12:43,937
Okay, so maybe it's a parameter
change, I'm using fewer parameters
254
00:12:43,937 --> 00:12:46,347
so it can actually run locally,
and then use the larger version.
255
00:12:47,977 --> 00:12:52,497
The analogy of, okay, me running a
local PostgreSQL container and me
256
00:12:52,497 --> 00:12:58,117
running a local model, if it's a
different model, it's a different model.
257
00:12:58,707 --> 00:13:02,347
And the results that you get from that
model are going to vary quite a bit.
258
00:13:02,627 --> 00:13:05,077
And so that's one of the things that
we have to kind of keep reminding
259
00:13:05,077 --> 00:13:08,562
people if you use for example a four
billion parameter, So, we're going to
260
00:13:08,562 --> 00:13:09,838
talk a little bit about how you can use
this model during local development.
261
00:13:09,838 --> 00:13:11,206
Yes, it can fit on your machine, but
then if you deploy and you're using
262
00:13:11,206 --> 00:13:16,047
the 32 billion parameter version in
production, those are very different
263
00:13:16,057 --> 00:13:18,627
models and you're going to get very
different interactions and different
264
00:13:18,637 --> 00:13:20,177
outputs from the models there.
265
00:13:20,177 --> 00:13:23,457
So, it's something to keep in
mind as folks are looking at
266
00:13:23,717 --> 00:13:24,707
building their applications.
267
00:13:26,157 --> 00:13:28,617
So, where are we seeing folks use this?
268
00:13:28,657 --> 00:13:33,132
Well Of course, if you can use
the same model across your entire
269
00:13:33,282 --> 00:13:35,822
software development life cycle,
then that works out pretty well.
270
00:13:36,322 --> 00:13:40,742
but we're also starting to see a little
bit of a rise of, using basically
271
00:13:40,752 --> 00:13:45,092
fine tuned use case specific models
or folks training their own models.
272
00:13:45,422 --> 00:13:48,442
and using those for the
specific application.
273
00:13:48,832 --> 00:13:53,422
and again, that's tends to be a little
bit smaller, more use case specific.
274
00:13:53,652 --> 00:13:55,892
And then, yes, that
makes sense to use there.
275
00:13:56,242 --> 00:13:58,312
okay, I need to be able
to run that on my own.
276
00:13:58,722 --> 00:13:59,392
et cetera.
277
00:13:59,442 --> 00:14:02,152
again, I think a lot of folks are
still feeling out the space and
278
00:14:02,152 --> 00:14:05,362
figuring out exactly how should I
think about how should I use this?
279
00:14:05,702 --> 00:14:08,532
and of course, the tooling has to
exist kind of before you can actually
280
00:14:08,532 --> 00:14:10,102
do a lot of those experiments.
281
00:14:10,102 --> 00:14:13,012
So, in many ways, we've been
building out that tooling to help
282
00:14:13,112 --> 00:14:14,582
support that experimentation.
283
00:14:15,052 --> 00:14:17,792
but I think in many ways, folks are still
figuring out exactly what this is going
284
00:14:17,792 --> 00:14:19,242
to look like for them going forward.
285
00:14:19,647 --> 00:14:20,047
Yeah.
286
00:14:20,361 --> 00:14:23,531
I'm trying to imagine what enterprises are
doing and building out and I'm imagining
287
00:14:23,531 --> 00:14:29,421
it's not like this, but one of my
imaginations is reminding me of 20 years
288
00:14:29,421 --> 00:14:33,581
ago, buying a Google box, which I don't
remember the name of, they called it, but
289
00:14:33,581 --> 00:14:37,331
it was this appliance you would put in
your data center that was, it was yellow.
290
00:14:37,381 --> 00:14:37,891
Racked
291
00:14:38,107 --> 00:14:39,067
search appliance.
292
00:14:39,491 --> 00:14:39,771
you go.
293
00:14:39,771 --> 00:14:40,071
Yeah.
294
00:14:40,351 --> 00:14:43,381
And I don't know if Nirmal, if you had
any customers back then with, if you
295
00:14:43,381 --> 00:14:44,811
were a consultant back then, but that,
296
00:14:45,047 --> 00:14:46,507
I can't name those customers.
297
00:14:46,831 --> 00:14:50,921
so I can, I was in, the city of
Virginia beach running the IT and
298
00:14:50,921 --> 00:14:53,131
the data center there, or at least
running the engineering groups.
299
00:14:53,461 --> 00:14:57,771
And those were back in the days
where we didn't want Google indexing
300
00:14:57,771 --> 00:14:59,481
our internal infrastructure.
301
00:14:59,631 --> 00:15:02,481
You don't want your internal data to be.
302
00:15:02,784 --> 00:15:07,154
accessed or used potentially
by this big IT conglomerate.
303
00:15:07,434 --> 00:15:07,674
Yeah.
304
00:15:07,674 --> 00:15:11,724
And so you bring, they, they sell you
a, on prem box and you put it in your
305
00:15:11,724 --> 00:15:14,894
data center and it would scan and have
all access to everything you could give
306
00:15:14,894 --> 00:15:18,384
it, back when apps didn't really have
their own search, Google was providing
307
00:15:18,384 --> 00:15:22,014
that for us, and they would index
our email and index our file servers
308
00:15:22,014 --> 00:15:24,724
and databases if we wanted to give
it access to them I'm not sure that's
309
00:15:24,734 --> 00:15:29,914
around anymore, but, at a time, Google
wasn't going to give away their software
310
00:15:29,914 --> 00:15:31,344
and we didn't all know how to run it.
311
00:15:31,694 --> 00:15:33,434
And, I'm sure it was running on Linux.
312
00:15:33,434 --> 00:15:37,864
And at the time in the mid 2000s,
we weren't yet Linux at the city.
313
00:15:38,624 --> 00:15:39,684
For different reasons.
314
00:15:39,684 --> 00:15:41,934
This kind of feels like the same
moment for enterprise where.
315
00:15:42,254 --> 00:15:44,814
They're going to have to buy GPUs
probably for the first time if they're
316
00:15:44,814 --> 00:15:47,314
going to run it on prem, they're
going to want to keep it separate.
317
00:15:47,654 --> 00:15:50,114
They're going to either buy something
for the first, probably GPUs if they're
318
00:15:50,114 --> 00:15:53,484
not training models, they just want to
run things internally to access their
319
00:15:53,484 --> 00:15:58,004
internal data, or maybe they're doing
it in the cloud and Nirmal's company,
320
00:15:58,034 --> 00:16:04,194
AWS, is providing them the GPUs and
then presumably because they're getting
321
00:16:04,194 --> 00:16:07,614
dedicated hardware, they won't have
to worry about OpenAI or Anthropic
322
00:16:07,624 --> 00:16:08,894
having access to all their stuff
323
00:16:08,914 --> 00:16:12,624
so with respect to ModelRunner and
again, just a reminder, these are
324
00:16:12,624 --> 00:16:16,074
my own opinions and not that of
my employer, Amazon Web Services.
325
00:16:16,375 --> 00:16:18,065
There's still a lot of use cases.
326
00:16:18,075 --> 00:16:21,055
what Michael was talking about with
respect to choosing lots of different
327
00:16:21,065 --> 00:16:26,625
models for different types of tasks, I
think there's probably a hybrid model
328
00:16:26,625 --> 00:16:32,150
at some point where folks are using
different fine tuned niche models
329
00:16:32,530 --> 00:16:37,430
for specific tasks that they're doing
locally, and then as hardware improves,
330
00:16:37,430 --> 00:16:43,260
and hopefully, you know, maybe your 3
year old developer laptop that you get
331
00:16:43,290 --> 00:16:47,980
at your corporation has enough, or the
models get, you know, Get, optimize it
332
00:16:47,980 --> 00:16:52,760
enough that they can run on CPU or the
GPUs that you have on a corporate laptop,
333
00:16:52,770 --> 00:16:57,230
but there'll be some tooling, probably
embedded into the development tooling
334
00:16:57,550 --> 00:16:59,690
and, or you can choose your own models.
335
00:17:00,200 --> 00:17:04,190
and then those, there'll be other models
where you need to reach out to the
336
00:17:04,240 --> 00:17:08,840
hyperscalers for those that just you're
not going to get the depth of reasoning
337
00:17:08,850 --> 00:17:13,340
you're not going to get the depth of
knowledge that you will from something
338
00:17:13,340 --> 00:17:18,640
like Claude in the cloud versus something
like deep seek running a quantized deep
339
00:17:18,640 --> 00:17:24,700
seek running on your mac but again this
is all changing very very fast and Right
340
00:17:24,700 --> 00:17:30,740
now, we're in a different state where
folks have to spend money to access
341
00:17:30,790 --> 00:17:35,620
those larger models, which hasn't been
the pattern with respect to software
342
00:17:35,620 --> 00:17:37,990
development in a long time, right?
343
00:17:38,210 --> 00:17:39,900
Not everyone has that advantage, right?
344
00:17:40,070 --> 00:17:45,280
I mean, if you want to use Claude Code
and, not hit any major limits, then
345
00:17:45,280 --> 00:17:50,480
you got to pay 200 a month, which, not
everyone's going to be able to afford 2,
346
00:17:50,480 --> 00:17:54,990
400 a year, to do software development,
there's also edge use cases, right?
347
00:17:54,990 --> 00:17:57,930
So IoT devices, just
trying to figure it out.
348
00:17:57,930 --> 00:17:59,020
Also just kicking the tires.
349
00:17:59,020 --> 00:18:01,880
I think that's probably the main,
like Michael said, I think that's
350
00:18:01,890 --> 00:18:04,950
just, everyone is just trying to kick
the tires as cheaply as possible.
351
00:18:05,156 --> 00:18:08,386
model runner feels like a gateway, like
it feels like a gateway drug to get
352
00:18:08,396 --> 00:18:13,616
me Hooked on like the idea, cause I, I
mean, I wasn't paying attention to, I
353
00:18:13,616 --> 00:18:18,236
wasn't a machine, I wasn't an ML person
or I wasn't building AI infrastructure,
354
00:18:18,256 --> 00:18:21,846
but Docker model runner and, well, and,
you know, to a lesser extent, Ollama,
355
00:18:22,166 --> 00:18:25,056
Ollama always felt like it was more
for those people that were doing that,
356
00:18:25,056 --> 00:18:28,916
but bringing this capability into a
tool that I'm already using actually
357
00:18:28,936 --> 00:18:34,136
felt like, Okay, this is meant for me
now, like this is, I don't have to be,
358
00:18:34,226 --> 00:18:39,736
I don't have to understand weights and
all the intricacies of how models are
359
00:18:39,746 --> 00:18:43,776
built and work and the different, I
don't have to, I don't have to understand
360
00:18:43,786 --> 00:18:46,976
the different file format and whether
that works on my particular thing.
361
00:18:47,026 --> 00:18:47,976
it just all kind of works.
362
00:18:47,976 --> 00:18:49,656
It's Docker easy at that point for me.
363
00:18:51,355 --> 00:18:55,695
think that's the key point here of
just, let's try to increase the access
364
00:18:55,695 --> 00:18:59,935
to these capabilities and let folks
start to experiment and I think Again,
365
00:18:59,935 --> 00:19:03,615
we, as an industry, are trying to still
figure out exactly how to use a lot
366
00:19:03,615 --> 00:19:08,545
of these tools and, okay, what is the
right size model for the job at hand?
367
00:19:08,965 --> 00:19:12,225
and so, again, in order to be able
to do that experimentation, you
368
00:19:12,775 --> 00:19:14,185
have to increase the access to it.
369
00:19:14,565 --> 00:19:17,305
and so, but that's still kind of,
you know, Where we are in many ways.
370
00:19:17,632 --> 00:19:19,622
So did we check off
another thing on the list?
371
00:19:20,381 --> 00:19:23,751
about to I'm gonna have to keep
saying this We're not a news podcast.
372
00:19:23,771 --> 00:19:28,591
We don't talk about the latest things
like in general in tech but you
373
00:19:28,591 --> 00:19:31,232
know, I got to give a shout out to
Fireship because I have Between that
374
00:19:31,232 --> 00:19:34,082
and a few other channels, I've really
learned a lot this year about models.
375
00:19:34,392 --> 00:19:38,152
And I was, there's this little,
diagram of sort of the state of a lot
376
00:19:38,152 --> 00:19:41,602
of the open, waiter open, or free,
I'm just going to say free models.
377
00:19:41,852 --> 00:19:44,562
They're free in some capacity,
that you can download.
378
00:19:44,592 --> 00:19:45,912
And, I was using Devstral.
379
00:19:45,932 --> 00:19:48,422
We talked about it a couple of
times on this show already, which
380
00:19:48,422 --> 00:19:49,802
is a model that came out in May.
381
00:19:49,842 --> 00:19:51,122
I actually had a newsletter on that.
382
00:19:51,122 --> 00:19:52,302
Shout out to, Bret.
383
00:19:52,302 --> 00:19:54,702
news, there's this guy, Bret,
he makes a newsletter, Bret.
384
00:19:54,852 --> 00:19:56,002
news, you can go check that out.
385
00:19:56,282 --> 00:20:01,372
and I talked about this, that maybe this
was the sweet spot because it was small
386
00:20:01,372 --> 00:20:07,212
enough, you could run it with a modern
GPU or modern Mac, and it wasn't the
387
00:20:07,212 --> 00:20:11,632
worst, nothing like the frontier models
that we get with OpenAI and Anthropic,
388
00:20:11,632 --> 00:20:13,682
but it was something that was better.
389
00:20:15,342 --> 00:20:16,792
And, that's called Devstral.
390
00:20:17,302 --> 00:20:19,962
And then we had Quinn three
just come out, I don't know, a
391
00:20:19,962 --> 00:20:22,872
week ago or something that is
392
00:20:23,198 --> 00:20:24,478
I think it was earlier this week.
393
00:20:24,892 --> 00:20:27,222
Oh, see, time warp of AI,
394
00:20:27,378 --> 00:20:27,778
I think it
395
00:20:27,922 --> 00:20:29,262
three days is like three weeks.
396
00:20:29,746 --> 00:20:30,926
It was two days ago.
397
00:20:31,198 --> 00:20:31,648
Yeah,
398
00:20:31,912 --> 00:20:32,752
Oh gosh.
399
00:20:33,242 --> 00:20:35,972
I always assume that I'm seeing a
Fireship videos late, but, yeah,
400
00:20:35,972 --> 00:20:37,432
one day ago, so yeah, that's true.
401
00:20:37,432 --> 00:20:41,387
One day ago, there's a newer model
coming out from Alibaba, right?
402
00:20:41,467 --> 00:20:45,687
And it is even better, although
it does take more GPU memory, I
403
00:20:45,918 --> 00:20:48,438
that's hard to run like on a laptop.
404
00:20:48,598 --> 00:20:54,901
I don't think you can, so that's a perfect
encapsulation of like why Docker model
405
00:20:54,901 --> 00:20:59,951
run is there, but also why it's going
to be a mix of models going forward.
406
00:21:00,421 --> 00:21:05,751
so we've, we have an AI that helps
you use Docker and get started.
407
00:21:05,871 --> 00:21:11,121
We have a tool that helps you run
models locally, and understand that
408
00:21:11,481 --> 00:21:13,611
what's the next step in this journey?
409
00:21:14,088 --> 00:21:17,378
by the way, you go to Docker Hub
to look for models, or you can
410
00:21:17,378 --> 00:21:20,388
do it in Docker Desktop, or you
can look at things in the CLI.
411
00:21:20,398 --> 00:21:22,148
there's many ways to see the models.
412
00:21:22,508 --> 00:21:24,128
you can pull things from HuggingFace now.
413
00:21:24,351 --> 00:21:24,981
That's pretty sweet.
414
00:21:25,090 --> 00:21:26,660
Can we build, can we make our own yet?
415
00:21:26,660 --> 00:21:27,500
Or do we have that?
416
00:21:28,480 --> 00:21:28,730
We can
417
00:21:28,874 --> 00:21:32,464
so, you can package, but you would
have to already have the gguff
418
00:21:32,484 --> 00:21:33,914
file and all that kind of stuff.
419
00:21:33,914 --> 00:21:36,174
So we don't have a lot of
the tooling there to help you
420
00:21:36,174 --> 00:21:37,814
actually create the model itself.
421
00:21:38,004 --> 00:21:41,554
Although a lot of folks will use
container based environments to do that.
422
00:21:41,914 --> 00:21:44,264
we don't have any specific
tooling around that ourselves.
423
00:21:44,275 --> 00:21:46,385
So is there, you're saying there's
a doc, oh, there, there is.
424
00:21:46,385 --> 00:21:47,375
Oh, I didn't realize.
425
00:21:47,415 --> 00:21:52,685
So there is now a Docker model
package CLI for creating, basically
426
00:21:52,685 --> 00:21:59,615
wrapping the GGUF or, other models
or whatever, into the Docker OCI
427
00:21:59,625 --> 00:22:04,895
standard format for shipping and
pulling and pushing essentially Docker.
428
00:22:05,505 --> 00:22:08,535
models that are in Docker hubs
that are in the Docker format.
429
00:22:09,034 --> 00:22:10,004
OCI format.
430
00:22:10,044 --> 00:22:10,434
Yep.
431
00:22:11,484 --> 00:22:15,324
So when we look at agentic applications,
step one is, you need models.
432
00:22:15,324 --> 00:22:17,174
It's kind of the brains of the operations.
433
00:22:17,469 --> 00:22:19,029
and then you need tools.
434
00:22:19,029 --> 00:22:21,439
And so you've got highlighted
here the MCP toolkit.
435
00:22:21,439 --> 00:22:25,299
That was kind of the first
adventure into the MCP space.
436
00:22:25,299 --> 00:22:29,419
And that one was focused a little bit
more on how do we provide tools to the
437
00:22:29,429 --> 00:22:32,539
other agentic applications that you
are already running on your machine.
438
00:22:32,539 --> 00:22:39,929
So Claude Desktop or using VS Code
Copilot on my machine or Cursor, etc.
439
00:22:40,579 --> 00:22:44,709
How do we provide those MCP servers?
440
00:22:45,599 --> 00:22:49,079
And containerized ways, secure
credential injection, etc.
441
00:22:49,259 --> 00:22:52,179
Basically manage the life
cycle of those MCP servers.
442
00:22:52,559 --> 00:22:57,809
Again, in the use case of connecting
them to your other agentic applications.
443
00:22:58,449 --> 00:23:00,700
And so, again, this is kind of
where we started our MCP journey.
444
00:23:01,080 --> 00:23:04,520
if you see flipping through a couple
of these, actually we just released a
445
00:23:04,520 --> 00:23:09,030
Docker Hub MCP server that allows you
to Search for images on hub or, you
446
00:23:09,030 --> 00:23:13,070
know, those within your organization,
which is super helpful for like maybe
447
00:23:13,460 --> 00:23:15,520
a write me a Dockerfile that does X.
448
00:23:15,550 --> 00:23:16,130
Well, cool.
449
00:23:16,130 --> 00:23:18,740
Let's go find that the right image
that should be used for that.
450
00:23:19,140 --> 00:23:22,060
so again, starts to open up some of
these, additional capabilities here.
451
00:23:22,826 --> 00:23:23,286
Yeah.
452
00:23:23,286 --> 00:23:28,956
And inside the Docker desktop UI,
there is now a beta tab, essentially,
453
00:23:29,346 --> 00:23:30,846
that's called MCP Toolkit.
454
00:23:31,346 --> 00:23:39,796
And it is a GUI that allows me to explore
and enable one of, 141 different tools
455
00:23:39,836 --> 00:23:41,636
and growing that Docker has added.
456
00:23:41,966 --> 00:23:47,676
So like a lot of the other places on the
internet that either they host models
457
00:23:47,676 --> 00:23:53,346
like Anthropic or OpenAI, or They're a
place where you can create AI applications
458
00:23:53,626 --> 00:23:58,036
and all those places have started to
create their own little portals for
459
00:23:58,036 --> 00:24:01,516
finding tools and they may or may not,
I mean, most of them all now settle on
460
00:24:01,516 --> 00:24:05,286
MCP, but before we had really MCP as
the protocol standard, they were already
461
00:24:05,286 --> 00:24:09,386
doing like OpenAI was doing this before,
but they were very, it was proprietary.
462
00:24:09,386 --> 00:24:14,116
You don't know how Evernote or
Notion got, showed up as a tool
463
00:24:14,126 --> 00:24:16,556
feature in ChatGPT, it did.
464
00:24:17,371 --> 00:24:21,021
But we just assumed that was their
custom integration and now we have this
465
00:24:21,021 --> 00:24:24,531
standard called MCP that everything
should interact with everything
466
00:24:24,551 --> 00:24:26,941
properly the way that they should.
467
00:24:27,631 --> 00:24:29,491
At least right now it's
the one that's winning.
468
00:24:29,491 --> 00:24:31,531
We don't know whether we'll
still be talking about MCP in
469
00:24:31,531 --> 00:24:33,321
five years, but it's here now.
470
00:24:33,631 --> 00:24:34,751
It's what we're talking about now.
471
00:24:35,091 --> 00:24:39,034
And this lights up a lot of capabilities.
472
00:24:39,544 --> 00:24:42,334
In other words, you turn on an MCP tool.
473
00:24:43,334 --> 00:24:46,184
And that sits behind
something called MCP Gateway.
474
00:24:46,434 --> 00:24:48,304
So tell me, what is MCP Gateway?
475
00:24:48,749 --> 00:24:49,219
Yeah.
476
00:24:49,219 --> 00:24:53,719
So at the end of the day, the toolkit is
a combination of several different things.
477
00:24:53,749 --> 00:24:57,199
the MCP gateway is actually
a component that we just open
478
00:24:57,199 --> 00:24:58,539
source at WeAreDeveloper.
479
00:24:58,539 --> 00:25:00,054
So you can actually run
this gateway directly.
480
00:25:00,344 --> 00:25:01,844
In a container, completely on its own.
481
00:25:02,314 --> 00:25:05,724
And the MCP gateway is what's
actually responsible for managing
482
00:25:05,734 --> 00:25:07,554
the lifecycle of these MCP servers.
483
00:25:07,711 --> 00:25:10,471
it itself is an MCP server.
484
00:25:10,511 --> 00:25:12,531
think of it more like an MCP proxy.
485
00:25:12,731 --> 00:25:17,351
it exposes itself as an MCP server that
then can connect to your applications.
486
00:25:17,661 --> 00:25:20,596
But when you ask that server,
Hey, what tools do you have?
487
00:25:21,056 --> 00:25:25,146
It's really delegating, or I mean,
it's using cache versions of, okay,
488
00:25:25,146 --> 00:25:27,546
what are the downstream MCP servers?
489
00:25:28,316 --> 00:25:31,336
And so it's acting as
basically a proxy here.
490
00:25:31,946 --> 00:25:37,141
so when requests come in and say, hey,
you know, from the agents at GAP, if
491
00:25:37,141 --> 00:25:42,233
I want to execute this tool, go do a
search on DuckDuckGo at that point, the
492
00:25:42,233 --> 00:25:47,353
MCP gateway will actually spin up the
container, that DuckDuckGo MCP server,
493
00:25:47,643 --> 00:25:52,013
and then delegate the request to that
container, which then does the search, and
494
00:25:52,013 --> 00:25:54,043
then the MCP gateway returns the results.
495
00:25:54,043 --> 00:25:58,023
So kind of think of it as a proxy
that's managing the lifecycle of all
496
00:25:58,023 --> 00:26:01,803
those containers, but also, you know,
injecting the credentials, configuration.
497
00:26:02,143 --> 00:26:05,203
it also does other things like actually
looking at what's going in and out of
498
00:26:05,203 --> 00:26:09,233
the prompts going through the proxy to
make sure, you know, secrets aren't being
499
00:26:09,233 --> 00:26:11,243
leaked or, that kind of stuff as well too.
500
00:26:11,293 --> 00:26:14,743
and we're even starting to do some
further explorations of what are other
501
00:26:14,743 --> 00:26:16,883
ways to kind of secure those MCP servers?
502
00:26:16,933 --> 00:26:20,283
You know, for example, a file system
one should never have network access.
503
00:26:20,483 --> 00:26:24,473
Cool, so let's, when that container
starts, you get no network access,
504
00:26:24,493 --> 00:26:27,923
or, the GitHub MCP server, you know,
it's talking to the GitHub APIs.
505
00:26:29,178 --> 00:26:32,898
Let's only authorize those host
names that it can communicate with.
506
00:26:32,908 --> 00:26:35,988
So, you know, start to do a little
bit more of a kind of permissioning
507
00:26:35,998 --> 00:26:40,238
model around these MCP servers, which
is where a lot of people are kind of
508
00:26:40,238 --> 00:26:44,038
most cautious and nervous about MCP
servers, because it's, they're completely
509
00:26:44,038 --> 00:26:48,458
autonomous for the most part, and you
have to trust what's going on there.
510
00:26:48,478 --> 00:26:52,338
this exact feature is both
necessary and also Solomon Hyke's
511
00:26:52,358 --> 00:26:54,128
prediction from a month ago.
512
00:26:54,128 --> 00:26:57,408
He was on this show and was saying
that we're going to see all these
513
00:26:57,408 --> 00:27:00,398
infrastructure companies and all
these tooling companies that are
514
00:27:00,398 --> 00:27:01,908
going to offer to lock this shit down.
515
00:27:01,908 --> 00:27:03,788
I think it's the quote
I have to get from him.
516
00:27:04,217 --> 00:27:04,777
that sounds right.
517
00:27:04,928 --> 00:27:08,578
he compared the origin of containers
and how it started with developers.
518
00:27:08,578 --> 00:27:12,268
And then eventually it took it and sort
of managed it in the infrastructure
519
00:27:12,268 --> 00:27:16,708
layer and provided all the restrictions
and security and limitations and
520
00:27:16,738 --> 00:27:17,858
configuration and all this stuff.
521
00:27:18,268 --> 00:27:20,618
And the same thing's happening to MCP.
522
00:27:20,828 --> 00:27:25,348
Where it started out as a developer tool
to empower developers to do all these
523
00:27:25,348 --> 00:27:29,328
cool things with AI that they couldn't do
and let the AI actually do stuff for us.
524
00:27:29,588 --> 00:27:34,038
And now very quickly in a matter of
months, IT is coming in and saying,
525
00:27:34,038 --> 00:27:35,218
okay, we're going to lock this down.
526
00:27:35,218 --> 00:27:35,898
It's crazy.
527
00:27:35,908 --> 00:27:39,088
You can, you know, your prompts can
delete your, drop your databases,
528
00:27:39,088 --> 00:27:42,748
your, as we just saw happen on
the internet recently this week.
529
00:27:43,038 --> 00:27:48,598
I want to, on this gateway topic
though, it can sound complicated.
530
00:27:49,013 --> 00:27:52,403
And maybe the internals are a
little, and there was obviously
531
00:27:52,403 --> 00:27:54,003
code built into this program.
532
00:27:54,223 --> 00:27:58,353
But for those of us that aren't
maybe building agents yet, or
533
00:27:58,353 --> 00:28:03,593
like really getting into building
apps that use AIs in the app, this
534
00:28:03,593 --> 00:28:05,733
just appears as kind of magic.
535
00:28:05,773 --> 00:28:11,263
Like it, you go into the Docker desktop
UI, I enable, I go through the toolkit,
536
00:28:11,273 --> 00:28:15,788
there's all these suggestions, everything
from, the MCP server for GitHub itself to
537
00:28:15,998 --> 00:28:20,418
an MCP server that could give me access
to Grafana data to accessing the Heroku
538
00:28:20,418 --> 00:28:24,248
API and you're looking at all these things
and you're just like, I'm enabling them.
539
00:28:24,258 --> 00:28:25,518
it's like a kid in a candy store.
540
00:28:25,518 --> 00:28:26,348
I'm just going check, check, check.
541
00:28:26,348 --> 00:28:27,778
Yeah, I want Notion.
542
00:28:27,778 --> 00:28:28,538
I want Stripe.
543
00:28:28,588 --> 00:28:31,518
I get them in a list, they're
enabled, which means they're
544
00:28:31,518 --> 00:28:32,968
not actually running, right?
545
00:28:32,968 --> 00:28:37,238
they're waiting to be called before
the gateway runs them in memory.
546
00:28:37,518 --> 00:28:40,448
That's all transparent to me, I
don't realize that's happening.
547
00:28:40,838 --> 00:28:47,108
And if I choose to use this toolkit
with Gordon, if I just go into Gordon,
548
00:28:47,158 --> 00:28:52,668
And in the Gordon AI, if I don't want
to run a local model myself, or I'm
549
00:28:52,848 --> 00:28:57,338
not using Claude Desktop or something
that gives me the ability to enable MCP
550
00:28:57,338 --> 00:29:02,318
tools, I can just go in here and say,
enable all my MCP tools, all 34 of them.
551
00:29:02,418 --> 00:29:08,188
I've got get ones and I've got, and
so now what that means is the Gordon
552
00:29:08,188 --> 00:29:13,618
AI can now use these tools, which
makes this free AI even smarter.
553
00:29:14,008 --> 00:29:20,592
And I can say, is there a
Docker hub image NGINX?
554
00:29:20,712 --> 00:29:21,812
I don't know if there is.
555
00:29:22,407 --> 00:29:22,787
Let's see.
556
00:29:22,817 --> 00:29:23,817
I've never even tested this.
557
00:29:23,817 --> 00:29:25,947
So it's kind of a, what could go wrong?
558
00:29:25,977 --> 00:29:28,137
let's use Google live on the
internet and see what happens.
559
00:29:28,137 --> 00:29:32,037
yeah, I was just saying, look at it,
going out and checking, using the
560
00:29:32,037 --> 00:29:37,297
Docker, the newly created Docker hub,
MCB, tool That you had, just, released.
561
00:29:38,107 --> 00:29:46,027
so is this going through a, MCP gateway
or is this not with the MCP gateway yet?
562
00:29:46,245 --> 00:29:53,165
when you and Gordon AI flip the switch to
say, yes, I want to use the MCP toolkit,
563
00:29:53,165 --> 00:29:59,319
basically what that's doing is, in the
Gordon AI application here, it's enrolling
564
00:29:59,529 --> 00:30:01,729
the MCP toolkit as an MCP server.
565
00:30:02,118 --> 00:30:05,796
And so then it's going to ask
The MCP toolkit, hey, what
566
00:30:05,796 --> 00:30:06,586
tools do you have available?
567
00:30:06,586 --> 00:30:10,086
And so when you saw that list of tools,
that's again coming from the gateway.
568
00:30:10,536 --> 00:30:15,961
Gordon is just simply treating the
MCP toolkit as An MCP server, which in
569
00:30:15,961 --> 00:30:18,141
itself is going to launch MCP server.
570
00:30:18,141 --> 00:30:21,301
So that's kind of why I mentioned, it's
kind of thinking of it like a proxy there.
571
00:30:21,327 --> 00:30:22,707
Yeah, it does feel like one.
572
00:30:22,707 --> 00:30:23,197
Yeah.
573
00:30:23,307 --> 00:30:27,387
And this, for those watching, like it
didn't actually work because I don't
574
00:30:27,397 --> 00:30:30,377
actually have access to hardened images,
but, I just wanted to see what it would,
575
00:30:30,377 --> 00:30:32,117
what it'd say, but it, the, the UI
576
00:30:32,318 --> 00:30:33,648
Which is the right answer?
577
00:30:33,708 --> 00:30:34,788
Which is the right answer for
578
00:30:35,027 --> 00:30:36,317
did the right thing.
579
00:30:36,367 --> 00:30:39,607
it didn't expose a
vulnerability in the MCP server.
580
00:30:39,657 --> 00:30:43,047
But yeah, so it basically,
I can give Gordon AI more.
581
00:30:43,467 --> 00:30:46,407
And I can do a lot more functionality,
more abilities to do things without
582
00:30:46,417 --> 00:30:49,497
having to run my own model, without
having to figure out Claude Desktop.
583
00:30:49,767 --> 00:30:54,927
But I will say, because I'm in love with
this toolkit so much, because I love
584
00:30:54,927 --> 00:31:00,237
this idea of one place for my MCP tools,
for me to enter in the API secrets so it
585
00:31:00,237 --> 00:31:03,987
can access my notion and my Gmail but I
don't want to have to do that in Claude
586
00:31:04,007 --> 00:31:08,057
Desktop and then in warp and then in
VS code and then in Docker and then in
587
00:31:08,057 --> 00:31:10,637
Ollama and like every place I might run.
588
00:31:11,367 --> 00:31:14,127
A tool that needs MCP
tools or access to an LLM.
589
00:31:14,417 --> 00:31:16,807
So I did it in Docker Desktop.
590
00:31:16,827 --> 00:31:18,907
I enabled the ones and set
them up the way I wanted.
591
00:31:19,347 --> 00:31:25,537
And then inside of my tools around
my computer that all support AI MCPs,
592
00:31:26,537 --> 00:31:31,787
they all have now added MCP client
functionality that lets me talk to
593
00:31:31,847 --> 00:31:36,364
another MCP, any MCP server that
speaks proper MCP through their API.
594
00:31:36,694 --> 00:31:40,454
And in this case, what I've done in the
warp terminal, because it does support
595
00:31:40,464 --> 00:31:46,494
MCP, is I just tell it, the command it's
going to run is docker mcp gateway run.
596
00:31:46,904 --> 00:31:50,124
And it uses the standard in and
standard out, which is one of
597
00:31:50,124 --> 00:31:51,714
the ways that you can use mcp.
598
00:31:52,124 --> 00:31:57,084
And then I suddenly have all 34 tools
that we enabled in my docker desktop.
599
00:31:57,514 --> 00:32:01,254
Available in Warp, just as long as Docker
Desktop's running, that's all I gotta do.
600
00:32:01,874 --> 00:32:06,864
And then, because Warp is using Claude
Sonnet 4, or whatever I told Warp to
601
00:32:06,864 --> 00:32:10,204
do, Docker doesn't care about that,
because I'm not asking it to use,
602
00:32:10,614 --> 00:32:12,704
think that, we talked about this it's
called bring your own key, I guess
603
00:32:12,704 --> 00:32:13,664
that's what everybody's talking about.
604
00:32:14,024 --> 00:32:19,334
Uh, bring your own key is when you want
to bring your own Model to whatever
605
00:32:19,384 --> 00:32:19,854
to access,
606
00:32:20,134 --> 00:32:21,464
you're a key to access the model.
607
00:32:21,464 --> 00:32:25,234
Yeah, but in warp in particular,
like this is nuanced, but in
608
00:32:25,234 --> 00:32:28,434
warp, you can't usually, you can't
yet access your own models yet.
609
00:32:28,444 --> 00:32:30,394
I think they're going to make that
an enterprise feature or something.
610
00:32:30,744 --> 00:32:32,754
But if I open up VS code,
I could do the same thing.
611
00:32:32,754 --> 00:32:36,304
If I opened up a ChatGPT desktop,
I could do the same thing.
612
00:32:36,664 --> 00:32:41,494
And, Aider or like any of the CLI tools,
although anything that accepts MCP
613
00:32:41,524 --> 00:32:44,314
so far, I've gotten to work this way.
614
00:32:44,444 --> 00:32:49,914
And it's been awesome because all these,
all these different IDEs and AI tools
615
00:32:50,154 --> 00:32:51,324
all set up a little bit different.
616
00:32:51,324 --> 00:32:54,374
Goose you set up differently
than Claude Desktop.
617
00:32:54,374 --> 00:32:54,854
So
618
00:32:54,865 --> 00:32:55,565
Client.
619
00:32:56,225 --> 00:33:00,936
they're all, and all they, all of
them have different knobs and, ways of
620
00:33:00,936 --> 00:33:05,976
controlling MCP servers and at varying
degrees of control and flexibility.
621
00:33:06,416 --> 00:33:09,926
So this is really nice because
then you can also just have
622
00:33:09,936 --> 00:33:11,706
all those tools running if you
623
00:33:11,815 --> 00:33:12,265
right.
624
00:33:12,335 --> 00:33:15,295
They look like one giant MCP server.
625
00:33:15,295 --> 00:33:19,875
Yeah, because normally I would have to
add each MCP tool as its own server,
626
00:33:20,021 --> 00:33:22,811
I think, it feels like some of the
tools are all standardizing on Claude
627
00:33:22,831 --> 00:33:25,451
Desktop as, that config file, which I
628
00:33:25,563 --> 00:33:25,713
the mcp.
629
00:33:25,713 --> 00:33:27,003
json,
630
00:33:27,191 --> 00:33:31,771
Yeah, it feels like that, like everyone's
settling on just using that one file.
631
00:33:32,131 --> 00:33:35,731
Which is, I guess it's kind of feels
kind of hacky, but I guess it's fine.
632
00:33:36,211 --> 00:33:38,731
it feels like every editor
using my VIM settings.
633
00:33:38,731 --> 00:33:39,351
It's like, no, no, no, no.
634
00:33:39,381 --> 00:33:39,851
Calm down.
635
00:33:39,851 --> 00:33:42,041
I don't necessarily want you
all to use the same file.
636
00:33:42,261 --> 00:33:45,031
I don't want you all overwriting each
other and changing the same file.
637
00:33:45,081 --> 00:33:49,901
so you're bringing up a really important
thing, which is, since we're new to this,
638
00:33:50,001 --> 00:33:55,451
the number of MCP servers is not too many
yet, even though it does feel like there's
639
00:33:55,452 --> 00:33:56,571
probably like a lot of MCP servers.
640
00:33:56,572 --> 00:33:59,541
I've been using 10 new MCP servers
since we've started this conversation.
641
00:34:00,201 --> 00:34:04,451
But it's still like a number of tools
that we can still rationalize about.
642
00:34:04,881 --> 00:34:10,319
But probably in another month or two at
this rate, there is a limit to how many
643
00:34:10,329 --> 00:34:17,434
MCP tools One single client, I guess
you could say, or one instantiation of
644
00:34:17,434 --> 00:34:23,824
a task that you're using like Claude
Code or Cline, it's context window can
645
00:34:23,844 --> 00:34:25,924
only use a certain amount of tools.
646
00:34:26,354 --> 00:34:33,424
And so, is there some ideas about breaking
up in the MCP gateway, like having maybe
647
00:34:33,424 --> 00:34:37,934
like sets of tools that have like specific
supersets or tools or something like that?
648
00:34:38,014 --> 00:34:38,284
Yeah.
649
00:34:38,284 --> 00:34:38,914
Good question.
650
00:34:38,934 --> 00:34:40,314
And so that's a good call out.
651
00:34:40,314 --> 00:34:43,774
And so I actually want to, zoom in on
that just a tiny bit there, because
652
00:34:43,844 --> 00:34:46,584
for folks that may be new to this,
they may not quite understand that
653
00:34:47,154 --> 00:34:51,584
the way that tools work is basically,
it's taking all the tool descriptions.
654
00:34:51,904 --> 00:34:53,034
Okay, here's a tool name.
655
00:34:53,034 --> 00:34:54,704
Here's when I'm going to use this tool.
656
00:34:54,704 --> 00:34:57,214
Here's the parameters that are
needed to invoke this tool.
657
00:34:57,594 --> 00:35:00,584
And it's sending that to
the model on every request.
658
00:35:01,754 --> 00:35:05,364
And so the model's having to read
all that and basically say, hey,
659
00:35:05,364 --> 00:35:07,764
based on this conversation, hey,
here's a toolbox of stuff that
660
00:35:07,764 --> 00:35:09,564
I may or may not be able to use.
661
00:35:10,234 --> 00:35:15,889
But as Nirmal just pointed out, like
that takes context window there and
662
00:35:16,619 --> 00:35:21,099
granted yes some of the newer models
have incredibly huge context windows but
663
00:35:21,099 --> 00:35:25,589
depending on the use case, it's going
to affect your speed, it's going to
664
00:35:25,589 --> 00:35:29,559
affect the quality, and so yeah, you do
want to be careful of like, okay, I'm
665
00:35:29,559 --> 00:35:32,619
not just going to go in there and just
flip the box on all the MCP servers.
666
00:35:32,629 --> 00:35:33,739
Now you have access to everything.
667
00:35:33,739 --> 00:35:36,619
Like, you do want to be a little
conscious of that as well.
668
00:35:36,929 --> 00:35:40,599
in fact, I found it funny in a, I
was playing in Cursor not long ago,
669
00:35:40,599 --> 00:35:44,119
and you know, they have even just
YOLO mode, just go crazy with it.
670
00:35:44,529 --> 00:35:48,339
But even they have A warning once
you, I think it's after you enable
671
00:35:48,339 --> 00:35:54,009
the 31st tool of like, hey, heads
up, you're getting a little crazy.
672
00:35:54,499 --> 00:35:58,599
So like, I'm like, for the one that
has YOLO mode to call me out for being
673
00:35:58,599 --> 00:36:02,569
crazy for too many tools, like it was,
it's again, kind of a reminder of just,
674
00:36:03,029 --> 00:36:05,899
okay, you do want to be conscious of
the number of tools that you're using.
675
00:36:06,369 --> 00:36:07,619
so to actually answer the question.
676
00:36:07,619 --> 00:36:07,739
Yeah.
677
00:36:07,739 --> 00:36:12,099
It's been something that we've been
exploring and kind of waiting to just see
678
00:36:12,099 --> 00:36:14,169
what the feedback is gonna be on that.
679
00:36:14,229 --> 00:36:17,349
Are there separate tool sets
that, clients can connect to.
680
00:36:18,139 --> 00:36:21,929
you know, that's certainly a possibility
as well, since this MCP gateway is an
681
00:36:21,929 --> 00:36:26,779
open source container, when you run
this for your application, not only can
682
00:36:26,779 --> 00:36:30,229
you say, these are the servers I want,
but then you can even further filter
683
00:36:30,239 --> 00:36:33,559
through, these are the tools from those
servers that I actually want to expose.
684
00:36:33,559 --> 00:36:36,069
So, for example, I think the
GitHub official one is up to
685
00:36:36,069 --> 00:36:38,109
72 tools now or something.
686
00:36:38,349 --> 00:36:39,829
It's a crazy number.
687
00:36:40,179 --> 00:36:41,949
but most of the time, I only
need maybe three or four of them.
688
00:36:42,269 --> 00:36:43,559
So, I want to filter that.
689
00:36:43,559 --> 00:36:46,529
And that's why you see, Cloud and
VS Code and many of these others.
690
00:36:46,539 --> 00:36:50,309
Even though you're pulling in
these MCP servers, many of those
691
00:36:50,329 --> 00:36:53,249
provide client side functionality
to kind of filter that list as well,
692
00:36:53,249 --> 00:36:53,629
Yeah.
693
00:36:54,290 --> 00:36:57,730
I wonder if we get to a state because
the MC, this is getting a little bit
694
00:36:57,730 --> 00:37:01,760
meta, but everything when you talk about
agentic AI gets meta really quickly.
695
00:37:02,450 --> 00:37:07,910
So, I wonder if the MCP gateway itself
is an MCP server, so it can rationalize
696
00:37:07,920 --> 00:37:12,700
about itself, I wonder if we get into
the pattern of, okay, there's this new
697
00:37:12,700 --> 00:37:17,850
task that I want this agent to do, and
the first thing, after it comes up with
698
00:37:17,850 --> 00:37:22,210
its task list, the steps it wants to
take, is go through that list and then,
699
00:37:22,690 --> 00:37:28,505
ask the MCP gateway to reconfigure
itself on each task and turn on Only
700
00:37:28,505 --> 00:37:32,505
the ones that it identified as likely
the ones that it needs for that task.
701
00:37:32,815 --> 00:37:36,265
And just dynamically, at any
given time, don't have anything
702
00:37:36,265 --> 00:37:37,485
more than five running.
703
00:37:37,865 --> 00:37:38,905
So figure it out.
704
00:37:39,005 --> 00:37:41,555
You can choose whatever five
you want, but only have five.
705
00:37:41,618 --> 00:37:45,548
We've done some experiments with that,
not quite to that full dynamicness,
706
00:37:45,578 --> 00:37:49,548
but I've even done some ones of a
Okay, here's a tool to enable other
707
00:37:49,548 --> 00:37:51,398
tools, is basically what it is.
708
00:37:51,848 --> 00:37:55,368
And, okay, and give me parameters
of, okay, do you need, GitHub?
709
00:37:55,378 --> 00:37:56,648
Do you need, Slack?
710
00:37:56,648 --> 00:37:59,768
You know, tell me what it is
that you need, and then I'll
711
00:37:59,768 --> 00:38:01,288
enable those specific things.
712
00:38:01,288 --> 00:38:05,378
And then what's cool then is,
as part of the MCP protocol,
713
00:38:05,378 --> 00:38:06,488
there's also notifications.
714
00:38:06,488 --> 00:38:10,393
So the MCP server can then notify,
The client says, hey, there's a new
715
00:38:10,393 --> 00:38:13,573
list of tools available, and then
the next API request to the model
716
00:38:13,863 --> 00:38:15,213
then has this new set of tools.
717
00:38:15,985 --> 00:38:16,055
I
718
00:38:16,055 --> 00:38:16,635
think we're almost
719
00:38:16,763 --> 00:38:18,353
capability is there, but,
720
00:38:19,055 --> 00:38:20,385
I think that's likely the next step.
721
00:38:21,423 --> 00:38:26,793
but it's also kind of like a,
yeah, how do you safeguard that?
722
00:38:26,853 --> 00:38:31,398
So it's, Yeah, it's an
interesting time period, for sure.
723
00:38:31,975 --> 00:38:35,445
we got an interesting question, is
MCP Gateways intent to replace an
724
00:38:35,465 --> 00:38:37,545
API Gateway or in parallel to it?
725
00:38:37,935 --> 00:38:38,645
Great question.
726
00:38:38,645 --> 00:38:39,715
Michael, you want to take that one?
727
00:38:39,858 --> 00:38:40,638
yeah, great question.
728
00:38:40,688 --> 00:38:44,672
I'd say in some ways that there's
similar functionality, but they
729
00:38:44,732 --> 00:38:45,932
serve very different purposes.
730
00:38:45,932 --> 00:38:49,822
So an API gateway, I'll just take
the most basic example, but I know
731
00:38:49,822 --> 00:38:53,932
there's lots of different ones, An API
gateway, single endpoint, and I may
732
00:38:53,932 --> 00:38:55,652
have lots of different microservices.
733
00:38:55,672 --> 00:38:57,092
Let's just pick a catalog.
734
00:38:57,132 --> 00:39:00,452
Okay, so for product related ones,
it's going to go to this microservice.
735
00:39:00,642 --> 00:39:02,182
Users, it's going to go to this other one.
736
00:39:02,182 --> 00:39:03,862
Cart, another service, whatever.
737
00:39:04,202 --> 00:39:08,112
And the API gateway is routing all
those different requests and rate
738
00:39:08,112 --> 00:39:14,751
limiting, etc. In many ways, like this
MCP gateway serves in a similar fashion
739
00:39:15,371 --> 00:39:18,921
in which it's going to be routing
to the right MCP server to actually
740
00:39:18,941 --> 00:39:20,521
handle the tool execution and whatnot.
741
00:39:20,931 --> 00:39:24,231
But again, it's only for the MCP protocol.
742
00:39:24,531 --> 00:39:27,811
So it's not going to be replacing an
API gateway because it's not doing
743
00:39:27,811 --> 00:39:33,451
normal API requests, etc. It's only
for MCP related workloads and requests.
744
00:39:34,381 --> 00:39:35,931
different protocols at play here.
745
00:39:36,737 --> 00:39:38,607
I think that's probably the
best way to describe it.
746
00:39:38,657 --> 00:39:44,357
otherwise, you could also say that
MCP and API Gateway are likely
747
00:39:44,357 --> 00:39:46,407
going to be running in parallel.
748
00:39:46,767 --> 00:39:50,757
and so probably what I would see would
be, I have an API gateway that routes
749
00:39:50,757 --> 00:39:55,807
a request to an endpoint, and then that
particular application, let's just say
750
00:39:55,807 --> 00:40:01,977
it's an agentic application, can then have
its own MCP gateway to satisfy whatever
751
00:40:01,977 --> 00:40:03,997
agentic flow it needs to use there.
752
00:40:03,997 --> 00:40:07,467
I wanted to, while you guys were having
an awesome conversation, I was trying
753
00:40:07,467 --> 00:40:13,947
to draw up, just a visualization to
try to represent, okay, so just so
754
00:40:13,947 --> 00:40:16,907
people understand, because this MCP,
we could make a whole show on MCP
755
00:40:16,907 --> 00:40:20,107
tools, honestly, from an infrastructure
perspective, how do these things talk?
756
00:40:20,117 --> 00:40:20,927
How do they integrate?
757
00:40:21,347 --> 00:40:23,367
The fact that you're talking about
that they're just really adding to
758
00:40:23,367 --> 00:40:26,898
the context window is a fantastic
fact that, A lot of people could go
759
00:40:27,088 --> 00:40:31,498
months or years using MCP tools day
to day and never know that, right?
760
00:40:31,558 --> 00:40:35,458
a normal non engineer could use
MCP tools, not understand how
761
00:40:35,458 --> 00:40:36,478
these things are all working.
762
00:40:36,738 --> 00:40:40,288
for those that are into this, are
playing around with MCP tools elsewhere
763
00:40:40,698 --> 00:40:44,538
and understanding a little bit of MCP
server functionality and client versus
764
00:40:44,538 --> 00:40:46,138
server versus host and all that stuff.
765
00:40:46,458 --> 00:40:51,908
Before Docker's gateway, the MCP gateway,
you would have like an MCP client
766
00:40:51,978 --> 00:40:55,748
That whether it's your IDE, your terminal,
or AI, chat, desktop, or whatever you've
767
00:40:55,748 --> 00:40:58,068
got, that is acting as an MCP client.
768
00:40:58,358 --> 00:41:01,518
Assuming it supports MCP servers,
you can add them one at a time.
769
00:41:01,748 --> 00:41:05,978
So I would add GitHub's MCP server, then
I would add DuckDuckGo's MCP server.
770
00:41:06,228 --> 00:41:09,058
I might add Notion's MCP server,
since I'm a big Notion fan.
771
00:41:09,338 --> 00:41:14,368
And each one of those servers
has One to infinity, tools, which
772
00:41:14,368 --> 00:41:16,538
are, I look at as like API routes.
773
00:41:16,918 --> 00:41:20,308
and each one has its
own very niche purpose.
774
00:41:20,774 --> 00:41:23,414
depending on the tool, and this is part
of the frustration with the ecosystem
775
00:41:23,414 --> 00:41:26,414
right now is we're only months into this,
but it's amazing that all these tools are
776
00:41:26,414 --> 00:41:30,024
all starting to support each other, tools
have different ways where you manage this.
777
00:41:30,034 --> 00:41:33,244
Some of them you can disable
and enable specific servers.
778
00:41:33,454 --> 00:41:36,914
Some, you can actually choose
the tools individually, which
779
00:41:36,914 --> 00:41:38,494
is like choosing API routes.
780
00:41:38,944 --> 00:41:41,994
And to me, it's you're always trying
to get down to the smallest amount of
781
00:41:41,994 --> 00:41:44,134
tools that you need to prevent confusion.
782
00:41:44,144 --> 00:41:47,854
Cause I'm, my biggest problem is
I enable all the tools because
783
00:41:47,854 --> 00:41:49,514
I get tired of managing them.
784
00:41:49,674 --> 00:41:50,398
I just want them to work.
785
00:41:50,759 --> 00:41:52,829
I just want them all to
work when they need to work.
786
00:41:53,189 --> 00:41:54,589
And then I, so I enable them all.
787
00:41:55,379 --> 00:41:56,939
I end up with 50 plus tools.
788
00:41:57,239 --> 00:42:01,419
And then when I'm asking AI to do things,
it chooses the wrong tool because I
789
00:42:01,419 --> 00:42:06,699
wasn't precise enough in my ask to
trigger the right words that are written
790
00:42:06,699 --> 00:42:09,419
in the system prompt of that MCP server.
791
00:42:09,469 --> 00:42:14,769
So actually, maybe an easier update might
be to put another layer on top of the
792
00:42:14,769 --> 00:42:18,029
MCP server and kind of an in between.
793
00:42:18,109 --> 00:42:23,929
so I'm connecting the MCP gateway
now to multiple other MCP servers.
794
00:42:24,189 --> 00:42:25,049
So I get, yeah, you're right.
795
00:42:25,049 --> 00:42:26,009
I need another layer here.
796
00:42:26,009 --> 00:42:27,379
That's actually MCP servers.
797
00:42:27,739 --> 00:42:33,829
so, there's now this gateway in the
middle, and it, the only negative of this
798
00:42:33,829 --> 00:42:39,119
approach is for right now, because we
don't have this futuristic utopia yet,
799
00:42:39,539 --> 00:42:47,579
is that to my terminal, or my IDE, it
all looks like one giant list of tools.
800
00:42:47,859 --> 00:42:51,589
And in one MCP server, which is
just the nature of proxy, right?
801
00:42:51,639 --> 00:42:54,379
But behind one IP address is a
whole bunch of websites, like
802
00:42:54,379 --> 00:42:55,349
you don't realize it, right?
803
00:42:55,719 --> 00:42:58,129
So it is, the analogy still
works, I believe, there.
804
00:42:58,349 --> 00:43:04,289
But in this case, because it's connecting
all of them together into one proxy,
805
00:43:04,309 --> 00:43:07,929
and the nice thing is, it's, I can see
in the memory usage and the containers.
806
00:43:07,929 --> 00:43:12,229
In fact, when Michael was on weeks
ago, We saw the MCP gateway spinning up
807
00:43:12,229 --> 00:43:15,619
servers dynamically and then shutting
them down and you could see the container
808
00:43:15,619 --> 00:43:19,279
launch run, you know, run the curl
command or whatever, and then close.
809
00:43:19,399 --> 00:43:22,919
And it was so quick, we couldn't,
capture and swat toggle windows
810
00:43:23,129 --> 00:43:24,269
to see the tools launching.
811
00:43:24,269 --> 00:43:25,379
And, I mean, it's beautiful.
812
00:43:25,379 --> 00:43:26,819
It's exactly what containers were for.
813
00:43:26,894 --> 00:43:28,654
It's, it's ephemeral, it's wonderful.
814
00:43:29,379 --> 00:43:34,639
But, if your IDE or if your chat desktop
or whatever is acting as your MCP client,
815
00:43:34,639 --> 00:43:39,729
the agent thing, if that doesn't let
you choose individual tools, then this
816
00:43:39,729 --> 00:43:44,989
approach is a little hard because the
only way from my IDE that I can, I have to
817
00:43:44,989 --> 00:43:47,169
turn off all of Docker or none of Docker.
818
00:43:47,169 --> 00:43:47,762
It I think.
819
00:43:48,312 --> 00:43:51,342
This gets us to a conversation
of eventually we will have this.
820
00:43:51,532 --> 00:43:55,162
I'm thinking of it as like the model,
the plan model before the model that will
821
00:43:55,162 --> 00:43:57,112
go, okay, you used all these keywords.
822
00:43:57,132 --> 00:43:59,512
I'm going to pick out the right
tools and I'm going to hand those
823
00:43:59,512 --> 00:44:01,782
off to the next model, which
is going to do the actual work.
824
00:44:02,352 --> 00:44:03,912
That's probably already here.
825
00:44:04,022 --> 00:44:05,632
Solomon predicted it a month ago.
826
00:44:05,822 --> 00:44:06,422
Yeah, I'm sorry.
827
00:44:06,422 --> 00:44:06,682
What?
828
00:44:07,023 --> 00:44:09,213
so that's what Michael and I,
while you were drawing this
829
00:44:09,402 --> 00:44:10,562
Oh, is that what you
were just talking about?
830
00:44:10,763 --> 00:44:12,533
that's what Michael and
I, we were talking about.
831
00:44:12,734 --> 00:44:18,600
So the gateway itself has its own
MCP server that controls itself.
832
00:44:19,180 --> 00:44:24,030
And so we're a few months away from
exactly what you were just talking about.
833
00:44:24,290 --> 00:44:28,200
Bret, because of context windows,
because there's too many tools, because
834
00:44:28,200 --> 00:44:31,690
of all the things that you did, all the
challenges you just mentioned, Bret.
835
00:44:32,120 --> 00:44:37,300
the first step might be the client
going to the MCP gateway, MCP server
836
00:44:37,300 --> 00:44:41,500
first and saying, hey, these are
the things I'm about to go do.
837
00:44:41,500 --> 00:44:46,560
Out of the list, check your MCP
gateway and tell me the list of, MCP
838
00:44:46,560 --> 00:44:49,010
tools that I actually need for that.
839
00:44:49,480 --> 00:44:52,690
And then only turn those
on for the next task.
840
00:44:53,228 --> 00:44:53,698
Yeah.
841
00:44:54,250 --> 00:44:57,130
and then it'll just
repeat that cycle again.
842
00:44:57,240 --> 00:45:03,640
and then winnow down that list of
MCP tools to the only things that
843
00:45:03,640 --> 00:45:05,290
are needed for that task at hand.
844
00:45:05,740 --> 00:45:09,220
So there's another layer here,
which we're, and Michael and I,
845
00:45:09,220 --> 00:45:11,520
we were discussing while you were
building that beautiful diagram.
846
00:45:12,110 --> 00:45:14,530
It's, people are experimenting with that.
847
00:45:14,580 --> 00:45:19,460
All the pieces are in place, but that this
pattern isn't quite there just yet, but
848
00:45:19,460 --> 00:45:23,200
it will likely be, I'm pretty sure this
is what we're going to be doing pretty
849
00:45:23,253 --> 00:45:26,823
Nobody wants to go manually choose
every MCP server that they're going
850
00:45:26,823 --> 00:45:28,403
to need before every AI request.
851
00:45:28,953 --> 00:45:31,673
almost feels like it takes away
the speed advantage of using the
852
00:45:31,673 --> 00:45:33,373
MCP tool to go get the data for me.
853
00:45:33,373 --> 00:45:38,388
if I have to do all this work in each
tool independently, Because I often
854
00:45:38,388 --> 00:45:43,648
will have an IDE accessing AI, acting
as the MCP client, and then I'll have
855
00:45:43,648 --> 00:45:45,318
a terminal acting as an MCP client.
856
00:45:45,548 --> 00:45:49,728
At the same time, I've got ChatGPT
desktop running over here, also
857
00:45:49,728 --> 00:45:52,928
while VS Code, I think a lot of us,
eventually evolve to the point where
858
00:45:52,988 --> 00:45:57,418
we've got two or three tools all at
the same time managing MCP tools.
859
00:45:57,448 --> 00:45:59,828
We've got, I guess we have
multiple IDEs, I should say.
860
00:46:00,238 --> 00:46:04,848
and trying to understand how all this
comes together is only interesting right
861
00:46:04,848 --> 00:46:08,538
now, but in six months, we're not going
to want to be messing with all this stuff.
862
00:46:08,538 --> 00:46:11,598
We're just going to want this part to
work so we can work on building agents,
863
00:46:11,648 --> 00:46:11,958
All right.
864
00:46:11,958 --> 00:46:16,348
So Compose, my favorite tool, a lot of
people's favorite Docker tool, other
865
00:46:16,348 --> 00:46:17,588
than the fact that Docker exists.
866
00:46:17,838 --> 00:46:21,828
you announced at, we are developers
that Compose is getting more.
867
00:46:22,138 --> 00:46:24,878
There's functionality in the YAML
specifically, where I guess we're talking
868
00:46:24,878 --> 00:46:30,248
about the YAML configuration that drives
the Compose command line, that in just
869
00:46:30,248 --> 00:46:34,458
three months ago, you were adding model
support, and that was like an early
870
00:46:34,458 --> 00:46:40,078
alpha idea of what if I could specify
the model I wanted Docker Model Runner
871
00:46:40,078 --> 00:46:47,588
to run when I launch my app that maybe
needs a model, a local model, and I
872
00:46:47,588 --> 00:46:51,893
use the example and I have a An actual
demo over on GIST, that people can
873
00:46:51,893 --> 00:46:57,693
pick up that you simply, you, you write
your Compose file, you use something
874
00:46:57,693 --> 00:47:00,548
called open web, open, what is it?
875
00:47:00,598 --> 00:47:03,398
Open web, web, open web UI, I think.
876
00:47:03,618 --> 00:47:04,908
Yeah, horrible name.
877
00:47:05,448 --> 00:47:10,308
Extremely generic name for what
is a ChatGPT clone, essentially.
878
00:47:10,368 --> 00:47:14,428
the open source variant, which can
use any models or more than one model.
879
00:47:14,438 --> 00:47:16,448
It actually lets you
choose it in the interface.
880
00:47:16,908 --> 00:47:21,608
And all you need is a
little bit of compose file.
881
00:47:23,528 --> 00:47:27,368
So, I created a 29 lines, and it
probably needs to be updated because
882
00:47:27,368 --> 00:47:34,978
it's probably outdated, but, 29 lines of
Compose that's half comments that allows
883
00:47:34,978 --> 00:47:40,958
me to spin up an open web UI container
while also spinning up the models or
884
00:47:40,968 --> 00:47:44,768
making sure, basically, that I have the
models locally that I need to run it.
885
00:47:44,998 --> 00:47:49,138
And this gives me a ChatGPT
experience without ChatGPT.
886
00:47:49,138 --> 00:47:50,238
Thank you.
887
00:47:50,318 --> 00:47:52,098
And you guys, you enable this.
888
00:47:52,098 --> 00:47:53,928
Now you're not creating the models.
889
00:47:54,088 --> 00:47:55,938
You're not creating the open web UI.
890
00:47:56,138 --> 00:48:00,358
you're simply providing the
glue for it to all come together
891
00:48:00,358 --> 00:48:02,208
in a very easy way locally.
892
00:48:02,548 --> 00:48:02,788
Yeah.
893
00:48:02,788 --> 00:48:06,258
as we agentic apps need three things.
894
00:48:06,258 --> 00:48:09,578
They need models, they need tools, and
then the code that glues it all together.
895
00:48:09,978 --> 00:48:13,398
What the Compose file lets us
do now is define all three of
896
00:48:13,398 --> 00:48:15,788
those in a single document.
897
00:48:15,978 --> 00:48:21,658
here's the models that my app is going to
need for MCP gateway that I'm just going
898
00:48:21,658 --> 00:48:23,378
to run as another containerized service.
899
00:48:23,708 --> 00:48:27,408
And then the code, the custom code,
can be really any agentic framework.
900
00:48:27,418 --> 00:48:32,308
this example is, Open web UI, but that
Compose snippet what we've done is
901
00:48:32,308 --> 00:48:37,728
we've evolved the specification now
models are a top level element in the
902
00:48:37,728 --> 00:48:39,153
Compose file, which is pretty cool.
903
00:48:39,693 --> 00:48:42,453
This just dropped in the last couple
of weeks, so this is brand new.
904
00:48:42,819 --> 00:48:43,749
Gotta update my gist.
905
00:48:44,213 --> 00:48:47,713
yep, and so where before, yeah,
you had to use this provider
906
00:48:47,713 --> 00:48:49,613
syntax, and that still works.
907
00:48:49,933 --> 00:48:52,123
now it's actually part
of the specification.
908
00:48:52,543 --> 00:48:55,203
defining a model, This is
going to pull from Docker Hub.
909
00:48:55,213 --> 00:48:58,318
again, You can have your own models
and your own container registry.
910
00:48:58,318 --> 00:48:59,608
It's just an OCI artifact.
911
00:48:59,608 --> 00:49:01,058
You can specify that anywhere.
912
00:49:01,698 --> 00:49:03,108
then we've got the services,
913
00:49:03,378 --> 00:49:04,908
And then the app itself.
914
00:49:05,228 --> 00:49:08,958
What's cool about the model now is
with the specification evolution,
915
00:49:09,278 --> 00:49:13,658
you can now specify, hey, this is the
environment variable I want you to
916
00:49:13,658 --> 00:49:15,658
basically inject into my container.
917
00:49:16,038 --> 00:49:19,868
to specify what's the endpoint,
where's the base URL that I
918
00:49:19,878 --> 00:49:22,108
should use to access this model.
919
00:49:22,488 --> 00:49:24,098
And then what's the model name as well.
920
00:49:24,398 --> 00:49:29,718
So the cool thing then is I can
go back up to the top level model
921
00:49:29,718 --> 00:49:33,908
specification, I can swap that out
and the environment variables will be
922
00:49:33,918 --> 00:49:37,858
automatically updated and as assuming
that my app is using those environment
923
00:49:37,858 --> 00:49:39,848
variables, Everything just works.
924
00:49:40,308 --> 00:49:44,188
So again, think of Compose as, it's
the glue that's making sure that
925
00:49:44,188 --> 00:49:48,218
everything is there for the application
to actually be able to leverage it.
926
00:49:49,029 --> 00:49:49,479
Yeah.
927
00:49:49,539 --> 00:49:51,729
the gateway part here
was pretty cool to me.
928
00:49:51,729 --> 00:49:58,429
That I can add in my tools, my
MCP tools inside of the YAML file.
929
00:49:58,449 --> 00:50:00,039
when I saw that part, I was like, yes.
930
00:50:00,089 --> 00:50:05,329
that is like my vision, my dream is
that I can pass a composed file to
931
00:50:05,329 --> 00:50:09,159
someone else and it'll use their keys.
932
00:50:09,689 --> 00:50:10,469
Presuming, my.
933
00:50:10,934 --> 00:50:15,354
Team is all using the same provider
that we would have the same.
934
00:50:15,364 --> 00:50:19,544
Because open, well, open AI,
base URL, open AI model, and
935
00:50:19,544 --> 00:50:21,704
then open AI API key or whatever.
936
00:50:21,734 --> 00:50:24,364
if you're going to use ones in the
SAS, like those are all pretty generic.
937
00:50:24,514 --> 00:50:28,154
Even if you're not using OpenAI, they're
all pretty generic, environment variables.
938
00:50:28,164 --> 00:50:30,794
So I guess this would work
across teams or across people
939
00:50:30,838 --> 00:50:32,388
well, and that's a good point to call it.
940
00:50:32,398 --> 00:50:36,528
One of the things that OpenAI did when
they released their APIs was basically,
941
00:50:36,528 --> 00:50:40,008
hey, here's a specification on how to
interact with models that pretty much,
942
00:50:40,058 --> 00:50:42,168
Everybody else has adopted and used.
943
00:50:42,548 --> 00:50:47,068
and so the Docker model runner
exposes an OpenAI compatible API.
944
00:50:47,068 --> 00:50:49,428
And so that's why you see these
environment variables kind
945
00:50:49,428 --> 00:50:52,288
of using the OpenAI prefix.
946
00:50:52,718 --> 00:50:56,168
Because again, I can use now any
agentic application that can talk to
947
00:50:56,168 --> 00:50:58,278
OpenAI or use the OpenAI libraries.
948
00:50:58,278 --> 00:51:00,328
And it's just a configuration
change at this point.
949
00:51:00,433 --> 00:51:00,813
All right.
950
00:51:00,833 --> 00:51:01,923
Now, the coup de gras.
951
00:51:02,363 --> 00:51:03,753
Piece la resistance.
952
00:51:04,463 --> 00:51:08,393
I can't even do my pretend French, All
this stuff has been running locally.
953
00:51:08,393 --> 00:51:10,893
Like when we think of Docker desktop,
we think of everything locally.
954
00:51:10,893 --> 00:51:17,153
And then a year or two ago, Docker
launched, Docker build cloud, which was
955
00:51:17,163 --> 00:51:19,583
like getting back to Docker's roots.
956
00:51:19,583 --> 00:51:23,903
I almost feel like of providing more
a SaaS service that essentially is.
957
00:51:24,678 --> 00:51:26,528
Doing something in a container for me.
958
00:51:26,528 --> 00:51:30,228
And in that case, it was just building
containers using an outsourced build kit.
959
00:51:30,238 --> 00:51:34,538
So it was better for parallelization
and multi architecture.
960
00:51:34,588 --> 00:51:35,168
it was sweet.
961
00:51:35,178 --> 00:51:39,878
And I love it for when I need to build
like enterprise tools or big business
962
00:51:39,968 --> 00:51:41,838
things that take 20 minutes to build.
963
00:51:41,838 --> 00:51:45,598
None of my sample little examples do
that, but anything in the real world
964
00:51:45,598 --> 00:51:48,718
takes that long and you need to build
multi architecture and generally it's
965
00:51:48,758 --> 00:51:50,368
going to be faster in a cloud environment.
966
00:51:50,368 --> 00:51:51,278
So you provided that.
967
00:51:51,658 --> 00:51:58,428
Now it feels like you've upgraded, like
it's beyond just building, and it does any
968
00:51:58,428 --> 00:52:04,088
image or any container I want to run, any
model I want to run, I guess, not maybe
969
00:52:04,098 --> 00:52:07,748
any, I don't know if there's a limitation
there, but, bigger models, then maybe I
970
00:52:07,748 --> 00:52:11,238
can run locally, and then, also builds.
971
00:52:11,653 --> 00:52:15,443
So it can do building the image,
hosting the container, running the
972
00:52:15,443 --> 00:52:17,903
model endpoint for a, QIN3 or whatever.
973
00:52:18,423 --> 00:52:20,953
I can now do all that in
something called Offload.
974
00:52:21,303 --> 00:52:22,243
So tell me about that.
975
00:52:22,548 --> 00:52:26,588
Docker Offload, the way I explain it to
people is, hey, you need more resources?
976
00:52:26,768 --> 00:52:27,808
Burst into the cloud.
977
00:52:28,628 --> 00:52:31,838
And so it's basically, I'm going
to offload this into the cloud,
978
00:52:32,178 --> 00:52:35,548
but yet it's still, everything
still works as if it were local.
979
00:52:35,548 --> 00:52:38,058
So if I've got bind mounts, okay,
great, we're going to automatically
980
00:52:38,058 --> 00:52:39,518
set up the synchronized file shares.
981
00:52:39,528 --> 00:52:43,448
And so all that's going to work the way
using mutagen and some of the other tools
982
00:52:43,448 --> 00:52:44,948
behind the scenes to make that work.
983
00:52:45,278 --> 00:52:48,008
Port publishing that still
works as you would expect it to.
984
00:52:48,008 --> 00:52:54,798
So again, it gives that local
experience, but using remote resources.
985
00:52:54,848 --> 00:52:57,758
I'm just offloading this to
the cloud, but yet it's still.
986
00:52:58,393 --> 00:52:59,153
My environment.
987
00:52:59,553 --> 00:53:02,553
and so, yeah, to make it clear, like,
this is not a production runtime
988
00:53:02,553 --> 00:53:05,863
environment, I can't share this
environment out, or, I can't, you
989
00:53:05,863 --> 00:53:08,573
know, create a URL and say, hey, check
this out, colleague, or whatever,
990
00:53:08,573 --> 00:53:10,253
it's still for your personal use.
991
00:53:10,423 --> 00:53:13,853
Now, of course, can you make a,
Cloudflare tunnels, and I'm going
992
00:53:13,853 --> 00:53:17,013
to make it production, sure, but I
993
00:53:17,053 --> 00:53:17,243
wouldn't
994
00:53:17,304 --> 00:53:17,614
could hack
995
00:53:17,973 --> 00:53:18,223
that.
996
00:53:18,444 --> 00:53:18,884
but yeah.
997
00:53:18,945 --> 00:53:19,555
Yeah.
998
00:53:19,795 --> 00:53:20,755
So what is the intent?
999
00:53:20,765 --> 00:53:23,095
so what is the use case?
1000
00:53:23,935 --> 00:53:24,205
the big
1001
00:53:24,267 --> 00:53:26,147
should I use Docker offload for first?
1002
00:53:26,450 --> 00:53:27,620
Yeah, so, okay, great.
1003
00:53:27,630 --> 00:53:31,160
You're wanting to play around these
agentic apps and, you know, we
1004
00:53:31,170 --> 00:53:35,450
were talking about not everybody
has access to high end GPUs or,
1005
00:53:35,450 --> 00:53:37,930
you know, M4 machines and whatnot.
1006
00:53:37,970 --> 00:53:41,800
great with the flip of a switch, and
you had it there in Docker Desktop,
1007
00:53:41,800 --> 00:53:45,440
but at the top you just flip a
switch, and now you're using offload.
1008
00:53:45,810 --> 00:53:51,990
and so now you've got access to a pretty
significant NVIDIA GPU, and additional
1009
00:53:51,990 --> 00:53:57,120
resources, and so yeah, as you're, We
see the use case, especially more for
1010
00:53:57,120 --> 00:54:00,640
the agent applications, because that's
where those resources are needed.
1011
00:54:01,580 --> 00:54:07,070
It does open up some interesting doors
for, maybe I'm just on a super lightweight
1012
00:54:07,100 --> 00:54:09,930
laptop that I'm using for school and
I don't have the ability to even run
1013
00:54:09,930 --> 00:54:11,640
a lot of my containerized workloads.
1014
00:54:12,220 --> 00:54:15,480
Great, I can use that for offload,
you know, offload that to the cloud.
1015
00:54:15,530 --> 00:54:19,460
it does open up some interesting
opportunities for, Use cases beyond
1016
00:54:19,460 --> 00:54:23,190
agentic apps, but that's kind of
where the big focus is right now.
1017
00:54:23,415 --> 00:54:26,765
So if you're like a Docker insider, or if
you're someone who's used Docker a while,
1018
00:54:27,420 --> 00:54:32,470
it's the Docker context command that we've
had forever, augmenting or changing the
1019
00:54:32,470 --> 00:54:36,100
environment variable docker underscore
host, which we've had since almost the
1020
00:54:36,100 --> 00:54:43,475
beginning, and it allows you from your
local Docker CLI, And even the GUI works
1021
00:54:43,475 --> 00:54:48,785
this way too, because I could always
set up a Docker remote engine and then
1022
00:54:49,075 --> 00:54:51,545
create a new context in the Docker CLI.
1023
00:54:52,025 --> 00:54:57,665
That would use SSH tunneling to go to that
server, and then I could run my Docker
1024
00:54:57,665 --> 00:55:02,015
CLI locally, my Compose CLI locally, and
it would technically be accessing and
1025
00:55:02,015 --> 00:55:06,615
running against, the remote host that I
had set up, but that was never really a
1026
00:55:06,615 --> 00:55:11,695
cloud service, like it was never, no one
provides Docker API access as a service
1027
00:55:11,695 --> 00:55:16,205
that I'm aware of, and, the context
command, while it's easy to use, and you
1028
00:55:16,205 --> 00:55:19,195
can actually use it on any command, you
can use docker run dash dash context,
1029
00:55:19,195 --> 00:55:22,815
I believe, or docker dash dash context
run, I can't remember the order, but,
1030
00:55:22,965 --> 00:55:24,125
you can change that on any command.
1031
00:55:24,275 --> 00:55:27,285
these are all things that existed,
but you made it, stupid easy.
1032
00:55:27,945 --> 00:55:30,445
It's just, like you said,
it's a toggle, it's so easy.
1033
00:55:30,885 --> 00:55:33,655
You just click that button
and then the UI changes, the
1034
00:55:34,015 --> 00:55:35,895
colors change, so now you know.
1035
00:55:36,365 --> 00:55:37,525
You're now remote.
1036
00:55:38,350 --> 00:55:41,830
Yeah, and so I'll go ahead and say too,
behind the scenes, it's using context.
1037
00:55:41,830 --> 00:55:43,250
It's using those exact things.
1038
00:55:43,610 --> 00:55:47,980
The tricky part is, because I've done
similar, development environments where,
1039
00:55:48,030 --> 00:55:52,280
I'm going to work against a Raspberry Pi
at home or, whatever else it might be.
1040
00:55:52,720 --> 00:55:56,050
the tricky part is when you want to
get into bind mounts, file sharing
1041
00:55:56,050 --> 00:55:58,620
kind of stuff, or port publishing,
and I want to be able to access
1042
00:55:58,620 --> 00:56:01,530
that port from my machine, like
automating all those different pieces.
1043
00:56:02,420 --> 00:56:03,730
That's not trivial.
1044
00:56:03,750 --> 00:56:04,900
I mean, it's possible.
1045
00:56:05,080 --> 00:56:07,810
a separate tool, yeah, you gotta
download ngrok or something,
1046
00:56:08,560 --> 00:56:12,310
And so this brings all that together
into a single offering here.
1047
00:56:12,902 --> 00:56:13,802
That's pretty amazing.
1048
00:56:13,812 --> 00:56:18,132
Like there's a lot going on underneath
the hood that, switch is hiding
1049
00:56:18,142 --> 00:56:19,922
a lot of different functionality.
1050
00:56:19,952 --> 00:56:20,152
Like
1051
00:56:20,152 --> 00:56:20,442
it's.
1052
00:56:20,997 --> 00:56:22,587
To make that very transparent
1053
00:56:23,250 --> 00:56:25,350
And this supports builds too, right?
1054
00:56:25,360 --> 00:56:25,890
So like.
1055
00:56:26,505 --> 00:56:30,475
When I toggle this in the
UI, or is there a CLI toggle?
1056
00:56:30,685 --> 00:56:31,305
Yeah, there is.
1057
00:56:31,591 --> 00:56:31,991
okay.
1058
00:56:32,251 --> 00:56:35,681
So if I toggle this, it's, yeah, you're
like, you're saying it's a context
1059
00:56:35,681 --> 00:56:41,161
change, but it's UI aware, and it takes
in all the other little things that we
1060
00:56:41,161 --> 00:56:42,761
don't think about until they don't work.
1061
00:56:42,761 --> 00:56:45,761
And then we're like, oh, yeah, it's
not really running locally anymore.
1062
00:56:45,761 --> 00:56:47,231
So now I can't use localhost colon.
1063
00:56:47,381 --> 00:56:49,631
Well, that all just, I'm going to show
you how this works and you don't even
1064
00:56:49,641 --> 00:56:52,721
have to know kind of like the rest of
the Dockery because you don't really
1065
00:56:52,721 --> 00:56:54,231
have to know how it works underneath.
1066
00:56:54,591 --> 00:56:57,531
but if you think it's too much magic,
I like to break it down and say,
1067
00:56:57,701 --> 00:56:59,241
it's just really the Docker context.
1068
00:56:59,271 --> 00:57:01,221
I don't actually look
anything at the code.
1069
00:57:01,241 --> 00:57:02,641
I don't know really how it's working.
1070
00:57:02,891 --> 00:57:06,391
But to me, when I went and checked,
it does change the context for me.
1071
00:57:06,391 --> 00:57:08,031
It actually injects it
and then removes it.
1072
00:57:08,311 --> 00:57:10,011
I did notice it from the CLI.
1073
00:57:10,211 --> 00:57:11,441
I could change context.
1074
00:57:11,867 --> 00:57:15,657
And it would retain the context, but
if I use the toggle button, it deletes
1075
00:57:15,657 --> 00:57:16,987
the context and then re adds it.
1076
00:57:17,107 --> 00:57:20,347
Regardless, it is in the
background, it's doing cool things.
1077
00:57:20,647 --> 00:57:23,817
I think the immediate request from
the captains was, can I do both?
1078
00:57:24,017 --> 00:57:30,157
Can I have, Per workload or per service
offload so that just my model's remote
1079
00:57:30,157 --> 00:57:34,087
and maybe that really big database
server and then all my apps are local.
1080
00:57:34,397 --> 00:57:38,497
I don't know why I would care, but like
it, that's something that people ask for.
1081
00:57:38,707 --> 00:57:40,927
I'm not sure that I
care that to that level.
1082
00:57:40,927 --> 00:57:44,857
I think I'm fine with either or, but I can
understand that if I'm running some things
1083
00:57:44,857 --> 00:57:49,167
locally already and I just want to add on
something in addition, it would be neat
1084
00:57:49,167 --> 00:57:51,477
if I could just choose for one service.
1085
00:57:52,462 --> 00:57:52,682
Yeah.
1086
00:57:52,682 --> 00:57:54,612
So as of right now, it
is an all or nothing.
1087
00:57:54,622 --> 00:57:55,742
you're doing everything local.
1088
00:57:55,742 --> 00:57:57,232
You're doing everything out in the cloud.
1089
00:57:57,232 --> 00:57:58,317
there's not a way to.
1090
00:57:58,787 --> 00:58:00,257
Split that up yet.
1091
00:58:00,307 --> 00:58:02,367
it's something that we've heard
from a couple of folks, but
1092
00:58:02,417 --> 00:58:05,237
again, it's that same thing of,
tell us more about the use cases.
1093
00:58:05,237 --> 00:58:08,447
So if that's a use case you have,
feel free to reach out to us and
1094
00:58:08,447 --> 00:58:13,907
help us better understand, why
you might want to split runtime.
1095
00:58:14,147 --> 00:58:16,977
hosting, split environment,
hybrid environment.
1096
00:58:17,237 --> 00:58:18,317
That's the correct term.
1097
00:58:18,476 --> 00:58:19,236
Why do you say it like that?
1098
00:58:19,436 --> 00:58:21,816
and just to be clear,
offload has its own cost.
1099
00:58:21,816 --> 00:58:24,416
like this isn't free forever for infinity.
1100
00:58:24,416 --> 00:58:26,246
You can't just take up a bunch of GPUs.
1101
00:58:26,506 --> 00:58:29,686
I was asking the team a little bit
and without getting too nerdy, it
1102
00:58:29,696 --> 00:58:33,826
sounds like it isolates, it spins up
a VM or there's maybe some hot VMs.
1103
00:58:33,826 --> 00:58:35,086
And I get a dedicated.
1104
00:58:36,026 --> 00:58:39,476
OS essentially it sounds like so
that I can get the GPU if I need.
1105
00:58:39,476 --> 00:58:43,206
And you kind of get an option of, do
I want GPU, servers with GPUs or not?
1106
00:58:43,216 --> 00:58:45,486
do I, am I going to run
GPU workloads or not?
1107
00:58:45,486 --> 00:58:48,476
And that affects pricing do we
get anything out of the box with
1108
00:58:48,476 --> 00:58:51,386
a Docker subscription or is it
just an, a completely separate
1109
00:58:51,726 --> 00:58:56,153
So actually it's a. Kind of private
beta, but yet people can sign up
1110
00:58:56,163 --> 00:58:57,353
for it and that kind of stuff.
1111
00:58:57,683 --> 00:59:01,633
folks will get the 300 GPU minutes,
which isn't a ton, but it's enough to
1112
00:59:01,653 --> 00:59:03,143
experiment and play around with it.
1113
00:59:03,443 --> 00:59:05,233
and then, start giving us feedback, etc.
1114
00:59:05,549 --> 00:59:08,169
Yeah, if you spin up the GPU
instance and then go to lunch, by
1115
00:59:08,169 --> 00:59:10,659
the time you get back, you'll have
probably used up your free minutes.
1116
00:59:10,738 --> 00:59:11,448
It's a long lunch,
1117
00:59:11,508 --> 00:59:12,668
hey, that's my kind of lunch.
1118
00:59:14,458 --> 00:59:17,998
but yeah, so we went an hour,
and we barely scratched the
1119
00:59:17,998 --> 00:59:19,228
surface, do we cover it all?
1120
00:59:19,298 --> 00:59:22,978
did we list at least all the
announcements of Major features and tools.
1121
00:59:22,978 --> 00:59:25,148
I don't even want to say we've covered
all the features because there's
1122
00:59:25,148 --> 00:59:26,798
probably some stuff with MCP we missed.
1123
00:59:27,188 --> 00:59:31,338
So you open source MCP gateway, but
we should point out you don't actually
1124
00:59:31,338 --> 00:59:36,488
have to know, like you can just use
Docker desktop and MCP tools locally.
1125
00:59:37,028 --> 00:59:40,618
But the reason you provide an
MCP gateway is open source is so
1126
00:59:40,618 --> 00:59:44,478
we could put it in the compose
file and then run it on servers.
1127
00:59:44,528 --> 00:59:45,498
think about it this way.
1128
00:59:45,518 --> 00:59:48,478
the MCP toolkit bundled with Docker
Desktop is going to be more for,
1129
00:59:48,728 --> 00:59:52,678
I'm consuming, I'm just wanting to
use MCP servers and connect them
1130
00:59:52,678 --> 00:59:54,558
to my other agentic applications.
1131
00:59:54,868 --> 00:59:57,888
And the MCP Gateway is going to
be more for, now I want to build
1132
00:59:57,888 --> 01:00:01,948
my own agentic applications and
connect those tools to, Those, those
1133
01:00:01,948 --> 01:00:03,508
applications that we're running there.
1134
01:00:03,949 --> 01:00:04,389
Yeah.
1135
01:00:04,729 --> 01:00:07,229
Do you see people using
MCP Gateway in production?
1136
01:00:07,249 --> 01:00:10,819
Do you see that as like a. Not that
you provide support or anything like
1137
01:00:10,819 --> 01:00:12,499
that, but is it designed so that
1138
01:00:13,499 --> 01:00:16,989
We've got a couple of folks that
are already starting to do so.
1139
01:00:17,059 --> 01:00:19,609
stay tuned for some use
case stories around that.
1140
01:00:19,659 --> 01:00:19,939
Yeah.
1141
01:00:20,569 --> 01:00:20,919
Awesome.
1142
01:00:21,419 --> 01:00:22,329
well, this is a lot.
1143
01:00:22,359 --> 01:00:28,139
I feel like I need to launch another
10 Docker YouTube uploads just
1144
01:00:28,149 --> 01:00:31,319
to cover each tool specifically,
each use case specifically.
1145
01:00:31,609 --> 01:00:35,314
there's a lot here, but
this is Amazing work.
1146
01:00:35,314 --> 01:00:39,884
I mean, I don't know if you have a fleet
of AI robots working for you yet, but
1147
01:00:40,118 --> 01:00:44,348
certainly feels like a lot of different
products that are all coming together very
1148
01:00:44,348 --> 01:00:48,678
quickly that are all somehow related to
each other, but also independently usable.
1149
01:00:49,143 --> 01:00:54,103
And, I'm having you on the show as
usual is a great way to break it down
1150
01:00:54,133 --> 01:00:58,233
into the real usable bits What do
we really care about without all the
1151
01:00:58,233 --> 01:01:03,053
marketing, hype of general AI hype, which
is always a problem on the internet.
1152
01:01:03,053 --> 01:01:05,173
But this feels like really useful stuff.
1153
01:01:05,203 --> 01:01:05,793
Um,
1154
01:01:08,058 --> 01:01:09,268
Eivor, another podcast.
1155
01:01:09,268 --> 01:01:10,738
I don't know, Eivor, what's up?
1156
01:01:10,748 --> 01:01:13,038
Are you requesting yet another podcast?
1157
01:01:13,398 --> 01:01:14,028
Um,
1158
01:01:14,432 --> 01:01:15,132
a whole new show
1159
01:01:15,178 --> 01:01:17,668
about Compose provider services?
1160
01:01:18,088 --> 01:01:18,888
Oh, yes.
1161
01:01:18,928 --> 01:01:24,448
Also, you can now run Compose directly
from, well, you can use Compose YAML
1162
01:01:24,468 --> 01:01:27,618
from directly inside of cloud tools.
1163
01:01:28,173 --> 01:01:29,923
The first one was Google Cloud Run.
1164
01:01:30,253 --> 01:01:33,373
So I could technically spin
up Google, which I love,
1165
01:01:33,383 --> 01:01:34,883
Google Cloud Run is fantastic.
1166
01:01:35,203 --> 01:01:38,403
Um, it would be my first choice
for running any containers in
1167
01:01:38,403 --> 01:01:39,673
Google if I was using Google.
1168
01:01:40,193 --> 01:01:45,902
Um, and so now they're, accepting
the Compose YAML, spec, essentially,
1169
01:01:46,422 --> 01:01:48,282
inside of their command line.
1170
01:01:48,559 --> 01:01:51,039
So this is like, this feels like the
opposite of what Docker used to do.
1171
01:01:51,039 --> 01:01:54,339
Docker used to build in cloud
functionality into the Docker tooling.
1172
01:01:54,339 --> 01:01:57,769
But now we're saying, Hey, let's partner
with those tools, those companies,
1173
01:01:57,989 --> 01:02:03,489
and let them build in cloud or
compose specification into their tool.
1174
01:02:03,849 --> 01:02:06,419
So we can have basically file reuse.
1175
01:02:06,479 --> 01:02:07,209
YAML reuse.
1176
01:02:07,209 --> 01:02:07,919
Is that right?
1177
01:02:08,549 --> 01:02:08,829
yeah.
1178
01:02:08,829 --> 01:02:13,249
So this is the first time exactly in which
it's not Docker tooling that's providing
1179
01:02:13,249 --> 01:02:15,959
the cloud support, but it's cloud native.
1180
01:02:16,216 --> 01:02:19,406
They're the ones building the tooling
and consuming the Compose file.
1181
01:02:19,590 --> 01:02:21,160
yeah, it's a big moment.
1182
01:02:21,170 --> 01:02:25,530
And as we work with Google Cloud
on this, yeah, you can deploy the
1183
01:02:25,530 --> 01:02:27,020
normal container workloads, etc.
1184
01:02:27,020 --> 01:02:30,890
But they already have support for Model
Runner to be able to run the models
1185
01:02:30,890 --> 01:02:34,460
there as well, it's pretty exciting
And I know, the provider services
1186
01:02:34,520 --> 01:02:37,410
this is how we started with models.
1187
01:02:37,815 --> 01:02:39,555
having support in Compose, where
1188
01:02:39,555 --> 01:02:44,964
That was another service
in which the service wasn't
1189
01:02:45,024 --> 01:02:46,704
backed by a normal container.
1190
01:02:47,174 --> 01:02:47,984
the old method.
1191
01:02:48,034 --> 01:02:52,894
yes, but what's cool about this is, so
first off, these hooks still are in place.
1192
01:02:53,309 --> 01:02:57,259
So that a Compose file can basically
delegate off to this additional
1193
01:02:57,279 --> 01:03:00,899
provider plugin to say, hey, this is
how you're going to spin up a model.
1194
01:03:01,159 --> 01:03:04,029
But it starts to open up a whole
ecosystem where anybody can make a
1195
01:03:04,029 --> 01:03:08,779
provider or, okay, hey, I've got this
cloud based database, just as an example.
1196
01:03:09,069 --> 01:03:12,429
And, okay, Now I can still use
Compose and it's going to spin up
1197
01:03:12,429 --> 01:03:17,439
my containers, but also create this
cloud based container and then inject
1198
01:03:17,629 --> 01:03:19,669
environment variables into my app.
1199
01:03:19,719 --> 01:03:22,039
again, it starts to open up
some pretty cool extensibility
1200
01:03:22,039 --> 01:03:23,589
capabilities of Compose as well.
1201
01:03:23,639 --> 01:03:28,279
I think we, yeah, we need to bring Michael
back just to dig into that because, it's
1202
01:03:28,279 --> 01:03:31,359
essentially like extensions or, Plugins
1203
01:03:31,807 --> 01:03:32,197
Yeah.
1204
01:03:32,328 --> 01:03:32,478
for
1205
01:03:32,507 --> 01:03:35,467
Yeah, so Compose is about to
get a whole lot more love.
1206
01:03:35,477 --> 01:03:38,777
It feels like it's already, I
mean, it's been years since we've
1207
01:03:38,777 --> 01:03:41,007
added a root extension or like a,
1208
01:03:41,158 --> 01:03:41,738
top level,
1209
01:03:41,947 --> 01:03:42,857
top level build.
1210
01:03:43,497 --> 01:03:46,247
it's not every day that Docker
decides there's a whole new
1211
01:03:46,807 --> 01:03:48,617
type of thing that we deploy.
1212
01:03:48,617 --> 01:03:52,127
Now we have models, we'll see if
providers someday become something.
1213
01:03:52,427 --> 01:03:52,777
that'll be cool.
1214
01:03:53,237 --> 01:03:56,687
and this is all due to the Compose
spec, which now allows other
1215
01:03:56,687 --> 01:03:59,877
tools to use the Compose standard.
1216
01:03:59,887 --> 01:04:02,157
And that's just great for everybody,
because everybody uses Compose.
1217
01:04:02,167 --> 01:04:05,117
it's like the most universal
YAML out there, in my opinion.
1218
01:04:05,527 --> 01:04:05,997
great.
1219
01:04:06,087 --> 01:04:07,677
Well, I think we've covered it all.
1220
01:04:07,917 --> 01:04:10,907
Nirmal and I need, another
month to digest all this, and
1221
01:04:10,907 --> 01:04:11,997
then we'll invite you back on.
1222
01:04:12,367 --> 01:04:12,627
do it.
1223
01:04:13,177 --> 01:04:16,487
but yeah, we've checked the
box of, Everything Docker first
1224
01:04:16,487 --> 01:04:18,917
half of the year, stay tuned
for the second half of the year.
1225
01:04:19,137 --> 01:04:22,357
I actually sincerely hope you don't
have as busy of a second half,
1226
01:04:22,387 --> 01:04:25,027
just because it's, these are a lot
of videos I got to make, you're
1227
01:04:25,027 --> 01:04:26,727
putting a lot of work into my, inbox,
1228
01:04:26,787 --> 01:04:28,607
We're helping you have content to create.
1229
01:04:28,903 --> 01:04:32,563
I know, yeah, there's no shortage of
content to create right now with Docker.
1230
01:04:32,903 --> 01:04:34,813
I am very excited to play
with all these things.
1231
01:04:34,813 --> 01:04:37,253
I sound excited because I am excited.
1232
01:04:37,253 --> 01:04:42,123
This is real stuff that I think is
beneficial and largely free, largely,
1233
01:04:42,438 --> 01:04:45,678
Like almost all of this stuff is
really just extra functionality that
1234
01:04:45,678 --> 01:04:49,218
already exists that in our tooling
without adding a whole bunch of SaaS
1235
01:04:49,218 --> 01:04:50,808
services we have to buy on top of it.
1236
01:04:51,168 --> 01:04:52,728
yeah, so congrats.
1237
01:04:53,728 --> 01:04:55,958
People can find out more docker.com.
1238
01:04:56,298 --> 01:05:00,368
docs.Docker.com, dockers got videos
on YouTube now they're putting up
1239
01:05:00,368 --> 01:05:02,618
YouTube videos, so check that out.
1240
01:05:02,768 --> 01:05:05,443
I saw Michael putting up some
videos recently on LinkedIn.
1241
01:05:06,443 --> 01:05:07,203
it's all over the place.
1242
01:05:07,203 --> 01:05:10,063
You can follow Michael Irwin on LinkedIn.
1243
01:05:10,113 --> 01:05:11,013
he's on BlueSky.
1244
01:05:11,013 --> 01:05:11,943
I think you're on BlueSky.
1245
01:05:12,516 --> 01:05:13,446
I think you're on BlueSky.
1246
01:05:14,256 --> 01:05:16,416
Um, or, or, where, yeah,
1247
01:05:16,441 --> 01:05:17,451
figured out where I'm hanging out.
1248
01:05:17,883 --> 01:05:18,753
Thanks so much for being here.
1249
01:05:18,934 --> 01:05:19,474
Thank you, Michael.
1250
01:05:19,474 --> 01:05:21,694
Thank you, Nirmal, for
joining and staying so long.
1251
01:05:22,087 --> 01:05:22,867
I'll see you in the next one.