Navigated to Is Docker Building the Best AI Stack? - Transcript

Is Docker Building the Best AI Stack?

Episode Transcript

1 00:00:04,333 --> 00:00:08,053 This is DevOps and Docker talk, and I am your host, Bret Fisher. 2 00:00:08,413 --> 00:00:10,483 This was a fun episode this week. 3 00:00:10,483 --> 00:00:14,743 Nirmal Mehta is back from AWS and our guest this week 4 00:00:14,773 --> 00:00:17,443 was Michael Irwin of Docker. 5 00:00:17,443 --> 00:00:22,183 He is a recurring guest if you've been listening to this podcast for any length 6 00:00:22,183 --> 00:00:25,743 of time, I think we had him on earlier this year, and he's been a friend for a 7 00:00:25,743 --> 00:00:30,693 decade, former Docker captain, now Docker employee advocating for Docker everywhere. 8 00:00:30,693 --> 00:00:34,233 He's all over the place and, I always have him on because he 9 00:00:34,233 --> 00:00:39,109 breaks stuff down to help us all understand what's going on at Docker. 10 00:00:39,199 --> 00:00:45,109 And there is a giant list of things we had to talk about this week, all AI related, 11 00:00:45,139 --> 00:00:49,789 all separate pieces of the puzzle that can be used independently or together. 12 00:00:49,969 --> 00:00:54,259 So in this episode, we will be covering the Docker Model Runner, which I've 13 00:00:54,259 --> 00:00:58,609 talked about a lot on this show over the last four months, for running open 14 00:00:58,669 --> 00:01:03,109 or free models locally or remotely on servers or on your machine, wherever. 15 00:01:03,229 --> 00:01:05,599 Basically you get to run your own models and Docker will host them for 16 00:01:05,599 --> 00:01:10,289 you, Docker will encapsulate and allow you to use the Docker CLI to manage it. 17 00:01:10,589 --> 00:01:15,539 Then we have the hub model catalogs, so you can pick one of the dozens of 18 00:01:15,539 --> 00:01:19,649 models I guess, that are available in Docker Hub and we talk about Gordon 19 00:01:19,649 --> 00:01:24,509 ai, which is their chat bot built into Docker Desktop and the Docker CLI. 20 00:01:25,059 --> 00:01:31,509 We then get into MCP toolkit and the hubs MCP catalog, and how to bring all 21 00:01:31,509 --> 00:01:37,504 our tools into our local ai or like, using some other AI and you're just 22 00:01:37,504 --> 00:01:39,124 wanting to use your own MCP tools. 23 00:01:39,334 --> 00:01:43,204 We talk about how Docker manages all that and it uses the MCP gateway 24 00:01:43,204 --> 00:01:47,764 that they recently open sourced to front end all those tools and 25 00:01:47,764 --> 00:01:49,534 to help you manage your MCP tools. 26 00:01:50,074 --> 00:01:56,554 We also then get into compose and how you add models and the MCP gateway plus 27 00:01:56,554 --> 00:02:00,049 MCP tools, all into your composed files. 28 00:02:00,319 --> 00:02:03,979 Then we talk about how to use that for building agents a little bit 29 00:02:04,089 --> 00:02:10,209 And then Offload, which allows you to build, run containers, 30 00:02:10,209 --> 00:02:15,999 run Docker models, all in Docker Cloud, and they call it Offload 31 00:02:15,999 --> 00:02:18,264 It actually is a pretty unique name. 32 00:02:18,264 --> 00:02:21,564 Most other people just called it Docker Cloud, but they call it Docker Offload. 33 00:02:21,834 --> 00:02:22,464 Great name. 34 00:02:22,464 --> 00:02:25,254 You just toggle inside your Docker UI and you're good to go. 35 00:02:25,254 --> 00:02:26,064 Everything's remote. 36 00:02:26,154 --> 00:02:31,764 And then we at the very end talked about Compose Now works at least 37 00:02:31,764 --> 00:02:35,364 the YAML files of Compose, now work inside of Google Cloud Run. 38 00:02:35,704 --> 00:02:37,744 Probably coming to other places soon. 39 00:02:37,924 --> 00:02:43,504 So we go for almost 90 minutes on this show because for good reason, we 40 00:02:43,534 --> 00:02:49,904 talk about the use cases of all these different various AI parts of the puzzle. 41 00:02:50,264 --> 00:02:55,814 And I'm excited because it paints, I hope for you a complete picture 42 00:02:56,114 --> 00:03:00,579 of what Docker has released, how it all works together, when you would 43 00:03:00,579 --> 00:03:04,149 choose each part for problem solving, different problems and all that. 44 00:03:04,359 --> 00:03:08,709 So please enjoy this episode with Nirmal and Michael Irwin of Docker. 45 00:03:10,863 --> 00:03:11,533 Hi! 46 00:03:12,020 --> 00:03:13,071 Hey, Awesome. 47 00:03:13,444 --> 00:03:14,144 I'm doing all right. 48 00:03:14,174 --> 00:03:17,255 we have a special guest Michael, another Virginian. 49 00:03:17,265 --> 00:03:17,955 Hello. 50 00:03:18,699 --> 00:03:19,179 Hello, 51 00:03:19,472 --> 00:03:21,222 we all have known each other a decade. 52 00:03:21,242 --> 00:03:22,592 So Michael, tell us who you are 53 00:03:23,052 --> 00:03:25,302 so I'm, Michael Irwin, I work at Docker. 54 00:03:25,302 --> 00:03:29,896 I'm in our developer success team, and do a lot of teaching, training, 55 00:03:29,896 --> 00:03:34,966 education, breaking down the stuff, trying to make it somewhat understandable. 56 00:03:35,056 --> 00:03:38,936 I've been with Docker for almost three and a half years now, so it's 57 00:03:38,936 --> 00:03:41,726 gone pretty quickly, but, it's fun to spend time with the community 58 00:03:41,776 --> 00:03:43,436 and just help developers out. 59 00:03:43,836 --> 00:03:47,006 Learn about this stuff, but also how do you actually take advantage 60 00:03:47,006 --> 00:03:49,086 and do cool stuff with it? 61 00:03:49,385 --> 00:03:50,115 That's awesome. 62 00:03:50,145 --> 00:03:53,705 I mean, this is like a new wave of, essentially development 63 00:03:53,715 --> 00:03:56,515 tooling and, it's pretty exciting. 64 00:03:56,915 --> 00:03:57,275 Yeah. 65 00:03:57,275 --> 00:03:58,265 we've got a list people. 66 00:03:58,365 --> 00:04:01,765 I only make lists on this show a couple of times a year. 67 00:04:02,015 --> 00:04:05,495 Yeah, it does seem like every time I'm on the show, there's a list involved. 68 00:04:06,065 --> 00:04:07,685 yeah, that's a pretty decent list. 69 00:04:07,965 --> 00:04:10,965 let's break it down because we got a lot to get through and we realize 70 00:04:10,975 --> 00:04:13,895 that on Docker's channel and on this channel, They've been talking 71 00:04:13,895 --> 00:04:15,175 about all these fun features. 72 00:04:15,205 --> 00:04:17,425 I've talked at length about Docker Model Runner. 73 00:04:17,835 --> 00:04:23,155 Hub Catalog, I don't have a bunch of videos on MCP or the MCP tools, but 74 00:04:23,725 --> 00:04:25,955 actually we've had streams on it before. 75 00:04:26,005 --> 00:04:30,605 But, I think the MCP toolkit and the MCP gateway I'm most excited about that. 76 00:04:30,605 --> 00:04:31,935 So we're going to get to that in a little bit. 77 00:04:32,329 --> 00:04:35,789 I think, technically, the first thing out of the gate was Gordon AI. 78 00:04:36,794 --> 00:04:38,704 this to me is. 79 00:04:39,439 --> 00:04:44,509 From a user's perspective, it's a Docker focused or maybe a developer focused 80 00:04:44,579 --> 00:04:52,219 AI chatbot, essentially similar to ChatGPT, but it's in my Docker interface. 81 00:04:52,279 --> 00:04:56,329 if I'm staring at the Docker desktop interface, there's an ask Gordon button. 82 00:04:56,929 --> 00:05:00,499 And if you've never clicked that, or if you clicked it once a couple 83 00:05:00,499 --> 00:05:03,789 years ago, it has changed a lot. 84 00:05:03,849 --> 00:05:09,685 There have been enhancements so o now we have memory and threads and it saves 85 00:05:09,685 --> 00:05:14,355 me from having to go to ChatGPT when I'm working, specifically around Docker 86 00:05:14,355 --> 00:05:18,505 stuff, but anything it feels like I can ask it developer related stuff 87 00:05:18,795 --> 00:05:21,275 Can you tell me now, is this free for everyone? 88 00:05:21,275 --> 00:05:24,675 How does this play out in terms of who can use this 89 00:05:24,675 --> 00:05:26,655 Yeah, so everybody can access it right now. 90 00:05:26,655 --> 00:05:29,665 The only limitations may be some of our business orgs. 91 00:05:29,705 --> 00:05:32,065 those orgs have to, enable it. 92 00:05:32,535 --> 00:05:35,825 We're just not going to roll out all the AI stuff as most organizations 93 00:05:35,825 --> 00:05:37,155 are pretty cautious about that. 94 00:05:37,442 --> 00:05:39,522 but yeah, Gordon's available to everybody. 95 00:05:39,572 --> 00:05:43,562 it started off actually mostly as a documentation help of just help me 96 00:05:43,572 --> 00:05:46,802 answer questions and keep up with new features and that kind of stuff. 97 00:05:47,292 --> 00:05:51,472 But, as you've noted, we've added new capabilities and new tools along the way. 98 00:05:51,937 --> 00:05:55,961 I was doing some experiments just the other day of Hey Gordon, help 99 00:05:55,961 --> 00:06:00,601 me convert this Dockerfile to use the new Docker hardened images. 100 00:06:00,611 --> 00:06:05,511 And it would do the conversion, find the images that my organization has that match 101 00:06:05,511 --> 00:06:07,481 the one that I'm using this Dockerfile. 102 00:06:07,531 --> 00:06:10,981 you're starting to see more and more capabilities built into it as well it's 103 00:06:10,981 --> 00:06:12,681 a pretty fun little assistant there. 104 00:06:13,057 --> 00:06:13,567 Yeah, 105 00:06:13,694 --> 00:06:18,714 highly recommend folks that are listening that if you've not touched Docker and 106 00:06:18,714 --> 00:06:23,714 you just open it up, check out Gordon and ask those questions that you probably 107 00:06:23,724 --> 00:06:27,104 have, compose all the things you could probably put that in there and Gordon will 108 00:06:27,104 --> 00:06:28,804 probably try to compose all the things. 109 00:06:28,857 --> 00:06:32,887 you know what one of my common uses for this is I need to translate a docker 110 00:06:32,887 --> 00:06:36,477 run command into a compose file or back and forth like, please give me the 111 00:06:36,477 --> 00:06:40,807 docker run equivalent of this compose file or please turn these docker run 112 00:06:40,807 --> 00:06:46,277 commands or this docker build command into a compose file and it saves me. 113 00:06:46,677 --> 00:06:49,967 It's not, I mean, I've been doing this stuff almost every day for a decade, 114 00:06:50,207 --> 00:06:55,697 so it's not like I needed to do it for me, but it's still faster than a pro 115 00:06:55,747 --> 00:06:58,847 typing it in from memory, I'm at the point now where I rarely need to refer 116 00:06:58,897 --> 00:07:03,067 to the docs, but it's still faster than me at writing a Compose file 117 00:07:03,067 --> 00:07:04,467 out and saving it to the hard drive. 118 00:07:04,867 --> 00:07:08,757 You know, because sometimes we debate around whether AI tools are useful for 119 00:07:08,767 --> 00:07:13,887 senior versus junior and blah, blah, blah, and I don't know, as a senior, it saves 120 00:07:13,887 --> 00:07:17,727 me keystrokes, it saves me time, unlike when it first came out a couple years 121 00:07:17,727 --> 00:07:23,562 ago, I have Tracked it, given feedback to the team and it no longer makes Compose 122 00:07:23,562 --> 00:07:26,262 files with old Compose information. 123 00:07:26,262 --> 00:07:27,532 Like it, at least in the 124 00:07:27,721 --> 00:07:28,591 the version tag. 125 00:07:28,772 --> 00:07:29,192 Yeah. 126 00:07:29,192 --> 00:07:30,632 Cause it, the version tag. 127 00:07:30,642 --> 00:07:31,042 Yeah. 128 00:07:31,142 --> 00:07:33,772 I was a big proponent of Hey, this hasn't been in there for five years. 129 00:07:33,772 --> 00:07:34,802 Why is it recommending it? 130 00:07:35,192 --> 00:07:36,122 and they fixed that. 131 00:07:36,132 --> 00:07:38,072 So, I'm very appreciative of it. 132 00:07:38,492 --> 00:07:41,042 We're going to save it, we're going to talk about MCP tools, and then 133 00:07:41,042 --> 00:07:43,342 we're going to come back to this, because it gets, it's gotten better, 134 00:07:43,592 --> 00:07:45,672 and one of the reasons it's gotten better is it can talk to tools. 135 00:07:45,922 --> 00:07:49,382 So I don't want to really spoil that, But yeah, I love the suggestions because 136 00:07:49,382 --> 00:07:53,625 sometimes when you're staring at a blank chat box, it feels kind of like being a 137 00:07:53,625 --> 00:07:55,625 writer and staring at a blank document. 138 00:07:55,685 --> 00:07:57,845 It's like, I know this is supposed to be a cool tool, but 139 00:07:57,845 --> 00:07:59,095 I don't know what to do with it. 140 00:07:59,125 --> 00:07:59,775 do I start? 141 00:07:59,885 --> 00:08:00,255 Yeah. 142 00:08:00,271 --> 00:08:00,951 analogy 143 00:08:01,375 --> 00:08:04,148 there's even some cool spots, like if you go to the containers tab, You'll 144 00:08:04,148 --> 00:08:08,428 see on the right side a little like magic icon there, and then you can 145 00:08:08,438 --> 00:08:10,338 ask questions about that container. 146 00:08:10,338 --> 00:08:13,348 So if you have a container that's failing to start or whatever, you can 147 00:08:13,538 --> 00:08:17,668 basically start a thread around the context of why is this thing not starting. 148 00:08:17,906 --> 00:08:18,386 Interesting. 149 00:08:18,386 --> 00:08:19,616 I have not tried that. 150 00:08:20,256 --> 00:08:23,506 it actually reminds me of the little star we get for Gemini that's in every 151 00:08:23,506 --> 00:08:25,066 single Google app that I'm opening. 152 00:08:25,066 --> 00:08:29,896 I guess we're all, slowly converging on the starlight AI icons now, I 153 00:08:29,896 --> 00:08:32,336 would have thought it would have been a robot face, but we chose this 154 00:08:32,346 --> 00:08:34,666 vague collection of stars and pluses. 155 00:08:34,666 --> 00:08:35,646 I and that is pretty cool. 156 00:08:35,656 --> 00:08:38,336 I, so is there more, is it like an images as well? 157 00:08:38,396 --> 00:08:40,746 So yeah, anywhere you see that icon, you can start a 158 00:08:40,746 --> 00:08:42,636 conversation around that thing. 159 00:08:42,997 --> 00:08:45,137 I'm going to start clicking on that more often to see what it tells me. 160 00:08:45,137 --> 00:08:47,307 Like, Hey, what's the, what data. 161 00:08:47,652 --> 00:08:49,402 Ooh, How do I back up this volume? 162 00:08:49,432 --> 00:08:50,332 That's, that is pretty cool. 163 00:08:50,332 --> 00:08:50,542 So they're 164 00:08:50,778 --> 00:08:52,568 that is the number one question 165 00:08:52,736 --> 00:08:54,386 I love that they're context aware. 166 00:08:54,386 --> 00:08:57,446 Like they know that this is a volume, or at least the prompts. 167 00:08:57,956 --> 00:09:00,912 it's like prompt suggestions, which This feels a little meta. 168 00:09:00,912 --> 00:09:03,162 Is the AI suggesting prompts for the ai? 169 00:09:03,242 --> 00:09:06,152 It's not in this case, but it certainly could. 170 00:09:08,053 --> 00:09:08,433 All right. 171 00:09:08,973 --> 00:09:10,573 so that's ask Gordon, 172 00:09:10,673 --> 00:09:10,763 Woo! 173 00:09:11,283 --> 00:09:11,693 Alright. 174 00:09:11,976 --> 00:09:12,966 And that is what you said. 175 00:09:13,164 --> 00:09:15,614 that is available to everyone running Docker Desktop. 176 00:09:15,804 --> 00:09:19,114 Is that, to be clear, that's not available for people running 177 00:09:19,124 --> 00:09:21,404 Docker Engine on Linux, right? 178 00:09:21,424 --> 00:09:25,494 That is not There is a CLI to it, isn't there? 179 00:09:25,594 --> 00:09:26,484 There's Docker AI. 180 00:09:26,705 --> 00:09:29,595 but that's part of the Docker Desktop installation. 181 00:09:29,595 --> 00:09:32,435 that's a CLI plugin and it's going to talk to the components that are 182 00:09:32,435 --> 00:09:34,115 bundled in Docker Desktop there. 183 00:09:34,165 --> 00:09:34,465 Yeah. 184 00:09:34,505 --> 00:09:37,895 so we have our, this is an AI that's outsourced to Docker. 185 00:09:37,895 --> 00:09:38,795 It's not running locally. 186 00:09:38,825 --> 00:09:41,265 it's just calling APIs from the Docker desktop interface. 187 00:09:41,315 --> 00:09:47,165 So then if I don't want to use Gordon AI, but I want to run my own AI models 188 00:09:47,175 --> 00:09:52,295 locally, which I feel like this is pretty niche because a lot of people I 189 00:09:52,295 --> 00:09:58,645 talk to, their GPU is underwhelming, and they don't have an M4 Mac or a NVIDIA 190 00:09:58,655 --> 00:10:02,035 desktop tower with a giant GPU in it. 191 00:10:02,065 --> 00:10:05,185 granted, I guess there are models that run on CPUs, but I 192 00:10:05,185 --> 00:10:07,195 am not the model expert here. 193 00:10:07,405 --> 00:10:11,075 I've been trying to catch up this year because I actually do have a decent 194 00:10:11,075 --> 00:10:15,345 laptop now with an M4 in it so I can run at least some of the Apple models. 195 00:10:15,675 --> 00:10:18,415 But this Docker model runner. 196 00:10:18,818 --> 00:10:22,318 it's a pretty cool feature, but you can pick your models, so 197 00:10:22,318 --> 00:10:23,458 you can pick a really small one. 198 00:10:24,458 --> 00:10:28,178 On Windows, it has to be an NVIDIA GPU on Windows to work. 199 00:10:28,188 --> 00:10:28,828 Is that right? 200 00:10:29,272 --> 00:10:32,542 So it supports both NVIDIA and Adreno, so if you've got a 201 00:10:32,572 --> 00:10:35,298 Qualcomm chip, it'll work there. 202 00:10:35,668 --> 00:10:39,238 Okay, and on Mac, it just uses system memory because that's how Macs work. 203 00:10:39,238 --> 00:10:41,768 They have the universal memory, right? 204 00:10:42,188 --> 00:10:45,588 And I mean, for me, it's been great because I have a brand 205 00:10:45,588 --> 00:10:46,908 new machine with 48 gig of RAM. 206 00:10:46,948 --> 00:10:48,378 I know that is not normal. 207 00:10:48,808 --> 00:10:54,395 but these models, Mac, it can be a little problematic because it's like, if I have 208 00:10:54,395 --> 00:10:57,475 a bunch of browser tabs open, I can't run the full size model that I want to 209 00:10:57,475 --> 00:11:00,545 run, because that's one of the problems is like, it's all using the same thing. 210 00:11:00,545 --> 00:11:03,635 So I have to do like we did back with VMs, I have to shut all my apps down 211 00:11:03,975 --> 00:11:06,555 and then run, because I always want to run the biggest model possible. 212 00:11:06,555 --> 00:11:09,390 So I have this like 40, 46 gig model. 213 00:11:09,390 --> 00:11:11,820 I think it's the Devstral new model that's supposed to be 214 00:11:11,820 --> 00:11:13,110 great for local development. 215 00:11:13,570 --> 00:11:16,550 So for those of you listening, if you want to go in depth, we're not going to 216 00:11:16,580 --> 00:11:20,440 have time for that, but it is, Docker Model Runner lets you run models locally. 217 00:11:20,820 --> 00:11:23,820 you can technically run this now on Docker infrastructure, right? 218 00:11:23,820 --> 00:11:25,070 Which we're going to get to in a little bit. 219 00:11:25,759 --> 00:11:27,089 dun dun. 220 00:11:27,130 --> 00:11:27,450 Okay. 221 00:11:27,600 --> 00:11:29,290 yeah, there is this thing called offload. 222 00:11:29,380 --> 00:11:31,010 Michael's going to know way more about that. 223 00:11:31,010 --> 00:11:33,850 Cause that's a really new feature, but you don't have to actually 224 00:11:33,850 --> 00:11:34,970 run these models locally now. 225 00:11:34,970 --> 00:11:36,290 Like you could offload them. 226 00:11:36,290 --> 00:11:36,530 Right. 227 00:11:36,930 --> 00:11:37,240 Okay. 228 00:11:37,540 --> 00:11:38,530 What are you seeing? 229 00:11:38,530 --> 00:11:42,206 are people coming to Docker, like trying to run the biggest models possible? 230 00:11:42,206 --> 00:11:45,446 They got like multiple GPUs or are you just seeing people kind of tinkering out? 231 00:11:45,546 --> 00:11:46,556 What's some analogies there? 232 00:11:46,995 --> 00:11:48,655 there's a little bit of everything. 233 00:11:48,835 --> 00:11:52,785 I'd say a lot of folks are still exploring the space and figure out what's the 234 00:11:52,785 --> 00:11:54,765 right ways of doing things as well too. 235 00:11:54,765 --> 00:11:58,205 So, one of the interesting things is, and one of the things to keep in mind 236 00:11:58,775 --> 00:12:02,585 is that Let's break outta the AI space. 237 00:12:02,585 --> 00:12:04,595 when you say, for example, a database. 238 00:12:04,625 --> 00:12:04,805 Okay. 239 00:12:04,805 --> 00:12:05,945 I'm using Postgres database. 240 00:12:06,365 --> 00:12:10,585 If I'm using a Postgres database, a managed offering out in the cloud. 241 00:12:10,645 --> 00:12:10,795 Okay. 242 00:12:10,795 --> 00:12:11,965 Let's just say RDS. 243 00:12:12,115 --> 00:12:12,355 Okay. 244 00:12:13,155 --> 00:12:17,085 during development I can run a Postgres container and yeah, it's a smaller 245 00:12:17,085 --> 00:12:18,795 version and it's not a managed offering. 246 00:12:19,075 --> 00:12:21,595 and that works because all the binary protocols are the same. 247 00:12:21,595 --> 00:12:22,915 I can basically just swap out. 248 00:12:23,120 --> 00:12:24,870 My connection URL, and it just works. 249 00:12:25,360 --> 00:12:26,690 But models are a little different. 250 00:12:27,130 --> 00:12:30,430 And so, I've seen some folks that have been like, I'm going to use a 251 00:12:30,450 --> 00:12:34,810 smaller version of a model during local development, but then when I deploy, 252 00:12:34,840 --> 00:12:39,637 I'm going to use the larger hosted model that my cloud provider provides. 253 00:12:39,727 --> 00:12:43,937 Okay, so maybe it's a parameter change, I'm using fewer parameters 254 00:12:43,937 --> 00:12:46,347 so it can actually run locally, and then use the larger version. 255 00:12:47,977 --> 00:12:52,497 The analogy of, okay, me running a local PostgreSQL container and me 256 00:12:52,497 --> 00:12:58,117 running a local model, if it's a different model, it's a different model. 257 00:12:58,707 --> 00:13:02,347 And the results that you get from that model are going to vary quite a bit. 258 00:13:02,627 --> 00:13:05,077 And so that's one of the things that we have to kind of keep reminding 259 00:13:05,077 --> 00:13:08,562 people if you use for example a four billion parameter, So, we're going to 260 00:13:08,562 --> 00:13:09,838 talk a little bit about how you can use this model during local development. 261 00:13:09,838 --> 00:13:11,206 Yes, it can fit on your machine, but then if you deploy and you're using 262 00:13:11,206 --> 00:13:16,047 the 32 billion parameter version in production, those are very different 263 00:13:16,057 --> 00:13:18,627 models and you're going to get very different interactions and different 264 00:13:18,637 --> 00:13:20,177 outputs from the models there. 265 00:13:20,177 --> 00:13:23,457 So, it's something to keep in mind as folks are looking at 266 00:13:23,717 --> 00:13:24,707 building their applications. 267 00:13:26,157 --> 00:13:28,617 So, where are we seeing folks use this? 268 00:13:28,657 --> 00:13:33,132 Well Of course, if you can use the same model across your entire 269 00:13:33,282 --> 00:13:35,822 software development life cycle, then that works out pretty well. 270 00:13:36,322 --> 00:13:40,742 but we're also starting to see a little bit of a rise of, using basically 271 00:13:40,752 --> 00:13:45,092 fine tuned use case specific models or folks training their own models. 272 00:13:45,422 --> 00:13:48,442 and using those for the specific application. 273 00:13:48,832 --> 00:13:53,422 and again, that's tends to be a little bit smaller, more use case specific. 274 00:13:53,652 --> 00:13:55,892 And then, yes, that makes sense to use there. 275 00:13:56,242 --> 00:13:58,312 okay, I need to be able to run that on my own. 276 00:13:58,722 --> 00:13:59,392 et cetera. 277 00:13:59,442 --> 00:14:02,152 again, I think a lot of folks are still feeling out the space and 278 00:14:02,152 --> 00:14:05,362 figuring out exactly how should I think about how should I use this? 279 00:14:05,702 --> 00:14:08,532 and of course, the tooling has to exist kind of before you can actually 280 00:14:08,532 --> 00:14:10,102 do a lot of those experiments. 281 00:14:10,102 --> 00:14:13,012 So, in many ways, we've been building out that tooling to help 282 00:14:13,112 --> 00:14:14,582 support that experimentation. 283 00:14:15,052 --> 00:14:17,792 but I think in many ways, folks are still figuring out exactly what this is going 284 00:14:17,792 --> 00:14:19,242 to look like for them going forward. 285 00:14:19,647 --> 00:14:20,047 Yeah. 286 00:14:20,361 --> 00:14:23,531 I'm trying to imagine what enterprises are doing and building out and I'm imagining 287 00:14:23,531 --> 00:14:29,421 it's not like this, but one of my imaginations is reminding me of 20 years 288 00:14:29,421 --> 00:14:33,581 ago, buying a Google box, which I don't remember the name of, they called it, but 289 00:14:33,581 --> 00:14:37,331 it was this appliance you would put in your data center that was, it was yellow. 290 00:14:37,381 --> 00:14:37,891 Racked 291 00:14:38,107 --> 00:14:39,067 search appliance. 292 00:14:39,491 --> 00:14:39,771 you go. 293 00:14:39,771 --> 00:14:40,071 Yeah. 294 00:14:40,351 --> 00:14:43,381 And I don't know if Nirmal, if you had any customers back then with, if you 295 00:14:43,381 --> 00:14:44,811 were a consultant back then, but that, 296 00:14:45,047 --> 00:14:46,507 I can't name those customers. 297 00:14:46,831 --> 00:14:50,921 so I can, I was in, the city of Virginia beach running the IT and 298 00:14:50,921 --> 00:14:53,131 the data center there, or at least running the engineering groups. 299 00:14:53,461 --> 00:14:57,771 And those were back in the days where we didn't want Google indexing 300 00:14:57,771 --> 00:14:59,481 our internal infrastructure. 301 00:14:59,631 --> 00:15:02,481 You don't want your internal data to be. 302 00:15:02,784 --> 00:15:07,154 accessed or used potentially by this big IT conglomerate. 303 00:15:07,434 --> 00:15:07,674 Yeah. 304 00:15:07,674 --> 00:15:11,724 And so you bring, they, they sell you a, on prem box and you put it in your 305 00:15:11,724 --> 00:15:14,894 data center and it would scan and have all access to everything you could give 306 00:15:14,894 --> 00:15:18,384 it, back when apps didn't really have their own search, Google was providing 307 00:15:18,384 --> 00:15:22,014 that for us, and they would index our email and index our file servers 308 00:15:22,014 --> 00:15:24,724 and databases if we wanted to give it access to them I'm not sure that's 309 00:15:24,734 --> 00:15:29,914 around anymore, but, at a time, Google wasn't going to give away their software 310 00:15:29,914 --> 00:15:31,344 and we didn't all know how to run it. 311 00:15:31,694 --> 00:15:33,434 And, I'm sure it was running on Linux. 312 00:15:33,434 --> 00:15:37,864 And at the time in the mid 2000s, we weren't yet Linux at the city. 313 00:15:38,624 --> 00:15:39,684 For different reasons. 314 00:15:39,684 --> 00:15:41,934 This kind of feels like the same moment for enterprise where. 315 00:15:42,254 --> 00:15:44,814 They're going to have to buy GPUs probably for the first time if they're 316 00:15:44,814 --> 00:15:47,314 going to run it on prem, they're going to want to keep it separate. 317 00:15:47,654 --> 00:15:50,114 They're going to either buy something for the first, probably GPUs if they're 318 00:15:50,114 --> 00:15:53,484 not training models, they just want to run things internally to access their 319 00:15:53,484 --> 00:15:58,004 internal data, or maybe they're doing it in the cloud and Nirmal's company, 320 00:15:58,034 --> 00:16:04,194 AWS, is providing them the GPUs and then presumably because they're getting 321 00:16:04,194 --> 00:16:07,614 dedicated hardware, they won't have to worry about OpenAI or Anthropic 322 00:16:07,624 --> 00:16:08,894 having access to all their stuff 323 00:16:08,914 --> 00:16:12,624 so with respect to ModelRunner and again, just a reminder, these are 324 00:16:12,624 --> 00:16:16,074 my own opinions and not that of my employer, Amazon Web Services. 325 00:16:16,375 --> 00:16:18,065 There's still a lot of use cases. 326 00:16:18,075 --> 00:16:21,055 what Michael was talking about with respect to choosing lots of different 327 00:16:21,065 --> 00:16:26,625 models for different types of tasks, I think there's probably a hybrid model 328 00:16:26,625 --> 00:16:32,150 at some point where folks are using different fine tuned niche models 329 00:16:32,530 --> 00:16:37,430 for specific tasks that they're doing locally, and then as hardware improves, 330 00:16:37,430 --> 00:16:43,260 and hopefully, you know, maybe your 3 year old developer laptop that you get 331 00:16:43,290 --> 00:16:47,980 at your corporation has enough, or the models get, you know, Get, optimize it 332 00:16:47,980 --> 00:16:52,760 enough that they can run on CPU or the GPUs that you have on a corporate laptop, 333 00:16:52,770 --> 00:16:57,230 but there'll be some tooling, probably embedded into the development tooling 334 00:16:57,550 --> 00:16:59,690 and, or you can choose your own models. 335 00:17:00,200 --> 00:17:04,190 and then those, there'll be other models where you need to reach out to the 336 00:17:04,240 --> 00:17:08,840 hyperscalers for those that just you're not going to get the depth of reasoning 337 00:17:08,850 --> 00:17:13,340 you're not going to get the depth of knowledge that you will from something 338 00:17:13,340 --> 00:17:18,640 like Claude in the cloud versus something like deep seek running a quantized deep 339 00:17:18,640 --> 00:17:24,700 seek running on your mac but again this is all changing very very fast and Right 340 00:17:24,700 --> 00:17:30,740 now, we're in a different state where folks have to spend money to access 341 00:17:30,790 --> 00:17:35,620 those larger models, which hasn't been the pattern with respect to software 342 00:17:35,620 --> 00:17:37,990 development in a long time, right? 343 00:17:38,210 --> 00:17:39,900 Not everyone has that advantage, right? 344 00:17:40,070 --> 00:17:45,280 I mean, if you want to use Claude Code and, not hit any major limits, then 345 00:17:45,280 --> 00:17:50,480 you got to pay 200 a month, which, not everyone's going to be able to afford 2, 346 00:17:50,480 --> 00:17:54,990 400 a year, to do software development, there's also edge use cases, right? 347 00:17:54,990 --> 00:17:57,930 So IoT devices, just trying to figure it out. 348 00:17:57,930 --> 00:17:59,020 Also just kicking the tires. 349 00:17:59,020 --> 00:18:01,880 I think that's probably the main, like Michael said, I think that's 350 00:18:01,890 --> 00:18:04,950 just, everyone is just trying to kick the tires as cheaply as possible. 351 00:18:05,156 --> 00:18:08,386 model runner feels like a gateway, like it feels like a gateway drug to get 352 00:18:08,396 --> 00:18:13,616 me Hooked on like the idea, cause I, I mean, I wasn't paying attention to, I 353 00:18:13,616 --> 00:18:18,236 wasn't a machine, I wasn't an ML person or I wasn't building AI infrastructure, 354 00:18:18,256 --> 00:18:21,846 but Docker model runner and, well, and, you know, to a lesser extent, Ollama, 355 00:18:22,166 --> 00:18:25,056 Ollama always felt like it was more for those people that were doing that, 356 00:18:25,056 --> 00:18:28,916 but bringing this capability into a tool that I'm already using actually 357 00:18:28,936 --> 00:18:34,136 felt like, Okay, this is meant for me now, like this is, I don't have to be, 358 00:18:34,226 --> 00:18:39,736 I don't have to understand weights and all the intricacies of how models are 359 00:18:39,746 --> 00:18:43,776 built and work and the different, I don't have to, I don't have to understand 360 00:18:43,786 --> 00:18:46,976 the different file format and whether that works on my particular thing. 361 00:18:47,026 --> 00:18:47,976 it just all kind of works. 362 00:18:47,976 --> 00:18:49,656 It's Docker easy at that point for me. 363 00:18:51,355 --> 00:18:55,695 think that's the key point here of just, let's try to increase the access 364 00:18:55,695 --> 00:18:59,935 to these capabilities and let folks start to experiment and I think Again, 365 00:18:59,935 --> 00:19:03,615 we, as an industry, are trying to still figure out exactly how to use a lot 366 00:19:03,615 --> 00:19:08,545 of these tools and, okay, what is the right size model for the job at hand? 367 00:19:08,965 --> 00:19:12,225 and so, again, in order to be able to do that experimentation, you 368 00:19:12,775 --> 00:19:14,185 have to increase the access to it. 369 00:19:14,565 --> 00:19:17,305 and so, but that's still kind of, you know, Where we are in many ways. 370 00:19:17,632 --> 00:19:19,622 So did we check off another thing on the list? 371 00:19:20,381 --> 00:19:23,751 about to I'm gonna have to keep saying this We're not a news podcast. 372 00:19:23,771 --> 00:19:28,591 We don't talk about the latest things like in general in tech but you 373 00:19:28,591 --> 00:19:31,232 know, I got to give a shout out to Fireship because I have Between that 374 00:19:31,232 --> 00:19:34,082 and a few other channels, I've really learned a lot this year about models. 375 00:19:34,392 --> 00:19:38,152 And I was, there's this little, diagram of sort of the state of a lot 376 00:19:38,152 --> 00:19:41,602 of the open, waiter open, or free, I'm just going to say free models. 377 00:19:41,852 --> 00:19:44,562 They're free in some capacity, that you can download. 378 00:19:44,592 --> 00:19:45,912 And, I was using Devstral. 379 00:19:45,932 --> 00:19:48,422 We talked about it a couple of times on this show already, which 380 00:19:48,422 --> 00:19:49,802 is a model that came out in May. 381 00:19:49,842 --> 00:19:51,122 I actually had a newsletter on that. 382 00:19:51,122 --> 00:19:52,302 Shout out to, Bret. 383 00:19:52,302 --> 00:19:54,702 news, there's this guy, Bret, he makes a newsletter, Bret. 384 00:19:54,852 --> 00:19:56,002 news, you can go check that out. 385 00:19:56,282 --> 00:20:01,372 and I talked about this, that maybe this was the sweet spot because it was small 386 00:20:01,372 --> 00:20:07,212 enough, you could run it with a modern GPU or modern Mac, and it wasn't the 387 00:20:07,212 --> 00:20:11,632 worst, nothing like the frontier models that we get with OpenAI and Anthropic, 388 00:20:11,632 --> 00:20:13,682 but it was something that was better. 389 00:20:15,342 --> 00:20:16,792 And, that's called Devstral. 390 00:20:17,302 --> 00:20:19,962 And then we had Quinn three just come out, I don't know, a 391 00:20:19,962 --> 00:20:22,872 week ago or something that is 392 00:20:23,198 --> 00:20:24,478 I think it was earlier this week. 393 00:20:24,892 --> 00:20:27,222 Oh, see, time warp of AI, 394 00:20:27,378 --> 00:20:27,778 I think it 395 00:20:27,922 --> 00:20:29,262 three days is like three weeks. 396 00:20:29,746 --> 00:20:30,926 It was two days ago. 397 00:20:31,198 --> 00:20:31,648 Yeah, 398 00:20:31,912 --> 00:20:32,752 Oh gosh. 399 00:20:33,242 --> 00:20:35,972 I always assume that I'm seeing a Fireship videos late, but, yeah, 400 00:20:35,972 --> 00:20:37,432 one day ago, so yeah, that's true. 401 00:20:37,432 --> 00:20:41,387 One day ago, there's a newer model coming out from Alibaba, right? 402 00:20:41,467 --> 00:20:45,687 And it is even better, although it does take more GPU memory, I 403 00:20:45,918 --> 00:20:48,438 that's hard to run like on a laptop. 404 00:20:48,598 --> 00:20:54,901 I don't think you can, so that's a perfect encapsulation of like why Docker model 405 00:20:54,901 --> 00:20:59,951 run is there, but also why it's going to be a mix of models going forward. 406 00:21:00,421 --> 00:21:05,751 so we've, we have an AI that helps you use Docker and get started. 407 00:21:05,871 --> 00:21:11,121 We have a tool that helps you run models locally, and understand that 408 00:21:11,481 --> 00:21:13,611 what's the next step in this journey? 409 00:21:14,088 --> 00:21:17,378 by the way, you go to Docker Hub to look for models, or you can 410 00:21:17,378 --> 00:21:20,388 do it in Docker Desktop, or you can look at things in the CLI. 411 00:21:20,398 --> 00:21:22,148 there's many ways to see the models. 412 00:21:22,508 --> 00:21:24,128 you can pull things from HuggingFace now. 413 00:21:24,351 --> 00:21:24,981 That's pretty sweet. 414 00:21:25,090 --> 00:21:26,660 Can we build, can we make our own yet? 415 00:21:26,660 --> 00:21:27,500 Or do we have that? 416 00:21:28,480 --> 00:21:28,730 We can 417 00:21:28,874 --> 00:21:32,464 so, you can package, but you would have to already have the gguff 418 00:21:32,484 --> 00:21:33,914 file and all that kind of stuff. 419 00:21:33,914 --> 00:21:36,174 So we don't have a lot of the tooling there to help you 420 00:21:36,174 --> 00:21:37,814 actually create the model itself. 421 00:21:38,004 --> 00:21:41,554 Although a lot of folks will use container based environments to do that. 422 00:21:41,914 --> 00:21:44,264 we don't have any specific tooling around that ourselves. 423 00:21:44,275 --> 00:21:46,385 So is there, you're saying there's a doc, oh, there, there is. 424 00:21:46,385 --> 00:21:47,375 Oh, I didn't realize. 425 00:21:47,415 --> 00:21:52,685 So there is now a Docker model package CLI for creating, basically 426 00:21:52,685 --> 00:21:59,615 wrapping the GGUF or, other models or whatever, into the Docker OCI 427 00:21:59,625 --> 00:22:04,895 standard format for shipping and pulling and pushing essentially Docker. 428 00:22:05,505 --> 00:22:08,535 models that are in Docker hubs that are in the Docker format. 429 00:22:09,034 --> 00:22:10,004 OCI format. 430 00:22:10,044 --> 00:22:10,434 Yep. 431 00:22:11,484 --> 00:22:15,324 So when we look at agentic applications, step one is, you need models. 432 00:22:15,324 --> 00:22:17,174 It's kind of the brains of the operations. 433 00:22:17,469 --> 00:22:19,029 and then you need tools. 434 00:22:19,029 --> 00:22:21,439 And so you've got highlighted here the MCP toolkit. 435 00:22:21,439 --> 00:22:25,299 That was kind of the first adventure into the MCP space. 436 00:22:25,299 --> 00:22:29,419 And that one was focused a little bit more on how do we provide tools to the 437 00:22:29,429 --> 00:22:32,539 other agentic applications that you are already running on your machine. 438 00:22:32,539 --> 00:22:39,929 So Claude Desktop or using VS Code Copilot on my machine or Cursor, etc. 439 00:22:40,579 --> 00:22:44,709 How do we provide those MCP servers? 440 00:22:45,599 --> 00:22:49,079 And containerized ways, secure credential injection, etc. 441 00:22:49,259 --> 00:22:52,179 Basically manage the life cycle of those MCP servers. 442 00:22:52,559 --> 00:22:57,809 Again, in the use case of connecting them to your other agentic applications. 443 00:22:58,449 --> 00:23:00,700 And so, again, this is kind of where we started our MCP journey. 444 00:23:01,080 --> 00:23:04,520 if you see flipping through a couple of these, actually we just released a 445 00:23:04,520 --> 00:23:09,030 Docker Hub MCP server that allows you to Search for images on hub or, you 446 00:23:09,030 --> 00:23:13,070 know, those within your organization, which is super helpful for like maybe 447 00:23:13,460 --> 00:23:15,520 a write me a Dockerfile that does X. 448 00:23:15,550 --> 00:23:16,130 Well, cool. 449 00:23:16,130 --> 00:23:18,740 Let's go find that the right image that should be used for that. 450 00:23:19,140 --> 00:23:22,060 so again, starts to open up some of these, additional capabilities here. 451 00:23:22,826 --> 00:23:23,286 Yeah. 452 00:23:23,286 --> 00:23:28,956 And inside the Docker desktop UI, there is now a beta tab, essentially, 453 00:23:29,346 --> 00:23:30,846 that's called MCP Toolkit. 454 00:23:31,346 --> 00:23:39,796 And it is a GUI that allows me to explore and enable one of, 141 different tools 455 00:23:39,836 --> 00:23:41,636 and growing that Docker has added. 456 00:23:41,966 --> 00:23:47,676 So like a lot of the other places on the internet that either they host models 457 00:23:47,676 --> 00:23:53,346 like Anthropic or OpenAI, or They're a place where you can create AI applications 458 00:23:53,626 --> 00:23:58,036 and all those places have started to create their own little portals for 459 00:23:58,036 --> 00:24:01,516 finding tools and they may or may not, I mean, most of them all now settle on 460 00:24:01,516 --> 00:24:05,286 MCP, but before we had really MCP as the protocol standard, they were already 461 00:24:05,286 --> 00:24:09,386 doing like OpenAI was doing this before, but they were very, it was proprietary. 462 00:24:09,386 --> 00:24:14,116 You don't know how Evernote or Notion got, showed up as a tool 463 00:24:14,126 --> 00:24:16,556 feature in ChatGPT, it did. 464 00:24:17,371 --> 00:24:21,021 But we just assumed that was their custom integration and now we have this 465 00:24:21,021 --> 00:24:24,531 standard called MCP that everything should interact with everything 466 00:24:24,551 --> 00:24:26,941 properly the way that they should. 467 00:24:27,631 --> 00:24:29,491 At least right now it's the one that's winning. 468 00:24:29,491 --> 00:24:31,531 We don't know whether we'll still be talking about MCP in 469 00:24:31,531 --> 00:24:33,321 five years, but it's here now. 470 00:24:33,631 --> 00:24:34,751 It's what we're talking about now. 471 00:24:35,091 --> 00:24:39,034 And this lights up a lot of capabilities. 472 00:24:39,544 --> 00:24:42,334 In other words, you turn on an MCP tool. 473 00:24:43,334 --> 00:24:46,184 And that sits behind something called MCP Gateway. 474 00:24:46,434 --> 00:24:48,304 So tell me, what is MCP Gateway? 475 00:24:48,749 --> 00:24:49,219 Yeah. 476 00:24:49,219 --> 00:24:53,719 So at the end of the day, the toolkit is a combination of several different things. 477 00:24:53,749 --> 00:24:57,199 the MCP gateway is actually a component that we just open 478 00:24:57,199 --> 00:24:58,539 source at WeAreDeveloper. 479 00:24:58,539 --> 00:25:00,054 So you can actually run this gateway directly. 480 00:25:00,344 --> 00:25:01,844 In a container, completely on its own. 481 00:25:02,314 --> 00:25:05,724 And the MCP gateway is what's actually responsible for managing 482 00:25:05,734 --> 00:25:07,554 the lifecycle of these MCP servers. 483 00:25:07,711 --> 00:25:10,471 it itself is an MCP server. 484 00:25:10,511 --> 00:25:12,531 think of it more like an MCP proxy. 485 00:25:12,731 --> 00:25:17,351 it exposes itself as an MCP server that then can connect to your applications. 486 00:25:17,661 --> 00:25:20,596 But when you ask that server, Hey, what tools do you have? 487 00:25:21,056 --> 00:25:25,146 It's really delegating, or I mean, it's using cache versions of, okay, 488 00:25:25,146 --> 00:25:27,546 what are the downstream MCP servers? 489 00:25:28,316 --> 00:25:31,336 And so it's acting as basically a proxy here. 490 00:25:31,946 --> 00:25:37,141 so when requests come in and say, hey, you know, from the agents at GAP, if 491 00:25:37,141 --> 00:25:42,233 I want to execute this tool, go do a search on DuckDuckGo at that point, the 492 00:25:42,233 --> 00:25:47,353 MCP gateway will actually spin up the container, that DuckDuckGo MCP server, 493 00:25:47,643 --> 00:25:52,013 and then delegate the request to that container, which then does the search, and 494 00:25:52,013 --> 00:25:54,043 then the MCP gateway returns the results. 495 00:25:54,043 --> 00:25:58,023 So kind of think of it as a proxy that's managing the lifecycle of all 496 00:25:58,023 --> 00:26:01,803 those containers, but also, you know, injecting the credentials, configuration. 497 00:26:02,143 --> 00:26:05,203 it also does other things like actually looking at what's going in and out of 498 00:26:05,203 --> 00:26:09,233 the prompts going through the proxy to make sure, you know, secrets aren't being 499 00:26:09,233 --> 00:26:11,243 leaked or, that kind of stuff as well too. 500 00:26:11,293 --> 00:26:14,743 and we're even starting to do some further explorations of what are other 501 00:26:14,743 --> 00:26:16,883 ways to kind of secure those MCP servers? 502 00:26:16,933 --> 00:26:20,283 You know, for example, a file system one should never have network access. 503 00:26:20,483 --> 00:26:24,473 Cool, so let's, when that container starts, you get no network access, 504 00:26:24,493 --> 00:26:27,923 or, the GitHub MCP server, you know, it's talking to the GitHub APIs. 505 00:26:29,178 --> 00:26:32,898 Let's only authorize those host names that it can communicate with. 506 00:26:32,908 --> 00:26:35,988 So, you know, start to do a little bit more of a kind of permissioning 507 00:26:35,998 --> 00:26:40,238 model around these MCP servers, which is where a lot of people are kind of 508 00:26:40,238 --> 00:26:44,038 most cautious and nervous about MCP servers, because it's, they're completely 509 00:26:44,038 --> 00:26:48,458 autonomous for the most part, and you have to trust what's going on there. 510 00:26:48,478 --> 00:26:52,338 this exact feature is both necessary and also Solomon Hyke's 511 00:26:52,358 --> 00:26:54,128 prediction from a month ago. 512 00:26:54,128 --> 00:26:57,408 He was on this show and was saying that we're going to see all these 513 00:26:57,408 --> 00:27:00,398 infrastructure companies and all these tooling companies that are 514 00:27:00,398 --> 00:27:01,908 going to offer to lock this shit down. 515 00:27:01,908 --> 00:27:03,788 I think it's the quote I have to get from him. 516 00:27:04,217 --> 00:27:04,777 that sounds right. 517 00:27:04,928 --> 00:27:08,578 he compared the origin of containers and how it started with developers. 518 00:27:08,578 --> 00:27:12,268 And then eventually it took it and sort of managed it in the infrastructure 519 00:27:12,268 --> 00:27:16,708 layer and provided all the restrictions and security and limitations and 520 00:27:16,738 --> 00:27:17,858 configuration and all this stuff. 521 00:27:18,268 --> 00:27:20,618 And the same thing's happening to MCP. 522 00:27:20,828 --> 00:27:25,348 Where it started out as a developer tool to empower developers to do all these 523 00:27:25,348 --> 00:27:29,328 cool things with AI that they couldn't do and let the AI actually do stuff for us. 524 00:27:29,588 --> 00:27:34,038 And now very quickly in a matter of months, IT is coming in and saying, 525 00:27:34,038 --> 00:27:35,218 okay, we're going to lock this down. 526 00:27:35,218 --> 00:27:35,898 It's crazy. 527 00:27:35,908 --> 00:27:39,088 You can, you know, your prompts can delete your, drop your databases, 528 00:27:39,088 --> 00:27:42,748 your, as we just saw happen on the internet recently this week. 529 00:27:43,038 --> 00:27:48,598 I want to, on this gateway topic though, it can sound complicated. 530 00:27:49,013 --> 00:27:52,403 And maybe the internals are a little, and there was obviously 531 00:27:52,403 --> 00:27:54,003 code built into this program. 532 00:27:54,223 --> 00:27:58,353 But for those of us that aren't maybe building agents yet, or 533 00:27:58,353 --> 00:28:03,593 like really getting into building apps that use AIs in the app, this 534 00:28:03,593 --> 00:28:05,733 just appears as kind of magic. 535 00:28:05,773 --> 00:28:11,263 Like it, you go into the Docker desktop UI, I enable, I go through the toolkit, 536 00:28:11,273 --> 00:28:15,788 there's all these suggestions, everything from, the MCP server for GitHub itself to 537 00:28:15,998 --> 00:28:20,418 an MCP server that could give me access to Grafana data to accessing the Heroku 538 00:28:20,418 --> 00:28:24,248 API and you're looking at all these things and you're just like, I'm enabling them. 539 00:28:24,258 --> 00:28:25,518 it's like a kid in a candy store. 540 00:28:25,518 --> 00:28:26,348 I'm just going check, check, check. 541 00:28:26,348 --> 00:28:27,778 Yeah, I want Notion. 542 00:28:27,778 --> 00:28:28,538 I want Stripe. 543 00:28:28,588 --> 00:28:31,518 I get them in a list, they're enabled, which means they're 544 00:28:31,518 --> 00:28:32,968 not actually running, right? 545 00:28:32,968 --> 00:28:37,238 they're waiting to be called before the gateway runs them in memory. 546 00:28:37,518 --> 00:28:40,448 That's all transparent to me, I don't realize that's happening. 547 00:28:40,838 --> 00:28:47,108 And if I choose to use this toolkit with Gordon, if I just go into Gordon, 548 00:28:47,158 --> 00:28:52,668 And in the Gordon AI, if I don't want to run a local model myself, or I'm 549 00:28:52,848 --> 00:28:57,338 not using Claude Desktop or something that gives me the ability to enable MCP 550 00:28:57,338 --> 00:29:02,318 tools, I can just go in here and say, enable all my MCP tools, all 34 of them. 551 00:29:02,418 --> 00:29:08,188 I've got get ones and I've got, and so now what that means is the Gordon 552 00:29:08,188 --> 00:29:13,618 AI can now use these tools, which makes this free AI even smarter. 553 00:29:14,008 --> 00:29:20,592 And I can say, is there a Docker hub image NGINX? 554 00:29:20,712 --> 00:29:21,812 I don't know if there is. 555 00:29:22,407 --> 00:29:22,787 Let's see. 556 00:29:22,817 --> 00:29:23,817 I've never even tested this. 557 00:29:23,817 --> 00:29:25,947 So it's kind of a, what could go wrong? 558 00:29:25,977 --> 00:29:28,137 let's use Google live on the internet and see what happens. 559 00:29:28,137 --> 00:29:32,037 yeah, I was just saying, look at it, going out and checking, using the 560 00:29:32,037 --> 00:29:37,297 Docker, the newly created Docker hub, MCB, tool That you had, just, released. 561 00:29:38,107 --> 00:29:46,027 so is this going through a, MCP gateway or is this not with the MCP gateway yet? 562 00:29:46,245 --> 00:29:53,165 when you and Gordon AI flip the switch to say, yes, I want to use the MCP toolkit, 563 00:29:53,165 --> 00:29:59,319 basically what that's doing is, in the Gordon AI application here, it's enrolling 564 00:29:59,529 --> 00:30:01,729 the MCP toolkit as an MCP server. 565 00:30:02,118 --> 00:30:05,796 And so then it's going to ask The MCP toolkit, hey, what 566 00:30:05,796 --> 00:30:06,586 tools do you have available? 567 00:30:06,586 --> 00:30:10,086 And so when you saw that list of tools, that's again coming from the gateway. 568 00:30:10,536 --> 00:30:15,961 Gordon is just simply treating the MCP toolkit as An MCP server, which in 569 00:30:15,961 --> 00:30:18,141 itself is going to launch MCP server. 570 00:30:18,141 --> 00:30:21,301 So that's kind of why I mentioned, it's kind of thinking of it like a proxy there. 571 00:30:21,327 --> 00:30:22,707 Yeah, it does feel like one. 572 00:30:22,707 --> 00:30:23,197 Yeah. 573 00:30:23,307 --> 00:30:27,387 And this, for those watching, like it didn't actually work because I don't 574 00:30:27,397 --> 00:30:30,377 actually have access to hardened images, but, I just wanted to see what it would, 575 00:30:30,377 --> 00:30:32,117 what it'd say, but it, the, the UI 576 00:30:32,318 --> 00:30:33,648 Which is the right answer? 577 00:30:33,708 --> 00:30:34,788 Which is the right answer for 578 00:30:35,027 --> 00:30:36,317 did the right thing. 579 00:30:36,367 --> 00:30:39,607 it didn't expose a vulnerability in the MCP server. 580 00:30:39,657 --> 00:30:43,047 But yeah, so it basically, I can give Gordon AI more. 581 00:30:43,467 --> 00:30:46,407 And I can do a lot more functionality, more abilities to do things without 582 00:30:46,417 --> 00:30:49,497 having to run my own model, without having to figure out Claude Desktop. 583 00:30:49,767 --> 00:30:54,927 But I will say, because I'm in love with this toolkit so much, because I love 584 00:30:54,927 --> 00:31:00,237 this idea of one place for my MCP tools, for me to enter in the API secrets so it 585 00:31:00,237 --> 00:31:03,987 can access my notion and my Gmail but I don't want to have to do that in Claude 586 00:31:04,007 --> 00:31:08,057 Desktop and then in warp and then in VS code and then in Docker and then in 587 00:31:08,057 --> 00:31:10,637 Ollama and like every place I might run. 588 00:31:11,367 --> 00:31:14,127 A tool that needs MCP tools or access to an LLM. 589 00:31:14,417 --> 00:31:16,807 So I did it in Docker Desktop. 590 00:31:16,827 --> 00:31:18,907 I enabled the ones and set them up the way I wanted. 591 00:31:19,347 --> 00:31:25,537 And then inside of my tools around my computer that all support AI MCPs, 592 00:31:26,537 --> 00:31:31,787 they all have now added MCP client functionality that lets me talk to 593 00:31:31,847 --> 00:31:36,364 another MCP, any MCP server that speaks proper MCP through their API. 594 00:31:36,694 --> 00:31:40,454 And in this case, what I've done in the warp terminal, because it does support 595 00:31:40,464 --> 00:31:46,494 MCP, is I just tell it, the command it's going to run is docker mcp gateway run. 596 00:31:46,904 --> 00:31:50,124 And it uses the standard in and standard out, which is one of 597 00:31:50,124 --> 00:31:51,714 the ways that you can use mcp. 598 00:31:52,124 --> 00:31:57,084 And then I suddenly have all 34 tools that we enabled in my docker desktop. 599 00:31:57,514 --> 00:32:01,254 Available in Warp, just as long as Docker Desktop's running, that's all I gotta do. 600 00:32:01,874 --> 00:32:06,864 And then, because Warp is using Claude Sonnet 4, or whatever I told Warp to 601 00:32:06,864 --> 00:32:10,204 do, Docker doesn't care about that, because I'm not asking it to use, 602 00:32:10,614 --> 00:32:12,704 think that, we talked about this it's called bring your own key, I guess 603 00:32:12,704 --> 00:32:13,664 that's what everybody's talking about. 604 00:32:14,024 --> 00:32:19,334 Uh, bring your own key is when you want to bring your own Model to whatever 605 00:32:19,384 --> 00:32:19,854 to access, 606 00:32:20,134 --> 00:32:21,464 you're a key to access the model. 607 00:32:21,464 --> 00:32:25,234 Yeah, but in warp in particular, like this is nuanced, but in 608 00:32:25,234 --> 00:32:28,434 warp, you can't usually, you can't yet access your own models yet. 609 00:32:28,444 --> 00:32:30,394 I think they're going to make that an enterprise feature or something. 610 00:32:30,744 --> 00:32:32,754 But if I open up VS code, I could do the same thing. 611 00:32:32,754 --> 00:32:36,304 If I opened up a ChatGPT desktop, I could do the same thing. 612 00:32:36,664 --> 00:32:41,494 And, Aider or like any of the CLI tools, although anything that accepts MCP 613 00:32:41,524 --> 00:32:44,314 so far, I've gotten to work this way. 614 00:32:44,444 --> 00:32:49,914 And it's been awesome because all these, all these different IDEs and AI tools 615 00:32:50,154 --> 00:32:51,324 all set up a little bit different. 616 00:32:51,324 --> 00:32:54,374 Goose you set up differently than Claude Desktop. 617 00:32:54,374 --> 00:32:54,854 So 618 00:32:54,865 --> 00:32:55,565 Client. 619 00:32:56,225 --> 00:33:00,936 they're all, and all they, all of them have different knobs and, ways of 620 00:33:00,936 --> 00:33:05,976 controlling MCP servers and at varying degrees of control and flexibility. 621 00:33:06,416 --> 00:33:09,926 So this is really nice because then you can also just have 622 00:33:09,936 --> 00:33:11,706 all those tools running if you 623 00:33:11,815 --> 00:33:12,265 right. 624 00:33:12,335 --> 00:33:15,295 They look like one giant MCP server. 625 00:33:15,295 --> 00:33:19,875 Yeah, because normally I would have to add each MCP tool as its own server, 626 00:33:20,021 --> 00:33:22,811 I think, it feels like some of the tools are all standardizing on Claude 627 00:33:22,831 --> 00:33:25,451 Desktop as, that config file, which I 628 00:33:25,563 --> 00:33:25,713 the mcp. 629 00:33:25,713 --> 00:33:27,003 json, 630 00:33:27,191 --> 00:33:31,771 Yeah, it feels like that, like everyone's settling on just using that one file. 631 00:33:32,131 --> 00:33:35,731 Which is, I guess it's kind of feels kind of hacky, but I guess it's fine. 632 00:33:36,211 --> 00:33:38,731 it feels like every editor using my VIM settings. 633 00:33:38,731 --> 00:33:39,351 It's like, no, no, no, no. 634 00:33:39,381 --> 00:33:39,851 Calm down. 635 00:33:39,851 --> 00:33:42,041 I don't necessarily want you all to use the same file. 636 00:33:42,261 --> 00:33:45,031 I don't want you all overwriting each other and changing the same file. 637 00:33:45,081 --> 00:33:49,901 so you're bringing up a really important thing, which is, since we're new to this, 638 00:33:50,001 --> 00:33:55,451 the number of MCP servers is not too many yet, even though it does feel like there's 639 00:33:55,452 --> 00:33:56,571 probably like a lot of MCP servers. 640 00:33:56,572 --> 00:33:59,541 I've been using 10 new MCP servers since we've started this conversation. 641 00:34:00,201 --> 00:34:04,451 But it's still like a number of tools that we can still rationalize about. 642 00:34:04,881 --> 00:34:10,319 But probably in another month or two at this rate, there is a limit to how many 643 00:34:10,329 --> 00:34:17,434 MCP tools One single client, I guess you could say, or one instantiation of 644 00:34:17,434 --> 00:34:23,824 a task that you're using like Claude Code or Cline, it's context window can 645 00:34:23,844 --> 00:34:25,924 only use a certain amount of tools. 646 00:34:26,354 --> 00:34:33,424 And so, is there some ideas about breaking up in the MCP gateway, like having maybe 647 00:34:33,424 --> 00:34:37,934 like sets of tools that have like specific supersets or tools or something like that? 648 00:34:38,014 --> 00:34:38,284 Yeah. 649 00:34:38,284 --> 00:34:38,914 Good question. 650 00:34:38,934 --> 00:34:40,314 And so that's a good call out. 651 00:34:40,314 --> 00:34:43,774 And so I actually want to, zoom in on that just a tiny bit there, because 652 00:34:43,844 --> 00:34:46,584 for folks that may be new to this, they may not quite understand that 653 00:34:47,154 --> 00:34:51,584 the way that tools work is basically, it's taking all the tool descriptions. 654 00:34:51,904 --> 00:34:53,034 Okay, here's a tool name. 655 00:34:53,034 --> 00:34:54,704 Here's when I'm going to use this tool. 656 00:34:54,704 --> 00:34:57,214 Here's the parameters that are needed to invoke this tool. 657 00:34:57,594 --> 00:35:00,584 And it's sending that to the model on every request. 658 00:35:01,754 --> 00:35:05,364 And so the model's having to read all that and basically say, hey, 659 00:35:05,364 --> 00:35:07,764 based on this conversation, hey, here's a toolbox of stuff that 660 00:35:07,764 --> 00:35:09,564 I may or may not be able to use. 661 00:35:10,234 --> 00:35:15,889 But as Nirmal just pointed out, like that takes context window there and 662 00:35:16,619 --> 00:35:21,099 granted yes some of the newer models have incredibly huge context windows but 663 00:35:21,099 --> 00:35:25,589 depending on the use case, it's going to affect your speed, it's going to 664 00:35:25,589 --> 00:35:29,559 affect the quality, and so yeah, you do want to be careful of like, okay, I'm 665 00:35:29,559 --> 00:35:32,619 not just going to go in there and just flip the box on all the MCP servers. 666 00:35:32,629 --> 00:35:33,739 Now you have access to everything. 667 00:35:33,739 --> 00:35:36,619 Like, you do want to be a little conscious of that as well. 668 00:35:36,929 --> 00:35:40,599 in fact, I found it funny in a, I was playing in Cursor not long ago, 669 00:35:40,599 --> 00:35:44,119 and you know, they have even just YOLO mode, just go crazy with it. 670 00:35:44,529 --> 00:35:48,339 But even they have A warning once you, I think it's after you enable 671 00:35:48,339 --> 00:35:54,009 the 31st tool of like, hey, heads up, you're getting a little crazy. 672 00:35:54,499 --> 00:35:58,599 So like, I'm like, for the one that has YOLO mode to call me out for being 673 00:35:58,599 --> 00:36:02,569 crazy for too many tools, like it was, it's again, kind of a reminder of just, 674 00:36:03,029 --> 00:36:05,899 okay, you do want to be conscious of the number of tools that you're using. 675 00:36:06,369 --> 00:36:07,619 so to actually answer the question. 676 00:36:07,619 --> 00:36:07,739 Yeah. 677 00:36:07,739 --> 00:36:12,099 It's been something that we've been exploring and kind of waiting to just see 678 00:36:12,099 --> 00:36:14,169 what the feedback is gonna be on that. 679 00:36:14,229 --> 00:36:17,349 Are there separate tool sets that, clients can connect to. 680 00:36:18,139 --> 00:36:21,929 you know, that's certainly a possibility as well, since this MCP gateway is an 681 00:36:21,929 --> 00:36:26,779 open source container, when you run this for your application, not only can 682 00:36:26,779 --> 00:36:30,229 you say, these are the servers I want, but then you can even further filter 683 00:36:30,239 --> 00:36:33,559 through, these are the tools from those servers that I actually want to expose. 684 00:36:33,559 --> 00:36:36,069 So, for example, I think the GitHub official one is up to 685 00:36:36,069 --> 00:36:38,109 72 tools now or something. 686 00:36:38,349 --> 00:36:39,829 It's a crazy number. 687 00:36:40,179 --> 00:36:41,949 but most of the time, I only need maybe three or four of them. 688 00:36:42,269 --> 00:36:43,559 So, I want to filter that. 689 00:36:43,559 --> 00:36:46,529 And that's why you see, Cloud and VS Code and many of these others. 690 00:36:46,539 --> 00:36:50,309 Even though you're pulling in these MCP servers, many of those 691 00:36:50,329 --> 00:36:53,249 provide client side functionality to kind of filter that list as well, 692 00:36:53,249 --> 00:36:53,629 Yeah. 693 00:36:54,290 --> 00:36:57,730 I wonder if we get to a state because the MC, this is getting a little bit 694 00:36:57,730 --> 00:37:01,760 meta, but everything when you talk about agentic AI gets meta really quickly. 695 00:37:02,450 --> 00:37:07,910 So, I wonder if the MCP gateway itself is an MCP server, so it can rationalize 696 00:37:07,920 --> 00:37:12,700 about itself, I wonder if we get into the pattern of, okay, there's this new 697 00:37:12,700 --> 00:37:17,850 task that I want this agent to do, and the first thing, after it comes up with 698 00:37:17,850 --> 00:37:22,210 its task list, the steps it wants to take, is go through that list and then, 699 00:37:22,690 --> 00:37:28,505 ask the MCP gateway to reconfigure itself on each task and turn on Only 700 00:37:28,505 --> 00:37:32,505 the ones that it identified as likely the ones that it needs for that task. 701 00:37:32,815 --> 00:37:36,265 And just dynamically, at any given time, don't have anything 702 00:37:36,265 --> 00:37:37,485 more than five running. 703 00:37:37,865 --> 00:37:38,905 So figure it out. 704 00:37:39,005 --> 00:37:41,555 You can choose whatever five you want, but only have five. 705 00:37:41,618 --> 00:37:45,548 We've done some experiments with that, not quite to that full dynamicness, 706 00:37:45,578 --> 00:37:49,548 but I've even done some ones of a Okay, here's a tool to enable other 707 00:37:49,548 --> 00:37:51,398 tools, is basically what it is. 708 00:37:51,848 --> 00:37:55,368 And, okay, and give me parameters of, okay, do you need, GitHub? 709 00:37:55,378 --> 00:37:56,648 Do you need, Slack? 710 00:37:56,648 --> 00:37:59,768 You know, tell me what it is that you need, and then I'll 711 00:37:59,768 --> 00:38:01,288 enable those specific things. 712 00:38:01,288 --> 00:38:05,378 And then what's cool then is, as part of the MCP protocol, 713 00:38:05,378 --> 00:38:06,488 there's also notifications. 714 00:38:06,488 --> 00:38:10,393 So the MCP server can then notify, The client says, hey, there's a new 715 00:38:10,393 --> 00:38:13,573 list of tools available, and then the next API request to the model 716 00:38:13,863 --> 00:38:15,213 then has this new set of tools. 717 00:38:15,985 --> 00:38:16,055 I 718 00:38:16,055 --> 00:38:16,635 think we're almost 719 00:38:16,763 --> 00:38:18,353 capability is there, but, 720 00:38:19,055 --> 00:38:20,385 I think that's likely the next step. 721 00:38:21,423 --> 00:38:26,793 but it's also kind of like a, yeah, how do you safeguard that? 722 00:38:26,853 --> 00:38:31,398 So it's, Yeah, it's an interesting time period, for sure. 723 00:38:31,975 --> 00:38:35,445 we got an interesting question, is MCP Gateways intent to replace an 724 00:38:35,465 --> 00:38:37,545 API Gateway or in parallel to it? 725 00:38:37,935 --> 00:38:38,645 Great question. 726 00:38:38,645 --> 00:38:39,715 Michael, you want to take that one? 727 00:38:39,858 --> 00:38:40,638 yeah, great question. 728 00:38:40,688 --> 00:38:44,672 I'd say in some ways that there's similar functionality, but they 729 00:38:44,732 --> 00:38:45,932 serve very different purposes. 730 00:38:45,932 --> 00:38:49,822 So an API gateway, I'll just take the most basic example, but I know 731 00:38:49,822 --> 00:38:53,932 there's lots of different ones, An API gateway, single endpoint, and I may 732 00:38:53,932 --> 00:38:55,652 have lots of different microservices. 733 00:38:55,672 --> 00:38:57,092 Let's just pick a catalog. 734 00:38:57,132 --> 00:39:00,452 Okay, so for product related ones, it's going to go to this microservice. 735 00:39:00,642 --> 00:39:02,182 Users, it's going to go to this other one. 736 00:39:02,182 --> 00:39:03,862 Cart, another service, whatever. 737 00:39:04,202 --> 00:39:08,112 And the API gateway is routing all those different requests and rate 738 00:39:08,112 --> 00:39:14,751 limiting, etc. In many ways, like this MCP gateway serves in a similar fashion 739 00:39:15,371 --> 00:39:18,921 in which it's going to be routing to the right MCP server to actually 740 00:39:18,941 --> 00:39:20,521 handle the tool execution and whatnot. 741 00:39:20,931 --> 00:39:24,231 But again, it's only for the MCP protocol. 742 00:39:24,531 --> 00:39:27,811 So it's not going to be replacing an API gateway because it's not doing 743 00:39:27,811 --> 00:39:33,451 normal API requests, etc. It's only for MCP related workloads and requests. 744 00:39:34,381 --> 00:39:35,931 different protocols at play here. 745 00:39:36,737 --> 00:39:38,607 I think that's probably the best way to describe it. 746 00:39:38,657 --> 00:39:44,357 otherwise, you could also say that MCP and API Gateway are likely 747 00:39:44,357 --> 00:39:46,407 going to be running in parallel. 748 00:39:46,767 --> 00:39:50,757 and so probably what I would see would be, I have an API gateway that routes 749 00:39:50,757 --> 00:39:55,807 a request to an endpoint, and then that particular application, let's just say 750 00:39:55,807 --> 00:40:01,977 it's an agentic application, can then have its own MCP gateway to satisfy whatever 751 00:40:01,977 --> 00:40:03,997 agentic flow it needs to use there. 752 00:40:03,997 --> 00:40:07,467 I wanted to, while you guys were having an awesome conversation, I was trying 753 00:40:07,467 --> 00:40:13,947 to draw up, just a visualization to try to represent, okay, so just so 754 00:40:13,947 --> 00:40:16,907 people understand, because this MCP, we could make a whole show on MCP 755 00:40:16,907 --> 00:40:20,107 tools, honestly, from an infrastructure perspective, how do these things talk? 756 00:40:20,117 --> 00:40:20,927 How do they integrate? 757 00:40:21,347 --> 00:40:23,367 The fact that you're talking about that they're just really adding to 758 00:40:23,367 --> 00:40:26,898 the context window is a fantastic fact that, A lot of people could go 759 00:40:27,088 --> 00:40:31,498 months or years using MCP tools day to day and never know that, right? 760 00:40:31,558 --> 00:40:35,458 a normal non engineer could use MCP tools, not understand how 761 00:40:35,458 --> 00:40:36,478 these things are all working. 762 00:40:36,738 --> 00:40:40,288 for those that are into this, are playing around with MCP tools elsewhere 763 00:40:40,698 --> 00:40:44,538 and understanding a little bit of MCP server functionality and client versus 764 00:40:44,538 --> 00:40:46,138 server versus host and all that stuff. 765 00:40:46,458 --> 00:40:51,908 Before Docker's gateway, the MCP gateway, you would have like an MCP client 766 00:40:51,978 --> 00:40:55,748 That whether it's your IDE, your terminal, or AI, chat, desktop, or whatever you've 767 00:40:55,748 --> 00:40:58,068 got, that is acting as an MCP client. 768 00:40:58,358 --> 00:41:01,518 Assuming it supports MCP servers, you can add them one at a time. 769 00:41:01,748 --> 00:41:05,978 So I would add GitHub's MCP server, then I would add DuckDuckGo's MCP server. 770 00:41:06,228 --> 00:41:09,058 I might add Notion's MCP server, since I'm a big Notion fan. 771 00:41:09,338 --> 00:41:14,368 And each one of those servers has One to infinity, tools, which 772 00:41:14,368 --> 00:41:16,538 are, I look at as like API routes. 773 00:41:16,918 --> 00:41:20,308 and each one has its own very niche purpose. 774 00:41:20,774 --> 00:41:23,414 depending on the tool, and this is part of the frustration with the ecosystem 775 00:41:23,414 --> 00:41:26,414 right now is we're only months into this, but it's amazing that all these tools are 776 00:41:26,414 --> 00:41:30,024 all starting to support each other, tools have different ways where you manage this. 777 00:41:30,034 --> 00:41:33,244 Some of them you can disable and enable specific servers. 778 00:41:33,454 --> 00:41:36,914 Some, you can actually choose the tools individually, which 779 00:41:36,914 --> 00:41:38,494 is like choosing API routes. 780 00:41:38,944 --> 00:41:41,994 And to me, it's you're always trying to get down to the smallest amount of 781 00:41:41,994 --> 00:41:44,134 tools that you need to prevent confusion. 782 00:41:44,144 --> 00:41:47,854 Cause I'm, my biggest problem is I enable all the tools because 783 00:41:47,854 --> 00:41:49,514 I get tired of managing them. 784 00:41:49,674 --> 00:41:50,398 I just want them to work. 785 00:41:50,759 --> 00:41:52,829 I just want them all to work when they need to work. 786 00:41:53,189 --> 00:41:54,589 And then I, so I enable them all. 787 00:41:55,379 --> 00:41:56,939 I end up with 50 plus tools. 788 00:41:57,239 --> 00:42:01,419 And then when I'm asking AI to do things, it chooses the wrong tool because I 789 00:42:01,419 --> 00:42:06,699 wasn't precise enough in my ask to trigger the right words that are written 790 00:42:06,699 --> 00:42:09,419 in the system prompt of that MCP server. 791 00:42:09,469 --> 00:42:14,769 So actually, maybe an easier update might be to put another layer on top of the 792 00:42:14,769 --> 00:42:18,029 MCP server and kind of an in between. 793 00:42:18,109 --> 00:42:23,929 so I'm connecting the MCP gateway now to multiple other MCP servers. 794 00:42:24,189 --> 00:42:25,049 So I get, yeah, you're right. 795 00:42:25,049 --> 00:42:26,009 I need another layer here. 796 00:42:26,009 --> 00:42:27,379 That's actually MCP servers. 797 00:42:27,739 --> 00:42:33,829 so, there's now this gateway in the middle, and it, the only negative of this 798 00:42:33,829 --> 00:42:39,119 approach is for right now, because we don't have this futuristic utopia yet, 799 00:42:39,539 --> 00:42:47,579 is that to my terminal, or my IDE, it all looks like one giant list of tools. 800 00:42:47,859 --> 00:42:51,589 And in one MCP server, which is just the nature of proxy, right? 801 00:42:51,639 --> 00:42:54,379 But behind one IP address is a whole bunch of websites, like 802 00:42:54,379 --> 00:42:55,349 you don't realize it, right? 803 00:42:55,719 --> 00:42:58,129 So it is, the analogy still works, I believe, there. 804 00:42:58,349 --> 00:43:04,289 But in this case, because it's connecting all of them together into one proxy, 805 00:43:04,309 --> 00:43:07,929 and the nice thing is, it's, I can see in the memory usage and the containers. 806 00:43:07,929 --> 00:43:12,229 In fact, when Michael was on weeks ago, We saw the MCP gateway spinning up 807 00:43:12,229 --> 00:43:15,619 servers dynamically and then shutting them down and you could see the container 808 00:43:15,619 --> 00:43:19,279 launch run, you know, run the curl command or whatever, and then close. 809 00:43:19,399 --> 00:43:22,919 And it was so quick, we couldn't, capture and swat toggle windows 810 00:43:23,129 --> 00:43:24,269 to see the tools launching. 811 00:43:24,269 --> 00:43:25,379 And, I mean, it's beautiful. 812 00:43:25,379 --> 00:43:26,819 It's exactly what containers were for. 813 00:43:26,894 --> 00:43:28,654 It's, it's ephemeral, it's wonderful. 814 00:43:29,379 --> 00:43:34,639 But, if your IDE or if your chat desktop or whatever is acting as your MCP client, 815 00:43:34,639 --> 00:43:39,729 the agent thing, if that doesn't let you choose individual tools, then this 816 00:43:39,729 --> 00:43:44,989 approach is a little hard because the only way from my IDE that I can, I have to 817 00:43:44,989 --> 00:43:47,169 turn off all of Docker or none of Docker. 818 00:43:47,169 --> 00:43:47,762 It I think. 819 00:43:48,312 --> 00:43:51,342 This gets us to a conversation of eventually we will have this. 820 00:43:51,532 --> 00:43:55,162 I'm thinking of it as like the model, the plan model before the model that will 821 00:43:55,162 --> 00:43:57,112 go, okay, you used all these keywords. 822 00:43:57,132 --> 00:43:59,512 I'm going to pick out the right tools and I'm going to hand those 823 00:43:59,512 --> 00:44:01,782 off to the next model, which is going to do the actual work. 824 00:44:02,352 --> 00:44:03,912 That's probably already here. 825 00:44:04,022 --> 00:44:05,632 Solomon predicted it a month ago. 826 00:44:05,822 --> 00:44:06,422 Yeah, I'm sorry. 827 00:44:06,422 --> 00:44:06,682 What? 828 00:44:07,023 --> 00:44:09,213 so that's what Michael and I, while you were drawing this 829 00:44:09,402 --> 00:44:10,562 Oh, is that what you were just talking about? 830 00:44:10,763 --> 00:44:12,533 that's what Michael and I, we were talking about. 831 00:44:12,734 --> 00:44:18,600 So the gateway itself has its own MCP server that controls itself. 832 00:44:19,180 --> 00:44:24,030 And so we're a few months away from exactly what you were just talking about. 833 00:44:24,290 --> 00:44:28,200 Bret, because of context windows, because there's too many tools, because 834 00:44:28,200 --> 00:44:31,690 of all the things that you did, all the challenges you just mentioned, Bret. 835 00:44:32,120 --> 00:44:37,300 the first step might be the client going to the MCP gateway, MCP server 836 00:44:37,300 --> 00:44:41,500 first and saying, hey, these are the things I'm about to go do. 837 00:44:41,500 --> 00:44:46,560 Out of the list, check your MCP gateway and tell me the list of, MCP 838 00:44:46,560 --> 00:44:49,010 tools that I actually need for that. 839 00:44:49,480 --> 00:44:52,690 And then only turn those on for the next task. 840 00:44:53,228 --> 00:44:53,698 Yeah. 841 00:44:54,250 --> 00:44:57,130 and then it'll just repeat that cycle again. 842 00:44:57,240 --> 00:45:03,640 and then winnow down that list of MCP tools to the only things that 843 00:45:03,640 --> 00:45:05,290 are needed for that task at hand. 844 00:45:05,740 --> 00:45:09,220 So there's another layer here, which we're, and Michael and I, 845 00:45:09,220 --> 00:45:11,520 we were discussing while you were building that beautiful diagram. 846 00:45:12,110 --> 00:45:14,530 It's, people are experimenting with that. 847 00:45:14,580 --> 00:45:19,460 All the pieces are in place, but that this pattern isn't quite there just yet, but 848 00:45:19,460 --> 00:45:23,200 it will likely be, I'm pretty sure this is what we're going to be doing pretty 849 00:45:23,253 --> 00:45:26,823 Nobody wants to go manually choose every MCP server that they're going 850 00:45:26,823 --> 00:45:28,403 to need before every AI request. 851 00:45:28,953 --> 00:45:31,673 almost feels like it takes away the speed advantage of using the 852 00:45:31,673 --> 00:45:33,373 MCP tool to go get the data for me. 853 00:45:33,373 --> 00:45:38,388 if I have to do all this work in each tool independently, Because I often 854 00:45:38,388 --> 00:45:43,648 will have an IDE accessing AI, acting as the MCP client, and then I'll have 855 00:45:43,648 --> 00:45:45,318 a terminal acting as an MCP client. 856 00:45:45,548 --> 00:45:49,728 At the same time, I've got ChatGPT desktop running over here, also 857 00:45:49,728 --> 00:45:52,928 while VS Code, I think a lot of us, eventually evolve to the point where 858 00:45:52,988 --> 00:45:57,418 we've got two or three tools all at the same time managing MCP tools. 859 00:45:57,448 --> 00:45:59,828 We've got, I guess we have multiple IDEs, I should say. 860 00:46:00,238 --> 00:46:04,848 and trying to understand how all this comes together is only interesting right 861 00:46:04,848 --> 00:46:08,538 now, but in six months, we're not going to want to be messing with all this stuff. 862 00:46:08,538 --> 00:46:11,598 We're just going to want this part to work so we can work on building agents, 863 00:46:11,648 --> 00:46:11,958 All right. 864 00:46:11,958 --> 00:46:16,348 So Compose, my favorite tool, a lot of people's favorite Docker tool, other 865 00:46:16,348 --> 00:46:17,588 than the fact that Docker exists. 866 00:46:17,838 --> 00:46:21,828 you announced at, we are developers that Compose is getting more. 867 00:46:22,138 --> 00:46:24,878 There's functionality in the YAML specifically, where I guess we're talking 868 00:46:24,878 --> 00:46:30,248 about the YAML configuration that drives the Compose command line, that in just 869 00:46:30,248 --> 00:46:34,458 three months ago, you were adding model support, and that was like an early 870 00:46:34,458 --> 00:46:40,078 alpha idea of what if I could specify the model I wanted Docker Model Runner 871 00:46:40,078 --> 00:46:47,588 to run when I launch my app that maybe needs a model, a local model, and I 872 00:46:47,588 --> 00:46:51,893 use the example and I have a An actual demo over on GIST, that people can 873 00:46:51,893 --> 00:46:57,693 pick up that you simply, you, you write your Compose file, you use something 874 00:46:57,693 --> 00:47:00,548 called open web, open, what is it? 875 00:47:00,598 --> 00:47:03,398 Open web, web, open web UI, I think. 876 00:47:03,618 --> 00:47:04,908 Yeah, horrible name. 877 00:47:05,448 --> 00:47:10,308 Extremely generic name for what is a ChatGPT clone, essentially. 878 00:47:10,368 --> 00:47:14,428 the open source variant, which can use any models or more than one model. 879 00:47:14,438 --> 00:47:16,448 It actually lets you choose it in the interface. 880 00:47:16,908 --> 00:47:21,608 And all you need is a little bit of compose file. 881 00:47:23,528 --> 00:47:27,368 So, I created a 29 lines, and it probably needs to be updated because 882 00:47:27,368 --> 00:47:34,978 it's probably outdated, but, 29 lines of Compose that's half comments that allows 883 00:47:34,978 --> 00:47:40,958 me to spin up an open web UI container while also spinning up the models or 884 00:47:40,968 --> 00:47:44,768 making sure, basically, that I have the models locally that I need to run it. 885 00:47:44,998 --> 00:47:49,138 And this gives me a ChatGPT experience without ChatGPT. 886 00:47:49,138 --> 00:47:50,238 Thank you. 887 00:47:50,318 --> 00:47:52,098 And you guys, you enable this. 888 00:47:52,098 --> 00:47:53,928 Now you're not creating the models. 889 00:47:54,088 --> 00:47:55,938 You're not creating the open web UI. 890 00:47:56,138 --> 00:48:00,358 you're simply providing the glue for it to all come together 891 00:48:00,358 --> 00:48:02,208 in a very easy way locally. 892 00:48:02,548 --> 00:48:02,788 Yeah. 893 00:48:02,788 --> 00:48:06,258 as we agentic apps need three things. 894 00:48:06,258 --> 00:48:09,578 They need models, they need tools, and then the code that glues it all together. 895 00:48:09,978 --> 00:48:13,398 What the Compose file lets us do now is define all three of 896 00:48:13,398 --> 00:48:15,788 those in a single document. 897 00:48:15,978 --> 00:48:21,658 here's the models that my app is going to need for MCP gateway that I'm just going 898 00:48:21,658 --> 00:48:23,378 to run as another containerized service. 899 00:48:23,708 --> 00:48:27,408 And then the code, the custom code, can be really any agentic framework. 900 00:48:27,418 --> 00:48:32,308 this example is, Open web UI, but that Compose snippet what we've done is 901 00:48:32,308 --> 00:48:37,728 we've evolved the specification now models are a top level element in the 902 00:48:37,728 --> 00:48:39,153 Compose file, which is pretty cool. 903 00:48:39,693 --> 00:48:42,453 This just dropped in the last couple of weeks, so this is brand new. 904 00:48:42,819 --> 00:48:43,749 Gotta update my gist. 905 00:48:44,213 --> 00:48:47,713 yep, and so where before, yeah, you had to use this provider 906 00:48:47,713 --> 00:48:49,613 syntax, and that still works. 907 00:48:49,933 --> 00:48:52,123 now it's actually part of the specification. 908 00:48:52,543 --> 00:48:55,203 defining a model, This is going to pull from Docker Hub. 909 00:48:55,213 --> 00:48:58,318 again, You can have your own models and your own container registry. 910 00:48:58,318 --> 00:48:59,608 It's just an OCI artifact. 911 00:48:59,608 --> 00:49:01,058 You can specify that anywhere. 912 00:49:01,698 --> 00:49:03,108 then we've got the services, 913 00:49:03,378 --> 00:49:04,908 And then the app itself. 914 00:49:05,228 --> 00:49:08,958 What's cool about the model now is with the specification evolution, 915 00:49:09,278 --> 00:49:13,658 you can now specify, hey, this is the environment variable I want you to 916 00:49:13,658 --> 00:49:15,658 basically inject into my container. 917 00:49:16,038 --> 00:49:19,868 to specify what's the endpoint, where's the base URL that I 918 00:49:19,878 --> 00:49:22,108 should use to access this model. 919 00:49:22,488 --> 00:49:24,098 And then what's the model name as well. 920 00:49:24,398 --> 00:49:29,718 So the cool thing then is I can go back up to the top level model 921 00:49:29,718 --> 00:49:33,908 specification, I can swap that out and the environment variables will be 922 00:49:33,918 --> 00:49:37,858 automatically updated and as assuming that my app is using those environment 923 00:49:37,858 --> 00:49:39,848 variables, Everything just works. 924 00:49:40,308 --> 00:49:44,188 So again, think of Compose as, it's the glue that's making sure that 925 00:49:44,188 --> 00:49:48,218 everything is there for the application to actually be able to leverage it. 926 00:49:49,029 --> 00:49:49,479 Yeah. 927 00:49:49,539 --> 00:49:51,729 the gateway part here was pretty cool to me. 928 00:49:51,729 --> 00:49:58,429 That I can add in my tools, my MCP tools inside of the YAML file. 929 00:49:58,449 --> 00:50:00,039 when I saw that part, I was like, yes. 930 00:50:00,089 --> 00:50:05,329 that is like my vision, my dream is that I can pass a composed file to 931 00:50:05,329 --> 00:50:09,159 someone else and it'll use their keys. 932 00:50:09,689 --> 00:50:10,469 Presuming, my. 933 00:50:10,934 --> 00:50:15,354 Team is all using the same provider that we would have the same. 934 00:50:15,364 --> 00:50:19,544 Because open, well, open AI, base URL, open AI model, and 935 00:50:19,544 --> 00:50:21,704 then open AI API key or whatever. 936 00:50:21,734 --> 00:50:24,364 if you're going to use ones in the SAS, like those are all pretty generic. 937 00:50:24,514 --> 00:50:28,154 Even if you're not using OpenAI, they're all pretty generic, environment variables. 938 00:50:28,164 --> 00:50:30,794 So I guess this would work across teams or across people 939 00:50:30,838 --> 00:50:32,388 well, and that's a good point to call it. 940 00:50:32,398 --> 00:50:36,528 One of the things that OpenAI did when they released their APIs was basically, 941 00:50:36,528 --> 00:50:40,008 hey, here's a specification on how to interact with models that pretty much, 942 00:50:40,058 --> 00:50:42,168 Everybody else has adopted and used. 943 00:50:42,548 --> 00:50:47,068 and so the Docker model runner exposes an OpenAI compatible API. 944 00:50:47,068 --> 00:50:49,428 And so that's why you see these environment variables kind 945 00:50:49,428 --> 00:50:52,288 of using the OpenAI prefix. 946 00:50:52,718 --> 00:50:56,168 Because again, I can use now any agentic application that can talk to 947 00:50:56,168 --> 00:50:58,278 OpenAI or use the OpenAI libraries. 948 00:50:58,278 --> 00:51:00,328 And it's just a configuration change at this point. 949 00:51:00,433 --> 00:51:00,813 All right. 950 00:51:00,833 --> 00:51:01,923 Now, the coup de gras. 951 00:51:02,363 --> 00:51:03,753 Piece la resistance. 952 00:51:04,463 --> 00:51:08,393 I can't even do my pretend French, All this stuff has been running locally. 953 00:51:08,393 --> 00:51:10,893 Like when we think of Docker desktop, we think of everything locally. 954 00:51:10,893 --> 00:51:17,153 And then a year or two ago, Docker launched, Docker build cloud, which was 955 00:51:17,163 --> 00:51:19,583 like getting back to Docker's roots. 956 00:51:19,583 --> 00:51:23,903 I almost feel like of providing more a SaaS service that essentially is. 957 00:51:24,678 --> 00:51:26,528 Doing something in a container for me. 958 00:51:26,528 --> 00:51:30,228 And in that case, it was just building containers using an outsourced build kit. 959 00:51:30,238 --> 00:51:34,538 So it was better for parallelization and multi architecture. 960 00:51:34,588 --> 00:51:35,168 it was sweet. 961 00:51:35,178 --> 00:51:39,878 And I love it for when I need to build like enterprise tools or big business 962 00:51:39,968 --> 00:51:41,838 things that take 20 minutes to build. 963 00:51:41,838 --> 00:51:45,598 None of my sample little examples do that, but anything in the real world 964 00:51:45,598 --> 00:51:48,718 takes that long and you need to build multi architecture and generally it's 965 00:51:48,758 --> 00:51:50,368 going to be faster in a cloud environment. 966 00:51:50,368 --> 00:51:51,278 So you provided that. 967 00:51:51,658 --> 00:51:58,428 Now it feels like you've upgraded, like it's beyond just building, and it does any 968 00:51:58,428 --> 00:52:04,088 image or any container I want to run, any model I want to run, I guess, not maybe 969 00:52:04,098 --> 00:52:07,748 any, I don't know if there's a limitation there, but, bigger models, then maybe I 970 00:52:07,748 --> 00:52:11,238 can run locally, and then, also builds. 971 00:52:11,653 --> 00:52:15,443 So it can do building the image, hosting the container, running the 972 00:52:15,443 --> 00:52:17,903 model endpoint for a, QIN3 or whatever. 973 00:52:18,423 --> 00:52:20,953 I can now do all that in something called Offload. 974 00:52:21,303 --> 00:52:22,243 So tell me about that. 975 00:52:22,548 --> 00:52:26,588 Docker Offload, the way I explain it to people is, hey, you need more resources? 976 00:52:26,768 --> 00:52:27,808 Burst into the cloud. 977 00:52:28,628 --> 00:52:31,838 And so it's basically, I'm going to offload this into the cloud, 978 00:52:32,178 --> 00:52:35,548 but yet it's still, everything still works as if it were local. 979 00:52:35,548 --> 00:52:38,058 So if I've got bind mounts, okay, great, we're going to automatically 980 00:52:38,058 --> 00:52:39,518 set up the synchronized file shares. 981 00:52:39,528 --> 00:52:43,448 And so all that's going to work the way using mutagen and some of the other tools 982 00:52:43,448 --> 00:52:44,948 behind the scenes to make that work. 983 00:52:45,278 --> 00:52:48,008 Port publishing that still works as you would expect it to. 984 00:52:48,008 --> 00:52:54,798 So again, it gives that local experience, but using remote resources. 985 00:52:54,848 --> 00:52:57,758 I'm just offloading this to the cloud, but yet it's still. 986 00:52:58,393 --> 00:52:59,153 My environment. 987 00:52:59,553 --> 00:53:02,553 and so, yeah, to make it clear, like, this is not a production runtime 988 00:53:02,553 --> 00:53:05,863 environment, I can't share this environment out, or, I can't, you 989 00:53:05,863 --> 00:53:08,573 know, create a URL and say, hey, check this out, colleague, or whatever, 990 00:53:08,573 --> 00:53:10,253 it's still for your personal use. 991 00:53:10,423 --> 00:53:13,853 Now, of course, can you make a, Cloudflare tunnels, and I'm going 992 00:53:13,853 --> 00:53:17,013 to make it production, sure, but I 993 00:53:17,053 --> 00:53:17,243 wouldn't 994 00:53:17,304 --> 00:53:17,614 could hack 995 00:53:17,973 --> 00:53:18,223 that. 996 00:53:18,444 --> 00:53:18,884 but yeah. 997 00:53:18,945 --> 00:53:19,555 Yeah. 998 00:53:19,795 --> 00:53:20,755 So what is the intent? 999 00:53:20,765 --> 00:53:23,095 so what is the use case? 1000 00:53:23,935 --> 00:53:24,205 the big 1001 00:53:24,267 --> 00:53:26,147 should I use Docker offload for first? 1002 00:53:26,450 --> 00:53:27,620 Yeah, so, okay, great. 1003 00:53:27,630 --> 00:53:31,160 You're wanting to play around these agentic apps and, you know, we 1004 00:53:31,170 --> 00:53:35,450 were talking about not everybody has access to high end GPUs or, 1005 00:53:35,450 --> 00:53:37,930 you know, M4 machines and whatnot. 1006 00:53:37,970 --> 00:53:41,800 great with the flip of a switch, and you had it there in Docker Desktop, 1007 00:53:41,800 --> 00:53:45,440 but at the top you just flip a switch, and now you're using offload. 1008 00:53:45,810 --> 00:53:51,990 and so now you've got access to a pretty significant NVIDIA GPU, and additional 1009 00:53:51,990 --> 00:53:57,120 resources, and so yeah, as you're, We see the use case, especially more for 1010 00:53:57,120 --> 00:54:00,640 the agent applications, because that's where those resources are needed. 1011 00:54:01,580 --> 00:54:07,070 It does open up some interesting doors for, maybe I'm just on a super lightweight 1012 00:54:07,100 --> 00:54:09,930 laptop that I'm using for school and I don't have the ability to even run 1013 00:54:09,930 --> 00:54:11,640 a lot of my containerized workloads. 1014 00:54:12,220 --> 00:54:15,480 Great, I can use that for offload, you know, offload that to the cloud. 1015 00:54:15,530 --> 00:54:19,460 it does open up some interesting opportunities for, Use cases beyond 1016 00:54:19,460 --> 00:54:23,190 agentic apps, but that's kind of where the big focus is right now. 1017 00:54:23,415 --> 00:54:26,765 So if you're like a Docker insider, or if you're someone who's used Docker a while, 1018 00:54:27,420 --> 00:54:32,470 it's the Docker context command that we've had forever, augmenting or changing the 1019 00:54:32,470 --> 00:54:36,100 environment variable docker underscore host, which we've had since almost the 1020 00:54:36,100 --> 00:54:43,475 beginning, and it allows you from your local Docker CLI, And even the GUI works 1021 00:54:43,475 --> 00:54:48,785 this way too, because I could always set up a Docker remote engine and then 1022 00:54:49,075 --> 00:54:51,545 create a new context in the Docker CLI. 1023 00:54:52,025 --> 00:54:57,665 That would use SSH tunneling to go to that server, and then I could run my Docker 1024 00:54:57,665 --> 00:55:02,015 CLI locally, my Compose CLI locally, and it would technically be accessing and 1025 00:55:02,015 --> 00:55:06,615 running against, the remote host that I had set up, but that was never really a 1026 00:55:06,615 --> 00:55:11,695 cloud service, like it was never, no one provides Docker API access as a service 1027 00:55:11,695 --> 00:55:16,205 that I'm aware of, and, the context command, while it's easy to use, and you 1028 00:55:16,205 --> 00:55:19,195 can actually use it on any command, you can use docker run dash dash context, 1029 00:55:19,195 --> 00:55:22,815 I believe, or docker dash dash context run, I can't remember the order, but, 1030 00:55:22,965 --> 00:55:24,125 you can change that on any command. 1031 00:55:24,275 --> 00:55:27,285 these are all things that existed, but you made it, stupid easy. 1032 00:55:27,945 --> 00:55:30,445 It's just, like you said, it's a toggle, it's so easy. 1033 00:55:30,885 --> 00:55:33,655 You just click that button and then the UI changes, the 1034 00:55:34,015 --> 00:55:35,895 colors change, so now you know. 1035 00:55:36,365 --> 00:55:37,525 You're now remote. 1036 00:55:38,350 --> 00:55:41,830 Yeah, and so I'll go ahead and say too, behind the scenes, it's using context. 1037 00:55:41,830 --> 00:55:43,250 It's using those exact things. 1038 00:55:43,610 --> 00:55:47,980 The tricky part is, because I've done similar, development environments where, 1039 00:55:48,030 --> 00:55:52,280 I'm going to work against a Raspberry Pi at home or, whatever else it might be. 1040 00:55:52,720 --> 00:55:56,050 the tricky part is when you want to get into bind mounts, file sharing 1041 00:55:56,050 --> 00:55:58,620 kind of stuff, or port publishing, and I want to be able to access 1042 00:55:58,620 --> 00:56:01,530 that port from my machine, like automating all those different pieces. 1043 00:56:02,420 --> 00:56:03,730 That's not trivial. 1044 00:56:03,750 --> 00:56:04,900 I mean, it's possible. 1045 00:56:05,080 --> 00:56:07,810 a separate tool, yeah, you gotta download ngrok or something, 1046 00:56:08,560 --> 00:56:12,310 And so this brings all that together into a single offering here. 1047 00:56:12,902 --> 00:56:13,802 That's pretty amazing. 1048 00:56:13,812 --> 00:56:18,132 Like there's a lot going on underneath the hood that, switch is hiding 1049 00:56:18,142 --> 00:56:19,922 a lot of different functionality. 1050 00:56:19,952 --> 00:56:20,152 Like 1051 00:56:20,152 --> 00:56:20,442 it's. 1052 00:56:20,997 --> 00:56:22,587 To make that very transparent 1053 00:56:23,250 --> 00:56:25,350 And this supports builds too, right? 1054 00:56:25,360 --> 00:56:25,890 So like. 1055 00:56:26,505 --> 00:56:30,475 When I toggle this in the UI, or is there a CLI toggle? 1056 00:56:30,685 --> 00:56:31,305 Yeah, there is. 1057 00:56:31,591 --> 00:56:31,991 okay. 1058 00:56:32,251 --> 00:56:35,681 So if I toggle this, it's, yeah, you're like, you're saying it's a context 1059 00:56:35,681 --> 00:56:41,161 change, but it's UI aware, and it takes in all the other little things that we 1060 00:56:41,161 --> 00:56:42,761 don't think about until they don't work. 1061 00:56:42,761 --> 00:56:45,761 And then we're like, oh, yeah, it's not really running locally anymore. 1062 00:56:45,761 --> 00:56:47,231 So now I can't use localhost colon. 1063 00:56:47,381 --> 00:56:49,631 Well, that all just, I'm going to show you how this works and you don't even 1064 00:56:49,641 --> 00:56:52,721 have to know kind of like the rest of the Dockery because you don't really 1065 00:56:52,721 --> 00:56:54,231 have to know how it works underneath. 1066 00:56:54,591 --> 00:56:57,531 but if you think it's too much magic, I like to break it down and say, 1067 00:56:57,701 --> 00:56:59,241 it's just really the Docker context. 1068 00:56:59,271 --> 00:57:01,221 I don't actually look anything at the code. 1069 00:57:01,241 --> 00:57:02,641 I don't know really how it's working. 1070 00:57:02,891 --> 00:57:06,391 But to me, when I went and checked, it does change the context for me. 1071 00:57:06,391 --> 00:57:08,031 It actually injects it and then removes it. 1072 00:57:08,311 --> 00:57:10,011 I did notice it from the CLI. 1073 00:57:10,211 --> 00:57:11,441 I could change context. 1074 00:57:11,867 --> 00:57:15,657 And it would retain the context, but if I use the toggle button, it deletes 1075 00:57:15,657 --> 00:57:16,987 the context and then re adds it. 1076 00:57:17,107 --> 00:57:20,347 Regardless, it is in the background, it's doing cool things. 1077 00:57:20,647 --> 00:57:23,817 I think the immediate request from the captains was, can I do both? 1078 00:57:24,017 --> 00:57:30,157 Can I have, Per workload or per service offload so that just my model's remote 1079 00:57:30,157 --> 00:57:34,087 and maybe that really big database server and then all my apps are local. 1080 00:57:34,397 --> 00:57:38,497 I don't know why I would care, but like it, that's something that people ask for. 1081 00:57:38,707 --> 00:57:40,927 I'm not sure that I care that to that level. 1082 00:57:40,927 --> 00:57:44,857 I think I'm fine with either or, but I can understand that if I'm running some things 1083 00:57:44,857 --> 00:57:49,167 locally already and I just want to add on something in addition, it would be neat 1084 00:57:49,167 --> 00:57:51,477 if I could just choose for one service. 1085 00:57:52,462 --> 00:57:52,682 Yeah. 1086 00:57:52,682 --> 00:57:54,612 So as of right now, it is an all or nothing. 1087 00:57:54,622 --> 00:57:55,742 you're doing everything local. 1088 00:57:55,742 --> 00:57:57,232 You're doing everything out in the cloud. 1089 00:57:57,232 --> 00:57:58,317 there's not a way to. 1090 00:57:58,787 --> 00:58:00,257 Split that up yet. 1091 00:58:00,307 --> 00:58:02,367 it's something that we've heard from a couple of folks, but 1092 00:58:02,417 --> 00:58:05,237 again, it's that same thing of, tell us more about the use cases. 1093 00:58:05,237 --> 00:58:08,447 So if that's a use case you have, feel free to reach out to us and 1094 00:58:08,447 --> 00:58:13,907 help us better understand, why you might want to split runtime. 1095 00:58:14,147 --> 00:58:16,977 hosting, split environment, hybrid environment. 1096 00:58:17,237 --> 00:58:18,317 That's the correct term. 1097 00:58:18,476 --> 00:58:19,236 Why do you say it like that? 1098 00:58:19,436 --> 00:58:21,816 and just to be clear, offload has its own cost. 1099 00:58:21,816 --> 00:58:24,416 like this isn't free forever for infinity. 1100 00:58:24,416 --> 00:58:26,246 You can't just take up a bunch of GPUs. 1101 00:58:26,506 --> 00:58:29,686 I was asking the team a little bit and without getting too nerdy, it 1102 00:58:29,696 --> 00:58:33,826 sounds like it isolates, it spins up a VM or there's maybe some hot VMs. 1103 00:58:33,826 --> 00:58:35,086 And I get a dedicated. 1104 00:58:36,026 --> 00:58:39,476 OS essentially it sounds like so that I can get the GPU if I need. 1105 00:58:39,476 --> 00:58:43,206 And you kind of get an option of, do I want GPU, servers with GPUs or not? 1106 00:58:43,216 --> 00:58:45,486 do I, am I going to run GPU workloads or not? 1107 00:58:45,486 --> 00:58:48,476 And that affects pricing do we get anything out of the box with 1108 00:58:48,476 --> 00:58:51,386 a Docker subscription or is it just an, a completely separate 1109 00:58:51,726 --> 00:58:56,153 So actually it's a. Kind of private beta, but yet people can sign up 1110 00:58:56,163 --> 00:58:57,353 for it and that kind of stuff. 1111 00:58:57,683 --> 00:59:01,633 folks will get the 300 GPU minutes, which isn't a ton, but it's enough to 1112 00:59:01,653 --> 00:59:03,143 experiment and play around with it. 1113 00:59:03,443 --> 00:59:05,233 and then, start giving us feedback, etc. 1114 00:59:05,549 --> 00:59:08,169 Yeah, if you spin up the GPU instance and then go to lunch, by 1115 00:59:08,169 --> 00:59:10,659 the time you get back, you'll have probably used up your free minutes. 1116 00:59:10,738 --> 00:59:11,448 It's a long lunch, 1117 00:59:11,508 --> 00:59:12,668 hey, that's my kind of lunch. 1118 00:59:14,458 --> 00:59:17,998 but yeah, so we went an hour, and we barely scratched the 1119 00:59:17,998 --> 00:59:19,228 surface, do we cover it all? 1120 00:59:19,298 --> 00:59:22,978 did we list at least all the announcements of Major features and tools. 1121 00:59:22,978 --> 00:59:25,148 I don't even want to say we've covered all the features because there's 1122 00:59:25,148 --> 00:59:26,798 probably some stuff with MCP we missed. 1123 00:59:27,188 --> 00:59:31,338 So you open source MCP gateway, but we should point out you don't actually 1124 00:59:31,338 --> 00:59:36,488 have to know, like you can just use Docker desktop and MCP tools locally. 1125 00:59:37,028 --> 00:59:40,618 But the reason you provide an MCP gateway is open source is so 1126 00:59:40,618 --> 00:59:44,478 we could put it in the compose file and then run it on servers. 1127 00:59:44,528 --> 00:59:45,498 think about it this way. 1128 00:59:45,518 --> 00:59:48,478 the MCP toolkit bundled with Docker Desktop is going to be more for, 1129 00:59:48,728 --> 00:59:52,678 I'm consuming, I'm just wanting to use MCP servers and connect them 1130 00:59:52,678 --> 00:59:54,558 to my other agentic applications. 1131 00:59:54,868 --> 00:59:57,888 And the MCP Gateway is going to be more for, now I want to build 1132 00:59:57,888 --> 01:00:01,948 my own agentic applications and connect those tools to, Those, those 1133 01:00:01,948 --> 01:00:03,508 applications that we're running there. 1134 01:00:03,949 --> 01:00:04,389 Yeah. 1135 01:00:04,729 --> 01:00:07,229 Do you see people using MCP Gateway in production? 1136 01:00:07,249 --> 01:00:10,819 Do you see that as like a. Not that you provide support or anything like 1137 01:00:10,819 --> 01:00:12,499 that, but is it designed so that 1138 01:00:13,499 --> 01:00:16,989 We've got a couple of folks that are already starting to do so. 1139 01:00:17,059 --> 01:00:19,609 stay tuned for some use case stories around that. 1140 01:00:19,659 --> 01:00:19,939 Yeah. 1141 01:00:20,569 --> 01:00:20,919 Awesome. 1142 01:00:21,419 --> 01:00:22,329 well, this is a lot. 1143 01:00:22,359 --> 01:00:28,139 I feel like I need to launch another 10 Docker YouTube uploads just 1144 01:00:28,149 --> 01:00:31,319 to cover each tool specifically, each use case specifically. 1145 01:00:31,609 --> 01:00:35,314 there's a lot here, but this is Amazing work. 1146 01:00:35,314 --> 01:00:39,884 I mean, I don't know if you have a fleet of AI robots working for you yet, but 1147 01:00:40,118 --> 01:00:44,348 certainly feels like a lot of different products that are all coming together very 1148 01:00:44,348 --> 01:00:48,678 quickly that are all somehow related to each other, but also independently usable. 1149 01:00:49,143 --> 01:00:54,103 And, I'm having you on the show as usual is a great way to break it down 1150 01:00:54,133 --> 01:00:58,233 into the real usable bits What do we really care about without all the 1151 01:00:58,233 --> 01:01:03,053 marketing, hype of general AI hype, which is always a problem on the internet. 1152 01:01:03,053 --> 01:01:05,173 But this feels like really useful stuff. 1153 01:01:05,203 --> 01:01:05,793 Um, 1154 01:01:08,058 --> 01:01:09,268 Eivor, another podcast. 1155 01:01:09,268 --> 01:01:10,738 I don't know, Eivor, what's up? 1156 01:01:10,748 --> 01:01:13,038 Are you requesting yet another podcast? 1157 01:01:13,398 --> 01:01:14,028 Um, 1158 01:01:14,432 --> 01:01:15,132 a whole new show 1159 01:01:15,178 --> 01:01:17,668 about Compose provider services? 1160 01:01:18,088 --> 01:01:18,888 Oh, yes. 1161 01:01:18,928 --> 01:01:24,448 Also, you can now run Compose directly from, well, you can use Compose YAML 1162 01:01:24,468 --> 01:01:27,618 from directly inside of cloud tools. 1163 01:01:28,173 --> 01:01:29,923 The first one was Google Cloud Run. 1164 01:01:30,253 --> 01:01:33,373 So I could technically spin up Google, which I love, 1165 01:01:33,383 --> 01:01:34,883 Google Cloud Run is fantastic. 1166 01:01:35,203 --> 01:01:38,403 Um, it would be my first choice for running any containers in 1167 01:01:38,403 --> 01:01:39,673 Google if I was using Google. 1168 01:01:40,193 --> 01:01:45,902 Um, and so now they're, accepting the Compose YAML, spec, essentially, 1169 01:01:46,422 --> 01:01:48,282 inside of their command line. 1170 01:01:48,559 --> 01:01:51,039 So this is like, this feels like the opposite of what Docker used to do. 1171 01:01:51,039 --> 01:01:54,339 Docker used to build in cloud functionality into the Docker tooling. 1172 01:01:54,339 --> 01:01:57,769 But now we're saying, Hey, let's partner with those tools, those companies, 1173 01:01:57,989 --> 01:02:03,489 and let them build in cloud or compose specification into their tool. 1174 01:02:03,849 --> 01:02:06,419 So we can have basically file reuse. 1175 01:02:06,479 --> 01:02:07,209 YAML reuse. 1176 01:02:07,209 --> 01:02:07,919 Is that right? 1177 01:02:08,549 --> 01:02:08,829 yeah. 1178 01:02:08,829 --> 01:02:13,249 So this is the first time exactly in which it's not Docker tooling that's providing 1179 01:02:13,249 --> 01:02:15,959 the cloud support, but it's cloud native. 1180 01:02:16,216 --> 01:02:19,406 They're the ones building the tooling and consuming the Compose file. 1181 01:02:19,590 --> 01:02:21,160 yeah, it's a big moment. 1182 01:02:21,170 --> 01:02:25,530 And as we work with Google Cloud on this, yeah, you can deploy the 1183 01:02:25,530 --> 01:02:27,020 normal container workloads, etc. 1184 01:02:27,020 --> 01:02:30,890 But they already have support for Model Runner to be able to run the models 1185 01:02:30,890 --> 01:02:34,460 there as well, it's pretty exciting And I know, the provider services 1186 01:02:34,520 --> 01:02:37,410 this is how we started with models. 1187 01:02:37,815 --> 01:02:39,555 having support in Compose, where 1188 01:02:39,555 --> 01:02:44,964 That was another service in which the service wasn't 1189 01:02:45,024 --> 01:02:46,704 backed by a normal container. 1190 01:02:47,174 --> 01:02:47,984 the old method. 1191 01:02:48,034 --> 01:02:52,894 yes, but what's cool about this is, so first off, these hooks still are in place. 1192 01:02:53,309 --> 01:02:57,259 So that a Compose file can basically delegate off to this additional 1193 01:02:57,279 --> 01:03:00,899 provider plugin to say, hey, this is how you're going to spin up a model. 1194 01:03:01,159 --> 01:03:04,029 But it starts to open up a whole ecosystem where anybody can make a 1195 01:03:04,029 --> 01:03:08,779 provider or, okay, hey, I've got this cloud based database, just as an example. 1196 01:03:09,069 --> 01:03:12,429 And, okay, Now I can still use Compose and it's going to spin up 1197 01:03:12,429 --> 01:03:17,439 my containers, but also create this cloud based container and then inject 1198 01:03:17,629 --> 01:03:19,669 environment variables into my app. 1199 01:03:19,719 --> 01:03:22,039 again, it starts to open up some pretty cool extensibility 1200 01:03:22,039 --> 01:03:23,589 capabilities of Compose as well. 1201 01:03:23,639 --> 01:03:28,279 I think we, yeah, we need to bring Michael back just to dig into that because, it's 1202 01:03:28,279 --> 01:03:31,359 essentially like extensions or, Plugins 1203 01:03:31,807 --> 01:03:32,197 Yeah. 1204 01:03:32,328 --> 01:03:32,478 for 1205 01:03:32,507 --> 01:03:35,467 Yeah, so Compose is about to get a whole lot more love. 1206 01:03:35,477 --> 01:03:38,777 It feels like it's already, I mean, it's been years since we've 1207 01:03:38,777 --> 01:03:41,007 added a root extension or like a, 1208 01:03:41,158 --> 01:03:41,738 top level, 1209 01:03:41,947 --> 01:03:42,857 top level build. 1210 01:03:43,497 --> 01:03:46,247 it's not every day that Docker decides there's a whole new 1211 01:03:46,807 --> 01:03:48,617 type of thing that we deploy. 1212 01:03:48,617 --> 01:03:52,127 Now we have models, we'll see if providers someday become something. 1213 01:03:52,427 --> 01:03:52,777 that'll be cool. 1214 01:03:53,237 --> 01:03:56,687 and this is all due to the Compose spec, which now allows other 1215 01:03:56,687 --> 01:03:59,877 tools to use the Compose standard. 1216 01:03:59,887 --> 01:04:02,157 And that's just great for everybody, because everybody uses Compose. 1217 01:04:02,167 --> 01:04:05,117 it's like the most universal YAML out there, in my opinion. 1218 01:04:05,527 --> 01:04:05,997 great. 1219 01:04:06,087 --> 01:04:07,677 Well, I think we've covered it all. 1220 01:04:07,917 --> 01:04:10,907 Nirmal and I need, another month to digest all this, and 1221 01:04:10,907 --> 01:04:11,997 then we'll invite you back on. 1222 01:04:12,367 --> 01:04:12,627 do it. 1223 01:04:13,177 --> 01:04:16,487 but yeah, we've checked the box of, Everything Docker first 1224 01:04:16,487 --> 01:04:18,917 half of the year, stay tuned for the second half of the year. 1225 01:04:19,137 --> 01:04:22,357 I actually sincerely hope you don't have as busy of a second half, 1226 01:04:22,387 --> 01:04:25,027 just because it's, these are a lot of videos I got to make, you're 1227 01:04:25,027 --> 01:04:26,727 putting a lot of work into my, inbox, 1228 01:04:26,787 --> 01:04:28,607 We're helping you have content to create. 1229 01:04:28,903 --> 01:04:32,563 I know, yeah, there's no shortage of content to create right now with Docker. 1230 01:04:32,903 --> 01:04:34,813 I am very excited to play with all these things. 1231 01:04:34,813 --> 01:04:37,253 I sound excited because I am excited. 1232 01:04:37,253 --> 01:04:42,123 This is real stuff that I think is beneficial and largely free, largely, 1233 01:04:42,438 --> 01:04:45,678 Like almost all of this stuff is really just extra functionality that 1234 01:04:45,678 --> 01:04:49,218 already exists that in our tooling without adding a whole bunch of SaaS 1235 01:04:49,218 --> 01:04:50,808 services we have to buy on top of it. 1236 01:04:51,168 --> 01:04:52,728 yeah, so congrats. 1237 01:04:53,728 --> 01:04:55,958 People can find out more docker.com. 1238 01:04:56,298 --> 01:05:00,368 docs.Docker.com, dockers got videos on YouTube now they're putting up 1239 01:05:00,368 --> 01:05:02,618 YouTube videos, so check that out. 1240 01:05:02,768 --> 01:05:05,443 I saw Michael putting up some videos recently on LinkedIn. 1241 01:05:06,443 --> 01:05:07,203 it's all over the place. 1242 01:05:07,203 --> 01:05:10,063 You can follow Michael Irwin on LinkedIn. 1243 01:05:10,113 --> 01:05:11,013 he's on BlueSky. 1244 01:05:11,013 --> 01:05:11,943 I think you're on BlueSky. 1245 01:05:12,516 --> 01:05:13,446 I think you're on BlueSky. 1246 01:05:14,256 --> 01:05:16,416 Um, or, or, where, yeah, 1247 01:05:16,441 --> 01:05:17,451 figured out where I'm hanging out. 1248 01:05:17,883 --> 01:05:18,753 Thanks so much for being here. 1249 01:05:18,934 --> 01:05:19,474 Thank you, Michael. 1250 01:05:19,474 --> 01:05:21,694 Thank you, Nirmal, for joining and staying so long. 1251 01:05:22,087 --> 01:05:22,867 I'll see you in the next one.