Navigated to The Coding Model Wars: Claude Opus 4.6 vs GPT-5.3 Codex - Transcript

The Coding Model Wars: Claude Opus 4.6 vs GPT-5.3 Codex

Episode Transcript

1 00:00:00,020 --> 00:00:05,840 Ejaaz: 48 hours ago, Anthropic dropped Claude Opus 4.6, the world's most powerful AI model. 2 00:00:06,140 --> 00:00:10,760 Ejaaz: And literally 20 minutes later, OpenAI dropped Codex 5.3, which is not only 3 00:00:10,760 --> 00:00:13,140 Ejaaz: better, but also built itself. 4 00:00:13,480 --> 00:00:17,740 Ejaaz: Now, to say both of these models are powerful would literally be the understatement of the century. 5 00:00:18,040 --> 00:00:20,880 Ejaaz: By the time I'd eaten breakfast yesterday, one of the models had discovered 6 00:00:20,880 --> 00:00:23,840 Ejaaz: 500 security flaws, which no one else had discovered before. 7 00:00:24,240 --> 00:00:28,080 Ejaaz: And by lunchtime, a bunch of software stocks were down hundreds of billions 8 00:00:28,080 --> 00:00:31,500 Ejaaz: of dollars out of fear that these models would replace entire teams. 9 00:00:31,800 --> 00:00:35,680 Ejaaz: And it's actually already happened. These models can replace a team of 50 software 10 00:00:35,680 --> 00:00:38,740 Ejaaz: engineers, rebuild Pokemon from scratch, and so much more. 11 00:00:38,840 --> 00:00:42,180 Ejaaz: And in this episode, we're going to be doing a live demo side by side to show 12 00:00:42,180 --> 00:00:43,900 Ejaaz: you which model is the best. 13 00:00:44,100 --> 00:00:47,040 Josh: Yeah, this is pretty cool. I wanted to spend a lot of time this episode kind 14 00:00:47,040 --> 00:00:50,560 Josh: of introducing people to these models, what they could do, how they work through 15 00:00:50,560 --> 00:00:52,700 Josh: demos that we're going to perform ourselves. 16 00:00:52,980 --> 00:00:55,940 Josh: These are definitely two frontier models but i think more importantly they're 17 00:00:55,940 --> 00:00:58,760 Josh: frontier coding models and when people hear that i 18 00:00:58,760 --> 00:01:01,900 Josh: think a lot of them get turned away because it seems like this complicated 19 00:01:01,900 --> 00:01:04,800 Josh: thing like you need to be a developer in order to use them and we 20 00:01:04,800 --> 00:01:08,240 Josh: are here to tell you that is not the case as from 21 00:01:08,240 --> 00:01:11,000 Josh: one non-technical person to another i fed this 22 00:01:11,000 --> 00:01:14,020 Josh: model a prompt i fed it some assets and 23 00:01:14,020 --> 00:01:16,880 Josh: then i pressed play and what i got is a 24 00:01:16,880 --> 00:01:19,740 Josh: side-scrolling game which was exactly what i asked for so on the screen now 25 00:01:19,740 --> 00:01:23,960 Josh: you're seeing the one shot prompt that i fed this model to ask to create a side 26 00:01:23,960 --> 00:01:28,200 Josh: scroller that was like mario that we can actually play so it has coins and i 27 00:01:28,200 --> 00:01:31,560 Josh: don't think the gravity quite works what you're saying is that it understands 28 00:01:31,560 --> 00:01:37,460 Josh: physics it is able to generate graphics and it plays like a pretty solid side 29 00:01:37,460 --> 00:01:39,560 Josh: scroller and i created this in five minutes, 30 00:01:40,310 --> 00:01:43,070 Josh: with one prompt and it actually works what. 31 00:01:43,070 --> 00:01:44,710 Ejaaz: Was the prompt that you used josh 32 00:01:44,710 --> 00:01:47,430 Josh: Yeah so i'll pause playing this game to 33 00:01:47,430 --> 00:01:50,050 Josh: actually show you the the prompt it was very simple it was this 34 00:01:50,050 --> 00:01:52,850 Josh: one paragraph i want you to make a game you can 35 00:01:52,850 --> 00:01:56,170 Josh: use python or c++ whatever you find the most convenient a 2d 36 00:01:56,170 --> 00:01:58,790 Josh: platformer that closely resembles super mario use the 37 00:01:58,790 --> 00:02:01,430 Josh: attached background image and sprites found in the 38 00:02:01,430 --> 00:02:04,490 Josh: asset folder take into account that the sprites don't come with transparent background 39 00:02:04,490 --> 00:02:07,210 Josh: but pink ones so you need to filter the background and for those who are 40 00:02:07,210 --> 00:02:10,290 Josh: watching you can actually see the sprites on my screen they were 41 00:02:10,290 --> 00:02:13,150 Josh: just a series of assets that there was no context given as 42 00:02:13,150 --> 00:02:16,090 Josh: to what each one of them was but the model reasoned through it it removed 43 00:02:16,090 --> 00:02:19,150 Josh: the background and it actually generated a pretty good representation of 44 00:02:19,150 --> 00:02:22,270 Josh: that now this was built one shot on codex which 45 00:02:22,270 --> 00:02:25,050 Josh: is the new open ai mac application that just released this 46 00:02:25,050 --> 00:02:28,410 Josh: week and i wanted to compare it to claude 47 00:02:28,410 --> 00:02:31,950 Josh: so i have another instance here on the screen with claude this is using opus 48 00:02:31,950 --> 00:02:35,490 Josh: 4.6 the newest frontier model that they just released this week and i want to 49 00:02:35,490 --> 00:02:39,310 Josh: do an exact one-to-one comparison so i'm gonna launch the same exact prompt 50 00:02:39,310 --> 00:02:42,390 Josh: we're gonna have that cook on codex or we're gonna have that cook in claude 51 00:02:42,390 --> 00:02:46,350 Josh: code and in the meantime you just maybe we can kind of talk about more of what 52 00:02:46,350 --> 00:02:48,110 Josh: these models do and how they work well. 53 00:02:48,110 --> 00:02:53,650 Ejaaz: Before we do that actually um as you set this game up i ran it on claude opus 54 00:02:53,650 --> 00:02:57,050 Ejaaz: 4.6 as well but with a slight twist okay 55 00:02:57,050 --> 00:02:58,790 Josh: Let's see your output what do we have okay. 56 00:02:58,790 --> 00:03:01,650 Ejaaz: Uh i don't know if you can see my screen 57 00:03:01,650 --> 00:03:06,190 Ejaaz: but it is the exact game that you just created but i don't know if those characters 58 00:03:06,190 --> 00:03:10,490 Ejaaz: look uh kind of familiar to you we have the uh hero protagonist character which 59 00:03:10,490 --> 00:03:15,750 Ejaaz: is uh my beautiful face and my beautiful person ejaz um and we have uh who's 60 00:03:15,750 --> 00:03:18,930 Ejaaz: this enemy over here that looks a lot like the bear guy 61 00:03:20,490 --> 00:03:26,930 Ejaaz: and listen we can double jump here josh and i think yep i can crush you but every time i mean this 62 00:03:27,440 --> 00:03:33,840 Ejaaz: Kind of jokes aside, this is insane. This took me like around three minutes to build end-to-end. 63 00:03:34,020 --> 00:03:36,080 Ejaaz: I used the exact same prompt that you gave me. 64 00:03:36,380 --> 00:03:39,700 Ejaaz: And we didn't have sprites ready-made of ourselves, right? 65 00:03:39,800 --> 00:03:43,340 Ejaaz: We didn't have like cartoon images of ourselves. So I uploaded an image that 66 00:03:43,340 --> 00:03:46,520 Ejaaz: we had taken, I don't know, like six months ago and said, hey, 67 00:03:46,640 --> 00:03:48,540 Ejaaz: can you make game avatars out of this? 68 00:03:48,680 --> 00:03:51,900 Ejaaz: It did it in 20 seconds. And then I said, could you add these to the game and 69 00:03:51,900 --> 00:03:57,100 Ejaaz: replace the enemy with Josh and the protagonist with Ejaz? And it did it in a minute. 70 00:03:57,440 --> 00:04:00,180 Josh: So here we go. That's pretty amazing. And these are really, these are just using 71 00:04:00,180 --> 00:04:03,300 Josh: standard desktop applications. So what you're using right here, 72 00:04:03,460 --> 00:04:05,280 Josh: this was done in Cloud Code, right? 73 00:04:05,680 --> 00:04:09,900 Josh: You just went onto Cloud, the MacBook, the Mac app. You downloaded it. You put in the prompt. 74 00:04:10,180 --> 00:04:13,720 Josh: You shared some assets. And now it built this amazing game in one single prompt. 75 00:04:13,840 --> 00:04:17,040 Josh: And we're actually going to experiment further in this episode where we're going 76 00:04:17,040 --> 00:04:21,000 Josh: to create a trading room that does actual real-time stock analysis. 77 00:04:21,400 --> 00:04:24,840 Josh: So as I'm curating the prompts and as we're getting ready for that second demo, 78 00:04:24,980 --> 00:04:27,360 Josh: maybe we could walk through what makes these models so exceptional. 79 00:04:27,440 --> 00:04:32,100 Ejaaz: Yeah, well, you might actually notice the first difference on screen right now. 80 00:04:32,220 --> 00:04:36,120 Ejaaz: If you notice, if you look closely, my avatar is kind of glitching out, right? 81 00:04:36,440 --> 00:04:39,680 Ejaaz: And if you compare it to your Codex game that you just coded up, 82 00:04:39,840 --> 00:04:41,740 Ejaaz: there's no glitches. It runs super smoothly. 83 00:04:41,980 --> 00:04:47,740 Ejaaz: And the main takeaway here is Codex 5.3 is a superior coding model to Anthropic. 84 00:04:48,020 --> 00:04:51,360 Ejaaz: And that's a sentence I never thought I would say, at least for the next couple 85 00:04:51,360 --> 00:04:55,780 Ejaaz: of years, because Anthropic has held that prestige and title for so long. 86 00:04:55,900 --> 00:04:59,080 Ejaaz: But since Code Red was initiated in open air around three months ago, 87 00:04:59,500 --> 00:05:04,180 Ejaaz: Sam has devoted pretty much all his resources towards building the best coding model. 88 00:05:04,320 --> 00:05:08,280 Ejaaz: And the benchmarks don't lie. It is a full 12 points on the software engineering 89 00:05:08,280 --> 00:05:10,980 Ejaaz: benchmark ahead of Claude Opus 4.6. 90 00:05:11,180 --> 00:05:12,460 Josh: That's a pretty significant difference. 91 00:05:12,700 --> 00:05:17,260 Ejaaz: So I've actually pulled up a more general comparison between the two models here. 92 00:05:17,400 --> 00:05:20,600 Ejaaz: And it summarizes it really well. So if we look at Claude's model, 93 00:05:20,740 --> 00:05:22,560 Ejaaz: Opus 4.6, what's good about it? 94 00:05:22,840 --> 00:05:26,120 Ejaaz: Well, they've 5x the context window. 95 00:05:26,140 --> 00:05:30,660 Ejaaz: So it's gone up to a million tokens or rather characters that you can put in 96 00:05:30,660 --> 00:05:33,580 Ejaaz: a single prompt, which if you want to understand how powerful this is, 97 00:05:33,800 --> 00:05:36,540 Ejaaz: you can just put way more information into your initial prompt. 98 00:05:36,660 --> 00:05:40,760 Ejaaz: It has much better context and memory. So you can end up cooking up much better 99 00:05:40,760 --> 00:05:44,640 Ejaaz: products overall, which is very, very impressive and important to have. 100 00:05:44,940 --> 00:05:49,640 Ejaaz: Number two, I would think about this as an orchestration model. 101 00:05:49,780 --> 00:05:54,480 Ejaaz: So if you look at specific benchmarks, it is beaten OpenAI at GDP eval. 102 00:05:54,620 --> 00:05:58,640 Ejaaz: GDP eval is a benchmark where they go out and they test a model's performance 103 00:05:58,640 --> 00:06:04,400 Ejaaz: at a really complex task versus a professional human that would normally do that task. 104 00:06:04,660 --> 00:06:08,120 Ejaaz: And the decision is, would you use the AI model or would you use the human? 105 00:06:08,300 --> 00:06:12,620 Ejaaz: And in this case, you would choose Claude 4.6 over humans way more than you 106 00:06:12,620 --> 00:06:16,260 Ejaaz: would choose OpenAI's latest model. So that's a really important thing. 107 00:06:16,400 --> 00:06:21,300 Ejaaz: And the point around Claude's latest model is that it doesn't code as well as 108 00:06:21,300 --> 00:06:26,960 Ejaaz: codecs, but it can orchestrate a bunch of agents and overall activity better than OpenAI. 109 00:06:27,200 --> 00:06:30,860 Ejaaz: Now, if you look at Codex and OpenAI's new models specifically, 110 00:06:31,390 --> 00:06:36,050 Ejaaz: It wins on the software engineering. It is simply a better software engineer 111 00:06:36,050 --> 00:06:40,410 Ejaaz: than Claude is, which is a massive flip around and shows that it's a testament 112 00:06:40,410 --> 00:06:44,490 Ejaaz: to how much resources and fine-tuning that OpenAI has been able to achieve. 113 00:06:44,750 --> 00:06:49,690 Josh: And to the note on the quality of the models here, my prompt is done in Claude 114 00:06:49,690 --> 00:06:52,530 Josh: code that I used, the same one that we used in Codex. And I'm going to run it 115 00:06:52,530 --> 00:06:53,470 Josh: here for the first time now. 116 00:06:53,610 --> 00:06:56,670 Josh: You can see on screen and we'll see what it looks like. 117 00:06:56,810 --> 00:06:59,550 Josh: So underneath, we have our Codex version, which looks beautiful. 118 00:06:59,550 --> 00:07:03,870 Josh: On top we have our brand new version that was just made by opus now i haven't 119 00:07:03,870 --> 00:07:06,170 Josh: tried this yet so we're going to see what happens when i press space to start, 120 00:07:09,590 --> 00:07:12,310 Josh: so it looks like opus has failed to create a 121 00:07:12,310 --> 00:07:18,310 Josh: floor so i am just falling through the floor until the game ends um okay so 122 00:07:18,310 --> 00:07:21,510 Josh: just based on this one demo alone this is a fairly significant difference where 123 00:07:21,510 --> 00:07:25,930 Josh: gpt's codex has created a beautiful side scroller it doesn't have gravity but 124 00:07:25,930 --> 00:07:28,950 Josh: i could just ask it to or it has gravity it's a little too much i could ask 125 00:07:28,950 --> 00:07:30,930 Josh: it to lower it opus doesn't even work at all, 126 00:07:31,550 --> 00:07:34,650 Josh: And again, the test was just a one-shot prompt. So I'm going to get back to 127 00:07:34,650 --> 00:07:38,410 Josh: work prompting it again to build this new application, the trading application. 128 00:07:38,610 --> 00:07:42,150 Josh: We'll follow up with that. But I think that's a funny kind of demo just to showcase 129 00:07:42,150 --> 00:07:46,450 Josh: that one actually is kind of superior in the other in this one use case, at least. 130 00:07:46,950 --> 00:07:53,130 Ejaaz: Yeah, I mean, you said it pretty clearly, which is Codex is the best coding AI model. 131 00:07:53,290 --> 00:07:57,130 Ejaaz: And I have to like, I can't emphasize that enough because OpenAI for a long 132 00:07:57,130 --> 00:08:01,690 Ejaaz: time was behind Anthropic and by a massive margin. and in some way, 133 00:08:01,770 --> 00:08:03,430 Ejaaz: shape, or form, they've been able to catch up. 134 00:08:03,690 --> 00:08:09,970 Ejaaz: Now, what's interesting here is both companies have focused on each other's goals. 135 00:08:10,310 --> 00:08:15,350 Ejaaz: So when Anthropic was typically meant to be the leading frontier model in coding, 136 00:08:15,890 --> 00:08:18,730 Ejaaz: it now has decided to focus on what OpenAI was really good at, 137 00:08:18,870 --> 00:08:23,390 Ejaaz: which is overall orchestration and being a better generalized model, right? 138 00:08:23,610 --> 00:08:25,610 Josh: They're taking each other's lunch. Yeah, exactly. 139 00:08:25,790 --> 00:08:27,530 Ejaaz: OpenAI has decided to eat Anthropic's 140 00:08:27,530 --> 00:08:30,490 Ejaaz: lunch and say, okay, we've got the generalized stuff sorted out. 141 00:08:30,690 --> 00:08:34,490 Ejaaz: Let's try and figure out the coding specific niche, highly defined, 142 00:08:34,630 --> 00:08:37,610 Ejaaz: professionalized functions. And it's produced the best coding model. 143 00:08:37,770 --> 00:08:41,650 Ejaaz: So it's kind of a weird win-win for both labs. 144 00:08:41,710 --> 00:08:46,090 Ejaaz: And what's awesome about this is they both now have really well-rounded, 145 00:08:46,310 --> 00:08:48,630 Ejaaz: but also very specialized models. 146 00:08:48,830 --> 00:08:53,470 Ejaaz: And the reason why this is important is, and this is like kind of maybe my hot take, 147 00:08:54,140 --> 00:08:59,120 Ejaaz: I don't think the coding models matter, Josh. I actually don't think the generalized models matter either. 148 00:08:59,400 --> 00:09:03,200 Ejaaz: I think they're both going off to something much bigger, which is creating the 149 00:09:03,200 --> 00:09:06,060 Ejaaz: operating system for the future of work. 150 00:09:06,200 --> 00:09:10,080 Ejaaz: They know that AI models and AI agents are gonna automate a ton of different 151 00:09:10,080 --> 00:09:14,120 Ejaaz: industries and the industries are only gonna pick you if you can do both generalized 152 00:09:14,120 --> 00:09:17,020 Ejaaz: work and hyper-specific work really well. 153 00:09:17,160 --> 00:09:20,340 Ejaaz: That is coding and orchestration and managing your data. 154 00:09:20,680 --> 00:09:24,020 Ejaaz: And now we have two amazing models dropped within 20 minutes of each other. 155 00:09:24,140 --> 00:09:28,560 Ejaaz: That does exactly that to the highest performance metric that we've ever seen before. 156 00:09:29,180 --> 00:09:32,380 Josh: They're pretty exceptional. So now for this next demo, I have it queued up here. 157 00:09:32,720 --> 00:09:36,440 Josh: What we're going to do is, what I did is ask the model itself to build me a 158 00:09:36,440 --> 00:09:40,440 Josh: prompt for this. So I wanted it to create me an AI stock portfolio war room. 159 00:09:40,620 --> 00:09:44,700 Josh: And I asked, hey, I want to create this, create me a fully fleshed out prompt 160 00:09:44,700 --> 00:09:48,240 Josh: that kind of should solve this problem with one shot. 161 00:09:48,320 --> 00:09:52,200 Josh: So what I do is I loaded it up here in our Cloud Code app. 162 00:09:52,240 --> 00:09:55,020 Josh: And then I also loaded it up into the codex app i created its own 163 00:09:55,020 --> 00:09:57,660 Josh: project folder and now i'm going to hit send so both of 164 00:09:57,660 --> 00:10:00,520 Josh: these things are thinking in real time we will check back 165 00:10:00,520 --> 00:10:03,340 Josh: in once their outputs are done and we'll compare again the second version 166 00:10:03,340 --> 00:10:06,460 Josh: which is more of a robust one i mean you'll see uh on 167 00:10:06,460 --> 00:10:10,220 Josh: the cloud screen it has this whole list of to-dos that it wants to do it has 168 00:10:10,220 --> 00:10:14,260 Josh: an entire plan there's nine different panels that it's going to build it's going 169 00:10:14,260 --> 00:10:18,200 Josh: to do risk analysis matrix and portfolio action bars and all this stuff so we'll 170 00:10:18,200 --> 00:10:21,260 Josh: let that cook and let's get back to what separates these what people have been 171 00:10:21,260 --> 00:10:24,420 Josh: freaking out about on the internet more as these things get going could i. 172 00:10:24,420 --> 00:10:27,320 Ejaaz: Take three minutes show you some wild demos yeah 173 00:10:27,320 --> 00:10:30,220 Josh: Let's see what the internet's been demoing while we wait for hours to cook okay. 174 00:10:30,220 --> 00:10:35,960 Ejaaz: Cool like listen our 2d mario inspired game was cool but imagine if i told you 175 00:10:35,960 --> 00:10:41,300 Ejaaz: you could recreate the entire pokemon game including levels cities characters 176 00:10:41,300 --> 00:10:45,740 Ejaaz: and creatures that you fight from scratch in about an hour and 30 minutes 177 00:10:46,380 --> 00:10:47,940 Ejaaz: That's pretty impressive. That's what we're looking at right now. 178 00:10:47,940 --> 00:10:49,540 Josh: Wow, it even has the fighting. 179 00:10:49,980 --> 00:10:53,320 Ejaaz: Yeah, yeah, yeah. And buttons and the multimodal gameplay. 180 00:10:53,620 --> 00:10:56,840 Ejaaz: And obviously this looks like it's been made by a child image wise, 181 00:10:57,040 --> 00:10:59,860 Ejaaz: but it's probably going to take you, what, another couple of hours to make a 182 00:10:59,860 --> 00:11:03,400 Ejaaz: really high fidelity game that you could probably run on your Nintendo Switch or whatever. 183 00:11:03,820 --> 00:11:06,760 Ejaaz: It is just so impressive that we can do these things. 184 00:11:07,160 --> 00:11:10,400 Ejaaz: Anyone can do these things with no previous background. Just upload a few images 185 00:11:10,400 --> 00:11:14,580 Ejaaz: or generate a few images and you can create childhood nostalgic games that are 186 00:11:14,580 --> 00:11:17,560 Ejaaz: worth billions of dollars, which is just super cool to see. 187 00:11:17,740 --> 00:11:21,620 Josh: Yeah, one of the cool things that I think it's really important to note is how approachable this is. 188 00:11:21,720 --> 00:11:25,180 Josh: Like for the recent example that we're having run right now on my screen, 189 00:11:25,640 --> 00:11:29,420 Josh: all I did was tell it what I wanted and ask it to develop the prompt with me. 190 00:11:29,540 --> 00:11:32,320 Josh: So even if it feels overwhelming, like you don't really know how to code, 191 00:11:32,380 --> 00:11:35,720 Josh: you don't know how to prompt things, you can actually just ask the model to 192 00:11:35,720 --> 00:11:38,480 Josh: help you generate the prompt, help explain to you how it works. 193 00:11:38,600 --> 00:11:41,720 Josh: And it's a really easy way to build basically anything you can imagine. 194 00:11:41,920 --> 00:11:45,040 Josh: It's not just games. It's productivity tools. It's CRM tracking. 195 00:11:45,040 --> 00:11:48,060 Josh: It's whatever you want it to be so i think that's really interesting but it 196 00:11:48,060 --> 00:11:52,260 Josh: also goes much more technical right i saw another crazy example with the compiler. 197 00:11:52,260 --> 00:11:55,260 Ejaaz: Okay so for for the tech nerds 198 00:11:55,260 --> 00:11:57,880 Ejaaz: out there that's been a lot of time coding you are going to 199 00:11:57,880 --> 00:12:04,700 Ejaaz: be wowed by this um for one of their uh flagship demos for uh opus 4.6 the anthropic 200 00:12:04,700 --> 00:12:11,600 Ejaaz: team decided to task the model with building a c compiler which is an incredibly 201 00:12:11,600 --> 00:12:16,780 Ejaaz: complicated execution tool that is required to code up some of the most craziest types of apps. 202 00:12:17,300 --> 00:12:20,980 Ejaaz: And they just walked away. And they just kind of like looked at it, 203 00:12:21,320 --> 00:12:23,460 Ejaaz: monitored it, made sure that it wasn't going awry. 204 00:12:23,780 --> 00:12:26,460 Ejaaz: And in two weeks, let me emphasize that, 205 00:12:26,930 --> 00:12:31,970 Ejaaz: Two whole weeks, 14 days, it coded nonstop and built this compiler. 206 00:12:32,370 --> 00:12:36,330 Ejaaz: Now, you might think two weeks is quite a long time. I want my thing done in an hour and a half. 207 00:12:36,490 --> 00:12:40,630 Ejaaz: Well, let me hearken back to history where previously, if you wanted to create 208 00:12:40,630 --> 00:12:44,550 Ejaaz: something like this, in today's world, it would take a team of around 50 or 209 00:12:44,550 --> 00:12:49,230 Ejaaz: so humans, and it would take them a few months to build from scratch. That's today. 210 00:12:49,510 --> 00:12:54,730 Ejaaz: But back in the day, it would technically have taken them around a decade to 211 00:12:54,730 --> 00:12:56,790 Ejaaz: build and like thousands of people. 212 00:12:56,930 --> 00:13:01,770 Ejaaz: So we have just kind of condensed the timeline to create really complicated 213 00:13:01,770 --> 00:13:05,490 Ejaaz: tools in a matter of hours or weeks in this case. 214 00:13:05,810 --> 00:13:09,610 Ejaaz: Now, the second thing I want to point out is the fact that these models can 215 00:13:09,610 --> 00:13:13,250 Ejaaz: go untouched for two weeks is just insane. 216 00:13:13,590 --> 00:13:17,330 Ejaaz: There was another stat that was released today by OpenAI with, 217 00:13:17,410 --> 00:13:21,650 Ejaaz: sorry, yesterday with OpenAI is 5.2, I think, 5.2 high, I believe, 218 00:13:21,850 --> 00:13:27,970 Ejaaz: where it can go pretty much 50% hit rate for 6.6 hours. a time horizon. 219 00:13:28,230 --> 00:13:31,250 Ejaaz: So that means if you gave it any kind of complicated coding task, 220 00:13:31,610 --> 00:13:35,930 Ejaaz: 50% of the time in 6.6 hours, it would get that done, completely done. 221 00:13:36,010 --> 00:13:39,750 Ejaaz: And it would nail it 50% of the time, which is just such an impressive track 222 00:13:39,750 --> 00:13:41,290 Ejaaz: record when you look back a year. 223 00:13:41,410 --> 00:13:45,210 Ejaaz: And that time was, what was it like 30 minutes, maybe an hour. 224 00:13:45,570 --> 00:13:49,150 Ejaaz: So every iteration, we see this thing double. It's just so insane. 225 00:13:49,630 --> 00:13:52,810 Josh: Yeah, it's really, it's unbelievable and almost like intimidating how 226 00:13:52,810 --> 00:13:55,770 Josh: capable and competent it is even for someone who 227 00:13:55,770 --> 00:13:58,730 Josh: is a novel at writing code it's not about writing 228 00:13:58,730 --> 00:14:03,650 Josh: code it's about being able to generate whatever you want it to so like if you 229 00:14:03,650 --> 00:14:07,050 Josh: think of it you kind of in a way it abstracts the code away and allows you to 230 00:14:07,050 --> 00:14:11,110 Josh: just speak the english language and get what you want from speaking english 231 00:14:11,110 --> 00:14:14,070 Josh: and in a way that you understand and it will help walk you through the way one 232 00:14:14,070 --> 00:14:16,950 Josh: of the things that i love about cloud code in particular is the plan mode. 233 00:14:17,690 --> 00:14:20,830 Josh: If you leave a lot of things out of your prompt, it'll actually just continue 234 00:14:20,830 --> 00:14:23,530 Josh: to prompt you with additional questions to understand where you want. 235 00:14:23,610 --> 00:14:29,330 Josh: And one of the most fascinating things that I read about GPT's 5.3 codex in 236 00:14:29,330 --> 00:14:33,070 Josh: particular is like you mentioned in the intro, it helps build itself. 237 00:14:33,470 --> 00:14:37,230 Josh: And I don't think that can be overstated because this is the first model in 238 00:14:37,230 --> 00:14:42,050 Josh: the history of OpenAI that has helped with the building and construction of itself. 239 00:14:42,490 --> 00:14:47,150 Josh: And what happens as that starts to ramp up, right? If you think of each model 240 00:14:47,150 --> 00:14:49,730 Josh: iteration as a flywheel, what is the constraint? 241 00:14:49,950 --> 00:14:54,250 Josh: The two constraints are the speed at which a developer can actually build it 242 00:14:54,250 --> 00:14:57,090 Josh: and then create the test for it and make sure that it's safe to ready to deploy. 243 00:14:57,430 --> 00:15:00,430 Josh: And then it's the hardware that's required to actually train the model. 244 00:15:00,570 --> 00:15:04,550 Josh: What we're seeing with Codex and Opus, which I really believe was kind of Sonnet, 245 00:15:04,850 --> 00:15:06,170 Josh: is the incremental improvements. 246 00:15:06,430 --> 00:15:09,410 Josh: Now, for the incremental improvements that don't require an entirely new training 247 00:15:09,410 --> 00:15:13,350 Josh: run, the real constraint is the actual software and what you could squeeze out of it. 248 00:15:13,350 --> 00:15:16,290 Josh: And when you have a model that's helping you build this 249 00:15:16,290 --> 00:15:19,390 Josh: software that can think for 6 12 24 hours 250 00:15:19,390 --> 00:15:22,230 Josh: at a time even longer and that is it kind 251 00:15:22,230 --> 00:15:25,310 Josh: of creates this like self-fulfilling loop right where the models use the 252 00:15:25,310 --> 00:15:28,190 Josh: new models to make the new models the future models 253 00:15:28,190 --> 00:15:31,070 Josh: stronger and more powerful and better and i thought that was a really interesting 254 00:15:31,070 --> 00:15:35,330 Josh: thing to note is that this is the first self propagating model where it ran 255 00:15:35,330 --> 00:15:38,910 Josh: a lot of the test for itself it introduced new code that made itself better 256 00:15:38,910 --> 00:15:43,310 Josh: and as we continue to see that you can start to imagine that vertical that like 257 00:15:43,310 --> 00:15:46,690 Josh: exponential progress line going pretty close to vertical and things getting 258 00:15:46,690 --> 00:15:48,550 Josh: really good like really really quick. 259 00:15:48,970 --> 00:15:53,390 Ejaaz: I think what most people listening to this might think is that, 260 00:15:53,930 --> 00:15:55,970 Ejaaz: well, what was different before? 261 00:15:56,370 --> 00:16:00,450 Ejaaz: Well, previously, models would just kind of work in a very analog mode. 262 00:16:00,630 --> 00:16:02,390 Ejaaz: You would just point it at a problem 263 00:16:02,390 --> 00:16:05,550 Ejaaz: and it would just understand what the problem was and then solve it. 264 00:16:05,630 --> 00:16:10,530 Ejaaz: But it lacked that awareness and wider context as to like what the wider vision 265 00:16:10,530 --> 00:16:13,630 Ejaaz: and goal was to achieve and then figuring out stuff for itself. 266 00:16:13,630 --> 00:16:17,890 Ejaaz: You always had to kind of handhold it. But now with its ability to kind of like 267 00:16:17,890 --> 00:16:21,130 Ejaaz: understand what it's trying to do and look internally and say, 268 00:16:21,270 --> 00:16:24,110 Ejaaz: huh, I made that mistake because of this error in my code. 269 00:16:24,250 --> 00:16:26,930 Ejaaz: I'm going to now like rewrite my code and then I'll be better at it. 270 00:16:27,030 --> 00:16:31,590 Ejaaz: It kind of functions similarly to a human. Now, I actually saw a great analogy. 271 00:16:32,110 --> 00:16:34,230 Ejaaz: I forgot who wrote it, but it's 272 00:16:34,230 --> 00:16:38,790 Ejaaz: fantastic. where if you imagine yourself standing on a sidewalk, right? 273 00:16:39,110 --> 00:16:45,350 Ejaaz: And a Bugatti Veyron drives super fast by you at let's say 200 miles an hour, 274 00:16:45,510 --> 00:16:47,450 Ejaaz: you'll be like, wow, that's kind of fast. 275 00:16:47,830 --> 00:16:52,850 Ejaaz: And then two minutes later, another Bugatti drives by you at 300 miles an hour. 276 00:16:53,090 --> 00:16:56,630 Ejaaz: You'll be like, wow, that's kind of fast. But you wouldn't really notice the 277 00:16:56,630 --> 00:16:59,890 Ejaaz: difference between that 100 mile an hour difference, right? 278 00:17:00,050 --> 00:17:04,070 Ejaaz: But if you were in the car strapped in, you would notice it is significantly 279 00:17:04,070 --> 00:17:06,890 Ejaaz: improved. And that's how software engineers feel right now. 280 00:17:07,090 --> 00:17:10,190 Ejaaz: Now, if you're someone that doesn't code all the time, you're not necessarily 281 00:17:10,190 --> 00:17:13,050 Ejaaz: going to understand these impacts, but it's really important for those of you 282 00:17:13,050 --> 00:17:17,770 Ejaaz: listening to this to figure out that this is massively impactful and will change 283 00:17:17,770 --> 00:17:19,890 Ejaaz: the way that a lot of things are happening today. 284 00:17:19,990 --> 00:17:24,270 Ejaaz: I mean, just take a look at this, right? This is a direct quote from someone 285 00:17:24,270 --> 00:17:27,010 Ejaaz: who is building at a major tech company, Rakuten. 286 00:17:27,290 --> 00:17:32,950 Ejaaz: And the quote here says, Claude Opus 4.6 autonomously closed 13 issues and assigned 287 00:17:32,950 --> 00:17:38,490 Ejaaz: 12 issues to the right team members in a single day, managing a 50-person organization 288 00:17:38,490 --> 00:17:40,510 Ejaaz: across six repositories. 289 00:17:40,730 --> 00:17:43,870 Ejaaz: Josh, do you know who else is responsible for doing that? 290 00:17:44,050 --> 00:17:48,550 Ejaaz: An entire team of product managers that each get paid a quarter of a million 291 00:17:48,550 --> 00:17:50,110 Ejaaz: dollars in compensation automatically. 292 00:17:50,330 --> 00:17:52,890 Josh: Minimum per year at least yeah their. 293 00:17:52,890 --> 00:17:53,830 Ejaaz: Jobs are automated now 294 00:17:53,830 --> 00:17:56,870 Josh: Well one of the earlier moments in 295 00:17:56,870 --> 00:17:59,870 Josh: which i realized this was pretty profound is is when claude co-work they 296 00:17:59,870 --> 00:18:03,890 Josh: said they built it with what just a hint like four people over the course of 297 00:18:03,890 --> 00:18:09,450 Josh: 10 days and it was 100 built by the current model of claude which is opus 4.5 298 00:18:09,450 --> 00:18:14,510 Josh: at the time like the the amount of leverage from these tools is so high but 299 00:18:14,510 --> 00:18:19,570 Josh: it cuts both ways it's like if you can design and develop a product in 10 days, 300 00:18:19,810 --> 00:18:23,430 Josh: then that means another company can probably do that in five. 301 00:18:23,690 --> 00:18:28,770 Josh: And it starts to lower the competitive threshold for these companies to catch up. 302 00:18:28,890 --> 00:18:32,090 Josh: And it starts to raise the bar of what is possible. 303 00:18:32,210 --> 00:18:35,790 Josh: Like if you could build something that profound in 10 days, what can you build 304 00:18:35,790 --> 00:18:37,210 Josh: over the course of six months? 305 00:18:37,350 --> 00:18:42,050 Josh: Like, can you really build something fantastic that has a moat that like actually 306 00:18:42,050 --> 00:18:46,350 Josh: delivers on the total power that you have by leveraging this AI? 307 00:18:46,510 --> 00:18:49,810 Josh: It's going to be interesting to see because i mean what we're finding even with 308 00:18:49,810 --> 00:18:53,370 Josh: the the codex and opus dual launch is that these companies are right next to 309 00:18:53,370 --> 00:18:55,150 Josh: each other and if one publishes something, 310 00:18:55,870 --> 00:18:59,210 Josh: profound or something that attracts a lot of users they're just a few days and 311 00:18:59,210 --> 00:19:03,330 Josh: a few prompts away from copying it and that's like a pretty difficult thing 312 00:19:03,330 --> 00:19:06,310 Josh: to compete against on on the software front well. 313 00:19:06,310 --> 00:19:10,070 Ejaaz: That's why if we look at the stock market over the last couple of days like 314 00:19:10,070 --> 00:19:13,810 Ejaaz: it's down trillions of dollars and i'm not exaggerating if you look at microsoft 315 00:19:13,810 --> 00:19:19,910 Ejaaz: over the last two weeks, the stock is down 20%. It's trading like a meme stock, which is just insane. 316 00:19:20,530 --> 00:19:26,190 Ejaaz: And the reason why that is, is a lot of investors are anticipating that these models, 317 00:19:27,290 --> 00:19:34,210 Ejaaz: Specifically Opus 4.6 and Codex 5.3, will just create the tools that these billions 318 00:19:34,210 --> 00:19:38,790 Ejaaz: of dollars worth of SaaS companies have spent or valued their entire lives on 319 00:19:38,790 --> 00:19:40,610 Ejaaz: in a couple of seconds, just as you described. 320 00:19:40,810 --> 00:19:45,410 Ejaaz: Now, the counter argument to this, Josh, is, and Jets of Wine actually kind 321 00:19:45,410 --> 00:19:48,450 Ejaaz: of went live at a conference and spoke about this and made this point, 322 00:19:49,310 --> 00:19:54,110 Ejaaz: If you're an AI agent or AI model that is capable of building these tools, right? 323 00:19:54,310 --> 00:19:59,710 Ejaaz: Why would you rebuild the tool every single time you do a function? 324 00:20:00,090 --> 00:20:03,330 Ejaaz: Surely you would just access the best tool and use it. 325 00:20:03,590 --> 00:20:07,690 Ejaaz: So there's a bit more nuance where AI models aren't just gonna recreate your 326 00:20:07,690 --> 00:20:10,430 Ejaaz: entire software stack if you are at a Fortune 500 company. 327 00:20:10,570 --> 00:20:14,290 Ejaaz: That kind of doesn't make any sense. There are a bunch of tools that are hyper-optimized to do that. 328 00:20:14,330 --> 00:20:20,570 Ejaaz: But what it will do is it will connect all of these tools and silos in a much more effective way. 329 00:20:20,810 --> 00:20:22,990 Ejaaz: And maybe that requires rebuilding parts of it. 330 00:20:23,250 --> 00:20:26,610 Ejaaz: Maybe it requires kind of connecting different ways, but not rebuilding the entire tools. 331 00:20:26,870 --> 00:20:31,610 Ejaaz: And whatever operating system that ends up becoming will be the most sticky 332 00:20:31,610 --> 00:20:32,970 Ejaaz: and valuable company ever. 333 00:20:33,090 --> 00:20:36,670 Ejaaz: Now, that could be Salesforce, or it could be someone completely different, 334 00:20:36,770 --> 00:20:38,990 Ejaaz: a startup that we haven't even heard of. And I think that's really important 335 00:20:38,990 --> 00:20:40,930 Ejaaz: to understand, but people are experimenting. 336 00:20:41,170 --> 00:20:44,450 Ejaaz: And if you look at this graph right here, which is may not look insane to some, 337 00:20:44,530 --> 00:20:50,230 Ejaaz: but is insane to me at least, 4% of daily GitHub commits are now clawed code. 338 00:20:50,590 --> 00:20:56,870 Ejaaz: That was, I think, 5% of what it is today two months ago. 339 00:20:57,070 --> 00:21:02,010 Ejaaz: So the ascent has just been insane. These companies are adopting it and they are using it. 340 00:21:02,510 --> 00:21:05,570 Josh: Yeah, the number is just going to keep going up and there's no reason why it 341 00:21:05,570 --> 00:21:08,430 Josh: wouldn't. It's such a testament. One, the speed. 342 00:21:08,710 --> 00:21:11,090 Josh: It feels like we're strapped in that car and now we're flying. 343 00:21:11,450 --> 00:21:14,190 Josh: Two, an outsider might not look like it. It certainly feels like that 344 00:21:14,190 --> 00:21:17,790 Josh: on the inside and i think a lot of people are starting to notice this and get 345 00:21:17,790 --> 00:21:20,690 Josh: a little nervous about it too like look at this example on the screen right 346 00:21:20,690 --> 00:21:27,310 Josh: now this is a prompt from gpt 5.3 codex which basically created an entire minecraft 347 00:21:27,310 --> 00:21:32,830 Josh: clone in a single prompt and it looks awesome and it works really fast and it 348 00:21:32,830 --> 00:21:33,870 Josh: was super lightweight and 349 00:21:34,120 --> 00:21:38,260 Josh: And it says, I also tried on Opus 4.6, but for some reason it got stuck. 350 00:21:38,740 --> 00:21:42,180 Josh: But you can build anything that you want very, very quickly, 351 00:21:42,640 --> 00:21:44,040 Josh: like very cheaply as well. 352 00:21:44,520 --> 00:21:48,720 Josh: What Opus 5.3, or Opus 5.3, I'm getting them all mixed up. 353 00:21:49,000 --> 00:21:55,260 Josh: What GPT 5.3 Codex offered is double the rates, the double the token rates for 354 00:21:55,260 --> 00:21:56,080 Josh: the next couple of months. 355 00:21:56,220 --> 00:21:59,960 Josh: So you actually have the freedom for their $20 a month plan to go and build whatever you want. 356 00:22:00,100 --> 00:22:02,640 Ejaaz: Can I maybe deliver a hot take, Josh? 357 00:22:02,900 --> 00:22:03,460 Josh: Yeah, what do you got? 358 00:22:03,460 --> 00:22:08,000 Ejaaz: I think the most exciting part about these model releases aren't the models themselves. 359 00:22:08,580 --> 00:22:11,840 Ejaaz: Largely, I think the models are kind of similar in capabilities. 360 00:22:12,280 --> 00:22:16,680 Ejaaz: They are around the same coding benchmarks, and they can roughly do the same 361 00:22:16,680 --> 00:22:19,120 Ejaaz: things. They can spin up a bunch of agents and orchestrate themselves. 362 00:22:19,580 --> 00:22:24,300 Ejaaz: The bigger picture, which I think a lot of people missed, was both companies, 363 00:22:24,380 --> 00:22:27,120 Ejaaz: Anthropic and OpenAI, are at war with each other. 364 00:22:27,400 --> 00:22:31,360 Ejaaz: And they're trying to basically build and own the operating system for work, 365 00:22:31,440 --> 00:22:34,140 Ejaaz: which isn't just a model. it's a software suite. 366 00:22:34,340 --> 00:22:37,260 Ejaaz: So this week alone, OpenAI didn't just release this new model. 367 00:22:37,460 --> 00:22:42,380 Ejaaz: They released the Codex app, which is a desktop Mac app, which is kind of like 368 00:22:42,380 --> 00:22:45,360 Ejaaz: a command line interface, which makes the coding experience way better. 369 00:22:45,500 --> 00:22:48,380 Ejaaz: And they also launched an enterprise platform called Frontier, 370 00:22:48,640 --> 00:22:54,300 Ejaaz: which allows Fortune 500 companies to basically take this magical model and 371 00:22:54,300 --> 00:22:57,800 Ejaaz: give it to non-coders and let them do magical things. Now, 372 00:22:58,480 --> 00:23:02,660 Ejaaz: All of these products together creates a very sticky experience where it starts 373 00:23:02,660 --> 00:23:07,320 Ejaaz: to make sense for software engineers and non-software engineers to use these products. 374 00:23:07,420 --> 00:23:10,900 Ejaaz: And it becomes incredibly sticky, which results in billion-dollar contracts, right? 375 00:23:11,440 --> 00:23:14,320 Ejaaz: Anthropic has done the same thing over the last two weeks. 376 00:23:14,420 --> 00:23:18,540 Ejaaz: They released Claude Cowork, they released agent teams this week, 377 00:23:18,560 --> 00:23:19,900 Ejaaz: and then they released this new model. 378 00:23:20,040 --> 00:23:23,120 Ejaaz: They're going after the same thing, which it kind of makes sense why they're 379 00:23:23,120 --> 00:23:25,960 Ejaaz: releasing Super Bowl ads that are kind of shitting on each other now. 380 00:23:26,360 --> 00:23:31,280 Ejaaz: It makes a lot of sense. And so the point is, if they can own this operating 381 00:23:31,280 --> 00:23:35,500 Ejaaz: system, this future of work, they will basically be the most valuable company. 382 00:23:35,580 --> 00:23:36,980 Ejaaz: And I think it's going to be when it takes most. 383 00:23:37,160 --> 00:23:39,880 Josh: I have to interrupt you here. We have some developments on our prompts that 384 00:23:39,880 --> 00:23:42,480 Josh: we've been working on, our AI stock war room. Let's go. That I'm going to have 385 00:23:42,480 --> 00:23:43,680 Josh: to share on the screen right now. 386 00:23:44,040 --> 00:23:48,180 Josh: So currently what it's doing is it's asking to do some quality assurance testing. 387 00:23:48,380 --> 00:23:52,880 Josh: So you'll see it actually used a it's taking over control of my browser and 388 00:23:52,880 --> 00:23:56,600 Josh: it's asking to make prompts on the screen. So you can see all of this that you're 389 00:23:56,600 --> 00:24:00,540 Josh: seeing right here is generated live, and it's doing an actual real-time debug 390 00:24:00,540 --> 00:24:02,440 Josh: of the product that it made. 391 00:24:02,600 --> 00:24:05,860 Josh: It's clicking around, it's resizing things, it's going through the links, 392 00:24:05,880 --> 00:24:09,200 Josh: and it's running real quality assurance testing on the actual product. 393 00:24:09,340 --> 00:24:11,360 Josh: It's really amazing to see. 394 00:24:12,110 --> 00:24:15,130 Josh: This was all just built all these visual charts and they're all accurate so 395 00:24:15,130 --> 00:24:17,850 Josh: right now we're looking at nvidia we have a chart and i'm not going to mess 396 00:24:17,850 --> 00:24:20,870 Josh: with it because it's doing the real-time manipulation to do quality assurance 397 00:24:20,870 --> 00:24:23,610 Josh: checks but it's actually clicking through it's making sure the 398 00:24:23,610 --> 00:24:28,510 Josh: stats are accurate it's making sure all the widgets work and look it has this 399 00:24:28,510 --> 00:24:32,570 Josh: amazing graphs already it has sentiment analysis 85 percent of people are bullish 400 00:24:32,570 --> 00:24:38,210 Josh: on nvidia it has recent signals from the news it has the assessment a risk assessment 401 00:24:38,210 --> 00:24:41,870 Josh: matrix where it shows the like export controls and chip controls. 402 00:24:41,990 --> 00:24:46,490 Josh: It has revenue and earnings every single quarter, charted, competitive moats. 403 00:24:46,690 --> 00:24:49,830 Josh: It has sector comparisons. It's like, this is unbelievable. 404 00:24:50,090 --> 00:24:53,230 Josh: And it just generated this in a single prompt. And I just find it really funny 405 00:24:53,230 --> 00:24:55,330 Josh: that we can actually watch this do it in real time. 406 00:24:55,450 --> 00:25:00,490 Josh: So you'll see in this prompt, it's clicking through, it's taking screenshots of what it's seeing. 407 00:25:00,630 --> 00:25:04,430 Josh: And then it's digesting, analyzing, and understanding what it made, 408 00:25:04,650 --> 00:25:07,150 Josh: what it messed up and what it actually still has left to finish. 409 00:25:07,150 --> 00:25:11,570 Josh: And it generated everything, all of this in real time as we're recording this episode. 410 00:25:12,990 --> 00:25:13,710 Josh: So fascinating. 411 00:25:14,210 --> 00:25:19,070 Ejaaz: Wow, it reminds me of some of the research platforms at the former companies 412 00:25:19,070 --> 00:25:22,710 Ejaaz: that I used to work at and they would pay, I'm not joking, millions of dollars 413 00:25:22,710 --> 00:25:26,670 Ejaaz: a year to get access to these types of platforms that would give them analysis 414 00:25:26,670 --> 00:25:28,450 Ejaaz: like what you're showing on the screen right now. 415 00:25:28,750 --> 00:25:32,390 Josh: And you just built it from scratch. From scratch, and look, it's doing this. 416 00:25:32,510 --> 00:25:35,750 Josh: I'm not even touching my keyboard. I just searched for Apple and now I'm sure 417 00:25:35,750 --> 00:25:36,750 Josh: if I go over to the prompt, 418 00:25:36,750 --> 00:25:39,710 Josh: it's taking screenshots of apple it says apple dashboard 419 00:25:39,710 --> 00:25:42,990 Josh: looking great let me scroll to see the new three column button row layout and 420 00:25:42,990 --> 00:25:47,550 Josh: it's checking the button rows and it's really unbelievable like we have the 421 00:25:47,550 --> 00:25:51,190 Josh: investment thesis the bull case for it the bear case for it catalyst and timelines 422 00:25:51,190 --> 00:25:56,810 Josh: it has wwdc built in it has the iphone 18 launch props um set up for september, 423 00:25:57,440 --> 00:26:01,500 Josh: It's like so cool. It's absolutely unbelievable. And now this is a real tool 424 00:26:01,500 --> 00:26:03,880 Josh: that I'll be able to use to type 425 00:26:03,880 --> 00:26:07,200 Josh: in whatever stock I want to look at and actually get some analysis on it. 426 00:26:07,340 --> 00:26:12,580 Josh: Now, I'll go over to Codex over here and it looks like Codex is taking its sweet time. 427 00:26:12,740 --> 00:26:16,480 Josh: It's still zero out of six tasks completed. So it might take a little while 428 00:26:16,480 --> 00:26:19,780 Josh: for us to get a visual on that, but it's just amazing to watch this happen in 429 00:26:19,780 --> 00:26:23,420 Josh: real time as at least Cloud Code and Opus 4.6, 430 00:26:24,020 --> 00:26:27,760 Josh: does some quality assurance testing live by taking over my browser and running 431 00:26:27,760 --> 00:26:30,740 Josh: it for itself. I just think this is like, this is amazing. 432 00:26:31,120 --> 00:26:37,160 Ejaaz: It's magic. Something I just noticed in your Opus chatbot screen when it's going 433 00:26:37,160 --> 00:26:41,640 Ejaaz: through its thinking, it seems to have like spun up a few different agents or 434 00:26:41,640 --> 00:26:44,240 Ejaaz: instances of its own self to pull this off. 435 00:26:44,420 --> 00:26:47,380 Ejaaz: Like I think if you scroll up, like I saw a few kind of like prompts that like 436 00:26:47,380 --> 00:26:49,120 Ejaaz: suggested that that's what it was doing, 437 00:26:49,600 --> 00:26:53,880 Ejaaz: which I think is, underscore is a very important point that both of these models 438 00:26:53,880 --> 00:26:59,980 Ejaaz: can do, which is they can spin up multiple versions of the same model and task 439 00:26:59,980 --> 00:27:02,020 Ejaaz: it with different things to run in parallel. 440 00:27:02,440 --> 00:27:05,920 Ejaaz: What this means is you can get a really complicated product like what you're 441 00:27:05,920 --> 00:27:11,380 Ejaaz: seeing on the screen right now in a matter of minutes because it's running in parallel. 442 00:27:11,600 --> 00:27:15,440 Ejaaz: So imagine having a bunch of computer science geniuses that you can just duplicate 443 00:27:15,440 --> 00:27:20,260 Ejaaz: immediately and run at a fraction of the cost of electricity, the cost of inference. 444 00:27:20,640 --> 00:27:23,980 Ejaaz: And now you start to see why all these NVIDIA chips and stuff are worth so much. 445 00:27:24,260 --> 00:27:26,320 Ejaaz: Because you want to do cool stuff like this. This is insane. 446 00:27:26,440 --> 00:27:28,600 Josh: It's actually incredible. Okay, so now I want to test it on Tesla. 447 00:27:29,240 --> 00:27:31,640 Josh: So I'm going to choose Tesla and see if it actually can do it in. 448 00:27:31,640 --> 00:27:33,160 Ejaaz: A non-controlled environment. This UI is so cool. 449 00:27:33,400 --> 00:27:37,000 Josh: It's very pretty. What the hell? This looks great. Okay, so here we have Tesla. 450 00:27:37,240 --> 00:27:39,900 Josh: It has the charts. We're going to click through the charts. It has the one-week 451 00:27:39,900 --> 00:27:43,840 Josh: chart, the one-month chart, the three-month chart. That looks fairly accurate. 452 00:27:44,340 --> 00:27:48,160 Josh: It has the price-to-earnings ratio, the 52-week high, 52-week low. 453 00:27:48,840 --> 00:27:52,460 Josh: So it looks like at one point it was trading at $4.88, now it's trading at $3.89. 454 00:27:52,900 --> 00:27:57,400 Josh: The bull case for Tesla, RoboTaxi and FSD driving licenses could unlock $500 455 00:27:57,400 --> 00:27:59,060 Josh: billion in revenue by 2030. 456 00:27:59,820 --> 00:28:03,280 Josh: It has the RoboTaxi service launch in Austin that it's preparing for. 457 00:28:03,580 --> 00:28:08,940 Josh: And let's see the sector comparison. So it's comparing it to Rivian, Baidu, Toyota, Ford. 458 00:28:09,300 --> 00:28:13,180 Josh: It has the competitive moat where it says it's most strong in brand power, 459 00:28:13,340 --> 00:28:14,920 Josh: IP patents, and cost advantages. 460 00:28:15,300 --> 00:28:17,900 Josh: You can see the revenue, the estimate per share earnings. 461 00:28:19,060 --> 00:28:23,620 Josh: Sentiment is much worse on Tesla than it was on Apple. It's at 52% right now. 462 00:28:24,020 --> 00:28:29,960 Josh: And it looks like, as it relates to the risk assessment, devaluation and competition 463 00:28:29,960 --> 00:28:32,880 Josh: and execution are all very high risk. 464 00:28:33,060 --> 00:28:36,800 Josh: And that's probably an accurate assessment, although I'm not sure the competition 465 00:28:36,800 --> 00:28:39,500 Josh: is really a problem. The execution is certainly going to be an issue. 466 00:28:39,620 --> 00:28:44,440 Josh: But it's just amazing to see how well it does. And it even gives it a verdict. 467 00:28:44,640 --> 00:28:46,500 Josh: So the AI verdict on Tesla is, 468 00:28:47,150 --> 00:28:51,190 Josh: It's a hold. Tesla's optionality is enormous, but current valuations already 469 00:28:51,190 --> 00:28:52,890 Josh: prices in multiple moonshots. 470 00:28:53,310 --> 00:28:57,010 Josh: Execution on RoboTaxi will be the key catalyst. That sounds about right. 471 00:28:57,190 --> 00:29:02,910 Josh: And it's amazing that we just built this with a single prompt without any oversight from me. 472 00:29:03,070 --> 00:29:07,970 Josh: And it works. It actually works. It's really just unbelievable how capable these things are. 473 00:29:08,110 --> 00:29:10,690 Josh: And now I have a dashboard that anytime I want to make a decision, 474 00:29:10,870 --> 00:29:16,010 Josh: I can type in the ticker and get all this um optionality it even has menus that 475 00:29:16,010 --> 00:29:21,130 Josh: work look at this profit margins pe ratios market cap wow pretty unbelievable it's. 476 00:29:21,130 --> 00:29:26,690 Ejaaz: It's a reactive in real time bloomberg terminal oh wait for the modern age 477 00:29:26,690 --> 00:29:30,710 Josh: There's um there's another feature here that looks like you could compare stocks 478 00:29:30,710 --> 00:29:35,890 Josh: let's see if this actually works here so if i type in let's say apple's ticker 479 00:29:35,890 --> 00:29:40,310 Josh: and i hit go will that compare the two now it looks like that doesn't work very 480 00:29:40,310 --> 00:29:43,730 Josh: well oh my god but it has moving average lines and everything. This is pretty robust. 481 00:29:43,950 --> 00:29:47,050 Ejaaz: I know it's like the traded and investors dream. Just crazy. 482 00:29:48,070 --> 00:29:50,070 Ejaaz: Kind of like a side note on this, but like, 483 00:29:50,840 --> 00:29:53,860 Ejaaz: The fact that Tesla's down and everyone's kind of like bearish on this company, 484 00:29:54,020 --> 00:29:56,880 Ejaaz: even though they're like rumored to be merging and stuff like this. 485 00:29:57,200 --> 00:30:02,800 Ejaaz: Like the point being is there's an asymmetry between what the market is seeing 486 00:30:02,800 --> 00:30:06,260 Ejaaz: and what these inventors and builders are seeing. 487 00:30:06,600 --> 00:30:12,220 Ejaaz: These AI labs have created what they define as pretty much a low form of AGI. 488 00:30:12,600 --> 00:30:16,320 Ejaaz: You literally have an AI model that is building the next version of itself. 489 00:30:16,320 --> 00:30:21,580 Ejaaz: That by description is like a super genius and it's only limited by the function 490 00:30:21,580 --> 00:30:24,160 Ejaaz: of energy and compute, right? 491 00:30:24,580 --> 00:30:28,700 Ejaaz: And then investors are looking at this and saying, huh, Amazon and Google are 492 00:30:28,700 --> 00:30:32,380 Ejaaz: about to spend a combined $500 billion worth of CapEx this year. 493 00:30:32,920 --> 00:30:36,600 Ejaaz: Kind of bearish, that's a lot of money. So there is a real investment opportunity 494 00:30:36,600 --> 00:30:39,780 Ejaaz: here to really understand the difference of what these things can actually do. 495 00:30:40,100 --> 00:30:43,000 Ejaaz: And that might lead to a lot of like opportunities to invest. 496 00:30:43,160 --> 00:30:46,780 Ejaaz: I don't know, but I know that I'm buying Tesla today and a bunch of google stock 497 00:30:46,780 --> 00:30:50,620 Josh: Yeah i mean look at this google valuation one this chart looks absolutely gorgeous 498 00:30:50,620 --> 00:30:54,900 Josh: but two um the ai verdict is a buy even the ai thinks google is a buy because 499 00:30:54,900 --> 00:30:59,260 Josh: they just have um alphabet offers the best value in mega cap tech dominant ai 500 00:30:59,260 --> 00:31:03,340 Josh: capabilities diversified growth and a cheap valuation if search mode holds and. 501 00:31:03,340 --> 00:31:05,320 Ejaaz: Yeah give me the week give me the week 502 00:31:05,320 --> 00:31:08,320 Josh: Let's see the weekly chart here do you want some moving average lines as well 503 00:31:08,320 --> 00:31:10,300 Josh: because we could drop those in please let's. 504 00:31:10,300 --> 00:31:15,500 Ejaaz: See let's see i'm actually super yeah look see it's had a slight dip Markets are so reactive. Crazy. 505 00:31:15,940 --> 00:31:22,640 Josh: Yeah, and I think to the point of the CapEx, markets are viewing that as a scary, high-risk statement. 506 00:31:22,880 --> 00:31:27,480 Josh: But while that's true, I also think it's a testament to the fact that scaling 507 00:31:27,480 --> 00:31:31,520 Josh: laws are going to work, and the largest companies in the world are betting on 508 00:31:31,520 --> 00:31:32,860 Josh: the continuation of them working. 509 00:31:33,140 --> 00:31:37,860 Josh: And the shared consensus between all of these large-cap companies deciding to 510 00:31:37,860 --> 00:31:40,080 Josh: spend record CapEx this year, 511 00:31:40,700 --> 00:31:43,980 Josh: is a testament to the fact that things are only going to go faster. 512 00:31:44,300 --> 00:31:47,160 Josh: And they believe that the more money they put in, the more outputs they will get. 513 00:31:47,300 --> 00:31:51,080 Josh: And they're going to continue to put their foot on the gas. So I think any question 514 00:31:51,080 --> 00:31:55,020 Josh: that anyone had, if these scaling laws could continue to hold up and we could 515 00:31:55,020 --> 00:31:58,120 Josh: continue to be on the path to whatever AGI looks like and beyond, 516 00:31:58,280 --> 00:32:00,300 Josh: I think that was answered this week through these earnings reports. 517 00:32:00,400 --> 00:32:02,580 Josh: And the overwhelming answer is yes, it's true. 518 00:32:02,840 --> 00:32:06,520 Josh: It is likely that this is going to happen and everyone is betting their entire company on it? 519 00:32:06,740 --> 00:32:11,700 Ejaaz: I think we have done a great job, if I pat ourselves on the back virtually, 520 00:32:11,880 --> 00:32:13,980 Ejaaz: Josh, of showing what these models are capable of. 521 00:32:14,100 --> 00:32:18,500 Ejaaz: And remember, it's been less than 48 hours that these models have been alive. 522 00:32:18,840 --> 00:32:23,840 Ejaaz: In fact, I think it's been like 36 hours. So if any of you are interested in 523 00:32:23,840 --> 00:32:28,400 Ejaaz: trying these out, I cannot urge you enough to go out and try these things. 524 00:32:28,760 --> 00:32:32,640 Ejaaz: Try to solve a problem that you're finding at work or try to solve a problem 525 00:32:32,640 --> 00:32:36,140 Ejaaz: that you're finding just in your casual leisure time to code up a hobby or a 526 00:32:36,140 --> 00:32:39,320 Ejaaz: project in a matter of seconds. It's so, so easy. 527 00:32:39,480 --> 00:32:43,260 Ejaaz: And it'll put you at an advantage to understand how these tools work and why 528 00:32:43,260 --> 00:32:46,320 Ejaaz: they're really changing the world as we see it around us, why stocks are dumping, 529 00:32:46,680 --> 00:32:47,920 Ejaaz: why some stocks are pumping. 530 00:32:48,600 --> 00:32:51,760 Ejaaz: But yes, go demo it. Let us know what you actually end up building. 531 00:32:52,400 --> 00:32:56,720 Ejaaz: Josh and I are trying to give you more live demos in a lot of the episodes that we put out. 532 00:32:56,860 --> 00:32:59,540 Ejaaz: And with every other model release and feature that drops, we are going to be 533 00:32:59,540 --> 00:33:03,600 Ejaaz: trying and testing these things so we can bring to you exactly what these things 534 00:33:03,600 --> 00:33:06,300 Ejaaz: can do and show you kind of like the benefits and disadvantages, 535 00:33:06,300 --> 00:33:08,000 Ejaaz: what's real and what's really not. 536 00:33:08,440 --> 00:33:12,040 Josh: Yeah. And I can't stress this enough. The best way to stay on top of things, 537 00:33:12,180 --> 00:33:15,260 Josh: the best way to feel like you're not being left behind is just to use the tools 538 00:33:15,260 --> 00:33:17,920 Josh: as they come out and to understand them and what makes them different. 539 00:33:18,140 --> 00:33:24,200 Josh: And for a single subscription to ChatGPT or to Claude, you can access tools 540 00:33:24,200 --> 00:33:26,040 Josh: just like this and build stuff just like this. 541 00:33:26,340 --> 00:33:29,660 Josh: I'm not, this wasn't like an incredibly difficult technical challenge. 542 00:33:29,680 --> 00:33:32,140 Josh: You just ask it what you want and you ask it to help you. 543 00:33:32,280 --> 00:33:35,740 Josh: And it will actually walk through and help you through the process and build whatever you want. 544 00:33:35,860 --> 00:33:41,300 Josh: So the most important thing for anyone listening is just to train that muscle and to get familiar with, 545 00:33:41,790 --> 00:33:45,410 Josh: these tools and these skills that you're able to leverage them to your advantage, 546 00:33:45,410 --> 00:33:47,950 Josh: however it may best fit in your life. 547 00:33:48,070 --> 00:33:50,410 Josh: And that's what kind of we wanted to share with us. 548 00:33:50,530 --> 00:33:52,890 Josh: Like, it's simple. You download the app, you log into your account, 549 00:33:52,970 --> 00:33:54,150 Josh: and you're on your way. It's really 550 00:33:54,150 --> 00:33:58,410 Josh: not as difficult as I think a lot of people make it seem like it is. 551 00:33:58,630 --> 00:34:02,090 Josh: And I mean, this beautiful dashboard is a testament to that. 552 00:34:02,450 --> 00:34:05,510 Josh: Okay, so Ejaz, it also looks like our codex output 553 00:34:05,510 --> 00:34:08,230 Josh: has finished itself so we have here on the 554 00:34:08,230 --> 00:34:11,010 Josh: screen we have opus which we saw which is 555 00:34:11,010 --> 00:34:14,130 Josh: really a lovely dashboard but it seems like codex 556 00:34:14,130 --> 00:34:17,430 Josh: now has its own version that we could quickly compare so maybe we'll try we'll 557 00:34:17,430 --> 00:34:20,490 Josh: go to our favorite google we'll type google in and we'll click analyze and kind 558 00:34:20,490 --> 00:34:24,390 Josh: of see how this compares i find it funny how they've they've merged on the same 559 00:34:24,390 --> 00:34:29,310 Josh: type of design style but yeah oh okay this whoa this is interesting this is 560 00:34:29,310 --> 00:34:33,210 Josh: different so it has the moving averages select oh is that, 561 00:34:34,320 --> 00:34:35,760 Josh: Okay, yeah, so it has the charts. 562 00:34:35,940 --> 00:34:36,620 Ejaaz: Is that accurate? 563 00:34:36,820 --> 00:34:39,780 Josh: It has the PE ratio. Yeah, that's what I was looking at. Let's go to that one-week chart and see. 564 00:34:40,580 --> 00:34:45,360 Josh: I have some questions about these. It looks pretty right. 565 00:34:46,040 --> 00:34:48,320 Ejaaz: Okay. That looks very wrong. 566 00:34:48,500 --> 00:34:52,920 Josh: Yeah, the one you're a little confused about. Let's compare it to Claude here. 567 00:34:53,100 --> 00:34:56,700 Josh: Let's go to Google and we'll analyze that. Well, it thinks we can look at the 568 00:34:56,700 --> 00:34:58,620 Josh: rest. So it looks like it emulated pretty well. 569 00:34:59,200 --> 00:35:01,460 Josh: It has the verdict. It has the same stats. 570 00:35:02,660 --> 00:35:06,560 Josh: The risk assessment matrix is... good but you could see like some of the text 571 00:35:06,560 --> 00:35:11,140 Josh: you can't really read because it's black on black um but nonetheless pretty 572 00:35:11,140 --> 00:35:12,700 Josh: interesting they both succeeded. 573 00:35:12,700 --> 00:35:18,120 Ejaaz: Yeah i mean as we said before like these models are very equally capable and 574 00:35:18,120 --> 00:35:22,040 Ejaaz: you know maybe it's just the way that you prompt something or uh the way that 575 00:35:22,040 --> 00:35:25,180 Ejaaz: some of these things work but largely they kind of achieve the same goal and 576 00:35:25,180 --> 00:35:31,120 Ejaaz: same quality um and like listen like we're talking about like minor discrepancies here 577 00:35:31,660 --> 00:35:34,760 Ejaaz: I can't wait to see what we will build with this. Like, this is insane. 578 00:35:34,960 --> 00:35:38,180 Josh: It's amazing. Both of these one-shot prompts didn't touch anything. 579 00:35:38,180 --> 00:35:40,500 Josh: And here we are. I do think that Google, when your chart is wrong, 580 00:35:40,600 --> 00:35:41,580 Josh: I think Claude got that one right. 581 00:35:41,700 --> 00:35:44,360 Josh: But we overall both succeeded in the mission. Both look great. 582 00:35:44,520 --> 00:35:45,780 Josh: And both are just excellent models. 583 00:35:46,400 --> 00:35:49,720 Ejaaz: Amazing. Okay, well, that's it. Wherever you're listening to this, 584 00:35:49,920 --> 00:35:53,060 Ejaaz: if it is on YouTube and you're watching our lovely faces, or if you're listening 585 00:35:53,060 --> 00:35:56,080 Ejaaz: to us on Spotify, Apple Music, or wherever you listen to us, 586 00:35:56,740 --> 00:35:59,860 Ejaaz: please subscribe, give us a rating, leave us some comments. 587 00:35:59,860 --> 00:36:04,960 Ejaaz: We love your feedback and we respond to pretty much every single comment because 588 00:36:04,960 --> 00:36:07,340 Ejaaz: we're trying to figure out how to make this show better and bring you the content 589 00:36:07,340 --> 00:36:09,220 Ejaaz: that you guys deserve and want. 590 00:36:09,400 --> 00:36:13,240 Ejaaz: Turn on notifications because we are releasing more and more videos every week 591 00:36:13,240 --> 00:36:15,680 Ejaaz: on the hottest topics as they come out. 592 00:36:15,860 --> 00:36:20,280 Ejaaz: We also have the sickest newsletter ever where one of us will either write a 593 00:36:20,280 --> 00:36:22,600 Ejaaz: essay or give you the five top highlights of the week. 594 00:36:22,700 --> 00:36:24,980 Ejaaz: So if you don't want to watch any of these videos, you can just read and digest 595 00:36:24,980 --> 00:36:28,580 Ejaaz: that and you'll know everything that you need to know in AI and frontier tech. 596 00:36:29,060 --> 00:36:31,860 Ejaaz: Thank you for listening, and we will see you on the next one. 597 00:36:31,940 --> 00:36:33,020 Josh: See you in the next one. Peace.

Never lose your place, on any device

Create a free account to sync, back up, and get personal recommendations.