
Limitless Podcast
ยทE122
The Coding Model Wars: Claude Opus 4.6 vs GPT-5.3 Codex
Episode Transcript
1
00:00:00,020 --> 00:00:05,840
Ejaaz:
48 hours ago, Anthropic dropped Claude Opus 4.6, the world's most powerful AI model.
2
00:00:06,140 --> 00:00:10,760
Ejaaz:
And literally 20 minutes later, OpenAI dropped Codex 5.3, which is not only
3
00:00:10,760 --> 00:00:13,140
Ejaaz:
better, but also built itself.
4
00:00:13,480 --> 00:00:17,740
Ejaaz:
Now, to say both of these models are powerful would literally be the understatement of the century.
5
00:00:18,040 --> 00:00:20,880
Ejaaz:
By the time I'd eaten breakfast yesterday, one of the models had discovered
6
00:00:20,880 --> 00:00:23,840
Ejaaz:
500 security flaws, which no one else had discovered before.
7
00:00:24,240 --> 00:00:28,080
Ejaaz:
And by lunchtime, a bunch of software stocks were down hundreds of billions
8
00:00:28,080 --> 00:00:31,500
Ejaaz:
of dollars out of fear that these models would replace entire teams.
9
00:00:31,800 --> 00:00:35,680
Ejaaz:
And it's actually already happened. These models can replace a team of 50 software
10
00:00:35,680 --> 00:00:38,740
Ejaaz:
engineers, rebuild Pokemon from scratch, and so much more.
11
00:00:38,840 --> 00:00:42,180
Ejaaz:
And in this episode, we're going to be doing a live demo side by side to show
12
00:00:42,180 --> 00:00:43,900
Ejaaz:
you which model is the best.
13
00:00:44,100 --> 00:00:47,040
Josh:
Yeah, this is pretty cool. I wanted to spend a lot of time this episode kind
14
00:00:47,040 --> 00:00:50,560
Josh:
of introducing people to these models, what they could do, how they work through
15
00:00:50,560 --> 00:00:52,700
Josh:
demos that we're going to perform ourselves.
16
00:00:52,980 --> 00:00:55,940
Josh:
These are definitely two frontier models but i think more importantly they're
17
00:00:55,940 --> 00:00:58,760
Josh:
frontier coding models and when people hear that i
18
00:00:58,760 --> 00:01:01,900
Josh:
think a lot of them get turned away because it seems like this complicated
19
00:01:01,900 --> 00:01:04,800
Josh:
thing like you need to be a developer in order to use them and we
20
00:01:04,800 --> 00:01:08,240
Josh:
are here to tell you that is not the case as from
21
00:01:08,240 --> 00:01:11,000
Josh:
one non-technical person to another i fed this
22
00:01:11,000 --> 00:01:14,020
Josh:
model a prompt i fed it some assets and
23
00:01:14,020 --> 00:01:16,880
Josh:
then i pressed play and what i got is a
24
00:01:16,880 --> 00:01:19,740
Josh:
side-scrolling game which was exactly what i asked for so on the screen now
25
00:01:19,740 --> 00:01:23,960
Josh:
you're seeing the one shot prompt that i fed this model to ask to create a side
26
00:01:23,960 --> 00:01:28,200
Josh:
scroller that was like mario that we can actually play so it has coins and i
27
00:01:28,200 --> 00:01:31,560
Josh:
don't think the gravity quite works what you're saying is that it understands
28
00:01:31,560 --> 00:01:37,460
Josh:
physics it is able to generate graphics and it plays like a pretty solid side
29
00:01:37,460 --> 00:01:39,560
Josh:
scroller and i created this in five minutes,
30
00:01:40,310 --> 00:01:43,070
Josh:
with one prompt and it actually works what.
31
00:01:43,070 --> 00:01:44,710
Ejaaz:
Was the prompt that you used josh
32
00:01:44,710 --> 00:01:47,430
Josh:
Yeah so i'll pause playing this game to
33
00:01:47,430 --> 00:01:50,050
Josh:
actually show you the the prompt it was very simple it was this
34
00:01:50,050 --> 00:01:52,850
Josh:
one paragraph i want you to make a game you can
35
00:01:52,850 --> 00:01:56,170
Josh:
use python or c++ whatever you find the most convenient a 2d
36
00:01:56,170 --> 00:01:58,790
Josh:
platformer that closely resembles super mario use the
37
00:01:58,790 --> 00:02:01,430
Josh:
attached background image and sprites found in the
38
00:02:01,430 --> 00:02:04,490
Josh:
asset folder take into account that the sprites don't come with transparent background
39
00:02:04,490 --> 00:02:07,210
Josh:
but pink ones so you need to filter the background and for those who are
40
00:02:07,210 --> 00:02:10,290
Josh:
watching you can actually see the sprites on my screen they were
41
00:02:10,290 --> 00:02:13,150
Josh:
just a series of assets that there was no context given as
42
00:02:13,150 --> 00:02:16,090
Josh:
to what each one of them was but the model reasoned through it it removed
43
00:02:16,090 --> 00:02:19,150
Josh:
the background and it actually generated a pretty good representation of
44
00:02:19,150 --> 00:02:22,270
Josh:
that now this was built one shot on codex which
45
00:02:22,270 --> 00:02:25,050
Josh:
is the new open ai mac application that just released this
46
00:02:25,050 --> 00:02:28,410
Josh:
week and i wanted to compare it to claude
47
00:02:28,410 --> 00:02:31,950
Josh:
so i have another instance here on the screen with claude this is using opus
48
00:02:31,950 --> 00:02:35,490
Josh:
4.6 the newest frontier model that they just released this week and i want to
49
00:02:35,490 --> 00:02:39,310
Josh:
do an exact one-to-one comparison so i'm gonna launch the same exact prompt
50
00:02:39,310 --> 00:02:42,390
Josh:
we're gonna have that cook on codex or we're gonna have that cook in claude
51
00:02:42,390 --> 00:02:46,350
Josh:
code and in the meantime you just maybe we can kind of talk about more of what
52
00:02:46,350 --> 00:02:48,110
Josh:
these models do and how they work well.
53
00:02:48,110 --> 00:02:53,650
Ejaaz:
Before we do that actually um as you set this game up i ran it on claude opus
54
00:02:53,650 --> 00:02:57,050
Ejaaz:
4.6 as well but with a slight twist okay
55
00:02:57,050 --> 00:02:58,790
Josh:
Let's see your output what do we have okay.
56
00:02:58,790 --> 00:03:01,650
Ejaaz:
Uh i don't know if you can see my screen
57
00:03:01,650 --> 00:03:06,190
Ejaaz:
but it is the exact game that you just created but i don't know if those characters
58
00:03:06,190 --> 00:03:10,490
Ejaaz:
look uh kind of familiar to you we have the uh hero protagonist character which
59
00:03:10,490 --> 00:03:15,750
Ejaaz:
is uh my beautiful face and my beautiful person ejaz um and we have uh who's
60
00:03:15,750 --> 00:03:18,930
Ejaaz:
this enemy over here that looks a lot like the bear guy
61
00:03:20,490 --> 00:03:26,930
Ejaaz:
and listen we can double jump here josh and i think yep i can crush you but every time i mean this
62
00:03:27,440 --> 00:03:33,840
Ejaaz:
Kind of jokes aside, this is insane. This took me like around three minutes to build end-to-end.
63
00:03:34,020 --> 00:03:36,080
Ejaaz:
I used the exact same prompt that you gave me.
64
00:03:36,380 --> 00:03:39,700
Ejaaz:
And we didn't have sprites ready-made of ourselves, right?
65
00:03:39,800 --> 00:03:43,340
Ejaaz:
We didn't have like cartoon images of ourselves. So I uploaded an image that
66
00:03:43,340 --> 00:03:46,520
Ejaaz:
we had taken, I don't know, like six months ago and said, hey,
67
00:03:46,640 --> 00:03:48,540
Ejaaz:
can you make game avatars out of this?
68
00:03:48,680 --> 00:03:51,900
Ejaaz:
It did it in 20 seconds. And then I said, could you add these to the game and
69
00:03:51,900 --> 00:03:57,100
Ejaaz:
replace the enemy with Josh and the protagonist with Ejaz? And it did it in a minute.
70
00:03:57,440 --> 00:04:00,180
Josh:
So here we go. That's pretty amazing. And these are really, these are just using
71
00:04:00,180 --> 00:04:03,300
Josh:
standard desktop applications. So what you're using right here,
72
00:04:03,460 --> 00:04:05,280
Josh:
this was done in Cloud Code, right?
73
00:04:05,680 --> 00:04:09,900
Josh:
You just went onto Cloud, the MacBook, the Mac app. You downloaded it. You put in the prompt.
74
00:04:10,180 --> 00:04:13,720
Josh:
You shared some assets. And now it built this amazing game in one single prompt.
75
00:04:13,840 --> 00:04:17,040
Josh:
And we're actually going to experiment further in this episode where we're going
76
00:04:17,040 --> 00:04:21,000
Josh:
to create a trading room that does actual real-time stock analysis.
77
00:04:21,400 --> 00:04:24,840
Josh:
So as I'm curating the prompts and as we're getting ready for that second demo,
78
00:04:24,980 --> 00:04:27,360
Josh:
maybe we could walk through what makes these models so exceptional.
79
00:04:27,440 --> 00:04:32,100
Ejaaz:
Yeah, well, you might actually notice the first difference on screen right now.
80
00:04:32,220 --> 00:04:36,120
Ejaaz:
If you notice, if you look closely, my avatar is kind of glitching out, right?
81
00:04:36,440 --> 00:04:39,680
Ejaaz:
And if you compare it to your Codex game that you just coded up,
82
00:04:39,840 --> 00:04:41,740
Ejaaz:
there's no glitches. It runs super smoothly.
83
00:04:41,980 --> 00:04:47,740
Ejaaz:
And the main takeaway here is Codex 5.3 is a superior coding model to Anthropic.
84
00:04:48,020 --> 00:04:51,360
Ejaaz:
And that's a sentence I never thought I would say, at least for the next couple
85
00:04:51,360 --> 00:04:55,780
Ejaaz:
of years, because Anthropic has held that prestige and title for so long.
86
00:04:55,900 --> 00:04:59,080
Ejaaz:
But since Code Red was initiated in open air around three months ago,
87
00:04:59,500 --> 00:05:04,180
Ejaaz:
Sam has devoted pretty much all his resources towards building the best coding model.
88
00:05:04,320 --> 00:05:08,280
Ejaaz:
And the benchmarks don't lie. It is a full 12 points on the software engineering
89
00:05:08,280 --> 00:05:10,980
Ejaaz:
benchmark ahead of Claude Opus 4.6.
90
00:05:11,180 --> 00:05:12,460
Josh:
That's a pretty significant difference.
91
00:05:12,700 --> 00:05:17,260
Ejaaz:
So I've actually pulled up a more general comparison between the two models here.
92
00:05:17,400 --> 00:05:20,600
Ejaaz:
And it summarizes it really well. So if we look at Claude's model,
93
00:05:20,740 --> 00:05:22,560
Ejaaz:
Opus 4.6, what's good about it?
94
00:05:22,840 --> 00:05:26,120
Ejaaz:
Well, they've 5x the context window.
95
00:05:26,140 --> 00:05:30,660
Ejaaz:
So it's gone up to a million tokens or rather characters that you can put in
96
00:05:30,660 --> 00:05:33,580
Ejaaz:
a single prompt, which if you want to understand how powerful this is,
97
00:05:33,800 --> 00:05:36,540
Ejaaz:
you can just put way more information into your initial prompt.
98
00:05:36,660 --> 00:05:40,760
Ejaaz:
It has much better context and memory. So you can end up cooking up much better
99
00:05:40,760 --> 00:05:44,640
Ejaaz:
products overall, which is very, very impressive and important to have.
100
00:05:44,940 --> 00:05:49,640
Ejaaz:
Number two, I would think about this as an orchestration model.
101
00:05:49,780 --> 00:05:54,480
Ejaaz:
So if you look at specific benchmarks, it is beaten OpenAI at GDP eval.
102
00:05:54,620 --> 00:05:58,640
Ejaaz:
GDP eval is a benchmark where they go out and they test a model's performance
103
00:05:58,640 --> 00:06:04,400
Ejaaz:
at a really complex task versus a professional human that would normally do that task.
104
00:06:04,660 --> 00:06:08,120
Ejaaz:
And the decision is, would you use the AI model or would you use the human?
105
00:06:08,300 --> 00:06:12,620
Ejaaz:
And in this case, you would choose Claude 4.6 over humans way more than you
106
00:06:12,620 --> 00:06:16,260
Ejaaz:
would choose OpenAI's latest model. So that's a really important thing.
107
00:06:16,400 --> 00:06:21,300
Ejaaz:
And the point around Claude's latest model is that it doesn't code as well as
108
00:06:21,300 --> 00:06:26,960
Ejaaz:
codecs, but it can orchestrate a bunch of agents and overall activity better than OpenAI.
109
00:06:27,200 --> 00:06:30,860
Ejaaz:
Now, if you look at Codex and OpenAI's new models specifically,
110
00:06:31,390 --> 00:06:36,050
Ejaaz:
It wins on the software engineering. It is simply a better software engineer
111
00:06:36,050 --> 00:06:40,410
Ejaaz:
than Claude is, which is a massive flip around and shows that it's a testament
112
00:06:40,410 --> 00:06:44,490
Ejaaz:
to how much resources and fine-tuning that OpenAI has been able to achieve.
113
00:06:44,750 --> 00:06:49,690
Josh:
And to the note on the quality of the models here, my prompt is done in Claude
114
00:06:49,690 --> 00:06:52,530
Josh:
code that I used, the same one that we used in Codex. And I'm going to run it
115
00:06:52,530 --> 00:06:53,470
Josh:
here for the first time now.
116
00:06:53,610 --> 00:06:56,670
Josh:
You can see on screen and we'll see what it looks like.
117
00:06:56,810 --> 00:06:59,550
Josh:
So underneath, we have our Codex version, which looks beautiful.
118
00:06:59,550 --> 00:07:03,870
Josh:
On top we have our brand new version that was just made by opus now i haven't
119
00:07:03,870 --> 00:07:06,170
Josh:
tried this yet so we're going to see what happens when i press space to start,
120
00:07:09,590 --> 00:07:12,310
Josh:
so it looks like opus has failed to create a
121
00:07:12,310 --> 00:07:18,310
Josh:
floor so i am just falling through the floor until the game ends um okay so
122
00:07:18,310 --> 00:07:21,510
Josh:
just based on this one demo alone this is a fairly significant difference where
123
00:07:21,510 --> 00:07:25,930
Josh:
gpt's codex has created a beautiful side scroller it doesn't have gravity but
124
00:07:25,930 --> 00:07:28,950
Josh:
i could just ask it to or it has gravity it's a little too much i could ask
125
00:07:28,950 --> 00:07:30,930
Josh:
it to lower it opus doesn't even work at all,
126
00:07:31,550 --> 00:07:34,650
Josh:
And again, the test was just a one-shot prompt. So I'm going to get back to
127
00:07:34,650 --> 00:07:38,410
Josh:
work prompting it again to build this new application, the trading application.
128
00:07:38,610 --> 00:07:42,150
Josh:
We'll follow up with that. But I think that's a funny kind of demo just to showcase
129
00:07:42,150 --> 00:07:46,450
Josh:
that one actually is kind of superior in the other in this one use case, at least.
130
00:07:46,950 --> 00:07:53,130
Ejaaz:
Yeah, I mean, you said it pretty clearly, which is Codex is the best coding AI model.
131
00:07:53,290 --> 00:07:57,130
Ejaaz:
And I have to like, I can't emphasize that enough because OpenAI for a long
132
00:07:57,130 --> 00:08:01,690
Ejaaz:
time was behind Anthropic and by a massive margin. and in some way,
133
00:08:01,770 --> 00:08:03,430
Ejaaz:
shape, or form, they've been able to catch up.
134
00:08:03,690 --> 00:08:09,970
Ejaaz:
Now, what's interesting here is both companies have focused on each other's goals.
135
00:08:10,310 --> 00:08:15,350
Ejaaz:
So when Anthropic was typically meant to be the leading frontier model in coding,
136
00:08:15,890 --> 00:08:18,730
Ejaaz:
it now has decided to focus on what OpenAI was really good at,
137
00:08:18,870 --> 00:08:23,390
Ejaaz:
which is overall orchestration and being a better generalized model, right?
138
00:08:23,610 --> 00:08:25,610
Josh:
They're taking each other's lunch. Yeah, exactly.
139
00:08:25,790 --> 00:08:27,530
Ejaaz:
OpenAI has decided to eat Anthropic's
140
00:08:27,530 --> 00:08:30,490
Ejaaz:
lunch and say, okay, we've got the generalized stuff sorted out.
141
00:08:30,690 --> 00:08:34,490
Ejaaz:
Let's try and figure out the coding specific niche, highly defined,
142
00:08:34,630 --> 00:08:37,610
Ejaaz:
professionalized functions. And it's produced the best coding model.
143
00:08:37,770 --> 00:08:41,650
Ejaaz:
So it's kind of a weird win-win for both labs.
144
00:08:41,710 --> 00:08:46,090
Ejaaz:
And what's awesome about this is they both now have really well-rounded,
145
00:08:46,310 --> 00:08:48,630
Ejaaz:
but also very specialized models.
146
00:08:48,830 --> 00:08:53,470
Ejaaz:
And the reason why this is important is, and this is like kind of maybe my hot take,
147
00:08:54,140 --> 00:08:59,120
Ejaaz:
I don't think the coding models matter, Josh. I actually don't think the generalized models matter either.
148
00:08:59,400 --> 00:09:03,200
Ejaaz:
I think they're both going off to something much bigger, which is creating the
149
00:09:03,200 --> 00:09:06,060
Ejaaz:
operating system for the future of work.
150
00:09:06,200 --> 00:09:10,080
Ejaaz:
They know that AI models and AI agents are gonna automate a ton of different
151
00:09:10,080 --> 00:09:14,120
Ejaaz:
industries and the industries are only gonna pick you if you can do both generalized
152
00:09:14,120 --> 00:09:17,020
Ejaaz:
work and hyper-specific work really well.
153
00:09:17,160 --> 00:09:20,340
Ejaaz:
That is coding and orchestration and managing your data.
154
00:09:20,680 --> 00:09:24,020
Ejaaz:
And now we have two amazing models dropped within 20 minutes of each other.
155
00:09:24,140 --> 00:09:28,560
Ejaaz:
That does exactly that to the highest performance metric that we've ever seen before.
156
00:09:29,180 --> 00:09:32,380
Josh:
They're pretty exceptional. So now for this next demo, I have it queued up here.
157
00:09:32,720 --> 00:09:36,440
Josh:
What we're going to do is, what I did is ask the model itself to build me a
158
00:09:36,440 --> 00:09:40,440
Josh:
prompt for this. So I wanted it to create me an AI stock portfolio war room.
159
00:09:40,620 --> 00:09:44,700
Josh:
And I asked, hey, I want to create this, create me a fully fleshed out prompt
160
00:09:44,700 --> 00:09:48,240
Josh:
that kind of should solve this problem with one shot.
161
00:09:48,320 --> 00:09:52,200
Josh:
So what I do is I loaded it up here in our Cloud Code app.
162
00:09:52,240 --> 00:09:55,020
Josh:
And then I also loaded it up into the codex app i created its own
163
00:09:55,020 --> 00:09:57,660
Josh:
project folder and now i'm going to hit send so both of
164
00:09:57,660 --> 00:10:00,520
Josh:
these things are thinking in real time we will check back
165
00:10:00,520 --> 00:10:03,340
Josh:
in once their outputs are done and we'll compare again the second version
166
00:10:03,340 --> 00:10:06,460
Josh:
which is more of a robust one i mean you'll see uh on
167
00:10:06,460 --> 00:10:10,220
Josh:
the cloud screen it has this whole list of to-dos that it wants to do it has
168
00:10:10,220 --> 00:10:14,260
Josh:
an entire plan there's nine different panels that it's going to build it's going
169
00:10:14,260 --> 00:10:18,200
Josh:
to do risk analysis matrix and portfolio action bars and all this stuff so we'll
170
00:10:18,200 --> 00:10:21,260
Josh:
let that cook and let's get back to what separates these what people have been
171
00:10:21,260 --> 00:10:24,420
Josh:
freaking out about on the internet more as these things get going could i.
172
00:10:24,420 --> 00:10:27,320
Ejaaz:
Take three minutes show you some wild demos yeah
173
00:10:27,320 --> 00:10:30,220
Josh:
Let's see what the internet's been demoing while we wait for hours to cook okay.
174
00:10:30,220 --> 00:10:35,960
Ejaaz:
Cool like listen our 2d mario inspired game was cool but imagine if i told you
175
00:10:35,960 --> 00:10:41,300
Ejaaz:
you could recreate the entire pokemon game including levels cities characters
176
00:10:41,300 --> 00:10:45,740
Ejaaz:
and creatures that you fight from scratch in about an hour and 30 minutes
177
00:10:46,380 --> 00:10:47,940
Ejaaz:
That's pretty impressive. That's what we're looking at right now.
178
00:10:47,940 --> 00:10:49,540
Josh:
Wow, it even has the fighting.
179
00:10:49,980 --> 00:10:53,320
Ejaaz:
Yeah, yeah, yeah. And buttons and the multimodal gameplay.
180
00:10:53,620 --> 00:10:56,840
Ejaaz:
And obviously this looks like it's been made by a child image wise,
181
00:10:57,040 --> 00:10:59,860
Ejaaz:
but it's probably going to take you, what, another couple of hours to make a
182
00:10:59,860 --> 00:11:03,400
Ejaaz:
really high fidelity game that you could probably run on your Nintendo Switch or whatever.
183
00:11:03,820 --> 00:11:06,760
Ejaaz:
It is just so impressive that we can do these things.
184
00:11:07,160 --> 00:11:10,400
Ejaaz:
Anyone can do these things with no previous background. Just upload a few images
185
00:11:10,400 --> 00:11:14,580
Ejaaz:
or generate a few images and you can create childhood nostalgic games that are
186
00:11:14,580 --> 00:11:17,560
Ejaaz:
worth billions of dollars, which is just super cool to see.
187
00:11:17,740 --> 00:11:21,620
Josh:
Yeah, one of the cool things that I think it's really important to note is how approachable this is.
188
00:11:21,720 --> 00:11:25,180
Josh:
Like for the recent example that we're having run right now on my screen,
189
00:11:25,640 --> 00:11:29,420
Josh:
all I did was tell it what I wanted and ask it to develop the prompt with me.
190
00:11:29,540 --> 00:11:32,320
Josh:
So even if it feels overwhelming, like you don't really know how to code,
191
00:11:32,380 --> 00:11:35,720
Josh:
you don't know how to prompt things, you can actually just ask the model to
192
00:11:35,720 --> 00:11:38,480
Josh:
help you generate the prompt, help explain to you how it works.
193
00:11:38,600 --> 00:11:41,720
Josh:
And it's a really easy way to build basically anything you can imagine.
194
00:11:41,920 --> 00:11:45,040
Josh:
It's not just games. It's productivity tools. It's CRM tracking.
195
00:11:45,040 --> 00:11:48,060
Josh:
It's whatever you want it to be so i think that's really interesting but it
196
00:11:48,060 --> 00:11:52,260
Josh:
also goes much more technical right i saw another crazy example with the compiler.
197
00:11:52,260 --> 00:11:55,260
Ejaaz:
Okay so for for the tech nerds
198
00:11:55,260 --> 00:11:57,880
Ejaaz:
out there that's been a lot of time coding you are going to
199
00:11:57,880 --> 00:12:04,700
Ejaaz:
be wowed by this um for one of their uh flagship demos for uh opus 4.6 the anthropic
200
00:12:04,700 --> 00:12:11,600
Ejaaz:
team decided to task the model with building a c compiler which is an incredibly
201
00:12:11,600 --> 00:12:16,780
Ejaaz:
complicated execution tool that is required to code up some of the most craziest types of apps.
202
00:12:17,300 --> 00:12:20,980
Ejaaz:
And they just walked away. And they just kind of like looked at it,
203
00:12:21,320 --> 00:12:23,460
Ejaaz:
monitored it, made sure that it wasn't going awry.
204
00:12:23,780 --> 00:12:26,460
Ejaaz:
And in two weeks, let me emphasize that,
205
00:12:26,930 --> 00:12:31,970
Ejaaz:
Two whole weeks, 14 days, it coded nonstop and built this compiler.
206
00:12:32,370 --> 00:12:36,330
Ejaaz:
Now, you might think two weeks is quite a long time. I want my thing done in an hour and a half.
207
00:12:36,490 --> 00:12:40,630
Ejaaz:
Well, let me hearken back to history where previously, if you wanted to create
208
00:12:40,630 --> 00:12:44,550
Ejaaz:
something like this, in today's world, it would take a team of around 50 or
209
00:12:44,550 --> 00:12:49,230
Ejaaz:
so humans, and it would take them a few months to build from scratch. That's today.
210
00:12:49,510 --> 00:12:54,730
Ejaaz:
But back in the day, it would technically have taken them around a decade to
211
00:12:54,730 --> 00:12:56,790
Ejaaz:
build and like thousands of people.
212
00:12:56,930 --> 00:13:01,770
Ejaaz:
So we have just kind of condensed the timeline to create really complicated
213
00:13:01,770 --> 00:13:05,490
Ejaaz:
tools in a matter of hours or weeks in this case.
214
00:13:05,810 --> 00:13:09,610
Ejaaz:
Now, the second thing I want to point out is the fact that these models can
215
00:13:09,610 --> 00:13:13,250
Ejaaz:
go untouched for two weeks is just insane.
216
00:13:13,590 --> 00:13:17,330
Ejaaz:
There was another stat that was released today by OpenAI with,
217
00:13:17,410 --> 00:13:21,650
Ejaaz:
sorry, yesterday with OpenAI is 5.2, I think, 5.2 high, I believe,
218
00:13:21,850 --> 00:13:27,970
Ejaaz:
where it can go pretty much 50% hit rate for 6.6 hours. a time horizon.
219
00:13:28,230 --> 00:13:31,250
Ejaaz:
So that means if you gave it any kind of complicated coding task,
220
00:13:31,610 --> 00:13:35,930
Ejaaz:
50% of the time in 6.6 hours, it would get that done, completely done.
221
00:13:36,010 --> 00:13:39,750
Ejaaz:
And it would nail it 50% of the time, which is just such an impressive track
222
00:13:39,750 --> 00:13:41,290
Ejaaz:
record when you look back a year.
223
00:13:41,410 --> 00:13:45,210
Ejaaz:
And that time was, what was it like 30 minutes, maybe an hour.
224
00:13:45,570 --> 00:13:49,150
Ejaaz:
So every iteration, we see this thing double. It's just so insane.
225
00:13:49,630 --> 00:13:52,810
Josh:
Yeah, it's really, it's unbelievable and almost like intimidating how
226
00:13:52,810 --> 00:13:55,770
Josh:
capable and competent it is even for someone who
227
00:13:55,770 --> 00:13:58,730
Josh:
is a novel at writing code it's not about writing
228
00:13:58,730 --> 00:14:03,650
Josh:
code it's about being able to generate whatever you want it to so like if you
229
00:14:03,650 --> 00:14:07,050
Josh:
think of it you kind of in a way it abstracts the code away and allows you to
230
00:14:07,050 --> 00:14:11,110
Josh:
just speak the english language and get what you want from speaking english
231
00:14:11,110 --> 00:14:14,070
Josh:
and in a way that you understand and it will help walk you through the way one
232
00:14:14,070 --> 00:14:16,950
Josh:
of the things that i love about cloud code in particular is the plan mode.
233
00:14:17,690 --> 00:14:20,830
Josh:
If you leave a lot of things out of your prompt, it'll actually just continue
234
00:14:20,830 --> 00:14:23,530
Josh:
to prompt you with additional questions to understand where you want.
235
00:14:23,610 --> 00:14:29,330
Josh:
And one of the most fascinating things that I read about GPT's 5.3 codex in
236
00:14:29,330 --> 00:14:33,070
Josh:
particular is like you mentioned in the intro, it helps build itself.
237
00:14:33,470 --> 00:14:37,230
Josh:
And I don't think that can be overstated because this is the first model in
238
00:14:37,230 --> 00:14:42,050
Josh:
the history of OpenAI that has helped with the building and construction of itself.
239
00:14:42,490 --> 00:14:47,150
Josh:
And what happens as that starts to ramp up, right? If you think of each model
240
00:14:47,150 --> 00:14:49,730
Josh:
iteration as a flywheel, what is the constraint?
241
00:14:49,950 --> 00:14:54,250
Josh:
The two constraints are the speed at which a developer can actually build it
242
00:14:54,250 --> 00:14:57,090
Josh:
and then create the test for it and make sure that it's safe to ready to deploy.
243
00:14:57,430 --> 00:15:00,430
Josh:
And then it's the hardware that's required to actually train the model.
244
00:15:00,570 --> 00:15:04,550
Josh:
What we're seeing with Codex and Opus, which I really believe was kind of Sonnet,
245
00:15:04,850 --> 00:15:06,170
Josh:
is the incremental improvements.
246
00:15:06,430 --> 00:15:09,410
Josh:
Now, for the incremental improvements that don't require an entirely new training
247
00:15:09,410 --> 00:15:13,350
Josh:
run, the real constraint is the actual software and what you could squeeze out of it.
248
00:15:13,350 --> 00:15:16,290
Josh:
And when you have a model that's helping you build this
249
00:15:16,290 --> 00:15:19,390
Josh:
software that can think for 6 12 24 hours
250
00:15:19,390 --> 00:15:22,230
Josh:
at a time even longer and that is it kind
251
00:15:22,230 --> 00:15:25,310
Josh:
of creates this like self-fulfilling loop right where the models use the
252
00:15:25,310 --> 00:15:28,190
Josh:
new models to make the new models the future models
253
00:15:28,190 --> 00:15:31,070
Josh:
stronger and more powerful and better and i thought that was a really interesting
254
00:15:31,070 --> 00:15:35,330
Josh:
thing to note is that this is the first self propagating model where it ran
255
00:15:35,330 --> 00:15:38,910
Josh:
a lot of the test for itself it introduced new code that made itself better
256
00:15:38,910 --> 00:15:43,310
Josh:
and as we continue to see that you can start to imagine that vertical that like
257
00:15:43,310 --> 00:15:46,690
Josh:
exponential progress line going pretty close to vertical and things getting
258
00:15:46,690 --> 00:15:48,550
Josh:
really good like really really quick.
259
00:15:48,970 --> 00:15:53,390
Ejaaz:
I think what most people listening to this might think is that,
260
00:15:53,930 --> 00:15:55,970
Ejaaz:
well, what was different before?
261
00:15:56,370 --> 00:16:00,450
Ejaaz:
Well, previously, models would just kind of work in a very analog mode.
262
00:16:00,630 --> 00:16:02,390
Ejaaz:
You would just point it at a problem
263
00:16:02,390 --> 00:16:05,550
Ejaaz:
and it would just understand what the problem was and then solve it.
264
00:16:05,630 --> 00:16:10,530
Ejaaz:
But it lacked that awareness and wider context as to like what the wider vision
265
00:16:10,530 --> 00:16:13,630
Ejaaz:
and goal was to achieve and then figuring out stuff for itself.
266
00:16:13,630 --> 00:16:17,890
Ejaaz:
You always had to kind of handhold it. But now with its ability to kind of like
267
00:16:17,890 --> 00:16:21,130
Ejaaz:
understand what it's trying to do and look internally and say,
268
00:16:21,270 --> 00:16:24,110
Ejaaz:
huh, I made that mistake because of this error in my code.
269
00:16:24,250 --> 00:16:26,930
Ejaaz:
I'm going to now like rewrite my code and then I'll be better at it.
270
00:16:27,030 --> 00:16:31,590
Ejaaz:
It kind of functions similarly to a human. Now, I actually saw a great analogy.
271
00:16:32,110 --> 00:16:34,230
Ejaaz:
I forgot who wrote it, but it's
272
00:16:34,230 --> 00:16:38,790
Ejaaz:
fantastic. where if you imagine yourself standing on a sidewalk, right?
273
00:16:39,110 --> 00:16:45,350
Ejaaz:
And a Bugatti Veyron drives super fast by you at let's say 200 miles an hour,
274
00:16:45,510 --> 00:16:47,450
Ejaaz:
you'll be like, wow, that's kind of fast.
275
00:16:47,830 --> 00:16:52,850
Ejaaz:
And then two minutes later, another Bugatti drives by you at 300 miles an hour.
276
00:16:53,090 --> 00:16:56,630
Ejaaz:
You'll be like, wow, that's kind of fast. But you wouldn't really notice the
277
00:16:56,630 --> 00:16:59,890
Ejaaz:
difference between that 100 mile an hour difference, right?
278
00:17:00,050 --> 00:17:04,070
Ejaaz:
But if you were in the car strapped in, you would notice it is significantly
279
00:17:04,070 --> 00:17:06,890
Ejaaz:
improved. And that's how software engineers feel right now.
280
00:17:07,090 --> 00:17:10,190
Ejaaz:
Now, if you're someone that doesn't code all the time, you're not necessarily
281
00:17:10,190 --> 00:17:13,050
Ejaaz:
going to understand these impacts, but it's really important for those of you
282
00:17:13,050 --> 00:17:17,770
Ejaaz:
listening to this to figure out that this is massively impactful and will change
283
00:17:17,770 --> 00:17:19,890
Ejaaz:
the way that a lot of things are happening today.
284
00:17:19,990 --> 00:17:24,270
Ejaaz:
I mean, just take a look at this, right? This is a direct quote from someone
285
00:17:24,270 --> 00:17:27,010
Ejaaz:
who is building at a major tech company, Rakuten.
286
00:17:27,290 --> 00:17:32,950
Ejaaz:
And the quote here says, Claude Opus 4.6 autonomously closed 13 issues and assigned
287
00:17:32,950 --> 00:17:38,490
Ejaaz:
12 issues to the right team members in a single day, managing a 50-person organization
288
00:17:38,490 --> 00:17:40,510
Ejaaz:
across six repositories.
289
00:17:40,730 --> 00:17:43,870
Ejaaz:
Josh, do you know who else is responsible for doing that?
290
00:17:44,050 --> 00:17:48,550
Ejaaz:
An entire team of product managers that each get paid a quarter of a million
291
00:17:48,550 --> 00:17:50,110
Ejaaz:
dollars in compensation automatically.
292
00:17:50,330 --> 00:17:52,890
Josh:
Minimum per year at least yeah their.
293
00:17:52,890 --> 00:17:53,830
Ejaaz:
Jobs are automated now
294
00:17:53,830 --> 00:17:56,870
Josh:
Well one of the earlier moments in
295
00:17:56,870 --> 00:17:59,870
Josh:
which i realized this was pretty profound is is when claude co-work they
296
00:17:59,870 --> 00:18:03,890
Josh:
said they built it with what just a hint like four people over the course of
297
00:18:03,890 --> 00:18:09,450
Josh:
10 days and it was 100 built by the current model of claude which is opus 4.5
298
00:18:09,450 --> 00:18:14,510
Josh:
at the time like the the amount of leverage from these tools is so high but
299
00:18:14,510 --> 00:18:19,570
Josh:
it cuts both ways it's like if you can design and develop a product in 10 days,
300
00:18:19,810 --> 00:18:23,430
Josh:
then that means another company can probably do that in five.
301
00:18:23,690 --> 00:18:28,770
Josh:
And it starts to lower the competitive threshold for these companies to catch up.
302
00:18:28,890 --> 00:18:32,090
Josh:
And it starts to raise the bar of what is possible.
303
00:18:32,210 --> 00:18:35,790
Josh:
Like if you could build something that profound in 10 days, what can you build
304
00:18:35,790 --> 00:18:37,210
Josh:
over the course of six months?
305
00:18:37,350 --> 00:18:42,050
Josh:
Like, can you really build something fantastic that has a moat that like actually
306
00:18:42,050 --> 00:18:46,350
Josh:
delivers on the total power that you have by leveraging this AI?
307
00:18:46,510 --> 00:18:49,810
Josh:
It's going to be interesting to see because i mean what we're finding even with
308
00:18:49,810 --> 00:18:53,370
Josh:
the the codex and opus dual launch is that these companies are right next to
309
00:18:53,370 --> 00:18:55,150
Josh:
each other and if one publishes something,
310
00:18:55,870 --> 00:18:59,210
Josh:
profound or something that attracts a lot of users they're just a few days and
311
00:18:59,210 --> 00:19:03,330
Josh:
a few prompts away from copying it and that's like a pretty difficult thing
312
00:19:03,330 --> 00:19:06,310
Josh:
to compete against on on the software front well.
313
00:19:06,310 --> 00:19:10,070
Ejaaz:
That's why if we look at the stock market over the last couple of days like
314
00:19:10,070 --> 00:19:13,810
Ejaaz:
it's down trillions of dollars and i'm not exaggerating if you look at microsoft
315
00:19:13,810 --> 00:19:19,910
Ejaaz:
over the last two weeks, the stock is down 20%. It's trading like a meme stock, which is just insane.
316
00:19:20,530 --> 00:19:26,190
Ejaaz:
And the reason why that is, is a lot of investors are anticipating that these models,
317
00:19:27,290 --> 00:19:34,210
Ejaaz:
Specifically Opus 4.6 and Codex 5.3, will just create the tools that these billions
318
00:19:34,210 --> 00:19:38,790
Ejaaz:
of dollars worth of SaaS companies have spent or valued their entire lives on
319
00:19:38,790 --> 00:19:40,610
Ejaaz:
in a couple of seconds, just as you described.
320
00:19:40,810 --> 00:19:45,410
Ejaaz:
Now, the counter argument to this, Josh, is, and Jets of Wine actually kind
321
00:19:45,410 --> 00:19:48,450
Ejaaz:
of went live at a conference and spoke about this and made this point,
322
00:19:49,310 --> 00:19:54,110
Ejaaz:
If you're an AI agent or AI model that is capable of building these tools, right?
323
00:19:54,310 --> 00:19:59,710
Ejaaz:
Why would you rebuild the tool every single time you do a function?
324
00:20:00,090 --> 00:20:03,330
Ejaaz:
Surely you would just access the best tool and use it.
325
00:20:03,590 --> 00:20:07,690
Ejaaz:
So there's a bit more nuance where AI models aren't just gonna recreate your
326
00:20:07,690 --> 00:20:10,430
Ejaaz:
entire software stack if you are at a Fortune 500 company.
327
00:20:10,570 --> 00:20:14,290
Ejaaz:
That kind of doesn't make any sense. There are a bunch of tools that are hyper-optimized to do that.
328
00:20:14,330 --> 00:20:20,570
Ejaaz:
But what it will do is it will connect all of these tools and silos in a much more effective way.
329
00:20:20,810 --> 00:20:22,990
Ejaaz:
And maybe that requires rebuilding parts of it.
330
00:20:23,250 --> 00:20:26,610
Ejaaz:
Maybe it requires kind of connecting different ways, but not rebuilding the entire tools.
331
00:20:26,870 --> 00:20:31,610
Ejaaz:
And whatever operating system that ends up becoming will be the most sticky
332
00:20:31,610 --> 00:20:32,970
Ejaaz:
and valuable company ever.
333
00:20:33,090 --> 00:20:36,670
Ejaaz:
Now, that could be Salesforce, or it could be someone completely different,
334
00:20:36,770 --> 00:20:38,990
Ejaaz:
a startup that we haven't even heard of. And I think that's really important
335
00:20:38,990 --> 00:20:40,930
Ejaaz:
to understand, but people are experimenting.
336
00:20:41,170 --> 00:20:44,450
Ejaaz:
And if you look at this graph right here, which is may not look insane to some,
337
00:20:44,530 --> 00:20:50,230
Ejaaz:
but is insane to me at least, 4% of daily GitHub commits are now clawed code.
338
00:20:50,590 --> 00:20:56,870
Ejaaz:
That was, I think, 5% of what it is today two months ago.
339
00:20:57,070 --> 00:21:02,010
Ejaaz:
So the ascent has just been insane. These companies are adopting it and they are using it.
340
00:21:02,510 --> 00:21:05,570
Josh:
Yeah, the number is just going to keep going up and there's no reason why it
341
00:21:05,570 --> 00:21:08,430
Josh:
wouldn't. It's such a testament. One, the speed.
342
00:21:08,710 --> 00:21:11,090
Josh:
It feels like we're strapped in that car and now we're flying.
343
00:21:11,450 --> 00:21:14,190
Josh:
Two, an outsider might not look like it. It certainly feels like that
344
00:21:14,190 --> 00:21:17,790
Josh:
on the inside and i think a lot of people are starting to notice this and get
345
00:21:17,790 --> 00:21:20,690
Josh:
a little nervous about it too like look at this example on the screen right
346
00:21:20,690 --> 00:21:27,310
Josh:
now this is a prompt from gpt 5.3 codex which basically created an entire minecraft
347
00:21:27,310 --> 00:21:32,830
Josh:
clone in a single prompt and it looks awesome and it works really fast and it
348
00:21:32,830 --> 00:21:33,870
Josh:
was super lightweight and
349
00:21:34,120 --> 00:21:38,260
Josh:
And it says, I also tried on Opus 4.6, but for some reason it got stuck.
350
00:21:38,740 --> 00:21:42,180
Josh:
But you can build anything that you want very, very quickly,
351
00:21:42,640 --> 00:21:44,040
Josh:
like very cheaply as well.
352
00:21:44,520 --> 00:21:48,720
Josh:
What Opus 5.3, or Opus 5.3, I'm getting them all mixed up.
353
00:21:49,000 --> 00:21:55,260
Josh:
What GPT 5.3 Codex offered is double the rates, the double the token rates for
354
00:21:55,260 --> 00:21:56,080
Josh:
the next couple of months.
355
00:21:56,220 --> 00:21:59,960
Josh:
So you actually have the freedom for their $20 a month plan to go and build whatever you want.
356
00:22:00,100 --> 00:22:02,640
Ejaaz:
Can I maybe deliver a hot take, Josh?
357
00:22:02,900 --> 00:22:03,460
Josh:
Yeah, what do you got?
358
00:22:03,460 --> 00:22:08,000
Ejaaz:
I think the most exciting part about these model releases aren't the models themselves.
359
00:22:08,580 --> 00:22:11,840
Ejaaz:
Largely, I think the models are kind of similar in capabilities.
360
00:22:12,280 --> 00:22:16,680
Ejaaz:
They are around the same coding benchmarks, and they can roughly do the same
361
00:22:16,680 --> 00:22:19,120
Ejaaz:
things. They can spin up a bunch of agents and orchestrate themselves.
362
00:22:19,580 --> 00:22:24,300
Ejaaz:
The bigger picture, which I think a lot of people missed, was both companies,
363
00:22:24,380 --> 00:22:27,120
Ejaaz:
Anthropic and OpenAI, are at war with each other.
364
00:22:27,400 --> 00:22:31,360
Ejaaz:
And they're trying to basically build and own the operating system for work,
365
00:22:31,440 --> 00:22:34,140
Ejaaz:
which isn't just a model. it's a software suite.
366
00:22:34,340 --> 00:22:37,260
Ejaaz:
So this week alone, OpenAI didn't just release this new model.
367
00:22:37,460 --> 00:22:42,380
Ejaaz:
They released the Codex app, which is a desktop Mac app, which is kind of like
368
00:22:42,380 --> 00:22:45,360
Ejaaz:
a command line interface, which makes the coding experience way better.
369
00:22:45,500 --> 00:22:48,380
Ejaaz:
And they also launched an enterprise platform called Frontier,
370
00:22:48,640 --> 00:22:54,300
Ejaaz:
which allows Fortune 500 companies to basically take this magical model and
371
00:22:54,300 --> 00:22:57,800
Ejaaz:
give it to non-coders and let them do magical things. Now,
372
00:22:58,480 --> 00:23:02,660
Ejaaz:
All of these products together creates a very sticky experience where it starts
373
00:23:02,660 --> 00:23:07,320
Ejaaz:
to make sense for software engineers and non-software engineers to use these products.
374
00:23:07,420 --> 00:23:10,900
Ejaaz:
And it becomes incredibly sticky, which results in billion-dollar contracts, right?
375
00:23:11,440 --> 00:23:14,320
Ejaaz:
Anthropic has done the same thing over the last two weeks.
376
00:23:14,420 --> 00:23:18,540
Ejaaz:
They released Claude Cowork, they released agent teams this week,
377
00:23:18,560 --> 00:23:19,900
Ejaaz:
and then they released this new model.
378
00:23:20,040 --> 00:23:23,120
Ejaaz:
They're going after the same thing, which it kind of makes sense why they're
379
00:23:23,120 --> 00:23:25,960
Ejaaz:
releasing Super Bowl ads that are kind of shitting on each other now.
380
00:23:26,360 --> 00:23:31,280
Ejaaz:
It makes a lot of sense. And so the point is, if they can own this operating
381
00:23:31,280 --> 00:23:35,500
Ejaaz:
system, this future of work, they will basically be the most valuable company.
382
00:23:35,580 --> 00:23:36,980
Ejaaz:
And I think it's going to be when it takes most.
383
00:23:37,160 --> 00:23:39,880
Josh:
I have to interrupt you here. We have some developments on our prompts that
384
00:23:39,880 --> 00:23:42,480
Josh:
we've been working on, our AI stock war room. Let's go. That I'm going to have
385
00:23:42,480 --> 00:23:43,680
Josh:
to share on the screen right now.
386
00:23:44,040 --> 00:23:48,180
Josh:
So currently what it's doing is it's asking to do some quality assurance testing.
387
00:23:48,380 --> 00:23:52,880
Josh:
So you'll see it actually used a it's taking over control of my browser and
388
00:23:52,880 --> 00:23:56,600
Josh:
it's asking to make prompts on the screen. So you can see all of this that you're
389
00:23:56,600 --> 00:24:00,540
Josh:
seeing right here is generated live, and it's doing an actual real-time debug
390
00:24:00,540 --> 00:24:02,440
Josh:
of the product that it made.
391
00:24:02,600 --> 00:24:05,860
Josh:
It's clicking around, it's resizing things, it's going through the links,
392
00:24:05,880 --> 00:24:09,200
Josh:
and it's running real quality assurance testing on the actual product.
393
00:24:09,340 --> 00:24:11,360
Josh:
It's really amazing to see.
394
00:24:12,110 --> 00:24:15,130
Josh:
This was all just built all these visual charts and they're all accurate so
395
00:24:15,130 --> 00:24:17,850
Josh:
right now we're looking at nvidia we have a chart and i'm not going to mess
396
00:24:17,850 --> 00:24:20,870
Josh:
with it because it's doing the real-time manipulation to do quality assurance
397
00:24:20,870 --> 00:24:23,610
Josh:
checks but it's actually clicking through it's making sure the
398
00:24:23,610 --> 00:24:28,510
Josh:
stats are accurate it's making sure all the widgets work and look it has this
399
00:24:28,510 --> 00:24:32,570
Josh:
amazing graphs already it has sentiment analysis 85 percent of people are bullish
400
00:24:32,570 --> 00:24:38,210
Josh:
on nvidia it has recent signals from the news it has the assessment a risk assessment
401
00:24:38,210 --> 00:24:41,870
Josh:
matrix where it shows the like export controls and chip controls.
402
00:24:41,990 --> 00:24:46,490
Josh:
It has revenue and earnings every single quarter, charted, competitive moats.
403
00:24:46,690 --> 00:24:49,830
Josh:
It has sector comparisons. It's like, this is unbelievable.
404
00:24:50,090 --> 00:24:53,230
Josh:
And it just generated this in a single prompt. And I just find it really funny
405
00:24:53,230 --> 00:24:55,330
Josh:
that we can actually watch this do it in real time.
406
00:24:55,450 --> 00:25:00,490
Josh:
So you'll see in this prompt, it's clicking through, it's taking screenshots of what it's seeing.
407
00:25:00,630 --> 00:25:04,430
Josh:
And then it's digesting, analyzing, and understanding what it made,
408
00:25:04,650 --> 00:25:07,150
Josh:
what it messed up and what it actually still has left to finish.
409
00:25:07,150 --> 00:25:11,570
Josh:
And it generated everything, all of this in real time as we're recording this episode.
410
00:25:12,990 --> 00:25:13,710
Josh:
So fascinating.
411
00:25:14,210 --> 00:25:19,070
Ejaaz:
Wow, it reminds me of some of the research platforms at the former companies
412
00:25:19,070 --> 00:25:22,710
Ejaaz:
that I used to work at and they would pay, I'm not joking, millions of dollars
413
00:25:22,710 --> 00:25:26,670
Ejaaz:
a year to get access to these types of platforms that would give them analysis
414
00:25:26,670 --> 00:25:28,450
Ejaaz:
like what you're showing on the screen right now.
415
00:25:28,750 --> 00:25:32,390
Josh:
And you just built it from scratch. From scratch, and look, it's doing this.
416
00:25:32,510 --> 00:25:35,750
Josh:
I'm not even touching my keyboard. I just searched for Apple and now I'm sure
417
00:25:35,750 --> 00:25:36,750
Josh:
if I go over to the prompt,
418
00:25:36,750 --> 00:25:39,710
Josh:
it's taking screenshots of apple it says apple dashboard
419
00:25:39,710 --> 00:25:42,990
Josh:
looking great let me scroll to see the new three column button row layout and
420
00:25:42,990 --> 00:25:47,550
Josh:
it's checking the button rows and it's really unbelievable like we have the
421
00:25:47,550 --> 00:25:51,190
Josh:
investment thesis the bull case for it the bear case for it catalyst and timelines
422
00:25:51,190 --> 00:25:56,810
Josh:
it has wwdc built in it has the iphone 18 launch props um set up for september,
423
00:25:57,440 --> 00:26:01,500
Josh:
It's like so cool. It's absolutely unbelievable. And now this is a real tool
424
00:26:01,500 --> 00:26:03,880
Josh:
that I'll be able to use to type
425
00:26:03,880 --> 00:26:07,200
Josh:
in whatever stock I want to look at and actually get some analysis on it.
426
00:26:07,340 --> 00:26:12,580
Josh:
Now, I'll go over to Codex over here and it looks like Codex is taking its sweet time.
427
00:26:12,740 --> 00:26:16,480
Josh:
It's still zero out of six tasks completed. So it might take a little while
428
00:26:16,480 --> 00:26:19,780
Josh:
for us to get a visual on that, but it's just amazing to watch this happen in
429
00:26:19,780 --> 00:26:23,420
Josh:
real time as at least Cloud Code and Opus 4.6,
430
00:26:24,020 --> 00:26:27,760
Josh:
does some quality assurance testing live by taking over my browser and running
431
00:26:27,760 --> 00:26:30,740
Josh:
it for itself. I just think this is like, this is amazing.
432
00:26:31,120 --> 00:26:37,160
Ejaaz:
It's magic. Something I just noticed in your Opus chatbot screen when it's going
433
00:26:37,160 --> 00:26:41,640
Ejaaz:
through its thinking, it seems to have like spun up a few different agents or
434
00:26:41,640 --> 00:26:44,240
Ejaaz:
instances of its own self to pull this off.
435
00:26:44,420 --> 00:26:47,380
Ejaaz:
Like I think if you scroll up, like I saw a few kind of like prompts that like
436
00:26:47,380 --> 00:26:49,120
Ejaaz:
suggested that that's what it was doing,
437
00:26:49,600 --> 00:26:53,880
Ejaaz:
which I think is, underscore is a very important point that both of these models
438
00:26:53,880 --> 00:26:59,980
Ejaaz:
can do, which is they can spin up multiple versions of the same model and task
439
00:26:59,980 --> 00:27:02,020
Ejaaz:
it with different things to run in parallel.
440
00:27:02,440 --> 00:27:05,920
Ejaaz:
What this means is you can get a really complicated product like what you're
441
00:27:05,920 --> 00:27:11,380
Ejaaz:
seeing on the screen right now in a matter of minutes because it's running in parallel.
442
00:27:11,600 --> 00:27:15,440
Ejaaz:
So imagine having a bunch of computer science geniuses that you can just duplicate
443
00:27:15,440 --> 00:27:20,260
Ejaaz:
immediately and run at a fraction of the cost of electricity, the cost of inference.
444
00:27:20,640 --> 00:27:23,980
Ejaaz:
And now you start to see why all these NVIDIA chips and stuff are worth so much.
445
00:27:24,260 --> 00:27:26,320
Ejaaz:
Because you want to do cool stuff like this. This is insane.
446
00:27:26,440 --> 00:27:28,600
Josh:
It's actually incredible. Okay, so now I want to test it on Tesla.
447
00:27:29,240 --> 00:27:31,640
Josh:
So I'm going to choose Tesla and see if it actually can do it in.
448
00:27:31,640 --> 00:27:33,160
Ejaaz:
A non-controlled environment. This UI is so cool.
449
00:27:33,400 --> 00:27:37,000
Josh:
It's very pretty. What the hell? This looks great. Okay, so here we have Tesla.
450
00:27:37,240 --> 00:27:39,900
Josh:
It has the charts. We're going to click through the charts. It has the one-week
451
00:27:39,900 --> 00:27:43,840
Josh:
chart, the one-month chart, the three-month chart. That looks fairly accurate.
452
00:27:44,340 --> 00:27:48,160
Josh:
It has the price-to-earnings ratio, the 52-week high, 52-week low.
453
00:27:48,840 --> 00:27:52,460
Josh:
So it looks like at one point it was trading at $4.88, now it's trading at $3.89.
454
00:27:52,900 --> 00:27:57,400
Josh:
The bull case for Tesla, RoboTaxi and FSD driving licenses could unlock $500
455
00:27:57,400 --> 00:27:59,060
Josh:
billion in revenue by 2030.
456
00:27:59,820 --> 00:28:03,280
Josh:
It has the RoboTaxi service launch in Austin that it's preparing for.
457
00:28:03,580 --> 00:28:08,940
Josh:
And let's see the sector comparison. So it's comparing it to Rivian, Baidu, Toyota, Ford.
458
00:28:09,300 --> 00:28:13,180
Josh:
It has the competitive moat where it says it's most strong in brand power,
459
00:28:13,340 --> 00:28:14,920
Josh:
IP patents, and cost advantages.
460
00:28:15,300 --> 00:28:17,900
Josh:
You can see the revenue, the estimate per share earnings.
461
00:28:19,060 --> 00:28:23,620
Josh:
Sentiment is much worse on Tesla than it was on Apple. It's at 52% right now.
462
00:28:24,020 --> 00:28:29,960
Josh:
And it looks like, as it relates to the risk assessment, devaluation and competition
463
00:28:29,960 --> 00:28:32,880
Josh:
and execution are all very high risk.
464
00:28:33,060 --> 00:28:36,800
Josh:
And that's probably an accurate assessment, although I'm not sure the competition
465
00:28:36,800 --> 00:28:39,500
Josh:
is really a problem. The execution is certainly going to be an issue.
466
00:28:39,620 --> 00:28:44,440
Josh:
But it's just amazing to see how well it does. And it even gives it a verdict.
467
00:28:44,640 --> 00:28:46,500
Josh:
So the AI verdict on Tesla is,
468
00:28:47,150 --> 00:28:51,190
Josh:
It's a hold. Tesla's optionality is enormous, but current valuations already
469
00:28:51,190 --> 00:28:52,890
Josh:
prices in multiple moonshots.
470
00:28:53,310 --> 00:28:57,010
Josh:
Execution on RoboTaxi will be the key catalyst. That sounds about right.
471
00:28:57,190 --> 00:29:02,910
Josh:
And it's amazing that we just built this with a single prompt without any oversight from me.
472
00:29:03,070 --> 00:29:07,970
Josh:
And it works. It actually works. It's really just unbelievable how capable these things are.
473
00:29:08,110 --> 00:29:10,690
Josh:
And now I have a dashboard that anytime I want to make a decision,
474
00:29:10,870 --> 00:29:16,010
Josh:
I can type in the ticker and get all this um optionality it even has menus that
475
00:29:16,010 --> 00:29:21,130
Josh:
work look at this profit margins pe ratios market cap wow pretty unbelievable it's.
476
00:29:21,130 --> 00:29:26,690
Ejaaz:
It's a reactive in real time bloomberg terminal oh wait for the modern age
477
00:29:26,690 --> 00:29:30,710
Josh:
There's um there's another feature here that looks like you could compare stocks
478
00:29:30,710 --> 00:29:35,890
Josh:
let's see if this actually works here so if i type in let's say apple's ticker
479
00:29:35,890 --> 00:29:40,310
Josh:
and i hit go will that compare the two now it looks like that doesn't work very
480
00:29:40,310 --> 00:29:43,730
Josh:
well oh my god but it has moving average lines and everything. This is pretty robust.
481
00:29:43,950 --> 00:29:47,050
Ejaaz:
I know it's like the traded and investors dream. Just crazy.
482
00:29:48,070 --> 00:29:50,070
Ejaaz:
Kind of like a side note on this, but like,
483
00:29:50,840 --> 00:29:53,860
Ejaaz:
The fact that Tesla's down and everyone's kind of like bearish on this company,
484
00:29:54,020 --> 00:29:56,880
Ejaaz:
even though they're like rumored to be merging and stuff like this.
485
00:29:57,200 --> 00:30:02,800
Ejaaz:
Like the point being is there's an asymmetry between what the market is seeing
486
00:30:02,800 --> 00:30:06,260
Ejaaz:
and what these inventors and builders are seeing.
487
00:30:06,600 --> 00:30:12,220
Ejaaz:
These AI labs have created what they define as pretty much a low form of AGI.
488
00:30:12,600 --> 00:30:16,320
Ejaaz:
You literally have an AI model that is building the next version of itself.
489
00:30:16,320 --> 00:30:21,580
Ejaaz:
That by description is like a super genius and it's only limited by the function
490
00:30:21,580 --> 00:30:24,160
Ejaaz:
of energy and compute, right?
491
00:30:24,580 --> 00:30:28,700
Ejaaz:
And then investors are looking at this and saying, huh, Amazon and Google are
492
00:30:28,700 --> 00:30:32,380
Ejaaz:
about to spend a combined $500 billion worth of CapEx this year.
493
00:30:32,920 --> 00:30:36,600
Ejaaz:
Kind of bearish, that's a lot of money. So there is a real investment opportunity
494
00:30:36,600 --> 00:30:39,780
Ejaaz:
here to really understand the difference of what these things can actually do.
495
00:30:40,100 --> 00:30:43,000
Ejaaz:
And that might lead to a lot of like opportunities to invest.
496
00:30:43,160 --> 00:30:46,780
Ejaaz:
I don't know, but I know that I'm buying Tesla today and a bunch of google stock
497
00:30:46,780 --> 00:30:50,620
Josh:
Yeah i mean look at this google valuation one this chart looks absolutely gorgeous
498
00:30:50,620 --> 00:30:54,900
Josh:
but two um the ai verdict is a buy even the ai thinks google is a buy because
499
00:30:54,900 --> 00:30:59,260
Josh:
they just have um alphabet offers the best value in mega cap tech dominant ai
500
00:30:59,260 --> 00:31:03,340
Josh:
capabilities diversified growth and a cheap valuation if search mode holds and.
501
00:31:03,340 --> 00:31:05,320
Ejaaz:
Yeah give me the week give me the week
502
00:31:05,320 --> 00:31:08,320
Josh:
Let's see the weekly chart here do you want some moving average lines as well
503
00:31:08,320 --> 00:31:10,300
Josh:
because we could drop those in please let's.
504
00:31:10,300 --> 00:31:15,500
Ejaaz:
See let's see i'm actually super yeah look see it's had a slight dip Markets are so reactive. Crazy.
505
00:31:15,940 --> 00:31:22,640
Josh:
Yeah, and I think to the point of the CapEx, markets are viewing that as a scary, high-risk statement.
506
00:31:22,880 --> 00:31:27,480
Josh:
But while that's true, I also think it's a testament to the fact that scaling
507
00:31:27,480 --> 00:31:31,520
Josh:
laws are going to work, and the largest companies in the world are betting on
508
00:31:31,520 --> 00:31:32,860
Josh:
the continuation of them working.
509
00:31:33,140 --> 00:31:37,860
Josh:
And the shared consensus between all of these large-cap companies deciding to
510
00:31:37,860 --> 00:31:40,080
Josh:
spend record CapEx this year,
511
00:31:40,700 --> 00:31:43,980
Josh:
is a testament to the fact that things are only going to go faster.
512
00:31:44,300 --> 00:31:47,160
Josh:
And they believe that the more money they put in, the more outputs they will get.
513
00:31:47,300 --> 00:31:51,080
Josh:
And they're going to continue to put their foot on the gas. So I think any question
514
00:31:51,080 --> 00:31:55,020
Josh:
that anyone had, if these scaling laws could continue to hold up and we could
515
00:31:55,020 --> 00:31:58,120
Josh:
continue to be on the path to whatever AGI looks like and beyond,
516
00:31:58,280 --> 00:32:00,300
Josh:
I think that was answered this week through these earnings reports.
517
00:32:00,400 --> 00:32:02,580
Josh:
And the overwhelming answer is yes, it's true.
518
00:32:02,840 --> 00:32:06,520
Josh:
It is likely that this is going to happen and everyone is betting their entire company on it?
519
00:32:06,740 --> 00:32:11,700
Ejaaz:
I think we have done a great job, if I pat ourselves on the back virtually,
520
00:32:11,880 --> 00:32:13,980
Ejaaz:
Josh, of showing what these models are capable of.
521
00:32:14,100 --> 00:32:18,500
Ejaaz:
And remember, it's been less than 48 hours that these models have been alive.
522
00:32:18,840 --> 00:32:23,840
Ejaaz:
In fact, I think it's been like 36 hours. So if any of you are interested in
523
00:32:23,840 --> 00:32:28,400
Ejaaz:
trying these out, I cannot urge you enough to go out and try these things.
524
00:32:28,760 --> 00:32:32,640
Ejaaz:
Try to solve a problem that you're finding at work or try to solve a problem
525
00:32:32,640 --> 00:32:36,140
Ejaaz:
that you're finding just in your casual leisure time to code up a hobby or a
526
00:32:36,140 --> 00:32:39,320
Ejaaz:
project in a matter of seconds. It's so, so easy.
527
00:32:39,480 --> 00:32:43,260
Ejaaz:
And it'll put you at an advantage to understand how these tools work and why
528
00:32:43,260 --> 00:32:46,320
Ejaaz:
they're really changing the world as we see it around us, why stocks are dumping,
529
00:32:46,680 --> 00:32:47,920
Ejaaz:
why some stocks are pumping.
530
00:32:48,600 --> 00:32:51,760
Ejaaz:
But yes, go demo it. Let us know what you actually end up building.
531
00:32:52,400 --> 00:32:56,720
Ejaaz:
Josh and I are trying to give you more live demos in a lot of the episodes that we put out.
532
00:32:56,860 --> 00:32:59,540
Ejaaz:
And with every other model release and feature that drops, we are going to be
533
00:32:59,540 --> 00:33:03,600
Ejaaz:
trying and testing these things so we can bring to you exactly what these things
534
00:33:03,600 --> 00:33:06,300
Ejaaz:
can do and show you kind of like the benefits and disadvantages,
535
00:33:06,300 --> 00:33:08,000
Ejaaz:
what's real and what's really not.
536
00:33:08,440 --> 00:33:12,040
Josh:
Yeah. And I can't stress this enough. The best way to stay on top of things,
537
00:33:12,180 --> 00:33:15,260
Josh:
the best way to feel like you're not being left behind is just to use the tools
538
00:33:15,260 --> 00:33:17,920
Josh:
as they come out and to understand them and what makes them different.
539
00:33:18,140 --> 00:33:24,200
Josh:
And for a single subscription to ChatGPT or to Claude, you can access tools
540
00:33:24,200 --> 00:33:26,040
Josh:
just like this and build stuff just like this.
541
00:33:26,340 --> 00:33:29,660
Josh:
I'm not, this wasn't like an incredibly difficult technical challenge.
542
00:33:29,680 --> 00:33:32,140
Josh:
You just ask it what you want and you ask it to help you.
543
00:33:32,280 --> 00:33:35,740
Josh:
And it will actually walk through and help you through the process and build whatever you want.
544
00:33:35,860 --> 00:33:41,300
Josh:
So the most important thing for anyone listening is just to train that muscle and to get familiar with,
545
00:33:41,790 --> 00:33:45,410
Josh:
these tools and these skills that you're able to leverage them to your advantage,
546
00:33:45,410 --> 00:33:47,950
Josh:
however it may best fit in your life.
547
00:33:48,070 --> 00:33:50,410
Josh:
And that's what kind of we wanted to share with us.
548
00:33:50,530 --> 00:33:52,890
Josh:
Like, it's simple. You download the app, you log into your account,
549
00:33:52,970 --> 00:33:54,150
Josh:
and you're on your way. It's really
550
00:33:54,150 --> 00:33:58,410
Josh:
not as difficult as I think a lot of people make it seem like it is.
551
00:33:58,630 --> 00:34:02,090
Josh:
And I mean, this beautiful dashboard is a testament to that.
552
00:34:02,450 --> 00:34:05,510
Josh:
Okay, so Ejaz, it also looks like our codex output
553
00:34:05,510 --> 00:34:08,230
Josh:
has finished itself so we have here on the
554
00:34:08,230 --> 00:34:11,010
Josh:
screen we have opus which we saw which is
555
00:34:11,010 --> 00:34:14,130
Josh:
really a lovely dashboard but it seems like codex
556
00:34:14,130 --> 00:34:17,430
Josh:
now has its own version that we could quickly compare so maybe we'll try we'll
557
00:34:17,430 --> 00:34:20,490
Josh:
go to our favorite google we'll type google in and we'll click analyze and kind
558
00:34:20,490 --> 00:34:24,390
Josh:
of see how this compares i find it funny how they've they've merged on the same
559
00:34:24,390 --> 00:34:29,310
Josh:
type of design style but yeah oh okay this whoa this is interesting this is
560
00:34:29,310 --> 00:34:33,210
Josh:
different so it has the moving averages select oh is that,
561
00:34:34,320 --> 00:34:35,760
Josh:
Okay, yeah, so it has the charts.
562
00:34:35,940 --> 00:34:36,620
Ejaaz:
Is that accurate?
563
00:34:36,820 --> 00:34:39,780
Josh:
It has the PE ratio. Yeah, that's what I was looking at. Let's go to that one-week chart and see.
564
00:34:40,580 --> 00:34:45,360
Josh:
I have some questions about these. It looks pretty right.
565
00:34:46,040 --> 00:34:48,320
Ejaaz:
Okay. That looks very wrong.
566
00:34:48,500 --> 00:34:52,920
Josh:
Yeah, the one you're a little confused about. Let's compare it to Claude here.
567
00:34:53,100 --> 00:34:56,700
Josh:
Let's go to Google and we'll analyze that. Well, it thinks we can look at the
568
00:34:56,700 --> 00:34:58,620
Josh:
rest. So it looks like it emulated pretty well.
569
00:34:59,200 --> 00:35:01,460
Josh:
It has the verdict. It has the same stats.
570
00:35:02,660 --> 00:35:06,560
Josh:
The risk assessment matrix is... good but you could see like some of the text
571
00:35:06,560 --> 00:35:11,140
Josh:
you can't really read because it's black on black um but nonetheless pretty
572
00:35:11,140 --> 00:35:12,700
Josh:
interesting they both succeeded.
573
00:35:12,700 --> 00:35:18,120
Ejaaz:
Yeah i mean as we said before like these models are very equally capable and
574
00:35:18,120 --> 00:35:22,040
Ejaaz:
you know maybe it's just the way that you prompt something or uh the way that
575
00:35:22,040 --> 00:35:25,180
Ejaaz:
some of these things work but largely they kind of achieve the same goal and
576
00:35:25,180 --> 00:35:31,120
Ejaaz:
same quality um and like listen like we're talking about like minor discrepancies here
577
00:35:31,660 --> 00:35:34,760
Ejaaz:
I can't wait to see what we will build with this. Like, this is insane.
578
00:35:34,960 --> 00:35:38,180
Josh:
It's amazing. Both of these one-shot prompts didn't touch anything.
579
00:35:38,180 --> 00:35:40,500
Josh:
And here we are. I do think that Google, when your chart is wrong,
580
00:35:40,600 --> 00:35:41,580
Josh:
I think Claude got that one right.
581
00:35:41,700 --> 00:35:44,360
Josh:
But we overall both succeeded in the mission. Both look great.
582
00:35:44,520 --> 00:35:45,780
Josh:
And both are just excellent models.
583
00:35:46,400 --> 00:35:49,720
Ejaaz:
Amazing. Okay, well, that's it. Wherever you're listening to this,
584
00:35:49,920 --> 00:35:53,060
Ejaaz:
if it is on YouTube and you're watching our lovely faces, or if you're listening
585
00:35:53,060 --> 00:35:56,080
Ejaaz:
to us on Spotify, Apple Music, or wherever you listen to us,
586
00:35:56,740 --> 00:35:59,860
Ejaaz:
please subscribe, give us a rating, leave us some comments.
587
00:35:59,860 --> 00:36:04,960
Ejaaz:
We love your feedback and we respond to pretty much every single comment because
588
00:36:04,960 --> 00:36:07,340
Ejaaz:
we're trying to figure out how to make this show better and bring you the content
589
00:36:07,340 --> 00:36:09,220
Ejaaz:
that you guys deserve and want.
590
00:36:09,400 --> 00:36:13,240
Ejaaz:
Turn on notifications because we are releasing more and more videos every week
591
00:36:13,240 --> 00:36:15,680
Ejaaz:
on the hottest topics as they come out.
592
00:36:15,860 --> 00:36:20,280
Ejaaz:
We also have the sickest newsletter ever where one of us will either write a
593
00:36:20,280 --> 00:36:22,600
Ejaaz:
essay or give you the five top highlights of the week.
594
00:36:22,700 --> 00:36:24,980
Ejaaz:
So if you don't want to watch any of these videos, you can just read and digest
595
00:36:24,980 --> 00:36:28,580
Ejaaz:
that and you'll know everything that you need to know in AI and frontier tech.
596
00:36:29,060 --> 00:36:31,860
Ejaaz:
Thank you for listening, and we will see you on the next one.
597
00:36:31,940 --> 00:36:33,020
Josh:
See you in the next one. Peace.