Navigated to Writing bugs with K.S. Bhaskar - Transcript

Writing bugs with K.S. Bhaskar

Episode Transcript

1 00:00:03,804 --> 00:00:13,577 Hello and welcome to Fork Around and Find Out, the podcast all about taking you back in time on what it's like to write bugs for the last five decades. 2 00:00:13,697 --> 00:00:19,129 Today on the show we have Bhaskar, who we met in person like many of the guests, which is great. 3 00:00:19,129 --> 00:00:25,521 I met you at dinner and you said something that stuck with me still to this day, where you said you've been writing the finest bugs since the 1970s. 4 00:00:25,521 --> 00:00:26,882 And I love that. 5 00:00:26,882 --> 00:00:28,462 So thank you for coming on the show. 6 00:00:28,791 --> 00:00:29,703 Thank you for having me. 7 00:00:29,703 --> 00:00:30,554 It's good to be here. 8 00:00:30,554 --> 00:00:34,804 And of course, the software business, we create bugs, right? 9 00:00:34,804 --> 00:00:35,772 That's what we do. 10 00:00:35,772 --> 00:00:38,352 That's it, no features, just bugs. 11 00:00:38,352 --> 00:00:40,041 And sometimes they're usable bugs. 12 00:00:40,041 --> 00:00:41,145 Exactly. 13 00:00:41,930 --> 00:00:48,256 wish that we could just say that quote to computer science classes because I think it would help a lot of imposter syndrome. 14 00:00:48,256 --> 00:00:49,956 uh 15 00:00:49,956 --> 00:00:53,534 you you will write this wrong five times and then you will read it, right? 16 00:00:53,534 --> 00:00:55,027 But it'll still be wrong in the future. 17 00:00:55,027 --> 00:00:58,574 So don't worry It's not it's either a bug right now or it's a bug later 18 00:00:58,594 --> 00:01:04,718 The amount of engineers, I think especially women that I talked to and they're just like, is it normal to feel bad so much? 19 00:01:04,718 --> 00:01:07,890 And I'm like, yeah. 20 00:01:08,331 --> 00:01:14,455 There's a lot of failing like, and then you feel like a God for like two seconds and then there's a lot of failing. 21 00:01:15,169 --> 00:01:16,110 Yeah. 22 00:01:16,110 --> 00:01:19,124 When the build works and you're like, understand that piece of it. 23 00:01:19,124 --> 00:01:20,706 And then the next thing is just like, I don't get it. 24 00:01:20,706 --> 00:01:21,147 Anything. 25 00:01:21,147 --> 00:01:22,112 Nothing works. 26 00:01:22,112 --> 00:01:30,686 think that should be part of teaching engineering though, like we're almost doing them a disservice when we're just like, and then it'll build, like, cause I mean, like that 27 00:01:30,686 --> 00:01:33,017 happens a lot less than the failing part. 28 00:01:33,017 --> 00:01:41,950 So if like we were more, I guess, transparent on that cycle, I think that people could find more joy in being an engineer, you know? 29 00:01:42,527 --> 00:01:45,959 In the old days we used to say if it compiled, ship it. 30 00:01:46,096 --> 00:01:46,887 Yeah. 31 00:01:47,851 --> 00:01:50,758 Yeah, if it compiled, that's the test right there. 32 00:01:50,758 --> 00:01:54,188 If I have something that executes, right, I that's good enough. 33 00:01:54,188 --> 00:02:01,063 Which is wild with the amount of testing and pipelines that we use now, you know, someone's dying when they're hearing that. 34 00:02:01,650 --> 00:02:03,690 Oh man, like a flaky test. 35 00:02:03,690 --> 00:02:05,690 You're like, oh, we can't just stop everyone stop. 36 00:02:05,690 --> 00:02:07,770 Cause this doesn't work all the time. 37 00:02:08,090 --> 00:02:18,810 And one of the things, one of the things I want to talk to you about is just how software has changed over a period of time that you've, you've been able to experience and, and how 38 00:02:18,810 --> 00:02:20,630 you've kind of changed with it, right? 39 00:02:20,630 --> 00:02:28,350 Like the industry, even in last 10 years is different 20 years ago, 30 years ago, it completely not even recognizable. 40 00:02:28,718 --> 00:02:33,662 Well, here's something too, which is when I got my first job programming. 41 00:02:33,662 --> 00:02:37,605 I was a student at my undergraduate university in India. 42 00:02:37,605 --> 00:02:45,211 And I got my first job as a programmer in 1971 part-time, and I was paid two rupees an hour. 43 00:02:45,332 --> 00:02:57,490 Now, at that time, the biggest computer that we had on campus was an IBM 7044, which today would be dwarfed computationally by a smart doorbell. 44 00:02:57,490 --> 00:02:58,506 You 45 00:02:58,850 --> 00:03:03,371 computer time on that sold for 2000 rupees an hour. 46 00:03:03,592 --> 00:03:06,392 Okay, so a thousand times more than I was paid. 47 00:03:06,573 --> 00:03:17,906 And so the lesson always was, what would a professor has told us is if you can spend two hours debugging or fixing something to save a couple seconds of computer time, it's worth 48 00:03:17,906 --> 00:03:18,616 it. 49 00:03:19,297 --> 00:03:22,566 And, know, today it's completely the other way around. 50 00:03:22,566 --> 00:03:29,516 Yeah, that's probably the biggest change, Because now somewhat one person's time is worth so much more than the compute. 51 00:03:29,570 --> 00:03:30,370 Right. 52 00:03:31,031 --> 00:03:33,542 You know, and I think nothing of, you know, I make a change. 53 00:03:33,542 --> 00:03:37,616 I just run a compile to see if I got the syntax right. 54 00:03:37,616 --> 00:03:47,273 And, you know, those days I wouldn't even think of doing that because one of the things you had to do was, you know, we had punch cards and you wrote your, your program on punch 55 00:03:47,273 --> 00:03:47,553 cards. 56 00:03:47,553 --> 00:03:52,066 You turned it in and you would basically get a printout the next day. 57 00:03:53,227 --> 00:03:57,440 And so if you didn't get it right, then you had to wait yet another day for the program to run. 58 00:04:00,127 --> 00:04:02,980 Has that made us worse as engineers? 59 00:04:02,980 --> 00:04:13,321 Like, I feel like because we don't value like the computer, like we just, we're just like, just either throw money at it or try it again or do little things over and over again. 60 00:04:13,321 --> 00:04:15,713 Is that, is that as an engineering culture? 61 00:04:16,514 --> 00:04:18,045 Well, it's good and bad. 62 00:04:18,045 --> 00:04:22,957 mean, there are things where, you know, certainly I'm a lot more productive now. 63 00:04:22,958 --> 00:04:36,565 And I think that the code I write today is probably a lot better than the code I wrote 50 years ago just because, you know, I can test it better, I can run more cycles on it and so 64 00:04:36,565 --> 00:04:37,305 on. 65 00:04:37,626 --> 00:04:45,110 On the other hand, you know, there are certain things where you have to, you know, let's say if you're doing something that is really critical. 66 00:04:45,994 --> 00:04:54,639 it really does pay to stop and think about it and think through it before you just write the code. 67 00:04:54,799 --> 00:05:03,143 And certainly in the business that YaraDB is in, we go into a lot of mission-critical applications in banking and healthcare. 68 00:05:03,504 --> 00:05:08,757 And so it's absolutely critical to us to get the code right. 69 00:05:08,757 --> 00:05:14,114 of course, so it works one way in the sense that we can do a lot more testing now than we've. 70 00:05:14,114 --> 00:05:15,135 could have done years ago. 71 00:05:15,135 --> 00:05:22,221 But on the other hand, it also means we have to spend a lot more time thinking about the code and thinking about the design even before we start coding. 72 00:05:22,542 --> 00:05:27,466 And in the old days, we didn't think about it that much because, again, programs are also simpler. 73 00:05:28,114 --> 00:05:28,914 Sure. 74 00:05:29,696 --> 00:05:41,113 Was that simply, was the simplification because of they were not used as in many places or was that just because they were isolated and not as dependent on other pieces of software 75 00:05:41,113 --> 00:05:41,644 and things? 76 00:05:41,644 --> 00:05:44,916 Well, you literally couldn't create a big program. 77 00:05:44,916 --> 00:05:53,863 So like the IBM 7044 computer, its memory was something like 32K words, and each word was 36 bits. 78 00:05:54,264 --> 00:06:01,649 So if it didn't fit in memory, you essentially had to sort of overlay parts of the program, bring in bits and pieces. 79 00:06:01,769 --> 00:06:08,674 So what you actually did was write your program in lots of little things, and you went from. 80 00:06:09,274 --> 00:06:12,508 You process something and then put it on tape and then you process something else. 81 00:06:12,508 --> 00:06:15,380 So you couldn't do anything that is particularly big. 82 00:06:15,622 --> 00:06:20,867 And I guess in some sense what comes around goes around because that's what microservices is. 83 00:06:21,488 --> 00:06:27,066 Except in our case, we did one service at a time instead of running them all in parallel. 84 00:06:27,066 --> 00:06:32,275 Maybe microservices would be better if we counted how much memory they used in words. 85 00:06:33,618 --> 00:06:36,743 Or is that you get 50 words of memory. 86 00:06:38,206 --> 00:06:40,170 Just a... 87 00:06:40,170 --> 00:06:49,421 Think about the fact that we were counting words and that kind of processing and then like the huge data sets we're processing right now for like machine learning. 88 00:06:49,421 --> 00:06:54,997 those two, like the fact that they're in a span of your career is absolutely amazing. 89 00:06:56,750 --> 00:06:58,610 It's been a lot of fun. 90 00:06:58,610 --> 00:07:06,250 What I tell people is, you know, I'm 70 and I got into computers. 91 00:07:06,250 --> 00:07:08,870 So, okay, so here's the reason I got into computers. 92 00:07:09,050 --> 00:07:12,290 The computer center was the only air-conditioned building on campus. 93 00:07:14,763 --> 00:07:26,850 that that is why okay but like have you ever heard anyone that's come on this show that's really done great things everybody's got like the most unique reason for starting and I 94 00:07:26,850 --> 00:07:29,291 love that like nobody's just because like 95 00:07:30,450 --> 00:07:38,636 I I got into computers because the computer lab was across the hall from my dorm room and I could do my homework while I got paid. 96 00:07:38,636 --> 00:07:40,768 And it was just like, could get paid and do homework. 97 00:07:40,768 --> 00:07:42,619 And that's the perfect thing as a college student. 98 00:07:42,619 --> 00:07:43,600 And I just sit there. 99 00:07:43,600 --> 00:07:47,492 But air conditioning, that is the best example. 100 00:07:47,863 --> 00:07:53,605 on the campus there, routinely in summer, the temperature would get to over 100 degrees. 101 00:07:54,006 --> 00:07:57,997 And again, this is India, and it was less developed in those days than it is today. 102 00:07:57,997 --> 00:08:04,751 So we would often have power cuts in the dorm from like 5 o'clock in the morning till 11 o'clock in the morning. 103 00:08:04,751 --> 00:08:06,571 So no fan even. 104 00:08:06,672 --> 00:08:08,963 So the computer center has got electricity. 105 00:08:08,963 --> 00:08:09,983 It's got air conditioning. 106 00:08:09,983 --> 00:08:12,234 It's a great place to go hang out. 107 00:08:12,270 --> 00:08:17,037 How did the university get a computer at that time? 108 00:08:17,037 --> 00:08:18,744 I did not grow up around computers. 109 00:08:18,744 --> 00:08:20,662 I was born in the 80s. 110 00:08:20,662 --> 00:08:29,226 everything, it seems like that seems like a really big privilege for that university to have that big of a computer and air conditioning and constant power at that time. 111 00:08:29,226 --> 00:08:33,446 So this was the Indian Institute of Technology in Kanpur. 112 00:08:33,586 --> 00:08:45,666 And it was set up as part of with aid from a consortium of nine American universities, know, MIT, Ohio State, Purdue, I forget there was a list of nine. 113 00:08:46,006 --> 00:08:54,226 And they, somewhere in like the mid 1960s, they had a spare IBM 1620 computer. 114 00:08:54,226 --> 00:08:56,194 Now the IBM 1620, by the way, 115 00:08:56,194 --> 00:08:58,175 goes back, it's a 1950s design. 116 00:08:58,175 --> 00:09:00,697 It was IBM's first scientific computer. 117 00:09:00,697 --> 00:09:04,610 So anyway, they had one to spare and they shipped it and installed it on campus. 118 00:09:04,610 --> 00:09:08,762 So that was the computer that I used to write my first program. 119 00:09:09,183 --> 00:09:16,668 And at that time in India, there were probably about, if I remember correctly, somewhere around 120 or 130 computers. 120 00:09:16,909 --> 00:09:21,591 And my university had three, so we were quite privileged. 121 00:09:23,133 --> 00:09:25,094 But it was... 122 00:09:25,270 --> 00:09:34,136 One of those things where the government of India set up these institutes to sort of be like the next, to train the next generation of engineers. 123 00:09:34,377 --> 00:09:37,419 so ours was set up with American collaboration. 124 00:09:37,419 --> 00:09:39,650 There was one that was set up with British collaboration. 125 00:09:39,650 --> 00:09:42,813 Another one, you know, German, another one Soviet. 126 00:09:42,813 --> 00:09:45,215 And then there was one that was set up with UNESCO collaboration. 127 00:09:45,215 --> 00:09:52,972 And so essentially in each of these universities, the country as part of their foreign aid program would provide expertise and... 128 00:09:52,972 --> 00:09:56,084 and equipment and so on to set up the university. 129 00:09:56,672 --> 00:10:02,105 Were there a lot of differences between the contrast of the different countries, I guess, and their... 130 00:10:02,105 --> 00:10:02,890 uh 131 00:10:02,890 --> 00:10:09,184 think the big difference was what they specialized in, what are some of the prestige courses. 132 00:10:09,184 --> 00:10:13,957 So in my university, electrical engineering was considered the prestige course. 133 00:10:14,237 --> 00:10:23,022 In the Indian Institute of Technology in Bombay, which is set up with Soviet collaboration, chemical engineering was a prestige course. 134 00:10:23,283 --> 00:10:28,756 In the one that was set up with German collaboration, for example, mechanical engineering was considered a prestige course. 135 00:10:29,562 --> 00:10:36,498 All the universities had all the different branches of engineering and we had a very competitive exam to get in. 136 00:10:36,498 --> 00:10:46,655 they would like, know, an entrance exam there was something like 40,000 people would sit for the exam and out of those maybe a couple thousand would get in. 137 00:10:46,696 --> 00:10:54,602 And depending on your rank in the exam, you got to select which course at which university you got into. 138 00:10:55,735 --> 00:10:58,735 Wow, you must have been, you've done really well. 139 00:11:00,378 --> 00:11:01,619 I did pretty well. 140 00:11:01,619 --> 00:11:13,412 So I actually, you know, I was the only one I remember in that year that came in the first hundred in both the entrance exam for the Indian Institute of Technology as well as there 141 00:11:13,412 --> 00:11:23,525 was another thing called the National Science Talent Search, which is kind of modeled on what used to be the Westinghouse Science Talent and then is the Intel Science Talent. 142 00:11:23,585 --> 00:11:26,086 So I actually... 143 00:11:26,092 --> 00:11:28,295 That particular year, I came first in the country on that. 144 00:11:28,295 --> 00:11:32,695 So was the only one that managed to come in in the first hundred on both. 145 00:11:32,695 --> 00:11:37,211 Did you have previous experience in electrical engineering before that? 146 00:11:37,684 --> 00:11:42,656 No, no, actually it was a long story, it's a short story actually. 147 00:11:42,656 --> 00:11:46,977 My passion in high school was physics, that's what I wanted to do in college. 148 00:11:47,357 --> 00:11:51,638 And so when I came first in the science talent, that's really what I wanted to do. 149 00:11:51,719 --> 00:11:57,660 And then my dad said, well, you're going to do electrical engineering and you're going to do electrical engineering at this university. 150 00:11:58,021 --> 00:12:02,102 And I argued with him for several days, but ultimately I had no choice. 151 00:12:03,308 --> 00:12:03,628 man. 152 00:12:03,628 --> 00:12:05,886 then you still pick the building that had AC. 153 00:12:06,422 --> 00:12:15,658 I picked the building that had AC and again, at that time there was no undergraduate computer science program, so electrical engineering was the closest you could get to 154 00:12:15,658 --> 00:12:16,909 computer science. 155 00:12:17,990 --> 00:12:25,165 And in fact, that is a time when I actually had the opportunity to sort of unite both. 156 00:12:25,165 --> 00:12:34,441 So Stanford University shipped us an old PDP-1 computer that they had worked on and they had heavily modified. 157 00:12:34,441 --> 00:12:36,034 So they shipped it to us when... 158 00:12:36,034 --> 00:12:37,595 They didn't have any further use for it. 159 00:12:37,595 --> 00:12:40,836 And a bunch of us spent like a year putting it together. 160 00:12:40,836 --> 00:12:42,697 It filled a whole room. 161 00:12:42,897 --> 00:12:44,428 And Stanford had modified it. 162 00:12:44,428 --> 00:12:46,709 So here's something for debugging software. 163 00:12:46,709 --> 00:12:51,581 One of the instructions that Stanford had added to it was called fiddle following. 164 00:12:51,741 --> 00:13:00,715 So if you executed a fiddle following instruction, the next instruction could be something completely different from what was one of the official instructions of the instruction 165 00:13:00,715 --> 00:13:01,505 set. 166 00:13:02,426 --> 00:13:03,946 So it was a lot of fun. 167 00:13:05,123 --> 00:13:15,413 what kind of, maybe the other people know what this is, but what did it entail to take a whole day to put together, or that whole time to put together a computer? 168 00:13:16,822 --> 00:13:24,289 Well, mean, we actually had some, you when it was shipped to us, it wasn't really like Lego where you had to put it together. 169 00:13:24,289 --> 00:13:27,451 The computer was kind of created and put together. 170 00:13:27,451 --> 00:13:35,839 But what it did mean was that if the program didn't work, you had to go look at the hardware, not just the software. 171 00:13:35,839 --> 00:13:40,062 So a flip-flop was essentially a circuit board about this size. 172 00:13:40,143 --> 00:13:44,556 It used germanium PNP transistors and 173 00:13:44,812 --> 00:13:49,106 And if the flip-flop circuit was loose, your program might not run properly. 174 00:13:50,153 --> 00:13:52,177 wow, that makes debugging way more complicated. 175 00:13:52,177 --> 00:13:57,525 Like I feel like I should be more grateful for the things I have to check at this point. 176 00:13:57,651 --> 00:13:59,229 It just goes into like... 177 00:13:59,229 --> 00:14:02,891 how much software engineers were also hardware engineers, right? 178 00:14:02,891 --> 00:14:04,882 You had to know how the hardware worked. 179 00:14:04,882 --> 00:14:09,964 Autumn, I finished the book you bought me for Christmas, The Superman, which was a story of Seymour Cray. 180 00:14:10,085 --> 00:14:17,984 And a lot of the things that he talked about, they deciding on, they want silicone transistors or germanium transistors? 181 00:14:17,984 --> 00:14:21,501 It was like a big deal for them to decide because everything else was germanium at the time. 182 00:14:21,501 --> 00:14:23,225 Like, no, we're gonna go with silicon, right? 183 00:14:23,225 --> 00:14:27,934 And it's just fascinating how those ripples have kind of affected so many things. 184 00:14:28,844 --> 00:14:34,788 And I remember one of the big problems they kept having in the the Cray computers that they're building was just the heat dissipation. 185 00:14:34,788 --> 00:14:38,001 They're like, well, we were hand soldering all these transistors. 186 00:14:38,001 --> 00:14:39,342 These were not integrated circuits. 187 00:14:39,342 --> 00:14:41,764 These were big pieces of equipment. 188 00:14:41,764 --> 00:14:49,629 And they had to have like AC specialists on hand to be able to like, how do we cool this much heat in this much electricity? 189 00:14:49,870 --> 00:14:50,710 Awesome. 190 00:14:53,090 --> 00:15:08,124 And this also points out to me just, think you are living proof of investment in other countries and being able to allow, not allow people to, but just like giving people access 191 00:15:08,124 --> 00:15:10,646 to things they may not have access to before. 192 00:15:10,736 --> 00:15:21,114 like your career is like proof that that works and that that is something that is continued even today that should be invested in long term for all people, not just to like 193 00:15:21,114 --> 00:15:25,167 say, you know, only people that live in this area can have access things. 194 00:15:25,167 --> 00:15:34,314 And you didn't get access to the top tier hardware at the time, like you were getting secondhand computers, but it still like allowed you to do so much more than what Stanford 195 00:15:34,314 --> 00:15:36,846 was gonna do with that PDP one anymore, right? 196 00:15:36,846 --> 00:15:37,836 Sure, absolutely. 197 00:15:37,836 --> 00:15:52,113 And in fact, I remember back in I forget, 1980s and 1990s, one of the things that we were doing was getting used computers and then shipping them to places like Central America and 198 00:15:52,113 --> 00:15:57,055 Africa and so on, where one of those things would actually be very useful. 199 00:15:57,055 --> 00:16:02,998 And I recently ran across a kid that was getting old laptops and sending them to India. 200 00:16:04,087 --> 00:16:12,507 I would argue not only is it an investment because look at how great it is that you got a career, I mean, look at how much you've given back to the industry. 201 00:16:12,507 --> 00:16:13,507 You know what I mean? 202 00:16:13,747 --> 00:16:16,087 So it's not like they just gave things to you. 203 00:16:16,087 --> 00:16:20,146 You've given back so much, which shows that it is such a great... 204 00:16:20,146 --> 00:16:30,127 People just look at kind of helping others or kind of making that democratization of tech as like, oh, well, we're helping people, but they help us back. 205 00:16:30,127 --> 00:16:34,005 We have had so many technological breakthroughs, science breakthroughs. 206 00:16:34,005 --> 00:16:35,399 that add to everyone. 207 00:16:35,399 --> 00:16:37,905 Like this is a team sport of like discoveries. 208 00:16:37,905 --> 00:16:40,531 We do so much more when we work together. 209 00:16:41,068 --> 00:16:42,598 Oh, absolutely. 210 00:16:42,999 --> 00:16:51,142 And I think that helping other people be productive helps us be productive as well. 211 00:16:51,142 --> 00:16:55,074 And certainly living in the US, it's easy to forget that. 212 00:16:55,074 --> 00:17:05,558 And certainly living in the US under the, I don't know what your particular political leanings are, but certainly living in the US at the present, where we tend to say, let's 213 00:17:05,558 --> 00:17:07,749 not worry about people outside. 214 00:17:07,749 --> 00:17:10,630 I think it's a mistake because ultimately, 215 00:17:10,892 --> 00:17:12,126 You're absolutely right. 216 00:17:12,126 --> 00:17:16,138 We do live together and we sort of build on each other. 217 00:17:16,191 --> 00:17:25,576 I was actually thinking that this is such a great conversation for the time that we're in because how many cancer researches, how many, just all kinds of research were saying that, 218 00:17:25,576 --> 00:17:28,598 you know, all these wonderful students got here on merit. 219 00:17:28,598 --> 00:17:31,229 They got here because they were amazing at their field. 220 00:17:31,229 --> 00:17:39,114 And we knew that it was this great investment because they were going to give 10 times more than what they've put in and that we were going to make these breakthroughs. 221 00:17:39,114 --> 00:17:44,977 And it just makes me so sad that we are going to lose so much advancement because we can't. 222 00:17:45,855 --> 00:17:50,280 look at this in a community kind of way and we're looking at it in a selfish way. 223 00:17:51,705 --> 00:17:58,958 desire to save a dollar is just ridiculous to think that we don't make that money back. 224 00:17:58,958 --> 00:18:00,601 uh Yeah. 225 00:18:00,601 --> 00:18:01,665 ah 226 00:18:01,665 --> 00:18:11,211 like taxes when you're like not a citizen, it's actually we're going to spend more money to get rid of people that we really need that make things better. 227 00:18:12,142 --> 00:18:14,856 Sure, I absolutely agree. 228 00:18:15,282 --> 00:18:22,082 So in 71, you started writing bugs and then you caught the bug for programming. 229 00:18:22,602 --> 00:18:24,462 How did that go on? 230 00:18:24,462 --> 00:18:28,722 What was it like for you, your career in, let's say, 80s and 90s? 231 00:18:28,722 --> 00:18:31,022 What was the next kind of wave that you were doing? 232 00:18:31,362 --> 00:18:32,242 a nutshell. 233 00:18:32,322 --> 00:18:34,502 So actually I started programming in 1970. 234 00:18:34,502 --> 00:18:37,042 1971 was when I got my first job. 235 00:18:37,762 --> 00:18:42,422 so I came to the US as a grad student in 1975. 236 00:18:43,502 --> 00:18:50,562 And that was like this great big shift in technology because I was using these old computers at college in India. 237 00:18:50,562 --> 00:18:55,222 And I came to Carnegie Mellon Computer Science Department in the US. 238 00:18:55,302 --> 00:18:59,206 And there we had deck tens and then 239 00:18:59,690 --> 00:19:04,552 using them, using a terminal interactively, that was something that was completely new to me. 240 00:19:05,133 --> 00:19:11,956 And then actually I went on to the University of Nebraska where we had like a personal APL machine. 241 00:19:11,956 --> 00:19:15,217 There's a programming language called APL. 242 00:19:15,217 --> 00:19:23,280 we actually had a machine that ran APL and that was my first computer that I saw that was a standalone desktop computer. 243 00:19:23,921 --> 00:19:25,111 Yeah, so we moved on to that. 244 00:19:25,111 --> 00:19:27,630 And then I went to work for uh 245 00:19:27,630 --> 00:19:33,530 a company in Seattle called Fluke for many years, and we built electronic test equipment. 246 00:19:33,770 --> 00:19:40,490 And I actually spent a good chunk of my time working on test equipment in particular. 247 00:19:40,490 --> 00:19:48,470 For example, there's test equipment for the Boeing used in the 757 and 767 to test the radio altimeter. 248 00:19:48,470 --> 00:19:50,750 So I led the software on that. 249 00:19:51,110 --> 00:19:53,590 And it's a time of significant change. 250 00:19:53,590 --> 00:19:54,630 we actually 251 00:19:54,926 --> 00:20:00,986 retargeted a compiler that we actually did the first piece of test equipment that was written in a high-level language. 252 00:20:00,986 --> 00:20:04,146 Until then, people had done it all in assembly language. 253 00:20:04,406 --> 00:20:11,586 So we actually took a C compiler, and then we retargeted that from the 8080 to the Z80, and then wrote that. 254 00:20:11,626 --> 00:20:18,586 And it was kind of interesting because we had 48 kilobytes of RAM and 16 kilobytes of RAM. 255 00:20:18,586 --> 00:20:24,009 And so we'd write the software, and then we'd say, oh crap, it's like six bytes bigger than 48 kilobytes. 256 00:20:24,009 --> 00:20:24,654 jeez. 257 00:20:24,654 --> 00:20:30,941 So then we would sit at it and we wrote a people optimize, know, and then we do people optimization. 258 00:20:30,941 --> 00:20:35,606 We'd say, okay, here's the sequence of code which can be shrunk by this little bit. 259 00:20:35,606 --> 00:20:44,024 So we'd add that to the people optimizer and then it would run a compilation and then we'd say, wow, okay, now we have 10 bytes to play with. 260 00:20:44,024 --> 00:20:46,857 it is different. 261 00:20:46,857 --> 00:20:48,038 oh 262 00:20:48,056 --> 00:20:51,402 at that point you're trying to out-optimize the compiler. 263 00:20:52,322 --> 00:21:02,425 We were trying to, well, basically as part of the compiler, we did a people optimization phase at the end of the compilation. 264 00:21:02,565 --> 00:21:03,255 so did that. 265 00:21:03,255 --> 00:21:10,207 And then ultimately at some point, and I went on to work for other hardware companies that were doing software. 266 00:21:10,207 --> 00:21:19,970 And then I realized back in the 1980s that I really needed to work for a software company because one of the problems of working for a hardware company, it's an accounting problem. 267 00:21:19,970 --> 00:21:21,272 Companies are slaves. 268 00:21:21,272 --> 00:21:22,922 to their accounting systems. 269 00:21:22,923 --> 00:21:36,288 So if you think about it, a hardware company like Fluke, they would burden their labor costs based on, because they had to have all these massive equipment for building 270 00:21:36,288 --> 00:21:37,348 equipment. 271 00:21:37,409 --> 00:21:44,691 And so let's say the labor costs they would charge is like, let's pick a number, oh $75 an hour. 272 00:21:45,152 --> 00:21:51,072 Now, if you think about software in those days, you wrote the software, but then the manufacturing cost of the software. 273 00:21:51,072 --> 00:21:55,646 is just you duplicate a floppy disk, you print it manually, shrink wrap it in a box, and you ship it. 274 00:21:55,646 --> 00:21:58,709 So maybe it's 15 minutes of time. 275 00:21:58,709 --> 00:22:07,106 But if you're charging your labor at $75 enough, and there's 15 minutes of time that does not require heavy equipment, all of a sudden, you're going to conclude that software is 276 00:22:07,106 --> 00:22:09,157 not a profitable business to be in. 277 00:22:10,699 --> 00:22:19,798 And so I decided to move into the software world, went to work for a company that did electronic. 278 00:22:19,798 --> 00:22:21,249 test and measurement software. 279 00:22:21,249 --> 00:22:29,714 In fact, that company had the first fast Fourier transform that ran on a PC and then moved into databases. 280 00:22:29,714 --> 00:22:41,120 They've been working in databases for the last 30 years and that's what Garadibi is, we're a database company and you're basically making very high-end key value databases. 281 00:22:41,330 --> 00:22:47,116 So what was that switch like going from like a lot of testing software to database? 282 00:22:47,116 --> 00:22:57,328 was in, I mean, databases weren't new, but I do feel like databases have gone through a lot of changes over the last 20 to 30 years. 283 00:22:58,316 --> 00:23:10,916 Well, in our case, we've actually stayed with the key value technology just because there's a lot of software that was written back in the 1980s and 1990s. 284 00:23:11,036 --> 00:23:12,877 I we still use that today. 285 00:23:13,318 --> 00:23:22,365 One of the things that's fashionable these days is to say that this software is old, so therefore it must be bad. 286 00:23:23,426 --> 00:23:25,127 And that doesn't make any sense. 287 00:23:25,127 --> 00:23:28,270 I the analogy I use is to like bicycles, right? 288 00:23:28,270 --> 00:23:34,630 If you look at a bicycle, for one of my presentations, I have a photo of a bicycle from the 1920s. 289 00:23:34,630 --> 00:23:36,510 And I have a photo of a bicycle from today. 290 00:23:36,510 --> 00:23:42,330 And it's one of those things where if you were to see that bicycle from the 1920s, you would have no problem riding it. 291 00:23:42,550 --> 00:23:49,410 And conversely, the guy from the 1920s were to see the bicycle today, maybe he'd have to understand how gears work, but he'd have no problem riding a bike. 292 00:23:49,810 --> 00:23:52,730 And so, old isn't necessarily bad. 293 00:23:53,730 --> 00:23:58,374 And so one of the things that, at least in the database business that I've been in, 294 00:23:58,968 --> 00:24:03,018 We run a lot of legacy applications. 295 00:24:03,018 --> 00:24:10,801 of that code, parts of it may have been written in the 1980s, and maybe today it's 100 times bigger than what it was in the 1980s. 296 00:24:10,801 --> 00:24:16,563 But the code is something that is living, it has grown, it has a lot more functionality. 297 00:24:16,883 --> 00:24:22,805 And we continue to insist that the code that is written in the 1980s continues to run today. 298 00:24:22,805 --> 00:24:27,778 So it's very different from the type of database business that you... 299 00:24:27,778 --> 00:24:32,702 you tend to see outside where, you here's something new, let's pick it up. 300 00:24:35,058 --> 00:24:46,567 I would say databases is usually a little more conservative in wanting to try the newest greatest thing because the closer you get to critical data, the more important it is that 301 00:24:46,567 --> 00:24:50,047 that code is tested well and kind of battle tested in the real world. 302 00:24:50,047 --> 00:24:51,715 It's also a pain to migrate. 303 00:24:51,715 --> 00:24:55,037 Like migrations are so painful and they take so long. 304 00:24:56,984 --> 00:25:01,895 Well, and among other things, you have to make sure that you migrated correctly, right? 305 00:25:01,895 --> 00:25:09,797 If you have terabytes of data, you can't just say, know, CP from this file to this file, and suddenly you've got the database moved over. 306 00:25:10,318 --> 00:25:20,560 So, and you know, we run, and especially because we're running in banking and healthcare, we have to also be very conscious about security. 307 00:25:21,201 --> 00:25:27,052 So it's not just that it has to be right, because your bank balance is just a few bits on a disk. 308 00:25:27,542 --> 00:25:31,731 it also has to have other safeguards. 309 00:25:32,708 --> 00:25:44,087 So how do you, like maintaining databases and maintaining code bases for decades, like how do you take a different mindset to how you're going to maintain something that long, 310 00:25:44,087 --> 00:25:44,322 right? 311 00:25:44,322 --> 00:25:52,273 Because there's so many things today that's like, maintain some open source project for maybe six months, a year, maybe a few years, but something like, hey, if we want to 312 00:25:52,273 --> 00:25:58,285 maintain this for 30 years, what kind of things do you need set up upfront? 313 00:25:58,285 --> 00:26:00,244 how do you maintain it through all the changes? 314 00:26:00,244 --> 00:26:03,656 Because the world's changed so much, you know? 315 00:26:04,878 --> 00:26:06,878 Well, certainly the... 316 00:26:06,878 --> 00:26:08,538 Well, one is, of course, evolution. 317 00:26:08,538 --> 00:26:13,398 mean, it's not like suddenly things change and something is broken. 318 00:26:13,638 --> 00:26:17,358 But the, you know, the code base has evolved. 319 00:26:17,738 --> 00:26:21,678 It originally, you know, if you go back 30, 40... 320 00:26:21,678 --> 00:26:26,118 The code base actually is almost 40 years old when it was first written. 321 00:26:26,118 --> 00:26:31,938 And it ran on a Motorola 68000 VMS computer system. 322 00:26:32,288 --> 00:26:34,990 And then it was migrated to a VAX and then to an alpha. 323 00:26:34,990 --> 00:26:39,063 And then it was migrated to UNIX. 324 00:26:39,063 --> 00:26:43,286 And then like 25 years ago, we migrated that to Linux. 325 00:26:43,286 --> 00:26:48,609 And so with each of these migrations, there's certainly evolution that comes along. 326 00:26:48,630 --> 00:26:55,584 But part of what you have to do when you maintain a code base with that law is, well, let me take a step back. 327 00:26:56,320 --> 00:26:58,191 and talk about software testing, right? 328 00:26:58,191 --> 00:27:03,253 So if you want to maintain code for that long, you've got to have a lot of testing that goes with it. 329 00:27:03,313 --> 00:27:08,955 So the goal of testing is not just to prove that the software does what it's supposed to do. 330 00:27:09,175 --> 00:27:14,958 You also have to prove that the software, you have to have confidence that the software doesn't do what it's not supposed to do. 331 00:27:16,178 --> 00:27:20,900 And then the question is, how do you have that confidence that the software is not doing what it's not supposed to do? 332 00:27:20,900 --> 00:27:25,222 Because you can't possibly test for all of those things. 333 00:27:25,304 --> 00:27:34,236 So the way that you do that in practice is you test that the software does everything that it's supposed to do, as well as a few diabolical cases that you throw at it. 334 00:27:34,737 --> 00:27:46,560 And so even if you make a small change somewhere, you still have to go through the entire test cycle to convince yourself that you haven't broken something somewhere unrelated. 335 00:27:47,440 --> 00:27:49,541 So there's a certain mindset. 336 00:27:49,541 --> 00:27:52,361 There's a certain way of writing software. 337 00:27:52,382 --> 00:27:54,142 You tend to stick to 338 00:27:55,170 --> 00:28:02,775 Let's say non-APIs, you don't necessarily go chase the newest shiny thing that comes along. 339 00:28:03,696 --> 00:28:08,459 So those are all the ways that we keep the oh lasting for a long time. 340 00:28:08,459 --> 00:28:14,523 And I think that if it's properly written, it should still be running 100 years from now. 341 00:28:14,523 --> 00:28:19,006 I won't be around 100 years from now, but the code should still be around and still be running. 342 00:28:19,447 --> 00:28:26,527 One of my favorite parts of talking to you so far is people won't be able to see it because of podcasts, but you have just the kindest face. 343 00:28:26,527 --> 00:28:30,827 And when you talk about the technology, you still look excited. 344 00:28:30,827 --> 00:28:33,387 And I love that because people will be like, well, what do you want to do when you retire? 345 00:28:33,387 --> 00:28:34,627 And I was like, I hope I never retire. 346 00:28:34,627 --> 00:28:42,467 I don't want to work nine to five forever, but like, I hope I get to kind of always play with technology and. 347 00:28:43,647 --> 00:28:44,224 Yes. 348 00:28:44,224 --> 00:28:45,806 things and from enjoying it. 349 00:28:45,806 --> 00:28:46,737 Yeah. 350 00:28:46,918 --> 00:28:47,758 Yeah. 351 00:28:47,853 --> 00:28:48,573 and enjoyment. 352 00:28:48,573 --> 00:28:53,186 How have you, I guess, pivoted? 353 00:28:53,186 --> 00:28:57,088 I've watched your talks long ago just because being in the database world. 354 00:28:57,088 --> 00:29:02,472 And the way that you speak of technology just makes me excited about it. 355 00:29:02,472 --> 00:29:04,323 How have you kept that excitement? 356 00:29:04,323 --> 00:29:06,344 Because you do great things. 357 00:29:06,344 --> 00:29:07,875 You advocate for open source. 358 00:29:07,875 --> 00:29:10,707 You advocate for access to like in... 359 00:29:10,707 --> 00:29:11,727 uh 360 00:29:12,137 --> 00:29:19,568 Allowing people into IT technology like you've done a lot of so many amazing things in your career and you still have that same excitement Like you just started yesterday. 361 00:29:19,568 --> 00:29:21,110 How do you keep that going? 362 00:29:22,338 --> 00:29:26,800 Well, there are two reasons to be in business, make money and have fun. 363 00:29:26,800 --> 00:29:36,854 you know, you have to, with the make money part, you you need enough to keep a roof over your head and put bread on the table, but I'm not out to, you know, buy a Caribbean island 364 00:29:36,854 --> 00:29:38,204 or something like that. 365 00:29:38,344 --> 00:29:43,726 And, but the fun part is, you know, I found something that I enjoy doing. 366 00:29:44,787 --> 00:29:49,529 And once I found something that I enjoyed doing, I kind of stuck with it. 367 00:29:49,789 --> 00:29:51,990 And what I tell people is that, 368 00:29:53,102 --> 00:30:00,449 I would like to keep writing software until it's time for me to be carried out horizontally because that's what I enjoy doing. 369 00:30:00,449 --> 00:30:06,034 And that's my bucket list and I'm able to do that. 370 00:30:06,034 --> 00:30:10,528 So it's not right for everyone obviously, but it's right for me. 371 00:30:10,753 --> 00:30:22,355 They say that if you, oh, they have that saying like, if you do what you enjoy, you never work a day in your life and your face just like, I hope to have a career that's half as 372 00:30:22,355 --> 00:30:24,427 cool as yours where I still look that excited. 373 00:30:24,427 --> 00:30:28,602 Like what is it that you love so much about databases and open source? 374 00:30:28,602 --> 00:30:33,685 Because like you are just such a component for Linux databases and open source. 375 00:30:33,685 --> 00:30:36,659 And I love the quote where you always say that open source is good for business. 376 00:30:36,659 --> 00:30:39,912 And I think we're kind of in a weird spot in open source right now. 377 00:30:39,912 --> 00:30:47,591 So like, I would love to hear kind of like, what keeps bringing you back to that and like what joy you find in databases and open source in Linux. 378 00:30:48,866 --> 00:30:54,729 Well, databases just happens to be something that I stumbled into when I was looking for a job. 379 00:30:54,729 --> 00:30:56,080 And then it got me interested. 380 00:30:56,080 --> 00:30:58,391 And once I got interested, I kind of stayed interested with it. 381 00:30:58,391 --> 00:31:09,658 But what I like about open source is that it transfers power from the hands of the developer to the hands of the user. 382 00:31:09,798 --> 00:31:15,721 And that was brought home to me when I was running a small business in Massachusetts many years ago. 383 00:31:17,036 --> 00:31:24,070 bug tracking in our software, had this software that cost like $1,000 or $2,000 or something like that. 384 00:31:24,070 --> 00:31:27,211 I forget the amount, but it was something that a small company could afford. 385 00:31:27,472 --> 00:31:32,705 And then that company got bought by another company that got bought by a bigger company. 386 00:31:32,705 --> 00:31:41,619 And suddenly this $2,000 piece of software became an enterprise software with like a quarter of a million dollar entry price. 387 00:31:43,041 --> 00:31:46,062 And about that time, I was also 388 00:31:46,706 --> 00:31:53,261 I started using Emacs, I was influenced by Richard Stallman and some of the work that he was doing. 389 00:31:53,261 --> 00:32:03,460 And I realized that you really need to shift the balance of power from the developer to the user. 390 00:32:03,460 --> 00:32:11,506 And the way to do that is open source or free software, free as in Libre, not free as in Beard, though the two go together. 391 00:32:11,547 --> 00:32:16,290 And so that's how I sort of became a convert to using open source. 392 00:32:16,290 --> 00:32:31,393 Now the transition to doing open source as a business came later and that was mostly, I was working for this company, our software was proprietary and then we got bought by this 393 00:32:31,393 --> 00:32:35,086 other company because they wanted the technology, they were our biggest customer. 394 00:32:35,587 --> 00:32:41,772 And they didn't want to market that software, they just said we'll continue supporting our old customers. 395 00:32:41,852 --> 00:32:44,394 But then what I could see was that over time, 396 00:32:44,598 --> 00:32:46,698 And this goes back to like 2000. 397 00:32:46,919 --> 00:32:59,222 Over time, what would happen is that as the bar for quality kept getting higher and higher, if the number of users stayed the same or shrank, then eventually what is going to 398 00:32:59,222 --> 00:33:06,024 happen was that you're going to be spending all of your time on quality and on testing and less of the time on the software itself. 399 00:33:06,084 --> 00:33:08,424 So we had to expand the user base. 400 00:33:09,225 --> 00:33:14,378 And the parent company was kind of skeptical about open source, but I said, hey, let's go. 401 00:33:14,378 --> 00:33:18,579 open source, let's release the software, we'll get a lot more users. 402 00:33:19,320 --> 00:33:29,524 And so they went along with that and surprise, surprise, all of a sudden, we got a lot more users and some of those users turned into customers because they were running these 403 00:33:29,524 --> 00:33:32,485 critical applications and needed someone behind them. 404 00:33:32,986 --> 00:33:38,668 And that was when I became a fan of open source as a business. 405 00:33:39,588 --> 00:33:44,150 And frankly, what we do is we sell peace of mind. 406 00:33:44,302 --> 00:33:53,722 So if someone is using our software and they're using it because their business depends on it, they need someone behind them and by having us behind them, they have peace of mind 407 00:33:53,722 --> 00:33:55,082 and that's what we're sailing. 408 00:33:55,935 --> 00:33:56,399 I wish. 409 00:33:56,399 --> 00:33:57,100 a lot of these are, right? 410 00:33:57,100 --> 00:34:07,245 Especially open source where the lower in the stack and the more critical the software is, the more people need to be able to sleep at night knowing that database is gonna have my 411 00:34:07,245 --> 00:34:09,158 data tomorrow, right? 412 00:34:09,158 --> 00:34:09,809 exactly. 413 00:34:09,809 --> 00:34:21,204 people would like remember what you just said about open source because I think that everybody is in this rush to change licenses on people and they forget the reciprocal and 414 00:34:21,204 --> 00:34:28,607 like relationship people that there is with people being your customer and contributing to your code base and using your software. 415 00:34:29,784 --> 00:34:39,146 Sure, and in fact just using software has value because if you use the software and you report issues, that has value to a software developer. 416 00:34:42,694 --> 00:34:44,975 Yeah. 417 00:34:45,716 --> 00:34:55,971 You said you were influenced by Richard Stallman and I just recently read the Cathedral in the Bazaar about open source uh software as opposed to the Free Software Foundation and 418 00:34:55,971 --> 00:34:57,321 what Richard Stallman was doing. 419 00:34:57,321 --> 00:35:00,473 How did you see that play out in what businesses were thinking? 420 00:35:00,473 --> 00:35:05,145 Because when I think of open source software and kind of the boom, I think of... 421 00:35:07,194 --> 00:35:16,047 not Mozilla, you know, like the browsers, the browser wars in the nineties where there was these open source options of like there was internet explorer and Microsoft, and then 422 00:35:16,047 --> 00:35:18,098 there was the open source version. 423 00:35:18,098 --> 00:35:22,300 And that was really how like people saw open source could be a business. 424 00:35:22,300 --> 00:35:24,861 Cause there was this thing that was challenging the big monopoly. 425 00:35:24,861 --> 00:35:25,841 How is that? 426 00:35:25,841 --> 00:35:28,302 How have you seen that change over the years? 427 00:35:29,560 --> 00:35:37,151 Well, actually, interesting that you mentioned the Cathedral of the Bazaar because ESR just lives a few miles down the road from where we are. 428 00:35:37,151 --> 00:35:38,577 if you know him, him come on the show. 429 00:35:38,577 --> 00:35:40,039 I'd love to talk to him. 430 00:35:40,039 --> 00:35:42,000 Well, I sort of know him. 431 00:35:42,000 --> 00:35:46,904 I gave him a ride once to the Southeast Linux Festival many years ago. 432 00:35:46,904 --> 00:35:51,007 oh Sure, I'll mention it to him. 433 00:35:51,048 --> 00:35:53,369 send him an email. 434 00:35:53,830 --> 00:36:08,066 No, I think the big cultural change, if you remember Steve Ballmer many years ago saying Linux is a cancer, and then many years later, you 435 00:36:08,066 --> 00:36:10,487 Satya Nadella says, we love Linux. 436 00:36:11,148 --> 00:36:22,316 So I think that really summarizes the cultural shift that has happened where people saw open source as a threat. 437 00:36:22,316 --> 00:36:24,518 Now they see that as an opportunity. 438 00:36:24,518 --> 00:36:36,736 And the sad part is now they see taking the software, you have these open core licenses and openish licenses which aren't really open source, but which source available licenses. 439 00:36:37,048 --> 00:36:39,711 So that seems to be going in the opposite direction. 440 00:36:39,711 --> 00:36:45,707 But I can see why people want to do that because certainly making money in the open source business is hard. 441 00:36:46,688 --> 00:36:48,670 But then making money in any business is hard. 442 00:36:48,670 --> 00:36:52,414 If I were running a restaurant, think making money in the restaurant business would be hard. 443 00:36:52,725 --> 00:36:53,676 Very true. 444 00:36:53,676 --> 00:36:56,497 I just want to say that talking to you is a joy. 445 00:36:56,497 --> 00:37:00,249 you just, your energy just like totally makes me like so excited. 446 00:37:00,249 --> 00:37:04,591 What got, what started you with your love for Linux? 447 00:37:04,591 --> 00:37:10,143 Like what made you, what drove, what attracted you to Linux and what's kept you there for so long? 448 00:37:11,352 --> 00:37:14,214 So it's open source Unix. 449 00:37:14,214 --> 00:37:16,455 So I've used Unix for many years. 450 00:37:16,455 --> 00:37:19,867 In fact, my first personal computer was a Unix computer. 451 00:37:19,867 --> 00:37:29,873 It was an AT &T 3B1 that had like a 40 megabyte disk and two megabytes of RAM or something like that. 452 00:37:29,873 --> 00:37:31,494 And it ran Unix. 453 00:37:32,435 --> 00:37:38,222 And then when I got to Linux, the first Linux I got was actually a Linux that 454 00:37:38,222 --> 00:37:43,622 the entire Linux was on a floppy and you booted off the floppy and you ran it and it gave you a shell. 455 00:37:44,022 --> 00:37:48,062 So it is like, cool, here's this Unix system. 456 00:37:48,262 --> 00:37:54,402 I have access to all the source code, I can play with it and that kind of got me into Linux in the first place. 457 00:37:54,582 --> 00:38:00,358 And so I kicked the Windows habit probably around 1999 and... 458 00:38:00,547 --> 00:38:01,512 You missed XP. 459 00:38:01,512 --> 00:38:02,834 That was a good wave. 460 00:38:04,194 --> 00:38:09,483 will actually bought a used laptop once it had XP but then I installed Linux on it. 461 00:38:09,483 --> 00:38:11,913 that's good. 462 00:38:14,805 --> 00:38:24,399 What do you, if you had to pick one part of your career, what was like, like what was the most exciting, I guess the highlight or what would you take away if you had to tell like 463 00:38:24,399 --> 00:38:28,213 your younger self about like this career that you could not have imagined? 464 00:38:28,415 --> 00:38:29,245 Like. 465 00:38:31,424 --> 00:38:33,085 Oh, that's hard to say. 466 00:38:33,085 --> 00:38:36,626 think that I've enjoyed every bit of it. 467 00:38:36,626 --> 00:38:47,400 I've actually changed jobs very few times, only three or four times in my entire career because I've enjoyed doing what I do. 468 00:38:47,880 --> 00:38:54,102 So what I would tell my younger self is just to keep doing what you enjoy. 469 00:38:54,647 --> 00:38:56,751 Do you have any, oh, sorry. 470 00:38:57,358 --> 00:39:02,006 to what you said about being in business, it's you're making money or you're having fun. 471 00:39:02,006 --> 00:39:06,687 And I feel like in a lot of ways that second piece on having fun is not what you're doing. 472 00:39:06,687 --> 00:39:08,729 Like you're making fun too, right? 473 00:39:08,729 --> 00:39:11,100 Like you are creating the fun you want to have. 474 00:39:11,100 --> 00:39:14,814 You are making money and you are making fun to be able to enjoy this, right? 475 00:39:14,814 --> 00:39:19,699 Cause we can make work into a lot of different things and we can say, oh, this sucks. 476 00:39:19,699 --> 00:39:21,090 I hate everything. 477 00:39:21,212 --> 00:39:27,890 But if you kind of go into it with excitement and wanting to learn the things and wanting to push yourself, you get to make your own fun. 478 00:39:27,890 --> 00:39:28,781 And that's really cool. 479 00:39:28,781 --> 00:39:30,984 That's what my kids do all day, every day, right? 480 00:39:30,984 --> 00:39:31,845 They get to play with their friends. 481 00:39:31,845 --> 00:39:34,158 They get to create fun out of nothing. 482 00:39:34,158 --> 00:39:35,442 Sure. 483 00:39:35,442 --> 00:39:35,992 You have to. 484 00:39:35,992 --> 00:39:38,233 a parent, I think remembering that. 485 00:39:39,118 --> 00:39:44,501 I always, you know, even when I was a kid, I would fiddle with stuff. 486 00:39:44,501 --> 00:39:50,985 So back in high school, in high school biology, I was in ace at dissection. 487 00:39:51,085 --> 00:40:00,290 And that was a time when, you know, Christiane Barnard and Denton Cooley and others were doing their first heart transplants. 488 00:40:00,331 --> 00:40:02,111 So I said, you know, this is cool. 489 00:40:02,312 --> 00:40:04,653 I could do a heart transplant on frogs. 490 00:40:04,913 --> 00:40:06,514 So one... 491 00:40:06,794 --> 00:40:17,887 one evening I found two unfortunate frogs in my backyard and I found that it was much easier to take a frog apart than to put it back together. 492 00:40:20,987 --> 00:40:25,464 I mean, I feel the same way about my VCR when I was a kid, but man, a frog, that's... 493 00:40:26,082 --> 00:40:29,237 Well, so, you know, I've got other stories like that too. 494 00:40:29,237 --> 00:40:40,685 You once I opened my watch, I had an old mechanical watch from back in the 1960s and, you know, I tried putting it back together and I had enough pieces left over for the second 495 00:40:40,685 --> 00:40:41,526 watch. 496 00:40:44,115 --> 00:40:45,195 For sure. 497 00:40:46,700 --> 00:40:48,173 How does that apply? 498 00:40:48,173 --> 00:40:50,088 I feel like that happens in software too. 499 00:40:50,088 --> 00:40:55,418 In software, I go to refactor something and I'm like, why do I have so much left over here? 500 00:40:56,046 --> 00:40:58,591 sure, that's absolutely true. 501 00:40:58,591 --> 00:41:05,422 It's understanding how things work, whether it's software or whether it's hardware or whether it's an animal. 502 00:41:05,422 --> 00:41:08,566 It's understanding what makes things tick. 503 00:41:08,742 --> 00:41:11,504 So help me understand YottaDB a little bit, right? 504 00:41:11,504 --> 00:41:15,506 Like that's the code base and the company that you founded, you've been working on for so long. 505 00:41:15,587 --> 00:41:19,529 Like on the website, the fastest key value database, right? 506 00:41:19,529 --> 00:41:21,821 Like what makes it that fast? 507 00:41:21,821 --> 00:41:24,042 Why does it function in that way? 508 00:41:24,042 --> 00:41:26,854 And why is that something you've been doing for so long? 509 00:41:27,512 --> 00:41:35,806 So, I mean, what makes the RDB so fast, I think, is just it's an obsession with speed. 510 00:41:35,806 --> 00:41:37,747 I mean, actually, speed is second. 511 00:41:37,747 --> 00:41:40,327 The first thing is it's an obsession with correctness. 512 00:41:40,788 --> 00:41:47,471 Because if the software doesn't have to be right, then you can make it arbitrarily fast. 513 00:41:47,471 --> 00:41:51,563 oh So, know, speed comes first. 514 00:41:51,563 --> 00:41:53,613 But we do obsess over speed. 515 00:41:55,678 --> 00:42:02,660 And even from one release to the next, if you see any kind of slowdown, we go through, we analyze why it is that way. 516 00:42:02,660 --> 00:42:06,041 It is just something that we do naturally. 517 00:42:06,041 --> 00:42:16,004 And so right now, the next release, for example, we're looking at rewriting the garbage collector oh just because that's a potential opportunity. 518 00:42:16,324 --> 00:42:18,804 So I think that that's where it comes from. 519 00:42:18,804 --> 00:42:24,736 And again, the way that it was developed, it was developed at a time when computers were 520 00:42:25,390 --> 00:42:27,952 know, a thousand times slower than they are today. 521 00:42:28,614 --> 00:42:36,992 And when you do that, then you naturally focus a lot more on performance and that kind of carries through in the code base all the way to today. 522 00:42:38,438 --> 00:42:45,602 Now key value data stores tend, like I know a lot of people that treat them redis, right? 523 00:42:45,602 --> 00:42:46,292 It's a cache. 524 00:42:46,292 --> 00:42:47,943 I don't actually care about the data. 525 00:42:47,943 --> 00:42:49,704 It's throw away information. 526 00:42:49,704 --> 00:42:51,385 I could rehydrate this somewhere else. 527 00:42:51,385 --> 00:43:00,469 But then I also think on the other side of that on things like Kubernetes and etcd, another key value database that's really important, but also uh intentionally. 528 00:43:01,264 --> 00:43:03,908 fault tolerant for being distributed, right? 529 00:43:03,908 --> 00:43:07,414 We want something that's distributed so that we don't ever lose information. 530 00:43:07,414 --> 00:43:13,113 And on the other end, we have this, I don't really care about it, I just want it in RAM and it can go away at any time. 531 00:43:13,113 --> 00:43:16,117 Where does YottaDB sit in that sort of scale? 532 00:43:16,635 --> 00:43:26,560 a third dimension to that, which is that you do care about the information, but in our case, we also care about, so say, let's say, know, distributed database. 533 00:43:27,101 --> 00:43:31,603 You're never going to get high transaction performance with the database. 534 00:43:31,954 --> 00:43:38,207 In our case, transaction performance is absolutely important for high-end banking systems. 535 00:43:38,788 --> 00:43:42,130 And at the same time, you know, having the data be 536 00:43:43,855 --> 00:43:47,477 The integrity of the data is is bad about. 537 00:43:47,857 --> 00:44:00,096 So we have different techniques for doing that, basically replicating in real time to different replicas, but having one system be essentially the system of truth at any given 538 00:44:00,096 --> 00:44:01,066 instant. 539 00:44:03,168 --> 00:44:07,071 comparing it to Redis, though, people often use Redis as a cache. 540 00:44:07,071 --> 00:44:11,274 Now, in our case, we're faster than Redis and we're a database, so you don't really need a cache. 541 00:44:11,274 --> 00:44:12,294 uh 542 00:44:12,376 --> 00:44:14,887 You just use the database directly. 543 00:44:15,628 --> 00:44:22,913 And I think there was a question that you had when you didn't actually articulate it, but you were kind of wondering about key value databases. 544 00:44:22,913 --> 00:44:27,256 It's important to remember that the very first databases were actually key value databases. 545 00:44:27,377 --> 00:44:30,898 In fact, the very first database was a key value database. 546 00:44:31,179 --> 00:44:41,536 It was developed by Rockwell and IBM for the Saturn V Apollo to manage the bill of materials for the moonshot. 547 00:44:41,686 --> 00:44:45,333 and it was developed in the 1960s and that was a key value database. 548 00:44:45,333 --> 00:44:48,408 And that database, by the way, is still running today. 549 00:44:48,408 --> 00:44:50,861 It's an IBM product called IMS. 550 00:44:51,864 --> 00:44:55,516 And, you know, and then, so go ahead. 551 00:44:55,516 --> 00:44:57,868 why did they write a database? 552 00:44:57,868 --> 00:44:59,719 Like I don't actually know the history. 553 00:44:59,719 --> 00:45:05,364 like I imagine we start with files on disk and we just say, we can't have two things right to the file on disk. 554 00:45:05,364 --> 00:45:11,239 And so we need something else to handle deletions and whatever, like locking, whatever the case may be. 555 00:45:11,239 --> 00:45:16,874 And so at some point we like changed from saying this file that I'm writing to is now a database. 556 00:45:19,522 --> 00:45:22,644 Well, it's not just the access control. 557 00:45:22,644 --> 00:45:27,768 It's also the fact that you need to search and retrieve data. 558 00:45:28,749 --> 00:45:34,253 So yes, in theory, you can just take a flat file and any flat file as a database. 559 00:45:35,194 --> 00:45:44,741 But on the other hand, finding information in that flat file or updating information in the flat file can be challenging if it's a big file. 560 00:45:44,741 --> 00:45:46,622 And that's where databases come in. 561 00:45:49,307 --> 00:45:51,368 At some point we, I mean, there's still files on disk, right? 562 00:45:51,368 --> 00:46:02,094 At some point they're like, there's bits on a disk, but how I index that's how I make sure I can quickly access the right information and update or write new information. 563 00:46:02,466 --> 00:46:04,047 That's exactly what the database is. 564 00:46:04,047 --> 00:46:12,170 And ultimately, as a database developer, we rely on the integrity of the underlying file system. 565 00:46:12,670 --> 00:46:17,682 So if the file system gets corrupted somewhere, then we can't really use it. 566 00:46:17,682 --> 00:46:22,474 Because when we write data to the file system, we expect to get it back. 567 00:46:23,814 --> 00:46:28,578 I bet you have some stories there about file systems that didn't give you that data. 568 00:46:28,578 --> 00:46:35,684 Well, mean, today there are only two file systems that we consider fully supported, EXT4 and XFS. 569 00:46:36,326 --> 00:46:41,630 So we consider F2FS kind of supportable, but not necessarily supported. 570 00:46:43,052 --> 00:46:53,382 we tell our customers we don't support ButterFS, we don't support ZFS, we don't support NFS, because we have found in our testing that they're not always reliable. 571 00:46:53,382 --> 00:47:06,576 That's fascinating because I always think of ZFS and ButterFS as having more checksums and more, you know, they have protections against bitrot and all these things that XFS and 572 00:47:06,576 --> 00:47:08,278 EXT4 doesn't have. 573 00:47:09,602 --> 00:47:16,873 Well, in our testing, we routinely run, we've got a couple dozen computers out here, we're constantly testing. 574 00:47:17,335 --> 00:47:24,736 And we have found situations where ZFS and buttered AFS basically don't give us the data that we expect. 575 00:47:25,062 --> 00:47:26,202 Hmm. 576 00:47:26,442 --> 00:47:27,863 That's fascinating. 577 00:47:28,724 --> 00:47:39,715 even thinking back on before we had journaled file systems, with whatever you were writing, you hope it wrote to disk before the power goes off, right? 578 00:47:39,715 --> 00:47:45,614 Like there's a lot of situations that we've come a long way in those areas to make sure file systems were pretty reliable. 579 00:47:45,614 --> 00:47:58,894 Oh, actually, I do have a story on that, in fact, which is that when the upstream code base first went into production back in like 1986, someone accidentally at a data center 580 00:47:58,894 --> 00:48:02,353 kicked out the power cable of the computer system. 581 00:48:03,014 --> 00:48:10,354 And they found that there was a bug and there was like two or three weeks of data which had not been recorded in the database. 582 00:48:10,974 --> 00:48:15,010 So the vendor actually sent all of the team down to 583 00:48:15,010 --> 00:48:19,676 the customer site and unfortunately that time they still had the paper records. 584 00:48:19,798 --> 00:48:29,734 So basically everyone had this crash project to go put the paper records back in the database and also fix the bug in the database so that everything got written out to disk. 585 00:48:34,082 --> 00:48:45,239 We started this conversation talking about having difficulty with things occasionally and still trying to learn new things and banging your head on the wall, trying to figure out 586 00:48:45,239 --> 00:48:46,890 like how things are working. 587 00:48:46,890 --> 00:48:53,334 What are the things today that you might still struggle with learning or trying to pick up when you're building it? 588 00:48:55,054 --> 00:48:59,914 Well, mean, certainly one of the things that I'm struggling with right now is AI. 589 00:49:02,014 --> 00:49:06,094 I find that it's useful. 590 00:49:06,894 --> 00:49:18,134 sometimes when I, you like I had to replace my laptop battery over the weekend and I was kind of wondering why the old battery died after just two years. 591 00:49:18,194 --> 00:49:24,526 And I went to do some research and found that, you know, I shouldn't charge it to more than 80 % if I... 592 00:49:24,526 --> 00:49:27,488 So then I said, how do I keep it within that limit? 593 00:49:27,488 --> 00:49:31,702 And basically I went to, I asked a couple of different AI models. 594 00:49:31,702 --> 00:49:37,837 Ultimately I find DeepSeq to be the easiest one to use for me, but it gave me good information. 595 00:49:38,338 --> 00:49:47,425 But where to have the problem is understanding what it's doing, what it's been trained on, how is it giving me that answer? 596 00:49:48,206 --> 00:49:53,760 And it doesn't seem to know its limits sometimes. 597 00:49:54,737 --> 00:49:55,309 Yeah, for sure. 598 00:49:55,309 --> 00:50:00,078 It's so confident in the information it gives without knowing where the boundaries of 599 00:50:00,078 --> 00:50:07,438 Right, then something like my laptop battery, not particularly important. 600 00:50:07,438 --> 00:50:13,138 I also asked it about recipes for pasta sauce when I was cooking dinner a couple of nights ago. 601 00:50:13,138 --> 00:50:14,938 It did all that well. 602 00:50:14,978 --> 00:50:27,570 But then when you think about using AI to predict, some people are trying to do this, someone who is going to commit a crime before they commit a crime. 603 00:50:27,570 --> 00:50:28,224 you 604 00:50:28,224 --> 00:50:39,668 or you have face recognition software where it's been trained well on, let's say, white males, but can't really tell other people apart. 605 00:50:39,668 --> 00:50:42,399 And there are enough cases of mistaken identity. 606 00:50:42,539 --> 00:50:46,700 So those are the kinds of things where I'm certainly concerned. 607 00:50:46,700 --> 00:50:52,542 I don't have a good grasp on it, and I don't feel that we as a society have a good grasp on it. 608 00:50:54,507 --> 00:51:01,643 or at least the people that have any sort of graphs on it have uh a vested interest in making sure other people don't understand it, right? 609 00:51:01,643 --> 00:51:04,848 Because as soon as it's not magic and it's just a technology. 610 00:51:05,554 --> 00:51:10,254 they kind of lose power of being able to train on whatever data they want. 611 00:51:10,254 --> 00:51:14,794 Because right now, looking at what they train on is copyrighted material. 612 00:51:14,794 --> 00:51:15,974 And they can spit out that copyright. 613 00:51:15,974 --> 00:51:20,274 They're like, oh, no, we have to have this because you don't understand how AI works. 614 00:51:20,274 --> 00:51:26,154 And then at the end of the day, it's like, no, that's what Google did back in the day, where they just said, we're going to scan every book. 615 00:51:26,154 --> 00:51:30,634 And then when you sue us to say we can't scan every book, we're going to say, oh, OK, we'll stop. 616 00:51:30,634 --> 00:51:31,794 But we already have the books. 617 00:51:31,794 --> 00:51:32,574 We already have the data. 618 00:51:32,574 --> 00:51:33,554 We're fine. 619 00:51:33,554 --> 00:51:34,534 We'll keep. 620 00:51:36,184 --> 00:51:45,937 Well, and the other thing is AI certainly has had a whole bunch of, know, hucksters come out there that say we can, you know, here's all this magic that we can do and trust us 621 00:51:45,937 --> 00:51:47,969 it's going to work. 622 00:51:48,464 --> 00:51:49,240 Yeah. 623 00:51:50,931 --> 00:52:00,251 Yeah, and again, a vested interest in making that money and making those promises that this will grow forever and they can do anything they can. 624 00:52:00,251 --> 00:52:07,251 But at the end of the day, LLMs, at least to some extent, are fancy databases. 625 00:52:07,904 --> 00:52:08,636 sure. 626 00:52:09,356 --> 00:52:13,872 and rely on a lot of that data uh structured in certain ways. 627 00:52:13,872 --> 00:52:17,987 uh I've been learning a lot about vector databases and just like, what do they do? 628 00:52:17,987 --> 00:52:19,178 How do they store data? 629 00:52:19,178 --> 00:52:20,500 Why is it important? 630 00:52:20,500 --> 00:52:26,006 And why did we even need a different type of database for this sort of thing? 631 00:52:26,227 --> 00:52:28,089 So yeah, that's very fascinating. 632 00:52:28,089 --> 00:52:28,727 oh 633 00:52:28,727 --> 00:52:36,053 about because if you think about YaraDB, we have a, at the core, core technology is a key value data store. 634 00:52:36,894 --> 00:52:40,296 And on top of that, we've, for example, we have a SQL layer. 635 00:52:40,357 --> 00:52:44,780 Ultimately, a key value data store is the most general type of data store. 636 00:52:45,281 --> 00:52:47,093 And so we can put SQL on top of it. 637 00:52:47,093 --> 00:52:53,167 I've been thinking about what would it take to put a vector database layer on top of it? 638 00:52:53,708 --> 00:52:56,200 And which one would we want to be compatible? 639 00:52:57,944 --> 00:53:07,309 So it certainly is something that ultimately, when you have an AI system, there's a large vector database somewhere that's actually storing the data. 640 00:53:07,551 --> 00:53:11,988 And they have some efficient access to that data to make it work. 641 00:53:13,254 --> 00:53:16,221 Do you think it's a fad? 642 00:53:16,364 --> 00:53:18,368 AI vector databases? 643 00:53:18,988 --> 00:53:19,918 No, it's not a fad. 644 00:53:19,918 --> 00:53:27,240 I think that what will happen is eventually people will realize what the limits are. 645 00:53:27,240 --> 00:53:30,822 oh But they're still useful. 646 00:53:30,822 --> 00:53:33,422 They've proved their usefulness. 647 00:53:34,483 --> 00:53:36,923 And once something is useful, it's not going to go away. 648 00:53:36,923 --> 00:53:39,624 It's just that people will keep pushing it. 649 00:53:39,624 --> 00:53:43,995 when it gets to the limits, then people will move on to something else. 650 00:53:43,995 --> 00:53:45,926 I kind of like search engines, right? 651 00:53:46,658 --> 00:53:50,139 When search engines were great, people used them, they kept pushing them. 652 00:53:50,279 --> 00:54:02,472 Then they said, with search engines, can now, Google came along with Gmail, and they came along with, and some things worked like, Yahoo came up with the idea of a portal, but that 653 00:54:02,472 --> 00:54:05,263 didn't really take off from a business point of view. 654 00:54:05,383 --> 00:54:13,635 But ultimately, we got to a point where people realized the limits of search engines, and then comes along AI, and that's something that, if you think about it, it's like something 655 00:54:13,635 --> 00:54:16,896 that sits on top of a search, on top of many search engines. 656 00:54:17,510 --> 00:54:18,795 Yeah, I've been thinking about that a lot. 657 00:54:18,795 --> 00:54:23,470 think that to me, AIs are kind of a search engine 2.0, right? 658 00:54:24,082 --> 00:54:30,002 where search engines, looking back even further, like early internet days, right? 659 00:54:30,002 --> 00:54:38,302 Like they were manually curated lists of websites that, know, Yahoo kept and said like, here's the websites you should go to to find some information. 660 00:54:38,302 --> 00:54:42,342 And then AltaVista and stuff like, oh, you can dynamically find this information. 661 00:54:42,342 --> 00:54:48,322 And we'll make that a little better by making you, give you the most reputable source for that information, right? 662 00:54:48,322 --> 00:54:49,562 That's where we kind of ended with Google. 663 00:54:49,562 --> 00:54:53,822 And I feel like the demise of Google as a search engine specifically, 664 00:54:54,276 --> 00:55:02,582 place for AI being the next search engine where I don't actually care on what one website says about a pasta sauce recipe. 665 00:55:02,582 --> 00:55:11,879 I care about what a hundred websites say about a pasta recipe and just give me the grouping of that thing and right now they're obviously very wrong. 666 00:55:11,879 --> 00:55:13,589 There's been plenty of cases where 667 00:55:13,946 --> 00:55:23,915 Gemini says put glue on your pizza and do things that are completely absurd but in general in that probably is like the weights for those systems are like Over indexing on what they 668 00:55:23,915 --> 00:55:33,114 think has more authority like reddit reddit has a lot of authority because there's lots of people there but it's also very sarcastic which is where that stuff usually comes from and 669 00:55:33,114 --> 00:55:35,416 so I think that that 670 00:55:35,663 --> 00:55:40,948 AI systems eventually become that sort of, don't want to go to one link. 671 00:55:40,948 --> 00:55:51,078 I want to get a summary of 50, the next three pages of links, and you just tell me what they all said together in one summary and just give me the grouping of it, right? 672 00:55:51,078 --> 00:55:58,325 Like the nine out 10 doctors recommend sort of thing, instead of going like, I'm going to pick this website for that specific thing. 673 00:55:58,325 --> 00:56:00,407 I just want to know, like, what do they generally all think? 674 00:56:00,558 --> 00:56:02,139 Sure, in fact that's exactly it. 675 00:56:02,139 --> 00:56:04,739 If you go back to, you know, mentioned AltaVista. 676 00:56:06,140 --> 00:56:16,783 One of the problems with AltaVista, at least for me, that when I switched from AltaVista to Google, was like I do a search and it would come back and say here's 60,000, you know, 677 00:56:16,983 --> 00:56:19,224 websites that answer your question. 678 00:56:19,224 --> 00:56:26,475 And Google on the other hand, because of the page rank algorithm, you know, you probably got 10 links which were useful. 679 00:56:26,475 --> 00:56:29,018 even the fact that Google put like they still have it. 680 00:56:29,018 --> 00:56:30,029 I'm feeling lucky, right? 681 00:56:30,029 --> 00:56:33,383 If I go to Google, I haven't been to Google.com for so long. 682 00:56:33,383 --> 00:56:35,685 just yeah, the second button, I'm feeling lucky. 683 00:56:35,685 --> 00:56:40,300 Like that was the unique thing of like, I will give you the top. 684 00:56:40,462 --> 00:56:49,806 It still exists on their website today because you're right they over index where Yahoo said we can curate a bunch of lists But that was a limitation on people managing or 685 00:56:49,806 --> 00:56:55,189 reading stuff altavista came along said hey our database is so big You can't even believe it. 686 00:56:55,189 --> 00:56:59,511 We will give you a hundred thousand links for this thing and Google said I'm gonna give you one Right. 687 00:56:59,511 --> 00:57:06,014 I used to Google stuff and say I'm feeling like no one does that today I don't know why that button still exists But that was the that was the selling point was like we could 688 00:57:06,014 --> 00:57:10,856 take all of the database of we do have a hundred thousand things But we're gonna give you the one 689 00:57:10,876 --> 00:57:19,590 right thing and and maybe Jem and I should have just been the I'm feeling lucky button all over again right like that's we're all back to that point of like I don't care about the 690 00:57:19,590 --> 00:57:24,282 one link though I care about the grouping of what all the links thought together 691 00:57:24,504 --> 00:57:29,808 How do you know the Gemini isn't that I'm feeling lucky but they put a language thing in front of it? 692 00:57:29,808 --> 00:57:32,560 Yeah, it very much, it could be. 693 00:57:34,422 --> 00:57:36,524 Batch guard, this has been so much fun. 694 00:57:36,524 --> 00:57:41,629 Thank you for coming on the show and talking to us about just everything. 695 00:57:41,629 --> 00:57:46,524 Like all of your experiences, what you've been doing with YottaDB, uh why it matters. 696 00:57:46,524 --> 00:57:54,071 Like why is it still even in the year 2025, like why high performance flexible databases are just kind of important. 697 00:57:54,071 --> 00:57:55,342 That's been so much fun. 698 00:57:55,534 --> 00:57:57,117 oh Thank you for inviting me. 699 00:57:57,117 --> 00:57:59,669 It's been fun talking with with Otto Menu. 700 00:57:59,846 --> 00:58:02,354 Yeah, thanks so much and thank you everyone for listening. 701 00:58:02,354 --> 00:58:03,962 We will talk to you again soon.

Never lose your place, on any device

Create a free account to sync, back up, and get personal recommendations.