·E31

Google’s Efforts to Build Patient-Facing AI: A Conversation with Drs. Alan Karthikesalingam and Anil Palepu

Episode Transcript

It has been a consistent thread how people can interact with these systems.

Like even, I mean, we were talking earlier about medical imaging and I remember, you know, all this work from concept bottlenecks to kind of, like activation-based heat maps.

You know, that lovely work actually of other networks analyzing how classifiers work.

Like why is a zebra, a zebra?

Well, actually it turns out it's because it, it's activating all the weights for being stripey and there's other activations going on for being horse-like.

And that may be much more useful than a sort of blotchy, pixel-wise activation map.

And I think when you kind of move into this era that we are in now, systems which can actually go away and do work, like they can work for potentially even days before coming back to you, bringing back vast swaths of information, I think it's gonna be critical to sort of optimize and think about these things properly.

You know, again, look, maybe it's my bias as a doctor, but I think medicine is the most humane of things, right?

It's the most essential human condition, and so our ability to actually interact with these systems properly is gonna be very, very important.

Welcome to another episode of NEJM AI Grand Rounds.

I'm your co-host Raj Manrai.

Today, Andy and I are excited to have two guests on the podcast, Dr.

Alan Karthikesalingam of Google DeepMind,and Dr.

Anil Palepu of Google.

Listeners of the podcast will remember Alan.

He's making his second appearance on AI Grand Rounds, and Anil joins us for the first time.

Together they told us about Google's articulate medical intelligence explorer, often referred to as either AMIE or AMIE.

AMIE is an AI system that's designed to directly interact with patients.

Alan and Anil took us through the project from ideation to building it to some of the evaluation studies that are happening today.

It was also a great chance to reflect with Alan on how AI has changed and how it hasn't since he was last on the podcast about two years ago.

Alan has this special ability to distill complex technical and clinical ideas into their essence, and he really is a wonderful example of a physician who's really become a leader in AI.

The NEJM AI Grand Rounds Podcast is brought to you by Microsoft, Viz.ai, Lyric, and Elevance Health.

We thank them for their support.

And with that, we bring you our conversation with doctors Alan Karthikesalingam and Anil Palepu.

Alright, well, Anil and Alan, welcome to AI Grand Rounds.

We're excited to have you today.

Yeah, thanks for having us.

Yeah.

Thanks for having us, Andy.

With the exception of Zak Kohane, I think that Alan is the first repeat guest that we've had on the podcast.

So great to have you back, Alan.

I think that's right.

Yeah, that's an honor.

Way to go to intimidate me at the start.

So as a result, Alan, we've actually already asked you the following question, so I'm gonna pose this question to Anil.

Anil, this is a question that we always get started with, that we ask all of our guests.

Could you tell us about the training procedure for your own neural network?

How did you get interested in artificial intelligence and what data and experiences led you to where you are today?

Yeah, I did my undergrad in 2016.

That's when I started.

At the time, I think I had.

Very little idea of what I wanted to do.

I was interested in medicine, but I definitely did not know that I wanted to do AI.

I don't think I knew what AI was at that point.

I basically just knew that I didn't really think I could be a doctor.

I was a bit too lazy.

I didn't think I could study for the MCAT or do all the steps that are needed for that.

So, I was like maybe doing something else like perhaps computational would be a bit easier for me.

And that's kind of how I got started.

It's not like a glorious story of knowing I had a dream and reaching it, but as I started to do research, I really fell in love with the type of work we were doing.

And I found that I had been doing like a lot of projects alongside clinicians and driven by a clinical need.

And I was really excited that it felt like these people all had some serious problems.

What really struck me was everyone was super unsatisfied with the status quo, and there were so many things to improve.

And me, as an undergrad, I actually felt like I could apply the things I learned in class.

Like, me learning like about random forests two weeks ago, and then applying that to like a real problem and actually seemingly making a difference in the workflows of these clinicians was like crazy inspiring to me.

And that led me to be like, okay, I wanna take this to the next level.

I wanna do my Ph.D.

and keep diving down in this path.

And so I did that.

I applied for my Ph.D.

I went to MIT and had the pleasure of working with an amazing advisor there who is not paying me to say this, but he's on this call right now.

And he taught me a lot about deep learning and got me really excited, particularly in the tech space for him.

That's when I joined Google.

Met some amazing coworkers there, Alan included, and still there today.

Keep learning from every day and really excited about the mission we're pushing.

Cool before we come back to your world class graduate training and mentorship, 'Cause we're gonna come back and spend a lot of time on that.

I feel like this is gonna be a recurring theme.

Yeah, it's gonna be a recurring theme.

Yeah.

So, you're yet another person from this HST program at Harvard and MIT.

So, this isn't like something that we've planned, but it just seems to be that this program produces a lot of folks at this intersection.

However, before you got in this, could you take us back to early days of Anil, when you were still like much more pluripotent before you had even thought about computer science and engineering.

What did you enjoy growing up and how did you end up in a biomedical engineering program in the first place?

As a kid, like I really looked up to, I think my family, particularly my brother, I have an older brother, and he always had a very strong vision of what he wanted to do.

And he knew he wanted to be a doctor and everyone could see it, he really had that drive in him.

And I was like, he knows what he's doing, he's doing all the work to figure out what's important and what to work on.

So maybe I'll do that.

And again, like I said, I think I knew what my skillset was, I knew what my limitations were, so I was like, I don't think I could actually do the work of seeing patients and have that on day-to-day.

Like, I wanna sit on my couch and watch TV while I work.

That doesn't seem possible if you're seeing patients all day.

So, I found some path that was, I think very inspired by him.

I think I've also certainly changed a lot as well, especially as I've gotten deeper into the research and I think I'm a little bit more driven, at this point than I started out.

Awesome.

Thanks for that, Anil.

So, I think we're gonna hop in and Alan maybe just quickly so that we don't shortchange you, do you wanna hop back on and just quickly introduce yourself again?

Yeah, of course.

Thanks again for having me.

So, I'm Alan.

I'm a clinician and a researcher at DeepMind, a company I joined kind of about eight years ago now.

And I've always worked on AI for biomedicine, doing stints basically at DeepMind itself, also in different teams in Google.

I'm very lucky to work with Anil for a good chunk of that on some very exciting work.

Awesome, thanks.

So now we're gonna transition and talk about some of the projects that you both have worked on together.

And I'll preface this by saying the first paper that we're gonna discuss, again, Zak Kohane mentioned this in our scotch infused end-of-the-year round table.

Did you guys, wait just as a, did you guys listen to that one?

So it's a good one.

We discussed AMIE, quite a bit, so I think, you'll enjoy that.

But anyway, go ahead.

Yeah.

And one of the things we discussed is actually the correct pronunciation.

Is it AMIE or is it AMIE?

I think this is, uh, is it potato, or is it?

Yeah, ask three doctors.

You'll get three answers.

One of the things about a transatlantic team is I think we find, like, almost every way to pronounce almost every word.

So, we'll leave that one a mystery for you.

Okay.

Well, uh, in my slightly southern twang, I'm gonna say AMIE here forward and Raj can go with the more cosmopolitan pronunciation of AMIE.

Okay, sounds good.

Um, so anyway, we had this end of the year episode with Zak, and one of the things that we do is round up papers that have been published at NEJM AI, and then we talk about papers that have been published elsewhere.

And Zak and I actually both pointed to the paper that we're gonna talk about next as one of the papers that we thought was most impactful, that was not published at NEJM AI.

And so, the paper is "Towards conversational diagnostic artificial intelligence." And instead of doing you both the disservice of butchering the explanation set up and context how about we go to Anil and you give us the technical overview and the technical setup, and then we'll go back to Alan and discuss some of the clinical aspects of it.

Yeah.

So, this paper "Towards conversational diagnostic AI," essentially what we're dealing with is up until this point, there was a lot of work showing essentially that, these LLM based systems do exhibit some quality medical reasoning.

They were tested on these benchmarks like MedQA, and, you know, at least in like these multiple-choice settings, they performed quite well and often superhuman.

That being said, there was very little work investigating how well these systems can actually elicit this information from patients.

And so, this is a setting where rather than having the entire picture provided upfront and an LLM simply interpreting it.

In a real clinic, a clinician will have to actually talk to a patient, get the information from them, do so in a way where they're exhibiting empathy and quality conversational skills for history taking, and ultimately still come up with the correct plan and differential for the patient.

So, this was what we really wanted to test.

Alan and others really set a really great vision for how we would test this, which is through something called an OSCE or Objective Structured Clinical Examination.

And so, we modeled our study around text-based OSCE and we tested whether our system that we developed could perform this kind of consultation with patient actors at the level of primary care physicians.

And so that's what we did for the study.

In terms of the actual modeling, at the time we were working with PaLM-based models and really the key innovation here was we were trying to simulate conversations between doctors and patients and use that to actually train a model to emulate the role of a doctor in those conversations.

Can I hop in here?

'cause I remember you actually presented this at a lab meeting one time, and there was a couple technical aspects that I thought were just, like, both surprising and interesting.

If I remember correctly, you have this team of PaLM models, one is acting as the doctor, one's acting as the patient.

And you use this ensemble of Large Language Models to synthetically generate data, and then you train on that data and the model keeps getting better.

And I remember at the time I was deeply skeptical of pulling yourself up by your own bootstraps aspect of what people are talking about in synthetic data, but this seemed to be a pretty compelling example of where you could have the model generate synthetic data and it get better at the task that you're trying to get it to learn.

So, maybe could you talk a little bit about, like, that aspect?

Was that surprising to you or have you seen other examples of this sort of synthetic data trick working in practice?

Yeah, so our setup, as you described really well, is we had this multi-agent system where we had essentially we had that doctor and that patient each played by PaLM at the time and they were interacting with each other.

And we also had a separate agent, a moderator, which was just basically checking whether conversation was done.

And really the critical agent was this critic agent, which, after an entire conversation was complete, it would review the conversation, identify things that the doctor did right or wrong, and pass that to the doctor.

The doctor got to try the conversation again.

And in doing so, we were able to guide the conversation to words, what was better doctor behavior according to how we knew we'd be evaluated and what the optimal plan for this patient would've been.

The other aspect of this that I think is injecting new knowledge is the fact that our conditions, our vignettes, were actually conditioned on search.

And so essentially here we're able to bring in some kind of grounding that is able to be provided to the model during training.

I think the other important point to note is that these PaLM-based models initially just like simply could not, if you tried to prompt them to do this, they would not hold a conversation.

Certainly not over this many turns.

And so, I think a lot of our gains were really in being able to hold that conversation, like the style of that diagnostic dialogue.

Whereas like today, we don't actually need to always do significant post training.

Like in our most recent work, we're actually using the base model for most of the heavy lifting there.

So I think that is something that has also just changed over time in terms of necessity.

So cool.

A couple other things now that this is all loaded into my context window.

I remember the really surprising part of this is that you had all of this presumably expensive human OSCE data of actual conversations.

And one of the conclusions I remember reading in the paper was that actually the synthetic data ended up being more valuable than the human data.

Is that a correct recollection?

I guess valuable in what sense?

In that just training on the human data was insufficient.

And you really needed the synthetic data to be able to get the model to learn.

Oh yeah.

So, we had a human, like we had transcripts, which we were training on, and I think, yeah, there were many limitations with those transcripts.

I'm sure if we have, like, human data in the exact style of the kind of consultations we wanted, that would be a different story.

But what we're dealing with is very noisy transcripts that maybe didn't cover the range of conditions we wanted to cover.

And so, I think in that sense, by synthetically creating them, we're able to actually tailor what we're providing to the model.

But I think a large part of that was A, like that it was an audio transcript and B, that we were able to flexibly choose the set of conditions.

Yeah, to add a bit of color to this, you know, you can imagine, I suppose it's like one thing for imitation learning to have a worked example of exactly how you would want an agent to behave itself.

It's quite another thing to be doing SFT on some transcripts, which are literally just transcribed from video consultations in which the provider and the patient might exchange a joke, but about something that has visually just sort of happened, like a bird flew outside the window or something.

And there's a lot of what I just did, you know, a lot of, um, ahhh, but, you know, this kind of thing faithfully transcribed so that, plus I think Anil makes a great point about the breadth and depth of conditions, and one of the many beautiful things about medicine is that it is essentially a long tail.

You know, even if you take common conditions, there's a kind of universe of ways in which even the most common or garden thing can actually present differently in different people.

And so, the ability of utilizing self-play, I think one wonderful thing about that is the ability to permute those situations to bring search into the loop, which means you're not relying on memory that's on information that like happened to be encoded in at pre-training and to much more fully go beyond like the experience, right, of what happens to have been written down in any one data set.

Alan, that sort of reminds me of with image-based models, the augmentations, right?

The sort of flips, and the permutations, and the rotations, and blurring, and shearing, other things that we do that are now just kind of standard practice, right?

You're sort of permuting all the ways to present information to the model and then trying to see what emerges out of it that's more stable and more robust.

Yes.

Is that, is that fair?

Yeah.

I love that analogy.

So, you know, there was, there's been this kind of arc over the last eight years in which I think for, you know, I've always had this kind of dream of systems like AMIE, and it was one of the reasons why like baffling my family, I left a very lovely and amazing academic clinical practice, right?

And, like, a very nice lab and everything to come to DeepMind.

And you know, there has been this arc from kind of supervised learning and imitation learning of these point tasks, producing these systems that were useful for one thing but fragile and not generalizable.

And not robust.

And in the era just preceding the advent of these amazing LLMs, we were doing a lot of work.

Trying to address those limitations through techniques like their self supervised learning especially, and we actually had a paper in Nature Medicine on the use of synthetic data in the radiology context and the dermatology context specifically for doing exactly what you were saying, but imagining, you know, those augmentation regimes, but actually grounded in the domain.

And even there again, we found some pretty great impacts, including non-trivial improvements to things like fairness in that context, and again, I think for medicine, this becomes important because of the long tail.

So, when you think about all the ways in which a condition can present and all the intersectional groups you actually have to account for, it feels to me like an almost impossible challenge to try and get that distribution represented properly.

Under like a supervised learning regime or under like a regime of real data that is going to be difficult.

It's kind of like a car that has to learn how to drive by memorizing every possible road situation.

It just doesn't seem feasible.

And so being able to leverage self play, synthetic data regimes, the intelligent acquisition or like modification of those data, like AI, that feels like a very powerful paradigm for medicine and life sciences.

Can I ask a technical question on that, Alan?

And feel free to take a pass on this.

So, where does the information come from?

So, you have this long tail of conditions, you're pulling data out of the weights of the model, and so presumably, like, no new information is created.

I understand that, like, supervised fine tuning post-training argument where now you're getting the model to be in the right shape of things, meaning that it's having this like turn-by-turn dialogue.

But what's your sense of this sort of, like, added the information gain from doing this because in, in self-play, like Go, the model can actually play games that have never been played before because you have this perfect simulator of Go, but in the case of generating synthetic data, you're drawing from some the same implicit distribution.

You're just working with it in a different way.

I don't know if that's like too wonky of a question.

Uh, but yeah.

Maybe not.

I, and again, I don't know if this answer is correct.

Is there a chance that the, you know, there's one distribution, which is the original pre-training corpora of these things.

I think there's potentially changes to those distributions, maybe in what happens if there's the kind of intentional multi-agent curation of new synthetic data that's like weighted differently into different distributions.

And which, you know, even with phenomena like hallucinations, maybe like a downside of that could be that, even then, impossible things can then be imagined and injected back into training.

But, uh, I dunno any other, if you would agree with that?

Yeah, I would tend to agree.

I think, you know, we're intentionally, you know, I don't know how much of this you could say is like already in distribution, but we are intentionally crafting data, and rating that data with our critique, and so on in order to optimize it for a particular use case.

And so, I think that certainly does significantly change the behavior, whether that could be achieved by some crazy amount of prompting on the original model.

Not sure, but I think it's a bit doubtful.

Awesome.

So maybe I can transition a little bit and I'll direct this next question to Alan.

So you know, Alan, you said you had this dream.

I think it's pretty inspiring, right?

That you've imagined something like AMIE for some time and you're choosing a, I think, a hard problem, which is directly interacting with patients, right?

Directly, sort of interacting with patients, trying to elicit information, do that safely, do that robustly, and that's a hard problem from several perspectives, both in trying to do this realistically on the sort of technical side, but then even from a safety side, also running a real study where the AI is gonna interact with patients.

And what I wanna ask is, what's your approach to sort of selecting clinical problems?

There's so many possible problems that you could select right from the very first, I'd say kind of split.

You could have decided to do doctor-facing AI with all of the work from your team, or you could have decided to sort of focus, which it seems like you are now on kind of patient-facing, patient- interacting AI.

And maybe you'll push back and say that it, AMIE is, is actually both, but I'm curious about how you just think from sort of a high level first.

About what are the clinically interesting problems for you to solve, and with all the potential different areas that you could apply AI from your team at Google to how you chose kind of AMIE in this, this set of problems.

And then I do afterwards, I'll just telegraph a little bit.

I want to dig into some of the specialist applications that you guys are also you're highlighting recently.

Yeah, of course.

I mean, look, you know, I think an incredible privilege, right?

To be like a team like at Google and DeepMind like, so like getting to work with people like Anil in kind of a frontier lab effectively that is, attempting to really solve intelligence and use that to sort of benefit humanity.

And I think that this kind of environment is very inspiring for thinking in first principles, right?

And so, I think something I've always tried to do is think about the first principles of what really medical intelligence and biomedical intelligence actually comprises.

And hopefully, you can probably see that in the thread of the work, like in things like Med-PaLM, really from very first principles we were thinking about to be useful in this domain, to have applications in this domain.

A very foundational question is what knowledge is encoded in these systems?

We then went on this journey of trying to rigorously break that down into sort of testable forms that we can subject to empirical validation and trying to do this kind of innovation like boldly, but also responsibly, right?

And doing in ways where we build upon things that like appropriately respect the extensive expertise that exists in this domain.

Like, I think the other amazing thing about medicine is how universal it is and how well studied it is.

And I like, I actually also believe that the people who practice medicine, you'll find every kind of talent in the medical profession.

And then in terms of the people that affects, it's all of us.

Like we will all come into contact with that at some point in our lives, and that means there's this incredible diversity of literature you can draw upon.

But when you move beyond kind of assessing the knowledge in these systems, I think then one of the things here, and you see this both with our work on AMIE and our work on co-scientists alike, is these kinds of very foundational questions.

In the setting of AMIE, it's about trying to understand how that knowledge can then be used empathically, conversationally, and to what end, as you said, like we've shown work in AMIE both as regards to the use of research systems like that in the hands of clinicians and potentially also the use of systems like that in terms of how they might be able to interact directly with patients and in more than one setting.

And so, for us, I think it's more about the first principles thinking.

Like, as Anil said, if you're then looking at medical conversation, it's then very natural to try and understand how can you characterize that?

And how can you conceptualize what quality is in that domain?

And luckily there exists many years, many, many years of really rigorous thinking about that, both from the perspective of medical quality.

Right?

And, okay.

OSCE is being one particular construct.

I mean, I can't tell you how many times I've sort of nervously shuffled around a sports hall between patient actors ready to show my skills and show what I can do.

But, um, that's just one part of it.

It's not just the kind of acquisition of medical information, the right history taking skills, and the guide to uncertainty.

It's these other attributes of like building rapport, trust, and there's very good literature on that.

We found very good ways of actually trying to characterize that.

And then I think some of the creativity is how to adapt that to the technology setting we were in.

And obviously there's a lot of limitations, right?

You know, particularly to this paper we're discussing, which was set in the PaLM model era.

So, grounded in text, right?

Exactly.

That's, that's, that was my question.

So, just to flush that out.

So, this is a chat bot, right?

That you're typing and interacting with?

For that first paper.

That was one of the constraints we adopted and we did so for practical reasons, right?

Like at that moment in time.

Still, maybe millions and millions of people were interacting with commercially available large language models, I think by that time.

But they were doing so by typing and these were at that time text-only systems.

And so, we tried to sort of adapt these well-known frameworks like OSCE to that technology setting.

And that introduces obviously some very important limitations.

It means that the work has to be interpreted with a lot of caution and hype and can also, it can also end up doing sometimes more harm than scientific findings do the good.

So, we also try to go into that in a lot of depth in the paper and think about that very rigorously, because again, considering those things carefully from first principles can also be a very good way to open up the most important next research directions that need to be looked into.

Do you see the next few years as having a big focus or big emphasis on human computer interaction?

How you present information to patients and doctors?

Are these topics that you see your team studying or that, you know, you see the field moving towards, but something that is not necessarily your focus with AMIE?

I think if you care about AI in medicine, and if you look back at the last decade of work in this field.

It's been a consistent thread how people can interact with these systems.

Like even, I mean, we were talking earlier about medical imaging, and I remember, you know, all this work from concept bottlenecks to kind of, like, activation-based heat maps and do those work or do those not work?

And you know, that lovely work actually of like other networks analyzing how classifiers work?

Like why is a zebra a zebra?

Well, actually it turns out it's because it's activating all the weights for being stripy and there's other activations going on for being horse-like.

And that may be much more useful than a sort of blotchy sort of pixel-wise activation map.

And I think when you kind of move into this era that we are in now of interactive, very powerful, conversational and multimodal systems, but systems which can actually sometimes go away and do work.

Like they can work for potentially hours, potentially even days before coming back to you, bringing back vast swaths of information.

I think it's gonna be critical to optimize and think about these things properly.

Again, look, maybe it's my bias as a doctor, but I think medicine is the most humane of things, right?

It's the most essential human condition.

And so, our ability to actually interact with these systems properly is gonna be very, very important.

Both for the good, right, in terms of getting the most out of them and fulfilling their undoubted potential, but also to mitigate risks which could otherwise be very significant indeed.

Alright, and maybe I can just get one last question.

Alan, I don't think I asked you this last time you were on, but you are, correct me if I'm wrong here by clinical training, you're a surgeon, right?

You were or you are a surgeon.

You're a practicing surgeon.

That's the one.

Yeah.

So, can I just get some last thoughts here before we move to the lightning round on AI's impact for surgery outside of AMIE and internal medicine and all these other contexts that we typically discuss it.

How do you see AI changing surgery in the next few years?

Yeah, I think it already is actually like one way to look at this is preoperative, intraoperative, postoperative.

There's probably like many other ways and foundational understanding of the conditions themselves.

If you think about the process of working people up for operations, it, depending on the specialty, There's immense opportunity there for multimodal analysis, patient selection, operative selection, operative technique.

The tools even that are used in theater.

These things can be revolutionized.

Like in fact, some of my research before I came to DeepMind was in the setting of like vascular surgery with the idea of personalized stent grafts.

3D modeling of CT scans and the design and durability of like devices, right?

That can seal off an aneurysm, but you can imagine how much more personal and how much more precise some of that work is gonna become intraoperatively.

I think there's amazing potential.

There's not only potential actually for guiding operations, robotic surgery, that kind of thing, but even in terms of medical education, understanding what's going on with the procedure, quality control of the operation.

We've had systems for a while that can do operative phase identification and things like that.

But the way I think that many surgeons still train is like with logbooks, writing down the operations.

The whole field of learning health care system, quality improvement, understanding granularly, what's going right, what's going wrong.

If there's a digital exhaust of sort of rich video, biomedical data, this entire thing can probably end up being much, much better, much, much more joyful end-to-end, and then postoperatively.

The way in which we track support patients, like getting better at home, understanding better how to do surveillance for certain kinds of operative outcomes, and reintervention or in, in the setting of like surgery feels like cancer survivorship and things like that.

There, there's like a raft of ways it, in fact, I would ask the opposite question.

Is there an aspect of surgical practice or of medical practice that we can't imagine this incredible tool being useful for?

Terrific.

Thank you.

That was a great answer.

Alright, Andy, are we ready for the lightning round?

We are ready for the lightning round.

So, the way that we're gonna do this is that I'm gonna ask going to Anil, Raj is gonna ask one to Alan and we'll just go back and forth like that.

So, Anil, first question.

You mentioned in the introduction that your brother is a role model for you and that you look up to him.

He's also a doctor.

Uh, what kind of doctor is he?

Can you remind me?

He's a pediatric ICU.

That's what I thought.

So, my question for you is, if you have a medical question right now, do you go to AMIE first or do you go to your brother?

Uh hmm.

I would say I probably go to my partner who is also a doctor.

Oh, secret answer number three.

Okay.

Also very, uh, politically savvy answer there.

So, bonus points for that.

Alright, this one's for Alan.

Alan, if you weren't in medicine, what job would you be doing?

Oh, my goodness.

That's an incredible question.

I've sort of never really thought about that.

That is, that is a very difficult question to, like, it would probably be something in science for human benefit.

Like, I, you know, maybe it would be something in, in kind of energy or, or climate, uh, kind of science, something like that.

And it's kind of a boring answer.

I love the idea of like, using the scientific method to try and make life a bit better for other people.

And so if you don't allow me to do that through biomedicine, then I'll find some other way.

Terrific.

Great.

Uh, not, not a boring answer.

We'll, we'll take it.

Not at all.

Full disclaimer, Anil, Raj wrote this question.

Uh oh.

No, no.

Actually, my wife, wrote this question yesterday while we were just after we put the kids down.

Yeah, go ahead, Andy.

Ane what was the best piece of advice your Ph.D.

advisor ever gave you?

Uh hmm.

If you can't come up with anything, we're gonna have this one.

This one's real, real tough.

I'm really, really struggling to think of, uh, anything.

No.

Uh, yeah, Mike, don't, don't edit the silence out of this one.

Yeah.

The best piece of advice.

It's hard to pick.

I mean, I feel like, uh, there were so many, I know probably so many his advices.

I, I think, honestly just, maybe not advice, but what you've done in terms of pushing me to even take this internship and really, thinking about what the best opportunity for growth was.

I think that mindset of looking for the most exciting opportunity and trying to learn the new thing and challenge myself is something that you've definitely taught me and as, as a general skill.

And so, I think that that is something that I'm really thankful for.

Yeah, thanks a lot, and thanks Rachna, for that question.

Yeah.

Nice.

Alright, the next one is for Alan.

Will AI and medicine be driven more by computer scientists or by clinicians?

Uh, it'll be driven by both.

It'll be driven by both together, and patients.

Alright, excellent.

Um, Anil, if you could have dinner with one person, dead or alive, who would it be?

I think probably my grandpa, who passed away when I was younger and I think he just really would like to see the person I grew up to be.

And I think he'd be very surprised, honestly.

Great answer.

Last question on the lightning ground for Alan.

Alan, do you think things created by artificial intelligence can be considered art?

Uh, wow.

That that is like a, yes, but I'm very ignorant on this.

I'm not, I'm not a good person to ask.

We'll, we'll take it.

You guys passed the lightning round.

Well done.

Congratulations.

Yeah.

Alright, so for the last couple of questions, we wanna zoom out and ask some bigger picture ones.

I feel like Alan is still lost in thought.

Trying to think about if AI can be art.

I see he's looking.

I think, I think Alan is thinking about his alternative career.

He is outside of medicine.

Yeah.

Alright, so we're gonna ask one of these to each of you.

We'll start with Anil first.

Anil, how do you see AMIE being used in five years and when do you think most patients' first interaction with the health care system will be with a system like this?

Yeah, I mean, I think in five years, judging by kind of the pace that like, I've seen things improve both, in terms of the base model as well as the work that we're doing and many others are doing in this kind of space.

I feel pretty optimistic that in five years we'll be at the level, like technologically, where we would have a system that can interact with patients that can provide, you know, good guidance to them and be used probably in conjunction with overseeing clinicians that can, maybe it's a setting where this kind of system would perform the intake and provide a plan or something like that.

But then a clinician would be there on the end to click a checkbox and say like, hey, I agree with, you know, AMIE's plan or whatever.

I will say, you know, of course, like our research with AMIE is research and I think it's an important distinction to make.

We're not building a product here.

But I imagine that's totally possible.

I think in five years perhaps we might even be at the point where it's more safe to actually have a system that enables access to this kind of expertise.

You know, there is like such a lack of access to high quality expertise globally.

And so, I think we might be at a stage where it's probably even better to have a semi-autonomous system like this able to interact with patients.

And so, I don't know how this works out, from a regulatory perspective and like logistically, of course there's, many barriers to work through, but I think at least like technologically, we would be at that stage where it's totally possible.

One quick follow up.

What do you think that AMIE in 2030 will have or what will it be able to do that AMIE in 2025 cannot.

In 2030, I certainly think this will be a system that can operate in very natural conversations.

So certainly, I don't think it'll be like a text-based conversation.

I think that it will be able to very seamlessly look up the most recent guidance and you know, any kind of specialty case.

The advantage of AI, right, it doesn't need to like study one specialty.

It can kind of have expertise in everything and that's all, you know, kind of complimentary, all that knowledge.

And I think the system really can perform most roles that don't require any particular physical intervention, which it can obviously recommend.

And hopefully at that point, like such a system would be pretty well validated in with real patients as opposed to, you know, simulated settings.

Cool.

Awesome.

Alright, this is our last question and I wanna, I wanna pose this one to Alan.

So Alan, we had you on the podcast before, and I think this was a few years ago now, maybe a little over two years ago.

And we spoke about Med-PaLM I think you guys had pre-printed it and we were maybe speaking right in between the sort of pre-print and then when the paper eventually was published in the journal.

The question is, what do you think has been the biggest change or maybe what's been most surprising for you to see?

In the last two years since you were last on the podcast, and then maybe also the sort of flip, the inverse of that question, what are you most surprised that hasn't really changed since May of 2023?

I think on the first one, we all knew that, you know, pace was gonna be rapid, in this field.

But I think the still even for me, the speed of improvement in so many different ways of these base models, like particularly, now being in this Gemini era of like natively multimodal models.

That has been very surprising and is a big shift from, some of the topics we've been talking about today, like specializing models, which, is much less relevant now.

The other half of the question was what, what's the same?

What are you most, yeah, yeah, exactly.

Or what are you most surprised by it, it being the same or have haven't really changing over the last two years?

I think sometimes in the advent of new technology like this, it can be quite important how the ecosystem comes together around that.

I feel that it's still taking some time, actually.

Those who develop the technology are only one small part of this.

There's entire ecosystems who have a very big role to play in terms of.

How systems like this are best evaluated how they're best benchmarked, like what the most important use cases are and, and how to characterize those properly.

And, you know, look, maybe it's a bit harsh to say that in two years, not much has changed there, but I think there's a lot of room for growth there.

For technology and technologists to be led actually by the, the domains themselves.

Terrific.

Alright.

Thank you both for being on AI Grand rounds.

That was, that was great.

Yeah.

Thank you both.

Yep.

Thanks for having us.

Thanks.

Thanks so much.

This copyrighted podcast from the Massachusetts Medical Society may not be reproduced, distributed, or used for commercial purposes without prior written permission of the Massachusetts Medical Society.

For information on reusing NEJM Group podcasts, please visit the permission and licensing page at the NEJM website.

Google’s Efforts to Build Patient-Facing AI: A Conversation with Drs. Alan Karthikesalingam and Anil Palepu

Episode Transcript

Never lose your place, on any device