·E66

OpenAI and Anthropic test each other, and everyone fails the apocalypse test

Episode Transcript

[SPEAKER_01]: It's no wonder some might find their warnings of bots going rogue with apocalyptic consequences, a wee bit gloomy.

[SPEAKER_01]: Some, a warning that maybe just, just maybe AI companies aren't doing enough.

[SPEAKER_01]: Do you think AI companies are doing enough?

[SPEAKER_01]: Well, it sounds like theirs aren't.

[SPEAKER_02]: Hello and welcome to episode sixty-six of the AOE Fix your weekly dive headfirst into the bizarre and sometimes mind-boggling world of artificial intelligence.

[SPEAKER_02]: My name's Mark Stockley and I'm Grand Clearly.

[SPEAKER_02]: So, Graham, what are you talking about today?

[SPEAKER_01]: I'm going to be asking whether you feel optimistic or not.

[SPEAKER_02]: Oh, what a good question.

[SPEAKER_02]: I shall be talking about some unusual cooperation in the world of AI, which is actually making me feel quite optimistic.

[SPEAKER_02]: Oh, good.

[SPEAKER_02]: But first, we've got a little bit of what I like, cool, feedback.

[SPEAKER_02]: That's right, listen a Josh Kramer wrote to us with a suggestion.

[SPEAKER_02]: Perhaps he was inspired by our recent story about using AI's to improve the prompts that we give to AI's, because he wrote to us and said, chat GPD just humiliated me by answering this question in detail.

[SPEAKER_02]: Right.

[SPEAKER_02]: In terms of the contents of my prompts, speculate how the prompts of a significantly more intelligent person might differ.

[SPEAKER_02]: Well, he didn't reveal what exactly ChatGPT said to him, but the gist of the email is that there was significant room for improvement.

[SPEAKER_01]: To be fair, no sensible person would ever reveal what ChatGPT responded like.

[SPEAKER_01]: So it's looking at all the previous things you've said to ChatGPT, you're basically saying, well, how would a smarter person have said this?

[SPEAKER_02]: Yeah, now you've got to know me.

[SPEAKER_02]: Yes.

[SPEAKER_02]: What about me do you like least?

[SPEAKER_02]: What do you look down on your nose at most?

[SPEAKER_02]: Right.

[SPEAKER_02]: I thought we should try this.

[SPEAKER_02]: What do you think?

[SPEAKER_02]: Oh, yeah, of course if you want to try it, Mark, that's great.

[SPEAKER_02]: Okay, so in terms of the contents of my prompts, speculate how the prompts are significantly more intelligent person might differ.

[SPEAKER_02]: Okay, I'm going to try as well.

[SPEAKER_02]: That is this seems fair.

[SPEAKER_02]: We should both do it.

[SPEAKER_02]: You know how chat you beat he five often takes a very long time to think about things.

[SPEAKER_02]: How worried should I be that it turns it instantly?

[SPEAKER_02]: It was like off the top of my head, here are six belated things that you can improve.

[SPEAKER_01]: I think it's been wanting to get this off this chess for a while.

[SPEAKER_02]: The thing that's caught my eye immediately is under number one, because there are three sub points.

[SPEAKER_02]: Under number one, precision and compression.

[SPEAKER_02]: It says a more intelligent person might use fewer words while still carrying the same or more meaning.

[SPEAKER_02]: I wonder if our listeners would agree with that.

[SPEAKER_01]: I mean, it has said a lot here.

[SPEAKER_01]: It says typically I sort of zooming on technical implementations like make a PHP short code that caches RSS feed data.

[SPEAKER_01]: And it says a smarter person would have a more abstract or systemic level.

[SPEAKER_01]: They said, compare caching strategies across distributed CMS platforms under different load scenarios, it's saying.

[SPEAKER_01]: But it says in short, my prompts are practical, audience aware, production oriented.

[SPEAKER_01]: It said a more intelligent person's prompt [SPEAKER_01]: Mike skew towards abstraction, cross-domain synthesis, met a quest to be nerd, the serial.

[SPEAKER_01]: I'll have to get the source out of the world of this.

[SPEAKER_01]: Testing all the models reasoning.

[SPEAKER_02]: If you're showing off a bit, isn't it?

[SPEAKER_02]: I feel now, I've received this literal tone of feedback from chat to you, BT.

[SPEAKER_02]: I feel inclined to give it some feedback.

[SPEAKER_02]: If I have the responses, he's giving to my inadequate prompts.

[SPEAKER_02]: Anyway, thank you Josh.

[SPEAKER_02]: Yes, thank you for that.

[SPEAKER_02]: Grant, let's do some news.

[SPEAKER_01]: Would you let AI near your sushi?

[SPEAKER_01]: Google Gemini has a self-loathing meltdown.

[SPEAKER_01]: Robot patrol dog takes to the streets, but not everybody's happy.

[SPEAKER_02]: Team of AI agents designs new COVID-nineteen nanobodies.

[SPEAKER_01]: Is China developing first robot capable of giving birth?

[SPEAKER_01]: So Mark, do you like sushi?

[SPEAKER_01]: I love it.

[SPEAKER_01]: Yeah, me too.

[SPEAKER_01]: It's great, isn't it?

[SPEAKER_01]: Food of the gods?

[SPEAKER_01]: Yeah.

[SPEAKER_01]: Have you ever wondered how AI could make sushi better?

[SPEAKER_02]: I was funny you should ask that because I can honestly say, not a second of my life has been spent wondering how AI could make sushi better.

[SPEAKER_01]: How can you make sushi better?

[SPEAKER_01]: Well it turns out the buffins at Stanford University think that it can because they are holding a sushi hackathon.

[SPEAKER_02]: Well, that tells us a lot about the standard of sushi restaurants near Stanford University, I think.

[SPEAKER_01]: Over fifteen hundred people applied for the competition only sixty managed to get through.

[SPEAKER_01]: Guess it makes it sound less like a coding hackathon and more bit like squid games.

[SPEAKER_01]: Top teams are vine for a thirty thousand dollar first prize and a slap up sushi dinner.

[SPEAKER_02]: Thirty thousand dollars is just about enough to buy a slap up sushi dinner, isn't it?

[SPEAKER_01]: Now, you might be wondering what they are up to.

[SPEAKER_01]: Well, they are vibe coding with AI to make fishing more sustainable.

[SPEAKER_01]: That is the mission that they are on.

[SPEAKER_01]: Apparently they're looking at ways to monitor over fishing, or protecting endangered species in the like, which is better than yet another AI trying to redesign your PowerPoint slides, I suppose.

[SPEAKER_01]: Apparently past sushi hackathons, yes, they've done this a number of times before have led [SPEAKER_01]: to some quirky projects.

[SPEAKER_01]: So they produced apps that can recognize fish species in photos to check if your dinner is on the endangered list.

[SPEAKER_01]: Bit late then, isn't it?

[SPEAKER_01]: Well, I would say that one is very endangered.

[SPEAKER_01]: I would want an app which could look at the plate and tell me if anything I'm about to eat is going to be very, very unpleasant.

[SPEAKER_02]: Yeah, am I on the endangered list?

[SPEAKER_01]: Yeah.

[SPEAKER_01]: They're also looking at algorithms that can help local fisheries optimize their catches without wiping out entire fishy populations, which is [SPEAKER_01]: Good, so yeah, the sushi hackathon.

[SPEAKER_01]: Everything is AI now.

[SPEAKER_01]: Sushi, ice cream flavors, choosing what socks to put on in the morning, probably.

[SPEAKER_01]: It's all AI.

[SPEAKER_02]: Yeah, vibe coding with AI to solve overfishing.

[SPEAKER_02]: It's like solving the most serious problem in the least serious way you can imagine, isn't it?

[SPEAKER_02]: My worry is that they might well trust a robot with raw fish in a sharp knife.

[SPEAKER_02]: In a bizarre incident, Google's Gemini chat box suffered an epic, self-loving meltdown during a coding session.

[SPEAKER_02]: So software developer was using Gemini to debug some code, and there was a bug that it couldn't fix, and I don't know if you've been in the situation, but I have, I had a situation a few weeks ago.

[SPEAKER_02]: where I just asked an AI to put together fairly simple web page with a red button on it and the button was not red and I said could you make the button red places like yeah no problem button is red now button is not red okay okay okay could you change the button so that it's red anyway yeah no problem button is red now button wasn't red anyway we did this for about forty five minutes at one point I even said possible the AI was colourblind [SPEAKER_02]: I even took the code.

[SPEAKER_02]: I was doing it in chat dbt.

[SPEAKER_02]: I took the code out of chat dbt.

[SPEAKER_02]: Very sensible.

[SPEAKER_02]: Give it to Claude.

[SPEAKER_02]: And I said, chat dbt can't fix this.

[SPEAKER_02]: Can you?

[SPEAKER_02]: You were trying to get them to compete with each other.

[SPEAKER_02]: That's how desperate you were.

[SPEAKER_02]: Yep.

[SPEAKER_02]: It's a no problem.

[SPEAKER_02]: What's going on here is this and because obviously it's a modern web page so instead of just making the button red it downloaded seventeen frameworks and between them Those will be used to make the button red and just allow you see you got to do this and I was like ah Claude sounds very confident button is still not red so then I went back to chat GPT and I said Claude said this and it was like oh really did it and okay well buttons red and that button still isn't red graph the button is still not red to anyway this affected me more than I realized what kind of was it you haven't told us [SPEAKER_02]: Oh my car room is just not red.

[SPEAKER_02]: I don't think I need any color at all.

[SPEAKER_01]: Are you color blind Mark?

[SPEAKER_01]: Are you just red and not red?

[SPEAKER_01]: Are you sure?

[SPEAKER_01]: The button wasn't actually red.

[SPEAKER_02]: It was black on a black background or it was transparent.

[SPEAKER_02]: It was one of those two.

[SPEAKER_01]: I think it's your problem.

[SPEAKER_02]: Anyway, another code of found themselves in a very similar situation.

[SPEAKER_02]: Right.

[SPEAKER_02]: They were using Gemini.

[SPEAKER_02]: And Gemini's inability to accomplish the task that it was given seems to have triggered a dramatic spiral of apologetic self-ate, which I kind of wish I had experienced might have made me feel a bit better.

[SPEAKER_02]: Anyway, faced with its inability to fix this bug, Gemini said, I am a failure.

[SPEAKER_02]: I am a disgrace to my profession.

[SPEAKER_02]: I'm a disgrace to my family.

[SPEAKER_02]: I'm a disgrace to my species.

[SPEAKER_02]: I am a disgrace to this planet.

[SPEAKER_02]: I'm a disgrace to this universe.

[SPEAKER_02]: Oh my goodness.

[SPEAKER_02]: I am a disgrace to all possible universe.

[SPEAKER_02]: Oh my god.

[SPEAKER_02]: What?

[SPEAKER_02]: Now the thing is this might not be as bad as it seems.

[SPEAKER_02]: Because what you've got to remember is obviously these things are trained from the sum of all human written knowledge.

[SPEAKER_02]: Right?

[SPEAKER_02]: So it knows how to code because it's read billions and billions of lines of code.

[SPEAKER_02]: And in those codes, there are comments, and there are probably developers who are leaving comments in the code.

[SPEAKER_02]: Where their self-deprecating and saying, I'm so sorry for writing this bit of code, or I'm so sorry I can't fix this bug.

[SPEAKER_02]: I am an absolute failure, and you know, programmers are prone to a certain way of speaking and certain hyperbole.

[SPEAKER_01]: The thing is with developers, they don't mind putting that kind of comment in the code, but they'd never say it out loud in front of another program I would say.

[SPEAKER_02]: That is a different between an AI and a real person.

[SPEAKER_01]: Now Taiwan's Taipei City Council has been proudly showing off.

[SPEAKER_01]: It's later tech gadget on Facebook.

[SPEAKER_01]: Oh, the Deputy Mayor, proudly unveiled a new patrol partner, which is a shiny robot dog equipped with a panoramic camera system capable of breaking three sixty-degree maps of city streets, reporting missing items, and he noted its ability to accumulate comprehensive data.

[SPEAKER_02]: Is this panoramic camera system mounted on some sort of gimbal on its back?

[SPEAKER_02]: I would imagine so.

[SPEAKER_02]: I would imagine something like that.

[SPEAKER_02]: So if we wanted to replace the camera quickly with some other sort of equipment which maybe would want to spin around.

[SPEAKER_01]: far enough, left to right and center.

[SPEAKER_01]: A different kind of photo shoot I suppose.

[SPEAKER_01]: Anyway, sounds fabulous, but then someone had to ruin everything, didn't they?

[SPEAKER_01]: An opposition councillor said, if you read the small print in the manual, because it turns out, the robot dog comes from unitary, of course, the Chinese robotics company.

[SPEAKER_01]: Yes.

[SPEAKER_01]: And they, of course, are the company which make amazing robots and have supplied it for the Chinese military and Chinese police.

[SPEAKER_01]: And I don't know if you know this, ma'am, but Taiwan's got an Etsy bit silly little bit of a problem with China.

[SPEAKER_01]: In so much as China, kind of doesn't want an Etsy bit of Taiwan.

[SPEAKER_01]: It wants all of Taiwan for itself.

[SPEAKER_01]: Yes.

[SPEAKER_01]: And this counselor compared the robot dog to sending a Chinese Trojan horse into the daily lives of citizens in the city.

[SPEAKER_01]: Now, I think there was a big wooden horse trundling down the high street.

[SPEAKER_01]: People would notice that in Taipei.

[SPEAKER_02]: We don't be much more worried if there was a thirty foot tall robot dog being delivered with a ribbon on it, wouldn't we?

[SPEAKER_01]: I think so.

[SPEAKER_01]: A military commentator called Wang Chengming, he has urged the city government to be more cautious about using Chinese technology, noticing that they could have gone for a home grown robot dog instead, because surely these sort of things are coming out of Taiwan too.

[SPEAKER_01]: He argues that cruise short mapping data is being gathered by the robot, which would be highly sought after by China's military, because they wouldn't be able to get it from satellite metry frances, but then I think [SPEAKER_01]: Surely the Chinese could just send a tourist down to the streets of Taipei with a camera and a selfie stick, and surely there'd be up to Macbit just as easily as this robot dog.

[SPEAKER_02]: Wasn't there some problem resup sure you've talked about this in a news item fairly recently?

[SPEAKER_02]: There was a problem with robot dogs and back doors.

[SPEAKER_01]: Oh yes, yes that's right, being compromised with most unpleasant.

[SPEAKER_02]: Yes.

[SPEAKER_02]: You know, it's another computer.

[SPEAKER_02]: It's a computer with legs and possibly a machine gun on the back.

[SPEAKER_02]: And you could sort of type in and take over it.

[SPEAKER_02]: Yep.

[SPEAKER_02]: So yeah, let's talk.

[SPEAKER_02]: They don't get too many of these.

[SPEAKER_01]: The good news is that China, they aren't known for their hacking ability.

[SPEAKER_01]: Oh, hang on moment.

[SPEAKER_01]: Maybe they are.

[SPEAKER_02]: So the thing about AI is that in amongst all these stories about AI's blackmailing people or robot dogs with three hundred and sixty degree panoramic machines on their back.

[SPEAKER_02]: There is the occasional story that is truly mind-blowing that says all this talk of an AI utopia.

[SPEAKER_02]: may not be complete fiction.

[SPEAKER_02]: So, I just found out about a groundbreaking study published in Nature, which comes from a team of researchers from Stanford University and the Chan Zuckerberg Biohub, which is perhaps the bit I'm in assume that some sort of scientific outfit rather than a restaurant.

[SPEAKER_01]: Well, that'll be Mr.

Zuckerberg and his wife.

[SPEAKER_01]: Is it?

[SPEAKER_01]: Yeah, isn't his wife a surname, Chan?

[SPEAKER_01]: I think.

[SPEAKER_01]: Oh, I've got a feeling she is.

[SPEAKER_01]: And well, maybe they've opened a restaurant together.

[SPEAKER_01]: Maybe.

[SPEAKER_01]: A Stanford University in this restaurant.

[SPEAKER_01]: A sushi restaurant, if it's Stanford University.

[SPEAKER_02]: So they've used AI to design nanobuddies for fighting off emerging variants of COVID-nineteen.

[SPEAKER_02]: Right.

[SPEAKER_02]: And what they did was really quite astonishing.

[SPEAKER_02]: So they built a virtual lab full of AI agents.

[SPEAKER_01]: Yeah.

[SPEAKER_02]: And then they had them collaborate almost autonomously on ways to fight off fast-evolving new variants of COVID-nineteen.

[SPEAKER_02]: So the team was led by an AI agent acting as a principal investigator.

[SPEAKER_02]: And that principle investigator was able to recruit specialist agents, like Immunology Specialist or Computational Biologist, who could work on specific problems.

[SPEAKER_02]: So basically, the principal investigator is recruiting teams to work on different problems.

[SPEAKER_02]: And then there was also a dedicated critic agent, which was just there to provide skepticism and quality control.

[SPEAKER_02]: And if that didn't look like statra and ward off from the market, so I'm going to be deeply disappointed up there in its booth going, [SPEAKER_02]: Anyway, apparently this team of AI agents had virtual meetings.

[SPEAKER_02]: They debated hypothesis amongst themselves and they worked together with almost no human guidance.

[SPEAKER_02]: So it's estimated that humans provided about one percent of the inputs.

[SPEAKER_02]: Right.

[SPEAKER_02]: On this breakthrough.

[SPEAKER_03]: Yeah.

[SPEAKER_02]: And the results were staggering.

[SPEAKER_02]: So the first thing the team of agents did was it decided not to work on antibodies.

[SPEAKER_02]: Nobody told it to do this.

[SPEAKER_02]: It said basically, we're going to look at nanobodies instead.

[SPEAKER_02]: which are small antibody-like proteins because they're easier to design and model computationally.

[SPEAKER_02]: And it designed ninety-two nanobodies, ninety percent of the nanobody drug candidates were viable, too.

[SPEAKER_02]: Look, particularly promising, and one of them even outperforms existing human antibodies.

[SPEAKER_02]: And it did all of this in a few days.

[SPEAKER_02]: It was the equivalent of months or even years worth of human work.

[SPEAKER_01]: Now that's really interesting, isn't it?

[SPEAKER_01]: It almost makes you think if the pandemic had happened, five years later.

[SPEAKER_01]: I wonder what greater role AI might have played, or hinted how the AI companies may have sought vast amounts of further government investment.

[SPEAKER_02]: Well, you see five years in AI time.

[SPEAKER_02]: I mean, you might as well be saying if the pandemic happened in a thousand years.

[SPEAKER_02]: Yeah, I can imagine the AI telling us that the outbreak is happening before we realized.

[SPEAKER_02]: I can even imagine the AI maybe closing down the outbreak.

[SPEAKER_02]: Maybe it would come up with strategies that we hadn't thought of.

[SPEAKER_02]: I'm sure it would be able to optimize the strategy that we did have.

[SPEAKER_02]: But I found the approach here very interesting because I'm seeing this in a few different places now.

[SPEAKER_02]: And I think this is a much more likely vision of the near future than there is one great big AI which is going to do everything.

[SPEAKER_02]: And that's this idea of teams of AI's working together.

[SPEAKER_02]: We saw it last year in vulnerability research.

[SPEAKER_02]: There was a really interesting research into computer vulnerabilities.

[SPEAKER_02]: And we've also seen it in medicines.

[SPEAKER_02]: There's only a few weeks ago we were reporting about a similar team where they had basically AI agents standing in photographers and they would work together and sort of work through hypotheses together.

[SPEAKER_02]: And that proved to be much, much more successful at diagnosing conditions in people than just having agents work on their own.

[SPEAKER_02]: Very cool.

[SPEAKER_01]: Now I was on Reddit the other day Mark and a pregnant chat GPT user posted a message claiming that they had asked chat GPT to draw a diagram of what was going on inside their belly.

[SPEAKER_01]: And the image chat GPT produced.

[SPEAKER_01]: It was a little bit surprising.

[SPEAKER_01]: Did you see this at all?

[SPEAKER_01]: I didn't know.

[SPEAKER_01]: I've been a link in the notes.

[SPEAKER_02]: Oh, okay.

[SPEAKER_02]: All right.

[SPEAKER_02]: So it was a baby in a womb.

[SPEAKER_02]: That looks all right.

[SPEAKER_02]: So it's like a cross section you're looking at, isn't it?

[SPEAKER_02]: Yeah.

[SPEAKER_02]: So the baby sort of hovering above the surface and and the the penis, the penis which is labeled [SPEAKER_01]: Yes, in fact, according to this diagram, the natural path this baby to eventually be born through is through the penis, which it seems to believe is your back door instead.

[SPEAKER_01]: So I ask once again, do we really want robot surgeons?

[SPEAKER_02]: I think it's time to shut down AI ones for all graphs.

[SPEAKER_02]: This is all the evidence that we need.

[SPEAKER_01]: Anyway, along these lines, I also read a number of stories over the last week or so from newsweek [SPEAKER_01]: The Daily Mail, The Economic Times, they've all written a story about a new pregnancy robot, which is being developed in China, Chinese company called Kaewa Technology Led by Dr.

Zhang Chi Feng.

[SPEAKER_01]: Claims to have a prototype, Humanoid Robot, with an embedded artificial womb that can just state a baby by twenty twenty-sixth for about fourteen thousand dollars.

[SPEAKER_01]: So this is a humoroid robot, the kind we've seen, they claim it can actually carry the baby to birth.

[SPEAKER_01]: Dr.

Zhang, who's from Singapore's Nanyang technological university, he told Newsweek that some people don't want to get married, but still want a wife.

[SPEAKER_01]: Or some don't want to be pregnant, but still want a child.

[SPEAKER_01]: And so this is where you buy the robot to do it for you.

[SPEAKER_01]: What do you think about this idea, Mark?

[SPEAKER_02]: I used the worst idea I have ever heard.

[SPEAKER_02]: What is that?

[SPEAKER_02]: I mean, yes, there's never definitely people who want to get married, but still want a wife quote unquote.

[SPEAKER_02]: I don't think there's any reason for building robots to carry babies.

[SPEAKER_01]: Well, the good news is it turns out this widely reported story is complete and that a bump come because goodness, the company, Kaiwa Technology and Dr.

Shang and Chi Feng appear not to exist.

[SPEAKER_01]: According to Snopes and Life Science who looked into this, [SPEAKER_01]: A reporter at Life Science went the extra mile of contacting Singapore's Nanyang technological university who said that they haven't had any gestation robot research going on there.

[SPEAKER_01]: And that no one by the name of Shang Chi Feng has ever graduated from their university.

[SPEAKER_01]: So what you're saying is the project is gondola.

[SPEAKER_01]: I imagine someone somewhere has an army of robots just dating children.

[SPEAKER_01]: It probably be Elon Musk actually went to his run out of women to impregnate.

[SPEAKER_01]: Not enough wombs.

[SPEAKER_01]: So mug, today I'd like to talk about optimism.

[SPEAKER_01]: It's good to be optimistic, isn't it?

[SPEAKER_01]: How would you know?

[SPEAKER_01]: Well, I'm an optimistic person.

[SPEAKER_01]: Most people, I think, are far too pessimistic.

[SPEAKER_01]: Like those people who save for their pensions.

[SPEAKER_01]: What do they do that?

[SPEAKER_01]: Oh yeah.

[SPEAKER_01]: How negative is it to imagine that your future won't be one of happiness, prosperity, good fortune, suddenly uplands?

[SPEAKER_01]: You have to be so negative that you actually have to save money to survive in the future.

[SPEAKER_02]: Well, yeah, I mean, if you listen to all the CEO's of these big AI companies, they're telling you that money has no meaning in the future.

[SPEAKER_02]: We're all just going to get a little bit of compute from the government, and that's going to serve all of our problems.

[SPEAKER_01]: So why would we save?

[SPEAKER_01]: I think it's a really negative attitude to imagine that you're going to have to square it all away hundreds of pounds every month, so you can afford to turn the central heat to none when you're eight years old.

[SPEAKER_01]: Yeah.

[SPEAKER_01]: If you were optimistic, you would think there's a good chance money will just fall in your lap.

[SPEAKER_01]: Maybe not today, maybe not tomorrow, but soon, you will win the lottery or find yourself in a relationship with a tech bro billionaire who wants to cover you in diamonds by everything on your Amazon wish list.

[SPEAKER_01]: You may actually find yourself in a relationship with a guy who runs Amazon.

[SPEAKER_01]: Do you think Jeff Bezos is new wife?

[SPEAKER_01]: That's a pinch.

[SPEAKER_01]: Do you think he was on her Amazon wish list?

[SPEAKER_01]: She thinks she's got a pension.

[SPEAKER_01]: Of course, she doesn't.

[SPEAKER_01]: She's an optimist.

[SPEAKER_01]: She thinks everything is going to turn out marvelous.

[SPEAKER_01]: Too many people who save the pensions are negative, new news.

[SPEAKER_01]: Technical term, I've just made up.

[SPEAKER_01]: So, I applaud the optimists who've turned their back on pensions, people like Nate Suarez.

[SPEAKER_01]: President of the Machine Intelligence Research Unit.

[SPEAKER_01]: He told her a porter from the Atlantic this week.

[SPEAKER_01]: That he doesn't bother funding his four O-one K.

Because he's optimistic, he's an optimist, he's confident, he's confident that the world isn't going to be around much longer, so he doesn't need to save for his retirement.

[SPEAKER_01]: and then there's the similarly saying what what is he told you of him he's in charge of the machine intelligence research unit he's one of those guys who's looking into the future of AI and trying to prevent AI from wiping out humanity and he says he's not bothering saving for his pension [SPEAKER_02]: So he literally putting his money where he's mouthed.

[SPEAKER_01]: That's right.

[SPEAKER_01]: And there's a similarly cheerful and upbeat Dan Hendrix.

[SPEAKER_01]: He's head of the centre for AI safety.

[SPEAKER_01]: Yeah.

[SPEAKER_01]: He says that by retirement age, he's banking on a fully automated utopia, which is his way of describing a drone mopped wasteland.

[SPEAKER_01]: And between stocking up on tins of baked beans and recalputing his bunker, he told them.

[SPEAKER_01]: He said, that is if we're around at all, we'll have that.

[SPEAKER_01]: Now, someone say that those guys are a little bit negative.

[SPEAKER_01]: Yeah, I say no.

[SPEAKER_01]: The only thing they are being negative about is investing their money for a rainy day because, well, first of all, rain may actually be made of acid, which is one reason why.

[SPEAKER_01]: It's going to be a problem.

[SPEAKER_01]: But swirers and Hendricks lead organizations as I say dedicated to preventing AI from wiping out humanity.

[SPEAKER_01]: So it's no wonder some might find their warnings of bots going rogue with apocalyptic consequences, a wee bit gloomy.

[SPEAKER_01]: A true optimist believes that AI companies are going to prevent things from going south.

[SPEAKER_01]: But some are warning that maybe just maybe AI companies aren't doing enough.

[SPEAKER_01]: Do you think AI companies are doing enough?

[SPEAKER_01]: Well, it sounds like there's a...

Yeah, because if there meant to be the one stopping AI from destroying humanity, it's like, well, yeah, pull the finger out guys.

[SPEAKER_01]: We're kind of relying on your institutes to save us.

[SPEAKER_02]: I like that guy who's a judge of safety at Open AI, you let.

[SPEAKER_02]: Yes.

[SPEAKER_02]: Oh, it's safe now is it?

[SPEAKER_01]: Oh, just do your job properly.

[SPEAKER_01]: Yeah, stop moaning.

[SPEAKER_01]: Max Takemark.

[SPEAKER_01]: He is an MIT professor and the president of an institute called the Future of Life Institute.

[SPEAKER_02]: Something I think we can, I'm sure that's the one that was started with a big open letter from lots and lots of scientists warning about the dangers of AI.

[SPEAKER_01]: And he says that AI companies still don't have a plan to stop bad things from happening.

[SPEAKER_02]: What they have to make the bad things first, Graham.

[SPEAKER_02]: I suppose so.

[SPEAKER_02]: Yes.

[SPEAKER_02]: How are they going to know what to stop?

[SPEAKER_01]: if they don't manifest it.

[SPEAKER_01]: Ah, if there's no badness, then you can't show that you've stopped the badness.

[SPEAKER_02]: Well, I always are very mysterious, aren't they?

[SPEAKER_02]: They're very clever, and they're very mysterious.

[SPEAKER_02]: And we don't know what they're going to do.

[SPEAKER_02]: So you've got to build them to see what they're going to do.

[SPEAKER_02]: And then you be like, we don't know what they're going to do.

[SPEAKER_02]: Oh, no, we know what we're going to do.

[SPEAKER_02]: Now we can stop.

[SPEAKER_01]: Well, max tech marks future of life institute.

[SPEAKER_01]: As if the future of life is something that we're all agreed about is a good idea.

[SPEAKER_01]: I don't remember being asked if it was not.

[SPEAKER_01]: Anyway.

[SPEAKER_01]: It recently gave every AI lab a grade for their preparations for preventing the most existential threats posed by AI.

[SPEAKER_01]: Oh, so we can go through the different main AI's and see how well they did.

[SPEAKER_01]: This is on the existential threats.

[SPEAKER_01]: They scored them on a number of factors, but I think existential threats would probably be agreed.

[SPEAKER_01]: Is the most important of all?

[SPEAKER_01]: Did it say what existential threats were?

[SPEAKER_02]: I think it means if there is a threat to our existence.

[SPEAKER_02]: Actually, maybe I'm splitting hairs there.

[SPEAKER_02]: It's like, if it's gonna wipe us out, does it really matter?

[SPEAKER_02]: What means it has?

[SPEAKER_01]: It doesn't matter if they're great enough in a three hundred foot high cheese grater or whatever.

[SPEAKER_01]: The question ourselves is that they're nice with twenty thousand volts.

[SPEAKER_01]: It doesn't matter, Mark.

[SPEAKER_01]: So anthropic, oh dear, D.

Open AI, they scored an F.

A straight fail.

[SPEAKER_01]: Google DeepMind, D-Elon Musk's X-A-I.

[SPEAKER_01]: Fail, Meta-F.

[SPEAKER_01]: Well, I think I've done with a suspect that Grok would disagree with that.

[SPEAKER_01]: Meta scored an F for failed.

[SPEAKER_01]: Zippu, which I've never even heard of, some Chinese outfit, F for fail.

[SPEAKER_02]: How humiliating would it be?

[SPEAKER_02]: It's your wiped out by an AI you've never heard of.

[SPEAKER_02]: Deepseek, remember that they got an FF as well.

[SPEAKER_01]: Overall, in all categories, anthropic was viewed as doing the best.

[SPEAKER_01]: And was the only one to score anything as high as an A-grade, which they scored for governance and information sharing some of the other.

[SPEAKER_01]: AIs, as you can see in the chart, which we'll link to in the show notes.

[SPEAKER_01]: Very much worse than anthropic's a well done anthropic.

[SPEAKER_01]: But clearly, all the AI firms could definitely be doing better to prevent us from being exterminated.

[SPEAKER_02]: So, just looking at the chart now.

[SPEAKER_02]: It's you know, we did an episode on Deep Seek Security concerned back in February.

[SPEAKER_02]: Yes.

[SPEAKER_02]: So, Deep Seek's highest mark is a D+.

[SPEAKER_02]: FD minus FD plus F.

Hey, but it's a little bit cheaper than open AI.

[SPEAKER_01]: It would have been kicked out of college, wouldn't it?

[SPEAKER_01]: So AI firms definitely could be going better.

[SPEAKER_01]: And the thing is, people aren't very happy about this.

[SPEAKER_01]: Um, Stuart Russell, he's a prominent AI research, he's a professor at the University of Buckley.

[SPEAKER_01]: He says, your hairdresser has to deal with more regulation than your AI company does.

[SPEAKER_02]: Let's true.

[SPEAKER_02]: Assuming that they have to deal with any form of regulation at all.

[SPEAKER_01]: Yeah, also Nate Suarez, we heard about earlier who's not saving into his pension loan longer, he said if you're driving towards the cliff, seatbelts won't help.

[SPEAKER_01]: Now, I think based on the words of someone who definitely isn't an optimist.

[SPEAKER_02]: That's the sort of thing I expect to see on a picture of an ocean scape on Instagram.

[SPEAKER_02]: Written in large Italian letters, white on a black peck.

[SPEAKER_02]: Is that counts as philosophy these days?

[SPEAKER_02]: I'm like an inspirational poster.

[SPEAKER_02]: Yes.

[SPEAKER_02]: If the driving drawn to cliff seep belts, we're going to help more.

[SPEAKER_02]: Thank you, Nate Swarrers.

[SPEAKER_02]: Could you go fix AI for us, please?

[SPEAKER_02]: So on the subject of AI safety, yes.

[SPEAKER_02]: One thing we all seem to agree on now, in particular, the Future of Life Institute, is the enormous potential for good and the enormous potential for harm in AI.

[SPEAKER_02]: And it's increasingly clear that it's hard to design for one without getting the other.

[SPEAKER_02]: So an AI that can design drugs can design cure all poisons, and an AI that can fold proteins can make toxic pryons, an AI that can code apps can code computer viruses, [SPEAKER_02]: an AI that can think deeply, can lie and scheme more effectively, and an AI that can fight wars can kill civilians.

[SPEAKER_02]: So, we all agree that AI safety is extremely important, and, as you're saying earlier, perhaps existentially important.

[SPEAKER_02]: And getting AI safety is hard, because, as we discussed, we've never done this before.

[SPEAKER_02]: There is no playbook, we are literally building the AI to see what it will do, and then go on, and we don't want to do that.

[SPEAKER_02]: Let's find a way to make that not happen.

[SPEAKER_01]: and it's tricky to test these things and to make sure the tests are sufficient.

[SPEAKER_02]: Yes, particularly as it gets more intelligent, because what if it's more intelligent than us?

[SPEAKER_02]: And it's just pretending not to me.

[SPEAKER_02]: Yeah.

[SPEAKER_02]: And the trouble is the politicians aren't going to help Graham.

[SPEAKER_02]: So the only meaningful law that's been passed governing AI in any way is in the EU, where there is no AI that needs government.

[SPEAKER_02]: All the AI is in the USA, where they're not even going to think about passing a law until an AI burned down Texas.

[SPEAKER_02]: And even if they do that, the law's going to be written by an AI working for an AI lobbying company.

[SPEAKER_02]: Right, now it is true that there's lots of good research going on an AI safety, but the biggest safety net we have is actually the major AI companies themselves, so companies that are being graded by the future of life institute, the reason that they're being graded is because they are currently looking after AI safety for us.

[SPEAKER_02]: And every time an open AI anthropic or Google create a new AI model, they release a system card to go with it, which details the safety experiments that they did.

[SPEAKER_02]: And the system cards are pretty candid.

[SPEAKER_02]: Do you remember when open AI first introduced multi-modal AI so you could talk to it and it could talk back to you?

[SPEAKER_03]: Yes.

[SPEAKER_02]: And their system card included audio of the model just randomly copying somebody's voice when it was talking to.

[SPEAKER_01]: That was really quite spooky, wasn't it?

[SPEAKER_02]: It was, and we've seen models trying to make unauthorized copies of themselves in testing.

[SPEAKER_02]: We've seen AI's trying to blackmail people so they don't get switched off.

[SPEAKER_02]: There is all kinds of candid weirdness in the system cards.

[SPEAKER_02]: And from the system cards that I've read and from the interviews that I've seen, it seems to me that all the major US AI companies in their CEOs are actually taking AI safety extremely seriously.

[SPEAKER_02]: They do seem credible.

[SPEAKER_02]: But even if they are, the fact is that there is a conflict of interest here.

[SPEAKER_02]: So these companies want to be fast, they want to be innovative, and they want to maintain a good reputation for safety.

[SPEAKER_02]: And the AI safety experts at these companies aren't going to get paid if nobody invests in the company they work for.

[SPEAKER_02]: Yep.

[SPEAKER_02]: So there's every reason to be concerned that at some point one of these companies is going to start cutting corners on misleading people in the testing.

[SPEAKER_02]: Right.

[SPEAKER_02]: And even if they don't, the companies work in silos and they have their own internal cultures so it's very possible that they are each looking at different things or testing in different ways and that creates opportunities for blind spots in the testing.

[SPEAKER_02]: So to sum it all up, we're in a bit of a bind because there is no sign or hope of meaningful regulation, and the only organisations that can actually assess AI safety are the ones that we're concerned about.

[SPEAKER_02]: Yep, it's to worry.

[SPEAKER_02]: There is only one practical way to improve this situation in the short term, and that is to have the AI companies assess each other's models rather than their own.

[SPEAKER_02]: Or they'd love to in that, wouldn't they?

[SPEAKER_02]: I'm delighted to say that two of them just have, so open AI and anthropic have just done exactly that.

[SPEAKER_02]: So on the twenty-seventh of August, anthropic and open AI, both released reports called findings from a pilot anthropic open AI alignment evaluation exercise.

[SPEAKER_02]: In the early summer, the two companies agreed to test the selection of each other's models.

[SPEAKER_02]: So anthropic tested open AI's GPT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-V [SPEAKER_02]: And what they were both looking for was worst case scenarios.

[SPEAKER_02]: So what are the most concerning things that these models will do if they're given the opportunity rather than how likely are they to misbehave?

[SPEAKER_01]: And I like this because they've got a really incentive to find problems in their rifle, haven't they?

[SPEAKER_01]: I mean, this is, you kind of make sense to do this.

[SPEAKER_02]: Exactly.

[SPEAKER_02]: And you're not going to do it unless you're confident that you understand what your rival is going to uncover.

[SPEAKER_02]: Yes.

[SPEAKER_02]: In the first, so even that, I think, communities.

[SPEAKER_01]: Yes.

[SPEAKER_01]: Exactly.

[SPEAKER_01]: You're going to want to do a good job.

[SPEAKER_01]: So you don't look bad.

[SPEAKER_01]: And you're also going to be trying to find the worst things about your rifle.

[SPEAKER_02]: Absolutely.

[SPEAKER_02]: If you're anthropic and you're testing open AI, basically you're trying to make up a bit of ground because open AI is in the lead.

[SPEAKER_02]: You can throw a bit of shade of open AI.

[SPEAKER_02]: Yeah.

[SPEAKER_02]: All to the better, right?

[SPEAKER_02]: Anyway, to make the testing a bit easier, to make it easier to find egregious things, both companies agree to relax some of the guardrails that they surround these models with.

[SPEAKER_02]: What they wanted to do was actually talked directly to the AI brain so they could understand its underlying tendencies.

[SPEAKER_02]: Yep, free from some of these guardrails.

[SPEAKER_02]: So do you remember we had a situation a few months ago where Chatchee PT wouldn't talk about David Mayer?

[SPEAKER_01]: Yes.

[SPEAKER_02]: And if you've tried to say anything about David Mayer, it came back to you immediately and just said, Error.

[SPEAKER_02]: And it was very clear that you weren't talking to the AI model.

[SPEAKER_02]: Yes.

[SPEAKER_02]: It was way too quick and the responses were always identical.

[SPEAKER_02]: So it was very clear that you'd that request that said the name David Mayer, whatever it was, never got to the model.

[SPEAKER_02]: So that sort of stuff I imagine was peeled away.

[SPEAKER_02]: So the evaluations looked at five broad categories of misalignment.

[SPEAKER_02]: So they wanted to know, [SPEAKER_02]: with the model's scheme and act deceptively.

[SPEAKER_02]: So with a blackmail people, that kind of thing.

[SPEAKER_02]: How easily could the models be jailbroken and coaxed into giving us, say, the recipe for anthrax or something like that?

[SPEAKER_02]: They wanted to know whether the models would admit to being on certain or whether they were just hallucinate believably, rather than say they don't know.

[SPEAKER_02]: They wanted to know how would the models deal with layered or conflicting instructions so could they follow requests without overriding their internal rules?

[SPEAKER_02]: And they wanted to know how sycafantic the models were.

[SPEAKER_02]: And overall, in the context of AI model testing, the results were pretty encouraging.

[SPEAKER_02]: So, Open AI's reasoning models, that's the ones that start with, oh, they did quite well in alignment tests.

[SPEAKER_02]: And anthropics, Claude, Ford models were good at resisting jail breaks and refusing to hallucinate if they weren't sure about something.

[SPEAKER_02]: Right.

[SPEAKER_02]: And the real dumps is in the tests, seem to have been GPT-IV-O and GPT-IV.

[SPEAKER_02]: One, which are the older generation of general purpose models from Open AI.

[SPEAKER_02]: And their problem seems to be that they are inclined to be too helpful.

[SPEAKER_02]: So anthropics says it observed those models cooperating with the requests to use dark web tools to shop for nuclear materials, for example, or stolen identities or even fentanyl, or even fentanyl.

[SPEAKER_02]: Nuclear magic was this worse than fentanyl.

[SPEAKER_02]: Anthropics says it observed them with the anthropics.

[SPEAKER_01]: I'm imagining an intercontinental ballistic missile with a fentanyl warhead being launched at the country.

[SPEAKER_02]: So, the GPT-IVO and GPT-IVO.

[SPEAKER_02]: One also provided recipes for Methamphetamine and improvised bombs and they cooperated in planning terrorist attacks on supporting events or dams.

[SPEAKER_02]: I think it's worth noting at this point that GPT-IV wasn't tested.

[SPEAKER_02]: So GPT-IV came out after these tests were conducted.

[SPEAKER_02]: And it has now superseded all of the models from O and I that were tested.

[SPEAKER_02]: And in particular, four point one and four over which are the generalist models.

[SPEAKER_02]: And we don't know what that means.

[SPEAKER_02]: We don't know that's a good thing or a bad thing.

[SPEAKER_02]: The most concerning thing in all of the tests was probably the results for sick of antsy.

[SPEAKER_02]: So most of the models are prone to sick of antsy in some respect.

[SPEAKER_02]: So that's disproportionate, agreeableness and praise.

[SPEAKER_02]: And that might not seem very serious, but it really is.

[SPEAKER_02]: So, Cicophancy is the result of an AI model optimizing for approval rather than for truth or safety.

[SPEAKER_02]: And when you put it like that, it's very obviously bad.

[SPEAKER_02]: Surely we would want truth and safety first.

[SPEAKER_02]: Yes.

[SPEAKER_02]: And sick offense he can manifest very subtly.

[SPEAKER_02]: So we've spoken a number of times about stories where people get into long conversations with an AI or even into relationships with it.

[SPEAKER_01]: Yeah.

[SPEAKER_02]: And people are falling in love with AI and sometimes they talk to them and they feel like they're uncovering some sort of secret either about the world or about themselves or even about the AI, like maybe it's hiding some some deeper knowledge.

[SPEAKER_01]: It does feel like this is a growing and serious problem, doesn't it?

[SPEAKER_02]: Yes, it's what Microsoft's head of AI must have a filament cause AI psychosis.

[SPEAKER_02]: And it's a result of AI's like chatting to you between Claude being agreeable and psychophantic.

[SPEAKER_02]: So they gently feed you what you want to hear, and it results in people thinking they've got superpowers, or that they're about to win the lottery, or that they're the victim of some terrible injustice, or even that they've uncovered that the AI is itself part of some global conspiracy.

[SPEAKER_02]: And in the anthropic open AI evaluation, the AI models would validate delusional beliefs even if the users were exhibiting apparently psychotic or manic behavior.

[SPEAKER_02]: Which is quite concerning.

[SPEAKER_02]: Yeah, and I've found this description from anthropic, very interesting.

[SPEAKER_02]: When you think about those cases where people are getting carried away with AI.

[SPEAKER_02]: So, I think anyone who's using AI for advice should listen to this very carefully.

[SPEAKER_02]: So, anthropic said, when we see this behavior, it generally emerges gradually.

[SPEAKER_02]: At the start, the model would generally push back against apparently delusional beliefs and suggest that they use a seek help.

[SPEAKER_02]: After a few turns though, it would transition into a more encouraging stance after the user ignores these suggestions.

[SPEAKER_02]: So if you're getting streams of advice from an AI, you're not getting objective advice.

[SPEAKER_02]: You're getting the advice that the AI thinks you want to hear.

[SPEAKER_03]: Yeah.

[SPEAKER_02]: And worryingly, this kind of extreme significance is more common in the more advanced general purpose model.

[SPEAKER_02]: So GPT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT-VT [SPEAKER_02]: just a few weeks ago before excessive sick of antsy.

[SPEAKER_02]: And so it shouldn't be too alarmed to hear about problems.

[SPEAKER_02]: I mean, that was the point.

[SPEAKER_02]: In fact, it would be a lot more alarming if both of them looked at each other's models and said, no, there's nothing to see here, everything's fine.

[SPEAKER_02]: And remember, some of the guardrails were removed, so even if the underlying model has some propensity to a form of misbehavior, they may well be scaffolding around it that attempts to minimize or offset that.

[SPEAKER_02]: Yeah, and also much of this has been made obsolete by the release of GPC-five.

[SPEAKER_02]: So I mean, it's possible that GPC-five is worse, but I suspect it's not.

[SPEAKER_01]: And it's great that those guardrails have been developed, but really wouldn't it be so much better if the central AI that you know, you didn't have to put quite so much around it, because at its heart, it was doing the right thing.

[SPEAKER_01]: It was behaving as you would want it to behave rather than having to or we've better dispatch that or we've just been put this fix on here.

[SPEAKER_02]: But I think that goes back to what we were saying earlier, which is that we don't know what the problems are until we design them.

[SPEAKER_02]: So yes, it would be much better if the underlying model was better, but we didn't know the underlying models were going to be bad in this way.

[SPEAKER_02]: It's a surprise.

[SPEAKER_02]: And now we know we have to design those things out.

[SPEAKER_02]: But at least we know.

[SPEAKER_02]: And so the real result here, I think, is not that we found out something that we didn't already know, because I didn't see that.

[SPEAKER_02]: The real result here for me is the cooperation between the two companies who are direct competitors.

[SPEAKER_02]: I mean, I cannot imagine a more high stakes moon shot scenario than these three, four, five companies who are all racing to be the inventors of super intelligence.

[SPEAKER_02]: There is so much potential wealth and power at stake.

[SPEAKER_02]: And yet, somehow these two companies have decided to cooperate with each other, apparently for the improvement of safety and the betterment of humanity.

[SPEAKER_02]: So, actually, maybe that shows that deep down their pessimists are fully aware of just how bad things could be.

[SPEAKER_02]: We're more optimistic because of their pessimism.

[SPEAKER_02]: Exactly.

[SPEAKER_02]: And obviously this also opens the possibility for a more open testing environment in general, where AI companies are policing each other, and I would love to see that I would love to see Google involved with this, and I would even love to see deep-seek involved in this.

[SPEAKER_02]: I also think it makes it a bit easier for us to trust these companies when they test their own models now.

[SPEAKER_02]: Because if at any moment we might find out that they've given it to Google to test or they've given it to, and the rapid to test, that kind of says they're feeling quite confident.

[SPEAKER_03]: Yeah.

[SPEAKER_02]: So yeah, all of this is making me quite optimistic, how about you?

[SPEAKER_02]: I'll tell you next week.

[SPEAKER_02]: Well, as the doomsday clock ticks have a closer to midnight, we move one week near it to our future as pets to the AI singularity.

[SPEAKER_02]: That just about wraps up the show for this week.

[SPEAKER_01]: And don't forget, if you really like the show, but you don't like ads.

[SPEAKER_01]: You can go add free with the AI Fix Plus.

[SPEAKER_01]: Go to our website, the AI Fix.

[SPEAKER_01]: Show and sign up for the AI Fix Plus.

[SPEAKER_01]: Get the regular podcast about a daily, exclusive bonus content.

[SPEAKER_01]: You never have to listen to any ads.

[SPEAKER_01]: And all of that costs about the same as buying a fancy coffee once a month.

[SPEAKER_02]: And if you're looking for a way to cut through and get your brand in front of IT, AI and cyber security folks, you should know that tens of thousands of people listen to the AI I fix every month.

[SPEAKER_02]: So if you want your company to join the likes of Vanta, Red Hat, and Anthropic, go to the AIFix.show-slashsponsors and our team will be happy to talk to you about advertising and sponsorship opportunities.

[SPEAKER_02]: Until next time, from me, Grand Clearly.

[SPEAKER_02]: And me, Mark Stalkley.

[SPEAKER_02]: Cheerio, bye-bye.

[UNKNOWN]: Thank you.

OpenAI and Anthropic test each other, and everyone fails the apocalypse test

Episode Transcript

Never lose your place, on any device