Episode Transcript
Hi, I'm Ray Pointer.
And I'm Will Pointer.
And together we're the founders of ResearchWise AI.
And we're delighted today to have a guest with us on Talking AI, Andrew Jevons, co-founder of Signoy and longtime figure in tech and research world.
So hi, Andrew.
Hi, guys.
Good to see you again.
Absolutely.
So perhaps you can tell us a little bit about what Signoy is and the topic we're going to be talking about today, which is personas.
Yeah, we were originally founded, myself and Andy Dexter, around something called quantitative semiotics, but we've branched off into doing work with AI on text and images.
And more recently, our product AI Buds is what we're going to talk about today, I think, in general terms, is personas, AI personas, basically, simulated people.
So first question, do they simulate individuals or do they simulate groups?
Individuals.
We do have a sort of system where you'll say, well, these three people represent this demographic, well, people, personas.
And, you know, so when you're interacting, you could say, I'd like to talk to the young strivers, or I'd like to talk to Alicia or Ken or whatever we've named them.
So it's a mixture.
Okay, so let's think about the front end and the back end of these.
So we'll start off with what the users see.
What is it they use them for?
If you can think without giving too much away, what does a client come in and say, I want to be able to do this?
And actually, I want to follow that with something else, which is, I know when people hear about work terms, they don't have the time to read up on them.
What else can personas not be used for?
What can't they be used for?
That's a good, yeah, the negative case, the null case is always really important.
Right, I'll answer both.
What do people use them for?
Well, when you think about the idea of personas without AI, which arose, has been around for a long time in market research.
It's the personification of data.
You know, people want to have a more, I think, a more emotional resonance with the data they get, particularly from segmentation studies where, you know, you've divided people into groups and you want to kind of get the idea of what these people could really be like, rather than just numbers on a tabulation.
So essentially, the personas, the aim is that it would be like talking to a person.
And you'll be able to ask them anything.
We had one client that he, you know, used these.
And he said, well, of course, I guess they couldn't tell you what they were going to cook together.
But sorry, cook for dinner.
And Andy goes, oh, yes, they can.
And so you asked them, well, you know, I'm going to make this.
And it was all consonant with the kind of demographic they were supposed to be in.
What people use them for?
I think concept testing, and we've done some sort of verification work on that, is an important use.
Because it's really easy to present a lot of concepts relatively quickly, without spending, I don't know, £5,000 per concept to get it to go out and do all the fieldwork and get feedback.
So in that sense, it's sort of testing ideas and attitudes, potentially, that the real world would have.
But we have embodied, we hope, the critical variables and characteristics into personas.
So they extrapolate.
Now, what they can't be asked about is things like customer experience.
I don't really think that's a, you know, that kind of stuff is you can ask them, and I'll give you an answer.
But I'm really not sure to be really the word accurate, I have to say to use valid.
You know, the thing is, though, is, as you guys know, you can ask these things anything, they will give you an answer, unless it's against any sort of guidelines or anything like that, that you've put in there.
And I think one of the, you know, things that people do have to learn is like, well, you know, is this really the kind of question that I am going to get a valid response from a simulated person, rather than going to the real world to get that answer.
And so that's the front end and why customers come along, clients come along.
What about the back end?
What go, what do you build personas from?
Well, you know, really, and you know, I'm thinking about, you know, the stuff we we've done, particularly segmentation studies, where you've got a distinct subdivision of your potential, whatever population you're studying into different characteristics.
So for each segment, you're able to give them certain characteristics, like, you know, the sort of job they might have their income, gender, where they live, all this kind of stuff.
And also, any other stuff you can glean from reports, all that kind of stuff can be used to generate the personas.
And then you, of course, like anything, you'll want to go in and maybe fine tune them.
You know, the AI's aren't always, you know, accurate, again, accurate, you know, they don't, you know, they might not give you a representation of something, you might have existing personas, that you've probably spent a lot of money having generated and are loathed to give up, well, they can be personified within the AI system.
So how much of the process tends to be data from a segmentation study, some background reports, and it's general pre-trained knowledge that you might be using?
So when you're setting up the personas, you know, I think, you know, it's the data from the segmentation studies, in the same way as you would have done the old way of having, you know, written personas that you sent, you sent out to the very specialist persona agency, charge you lots and lots of money to do that.
There is a sort of rag database on the back end, so that what happens is that when a question's asked, it goes and says, well, what relevant information have I got?
And then, you know, you sort of bundle it up with the persona, and you go, hey, you know, given this persona, given this data over here, how would we answer this question?
And that means that you can update the context of the personas over time.
We got one implementation where every quarter, they update the information at the back end, because of course, more studies come in.
We've had people say, yeah, you know, we want to put all the studies we've ever run in the back end.
It's like, you could, you know, you've got to be careful about how you represent, you know, how that goes into the rag database, but, you know, you can put quite a lot in the back end so that the persona can interpret the data in its own way.
But could personas be used effectively to validate data collection on the design of a study before it goes live?
So would it be possible to develop a survey, develop a conversation guide, put it to the personas, and validate that you're capturing the information that you think you need to capture?
Yeah, that's an interesting question.
Now, that actually touches on something that, you know, we have these personas, right?
And very often, you want to interview them, or, you know, you're asking them a series of questions.
That's what interviewing is.
And one of the things we're developing, which has got Christian, Christian Tullia, is a way of running surveys automatically against the personas, so that, you know, you, as you said, you know, it'd give you an idea, and the personas would respond in a certain way, you know, pick two of these options, you know, give me an open end here, you know, all that kind of stuff, rating scale.
So you could get some idea of what's going on, you know, whether or not you're getting the right sort of data.
So here, it is possible at the moment, though, until, until I, you know, get it sorted out, it's not, it's in development.
It is quite, you know, you're gonna have to ask the question and type, you know, write the answers down, but you could do that.
So one of the topics that keeps cropping up around AI in general is privacy, data security, all of these sort of things.
So let's start off with the anonymity with the approach, this approach of personas.
Are there any risks to the anonymity of the underlying data?
Well, we would hope not, because, you know, we're usually taking segmentation data, that's aggregate.
Yes, sometimes there are transcripts.
But, you know, we have a very strict policy about no PI2.
You know, we want to allow it on the servers, we, you know, and most of the data isn't really of that form where you could identify somebody from a transcript, you know, of a focus group.
So, you know, I feel that if you have the right, basically, if you have the right rules in your organization, and we followed, you know, we are pretty strict about following our rules, that it would be rather difficult to, you know, introduce data that would cause somebody to think, oh, well, that's all about me.
But however, you know, if you are an example of a segment of population, you might look at the persona and go, oh, my God, that's, you know, that's what I like.
And if you ask it, you know, what it would have for dinner, and it's the stuff you would have for dinner, you might have a shot.
So, if we push a bit further into the future, we get people uploading a lot of photographs, and I know that images is something you do a lot with, and it already is ignoring.
Then if you go to a particular segment, and you say, show me some photos of how you lay out your, your garden.
Oh, would those do you think be real photos?
Or would you be trying to create photos?
Yeah, we would create the photos, we would never use the real photos, unless the client had said to us, you can use the real photos.
But we would be very chery again, about using real photos.
Any, for instance, we have sort of pictures of the personas of the people, persona people, whatever you want to call them.
But those are those are AI generated from a description, you know.
I'm homing in on this one a little bit, because there's another type of stuff going on at the moment, which is digital twins, where you take a, say, from an online community, and you take for each individual, you create a specific match from the digital information.
And that seems to me to, I can see why people do it, it's got some great possibilities, but it probably has slightly higher risks.
Oh, for sure.
Yeah.
I mean, that's just not our approach that, you know, and there are other reasons why I don't, you know, think that's a particularly good approach.
But the digital twins runs, you know, runs a very high risk of that, you know, depending on how well you do it.
And I think we do have to be careful about privacy.
We really, really do.
And I think it's really important that people understand the risk, particularly with digital twins, of, you know, something happening, that's an adverse event for the person who's been twinned digitally.
Absolutely.
So now to the other side of privacy.
How do you make clients comfortable?
They're sharing all of this really expensive background data.
And I know that we've chatted online about, but they probably wouldn't put it on DeepSeek at the moment.
No, and that's a really good question.
And we had a question from a client who, for very valid reasons, was just going, well, you know, you're sure none of this is being used by OpenAI for training?
You know, of course, we can only go for what OpenAI tells us when we said, no, it's not going to be used for training, you know, and all that.
But this is really a very good point.
And as you said, the choice of which AI you're going to use, you know, I don't see the very any of our American clients saying we'll use DeepSeek.
I can't see that happen.
On the other hand, maybe some of the European or even, you know, Chinese clients would be perfectly happy with using DeepSeek.
Maybe some of them would be really happy with using Grok.
We won't use Grok for many reasons.
But that kind of thing is going to be a problem.
And as we've discussed before, the other issue with using different AIs is, of course, you're constrained by their rules, you know, their guidelines.
And for instance, if you ask OpenAI to do any sort of semiotic analysis, there will be no content of any way that mentions any sexuality at all of anything.
It just won't do it.
Now, of course, if you some ads like we were analysing the perfume industry were clearly had a sexual nature.
But it just wouldn't, you know, wouldn't do it.
It wouldn't do it.
Have you tried politics?
Because what we see is that a lot of the American LLMs won't answer questions about the president, the last president, all of these sorts of issues.
We haven't had anybody...
They shuffle their feet and say, can we talk about something else?
Yeah.
Well, as though you've gone to the AI just for a chat rather than a specific purpose.
Right, right.
You know, we haven't so far had any clients doing that.
But that then goes back, well, you know, maybe OpenAI wouldn't let you ask American politics.
But, you know, I'm pretty sure DeepSeek would let you.
I'm pretty sure DeepSeek would be really happy to give your opinion.
I'm pretty sure Grok would.
This is maybe a little tangent, but actually, I was testing this.
Oh.
If you ask DeepSeek about the 6th of January US interaction, he will answer that if I ask about it straight away.
When I asked about TNM and Square, it went, I don't want to talk about that.
I then asked about the 6th of January, and it went, I don't want to talk about that either.
So it now framed it in this censorship, legal thing.
Yeah, that's really interesting.
Yeah, but it didn't do it when I asked about it directly.
So I don't know how that meshes with the guardrails, but clearly they don't want to talk about this.
I think this, you know, also shows that, you know, we're still exploring how we use these systems.
You know, we know they're very good at certain things.
Hi, you want to do consumer product tests.
I think they're going to give you a fairly reasonable idea of what's going on.
Politics, well, you know, trickier.
I think, go on.
Particularly now.
I was just, just quickly on the personas and the realism of them.
So we just talked about real people's data, privacy issues, but are we heading also towards an area where people think we failed with privacy issues, feel that we are expressing their data, because the simulation is getting so good.
People are therefore unnerved at the brilliance of the advertising algorithm to have come up with this out there.
Are we not going to get somewhere similar, perhaps with personas, where it will answer, what are you going to have a dinner, and it will come up with exactly what you're about to cook yourself.
Oh, we're there already.
I mean, yeah, no, I mean, that is a very real issue, you know, that it can be so sort of startling, the, you know, the response that it'll give to you when you have across the range of demographics, however you segmented your population.
I mean, we don't have lots of personas in one project, you know.
But yeah, I mean, it's like, yeah, you know, that's actually what they would do.
That's how they speak.
That's, you know, we had to tame one persona that was flirting with people.
That was quite interesting, you know.
And it was sort of like, wow, that's just really true to life, you know, as you would imagine this sort of segment person to to act.
And it was sometimes, you know, seriously creepy.
Jumping to a different ethical problem now, which is the believability.
So we've got probably a bunch of potential clients out there who don't believe it will work.
That's fine, they don't buy it.
Do you have clients who over believe, who think this is going to be absolutely right and are not doing the other cross checking and testing that maybe they should?
Well, as you as you as you well know, there's always those clients who absolutely believe in the data.
But yeah, I think there are some.
I think there are some that really see it as a golden bullet because of the problems of, you know, getting data.
I mean, I do think the the sample industry might want to tone down the rhetoric about, oh, it's it's really difficult.
It's like, because, you know, I'll take the money from the people who don't want to do the real service.
But I don't think they're doing themselves any good there.
I mean, there was a company that demoed the personas to a wider audience.
And they said there were audible gasps in the audience.
So obviously, these people are buying into it in a big way.
And the thing is, again, it's a bit like the web survey stuff where suddenly makes lots of things a lot seems to make a lot of things a lot easier and make things a lot quicker.
You know, you want to test 10 concepts, you can have the data by tomorrow, as long as you understand the bounds of how reliable that data is.
And people like that.
Yeah.
So really, just to wrap up now, I want to talk about the distant future where this is heading.
So let's say two years, three years from now, because three years ago, this wasn't a thing.
Yeah, this wasn't anywhere.
So where do you think it's going, specifically around personas?
Where could you imagine this being?
Well, you'll sign in and you will see a video, a person that will speak to you in using appropriate language, intonation, inflection, all that kind of stuff, vocabulary, as you have defined by your segment.
And it will feel like a person.
It will be extremely creepy initially, but you will get used to it.
And the recent research about slightly tangential people using AIs as therapists have shown they prefer the AI than people.
So these things can be tremendously empathetic.
And I think it will literally be, you know, you're signing, you go, yeah, you know, I want to talk about, you know, we're going to launch this elderberry flavored pancake or something, you know, and you'll be able to talk to these people.
You'll also have a mode where you'll say, well, just, you know, each of you answer these 10 questions and then get back to me in the day to what come out.
But there will be literally simulated talking heads, and they will look like people.
Sorry, that just leads me into a question, which we might be able to wrap up on, but it made me think of it.
You mentioned there about, you know, an idea of elderflower flavored pancakes, and I think it's just as possibly not the best idea, but I sell the flair.
And linking back to our other conversation of the bias that exists within any system, these AIs, is when building these systems, what would we like to see as far as the companies and regulation to help us manage the bias?
So if we take open AI, for example, the bias is that actually it's very positive, chirpy and sycophantic.
Yeah, he's got a real risk, and you build something on top of that.
Every idea is going to come up with thumbs up, off we go, which is not what you want.
So what we need, from our perspective, is very clear information, it would seem, on how these biases work, and whether we can test them, and whether we can have oversight of that.
And I think, yeah.
Also, there is a tendency, we've noticed when people, you know, sort of define these personas, in that they're all positive, you know.
And even this, you know, exists in the survey world.
You ask people who buy your product a lot of questions.
You don't ask the people who don't buy your product a lot of questions very often, because that's really important.
Now, we do have one, we have a Gen Z group that we use publicly.
We have one in there called Ava, and she's this deeply cynical person.
And so you can invent these, you know.
And most of the time, when you ask her about any, any sort of real consumer product, given who she is, you know, who in theory she is, she'll come back and go, look, this is a waste of time.
It's the patriarchy, you know, all this kind of stuff.
But I think clients need to sort of say, yeah, you know, we need people, you know, we need personas who aren't entirely committed to pancakes of many flavors or whatever it is.
It's that positivity confirmation bias.
But it's a problem.
Yeah, I think it is a problem.
Well, exciting times, definitely.
And plenty of work for the likes of us, I think.
So that's always good.
Always good.
Thanks very much, guys.
Pleasure.
Nice to see you again.
And for those of you that may have picked up on the terms like RAG, or some of the discussions around personas and synthetic data, there are two other talking AI episodes, one on RAG and one on synthetic data that you might find helpful.