Episode Transcript
Adopt your processes to work with a virtual being.
I don't like chatbots actually because that is just an interface.
Thinking about virtual colder because it's much more interesting when you can give it anything and invest in the tools.
Welcome to episode 46 of Tool Use, the weekly conversation about AI tools and strategies brought to you by an edit.
I'm Mike Byrd, and today we're talking about practical AI for businesses.
What strategies really work for robust, fair and effective AI systems?
This week we're joined by Wolfram Ravenwolf, the AI evangelist at Elamind, a Thursday Eye regular, and an AI engineer.
Wolfram, welcome to tool use.
Thank you.
Great to be here.
I'm loving it would.
You mind give us a little bit of your background, how you got into AI?
So funny anecdotes on my first computer, which I got at the age of 10 in 1988.
It was a Commodores C64 and I named it Eliza after the mother of all check bots because as a child I was totally fascinated with AI already.
I even bought in basic, basic implementation of Eliza so on this computer.
And so my heart has been beating for AI for a long time.
But personally, I went another route and I was just going into IT and to do to, I have 30 years of Linux experience and that's worked as a system and network administrator for over 20 years.
The last couple of years we are dev OPS engineer, Kubernetes stuck out all this stuff.
And yeah, then the CHA cha BT moment happened and I was already back on track for AI that way because in summer 2022 when Stability Fusion got launched, that was what really excited me when image generation was great.
And then CHA GBT came and I wanted to have it on my own systems and use it myself and and said llama was leaked the original metos llama.
And that is when I started building an AI system to work on this and experiments with it.
And I posted all my experiments and evaluations and everything I coach on Reddit.
And that was back in the day when I think Hartford was smoking his uncensored models.
The block was commanders, I think them and I was testing them though both were age aspects and it was less than three years.
This is amazing how fast the time is progressing of first time moved in AI land.
Amazing.
Yeah.
And I recently joined Alamine now just in April as an AI evangelist and engineer and we are working on Eva platform that will be released soon.
And so, yeah, evaluation, that has been my back story, what I've been doing all the time, because to run AI, you want to run the best AI you can, especially if you have a source constraint as everybody else in the end.
And though it was always very important to me to test these models to see how they work and how to optimize them.
So I tested models, I tested, you know, inference settings, I tested prompt templates, everything that affects the model.
And nowadays you have like systems and you have multi modality and all that stuff, which makes testing and devaluating even more complicated, but all the more relevant and necessary.
It's been such a crazy time and the ability to have this open source community come together and and share their experimentation.
That's one things that got you on my radar was just being able to take these different things, share the community so we can all learn together.
One area that I found there's like hits and misses with is the excessive demos that come on the AI space.
So when you're working with companies, what are some of the common misconceptions that people have about what AI is actually capable of versus what people think it can do?
Yeah, that's very interesting.
You have so the different levels of understanding of AI.
So the management often just reads about AI and it's a new big thing.
So we have to have it and in the company where I worked before that where I was doing DevOps engineering.
So our our founder was saying OK, AI scissor speak, we need it.
And he sent everybody the whole company on a 2 day workshop that was must be summer 2023.
So still relatively early.
And he said everybody there and had to workshop organized how can we use AI anywhere in the company.
So it was from the mindset it was ready trade and I lost it because I was already on that trip in the way.
And so I left that and I then transitioned into a newly formed AI department where I was working as the AI engineer and set up inference systems and arranged the models locally and on the cloud where we also hired the data engineer.
And we worked together on this and one manager on our team.
So we could really put it in the company.
Yeah, yeah.
Until the company got bought by another company and through the restructuring and stuff and all that.
So it was a bit slowing me down and that's why I decided I wanted to switch it up and go work for dynamic startup with a team of engineers and push evaluations further that way.
So, yeah, of course you have these people, if you have some, yeah, normal people in a way that everything, you have the developers that are experimenting with AI, you have the management, you have marketing which wants to use some of these tools like image generation is big, all the video generation that is now getting even better.
And yeah, I think everybody can and should use AI just because it is the most important technology that humanity has ever deceived.
I think fire has changed everything, electricity has changed everything, and now AI is changing everything as well.
And I'm convinced of that.
So even when people come back, I had somebody come back from maternal leave and I asked her, OK, how are you using AI?
She said I don't use AI at all.
I just have a little baby and I don't have time for this.
And I said, OK, but it's the baby is keeping you up at night or is toothing teething, anything like that.
You can ask the AI about that and it can help you.
It doesn't have to be just a business tool or anything can be your helper after you wear.
So people should get used to it.
Like with computers.
It's not just a business machine, home computer, personal computer, gaming devices, everything.
You get to use the technology, you love the technology, and then it can help you in your professional life.
I fully agree, in personal life I've been using AI in different ways and it's just unlocked a few different things I've been immensely helpful that I didn't really foresee as a possibility and I've been deep in this for a while now.
For those businesses that are AI curious, just starting to get into it, do you have any general low hanging fruit that is something that they can quickly adopt or something they can quickly run an experiment on so they can start seeing that value right off the bat?
I think you need to take a bit of money in your hands because it is an important tool and good tools that cost money.
And if you start with the free tools and yeah, it's nice to to get a look, but it will not convince anyone.
So if you are this has getting gotten much better than when old May I made to the GBT for all the default model for everybody.
That was a big movement, of course, but before that, when you had shared GBT 3.5, for example, on the free tier that you tell people AI is straight, they check out a free model and they are saying, oh, it's not working as much as it's shortened.
So if you want to show somebody, you don't even, you don't really have to go to the, for example, JGBT poll TR level, but the plus level for just 20 bucks for get one account and give it to your, your, yeah, your, your, your broadcast, your employees and let them work and experiment with it and use it.
And I would highly recommend to hire somebody to show them the ropes.
Or if you don't have anyone, like all technology, when computers came into the companies, you had some tech guys who were interested in it doing it as a hobby and they could help everybody.
So you need somebody who has a bit of knowledge with that and the curiosity.
And I think you should be able to find someone if you have a good company, because then they can show others how to use the AI, what to consider, what to do, and which AI is good.
If you just give somebody a J, GB, T account and they don't know where to select all three, for example, and what it is good for and what it's not.
And so there's a learning curve.
And what I'm pretty sad about is that the AI usually doesn't even know about it.
So you can't just use any AI and ask it which of the models that are available in the same interface is good for this or that.
So that is usually not at the prompt.
And yeah, that is a bit of a downer actually, because why doesn't the AI help me decide which AI is good?
Or why don't I have just one interface that routes to the rights AI depending on that?
That would be a big unlock, I guess, making it much easier for people.
But yeah, get the best AI you can get somebody to show you how to use it and then use it, practice with it, learn it and don't blame people if stuff goes wrong because we all know what AI is not perfect, but people are neither.
That is a very important thing to consider.
People are not perfect.
And if it does matter, you have to have a four eyes principle at least, or checks that also apply to AI and tell people that they are responsible.
So what the AI does.
So if it does matter, they have to double check it.
And that is not important to learn where you can trust the AI where you can't.
But the same would be with an intern.
I like that example that you compare AI to a smart intern who knows a lot, but he doesn't know your company or she doesn't know what you are working on.
So you have to be explicit to teach it.
And in a way, everybody has to be a manager that way and teach the AI, show it what to do and how to do it.
And yeah, adapt your processes to work with such a virtual being in your company.
Now, virtual cold worker, I don't like chat bots actually, because that is just an interface of type and stuff.
Thinking about virtual cold worker 'cause it's much more interesting when you can give it anything and invest in the tools.
Have some tools where you can use AI without much friction.
It has to be easy to use.
That is one of the things I wrote myself.
So at the company I wrote the tools so you could just press the hot key and send whatever you have selected to the AI to have it translate or check for correctness or write an e-mail response and stuff like that.
So it's rather easy to do.
And then I taught the people to use that too and not just give it out, but teach people and refresh them and to make sure they have it updated.
And so you need somebody to really do the evangelism in their company and make sure that the tool is used and set the people who get better with the tool.
That is very important.
And the tool, it must be easy, it must be powerful, it must be easy, and you have to have somebody responsible for making sure that it is getting used.
So many good points in there.
I, I, I fully agree with a lot of it.
The idea of having a coach or a teacher to come in and just give you an idea of what's possible, An environment of encouraging curiosity where just say, if you're about to a task, how can you fit AI into it?
Can you just like have a little back and forth with it?
Can you actually try to find a tool that accomplishes it?
It's all really good stuff when a company is trying to select an LLM.
So they've started using different tools and they've realized there's something that satisfies most of the case, but they want to build something in the house that's perfect custom to them.
But they don't know which LLM to go with, whether it's opening Anthropic, any of the big ones, any of the local ones.
Do you have a process or a general idea that people should follow in order to help evaluate these models before they get into two specific stuff?
Like when would I choose model A over Model B?
That is something most businesses actually want to have a local AI though.
They can use data privacy relevant stuff and not send it out to the Internet.
Right here I'm in Germans or in Germany and in the Europe you have very strictly the biggest speculations, of course.
So most companies, they can't even use open AI, for example, unless it's a specific Azure version and they have special contracts and stuff like that.
It's very obvious that they need something local and of course and they are asking the next question, do I do it in house or do I rent a server on service?
So there are some services like mistrust for example, you could rent it in the EU and then you would test regulations that work with it.
But if you want to do it internally, I would definitely come in to 1st get started with the the big models that you can on your system.
Of course, it's always resource constraint.
But yeah, follow the news, follow the people like me who report about it.
And in the end, right now I would say if you meet in Germany or in Europe, you have the languages.
So you you don't have much choice actually, because the Chinese models are very smart.
But in the European languages I specifically tested in Germany, they are not as strong.
So you have German 27-B or smaller if you need to, but the biggest gem I you can run locally that is available.
This is 27 B.
So that would be my first choice or one of the Mistrum models which are very strong.
I would test these.
And you always have to make your own benchmarks because in the end, nobody knows what you are doing with the AI.
And that is what I always recommend to people.
If you have something the AI is not doing, it's not working, write it down.
That is a great benchmark because if it's not working, you can test it with another AI to see if it works.
So you can overtime collect a couple of these cases where you have some personalized benchmarks that you can use to evaluate if the AI model is finally being able to solve the issues you have.
And it's amazing to see that progress.
I remember when Quill, the QBQ came out and I could run finally, could do stuff locally that was impossible to do before.
The reasoning was so great.
And I was very amazed when it finally did these tasks that I had to refer to online AI first.
So that was a big unlock.
And yeah, as a business, ask your your your employees to collect these special cases where, you know, it would really help the business if you have this sourced.
And then the if it can be solved, if it's not specifically protected or private or anything, do use an online AI because unfortunately they are still better than the open source AI right now.
Or if you are using keepsake.
I want.
OK, that is so big.
Yeah.
I don't think most businesses will start with that.
But of course you can build on to it.
But yeah, I would always take the best AI that is currently available, no matter where it runs, to test and see if it actually does what you want to do.
And then if it works, then, you know, AI can do it currently.
And then you can go down in size to local models or to quantized models and see if it can still be done.
And so you can tune into which model you want to use.
And yeah, like I said, I would stop with Jemma 27-B, see if that works.
Yeah, I I like command R and command A, for example.
There has been a long time.
So everyone's locally, but you need a commercial license for them.
Yeah, check it out to trend.
I love trend.
Maybe even build a pipeline where you have the smart Chinese model do the task, and then have models that may not be as smart but it can write your language better to just summarize and to convert it so it's more presentable.
There's multi multi model systems you can set up to really start leveraging the strengths of each of them.
One thing you touched on which I think is brilliant is having everyone in the company try to just store what doesn't work and then have that as an eval set.
Start having the local benchmark so you can start building up these capabilities or testing different LMS in a system.
Do you have any approaches or strategies to help make sure this is either automated or just easier to do?
And how would you store this information for starting to collect these failure cases that we want to have succeed?
So usually a company has some kind of knowledge management like Notion or something or wiki or just documents on your system.
So just put it somewhere that is the easiest path and it's pretty easy to copy and paste.
And usually you take it out of somewhere and then you give it to the AI.
So you already have it stored somewhere.
Maybe if you are writing it up on the spot and the AI sales, then before you save it, it would be cool if you have some AI guy in your company to look at it.
So I when I was doing that at the previous company where I was, yeah, may I have an interest as well?
The people always called me and so I looked at the query and could have them and tell them, OK, this can never work because this or that and maybe the context window is too small or you haven't given enough information.
So that is important because you want good test cases.
But if somebody is not available, just write it down.
Somebody can later go through them and picks the best cases.
So just write it down some yeah And if you have some yeah broker was in your company or something, you can think about making a proxy or something that is actually saving fees or you set up local.
They are like open that UI for instance, where you have the village to write answers and put them in the database internally.
Yeah many options but you definitely should collect because AI gets better all the time and having some good puzzle evaluation is very important to find out if it is working for your use case.
Because a big MMLU pros Co-op doesn't help you if it doesn't speak German very well and doesn't do your task if that is what you need.
And just Speaking of MLU and the other major benchmarks, do you even find any value in them anymore or is it the local setup, benchmarks and evals that matter much more?
I find a lot of value and actually I'm doing the MMLU full benchmark myself with the models because the big thing is these of these benchmarks if they are widely available, which is a negative stretch well, but at least most of the models that get released, they get this course for these benchmarks.
And now you locally you are usually running a contrast model because you don't have the resources for the full FP 1632 bit version.
And then you can use, so you're available MMA, your pro benchmark locally with your Quons and your prompts and stuff.
And see prompts is not the right word, but your settings for example.
And then you can see how does the score change between what they published and what you get on your own system.
And I've been doing very specific benchmarks.
For example, when Trend was released or quilled that model as well, I was doing various Quons.
I'm doing the same benchmark and so I can see if the four bit version is much, much worse than the 8 bit version for example.
And so that helps a lot to find the proper quant and see.
I often found some issues with the model itself where it was the, for example, an 8 bit version or different format like GGS was doing much worse than an EXO tool for instance.
That's that can be caused by wrong tokenizer config in the model or just a fade quant or stuff like that.
So yeah, the benchmarks helps you to find these out.
And that is also the tip I have to get because when you download the model from Hugging Face, for example, you download the new model was released, you get the GGS and yeah, you go.
You should watch it, especially if it's new, because sometimes there are these issues and they get fixed and then there's an updated version and yeah, you may not even notice.
So you may be running a reverse version.
And that is really bad if you just test it that way and then you'll say, oh, the model is bad.
I don't know what everybody else is doing because on my system it doesn't work.
But maybe you have a version that has a problem.
So updating is actually also important for the models and sometimes they are optimized like the Anslos team is doing, which I respect a lot.
So where they look at the model and say find some little issues of the commands that were made were not optimized because yeah, it's a science for itself.
And it's very important to make sure that the model you are using is good and it doesn't touch if the original model is straight but the versions you have has some issues and doesn't work very bad, that is also very important.
And I'm glad you brought the Unsloth guys because they publish everything openly and it's really cool to see just the process of diving in.
As someone who's more in the application layer rather than the the training and ML layer, I still find it so interesting just being able to see how you can really dissect these things to to optimize performance.
I wanted to add to that because that also shows that there are many, many factors besides just the model weights because you also have the prompt format that you are using the actual prompt.
Of course everybody knows the prompt, but the format itself is also very important.
And if you are using the chat completion endpoint, usually the inference engine does the formatting for you if the template that is part of the model is correct and there have been issues with that as well.
And if you are using just a completion endpoint, you have to do your own formatting which is still locally sometimes used.
So that is also very important.
I did a lot of template work before.
For example, 3D tenant is actually one of the inference engines where I did a lot of the templates and cobalt CPP for example, both ways on the end and back end I was using.
So I was making sure that the templates were right.
That is one thing.
And of course sampling have a major so they are yeah, everybody knows temperature, but there are so many others.
Repetition, penalty, top K and all these tails, these sampling stuff.
Most I don't use but you always have to check which are the default settings here, influence engine or the model provides.
Especially if you are using for example Olama, they have those baked in the model files and you should make sure that they work like the context.
If you don't have the right context length, then your big model or your good model is not as good because it will be forgetting stuff.
And there's always of course resources.
The bigger the context window, the more resources you'll need.
So you have to see where where it fits.
And that is also something everybody has to do to optimize the inference engine and make sure that the settings fit to their use case.
That's very important.
If you are using that and importing big documents in your context window is the small, it can't work.
And that is not the models fault or the software's fault.
So that is also something you have to evaluate.
User error is a real thing.
You you'd mentioned O llama.
I'm a fan of O llama.
I also use Jan for my local inference.
Do you have any tools that you use for local inference, whether it's for testing or for using, that you'd recommend people check out?
Oh, I started with, let me think, Ubabooga.
Actually Ubabooga's VEP UI was one of the first cobalt CPP O llama at work.
I used that.
So that and yeah, of course there are Llama CPP, the granddaddy and now that it has its own UI and especially its own server where you can use it as an API that was a big unlock as well.
Excelama is very fast if you can put it all on GPU.
So my personal AI workstation is 48 GB VM.
So it's my my desktop PC.
But I got 2230 nineties into and I know that a lot of companies are also using 240 nineties, fifty 90s, mostly 2 cards and in that unless you go really big with an HR 100 or something like that.
But that's a company I was before.
We also got one system with two traffic.
No, it was just one.
It was one, but it only had 48 GB as well.
So basically you should use whatever works best for you.
Of course Olama is great but I wouldn't use it for production it it's more for testing I would say.
While same like doctor and Kubernetes, I would use the car for easy deployment and development and Kubernetes if you want to go full scale.
So all VLLM, for example, that is really fast and perform and case of I have no experience with.
But there are so many options as you see, and you should check out which one works best for the situation where you are.
Do you want to use AI internally for your for your your team?
Then you don't need as big and parallel insurance, depending on the company size, of course.
Then when you want to provide it to your customers and add it on scale and stuff.
So yeah, I don't have much personal recommendation.
Right now.
I'm on the Mac using LM Studio because it's so easy to use and use whatever floats your boat I guess.
I I'd say should all work very well and no matter what you do, evaluate, evaluate, evaluate, you always want to benchmark the system, see which is faster and that can depend on the model as well, for example multi modality.
That is also an issue where you don't have it available for all the systems unfortunately, so you may be forced to use a special inference engine that supports the multi modality for the models you are using, although the support is getting better now.
One Ave.
I'd like to explore a little bit is when people are having these local models.
I also use a MacBook so I can just run it on my machine.
You know the the smaller models but does the trick.
But when you get to the small business scale or even medium business and they have the privacy concerns.
So they don't want to use open AI, but they're unsure of how to deploy it or where they should have their local AI, whether they want to set up a server somewhere in order to run it.
Do you have any recommendations on how people can kind of explore the hardware side of things, whether it's owning or even renting servers and at like what stages they might want to explore more before over committing if they don't really know what their solution is yet?
That is a bit like a server in general.
I think if you are a very small company, you may be more well served as a cloud services because you don't have a dedicated team to maintains assistance.
If you are medium sized company, you probably have an IT department and a lot of servers locally is all getting one more and put it in somewhere in a server room.
It's not a big deal.
Of course you should have some people maintain it that know about the L&M part or the iPad in general as well, which inference Angel like we just discussed to use and to keep it updated and to the models and stuff like that.
You should have somebody knowledgeable of course, because yeah, it, it doesn't help if you don't have anyone to keep these systems updated.
Among us, the company where I was before when I left, I asked together, do you have some new models now that's been made available like 4 recently though?
No, they are still on 3.5 I think, or maybe 3.7 even, but not any new one.
And you need somebody who takes care of these systems, of course.
And yeah, credit cards, they are hard to get, getting more expensive, unfortunately, we all want more.
Or you could go the other route and say, OK, I'm going for like DeepSeek and I want to do the multi model, the MMOA models, mixture of expert models where you can put it on RAM and still have good performance.
If you have a very fast CPU, that is another route.
Or maybe do both.
Of course it costs more that way.
But yeah, you want to have options.
Of course, that is one thing.
And yeah, it's still buying a system and having someone maintain it is of course it's permanent cost that you have to maintain.
Whereas if you use a cloud service like as AI or one of the others or they are specific providers, you can use Hugging Face for example, if you are using them, they take care of the maintenance.
So that may make sense.
You just have to check your regulations if you can do that or if you have a local system.
Mistrial now just started with their where they want to build a big compute and offer it.
So there are options.
There will be more options.
I think as always, you may want to start with some service where you have just you know what it costs and you can cancel it at any time and switch to another service.
If you get a big CPU and now the models need more GPU, then yeah, you are locked in a bit and you have to upgrade it.
And if you go the other way, you invested a big GPU system and now the Moe models what where it is at, then you may be locked down there.
So as a business you'll probably want to be the flexible.
There's a couple thousand boxes, not the problem.
Then you can just get a local machine and to make sure that everything is only running locally.
What I did at the previous company was just have two systems 2 times open back UII was using, there were two systems, 1 was externally where I had all the external models integrated like JGBT clawed.
What did you have as well with the big llama for example, stuff running externally and it was all rats.
The interface was seen so people knew, oh careful this is going outside.
And then we had a local system which was all cream and it was all local so people knew OK, I can put anything in here.
It is staying on my local system in this, Yeah.
So I would recommend to start with anything that is good and then decide if you wanted in house, build a team and build a server.
Good advice and I really like the day of having a very visual UI about what's internal and external.
So you can kind of be like safe to to put in anything your business data and then just general queries.
Let's see, I really had to tell the people that they should use the external system by default and only go to the internal system if it is sensitive data.
Because of the system we had was a 48 GB VM, the only at the quantized local model, so it wasn't full quality.
And even the big ones, they didn't compare yet to what you could get from Open AI or in Shopic.
We're getting there bit by bit.
Do you see, do you think of any other hidden expenses or general cost for businesses who want to host their own AII?
Think there is an issue if you, yeah, if you only offer internal AI and it is the best you have and the people are not allowed to use external AI, but models you are serving are not working for the use case, then you may have a cost that you don't even see because basically lost opportunity.
Or yeah, you you don't get the full benefit AI could give you if you were going to a better AI.
And I love open source, I am big open source advocate, but we have to be realistic, especially the business context.
It's not just ideology.
And yeah, right now it is still though that the AI is still advanced, more advanced.
And I think the gap is closing all the time, especially if the Chinese are doing a great job at the vet.
So.
And I use local AI, use external AII think most will use both and see where it works.
But just going internal and forbidding your employees to use external AI may be limiting.
And that is a cost that you won't even see.
In that way it's probably even even more loss if you don't use AI at all.
I'll put banning it that would be bad idea.
Yeah, it's no longer their day and age to be in denial that this is the future.
One benefit that I really like with post your own model or or leveraging the open source is the ability to fine tune it on the data.
Do you have a general perspective on at what point people should explore fine tuning?
Or should they just, you know, stick with hosted, stick with local, and then just run it stock?
Interestingly, personally I have not find you the model.
I've been thinking about it, but it's not something I have been doing.
I have been more prompt engineering and getting my way.
That way, even with the big commercial models, they are doing what I want.
But yeah, if you want to find you and of course you need somebody to do it and you need some resources of course.
So yeah, it it doesn't have to be very expensive.
And I know from what I see around me and our evaluation software confirms it, that you can get a lot of better performance out of a small model if you find unit for your specific use case.
So I'm not saying fine tuning instead.
So if you have such use case, it is definitely worth it to to test it, but you'd need a lot of investment in that way that you need somebody to do it.
You give them the time to do it or some money and then it can work.
So yeah, if you don't have somebody dedicated to doing it, you may hire somebody just for this job contracted out or you just try it with a prompt.
And the models are getting better and I think you get a lot with good prompting of course.
And the thing we find you model is you are fine tuning and then you have the model and now a better model comes out, so you have fine tune to fine tune again.
You can take your prompts.
You have to adapt them of course, but you can take them very easily.
And that is all the reason why I didn't find you the model for myself because I want to always use whatever is available to tested to use the best.
And I would constantly be retuning the models.
And yeah, I'm getting sort of from my use cases, all the use cases I have, I get it with the prompting.
So I would say the easiest way is to work on the prompt.
But if that doesn't bring you where you need to go and you don't have to prompt engineer or anyone who has the expertise to happy with that.
And you should have an evaluation system that helps you if it's getting better or not, because otherwise you would be flying client and you don't want that.
And yeah, so I would start with a prompt because it's the easiest way and it's the cheapest.
But if it is not getting you where you want to go, consider fine shielding.
Max out prompting until you can't prompt anymore, and then start exploring these other options.
One thing I'd like to pivot to is you've built yourself a personal assistant, and I think a lot of the future of our AI software is going to be very individual, very personalized.
What advice would you give to someone who has an idea for a little tool or assistant that they want to build and they just don't know where to begin?
They go and chat to BT Help Me Build X but then what do you think they should do?
Yeah, that is the best start you can actually do.
If you are having an idea, talk to the eye about it.
Of course you could talk to people.
Don't forget that all it's keep talking to people, but yeah, you may not have somebody around you with so you same expertise or it may be an area where you don't even yeah, you don't know much about.
So all your spai.
Of course, then you have to also be very careful because we all know AI can tell you something that's on so plausible, so logical and it's suddenly false.
So but yeah, you always have to double check and the goods, the better models are doing agentic stuff.
They are searching the web, they're giving you sources.
So that is getting better.
And yeah, that would be the trust.
Start taking the AI you like and that you have been working with and ask it.
Always use the best AI.
In this case I would definitely use O3 for example, instead of four because it is doing it's thinking and checking and tool use and stuff like that.
It helped me fix my workstation.
My workstation was totally broken.
I just went to JJBT on my other system and she went through everything from debugging and finding out that it was a broken power supply.
It wrote the mail to the where I got the system from so I could get a new power supply.
And then I took the picture of all the cables and it told me which cable goes where and which I needed it, which I didn't.
I, I showed a picture of what cable is that because I'm not an athlete guy.
I have all software guy.
So I showed it everything and I took pictures and it said, OK, you have to unscrew this and then you can pull this out and do things.
And that it gave me exact how to do it.
It, it even looked in the manual of my motherboard and oomed in, you know, the all three stuff it can oom.
And it even took the pictures that were relevant and dropped them and showed me just what I needed to see.
It was amazing, absolutely amazing and help me totally check the system.
So and if you want to build something, just do the same.
Of course, if you need to programs and I would recommend an editor like for example, cursor or Vince or use clock culture instance.
Because if you want to build something that would be more would work better than if you add copy and pasting stuff on the web UI.
Although even all three can now give you downloads and create stuff for you.
And yeah, it's amazing what the models are doing and can do.
And it will get even better.
And I think now is the best time.
If you have an an idea, you should be able to realize that idea.
The time has never been better because you just need to be smart enough and have some motivation to get started.
And then you can quickly learn all these things.
Maybe that could be a point where I can tell you another anecdote about some things that changed my mind about AI.
Then I was using AI as works at the previous company.
I was always of the opinion you need to be smarter than the AI to make good use of it.
Because if you aren't, you don't even know if the AI is telling of the right thing.
And you can't direct this.
But I was.
When I moved to the AI team, I left vacancy on the administration team, so we needed a lead of administrator.
And yeah, it was.
We had a hard time finding someone who was good enough of his positions or in the end he decided to convert a Windows admin.
And yeah, so I showed him everything.
And he had AI that was the big unlock because she was smart enough and new computers, but he had no idea about Linux.
But he could have the eyes, I'm saying.
And even if he didn't understand it, he could have it explained.
And it showed me that you don't have to be an expert in the topic where you are using the AI to make good use of it.
If you have the right mindset to work with it that it explained.
But don't just tell it what you want to do, but also let your let it explain how it's doing it and why it's doing what it's doing and stuff so you can.
He learns the looks that way.
And it was much, much faster than if I had just to do everything my by myself.
I was so proud of him.
And he was shabby.
Stuff he has learned on his own through the AI.
That was a good thing.
I had another colleague, which we had to, yeah, he should have done the job before, but we had to let him go because what he was doing is he was saying I do this and he just let it RIP.
And when I was seeing that and he didn't know what the, I was writing a simple task.
It wrote a lot of code.
And she just, I don't know what it's doing, just do it.
And I was saying, no, that's not the way you can do it.
It was a test system.
But do not do this on production or anything like that.
And that is a mindset thing, how you work with it.
Let it explained, ask it again.
My colleague always said we should have an automated.
Are you sure to be sent for the I say something.
So yeah, you have to to learn to use the tool.
And a silly example, but I've been using just the ChatGPT app around my yard.
Take a picture of flower or a plant like, hey, what is this?
And it'll give me an answer.
Very believable.
And then I just do a quick Google search after just to confirm the plant.
And then I look at the photos and it got three out of four.
So 75% accuracy is decent.
But it was so confident about the one I got wrong that yeah, got to be skeptical.
Even with generating writing or communications.
It sounds like AI people are getting better at being able to identify it.
So take the input as an input not as like.
Like the sole source of truth.
Yes, exactly.
It's still the human that is using the tool.
It if you have some agents that are totally dependent, maybe robots and something that we can talk about that.
But until then we have a tool.
If we have a user and the user is responsible for what they are doing with the tool.
And Yep, the tool can be the best tool.
Like you can have the best hammer and you can still hit on your finger and touch yourself.
Well, that's somebody else.
You want to get the screw in the wall.
It's not like taking the board of strong.
Yeah, that does.
That's a great.
Analogy.
AI affinity, I call it AI affinity like you have computer affinity.
Where you I then I I saw when computers got in the companies you were looking for people who were able to use it.
And yeah, some elder people were not.
Some other elder people were doing it easily.
So that is a mindset thing.
It doesn't have to do with age or anything.
It is just are you open for the technology and are you willing to learn it?
And that's no matter how old you are.
That is the thing that that we need with AI now.
And you had mentioned when people are building their projects use O3, so whether it's the Chachi PT app or just go on their website, use a cursor or an IDE like Cursor Windsor or something like that.
Do you have any other tools that you'd recommend people check out if they're in the interest of building or even just trying to leverage AI in their work?
I'm a tools guy, I have to say.
So I, I want AI to work for me everywhere I go.
So I want to have it.
I got a Google Pixel phone, for example, just because it has all great integration.
I don't even have to unlock it.
I can just say the code word and the AI reacts and that stuff.
And I have all the AI tools on it as well.
And now that I'm on a Mac, I also got the tools like Ray Raycast.
For example, if you have Raycast, it costs about the same as a J GB T subscription, but you get J GB TU, you get entropic thought you get all the other stuff.
Even Croc is in there and llama and a lot of things.
And then takes the time to learn to set it up.
Like I told you, I wrote this tool where you can just press a keyboard hotkey to send anything to the AI.
I just implemented that with ray cast as well.
We have the caps lock key is my hyper key, which is for all the things because caps lock usually don't need it.
I don't.
So for me it's just working as a multi function AI key and I just press it and this edit key and I have on my mouse for example a button.
When I press it I can just draw a rectangle around the screen and it already gaze goes to an AI chat and I can ask about this.
I have a key I can press so I can talk to the computer and it gets transcribed.
That is a highlight actually is also tools or if you are talking tools, Ray cast and highlight.
And there's code typist, which is standing a local, small local model and gives you auto complete in any text window in everywhere where you are typing.
And these are little things that if you combine them, you have an AI powerhouse that you can use for anything.
And I still do go to Churchy BT for examples or can use all three with all the integration that they have there.
Perplexity was a big thing I recommended to people because that was at the time when search was not yet standard in Open My Eye and transhibity.
And a lot of people ask stuff and the model didn't search the way it's high lessonated.
And that was also a bad time.
So I told people to go to perplexity and do that.
I don't want to make too much advertisement, but I think if a tool is good, it's always a good thing to talk about it.
But yeah, I'm not sure if I keep perplexity.
If that browser is any good, I will, but otherwise I use it much less now.
I use all three a lot, especially after the rest rate limits.
I like it a lot.
Clot is sometimes a bit too abstraction or how do you say so?
It can be a bit boring.
But actually I have my personal assistant.
My Amy is is a big prompt I have said I use everywhere.
I use it with local AI and I use the same prompt with Chichi BT I have it in perplexity.
I have it everywhere I can have fun, even in Gemini as a gem.
So I always interact with the AI through the lens of that sassy assistant, which makes it more personable.
So personality for AI is very important, I think because when we are using a tool all the time, it is great if you have a tool that is fun to use.
And when I'm having an issue with the computer and I get a sense of response that makes me laugh, then it's not as bad as it would be if it's just a boring old assistant stuff.
That's such a great point where these are tools that we can really shape and mold into whatever preference we have for interacting with them.
We don't have to just choose between light mode and dark mode, you can actually set personality and tone.
One thing that your your answer reminded me of is I have Hammer Spoon set up locally, which is an interface, a tool for interfacing with Mac OS.
And by taking Hammer Spoon's API and feeding into ChatGPT also I can get scripts for accomplishing different things, set it to different hotkeys and all of a sudden my laptop got that much more powerful just because I had a little idea for inspiration for window management or for transcription and is able to do that.
Obsidian, my note taking app, you can generate plug insurance very easily now.
So it's just all you have to do is be willing to experiment and play with it, and then you can get pretty much anything you want out of these systems.
And the systems will help you because when I was sitting at the Mac, everybody is a Mac and I have not been I have been using Windows for let me think 30 years now and I totally customized it to make it work the way I want this auto hot key and other stuff and tools.
And now The thing is when I was on the Mac, I asked AII told it, hey, I, I'm on a Mac now and I feel lost.
I want this I want that.
And like just mentioned, Hammerspoon, that was although a tool the AI recommended.
And another tool that I recommended was Kevin Beaner.
And yeah, I even wrote the scripts.
I just told it, OK, I want my caps lock key to be a paste key and my shift key to be a copy key.
Because if I just press caps lock, I just want my clipboard to be inserted.
And if I press shift without anything else, I want to have it copied because I have a copy and paste tab.
So I have these two keys.
I don't have to press 2 keys that together.
And it wrote the script that did everything.
And yeah, it's great.
You tell you have what you want.
It tells you which tools to use and it will write the scripts for you or give you instructions how to use it.
For example, I wanted to they call the video and do something.
So it told me I get kept cut and I got it.
And I said OK, I'm here, what can I do now?
And it said OK, now you have to do this and that and you want to do this.
It explains you the tools.
If the tool is popular and well known it can help you use it.
That is so amazing.
It's so much fun.
All right, well, last question for me, this has been great.
If someone's working at a company and they want to help promote the use of AI or try to get more ASSS, Barton, do you have any advice for how they can talk about it, bring it up and just try to help accelerate the company move forward?
Yeah, depends on your manager or your founder or boss.
If they are numbers person, then you just look up some studies where they show how much efficiency gains you get from the AI, how much money you can save and how many other companies are using it.
That is one thing.
But otherwise, it's always good to show an example.
If you can show them how you manage to do them, things that would have taken you much longer, if you can just throw them or.
Yeah, that would be the best thing.
I think showing is more than telling.
So.
Yeah.
And tell them that you are into AII think a lot of people may not even know.
I when I talked to my boss, he didn't even know that I was doing all these things in my spare time.
And then he said, oh, I have to get you on an AI team.
We have to do this.
If I hadn't talked to him about it, she wouldn't have known.
I would have done it in my spare time.
And yeah, I wouldn't be knows.
Yeah, it was all on the way there.
So it's important to talk about it, to show off that it what it does and yeah, show it.
I think it's so good that you should be able to find some use cases where it can really show that that has a big advantage.
Fully agree Wolfram, I've really enjoyed this conversation.
Your passion just comes right through.
Before we let you go, is there anything you want the audience to know?
Yeah, stay tuned because I have not been posting a lot of evaluations recently, but at Ella Mind, we are working on a really solid evaluation platform and yeah, there will be stuff to come and as soon as I can talk more about it, I will definitely do so.
Very.
Excited for that.
All right.
Well, thank you for coming on and we'll talk to you soon.
Thank you and thanks for having me.
It's a pleasure talking to you.
Keep going with that great show.