Navigated to AI-Assisted Coding: Avoiding "Chair Pants" - Transcript

AI-Assisted Coding: Avoiding "Chair Pants"

Episode Transcript

Pia: 00:00

Hello and welcome to The Fuse and The Flint, where technology, innovation and development intersect in exciting ways. I'm Pia Opulencia, head of product and AI at 8th Light. Joining me is Pierce Edmiston, principal crafter at 8th Light. In today's discussion, we'll explore AI assisted coding and the importance of avoiding what we refer to as chair pants.

Pia: 00:29

So Pierce, I love the chair pants metaphor. Can you tell me a little bit more about its origins and, and how it relates to AI and software development?

Pierce: 00:37

Sure. Yeah. So am I right, Pia, that I got you to watch jury duty?

Pia: 00:43

I. You are correct. You got me to watch jury duty.

Pierce: 00:46

So in, in the show jury duty, uh, there was this character, Todd, this socially awkward inventor.

Pierce: 00:53

And um, there's this one episode where he is showing up for jury duty wearing chair pants, like these crutches that are attached to his waist that swing out and they allow him to sit anywhere he wants, you know, secured by these, these crutches. Now he's just a hilarious character. And I love that scene.

Pierce: 01:14

But, um, for me, as I was thinking about this, and the reason I was talking to you about it is Todd sort of represented a caricature of the, uh, AI enabled software engineer. Someone who thinks that, you know, they don't need to learn how to code. All they need is to know how to talk to a chat bot and, uh, give it the right prompt.

Pierce: 01:35

And they'll never need to, um, know about writing code again. So I thought that, yeah, the chair pants ideas was kind of, uh, an interesting point in time for where we're at with AI assisted coding and how a lot of the solutions that are out there to me sometimes feel like a little bit more inventions in search of a problem rather than, um, real solutions for software engineers.

Pia: 01:58

And so we can avoid chair pants situations. I'm wondering if you mentioned a couple pitfalls that this caricature might run into when looking at this emerging technology. What, what are some of those, when you think of AI assisted coding, what are some common pitfalls?

Pierce: 02:13

Well, yeah, the first thing that I.

Pierce: 02:15

Comes to my mind is that just having a large language model that can write code is not sufficient to do everything that a software engineer would do. Large language models that can write code don't solve software problems more generally. And you know, I think about Todd's chair pants they don't solve.

Pierce: 02:33

Sitting in general, they solve, sitting in this like very specific case of there's no other obstacles around them standing on open ground. So for these models, they often are really good at writing greenfield code, you know, new code to solve a problem. Um, but that kind of problem in my experience is, is actually relatively rare in, um, software engineering.

Pierce: 02:54

I think of a fellow 8th Light. Aaron Leahy once said to me, you know, the code is the easy part. Um, it's finding where to put the code. That's the hard part. Mm-hmm. And even if a large language model is, is really good at, at generating a bunch of code from scratch, um, it doesn't necessarily mean it's good at integrating that code with a legacy system or whatever already exists.

Pia: 03:16

And so when that integration happens, it sometimes expands and scales existing problems potentially within the code base or the organization. Unintended. I've heard, uh, teams really struggling with the fact that these tools can sometimes exacerbate issues, um, more than alleviate code quality issues.

Pierce: 03:38

Yeah, I mean, it really does depend on how you're trying to use it.

Pierce: 03:41

I mean, if you are approaching it as. A black box that I can just trust. You are inevitably going to have a lot more work on the QA side of things. You know, you're gonna be kind of blindly looking around for bugs because you weren't very careful in how you introduce that code. I mean, these models don't mind outputting pages and pages and pages and lines, and lines and lines of code.

Pierce: 04:06

They have no problem with very, very long solutions and software. You can always add more complexity, you know, and that's often not good for the long-term health of the software. Um, you know, looking for opportunities to refactor and simplify and do things differently is just as important as doing it, uh, correctly.

Pia: 04:25

I guess on that front, do you find that there's almost an inherent assumption with these LLMs that what's on main and in production is good? And so it branches out and, and assumes that versus, you know, how do we interject a different pattern, or a different approach to solving a problem? Can you unpack that a little bit?

Pierce: 04:45

Yeah, for sure. So, you know, when we think about how the large language models are trained, we know that they use some sort of masking and they're trying to predict the next word in the sentence or predict the preceding sentence or all of these ways in which they're just trying to fill in the blank for what.

Pierce: 05:01

Somebody had written, and that has an assumption that whatever that blank was, was the right word. Um, and so if you apply that to code, just because someone on an open source project happened to have this as the, you know, the line between 99 and 101 was this particular formulation, it doesn't mean it was the best answer.

Pierce: 05:23

It just means, you know, the model filled in the blank there. So I think that these large language models are, are good at doing things that are, you know, syntactically correct and might look like good python code or might look like good Java code. But, um, the training data that they're pulling from has no measure of quality other than did it successfully repeat.

Pierce: 05:45

The training data.

Pia: 05:46

And so we end up at, you know, this notion of, uh, which you've talked about in your article, badly generated code might be worse than no code at all in some circumstances. So what are the unintended consequences of, of that in your mind?

Pierce: 06:02

Yeah. Badly generated code can waste time. I mean, I had a recent example where I was, um, prompting for a simple script that was going to, you know.

Pierce: 06:13

Load all of these objects from a remote file store and it was doing it sequentially, you know, one at a time. And I was thinking, well, wouldn't it be great if I could speed this up? Do it in batches, do it in parallel. And the AI model prompted back, yeah, you can use the plural version of this, get object you can use, get objects.

Pierce: 06:31

And so I was like, oh, that's great. And so I went and refactored a bunch of my code around that to use this like plural form, and it was much cleaner and it was gonna be much faster. And then I discovered that that method didn't exist. There was no plural form. I agree with the AI model that it should exist, but that doesn't mean that it does exist.

Pierce: 06:48

And so I, you know, ended up wasting, you know, 30 to 40 minutes on that small example. And I just had to undo all of that, that work that I thought was, was progress.

Pia: 06:57

Hmm. Best laid plans. Um, so I guess we've, we've talked a bit about pitfalls, maybe flipping to what are the effective ways that we can use AI assisted coding tools in our day to day?

Pierce: 07:09

Yeah, for me, my recommendation is that engineers view it at this stage more like auto complete than a totally robot generated PR. I much prefer looking at, you know, individual lines, small snippets of code that were generated by an ai. And reviewing them individually rather than trusting an AI to come up with an entire PR all on its own.

Pierce: 07:33

And then just kind of reviewing that as a whole. So that's my general recommendation is, you know, think about it in the same category as auto complete right now, but there's definitely room for it to grow.

Pia: 07:45

And so with, with chair pants, if you flip that around and you say it's going to generate a PR that I can just merge and it'll be happy and pass all the tests.

Pia: 07:53

Um, is chair pants. That notion that you think this might be the right solution, but it just doesn't fit the problem at hand. Is that sort of the guidance that you'd give to teams, uh, tinkering with this technology?

Pierce: 08:07

Yeah, I think it's maybe a recommendation to think about the failure case. I. Um, you know, so for like Todd's chair pants, obviously when it works, it works great.

Pierce: 08:17

But then, you know, the funniest scenes in the show are when he's getting on the bus and he can't get into his seat and he's trying to get into the juror's box and his crutches are getting in the way and it causes this big scene. And so, you know, even if we can have, you know, one, uh, very good bot generated PR that does a certain thing.

Pierce: 08:35

We can't rely on it in all cases, you know, we need to be able to make sure it's not gonna come back to bite us in some way we don't anticipate right now.

Pia: 08:44

What are some brail or defenses that you've found useful for experimenting with the different approaches that you've been talking about? I guess what, what has served as a foundation or a safety net for you as you've been learning how to use these tools?

Pierce: 08:59

I think I currently am, am using these models as somewhat of an Oracle. I will be working on my day-to-day work, and then I'll kind of realize that I don't understand something about a framework that I'm using enough, I, and, you know, normally I might google around for a little bit and find some good tutorials or maybe try to find some blog posts, but lately I've been, you know, going and starting a session with a chat bot where I am having it sort of explain the concepts in a more, I guess it's more of a like a teacher student sort of scenario.

Pierce: 09:34

I remember an old science quote that, you know, all models are wrong, but some are useful. And I feel like that when I get a big response from one of these chat models that, you know, I never get exactly what I need in my particular use case, but often I'm able to pull out the pieces that are actually valuable and then figure out how to apply them in my particular use case.

Pia: 09:54

That's super interesting. I've never thought about it like an oracle. I suppose that's what, you know, there's the notion of AI as advisors and, and that kind of tracks to that Oracle interaction. Yeah. You're kind of rubber ducking with the models, um, to figure out what to do next. I wonder how does that change how we measure our impact as a software development team?

Pia: 10:19

Or how does the organization see its use changing how we measure outcomes?

Pierce: 10:24

It's an interesting question. Yeah. You know, with these models, they are. Intended to make it easier to deliver new functionality. Taking something that would've taken me a week and now I can do it in a shorter amount of time.

Pierce: 10:38

Actually measuring that is probably gonna be really difficult. I think even if you show short term gains, um mm-hmm. I don't think we know yet kind of the long term consequence of using these models and in particular encouraging our teams to rely on them. You know, it might solve problems in the short term, but what if it means the engineers are less up to speed on the latest and greatest tech down the road because they've just been relying on this tool for everything that they're doing.

Pierce: 11:06

And I don't know if I have a good answer for how to think about the long-term impact of a deep integration with some of these tools?

Pia: 11:14

Well, yeah. I, I think it can be looked at too much as an efficiency tool and, and we're driving towards, you know, shipping faster when, you know, there's, there's other criteria or other dimensions we could be measuring success by, like.

Pia: 11:28

Quality or innovations delivered or, or whatnot, right? Because I think the prevailing, um, tone is that all these AI assisted coding tools are just gonna save businesses billions of dollars, and we won't have to hire engineers anymore. And I guess I disagree with that. I don't think that that's gonna be the reality.

Pia: 11:47

I think our, um, roles might change the tests and how we look at the problems in front of us might change, but I don't think necessarily it's going to replace or take away from what we're doing.

Pierce: 11:57

Yeah, definitely agree. Another issue I've seen with some of these AI uh, models and the way they're used is that they might give you the right solution, but to the wrong problem.

Pierce: 12:09

You know, I came of age as a software engineer in the heyday of Stack Overflow and spent much of my time searching on Stack Overflow, and one of the things you find on Stack Overflow is, you know, when you go to a post, you don't necessarily just look at the question, the original question that was asked.

Pierce: 12:25

Uh, you go to the first answer. And why do you do that? Well, part of the reason is that, uh, on Stack Overflow, a lot of times the best answers are in response to the question that should have been asked, not the one that actually was asked. So there's a lot of Stack Overflow, which is like, uh, I can answer your question and give you a solution that solves your exact problem, but here's why.

Pierce: 12:49

That is not actually what you wanna do in this scenario. Instead, this is the problem you should have been after. This is the question you should have asked, and this is the solution to that problem. Um, I think that, you know, by relying on AI for, for code, we have to be sure that we are asking the right questions, um, so that, you know, we know that the, the answer will actually follow through on the outcome that we want.

Pia: 13:12

So, in closing, I suppose what I'm hearing are three main ideas coming out of this conversation. Um, we're highlighting the danger of viewing AI as a magic bullet. Developers should leverage AI for specific tasks within a software development team. Uh, but not to assume that it's going to replace that developer's human judgment.

Pia: 13:32

And that LLMs may generate misleading or low quality outputs if, if you're not properly managing or mitigating, uh, these potential inefficiencies.

Pierce: 13:45

If you enjoyed this episode, make sure to like, subscribe and leave us a review. We can also dive into this topic and more by visiting 8th light.com. Be sure to follow us on LinkedIn to stay up to date with our latest work.

Pia: 13:55

And if you're interested in Pierce's article we discussed today, avoiding chair pants in the world of AI assisted coding, the link is in the show notes.

Pia: 14:02

Thanks for tuning in and we'll see you next time.