Listen on these platforms
Brief summary
Generative AI appears to be making an impact in a huge range of fields, but one that we're particularly interested in at ºÚÁÏÃÅ is its use in software development.
Ìý
In recent months, there's been a lot of talk in the industry around issues like whether AI might boost developer productivity and if it can be used for pair programming, but in this episode of the Technology Podcast we try to get beneath the hype to explore the reality of generative AI and software development — how is it actually being used today? What works? And what doesn't?
Ìý
To dive deeper into all this, Chief of AI Mike Mason and Global Lead for AI-assisted Software Delivery Birgitta Böckeler join hosts Prem Chandrasekaran and Neal Ford, discussing everything from the current tooling to the way GenAI is shaping developer practices and workflows.
Episode transcript
Ìý
Prem Chandrasekaran: Welcome, everyone. My name is Prem Chandrasekaran. I am one of your regular co-hosts on the ºÚÁÏÃÅ Technology Podcast. I also have Neal Ford with me. Neal, do you want to introduce yourself?
Ìý
Neal Ford: Indeed, I do. Thanks, Prem. Welcome, everybody, to the podcast. I'm Neal Ford, one of your other regular hosts. Boy, you're going to hear a lot of familiar voices on this podcast because our guests today are normally hosts but they're in the much more luxurious guest chairs today in our palatial podcasting lounge. Today, we are joined by Mike Mason and Birgitta Böckeler. I'll let them introduce themselves.
Ìý
Birgitta Böckeler: Yes. Lots of nepotism always on our podcast, right!? [chuckles] When hosts become guests. Yes. Hi, everybody. My name is Birgitta Boeckeler. I'm a Technical Principal with ºÚÁÏÃÅ in Berlin, Germany.
Ìý
Mike Mason: My name's Mike Mason. I am the Chief AI Officer for ºÚÁÏÃÅ.
Ìý
Neal: That is particularly apropos for this podcast because our topic today is AI-assisted software development. Mike seems like a perfectly good guest for that.
Ìý
Prem: Definitely so is Birgitta.
Ìý
Birgitta: Yes. I should have probably mentioned, yes, that I also have a global role at the moment at ºÚÁÏÃÅ where I work in Mike's team to look into exactly this. Mike is the Chief AI Officer and I'm on his team looking specifically into everything related to AI. Of course, the flavor of the day, GenAI, and how to use it for software delivery.
Ìý
Prem: Thank you, Birgitta. That was great. Let's get started and be honest: tell us what's the latest and greatest here and why is AI-assisted coding such a big deal today or is it even a big deal?
Ìý
Birgitta: Yes. What's different about this? I think, for me, it's two main things. I like to always look a little bit at other times when we are trying to make ourselves more effective at software delivery through code generation because one of the most obvious and, so far, most widely used forms of using GenAI for software delivery at the moment is code generation, coding assistance in the IDE with tools like GitHub Copilot, Codeium, Tabnine, Codey, all of those many names. [chuckles]
Ìý
When I think about more traditional code generators where we formally describe a structure of a language, and then, we have a code generator generate code for us, one of the things that's different here obviously is that it's a very formal way of doing this and structured way of doing this.
Ìý
We always have to take some work to actually describe a higher abstraction of a language and then build a code generator whereas when we use a large language model to help us write code, generate code to suggest code for us, it's a lot more unstructured and informal. We actually write something in natural language either a code comment or a function name, and then, that gets translated. We don't actually have to create the whole structure around that. It's a lot more on the fly and it's a lot more closer to how we actually think as humans.
Ìý
The other thing that's different, I think, and rather big potential lies is that we're not trying to actually further raise the abstraction level and have less details to deal with. We always try that with code generators, with low-code no-code. We always try to raise the abstraction level, but then, we lose some control and always have to kind of, not always, but in a lot of cases, [chuckles] have to go down the stack again, to have more control again if we need more custom things.
Ìý
Now, with GenAI coding assistance, we're actually not really trying to raise the abstraction level. We're going to the side and we're using this assistance on all of the levels. We're using large language models to help us create low-code no-code applications. We're using it to create Java code, to create Spring code, to-- I don't know what other abstraction levels, I don't know, to generate DSL code for us.
Ìý
I think that's the thing that's new and that's also why it's now different for us as developers to figure out how to use that because it's not this experience of I do something and all the rest is being taken care of for me by the machines but it's this thing that we have as an assistant but we still have to be in control and we still have to figure out what to do with the suggestions.
Ìý
Mike: I think I've described it to people in some ways as kind of autocomplete on steroids. It's like a super powerful autocomplete with lots of context details available to it that very often does the right thing. It will write a block of code, 5 to 10 lines of code. It is roughly the code that I was thinking about writing and the superpower autocomplete has done that for me.
Ìý
It's very accessible. You can just switch it on one of these tools and suddenly you're getting better suggestions in the IDE in this-- and they look very similar to single line autocomplete that you might have seen before in an IDE. It's very accessible, very familiar to developers.
Ìý
The other thing that I think is worth pointing out is because it's in the IDE, you can stay in your development flow. If I'm trying to figure out a particular algorithm that probably there's a half-right answer somewhere on Stack Overflow for that or somewhere on a search engine, I have to flip out of my IDE and go to that whereas with one of these tools, I can describe what I'm trying to do either in a comment or a method name like Birgitta said and then the autogenerated code will very often have that algorithm in it that I was searching for and help me get that within the IDE without breaking my flow.
Ìý
Not just that, but it does it in a very context-specific way and adapts the textbook algorithm to my current code structure and variable names and all that kind of thing. Again, that's actually making it more powerful than going and looking up a dry academic example of something and then needing to figure out how it's going to work in my code base.
Ìý
Birgitta: Yes. I think that example also shows the other thing that is, of course, totally different from traditional code generators. That was a limited analogy because, of course, this is about a lot more than code generation. It's also about information discovery and lots of other things that I think we'll get to talk about in the next half hour or so.
Ìý
Neal: I have an analogy and I want to run this by you. I want to run this by a couple of experts because I'm trotting out this analogy as to what impact this is going to have on developers and this is very closely related to what Mike was talking about.
Ìý
In the 1970s, accountants spend most of their day recalculating paper spreadsheets by hand with calculators or adding machines. Spreadsheets, of course, completely eliminated that and made them way more productive and that had two instant effects. One is you could build way more complex spreadsheets and accountants became a lot more productive. That didn't necessarily make them better accountants. They just got rid of some of the busy work that they had as accountants. The other side effect is by 1980, if you are an accountant who didn't know how to use a spreadsheet, you had a hard time getting a job.
Ìý
My analogy for that is I'm a developer and I know SQL more or less, but now, I'm trying to put together this complex interjoined thing. I try one and there's an error message, so I Google the error message, and I get it to change, hey, progress. I fiddle around with that for 45 minutes versus handing that off to a GenAI and say, "Generate this SQL for me," and just execute it and get on my work. That I think is the productivity boost. It's not going to make you a different programmer, it's just going to eliminate a lot of the little busy work like that for you along the way. What do you think of that analogy? Is that accurate or is that too far?
Ìý
Mike: [laughs] Well, I think it depends on what timescale you're thinking about because I think it's accurate. I think it may not be far enough. Then, there's all sorts of subtleties to that. The SQL query thing's interesting because you can really cause your DBA some pain if you mess up a SQL query or you try to run something that isn't using an index or whatever.
Ìý
There's some important stuff to think about beyond just did I get the right answer to the business outcome that I was trying to get, which I'm translating into a SQL query and it seems to run and I seem to get the right answer from that. There's an entire question about how you incorporate that into your technique of building software. Like, how are we testing our SQL queries is a really good question that comes up here immediately because how do I know it's right?
Ìý
I'm going to show my Microsoft-isms here — do I just hit F5 and run the thing a few times and see whether it seems to be working or do I have a more robust mechanism for ensuring that the stuff I've written is correct? Obviously, we would advocate for automated unit testing and stuff like that so that we know that that stuff is right and we've got all those safety nets, but I think-- That's a question and then--
Ìý
Neal: Let me interject just a second. The same way as an accountant, when you put a formula in a field, you don't just trust that that's exact-- you go back and verify and test. There's a little bit of an analog there, but I agree that it's much shallower than what the capabilities that we have in front of us and much more varied the things that we can do now. Sorry, I interrupted you at halfway through your thought, so I [crosstalk]...
Ìý
Mike: No, no — Birgitta’s got her thoughts on it, I want to hear her thoughts more than mine! [Laughter]
Ìý
Birgitta: Yes! I struggle with the analogy a little bit I think because this accountant spreadsheet thing is exactly in that orthodoxy I was describing before of raising the abstraction level of somebody's work with software. You have then a repeatable deterministic type of thing. Whereas I think this is opening up this more messy type of assistance that we get where we still have to be in the driver's seat and we have to understand the formulas that are being generated if they actually work or not.
Ìý
So I think, yeah, maybe I see where you're going with it, but because it's also a software analogy, it's like I struggle with it because it's the same thing, this, like, "Oh, we're just raising the abstraction level and have this machine fill in the details for us." I think this is a different kind of-- What's the word? Yes. It's almost like a little bit of a paradigm shift for us to think about how to use a piece of software because it's actually--
Ìý
They are bad software. It's not really behaving like other software that we're using, so that's why as users, we also have to approach it differently and it doesn't always work. That's going to be interesting how we have to change our mindset, how to approach software to help us with tasks.
Ìý
Neal: Let me ask you a slight follow-up question because you've said the abstraction and I think that's a good way to think about that is raising the abstraction, but in a different way. One of the common pieces of advice we've always had is understand one abstraction below the abstraction you're working in, but we can't understand a lot of the things. Even the experts are not exactly sure how some of these things are producing some-- That's a difference in that abstraction boosting is that this is a lot more opaque in some ways.
Ìý
Birgitta: That's another dimension of that messiness. Yes [chuckles].
Ìý
Mike: It is opaque, but the stuff that it produces should not be opaque because the SQL or the JavaScript or whatever code that you are generating today, at least, we as developers need to understand that resulting code because that is our responsibility. It's our responsibility to ensure correctness, lack of security holes, performance characteristics, all that other stuff. The code that gets spit out is-- What's the opposite of opaque? It's transparent to us, I guess. Transparent. We can understand that because that's part of our job to understand how this stuff works.
Ìý
The fact that the AI is opaque, I think, is quite interesting because what that leads to is non-obvious results, better structured and better-documented code bases produce better results when you use a GenAI coding assistant with them because the AI has more things to hook into to understand the structure of your code and your solution.
Ìý
That's interesting. There are probably things to be learned about how you prompt this stuff, whether you write a comment, and then, get it to auto-complete for you, or whether just a method name, camel case method name, or however you're doing it is enough to suggest it or whether just moving your cursor to the right line in the file and then saying, "Please, suggest something now," because you can do that as well. That might be a different prompting style.
Ìý
Prem: It looks like this is quite rich in terms of the conversation that we are having. I do want us to move on and help our listeners visualize how these things manifest themselves. You're talking about AI-assisted coding but what are the most popular ways in which you interact with these tools?
Ìý
Birgitta: Yes. I think it's a really interesting space right now because it's very fast-moving. I have a virtual whiteboard somewhere where I keep dumping the new tools that I mentioned [chuckles] somewhere, and the list of coding assistance for your IDE is just growing and growing. It's just very hard to keep up. Also, the new features that are coming up.
Ìý
Obviously, I think one that is very popular and very well-known is GitHub Copilot, and we're also using that on a lot of our accounts. Then, there's Tabnine, Codeium, Cursor. There's Codeium with an E, Codium without an E, Codey with an E, Cody without an E. Also, very creative naming apparently [chuckles] in the products as they're all coming up very fast. They have to come up with names fast.
Ìý
Yes. It's an interesting time and I think developer IDE experience as well because I think-- We've seen that in the past when the JetBrains IDEs came up. That was a huge boost in productivity compared to the other IDEs that were around. Most of these coding assistants at the moment have two core features. One is this inline assistance while you're typing, it's doing this autocomplete in steroids that Mike was talking about.
Ìý
Then, most of them also have a chat component where you basically have a large language model chatbot in your IDE and can ask it questions. Often, those chatbots then also have context of-- part of your code base. For example, the open files that you have.
Ìý
Now, a lot of the other features that are coming up is-- Very common is something like, "Ask your code base." You can actually ask the chatbot something about your code base, like, "Where did I implement X, Y, Z," [chuckles] or stuff like that. The tool vendors are trying to do some indexing and turning your code base into something searchable that can be used to enhance the question to the large language model in the background. That's what a lot of them are doing.
Ìý
Then, especially this IDE called Cursor, which is still quite new, actually. It's early days, so I think maybe [chuckles] not ready for productive use yet, but it's really interesting the ideas that they have for the user experience. Also, for the prompting in the inline assistance, you can use a little chat component. Then, they also have abilities to-- When you ask a question about your code base to more specifically point at the parts of the code base that you want answers on, or you can include and say, "I want you to also consider the documentation of the following library that I'm using," and then, it indexes that as well or it has some interesting prompting approach to help you debug something.
Ìý
I've actually used this auto-debug feature with it once, and it goes into a chain of thought prompting loop where the model is trying to "reason" about what's going on in your error message and then says, "Ah, okay. There seems to be something wrong in your POM.XML file, so let me look for that place in your POM.XML. I couldn't find it, so I'm going to look into the POM.XML in the other submodules." It goes through that.
Ìý
The one time I did that, it didn't actually work really well in terms [chuckles] of actually finding the right things and all of that, and it went into an endless loop at the end that I had to cancel, but I do think that this is a valid approach to let a large language model help me debug something. There are some really interesting ideas that are coming up there. Yes. It's interesting days. [Chuckles]
Ìý
Prem: Yes. It looks like you're saying it can generate code. It can also help you navigate code. It can help you find some things that you don't know about and things of that sort. Mike, I think you are trying to say something or maybe there are other things as well.
Ìý
Mike: Well, I just actually wanted to add something about using cloud-based services versus local deployment. Most of these services are software-as-a-service type things where you pay for a subscription and when you are getting help or prompting the model, you're actually sharing some of your code base and some of your workflow data with the provider of that tool.
Ìý
Now, most of us are fairly comfortable with that because lots of organizations are using GitHub Enterprise anyway, and so they trust, say, GitHub with their source code, so by implication, they must trust that organization with using a copilot-style tool.
Ìý
There are companies out there who are much more averse to using cloud services and products. We actually helped a large company do an on-premise deployment using an open-source large language model for code generation, a fairly lightweight IDE plugin for VS code. I just wanted to point out that there are options for doing this stuff that don't rely on sending your data and your code to a third party if that's something that's going to be difficult for your organization.
Ìý
Prem: Yes. I've tried a few of these as well. I've usually used them to generate small snippets of code. They seem to work relatively okay, so I say, "Okay. Generate me a stack," or, "Generate me a push method on a stack." Those very simple things seem to work. Is it possible for these tools to be used in a even more holistic way where I wanted to generate an entire application, for example? Have you got any experience or thoughts on that?
Ìý
Mike: I think that's one of those things that's going to evolve. Right now, I would call them good for method-level code generation, so 5, 10, 15, 20 lines of code maybe with good context, I think you can get some good outputs from that. We are starting to see small apps being built from prompts and there's various-- you can see various different systems online and videos of people doing this stuff.
Ìý
I'm a little bit skeptical still. I feel like maybe we are being shown cherry-picked successes the 1 time in 10 that it worked and produced something good for us. I think we all know if you look at the trajectory of these AI systems and of technology in general, everything gets better. Something that half works today is going to work really great in six months' time.
Ìý
I do think we should be thinking about what happens when the unit of code that can be produced starts to get bigger. Maybe the question to ask is if you can create a unit of code, I don't know, 500 lines big across several classes, or files, and the AI can do that in a consistent way and keep it coherent, what would you want to use that for? Does it have an impact on the kinds of things that we choose to do architecturally because AI can produce something of a certain size?
Ìý
Birgitta: More promptable architecture.
Ìý
Neal: See, I think architecture is the place where it's going to take the longest because it's all about doing this trade-off analysis in the current context, but where I can see whole applications being generated or things like simple CRUD applications or designing things that are really busy work like print preview dialogues and stuff that nobody wants to spend the time designing, that kind of stuff. I think it's the kind of busy work that frees up the humans to do the stuff that only a human can really do at this point. I think the scope will get bigger and bigger over time, but I'm skeptical that it'll ever be able to create really sophisticated software.
Ìý
Birgitta: It's again, I think, something to think about is the combination of large language models with other things. Again, it always comes back to the abstraction levels. I was talking about low-code no-code in the beginning. Those are currently the most impressive demos of using this GenAI to create applications because low-code no-code already has this huge platform under the covers that does a lot of this stuff.
Ìý
Then, when you prompt it to create a low-code no-code application, you also have the constraints that you would have when you build a low-code no-code application. It needs to be a specific use case, straightforward use case, a preview dialogue, you were just saying, [chuckles] or something like that. Still, this combination I find really interesting, not to think about, "Oh, but large language models have these limitations," but what if you combine them with other things?
Ìý
For example, one of the coding assistants out there right now is by a company called Sourcegraph and they have an existing product that does code search. That is a product that really unders-- where the product already understands the structure of the code. It understands the abstract syntax tree and all of that. Now, they are combining that with the large language model-based coding assistant.
Ìý
Usually, the language models, they don't really understand the code. They just see the tokens, the patterns. Now, combining that with a code search that actually understands the structure then gives it an extra power. It's just one example but I think that will be interesting to see how-- not just take the model, but how do you combine it with other things, and then, make the overall result better.
Ìý
Mike: I think the low-code no-code is a really interesting example because the demo is impressive because you just give it a spec in English for an application and it spins the thing up. The question is always about what is that low-code no-code platform capable of doing and what trade-offs are we making by using it? We've talked about that stuff in the past. I won't belabor the point, but usually, if a low-code platform has picked an option for you in, for example, the way the UI works or the way stuff is stored in the database, you just have to go with that. You just have to go with whatever the platform has chosen.
Ìý
Now, you've also got [chuckles] the added complication that potentially people who don't really know how to use the low-code platform are specifying apps in English or other languages and then getting an AI to generate them. That person now needs to have absolutely no knowledge of what that low-code platform really does under the covers. One of the things we've always advocated for or pointed out is that somebody in the organization needs to understand what the low-code platform is doing or what the abstraction is doing. As somebody on your dev team needs to know what the hex Spring is doing under the covers probably to-- if you're using that as an abstraction.
Ìý
It's not like this is a new problem, but the alluring capability of, "Oh, we can just have business users specifying departmental applications using English, and then, a generative AI system creates those." I don't think we're getting away from the problem of needing to really understand what that system is doing, having at least one person in the organization who knows what's there.
Ìý
Then, the other question would be if you want to change something about that, what do you do? Do you change the English that you use to specify the system and rerun the AI? If you keep the English constant and you run the AI twice, do you get the same application at the end because these things are non-deterministic? I don't know. If you turn the temperature down or give it a seed number so you get the same output. There's all sorts of actually important questions there if you really wanted to do this stuff for real rather than just having a flashy demo that makes everybody go, "Ooh, yes."
Ìý
Neal: Well, and to use Birgitta's language about this, I think the reason it works well for low-code environments was because it's a very limited abstraction, but in some ways, the C programming language is an abstraction, too. It's just a lot less limited conceivably.
Ìý
I think there's an interesting thing here. The number of possible variables going up and how fast it can cope with something. C, you can pretty much-- it's a low-level assembly language versus some very constrained ESL. The abstraction level between those things is really high. I'm curious as to how fast they can ascend or descend that abstraction stack.
Ìý
Prem: Here is a thing that I've been wondering. This whole point of abstraction that all of us have raised. Now, here's the reality of it. Even when I'm working in a more conventional environment, let's say using the Java programming language or maybe the Spring framework, there are portions of it that I may not intimately understand. That's okay because it helps me get the job done in arguably a smaller amount of time than when I'm not using it.
Ìý
Wouldn't that same thing apply here? The place where I'm taking this is, okay, I've got a certain level of things that I need to do. For example, it needs to be syntactically correct. Okay. I got that. It needs to be correct from the perspective of meeting the requirements that I have. Those requirements might be, okay, it works. It seems to work when I've got three rows in the database, or will it work when I've got 3 million rows in the database?
Ìý
Okay. Now, does it move us to a point where I really need to get good at expressing those test scenarios as opposed to actually coding the thing? Does it shift the balance of power token? Now, if I'm able to express all of those acceptance criteria really, really well, then do I really care? I don't know how assembly language works. I really don't. I don't care. Does it move us to that level or is it too early to say?
Ìý
Birgitta: One difference there, I think, would be maybe not take the assembly example, the Spring example. When I write code and I use Spring annotations, I am basically putting myself into the hands of the creators of the Spring framework. They built something that deterministically always works the same way and that they thought about and that works.
Ìý
Of course, sometimes, I need to understand in detail what the annotation does, and sometimes I don't. If I get suggestions from a large language model that are based on all of the code that is out there on the internet, that sometimes works [chuckles] and sometimes doesn't, that's like a different quality.
Ìý
I think that's the messiness where-- I mean, in a way you could say if you have a good quality assurance approach, if you're really confident in your tests and you let this get generated, then maybe it's a little bit more similar. It still needs to be extensible code because as long as humans we still have to extend the code, it also needs to be somehow readable and all of that. Yes. Again, it doesn't quite work because the traditional abstraction layers are really well-defined, deterministic, and all of that. Here's all of this messiness in this space. It's always slightly different.
Ìý
Mike: Well, and what you just said there and was really interesting, Birgitta, because it still needs to be read and evolved by humans. Actually, that's one of the most remarkable things about GenAI creating code today is that code, we all know this, is actually for communicating amongst programmers and for me communicating to myself in the future because I've forgotten what the heck the thing was supposed to do.
Ìý
The fact that AI can generate human-readable and human-evolvable code is remarkable and very useful to us. I do wonder if we move to a mode of specifying functionality and ascertaining correctness through good acceptance testing, whether that still needs to be true because if you have a block of impenetrable AI-generated code that is super optimized, but no human could read it, but you have all these tests around the outside to guarantee that it is doing what you want it to do, does it matter that it's not human readable anymore?
Ìý
Prem: Yes. That's exactly what I was thinking. If I've got these suite of fitness functions and it basically says, okay, this is what it does. I don't really care. I don't want to see what lies underneath because it seems to obey all of these fitness functions that I've written against it. Great. That's all that matters. Then, now, if there is a bug, I say, "Here is an additional fitness function that you need to adhere to." Then, as long as it does that, I'm like, "Yes. Bring it on. Write the most complex code. Write the most ugly code unreadable." I don't care. It's obeying all of my acceptance criteria.
Ìý
Birgitta: Then, the key thing is you wrote those fitness functions, so there's a nice blog post by Michael Feathers about should you generate your tests and your code, [chuckles] because if you also use these tools to generate your tests-- and I think it's-- I wouldn't say I'll never use them to generate your tests. I think that's a bit too extreme. As always, it depends, but I think you definitely then have to be a lot more diligent when you generate your tests because you at least want to be sure that you're having good test coverage, good test scenarios, have covered all of your bases. Maybe you do things like-- What's it called? Where you do all the-- know the testing where you generate all of those different types of--
Ìý
Mike: Fuzz testing or--?
Ìý
Neal: Mutation testing.
Ìý
Birgitta: Mutation testing. [Chuckles] Thank you. [Laughs] Yes. Maybe something like that, property-based testing, all of those things. Yes. Maybe new ways to think about our testing approach to maybe invest even more in testing. Then, again, we have to see how does it balance out? Do we actually get a net positive after, at the end, when we invest even more in testing. [chuckles]
Ìý
Neal: Well, this is a slight digression, but the closure community has a fascinating way of testing stuff. They do statistically-based testing. Rather than write unit tests, they write this test suite and have the system generate all these possible outcomes to see if it statistically is producing the right stuff.
Ìý
I can definitely see that approach being used against something like-- something that's been generated that is maybe a little bit opaque but still does the stuff we want it to do, and we don't really care how it's getting the job done as long as it's correct, but then, using some statistics and things like that to determine that it is, in fact, doing the right thing.
Ìý
Prem: That leads us to the next section, which is very popular, at least among stock workers in terms of how we approach problems, testing, and development. A lot of us like doing that. How does this thing affect that flow where I'm writing the tests or I'm writing a test and then writing a bit of production code, see that pass, and then, keep doing this again and again until the time that I can't think of something. How does that get affected?
Ìý
Birgitta: One of the things that you'll also notice when you use this coding assistance is that they often suggest quite a few lines of code to you. Sometimes, it's also frustrating because they only go line by line, but sometimes, they give you 15 lines at once. Usually, when we do TDD, really, in a pinging pong style, we usually go step by step. What's the minimum smallest next thing that we want to work, that we want the test to be green?
Ìý
Now, there's this partner that we work with who just doesn't understand this, [chuckles] but there's actually without going too much into detail in the podcast conversation, but we have a bunch of memos out on Martin Fowler's website about this. If you click on his GenAI tech on the website, you'll find it where our colleague Paul Sobocinski did a little write-up about their experience with doing this TDD ping pong style and sometimes also going, okay, I adjust my test and then I actually delete my full implementation and have it regenerate because every time I put a new assertion, that's like a new factor of the prompt.
Ìý
I just delete my whole function and then the coding assistant will pick up my test in the other file and will potentially give me like a more fuller implementation of what I did. It contains little things like that, and he's going through the red-green refactor cycle and how a coding assistant would be helpful or sometimes not helpful in that. [chuckles]
Ìý
Neal: It sounds like you've tiptoed right up to the inevitable question that people ask Thoughtworkers about GenAI. Is it a pair programmer? [chuckles]
Ìý
Birgitta: Yes. I think the GitHub Copilot is even-- the tagline is your AI pair programmer, which annoys me a little bit, [chuckles] have to a little bit, like I don't--
Ìý
Neal: More than a little bit for a lot of-- [chuckles]
Ìý
Birgitta: The product's really great, but that part annoys me. Yes. It's like a little frustrating to us, I think, because we're big proponents of pair programming and it's very often misunderstood as it's this-- when there's two people, this one person will know the syntax when the other person doesn't. I'm exaggerating but it's about when-- filling knowledge gaps or something, but it's actually--
Of course, a tool like GitHub Copilot or the others can help with some of that stuff sometimes even better than a human maybe. Actually, the point of the pair programming practice is to make the team better, not just the individual coder. It's about having the context of what's going on, knowing this person wasn't a story kickoff discussion, this person wasn't, maybe that person is-- one person is out sick tomorrow, and the other person has the context, or it's also about--
Ìý
It's often one of the only spaces left in remote working where we actually informally collaborate [chuckles] with each other and code and where we can have bonding and stuff like that. It's so many things that the robot cannot help us with, so it's a bit frustrating that now, this-- equating those two things is doing a disservice to what the practice actually does.
Ìý
Neal: It's inevitable, though, don't you think? [chuckles] That they're all going to pitch themselves this way? [chuckles]
Ìý
Mike: Well, I think it depends what you are trying to get out of it. If you are clear about what you're trying to use the tool for, and the things that the tool cannot do, all of the stuff that Brigitta just talked about, the tool can be useful. There's all this discussion right now about whether AI is actually intelligent or whether it's just patent matching, spitting stuff out, and all this kind of stuff.
Ìý
Some part of me doesn't care because it's useful. I've got this useful tool and it's providing utility to me and, no, you can't have my license back. Sorry. I want that. I'm going to keep using it. Thanks very much. Some of the stuff that these tools could do is a super useful contribution to the development process like looking over your code and the changes that you've just made and determining does that have any security flaws to it, asking questions like that about the code that you've just done or reminding you, "Hey, did you run a performance test on this? It looks like you're accessing--" I don't know, it looks like you've got an n + 1 select loop going on there. Are you sure that that's what you intended to do?
Ìý
Those kinds of things are useful and would otherwise require time from more senior teammates to look over your code to remind you to do that kind of stuff. I'm all for using these things when it frees us up to think about things that are more in the human domain of thinking about, like, is my architecture going in the direction that I want it to? How do these system components relate to each other? Do I really understand the requirement that's being asked of me? Is this feature conflicting with another feature that's coming down the road that we did last week? Those are things that are much harder for an AI to help you with, but the nuts and bolts of the task of coding, it seems is getting more and more accessible to AI.
Ìý
Prem: Makes sense. Makes sense. Brigitta and Mike, we can keep going on this, but we do have to end this. Any parting thoughts from you folks in terms of what we can expect from this and where we should use it or not use it or anything else that people who are trying to get started on this should be aware of?
Ìý
Mike: The bee in my bonnet on this is that everybody should be trying this stuff out. The capabilities of generative AI change literally every week. The only way for you to really understand what this can do and whether it can work for you is to try it and actually try it in a whole bunch of different ways.
Ìý
I would always advocate for responsible experimentation. Don't go to work and install one of these tools without your employer's [chuckles] permission without them understanding that you're doing that because that leads to very bad outcomes, but if, say, you have an open-source project that you are working on or some personal code or something like that, most of these tools have a free trial available. You can get started with them and start understanding what they do.
Ìý
That would be my very strong recommendation to anyone is to just try this stuff. It's not perfect. It's not going to replace you as a programmer. Don't worry about that, but like the accountants and the spreadsheets, you, as a programmer, should understand how you can use this tool in your craft. There's lots of people, I get the impression, making excuses for it. I guarantee you. Once you try this stuff, your eyes will be open and I think you will like incorporating this tool.
Ìý
Prem: Brigitta?
Ìý
Brigitta: Yes. First of all, plus one to that definitely. I was mentioning before, this is different from other software that we've used. You can't just go to a training, somebody tells you, "These are the rules of the tool. This is how it works," and then, you apply those rules. There's lots of things that you can't really explain why is that happening. [chuckles] You have to get a feeling for it. Also, for things like, oh, sometimes going in small steps helps or I think you really-- Yes, have to try it out. I also feel even though I'm in a full-time role right now looking into these things, I find that I have to train myself to remember to try things out, so to-- because I'm just not used to it.
Ìý
I think we all-- if this is really going to be part of our lives in the future, I think we have to cognitively change a little bit how we think about things. I recently read an article where-- about this where the author mentioned how we all collectively cognitively changed when internet search came around from training ourselves to remember facts, to training ourselves, to know how to get the facts right. She was saying that she expects something similar to happen with this. We just don't know yet how it's going to go.
Ìý
Then, the other thing is what I mentioned before, to think about the combinations, to not just think about, "Oh, these are the limitations of the models," but to think creatively about how you can combine it with other things.
Ìý
Neal: Great. That's a great way to wrap up. Thank you, Brigitta and Mike for helping us sip from the firehose of information about AI. This is not the last podcast we're going to do about AI in general and AI-assisted software development, but we wanted to dip our toe into the ocean of information before it became too overwhelming. Thank you so much for joining us and thanks, Prem, for hosting with me today.
Ìý
Prem: Thank you, folks, and we will see you next time.
Ìý
[Music]