Code Quality in the Age of AI-Assisted Development

As developers transition from manual coding to AI-assisted coding, an increasing share of code is now being generated by AI. This shift has significantly boosted productivity and efficiency, but it raises an important question: how does AI-assisted development impact code quality? How can we ensure that AI-generated code maintains high quality, adheres to good style, and follows best practices? This question has been on my mind recently, and it is the topic of this blog post.

The development flow (before GenAI)

Before the rise of Generative AI (GenAI), the typical software development flow looked roughly like this:

Write code in an IDE with the help from IntelliSense and research from documentation.
Fix code after implementing some tests and maybe using some static code analysis tool to uncover bugs and style issues.
Push code to a feature branch or send a pull request.
Review code with peer reviews from team members and possibly ran more extensive static code analysis checks on the branch to further refine the code.
Merge code to main branch after finalizing code & tests.

In this flow, we had multiple levels of checks to ensure code quality. First and foremost, developers, the sole authors of the code, were using their judgement to ensure code quality. Additionally, we had tests, static code analysis tools, and peer reviews that kept us honest.

AI-assisted development

GenAI fundamentally changed the way we code. Back in the day, we used to sort through pages and pages of documentation, tried to figure out which library to use, what method to call and how. Now, we start writing some code and use AI to fill the rest. It’s not perfect but they’re good enough to significantly speed up coding and they’re only getting better and better.

The AI coding assistants landscape is diverse, with tools offering various capabilities. Bilgin Ibryam’s AI Coding Assistants Landscape post is a good overview of the current landscape.

No matter what AI development tool you use, one thing is certain: Developers are writing less and less ⬇️ code and AI is generating more and more ⬆️ code. The actual percentage of code written by AI varies from company to company but to give you an idea, according to Sundar Pichai, the CEO of Google, Over 25% of Google’s code is now written by AI.

At this point, you might be wondering: If AI is helping us to write code and if AI-assisted development tools are getting better and better, do we still need tests, static analysis, peer reviews for AI generated code? That’s a valid question and my answer is: Absolutely!

First and foremost, you can’t blindly trust AI generated code, no matter how good it might be. Secondly, you still need to look out for the same potential problems you did before in the code whether it was human or AI generated. Thirdly, AI generated code has its own unique challenges. Let’s dive into these in more detail.

What kind of code issues did we encounter before GenAI?

Let’s recap what kind of code issues we looked out for before GenAI:

Clean Code: We tried to have code that is easy to read, understand, maintain, and extend.
Code Smells: While not necessarily bugs, potential problems in code that might need refactoring.
Security Vulnerabilities: Weaknesses in code that could be exploited by attackers.
Performance Issues: Inefficient yet correct code.
Technical Debt: An easy solution that leads to a debt that needs to be paid by refactoring or redesign later.

The reality is that all these problems persist, regardless of whether the code is generated by a developer or AI. Therefore, we can still leverage our existing processes, such as peer reviews and static code analysis, to address these issues.

GenAI specific code issues

In addition to the regular code issues, AI generated code poses some AI specific code issues such as:

Bad Code Structure and Style: AI tools often lack full code context awareness and as a result, they generate code that doesn’t adhere to the structure or style of the existing code base.
Non-Deterministic Behavior: AI tools are non-deterministic. With the same code input, they can generate different code or test outputs at different times.
Hidden biases: AI tools are trained on data with biases and as a result, they can generate code with subtle biases.
Over-reliance: Over trusting AI generated code can lead to not understanding the code fully and over time, this can lead to unmaintainable code.

The development flow (after GenAI)

Given all the regular and GenAI specific issues you can still have with the code, what can we do in the age of AI-assisted development?

First of all, most AI code generation tools require developers to type some code, the AI completes the rest, the developer accepts/rejects the AI generated code, and finally changes it to produce the final code. You, the developer, are the first line of defence. You use your own judgement and experience to guide the AI to an acceptable level of code quality.

Secondly, at pre-commit time, you can still rely on |IntelliSense and a static code analyzer to watch over the AI coding assistant and catch basic code and style issues.

Thirdly, at post-commit time, you can rely on peer reviews to double check the code and ideally, you have a static code analysis tool with more extensive and more AI specific checks.

In this new workflow, the development flow looks like this:

Write code in an AI enabled IDE where some code is written manually and the rest generated by the AI.
Accept & change AI generated code.
Fix code after implementing some tests and maybe using some static code analysis.
Push code to a feature branch or send a pull request.
Review code with peer reviews from team members and run more extensive and AI specific static code analysis checks on the branch to further refine the code.
Merge code after we finalize the code & tests.

This workflow incorporates static code analysis to monitor AI-generated code. It also ensures human developers remain actively involved throughout the coding process, including peer reviews. This oversight guarantees adherence to coding standards and promotes a clear understanding of the AI-generated code within the team.

Conclusion

In this blog post, I explored how GenAI changed our development process and what this means for code quality. On one hand, code is code, whether it’s generated by humans or AI and the same processes and tools apply to both. On the other hand, AI poses some unique challenges that need to be handled carefully. In a future post, I’ll dive into specific tools that can help ensure AI-driven development produces reliable and maintainable code.