Coding with AI Agents: Between High Throughput and Code Quality

Recently, I had to present the results of my work in a formal way, so I decided to share the material with a broader group of readers.

Coding with AI agents can move very fast. The challenge is not code generation itself, but maintaining code quality, consistency and control over the process.

Based on one of my projects — Project Venom — I prepared a practical summary showing that high throughput with AI agents can be combined with code quality control.

Sources and supplementary data

Project Venom repository: https://github.com/mpieniak01/Venom
Research summary: https://research.strefa.org/

The project is being developed with the use of AI coding agents, and its final state looked as follows:

technology stack: Python 73.1%, TypeScript 24.9%, Shell 1.0%, Makefile 0.5%, JavaScript 0.3%, CSS 0.2%,
coding supported by AI agents: 138,011 LOC,
repository activity: 1,587 commits,
quality: reduction from 1,650 issues in the first SonarQube measurement to 0 issues,
test coverage: 92.2%.

Conclusion: AI agents increase throughput, but code quality is maintained by the process: pre-commit, CI, quality gate and review performed by AI agents and a human.

The data shows three phases of work

The quality and code flow chart shows three phases.

Phase I — intensive coding

In the first phase, AI agents generated a large part of the code. Quality validation mainly covered unit tests. This made it possible to quickly expand the scope of the project.

Phase II — raising the quality bar

In the second phase, quality control was tightened. Connecting SonarQube showed 1,650 quality issues. This was the moment of transition from fast generation to controlled code cleanup.

In this phase, local hard pre-commit gates were created. They included:

static code validation,
checking code coverage with tests,
architecture drift control.

The agent could not complete the task without passing the quality gates. Code generated by AI did not go directly into the repository. It first had to pass a local filter.

Phase III — stabilization and quality maintenance

In the third phase, the process was already stabilized by pre-commit, CI, SonarQube / SonarCloud and review performed by AI agents and a human. Quality was not checked only at the end; it was maintained during the work.

The final result: 0 issues and 92.2% test coverage, expressed in more than 5,000 unit tests. At the same time, the testing procedure was not supposed to exceed 10 minutes — as an assumption for a correct test pass.

How the workflow worked

AI agents supported execution: planning the change, implementation, refactoring, generating tests, analyzing errors and updating documentation.

However, they did not take over boundary decisions. The human still defined the goal and scope, accepted the architecture, and decided on merge, deployment and risk.

The key workflow was:

Human defines the goal
   ↓
AI agent helps prepare the plan and implementation
   ↓
Pre-commit checks the change locally
   ↓
GitHub Actions runs CI
   ↓
SonarQube / SonarCloud evaluates quality
   ↓
Pull request goes to review by AI agents and a human
   ↓
Merge

Key points

The first quality gate was local. Pre-commit scripts, prepared with the help of AI agents, ran static code validation, basic quality rules, local tests and an initial coverage check. A commit was possible only after a green result.
After pushing to GitHub, the second control layer started: GitHub Actions. The pipeline worked like a set of automated reviewers. It included, among others, backend tests, frontend linting, a quick validator, OpenAPI contract checks, architecture guards, forbidden path checks and documentation validation.
The next layer was SonarQube / SonarCloud. Its role was not to create a report after the fact. It was a quality gate: it checked whether new issues appeared, whether technical debt was growing, whether the code was becoming too complex, whether test coverage was falling and whether the change degraded the state of the project.

Why this model worked

The model worked because it combined three elements.

First: the throughput of AI agents.
Agents accelerated the transition from idea to working change. They helped with analysis, implementation, refactoring, tests and documentation.
Second: automated quality control.
Pre-commit, CI, guards, tests and SonarQube created a system of gates. The agent could generate code, but it could not decide on its own that the code was ready.
Third: human decision loops.
The human retained control over the goal, scope, architecture, change acceptance and risk.

In short:

AI generates and proposes.
Pre-commit catches basic errors.
CI checks the change in the repository.
SonarQube evaluates quality.
The human makes the decision.
This is the core of the approach.

What other teams can take from this

For teams starting work with coding agents, the most important question is not: “Which AI agent is the best?”. The word “best” changes from day to day.

A better question is:

How do we prepare a process that can safely accept code generated by AI agents?

A minimal starting set:

GitHub repository,
pull requests,
pre-commit,
unit tests,
GitHub Actions,
SonarQube / SonarCloud,
instructions for agents,
tasks for the agent,
review: AI agents and a human.

AI coding without a pipeline is a fast experiment. AI coding with a pipeline becomes a controlled software development process.

The most important experience from Project Venom:

A properly designed process makes it possible to combine high throughput from AI agents with a real improvement in code quality.

The agent can write code, but the pipeline must decide whether that code moves forward.

Question

How do you organize work with AI coding agents?

Do you treat them mainly as assistants in the IDE, or already as participants in the process with their own instructions, guards and quality gates?

I would be happy to compare approaches.

Disclosure

AI support in the article; presentation graphics. Data cited from the knowledge base.

Coding with AI Agents: Between High Throughput and Code Quality

Sources and supplementary data

The data shows three phases of work

Phase I — intensive coding

Phase II — raising the quality bar

Phase III — stabilization and quality maintenance

How the workflow worked

Key points

Why this model worked

What other teams can take from this

Question

Disclosure

O autorze

Maciej Pieniak

Wiadomości

Sources and supplementary data

The data shows three phases of work

Phase I — intensive coding

Phase II — raising the quality bar

Phase III — stabilization and quality maintenance

How the workflow worked

Key points

Why this model worked

What other teams can take from this

Question

Disclosure

O autorze

Maciej Pieniak

Czytaj wiecej

Wiadomości