Skip to main content

Engineering Quality Standards

This section defines the quality standards that all AI-augmented engineering outputs MUST meet before integration into the codebase. These standards establish measurable thresholds for code complexity, documentation, test coverage, architectural conformance, and technical debt. They serve as the definitive quality gates in the CI/CD pipeline and the benchmarks against which AI Output Verification and Human-in-the-Loop Review processes evaluate AI-generated code.

Why Elevated Standards for AI-Generated Code

AI code generation tools optimize for plausible, functional output. They do not inherently optimize for maintainability, architectural consistency, or long-term codebase health. Without explicit quality gates, AI-assisted development accelerates the accumulation of technical debt, architectural drift, and maintenance burden. The standards in this section compensate for these tendencies by establishing firm, measurable boundaries.

info

These standards apply to all code entering the codebase, whether AI-generated or human-authored. However, AI-generated code is subject to stricter enforcement because of its statistically higher defect and vulnerability rates.

Complexity Thresholds

Code complexity directly correlates with defect density, testing difficulty, and maintenance cost. AI-generated code frequently exceeds reasonable complexity thresholds because models tend to produce monolithic solutions rather than well-decomposed components.

Mandatory Complexity Limits

MetricMaximum for AI-Generated CodeMaximum for Human-Authored CodeMeasurement Tool
Cyclomatic Complexity (per function)1015SonarQube, radon, lizard, ESLint complexity rule
Cognitive Complexity (per function)1218SonarQube
Function Length (lines)4060Linter configuration
File Length (lines)300500Linter configuration
Parameter Count (per function)45Linter configuration
Nesting Depth (maximum)34Linter configuration
Class Coupling (afferent + efferent)1015SonarQube, JDepend

Complexity Violation Protocol

  • Code that exceeds any complexity threshold MUST NOT be merged without remediation
  • If a legitimate business reason requires exceeding a threshold, a complexity waiver MUST be filed and approved by a Tier 2 or higher reviewer as defined in Human-in-the-Loop
  • Waivers MUST include: the reason the threshold cannot be met, the risk assessment, and a remediation plan with a deadline
  • Waivers are valid for a maximum of one quarter and MUST be re-evaluated at the next architectural review
warning

AI tools SHOULD be prompted to generate code below these complexity thresholds by including explicit constraints in the prompt. See Prompt Engineering Rigor for constraint specification standards.

Documentation Requirements

AI-generated code often includes minimal, generic, or misleading documentation. Documentation standards ensure that maintainers can understand the intent and behavior of the code without relying on the original AI session context.

Mandatory Documentation Elements

ElementRequirementApplies To
Function/Method DocstringsMUST be present on all public and protected functions. MUST describe purpose, parameters, return values, exceptions, and side effects.All code
Class/Module HeaderMUST describe the responsibility of the class or module and its role in the architecture.All classes and modules
Inline CommentsMUST be present where the implementation choice is non-obvious. MUST NOT merely restate the code.Complex logic, workarounds, performance optimizations
API DocumentationMUST include OpenAPI/Swagger annotations or equivalent for all public API endpoints.API code
Architecture Decision Records (ADRs)MUST be created when AI-generated code introduces a new pattern, library, or architectural approach.Architectural changes
Prompt ReferenceMUST reference the prompt template or session ID used to generate the code.All AI-generated code

Documentation Anti-Patterns

The following documentation patterns are common in AI-generated code and MUST be rejected during review:

  • Restating the code: // increment i by 1 above i++
  • Generic descriptions: // This function handles the logic without specificity
  • Hallucinated references: Documentation referencing classes, methods, or configurations that do not exist
  • Stale documentation: Comments that describe behavior different from the actual implementation
  • Missing error documentation: Omitting description of error conditions and exception behavior

Test Coverage Minimums

Test coverage requirements for AI-generated code are intentionally elevated above standard baselines. These thresholds are enforced in the CI/CD pipeline as automated gates.

Coverage Requirements

MetricAI-Generated CodeHuman-Authored CodeEnforcement
Line Coverage90% minimum80% minimumCI gate -- blocks merge
Branch Coverage85% minimum75% minimumCI gate -- blocks merge
Function/Method Coverage95% minimum85% minimumCI gate -- blocks merge
Mutation Score70% minimumNot requiredCI gate for Tier 2/3 risk

For detailed coverage measurement methodology and mutation testing requirements, see AI Output Verification & Validation.

tip

When prompting AI tools for code generation, include the coverage requirements in the prompt constraints. For example: "Generate code with corresponding unit tests achieving at least 90% line coverage and 85% branch coverage."

Architectural Conformance

AI-generated code MUST conform to the project's established architectural patterns. AI tools are not aware of an organization's architectural decisions unless explicitly informed, and they frequently generate code that violates layer boundaries, introduces unauthorized dependencies, or bypasses established patterns.

Conformance Requirements

RequirementDescriptionEnforcement
Layer Boundary ComplianceCode MUST respect defined architectural layers (e.g., controllers MUST NOT directly access repositories in a hexagonal architecture).ArchUnit, Dependency Cruiser, custom lint rules
Dependency DirectionDependencies MUST flow in the approved direction (e.g., domain MUST NOT depend on infrastructure).ArchUnit, Dependency Cruiser
Pattern ConsistencyNew code MUST use established patterns (e.g., if the project uses the Repository pattern, AI-generated data access MUST use a Repository, not raw queries).Code review + architectural fitness functions
No Unauthorized DependenciesAI-generated code MUST NOT introduce new third-party dependencies without explicit approval. New dependencies require a lightweight evaluation covering: license, maintenance status, security history, and size impact.Dependency lockfile diff review in PR
Naming ConventionsClasses, functions, variables, and files MUST follow the project's naming conventions.Linter rules
Configuration PatternsConfiguration MUST use the project's established configuration mechanisms (e.g., environment variables, config files) and MUST NOT hardcode values.Code review + secret scanning

Architectural Fitness Functions

Teams SHOULD implement automated architectural fitness functions that run in CI and validate conformance continuously:

// Example: ArchUnit test (Java)
@ArchTest
static final ArchRule controllers_should_not_access_repositories =
noClasses()
.that().resideInAPackage("..controller..")
.should().accessClassesThat()
.resideInAPackage("..repository..");

@ArchTest
static final ArchRule domain_should_not_depend_on_infrastructure =
noClasses()
.that().resideInAPackage("..domain..")
.should().dependOnClassesThat()
.resideInAPackage("..infrastructure..");

Technical Debt Limits

AI-assisted development can rapidly accelerate technical debt accumulation if not actively managed. The following standards establish firm limits on debt introduction.

Technical Debt Thresholds

MetricMaximum AllowableMeasurement
SonarQube Technical Debt Ratio5% for AI-generated filesSonarQube
New Code Smells per PR0 Critical, 0 MajorSonarQube, CodeClimate
TODO/FIXME/HACK Comments0 in AI-generated code (resolve before merge)Grep/lint rule
Duplicated Code0% duplication in new AI-generated code against existing codebase; less than 3% within the PRSonarQube, jscpd, CPD
Deprecated API Usage0 new usages of deprecated APIsCompiler warnings, lint rules

Debt Management Protocol

  • AI-generated code that introduces technical debt beyond the thresholds above MUST NOT be merged
  • If AI-generated code reveals or interacts with existing technical debt, a separate ticket MUST be created to address the pre-existing debt
  • Quarterly technical debt audits MUST include analysis of AI-generated code contributions to total debt
  • The ratio of AI-originated debt to total new debt SHOULD be tracked as a leading indicator

Quality Gate Checklist

The following checklist consolidates all quality gates into a single reference. Every item MUST pass before AI-generated code is merged.

#Quality GateThresholdAutomated
1All unit tests pass100% pass rateYes
2Line coverage meets minimum>= 90%Yes
3Branch coverage meets minimum>= 85%Yes
4Function coverage meets minimum>= 95%Yes
5Mutation score meets minimum (Tier 2/3)>= 70%Yes
6SAST scan clean0 Critical/HighYes
7Dependency vulnerability scan clean0 CriticalYes
8Cyclomatic complexity within limits<= 10 per functionYes
9Cognitive complexity within limits<= 12 per functionYes
10No new code smells (Critical/Major)0Yes
11No code duplication< 3% within PRYes
12No TODO/FIXME/HACK comments0Yes
13Documentation present and correctAll public APIs documentedPartial (lint + review)
14Architectural conformance validatedAll fitness functions passYes
15No unauthorized new dependencies0 unapproved additionsPartial (lockfile diff)
16Regression suite passes100% pass rateYes
17Human review completedApproved by qualified reviewerNo (manual)
18Prompt reference attachedPR includes prompt/session referencePartial (label check)
danger

CI/CD pipelines MUST be configured to enforce gates 1-16 automatically. Gates 17-18 are enforced through branch protection rules requiring human approval. No bypass mechanism SHALL exist for AI-generated code outside the emergency procedures defined by the engineering director.

Continuous Improvement

Quality standards are living documents. They MUST be reviewed and updated based on:

  • Defect trend analysis from Incident Response post-mortems
  • Emerging research on AI code generation quality
  • Changes in AI tooling capabilities
  • Organizational maturity in AI-assisted development

Standards review cadence: quarterly, conducted by the engineering quality lead in collaboration with the architecture board.