Effective AI Pair Programming

AI pair programming is the dominant interaction pattern for AI-assisted development. Unlike traditional pair programming between two humans, AI pair programming requires the engineer to serve simultaneously as the driver (guiding the AI), the navigator (reviewing AI outputs), and the quality gatekeeper (deciding what to accept). This guide provides proven techniques for making AI pair programming sessions productive, efficient, and safe.

Conversation Structure

The structure of your conversation with an AI coding assistant dramatically affects output quality. Unstructured, stream-of-consciousness prompting produces inconsistent results. Structured conversations produce predictable, high-quality code.

The CRAFT Framework

Use the CRAFT framework for structuring AI pair programming conversations:

Step	Action	Example
C - Context	Provide the architectural and codebase context	"This is a Node.js Express API using TypeORM with PostgreSQL. We follow a clean architecture with controllers, services, and repositories."
R - Requirements	State the specific requirements clearly	"I need a user registration endpoint that validates email format, checks for duplicates, hashes the password with bcrypt, and returns a JWT."
A - Assumptions	State the assumptions the AI should work with	"Assume the User entity and UserRepository already exist. Use the existing JwtService for token generation."
F - Format	Specify the expected output format	"Generate the service method and controller endpoint separately. Include error handling for duplicate emails and invalid input."
T - Tests	Define how the output will be validated	"Include unit tests for the service method covering success, duplicate email, and invalid email scenarios."

Conversation Patterns by Task Type

Different development tasks benefit from different conversation patterns:

Feature Implementation (New Code)

Start with architecture context and design constraints
Request interface/type definitions first
Request implementation second
Request tests third
Review and refine at each step

Bug Fix

Describe the observed behavior vs. expected behavior
Provide the relevant code section
Share error messages, stack traces, and logs
Ask for analysis before requesting a fix
Verify the fix addresses root cause, not just symptoms

Refactoring

Provide the current code and explain why it needs refactoring
Specify the target architecture or pattern
State what must NOT change (behavioral contracts, public APIs)
Request refactoring in small, reviewable steps
Verify behavior preservation at each step

Code Review Assistance

Share the code to be reviewed
Provide project conventions and standards
Ask for specific categories of feedback (security, performance, maintainability)
Use AI findings as starting points for your own analysis, not as final verdicts

Context Management

Context management is the most impactful skill in AI pair programming. See Context Window Management for detailed strategies. Key principles:

Lead with the most relevant code. The AI weights earlier context more heavily in many implementations.
Include interfaces, not implementations. When the AI needs to generate code that integrates with existing systems, provide the interfaces and types, not the full implementations.
Prune aggressively. Remove code that is not relevant to the current task. Irrelevant context dilutes the AI's attention.
Refresh context for long sessions. After extended sessions, restate the key context to prevent drift.

Rarely does AI-generated code meet all requirements on the first attempt. Effective pair programming involves systematic refinement:

The Review-Refine-Accept Loop

Generate --> Review --> [Accept | Refine | Reject]
                           |         |
                           |    Provide specific
                           |    feedback, then
                           |    regenerate
                           |
                      Integrate into
                      codebase

Review: Examine every line of AI output. Check for:

Correctness against requirements
Edge case handling
Error handling completeness
Security implications
Architectural alignment
Style and convention compliance

Refine: When output needs improvement, provide specific feedback:

Bad: "This doesn't work, try again"
Good: "The error handling is incomplete. Add a try-catch around the database call and handle ConnectionError and TimeoutError separately. Log the error and return a 503 status."

Accept or Reject: Accept code that meets your quality bar. Reject and rewrite manually if:

Three refinement attempts have not converged on a correct solution
The AI is introducing regressions with each iteration
The code is in a security-critical path and you are not confident in AI's ability to get it right

See Iterative Refinement for advanced refinement patterns.

When to Take Over

Knowing when to stop using AI and write code manually is a critical judgment skill. Take over from AI when:

Signal	Action
AI output is regressing after multiple refinements	Write the code manually using AI's best attempt as reference
You spend more time reviewing and fixing than it would take to write	Write from scratch
The code involves security-critical logic	Write manually per When NOT to Use AI
AI consistently misunderstands the requirement despite clear prompting	The task may require domain knowledge the AI lacks
AI is producing "almost right" code that requires subtle corrections	The corrections indicate the task is in the AI's uncertainty zone
You don't understand what the AI generated	Never accept code you don't understand

The most dangerous situation is accepting AI code that appears correct but that you have not fully verified. If you cannot explain every line of the generated code, you SHOULD NOT accept it.

Session Patterns

Short Sessions (< 30 minutes)

Best for: Individual functions, bug fixes, utility code, test generation. Pattern: Single CRAFT prompt, one or two refinement cycles, then move on.

Medium Sessions (30-120 minutes)

Best for: Feature implementation, component development, refactoring. Pattern: Multiple CRAFT prompts building on each other. Refresh context every 30-45 minutes. Save intermediate outputs.

Long Sessions (> 120 minutes)

Best for: Architecture exploration, large feature development, system migration. Pattern: Break into phases. Start a new conversation or context for each phase. Document decisions as you go per PRD-STD-005. Avoid context degradation by resetting periodically.

Anti-Patterns to Avoid

The Rubber Stamp: Accepting AI output without thorough review. This violates PRD-STD-002 and is the primary source of AI-related defects.
The Infinite Loop: Endlessly refining AI output hoping for perfection. Set a refinement budget (typically 3 attempts) and fall back to manual coding.
The Context Dump: Pasting an entire codebase into the context window. This overwhelms the AI and produces worse results than carefully curated context.
The Assumption: Assuming AI understands your project's conventions, architecture, or business rules without explicitly stating them.
The Shortcut: Using AI to generate code in areas where you lack expertise, hoping it "just works." AI amplifies expertise; it does not replace it.
The Solo Show: Using AI without any peer review, even for seemingly simple changes. Every AI-assisted change benefits from a second pair of human eyes.

Measuring Effectiveness

Track these metrics to measure and improve AI pair programming effectiveness:

Acceptance rate: Percentage of AI suggestions accepted without modification (target: 40-60%)
Iteration count: Average number of refinement cycles per task (target: 1-3)
Takeover rate: Percentage of tasks where AI is abandoned in favor of manual coding (expected: 10-20%)
Post-merge defect rate: Defects found in AI-assisted code after merge (compare to non-AI baseline)
Session productivity: Story points or tasks completed per session (compare to non-AI baseline)

These metrics support the Pillar 4: Measurement & Metrics framework and help teams continuously refine their AI pair programming practices.

Conversation Structure​

The CRAFT Framework​

Conversation Patterns by Task Type​

Context Management​

Iterative Refinement During Sessions​

The Review-Refine-Accept Loop​

When to Take Over​

Session Patterns​

Short Sessions (< 30 minutes)​

Medium Sessions (30-120 minutes)​

Long Sessions (> 120 minutes)​

Anti-Patterns to Avoid​

Measuring Effectiveness​