Estimation & Planning with AI

Traditional estimation models are built on the assumption that developer productivity is relatively stable and predictable. AI-assisted development violates this assumption. Some tasks are accelerated dramatically; others see no benefit; and the same task may yield different acceleration depending on context quality, prompt effectiveness, and AI tool performance on a given day. Organizations MUST recalibrate their estimation and planning approaches to account for this new reality.

The Estimation Challenge

Why Traditional Estimation Breaks Down

Consider a team that historically delivered 40 story points per sprint with consistent accuracy. After adopting AI tools:

Sprint 1: 52 points delivered (AI accelerated several stories)
Sprint 2: 38 points delivered (sprint contained mostly architecture work)
Sprint 3: 61 points delivered (high proportion of AI-friendly tasks)
Sprint 4: 35 points delivered (AI service outage, security-sensitive work)

The average velocity (46.5 points) is higher than the pre-AI baseline, but the variance is far larger. Planning based on a single velocity number produces unreliable forecasts. Stakeholders lose confidence. Teams feel pressure when they undershoot and undervalued when they overshoot.

info

The core insight is that AI-assisted development does not uniformly increase velocity -- it changes the distribution of velocity from a narrow band to a wide range that depends heavily on work mix. Estimation approaches MUST account for this distribution, not just the mean.

Recalibrating Velocity

Establishing AI-Adjusted Baselines

Organizations MUST establish new velocity baselines that reflect AI-augmented capabilities. This requires a structured baselining period.

Baselining Process:

Collect Data: Track velocity for a minimum of 6 sprints after AI tool adoption stabilizes (past the initial learning curve)
Categorize Work: Tag each completed story with its AI acceleration category (High, Medium, Low, None)
Calculate Category Velocities: Determine average velocity contribution by work category
Build Work-Mix Model: Create a model that predicts sprint velocity based on planned work mix

Velocity Range Model

Rather than a single velocity number, teams MUST use a three-point velocity range:

Velocity Estimate	Calculation	Confidence	Use
P25 (Conservative)	25th percentile of observed sprint velocities	75% confidence of achievement	Sprint commitment
P50 (Expected)	Median observed sprint velocity	50% confidence of achievement	Planning target
P75 (Optimistic)	75th percentile of observed sprint velocities	25% confidence of achievement	Best-case scenario

Example:

Observed velocities over 8 sprints: 35, 38, 42, 45, 48, 52, 55, 61

P25 (Conservative): 39    → Commit to this
P50 (Expected):     46.5  → Plan for this
P75 (Optimistic):   53    → Stretch target

Work-Mix Adjusted Estimation

For higher accuracy, teams SHOULD compute expected velocity based on the specific work mix planned for the sprint:

Expected Velocity = Σ (Story Points in Category × Category Acceleration Factor)

This approach requires the story categorization described in Sprint & Agile Adaptation and produces more accurate estimates than a blanket velocity number.

Managing Variable Productivity

Sources of Variability

Teams MUST understand and plan for the following sources of AI productivity variability:

Source	Impact	Mitigation
Work mix	High acceleration vs. low acceleration tasks	Work-mix adjusted estimation
AI tool availability	Service outages, rate limits, degraded performance	Fallback capacity planning
Context quality	Well-prepared context vs. ad-hoc prompting	Workflow optimization standards
Developer AI proficiency	Experienced vs. new AI tool users	Training programs
Task novelty	Routine vs. novel problems	Category-specific estimation
Review burden	Simple vs. complex review requirements	Review time allocation in planning

Variability Reduction Strategies

Organizations SHOULD implement these strategies to reduce estimation variability over time:

Standardize Workflows: Consistent workflow patterns reduce the variance introduced by individual approach differences
Invest in Context Packs: Well-maintained context packs reduce the variance caused by context quality differences
Build Prompt Libraries: Proven prompts from the prompt repository produce more consistent results than ad-hoc prompting
Level Up Skills: Training programs reduce the variance between experienced and novice AI tool users
Plan for Fallback: Always include non-AI-dependent tasks in the sprint backlog so that AI tool outages do not derail the sprint

tip

Variability is highest in the first 3-6 months of AI adoption and naturally decreases as teams build expertise, prompts, and workflows. Do not panic about high variability during the adoption period -- focus on collecting data and building the assets that will reduce it over time.

Communicating Timelines

The Range-Based Communication Model

When communicating timelines to stakeholders, teams MUST use range-based estimates rather than point estimates:

Instead of: "This project will take 8 sprints."

Use: "This project will take 6-10 sprints, with 8 sprints being our most likely estimate. The range reflects variability in AI-assisted development acceleration, which we are actively working to narrow."

Stakeholder Communication Framework

Audience	Communication Style	Detail Level	Frequency
Executive Leadership	Range with confidence level	High-level milestones	Monthly
Product Management	Range with work-mix explanation	Feature-level estimates	Per sprint
Engineering Teams	Detailed estimation model	Story-level breakdown	Sprint planning
External Stakeholders	Conservative estimate + upside potential	Commitment dates only	As needed

Managing Expectations

Organizations MUST proactively set expectations about AI-assisted development estimation:

Acknowledge Uncertainty: Be transparent that AI assistance introduces estimation variability, especially in early adoption
Show Improvement Trend: Present estimation accuracy data over time to demonstrate that the team is improving
Explain the Tradeoff: Higher average velocity comes with higher variability -- this is a known tradeoff, not a failure
Commit Conservatively, Deliver Frequently: Use conservative commitments but deliver increments as they complete, which often means delivering ahead of the commitment date

warning

Never promise stakeholders the optimistic estimate. The asymmetry of outcomes is severe: delivering early is a pleasant surprise; delivering late damages trust. Always commit to P25 (conservative) and let faster delivery speak for itself.

Uncertainty Management

Cone of Uncertainty (AI-Adjusted)

The traditional cone of uncertainty narrows as a project progresses and more is learned. In AI-assisted development, the cone is initially wider but can narrow faster:

Project Phase	Traditional Uncertainty	AI-Adjusted Uncertainty	Rationale
Inception	+/- 100%	+/- 150%	AI acceleration unknown for this project
Elaboration	+/- 50%	+/- 60%	Initial AI effectiveness data collected
Construction (early)	+/- 25%	+/- 30%	Work-mix acceleration factors calibrated
Construction (late)	+/- 10%	+/- 10%	Sufficient data for reliable estimation
Transition	+/- 5%	+/- 5%	Most work complete, minimal uncertainty

Risk Buffers

Project plans MUST include explicit buffers for AI-related risks:

AI Tool Dependency Risk: 5-10% buffer for AI service outages or degraded performance
Quality Risk: 10-15% buffer for additional review and rework of AI-generated code
Learning Curve Risk: 15-20% buffer for teams in their first 3 months of AI tool adoption (decreasing to 0% after 6 months)
Regulatory Risk: 5% buffer for compliance-related rework if regulations change

Contingency Planning

For each critical path milestone, teams MUST identify:

AI-dependent tasks: Which milestone tasks rely on AI acceleration?
Fallback approach: How would these tasks be completed without AI assistance?
Fallback timeline: How much longer would the fallback approach take?
Trigger criteria: At what point should the team switch to the fallback approach?

Planning Artifacts

AI-Augmented Sprint Planning Template

## Sprint [N] Planning

### Planned Work Mix
- High AI acceleration: [X] stories, [Y] points
- Medium AI acceleration: [X] stories, [Y] points
- Low AI acceleration: [X] stories, [Y] points

### Velocity Estimates
- Conservative (P25): [X] points
- Expected (P50): [Y] points
- Optimistic (P75): [Z] points

### Sprint Commitment: [Conservative estimate] points

### AI-Related Risks
- [Risk 1]: [Mitigation]
- [Risk 2]: [Mitigation]

### Review Capacity
- AI code review allocation: [X] hours
- Standard review allocation: [Y] hours

Release Planning Adjustments

For release-level planning, organizations MUST:

Use the P25 (conservative) velocity for release date commitments
Use the P50 (expected) velocity for internal planning and resource allocation
Track actual velocity against all three estimates to improve future planning accuracy
Recalculate release projections at least every 2 sprints using updated velocity data
Include explicit AI risk buffers in the release plan

danger

Do not allow AI tool vendors' productivity claims to influence release commitments. Vendor claims of "2-3x productivity improvement" are marketing figures based on ideal conditions. Real-world acceleration varies dramatically by task type, team maturity, and codebase complexity. Use only your team's measured data for planning.

Estimation Maturity Progression

As organizations gain experience, their estimation approaches SHOULD mature:

Maturity Stage	Approach	Expected Accuracy
Beginner	Pre-AI baseline + flat AI buffer	+/- 40-60%
Developing	Category-based acceleration factors	+/- 25-40%
Proficient	Work-mix adjusted velocity ranges	+/- 15-25%
Advanced	Predictive models using historical AI task data	+/- 10-15%
Expert	ML-assisted estimation with continuous calibration	+/- 5-10%

Organizations SHOULD target the "Proficient" stage within 2-3 months of AI adoption and "Advanced" within 4-6 months. The progression is driven by data collection, analysis, and the feedback loops defined in Pillar 3.

The Estimation Challenge​

Why Traditional Estimation Breaks Down​

Recalibrating Velocity​

Establishing AI-Adjusted Baselines​

Velocity Range Model​

Work-Mix Adjusted Estimation​

Managing Variable Productivity​

Sources of Variability​

Variability Reduction Strategies​

Communicating Timelines​

The Range-Based Communication Model​

Stakeholder Communication Framework​

Managing Expectations​

Uncertainty Management​

Cone of Uncertainty (AI-Adjusted)​

Risk Buffers​

Contingency Planning​

Planning Artifacts​

AI-Augmented Sprint Planning Template​

Release Planning Adjustments​

Estimation Maturity Progression​