Phased Rollout – Cohort Strategy#

Principle#

Don’t light up 500 developers at once. Each cohort discovers different classes of issues and builds institutional knowledge for the next cohort.

Cohort 1: Power Users (25 developers, Weeks 5–6)#

Selection Criteria#

  • Hand-picked across 3–4 teams representing different technology stacks
  • Mix of senior engineers (who know the patterns) and enthusiastic mid-levels (who’ll push boundaries)
  • At least one developer per major codebase
  • Include developers who are already CLI/terminal-native

Objectives#

  • Validate infrastructure end-to-end (Bedrock routing, gateway, PrivateLink)
  • Test managed-settings.json enforcement – do the deny rules work? Does bypass mode stay disabled?
  • Write the first project CLAUDE.md and agent_docs/ files for their repos
  • Co-create the initial 5–8 org-wide skills based on real workflows
  • Become internal champions who can support Cohort 2

Expected Discoveries#

  • Bedrock inference profiles required for on-demand usage (common first gotcha)
  • Prompt caching behavior differs from direct API
  • Some model versions lag behind on Bedrock
  • Specific CLAUDE.md instructions that Claude follows well vs. ignores
  • MCP servers that are most valuable for the org’s toolchain
  • Skills that need more or fewer steps than initially designed
  • Edge cases in deny rules (false positives blocking legitimate work)

Success Metrics#

  • All 25 developers able to use Claude Code through the enterprise infrastructure
  • At least 3 project CLAUDE.md files written and reviewed
  • At least 5 org-wide skills tested and iterated
  • Zero security policy violations (deny rules holding)
  • Qualitative feedback: “This makes me faster” vs. “This gets in the way”

Platform Team Commitment#

  • Daily Slack channel for issues during first week
  • 30-minute stand-up twice per week with Cohort 1
  • Same-day turnaround on configuration issues

Cohort 2: Full Teams (100 developers, Weeks 7–9)#

Selection Criteria#

  • Expand to complete teams (beyond individual developers)
  • Include at least one team that’s skeptical – they’ll surface real objections
  • Include the team with the most complex codebase

Objectives#

  • Stress-test the LLM gateway’s rate limiting and budget model
  • Validate that team-level CLAUDE.md and skills work across a full team
  • Discover the actual cost model (tokens per developer per day)
  • Instrument OpenTelemetry metrics

New Infrastructure Requirements#

  • Gateway rate limiting tuned based on Cohort 1 usage patterns
  • Per-team token budgets configured
  • CloudWatch dashboards showing per-user token consumption, latency percentiles, error rates
  • Cost allocation tags in AWS for team-level billing

Expected Discoveries#

  • Real-world token consumption rates (expect 50K–200K tokens/developer/day depending on usage intensity)
  • Teams that use Claude Code very differently (some use it for code gen, others for review, others for documentation)
  • Teams that need different model access (Opus for architecture work, Sonnet for routine coding)
  • Edge cases in project CLAUDE.md that only surface with diverse usage patterns

Success Metrics#

  • Gateway handles 100 concurrent developers without latency spikes
  • Per-team budgets prevent runaway costs
  • At least 80% of developers reporting increased productivity
  • Zero infrastructure outages
  • Cost model validated and predictable

Platform Team Commitment#

  • Slack channel continues, monitored 8am–6pm
  • Weekly office hours for questions
  • Bi-weekly feedback surveys

Cohort 3: Full Organization (375 developers, Weeks 10–12)#

Prerequisites (Must Be True Before Launching)#

  • Gateway proven at 100-developer scale
  • Managed-settings.json deployed to all developer machines via Mobile Device Management (MDM)
  • Internal documentation written by Cohort 1 champions
  • Onboarding guide tested with Cohort 2 (developers who weren’t hand-picked)
  • Cost projections validated and approved by finance
  • CISO sign-off on the security architecture still current

Rollout Approach#

  • Department-by-department, not all-at-once
  • Each department gets a 15-minute onboarding session led by a Cohort 1/2 champion
  • Self-service documentation available for async onboarding
  • Platform team monitors gateway metrics for capacity issues

Expected Discoveries#

  • Long-tail support issues from developers with non-standard setups
  • Teams that need custom skills not anticipated during Cohort 1/2
  • Organizational patterns in how different teams use Claude Code
  • Real productivity data at scale for leadership reporting

Success Metrics#

  • 90%+ of developers have used Claude Code at least once within 2 weeks of access
  • Weekly active usage rate > 60% after first month
  • Support ticket volume declining (not growing) after first 2 weeks
  • Total cost within 10% of projection
  • No security incidents

Rollback Plan#

Principle#

Rollback is cohort-level, not all-or-nothing. If a cohort encounters serious issues, pause that cohort while earlier cohorts continue operating.

Rollback Triggers#

  • Security incident (deny rule bypass, data exposure)
  • Infrastructure instability (gateway outages, Bedrock throttling) affecting developer productivity
  • Cost significantly exceeding projections (>25% over forecast)
  • Widespread negative developer feedback (>40% reporting “gets in the way” rather than “makes me faster”)

Rollback Actions by Severity#

Pause (reversible): Disable the gateway route for the affected cohort’s users. Developers can’t reach Claude Code but nothing is uninstalled. Resume when the issue is resolved.

Reconfigure: Push updated managed-settings.json via MDM to tighten deny rules, change model routing, or adjust budgets. No developer action required.

Full rollback: Remove Claude Code from developer machines via MDM. Remove managed-settings.json and managed CLAUDE.md. This is the nuclear option – use only if a fundamental security concern is discovered.

What Rollback Does NOT Require#

  • Reverting code already written with Claude Code – it’s in git like any other code
  • Removing CLAUDE.md files from repos – they’re inert without the CLI
  • Canceling the Bedrock account – infrastructure can stay warm for re-engagement

Re-Engagement After Rollback#

If you roll back a cohort, fix the root cause and re-launch with a smaller pilot (5-10 developers) before re-expanding. Don’t re-launch at the previous cohort size until the fix is validated.

Timeline Summary#

Week 1-4:  Phase 0 -- Infrastructure build
Week 3-6:  Phase 1 -- Platform engineering (overlapping)
Week 5-6:  Cohort 1 -- 25 power users
Week 7-9:  Cohort 2 -- 100 full teams
Week 10-12: Cohort 3 -- 375 remaining developers
Week 12+:  Phase 3 -- Ongoing governance and optimization