Amazon Bedrock Fundamentals#

What Is Amazon Bedrock?#

Bedrock is AWS’s managed API gateway for foundation models. Instead of going directly to Anthropic for model access, Bedrock gives you a single AWS-native service that brokers access to foundation models through the same IAM, billing, networking, and compliance infrastructure you already use for everything else in AWS.

Analogy for Cloud Foundry practitioners: If Cloud Foundry abstracts away infrastructure for app developers, Bedrock does the same for model inference. Developers don’t think about where Claude is running – they just call the API. The platform team controls the networking, access, cost, and compliance layer underneath.

Key Properties for Enterprise Deployment#

Managed Inference, Not Model Hosting#

You don’t deploy models, manage GPU clusters, or deal with scaling. You call InvokeModel or InvokeModelWithResponseStream, and Bedrock handles the rest. AWS runs the Claude models in their infrastructure under a contractual relationship with Anthropic.

Data Boundary Guarantees#

Customer inputs and outputs are not used to train or improve foundation models
Data is not shared with model providers (Anthropic)
Data is not stored beyond immediate request processing (unless you explicitly enable logging)
Your code/prompts go to AWS, not to Anthropic directly

Data Retention Policy#

AWS Bedrock does not store prompt or completion data, but logs are retained for approximately 30 days by default through CloudTrail and CloudWatch. This retention period is customer-configurable but cannot be reduced to zero.

Critical difference from direct API: Anthropic’s Zero Data Retention (ZDR) offering is not available when using Claude through AWS Bedrock or GCP Vertex AI. Organizations with compliance requirements for immediate data deletion must use either:

Anthropic’s direct API with ZDR enabled (7-day standard retention, or immediate discard with ZDR contract)
Azure Foundry (where Anthropic is the data processor and ZDR terms apply)

For Bedrock deployments, the 30-day log retention is the minimum. Configure CloudTrail and CloudWatch retention policies to meet your compliance requirements, but understand that some logging window is inherent to the AWS integration.

Inherits the AWS Security Stack#

IAM policies control who can invoke which models
CloudTrail logs every API call
VPC endpoints via PrivateLink keep traffic off the public internet
KMS encryption for data at rest
Compliance certifications: SOC 2, HIPAA, FedRAMP, etc.

AWS Trainium Infrastructure Optimization#

Claude models on AWS Bedrock run on AWS Trainium (Amazon’s custom ML accelerator chips), not Nvidia GPUs. This architectural choice delivers cost and efficiency advantages:

Lower inference costs: Trainium is purpose-built for transformer workloads and more cost-efficient than GPU-based alternatives
Better economics at scale: For 500-developer deployments with high token throughput, Trainium’s cost structure compounds savings
Tight AWS integration: No cross-provider latency or egress costs

This is a genuine AWS-specific advantage. Organizations comparing cloud providers should factor this into total cost of ownership calculations. See Anthropic’s announcement on Amazon Trainium for technical details.

Pricing Model#

On-demand: Pay-per-token, no upfront commitment
Provisioned throughput: Guaranteed capacity – at 500 developers, you’ll likely need this for at least Sonnet
Prompt caching: Reduces cost and latency for repeated context patterns

How Claude Code Talks to Bedrock#

Normally, Claude Code calls api.anthropic.com directly. Setting CLAUDE_CODE_USE_BEDROCK=1 switches it to the AWS SDK’s Bedrock runtime client. The Claude models are the same – same Sonnet, same Opus, same capabilities – but the request path changes:

Without Bedrock:
  developer laptop → internet → api.anthropic.com → Claude model

With Bedrock:
  developer laptop → corporate network → Bedrock endpoint → Claude model (in AWS)

With Bedrock + PrivateLink:
  developer laptop → corporate network → VPC endpoint (PrivateLink) → Bedrock → Claude
  (nothing touches the public internet at any point)

Dual-Model Usage#

Claude Code uses two models simultaneously:

Primary model (Sonnet or Opus): Heavy reasoning, code generation, analysis
Fast model (Haiku): Lightweight tasks – summarization, classification, quick checks

Both must be enabled in Bedrock’s model access settings.

Known Gotchas#

Inference Profiles Required#

Bedrock requires inference profiles (cross-region model identifiers) rather than bare model IDs for on-demand throughput:

# Won't work:
anthropic.claude-sonnet-4-5

# Correct:
us.anthropic.claude-sonnet-4-5-20250929-v1:0

This trips up almost everyone on initial setup.

Model Availability Lag#

Bedrock sometimes lags Anthropic’s direct API in model availability by days or weeks when new versions release.

Haiku Version Pinning#

For Bedrock users, Claude Code will not automatically upgrade from Haiku 3.5 to Haiku 4.5. Set ANTHROPIC_DEFAULT_HAIKU_MODEL explicitly:

export ANTHROPIC_DEFAULT_HAIKU_MODEL='us.anthropic.claude-haiku-4-5-20251001-v1:0'

Prompt Caching Differences#

Prompt caching behavior differs between direct API and Bedrock. Test caching behavior during Cohort 1 and adjust configuration accordingly.

Provider Comparison#

Bedrock is the recommended path for AWS-native organizations, but Claude Code also supports two other cloud providers that offer the same “no code leaves your network” guarantee. Each has a dedicated guide in this section:

Google Vertex AI Fundamentals – VPC Service Controls, Private Google Access, Workload Identity, Cloud Interconnect
Azure Foundry Fundamentals – Private Endpoints, VNet integration, managed identities, ExpressRoute

Provider Selection#

Factor	Bedrock	Vertex AI	Azure Foundry
Set via	`CLAUDE_CODE_USE_BEDROCK=1`	`CLAUDE_CODE_USE_VERTEX=1`	`CLAUDE_CODE_USE_FOUNDRY=1`
Auth skip (for gateway)	`CLAUDE_CODE_SKIP_BEDROCK_AUTH`	`CLAUDE_CODE_SKIP_VERTEX_AUTH`	`CLAUDE_CODE_SKIP_FOUNDRY_AUTH`
Network isolation	VPC PrivateLink	VPC Service Controls	Private Endpoints
Best for	AWS-native orgs	GCP-native orgs	Azure-native orgs

The platform engineering layer (Phase 1) and rollout strategy (Phase 2) are provider-agnostic – only Phase 0 infrastructure changes if you use a different provider.