Amazon Bedrock charges users based on model usage, data processing, and optional features. You pay primarily for the number of tokens (text units) processed by generative AI models, with costs varying by model type and AWS region. For example, using Anthropic’s Claude 2 model in the US East region might cost $0.008 per 1,000 input tokens and $0.024 per 1,000 output tokens, while Amazon Titan Lite could cost $0.0003 per 1,000 input tokens. API requests to Bedrock are included in token costs, so there’s no separate per-call fee. Additional charges may apply for data storage in connected AWS services like S3 or for transferring data outside AWS regions.
Costs depend on three main factors: model selection, task type, and data volume. Larger models like Claude 2 handle complex tasks but cost more per token, while smaller models like Titan Lite offer budget-friendly options for simpler tasks. Tasks requiring higher computational effort, such as generating long-form text, incur higher output token costs compared to short responses. Data throughput also impacts pricing—processing a 10,000-token document with Claude 2 would cost $0.08 for input and $0.24 for output. If your application handles 1 million tokens monthly, costs could range from $300 (using Titan) to over $8,000 (using Claude 2), depending on input/output splits.
AWS provides tools to manage and predict costs. The Bedrock console displays real-time token usage, and developers can set budget alerts via AWS Budgets. For consistent workloads, provisioned throughput offers discounted rates by committing to a minimum token volume (e.g., $1.25 per hour for 100 tokens/second). Data transfer within the same AWS region is typically free, but transferring 1TB of data out to the internet could add $90-$120 monthly. Using Cost Explorer, teams can analyze spending patterns—like spotting a 30% cost spike from switching models—and adjust their architecture accordingly. This granular control helps balance performance needs with budget constraints.
Zilliz Cloud is a managed vector database built on Milvus perfect for building GenAI applications.
Try FreeLike the article? Spread the word