🚀 Try Zilliz Cloud, the fully managed Milvus, for free—experience 10x faster performance! Try Now>>

Milvus
Zilliz
  • Home
  • AI Reference
  • Is it possible to get token usage metrics or other usage details from Amazon Bedrock after making a request (to track costs or performance)?

Is it possible to get token usage metrics or other usage details from Amazon Bedrock after making a request (to track costs or performance)?

Yes, Amazon Bedrock provides mechanisms to track token usage and other metrics, which helps developers monitor costs and performance. When you make a request to Bedrock, the service does not return token counts directly in the API response. However, it integrates with AWS CloudWatch, where usage metrics like InvocationCount (number of API calls) and TokenCount (total tokens processed) are logged. These metrics are available at both the account and model-ID levels, allowing you to filter data by specific regions, models (e.g., anthropic.claude-v2), or use cases. For example, if you run 100 inference calls with Claude, CloudWatch would aggregate token consumption, enabling you to estimate costs based on Bedrock’s per-token pricing model.

To track costs, you can use AWS Cost Explorer alongside CloudWatch. Bedrock’s pricing is based on tokens processed (input and output), and Cost Explorer provides itemized billing data. By filtering costs by the “Bedrock” service tag, you can view expenses per model, region, or usage type. For instance, if your application uses both Claude and Jurassic models, Cost Explorer breaks down costs for each, helping identify high-expense areas. Additionally, AWS Budgets can alert you when spending approaches predefined thresholds. While Bedrock doesn’t expose per-request token data in real time, these tools offer aggregated insights for cost management.

For performance monitoring, CloudWatch provides metrics like ModelLatency (time taken per inference) and Errors (failed requests). Developers can create dashboards to track latency trends or set alarms for sudden spikes. For detailed request-level logging, enable AWS CloudTrail to capture Bedrock API activity. Though CloudTrail doesn’t log tokens, it records metadata like timestamps, model IDs, and regions, which can correlate with CloudWatch metrics for troubleshooting. For example, if a Claude model’s latency increases, you could cross-reference CloudTrail logs to check for regional issues or throttling. While token-level tracking requires combining multiple tools, Bedrock’s native AWS integrations provide a robust framework for cost and performance analysis.

Like the article? Spread the word