Amazon Bedrock is a fully managed service on AWS that provides access to foundation models from various AI companies through a single API.
We recommend configuring Claude 3.7 Sonnet as your chat model.
If you run into the following error when connecting to the new Claude 3.5 Sonnet 2 models from AWS -
400 Invocation of model ID anthropic.claude-3-5-sonnet-20241022-v2:0 with on-demand throughput isn't supported. Retry your request with the ID or ARN of an inference profile that contains this model.
You can fix this using the following config:
Bedrock currently does not offer any autocomplete models. However, Codestral from Mistral and Point from Poolside will be supported in the near future.
In the meantime, you can view a list of autocomplete model providers here.
We recommend configuring amazon.titan-embed-text-v2:0
as your embeddings model.
We recommend configuring cohere.rerank-v3-5:0
as your reranking model, you may also use amazon.rerank-v1:0
.
Bedrock allows Claude models to cache tool payloads, system messages, and chat
messages between requests. Enable this behavior by adding
promptCaching: true
under defaultCompletionOptions
in your model
configuration.
Prompt caching is generally available for:
Customers who were granted access to Claude 3.5 Sonnet v2 during the prompt caching preview will retain that access, but it cannot be enabled for new users on that model.
Prompt caching is not supported in JSON configuration files, so use the YAML syntax above to enable it.
Authentication will be through temporary or long-term credentials in
~/.aws/credentials
under a configured profile (e.g. “bedrock”).
To setup Bedrock using custom imported models, add the following to your config file:
Authentication will be through temporary or long-term credentials in ~/.aws/credentials under a configured profile (e.g. “bedrock”).