Amazon Web Services (AWS) has been riding the generative AI hype train hard lately. Roughly one third of sessions at the annual AWS re:Invent conference are currently tagged as AI/ML. That means that AWS must provide some great AI offerings in their massive catalog of services, right?
AWS offers a service called Amazon Bedrock:
Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI.
That's a lot of buzzwords. In practice, a core feature of Bedrock allows developers to access many models from different providers through a common API. Here's a simple Python example from the Bedrock docs:
body = json.dumps({
"prompt": "\n\nHuman: explain black holes to 8th graders\n\nAssistant:",
"max_tokens_to_sample": 300,
"temperature": 0.1,
"top_p": 0.9,
})
modelId = 'anthropic.claude-v2'
accept = 'application/json'
contentType = 'application/json'
response = brt.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType)
Switching to a different model is as simple as changing the modelId
. You do need to request access to each model in the AWS console but that is a pretty simple and quick process. This all sounds great so far, right? It's all fun and games until you start building something and it suddenly stops working.
Disclaimer before the complaining starts: I usually enjoy building software hosted on AWS quite a bit. Any negativity expressed here is coming from a place of me wanting them to do better.
I've been seeing posts on a daily basis across both Twitter/X and the AWS subreddit from users whose Bedrock service suddenly stopped working without warning.
AWS manages load on its services by putting quotas on almost everything, which is a completely sensible way to build an outrageously large distributed system. Sometimes these quotas are adjustable (upon request) and sometimes they aren't.
After initially requesting access to a model in Bedrock, a default quota for the model in both requests per minute and tokens per minute will be assigned to your account. AWS has been resetting customer Bedrock quotas to 0. This is happening despite AWS initially granting access to models on Bedrock and assigning a working (non-zero) quota. This causes live customer applications built on top of Bedrock to stop working without any notification whatsoever. For a company that prides itself on "five nines" uptime it is pretty unusual to see something like this. It definitely does not instill a lot of confidence in Bedrock as a service.
This happened to me both in my personal AWS account where I was testing out Bedrock as well as in several AWS accounts belonging to the startup where I serve as CTO and where I was actively building an AI application on top of Bedrock. I've seen varying reports on social media about this getting resolved in a satisfying manner by AWS support. For me, I ended up just switching over to using a provider's API directly rather than waiting several days to hear back from AWS support. In my case this was Anthropic and I haven't had any remotely similar problems with them.
It's not clear if this is all happening because of a bug or if it is something like AWS struggling with capacity to service Bedrock. Either way, it is unacceptable to kill access without warning. At least for now you might want to think twice about building on top of Bedrock.
Top comments (0)