All Announcements12 Aug, 2024

Anthropic's 12-hour outage

Last Thursday, the Anthropic API was unstable/down for almost 12 hours:

We saw that orgs who had setup fallbacks for their Anthropic requests, they did not face any failures — 99.86% of their requests succeeded, and only 0.14% failed.

These users had setup fallbacks to route their requests to (1) OpenAI, (2) Azure, (3) Gemini, and (4) a bunch of hosted Llama models.

There are some key learnings from the 0.14% requests that failed though:

Some users hadn't configured fallbacks for the 529 status code (which had spiked the most that day)
A few had improperly set up fallback targets (expired keys, non-existent targets)
In rare cases, even the fallback target failed (pro tip: always have multiple options!)

Check out our fallback documentation to protect your app from going down with an LLM API failure again → Portkey Fallback Docs