Announcements

Stay informed about Portkey's newest releases, features, and improvements

21 Oct, 2025
Rate limit support for both tokens and requests
You can now set rate limits for both tokens and requests under an API key.
Request-based limits: Control the number of requests per minute, hour, or day.
Token-based limits: Control the number of tokens consumed per minute, hour, or day.
This gives you more precise control over usage, budgets, and traffic patterns across teams and applications.
16 Oct, 2025
Day 0 support for Claude Haiku 4.5!
Claude Haiku 4.5 is Anthropic’s fastest, most efficient model yet, delivering near-frontier performance at a fraction of the cost. It’s ideal for real-time agents, chatbots, and reasoning-heavy workloads where speed matters.
With Portkey, you can bring Haiku 4.5 into production with:
✅ Unified access via Bedrock, Vertex AI, and Anthropic providers
✅ Guardrails for safe, compliant usage
✅ Full observability across logs, latency, and spend
✅ Budget and rate-limit controls for teams and apps
7 Oct, 2025
Use OpenAI’s AgentKit with any provider🚀
You can now extend OpenAI’s AgentKit to work across different LLM providers — all through Portkey
- End-to-end observability
- Guardrails for every interaction
- Cost tracking across executions
- Reliability features like fallbacks and retries
- Bring production-grade visibility and control to your agents.
Read more about this here -> https://portkey.ai/docs/integrations/libraries/openai-agent-builder
1 Oct, 2025
🚨 Model Deprecation Alert: Llama 3.1 & 3.2 on Vertex AI
Llama 3.1 and 3.2 models on Vertex AI Managed API will be deactivated on January 15, 2026.
What to do:
- Migrate to Llama 3.3 or Llama 4 (recommended)
- Update model names in your integration
- Test before the cutoff date
30 Sep, 2025
⚡ Claude Sonnet 4.5 is live on Portkey!
Anthropic’s latest Claude Sonnet 4.5 is now available through the Portkey AI Gateway.
With Portkey, you can bring Claude Sonnet 4.5 into production with:
✅ Unified access via the AI Gateway
✅ Full observability: logs, latency, and spend
✅ Budget and rate-limit controls across teams and apps
2
29 Sep, 2025
Regex replace guardrail ✏️
Your data policies aren’t always one-size-fits-all.
Alongside our PII guardrail, we’ve introduced a Regex Replace Guardrail for specific redactions.
Define custom regex patterns and automatically mask sensitive data before it reaches the model so that you can:
- Mask sensitive data (e.g., emails, phone numbers)
- Normalize inputs before they reach the model
- Enforce consistent formatting across requests
Read more here -> https://lnkd.in/dgkPGrQF
29 Sep, 2025
Metadata-based model access guardrail 🛡️
We’ve introduced a new guardrail that lets you restrict model access based on metadata key–value pairs at runtime.
This means you can:
- Enforce tenant, region, or environment-based restrictions dynamically
- Ensure compliance policies are checked per request
- Control model usage without changing your app logic
Granular governance, enforced in real time.
2
25 Sep, 2025
Happening today: LLMs in Prod, San Francisco!
We’re teaming up with Exa to host the next edition of LLMs in Prod. Expect talks, panels, and conversations with PG&E, Postman, Palo Alto Networks, LinkedIn, Cerebras, and more, focused on running LLMs in production at scale.
📍 Exa HQ, San Francisco
📅 Today, 5–8 PM
Are we seeing you today? (few spots left!)
19 Sep, 2025
Unified Count Tokens Endpoint!
We’ve introduced a single endpoint for counting tokens across AWS Bedrock, Vertex AI, and Anthropic.
This makes it easier to:
- Get an estimate of tokens before sending the request
- Manage rate limits and costs
- Enforce routing and quota rules inside your app
- Optimize prompts to be a specific length
Read more here ->
18 Sep, 2025
Unified finish_reason parameter across providers ✅
By default, finish_reason parameter is now mapped to an OpenAI-compatible value for consistency. If you want to keep the original provider-returned value, set x-portkey-strict-openai-compliance = false.