Announcements

Stay informed about Portkey's newest releases, features, and improvements

  1. Rate limit support for both tokens and requests

    You can now set rate limits for both tokens and requests under an API key.

    Request-based limits: Control the number of requests per minute, hour, or day.

    Token-based limits: Control the number of tokens consumed per minute, hour, or day.

    This gives you more precise control over usage, budgets, and traffic patterns across teams and applications.

    rate-limits-tokens-and-requests
  2. Day 0 support for Claude Haiku 4.5!

    Claude Haiku 4.5 is Anthropic’s fastest, most efficient model yet, delivering near-frontier performance at a fraction of the cost. It’s ideal for real-time agents, chatbots, and reasoning-heavy workloads where speed matters.

    With Portkey, you can bring Haiku 4.5 into production with:

    ✅ Unified access via Bedrock, Vertex AI, and Anthropic providers

    ✅ Guardrails for safe, compliant usage

    ✅ Full observability across logs, latency, and spend

    ✅ Budget and rate-limit controls for teams and apps

    image (46)
  3. ⚡ Claude Sonnet 4.5 is live on Portkey!

    Anthropic’s latest Claude Sonnet 4.5 is now available through the Portkey AI Gateway.

    With Portkey, you can bring Claude Sonnet 4.5 into production with:

    ✅ Unified access via the AI Gateway

    ✅ Full observability: logs, latency, and spend

    ✅ Budget and rate-limit controls across teams and apps

    claude sonnet 4.5

    2

  4. Regex replace guardrail ✏️

    Your data policies aren’t always one-size-fits-all.

    Alongside our PII guardrail, we’ve introduced a Regex Replace Guardrail for specific redactions.

    Define custom regex patterns and automatically mask sensitive data before it reaches the model so that you can:

    • Mask sensitive data (e.g., emails, phone numbers)
    • Normalize inputs before they reach the model
    • Enforce consistent formatting across requests
    regex-replace-guardrails

    Read more here -> https://lnkd.in/dgkPGrQF

  5. Metadata-based model access guardrail 🛡️

    We’ve introduced a new guardrail that lets you restrict model access based on metadata key–value pairs at runtime.

    This means you can:

    • Enforce tenant, region, or environment-based restrictions dynamically
    • Ensure compliance policies are checked per request
    • Control model usage without changing your app logic

    Granular governance, enforced in real time.

    2

  6. Happening today: LLMs in Prod, San Francisco!

    We’re teaming up with Exa to host the next edition of LLMs in Prod. Expect talks, panels, and conversations with PG&E, Postman, Palo Alto Networks, LinkedIn, Cerebras, and more, focused on running LLMs in production at scale.

    📍 Exa HQ, San Francisco

    📅 Today, 5–8 PM

    Are we seeing you today? (few spots left!)

  7. Unified Count Tokens Endpoint!

    We’ve introduced a single endpoint for counting tokens across AWS Bedrock, Vertex AI, and Anthropic.

    This makes it easier to:

    • Get an estimate of tokens before sending the request
    • Manage rate limits and costs
    • Enforce routing and quota rules inside your app
    • Optimize prompts to be a specific length

    Read more here ->

  8. Unified finish_reason parameter across providers ✅

    By default, finish_reason parameter is now mapped to an OpenAI-compatible value for consistency. If you want to keep the original provider-returned value, set x-portkey-strict-openai-compliance = false.