Announcements

Stay informed about Portkey's newest releases, features, and improvements

  1. Anthropic models on Azure are now accessible via Portkey!

    You can now access Anthropic models on Azure via the native /messages endpoint, so you can keep the standard Anthropic request format while getting:

    ✅ Unified access through the Portkey Gateway

    ✅ Guardrails for safe, compliant usage

    ✅ Full observability across logs, latency, and spend

    ✅ Team-level budgets and rate-limit controls

  2. Oracle LLMs are now available on Portkey!

    You can now start using LLMs hosted on Oracle Cloud Infrastructure (OCI) in production with:

    ✅ Support for Cohere, Meta and Llama models via OCI Generative AI

    ✅ Guardrails for safe, compliant usage

    ✅ Full observability across logs, latency, and spend

    ✅ Team-level budgets and rate-limit controls

    See how you can add Oracle models to Portkey.

    oracle

  3. Gemini 3 Flash Preview model support!

    Gemini 3 Flash delivers fast reasoning and multimodal understanding at lower latency and cost compared to larger models, while still handling complex text, image, audio, and video tasks.

    With Portkey, you can bring this model into production with:

    ✅ Unified access via the Gateway

    ✅ Guardrails for safe, compliant usage

    ✅ Full observability across logs, latency, and spend

    ✅ Team-level budgets and rate-limit controls

    Try it today!

  4. Portkey now integrates with Amazon Bedrock AgentCore!

    Because AgentCore supports OpenAI-compatible frameworks, you can integrate Portkey without modifying your agent code while keeping AgentCore’s runtime, gateway, and memory services intact.

    With this setup, you get:

    • A unified gateway for 1600+ models across providers
    • Production telemetry (traces, logs, metrics) for AgentCore invocations
    • Reliability controls such as fallbacks, load balancing, and timeouts
    • Centralized governance over provider access, spend, and policies using Portkey API keys

    See how you can implement it here.

  5. Smarter routing capabilities!

    Sticky load balancing ensures that requests sharing the same identifier are consistently routed to the same target.

    This is useful for:

    • Maintaining conversation context across multiple requests
    • Ensuring consistent model behavior during A/B testing
    • Supporting session-based or user-specific routing

    Read more about this here ->

    sticky-load-balancing
  6. Provider updates — Gemini, Vertex AI, Azure OpenAI

    We’ve shipped a set of provider updates to improve parity and cost visibility across models:

    • Gemini / Vertex AI: Added support for the reasoning_effort parameter to control model thinking behavior. OpenAI values (minimal, low, medium, high) are mapped to Gemini’s thinkingLevel.

    • Azure OpenAI:
    1. Added support for the v1 preview API version across Azure OpenAI endpoints.
    2. Added pricing support for the batch /responses endpoint when used with deployments.

  7. GPT-5.2 is now available on Portkey!

    You can now start using GPT-5.2 in production with:

    ✅Unified access through the Gateway

    ✅Guardrails for safe, compliant usage

    ✅Full observability across logs, latency, and spend

    ✅Team-level budgets and rate-limit controls

    gpt-5.2
  8. Usage limits + rate limit policies

    You can now apply rate limits and usage budgets across your entire organization, and update them instantly at a team or group level instead of handling them one at a time. You can define policy conditions based on API keys, metadata, or workspaces.

    This makes it easier to:

    ✔ Adjust budgets org-wide in one move

    ✔ Roll out policy changes across teams at once

    ✔ Create grouped or multi-condition enforcement rules

    ✔ Scale access safely without config overload

    See how you can implement

    3

  9. Claude Opus 4.5 is now available on Portkey!

    You can now start using Opus 4.5 in production with:

    ✅Unified access through Anthropic

    ✅Guardrails for safe, compliant usage

    ✅Full observability across logs, latency, and spend

    ✅Team-level budgets and rate-limit controls

    claude opus 4.5

    2