Announcements

Stay informed about Portkey's newest releases, features, and improvements

  1. Open-sourcing the LLM pricing database!!

    Keeping LLM pricing accurate is difficult because it isn’t static. Models change, pricing shifts, and new billing dimensions are introduced over time.

    To handle this, we built a never-goes-old database that powers cost calculations ~$250,000 in LLM spend, every day.

    We’ve now open-sourced the pricing database produced by this system so others can inspect how pricing is tracked, reuse the data, and understand the assumptions behind cost calculations.

    The database covers 2,000+ models across 40+ providers, including thinking tokens, cache pricing, context tiers, and non-token fees. It’s available via a free public API with no authentication required.

    Take a look ->https://portkey.ai/models

    llms.new
  2. OpenCode Integration!

    Running OpenCode in shared or production environments?

    By using Portkey alongside OpenCode, teams can add the controls needed to run it at scale, without changing how developers work.

    You get:

    ✅ Access control, budgets, and usage limits at the platform layer

    ✅ Centralized governance without modifying OpenCode itself

    ✅ Full observability across usage, latency, and spend

    ✅ Guardrails to keep usage predictable in shared environments

    See how you can run OpenCode with governance

    2

  3. Anthropic models on Azure are now accessible via Portkey!

    You can now access Anthropic models on Azure via the native /messages endpoint, so you can keep the standard Anthropic request format while getting:

    ✅ Unified access through the Portkey Gateway

    ✅ Guardrails for safe, compliant usage

    ✅ Full observability across logs, latency, and spend

    ✅ Team-level budgets and rate-limit controls

  4. Oracle LLMs are now available on Portkey!

    You can now start using LLMs hosted on Oracle Cloud Infrastructure (OCI) in production with:

    ✅ Support for Cohere, Meta and Llama models via OCI Generative AI

    ✅ Guardrails for safe, compliant usage

    ✅ Full observability across logs, latency, and spend

    ✅ Team-level budgets and rate-limit controls

    See how you can add Oracle models to Portkey.

    oracle

  5. Gemini 3 Flash Preview model support!

    Gemini 3 Flash delivers fast reasoning and multimodal understanding at lower latency and cost compared to larger models, while still handling complex text, image, audio, and video tasks.

    With Portkey, you can bring this model into production with:

    ✅ Unified access via the Gateway

    ✅ Guardrails for safe, compliant usage

    ✅ Full observability across logs, latency, and spend

    ✅ Team-level budgets and rate-limit controls

    Try it today!

  6. Portkey now integrates with Amazon Bedrock AgentCore!

    Because AgentCore supports OpenAI-compatible frameworks, you can integrate Portkey without modifying your agent code while keeping AgentCore’s runtime, gateway, and memory services intact.

    With this setup, you get:

    • A unified gateway for 1600+ models across providers
    • Production telemetry (traces, logs, metrics) for AgentCore invocations
    • Reliability controls such as fallbacks, load balancing, and timeouts
    • Centralized governance over provider access, spend, and policies using Portkey API keys

    See how you can implement it here.

  7. Smarter routing capabilities!

    Sticky load balancing ensures that requests sharing the same identifier are consistently routed to the same target.

    This is useful for:

    • Maintaining conversation context across multiple requests
    • Ensuring consistent model behavior during A/B testing
    • Supporting session-based or user-specific routing

    Read more about this here ->

    sticky-load-balancing
  8. Provider updates — Gemini, Vertex AI, Azure OpenAI

    We’ve shipped a set of provider updates to improve parity and cost visibility across models:

    • Gemini / Vertex AI: Added support for the reasoning_effort parameter to control model thinking behavior. OpenAI values (minimal, low, medium, high) are mapped to Gemini’s thinkingLevel.

    • Azure OpenAI:
    1. Added support for the v1 preview API version across Azure OpenAI endpoints.
    2. Added pricing support for the batch /responses endpoint when used with deployments.

  9. GPT-5.2 is now available on Portkey!

    You can now start using GPT-5.2 in production with:

    ✅Unified access through the Gateway

    ✅Guardrails for safe, compliant usage

    ✅Full observability across logs, latency, and spend

    ✅Team-level budgets and rate-limit controls

    gpt-5.2