Announcements

Stay informed about Portkey's newest releases, features, and improvements

  1. Heads up if you’re using Gemini models 👀

    Google is deprecating Gemini 2.0 Flash and Gemini 2.0 Flash Lite on March 31, 2026 across Gemini API and Google AI Studio.

    If your apps or agents rely on these models, you’ll need to migrate to supported alternatives like Gemini 2.5 Flash or 2.5 Flash Lite before that date.

    image
  2. 🚀 The MCP Gateway is now GA!

    MCP adoption is accelerating rapidly, with servers and tools now being used across teams, environments, and agent workflows.

    As MCP usage scales, teams start running into a consistent set of challenges:

    • authentication varies across MCP servers
    • access to tools becomes difficult to govern
    • visibility into MCP usage is limited

    Portkey’s MCP Gateway solves this!

    It is a control layer to securely connect MCP servers, standardize authentication, manage tool access, and get clear visibility into MCP usage, without changing existing agents or MCP servers.

    Check it out ->

  3. Open-sourcing the LLM pricing database!!

    Keeping LLM pricing accurate is difficult because it isn’t static. Models change, pricing shifts, and new billing dimensions are introduced over time.

    To handle this, we built a never-goes-old database that powers cost calculations ~$250,000 in LLM spend, every day.

    We’ve now open-sourced the pricing database produced by this system so others can inspect how pricing is tracked, reuse the data, and understand the assumptions behind cost calculations.

    The database covers 2,000+ models across 40+ providers, including thinking tokens, cache pricing, context tiers, and non-token fees. It’s available via a free public API with no authentication required.

    Take a look ->https://portkey.ai/models

    llms.new
  4. OpenCode Integration!

    Running OpenCode in shared or production environments?

    By using Portkey alongside OpenCode, teams can add the controls needed to run it at scale, without changing how developers work.

    You get:

    ✅ Access control, budgets, and usage limits at the platform layer

    ✅ Centralized governance without modifying OpenCode itself

    ✅ Full observability across usage, latency, and spend

    ✅ Guardrails to keep usage predictable in shared environments

    See how you can run OpenCode with governance

    3

  5. Anthropic models on Azure are now accessible via Portkey!

    You can now access Anthropic models on Azure via the native /messages endpoint, so you can keep the standard Anthropic request format while getting:

    ✅ Unified access through the Portkey Gateway

    ✅ Guardrails for safe, compliant usage

    ✅ Full observability across logs, latency, and spend

    ✅ Team-level budgets and rate-limit controls

    2

  6. Oracle LLMs are now available on Portkey!

    You can now start using LLMs hosted on Oracle Cloud Infrastructure (OCI) in production with:

    ✅ Support for Cohere, Meta and Llama models via OCI Generative AI

    ✅ Guardrails for safe, compliant usage

    ✅ Full observability across logs, latency, and spend

    ✅ Team-level budgets and rate-limit controls

    See how you can add Oracle models to Portkey.

    oracle

  7. Gemini 3 Flash Preview model support!

    Gemini 3 Flash delivers fast reasoning and multimodal understanding at lower latency and cost compared to larger models, while still handling complex text, image, audio, and video tasks.

    With Portkey, you can bring this model into production with:

    ✅ Unified access via the Gateway

    ✅ Guardrails for safe, compliant usage

    ✅ Full observability across logs, latency, and spend

    ✅ Team-level budgets and rate-limit controls

    Try it today!

  8. Portkey now integrates with Amazon Bedrock AgentCore!

    Because AgentCore supports OpenAI-compatible frameworks, you can integrate Portkey without modifying your agent code while keeping AgentCore’s runtime, gateway, and memory services intact.

    With this setup, you get:

    • A unified gateway for 1600+ models across providers
    • Production telemetry (traces, logs, metrics) for AgentCore invocations
    • Reliability controls such as fallbacks, load balancing, and timeouts
    • Centralized governance over provider access, spend, and policies using Portkey API keys

    See how you can implement it here.

  9. Smarter routing capabilities!

    Sticky load balancing ensures that requests sharing the same identifier are consistently routed to the same target.

    This is useful for:

    • Maintaining conversation context across multiple requests
    • Ensuring consistent model behavior during A/B testing
    • Supporting session-based or user-specific routing

    Read more about this here ->

    sticky-load-balancing