Announcements

Stay informed about Portkey's newest releases, features, and improvements

  1. Better availability, fewer 429s.

    Vertex AI has rolled out global endpoints, and they’re now fully supported on Portkey. If you’ve hit “resource exhausted” errors before, switching to the global endpoint can significantly improve availability and reduce throttling under load.

     

    To enable it on Portkey, just set the region to global in your Vertex AI virtual key. Everything else works the same.

     

    Models supported via global endpoints:

    – Gemini 2.5 Flash-Lite

    – Gemini 2.5 Pro

    – Gemini 2.5 Flash

    – Gemini 2.0 Flash

    – Gemini 2.0 Flash-Lite

     

    vertex AI
  2. ⚡ Day 0 support for Gemini 2.5 Pro, Flash, and the new Flash-Lite!

    Gemini 2.5 Pro and 2.5 Flash models are now stable and generally available. 2.5 Flash-Lite is the most cost-efficient and fastest Gemini 2.5 model yet.

    These models offer long context lengths, making them especially useful for coding.

    You can now access these models via Portkey and get:

     

    ✅ Smart routing and fallback options

    ✅ Full logs, usage metrics, and cost tracking

    ✅ Org-wide rate limits, retries, and guardrails

    ✅ Prompt playground to test, version, and deploy prompts

    gemini (3)
  3. 🔥 Now available on Portkey

     gemini-2.5-pro-preview-05-06 and gemini-2.5-flash-preview-05-20 Models are now live and ready to use via the AI Gateway, along with:

     

    ✅ Org-wide guardrails, retries, and rate limits

    ✅ Full logging and metadata for every call

    ✅ Usage, latency, and cost tracking across all providers

    gemini (1)
  4. Langroid now supports Portkey out of the box!

    Langroid is a powerful Python framework purpose-built for agentic workflows with multi-agent programming.

    Teams building on Langroid can now easily plug into Portkey's AI Gateway and get:

     

    ✅ Unified access to 1600+ models

    ✅ Caching, retries, and fallbacks

    ✅ Prompt-level safety guardrails

    ✅ Logs, cost metrics, and full observability

    ✅ Budget enforcement and routing

     

    Explore the integration

     

  5. 🚨 Model Deprecation Alert

    OpenAI is deprecating gpt-4o–realtime-preview-2024-10-01 model.

    The endpoint will stop responding entirely after September 10, 2025.

     

    Quick action items:

    ✅ Audit any usage of this specific model

    ✅ Migrate to gpt-4o-realtime-preview release

    ✅ Complete all tests before the cutoff

     

     

     

     

  6. Magistral is now available on Portkey!

    You can now connect to Magistral, Mistral’s first reasoning model, via Portkey. It is purpose-built for domain-specific, transparent, and multilingual tasks.

    With Portkey’s integration, you can run Magistral in production with:

     

    ✅ Smart routing + model fallback

    ✅ Guardrails for safe, consistent outputs

    ✅ Logs, latency, and cost observability

    ✅ Rate limits and budget enforcement

    magistral

     

     

  7. ⚡ Day 0 support for o3-pro!

    You can now start using OpenAI's 03-pro via Portkey's AI gateway.

     

    A defining feature of o3-pro is its ability to understand and generate highly human-like responses, maintaining context even in complex, multi-turn conversations.

     

    With Portkey, bring o3-pro into production with:

    ✅ Smart routing, failover, and retries built in

    ✅ Guardrails for safe, compliant interactions

    ✅ Full observability with logging, latency, and cost insights

    ✅ Budget and rate-limit controls across use cases

    o3-pro (1)

    2

  8. Strengthening our AI agent support with Strands Agents

    We’re continuing to deepen our agent infrastructure now with Strands Agents, the lightweight framework built by AWS.

     

    With Portkey, you can enhance Strands Agents with:

     

    ✅ Full observability of every step, tool use, and output

    ✅ Built-in reliability—retries, fallbacks, and load balancing

    ✅ Access to 1600+ LLMs through a single API

    Cost tracking, usage logging, and metadata tagging

    Guardrails for safe, compliant behavior across environments

     

    Explore the integration here 

     

  9. 📣 Portkey now integrates with Roo Code!

    Roo Code is built for fast development, Portkey makes it enterprise-ready.

    With Portkey, you can:

     

    ✅ Trace every request by user, tool, or team

    ✅ Manage LLM spend with rate limits and budgets

    ✅ Enforce RBAC for users, teams, and environments

    ✅ Add retries, fallbacks, and safety guardrails

    ✅ Get full logging and metrics across all models

     

     Explore Roo Code integration 

     

  10. Portkey now integrates with Cline!

    With Portkey, you can turn your Cline setup into an enterprise-grade system with:

     

    • Real-time usage and spend tracking
    • Budget limits per team or project
    • RBAC for fine-grained access and model management
    • Full observability—trace every call across 1600+ LLMs
    • Fallbacks, retries, and guardrails with zero code changes

    Keep building without interruptions. Check out the documentation here.