Files
2026-01-30 03:04:10 +00:00
..
2026-01-30 03:04:10 +00:00
2026-01-30 03:04:10 +00:00
2026-01-30 03:04:10 +00:00
2026-01-30 03:04:10 +00:00
2026-01-30 03:04:10 +00:00
2026-01-30 03:04:10 +00:00

Cloudflare AI Gateway

Expert guidance for implementing Cloudflare AI Gateway - a universal gateway for AI model providers with analytics, caching, rate limiting, and routing capabilities.

When to Use This Reference

  • Setting up AI Gateway for any AI provider (OpenAI, Anthropic, Workers AI, etc.)
  • Implementing caching, rate limiting, or request retry/fallback
  • Configuring dynamic routing with A/B testing or model fallbacks
  • Managing provider API keys securely with BYOK
  • Adding security features (guardrails, DLP)
  • Setting up observability with logging and custom metadata
  • Debugging AI Gateway requests or optimizing configurations

Quick Start

What's your setup?

Most modern pattern using official ai-gateway-provider package with automatic fallbacks.

import { createAiGateway } from 'ai-gateway-provider';
import { createOpenAI } from '@ai-sdk/openai';
import { generateText } from 'ai';

const gateway = createAiGateway({
  accountId: process.env.CF_ACCOUNT_ID,
  gateway: process.env.CF_GATEWAY_ID,
});

const openai = createOpenAI({ 
  apiKey: process.env.OPENAI_API_KEY 
});

// Single model
const { text } = await generateText({
  model: gateway(openai('gpt-4o')),
  prompt: 'Hello'
});

// Automatic fallback array
const { text } = await generateText({
  model: gateway([
    openai('gpt-4o'),              // Try first
    anthropic('claude-sonnet-4-5'), // Fallback
  ]),
  prompt: 'Hello'
});

Install: npm install ai-gateway-provider ai @ai-sdk/openai @ai-sdk/anthropic

Pattern 2: OpenAI SDK

Drop-in replacement for OpenAI API with multi-provider support.

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/compat`,
  defaultHeaders: {
    'cf-aig-authorization': `Bearer ${cfToken}` // For authenticated gateways
  }
});

// Switch providers by changing model format: {provider}/{model}
const response = await client.chat.completions.create({
  model: 'openai/gpt-4o', // or 'anthropic/claude-sonnet-4-5'
  messages: [{ role: 'user', content: 'Hello!' }]
});

Pattern 3: Workers AI Binding

For Cloudflare Workers using Workers AI.

export default {
  async fetch(request, env, ctx) {
    const response = await env.AI.run(
      '@cf/meta/llama-3-8b-instruct',
      { messages: [{ role: 'user', content: 'Hello!' }] },
      { 
        gateway: { 
          id: 'my-gateway',
          metadata: { userId: '123', team: 'engineering' }
        } 
      }
    );
    
    return Response.json(response);
  }
};

Headers Quick Reference

Header Purpose Example Notes
cf-aig-authorization Gateway auth Bearer {token} Required for authenticated gateways
cf-aig-metadata Tracking {"userId":"x"} Max 5 entries, flat structure
cf-aig-cache-ttl Cache duration 3600 Seconds, min 60, max 2592000 (30 days)
cf-aig-skip-cache Bypass cache true -
cf-aig-cache-key Custom cache key my-key Must be unique per response
cf-aig-collect-log Skip logging false Default: true
cf-aig-cache-status Cache hit/miss Response only HIT or MISS

In This Reference

File Purpose
sdk-integration.md Vercel AI SDK, OpenAI SDK, Workers binding patterns
configuration.md Dashboard setup, wrangler, API tokens
features.md Caching, rate limits, guardrails, DLP, BYOK, unified billing
dynamic-routing.md Fallbacks, A/B testing, conditional routing
troubleshooting.md Debugging, errors, observability, gotchas

Reading Order

Task Files
First-time setup README + configuration.md
SDK integration README + sdk-integration.md
Enable caching README + features.md
Setup fallbacks README + dynamic-routing.md
Debug errors README + troubleshooting.md

Architecture

AI Gateway acts as a proxy between your application and AI providers:

Your App → AI Gateway → AI Provider (OpenAI, Anthropic, etc.)
         ↓
    Analytics, Caching, Rate Limiting, Logging

Key URL patterns:

  • Unified API (OpenAI-compatible): https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat/chat/completions
  • Provider-specific: https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/{provider}/{endpoint}
  • Dynamic routes: Use route name instead of model: dynamic/{route-name}

Gateway Types

  1. Unauthenticated Gateway: Open access (not recommended for production)
  2. Authenticated Gateway: Requires cf-aig-authorization header with Cloudflare API token (recommended)

Provider Authentication Options

  1. Unified Billing: Use AI Gateway billing to pay for inference (keyless mode - no provider API key needed)
  2. BYOK (Store Keys): Store provider API keys in Cloudflare dashboard
  3. Request Headers: Include provider API key in each request

Resources