2.7 KiB
Features & Capabilities
Caching
Dashboard: Settings → Cache Responses → Enable
// Custom TTL (1 hour)
headers: { 'cf-aig-cache-ttl': '3600' }
// Skip cache
headers: { 'cf-aig-skip-cache': 'true' }
// Custom cache key
headers: { 'cf-aig-cache-key': 'greeting-en' }
Limits: TTL 60s - 30 days. Does NOT work with streaming.
Rate Limiting
Dashboard: Settings → Rate-limiting → Enable
- Fixed window: Resets at intervals
- Sliding window: Rolling window (more accurate)
- Returns
429when exceeded
Guardrails
Dashboard: Settings → Guardrails → Enable
Filter prompts/responses for inappropriate content. Actions: Flag (log) or Block (reject).
Data Loss Prevention (DLP)
Dashboard: Settings → DLP → Enable
Detect PII (emails, SSNs, credit cards). Actions: Flag, Block, or Redact.
Billing Modes
| Mode | Description | Setup |
|---|---|---|
| Unified Billing | Pay through Cloudflare, no provider keys | Use cf-aig-authorization header only |
| BYOK | Store provider keys in dashboard | Add keys in Provider Keys section |
| Pass-through | Send provider key with each request | Include provider's auth header |
Zero Data Retention
Dashboard: Settings → Privacy → Zero Data Retention
No prompts/responses stored. Request counts and costs still tracked.
Logging
Dashboard: Settings → Logs → Enable (up to 10M logs)
Each entry: prompt, response, provider, model, tokens, cost, duration, cache status, metadata.
// Skip logging for request
headers: { 'cf-aig-collect-log': 'false' }
Export: Use Logpush to S3, GCS, Datadog, Splunk, etc.
Custom Cost Tracking
For models not in Cloudflare's pricing database:
Dashboard: Gateway → Settings → Custom Costs
Or via API: set model, input_cost, output_cost.
Supported Providers (22+)
| Provider | Unified API | Notes |
|---|---|---|
| OpenAI | openai/gpt-4o |
Full support |
| Anthropic | anthropic/claude-sonnet-4-5 |
Full support |
| Google AI | google-ai-studio/gemini-2.0-flash |
Full support |
| Workers AI | workersai/@cf/meta/llama-3 |
Native |
| Azure OpenAI | azure-openai/* |
Deployment names |
| AWS Bedrock | Provider endpoint only | /bedrock/* |
| Groq | groq/* |
Fast inference |
| Mistral, Cohere, Perplexity, xAI, DeepSeek, Cerebras | Full support | - |
Best Practices
- Enable caching for deterministic prompts
- Set rate limits to prevent abuse
- Use guardrails for user-facing AI
- Enable DLP for sensitive data
- Use unified billing or BYOK for simpler key management
- Enable logging for debugging
- Use zero data retention when privacy required