Files
2026-01-30 03:04:10 +00:00
..
2026-01-30 03:04:10 +00:00
2026-01-30 03:04:10 +00:00
2026-01-30 03:04:10 +00:00
2026-01-30 03:04:10 +00:00
2026-01-30 03:04:10 +00:00

Cloudflare AI Search Reference

Expert guidance for implementing Cloudflare AI Search (formerly AutoRAG), Cloudflare's managed semantic search and RAG service.

Overview

AI Search is a managed RAG (Retrieval-Augmented Generation) pipeline that combines:

  • Automatic semantic indexing of your content
  • Vector similarity search
  • Built-in LLM generation

Key value propositions:

  • Zero vector management - No manual embedding, indexing, or storage
  • Auto-indexing - Content automatically re-indexed every 6 hours
  • Built-in generation - Optional AI response generation from retrieved context
  • Multi-source - Index from R2 buckets or website crawls

Data source options:

  • R2 bucket - Index files from Cloudflare R2 (supports MD, TXT, HTML, PDF, DOC, CSV, JSON)
  • Website - Crawl and index website content (requires Cloudflare-hosted domain)

Indexing lifecycle:

  • Automatic 6-hour refresh cycle
  • Manual "Force Sync" available (30s rate limit)
  • Not designed for real-time updates

Quick Start

1. Create AI Search instance in dashboard:

  • Go to Cloudflare Dashboard → AI Search → Create
  • Choose data source (R2 or website)
  • Configure instance name and settings

2. Configure Worker:

// wrangler.jsonc
{
  "ai": {
    "binding": "AI"
  }
}

3. Use in Worker:

export default {
  async fetch(request, env) {
    const answer = await env.AI.autorag("my-search-instance").aiSearch({
      query: "How do I configure caching?",
      model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast"
    });
    
    return Response.json({ answer: answer.response });
  }
};

AI Search vs Vectorize

Factor AI Search Vectorize
Management Fully managed Manual embedding + indexing
Use when Want zero-ops RAG pipeline Need custom embeddings/control
Indexing Automatic (6hr cycle) Manual via API
Generation Built-in optional Bring your own LLM
Data sources R2 or website Manual insert
Best for Docs, support, enterprise search Custom ML pipelines, real-time

AI Search vs Direct Workers AI

Factor AI Search Workers AI (direct)
Context Automatic retrieval Manual context building
Use when Need RAG (search + generate) Simple generation tasks
Indexing Built-in Not applicable
Best for Knowledge bases, docs Simple chat, transformations

search() vs aiSearch()

Method Returns Use When
search() Search results only Building custom UI, need raw chunks
aiSearch() AI response + results Need ready-to-use answer (chatbot, Q&A)

Real-time Updates Consideration

AI Search is NOT ideal if:

  • Need real-time content updates (<6 hours)
  • Content changes multiple times per hour
  • Strict freshness requirements

AI Search IS ideal if:

  • Content relatively stable (docs, policies, knowledge bases)
  • 6-hour refresh acceptable
  • Prefer zero-ops over real-time

Platform Limits

Limit Value
Max instances per account 10
Max files per instance 100,000
Max file size 4 MB
Index frequency Every 6 hours
Force Sync rate limit Once per 30 seconds
Filter nesting depth 2 levels
Filters per compound 10
Score threshold range 0.0 - 1.0

Reading Order

Navigate these references based on your task:

Task Read Est. Time
Understand AI Search README only 5 min
Implement basic search README → api.md 10 min
Configure data source README → configuration.md 10 min
Production patterns patterns.md 15 min
Debug issues gotchas.md 10 min
Full implementation README → api.md → patterns.md 30 min

In This Reference

  • api.md - API endpoints, methods, TypeScript interfaces
  • configuration.md - Setup, data sources, wrangler config
  • patterns.md - Common patterns, decision guidance, code examples
  • gotchas.md - Troubleshooting, code-level gotchas, limits

See Also