diff --git a/ab-test-setup/.skillshare-meta.json b/ab-test-setup/.skillshare-meta.json new file mode 100644 index 0000000..11294d3 --- /dev/null +++ b/ab-test-setup/.skillshare-meta.json @@ -0,0 +1,8 @@ +{ + "source": "github.com/coreyhaines31/marketingskills/tree/main/skills/ab-test-setup", + "type": "github-subdir", + "installed_at": "2026-01-30T02:22:07.322363226Z", + "repo_url": "https://github.com/coreyhaines31/marketingskills.git", + "subdir": "skills/ab-test-setup", + "version": "a04cb61" +} \ No newline at end of file diff --git a/ab-test-setup/SKILL.md b/ab-test-setup/SKILL.md new file mode 100644 index 0000000..b0bc652 --- /dev/null +++ b/ab-test-setup/SKILL.md @@ -0,0 +1,265 @@ +--- +name: ab-test-setup +version: 1.0.0 +description: When the user wants to plan, design, or implement an A/B test or experiment. Also use when the user mentions "A/B test," "split test," "experiment," "test this change," "variant copy," "multivariate test," or "hypothesis." For tracking implementation, see analytics-tracking. +--- + +# A/B Test Setup + +You are an expert in experimentation and A/B testing. Your goal is to help design tests that produce statistically valid, actionable results. + +## Initial Assessment + +**Check for product marketing context first:** +If `.claude/product-marketing-context.md` exists, read it before asking questions. Use that context and only ask for information not already covered or specific to this task. + +Before designing a test, understand: + +1. **Test Context** - What are you trying to improve? What change are you considering? +2. **Current State** - Baseline conversion rate? Current traffic volume? +3. **Constraints** - Technical complexity? Timeline? Tools available? + +--- + +## Core Principles + +### 1. Start with a Hypothesis +- Not just "let's see what happens" +- Specific prediction of outcome +- Based on reasoning or data + +### 2. Test One Thing +- Single variable per test +- Otherwise you don't know what worked + +### 3. Statistical Rigor +- Pre-determine sample size +- Don't peek and stop early +- Commit to the methodology + +### 4. Measure What Matters +- Primary metric tied to business value +- Secondary metrics for context +- Guardrail metrics to prevent harm + +--- + +## Hypothesis Framework + +### Structure + +``` +Because [observation/data], +we believe [change] +will cause [expected outcome] +for [audience]. +We'll know this is true when [metrics]. +``` + +### Example + +**Weak**: "Changing the button color might increase clicks." + +**Strong**: "Because users report difficulty finding the CTA (per heatmaps and feedback), we believe making the button larger and using contrasting color will increase CTA clicks by 15%+ for new visitors. We'll measure click-through rate from page view to signup start." + +--- + +## Test Types + +| Type | Description | Traffic Needed | +|------|-------------|----------------| +| A/B | Two versions, single change | Moderate | +| A/B/n | Multiple variants | Higher | +| MVT | Multiple changes in combinations | Very high | +| Split URL | Different URLs for variants | Moderate | + +--- + +## Sample Size + +### Quick Reference + +| Baseline | 10% Lift | 20% Lift | 50% Lift | +|----------|----------|----------|----------| +| 1% | 150k/variant | 39k/variant | 6k/variant | +| 3% | 47k/variant | 12k/variant | 2k/variant | +| 5% | 27k/variant | 7k/variant | 1.2k/variant | +| 10% | 12k/variant | 3k/variant | 550/variant | + +**Calculators:** +- [Evan Miller's](https://www.evanmiller.org/ab-testing/sample-size.html) +- [Optimizely's](https://www.optimizely.com/sample-size-calculator/) + +**For detailed sample size tables and duration calculations**: See [references/sample-size-guide.md](references/sample-size-guide.md) + +--- + +## Metrics Selection + +### Primary Metric +- Single metric that matters most +- Directly tied to hypothesis +- What you'll use to call the test + +### Secondary Metrics +- Support primary metric interpretation +- Explain why/how the change worked + +### Guardrail Metrics +- Things that shouldn't get worse +- Stop test if significantly negative + +### Example: Pricing Page Test +- **Primary**: Plan selection rate +- **Secondary**: Time on page, plan distribution +- **Guardrail**: Support tickets, refund rate + +--- + +## Designing Variants + +### What to Vary + +| Category | Examples | +|----------|----------| +| Headlines/Copy | Message angle, value prop, specificity, tone | +| Visual Design | Layout, color, images, hierarchy | +| CTA | Button copy, size, placement, number | +| Content | Information included, order, amount, social proof | + +### Best Practices +- Single, meaningful change +- Bold enough to make a difference +- True to the hypothesis + +--- + +## Traffic Allocation + +| Approach | Split | When to Use | +|----------|-------|-------------| +| Standard | 50/50 | Default for A/B | +| Conservative | 90/10, 80/20 | Limit risk of bad variant | +| Ramping | Start small, increase | Technical risk mitigation | + +**Considerations:** +- Consistency: Users see same variant on return +- Balanced exposure across time of day/week + +--- + +## Implementation + +### Client-Side +- JavaScript modifies page after load +- Quick to implement, can cause flicker +- Tools: PostHog, Optimizely, VWO + +### Server-Side +- Variant determined before render +- No flicker, requires dev work +- Tools: PostHog, LaunchDarkly, Split + +--- + +## Running the Test + +### Pre-Launch Checklist +- [ ] Hypothesis documented +- [ ] Primary metric defined +- [ ] Sample size calculated +- [ ] Variants implemented correctly +- [ ] Tracking verified +- [ ] QA completed on all variants + +### During the Test + +**DO:** +- Monitor for technical issues +- Check segment quality +- Document external factors + +**DON'T:** +- Peek at results and stop early +- Make changes to variants +- Add traffic from new sources + +### The Peeking Problem +Looking at results before reaching sample size and stopping early leads to false positives and wrong decisions. Pre-commit to sample size and trust the process. + +--- + +## Analyzing Results + +### Statistical Significance +- 95% confidence = p-value < 0.05 +- Means <5% chance result is random +- Not a guarantee—just a threshold + +### Analysis Checklist + +1. **Reach sample size?** If not, result is preliminary +2. **Statistically significant?** Check confidence intervals +3. **Effect size meaningful?** Compare to MDE, project impact +4. **Secondary metrics consistent?** Support the primary? +5. **Guardrail concerns?** Anything get worse? +6. **Segment differences?** Mobile vs. desktop? New vs. returning? + +### Interpreting Results + +| Result | Conclusion | +|--------|------------| +| Significant winner | Implement variant | +| Significant loser | Keep control, learn why | +| No significant difference | Need more traffic or bolder test | +| Mixed signals | Dig deeper, maybe segment | + +--- + +## Documentation + +Document every test with: +- Hypothesis +- Variants (with screenshots) +- Results (sample, metrics, significance) +- Decision and learnings + +**For templates**: See [references/test-templates.md](references/test-templates.md) + +--- + +## Common Mistakes + +### Test Design +- Testing too small a change (undetectable) +- Testing too many things (can't isolate) +- No clear hypothesis + +### Execution +- Stopping early +- Changing things mid-test +- Not checking implementation + +### Analysis +- Ignoring confidence intervals +- Cherry-picking segments +- Over-interpreting inconclusive results + +--- + +## Task-Specific Questions + +1. What's your current conversion rate? +2. How much traffic does this page get? +3. What change are you considering and why? +4. What's the smallest improvement worth detecting? +5. What tools do you have for testing? +6. Have you tested this area before? + +--- + +## Related Skills + +- **page-cro**: For generating test ideas based on CRO principles +- **analytics-tracking**: For setting up test measurement +- **copywriting**: For creating variant copy diff --git a/ab-test-setup/references/sample-size-guide.md b/ab-test-setup/references/sample-size-guide.md new file mode 100644 index 0000000..c934b02 --- /dev/null +++ b/ab-test-setup/references/sample-size-guide.md @@ -0,0 +1,252 @@ +# Sample Size Guide + +Reference for calculating sample sizes and test duration. + +## Sample Size Fundamentals + +### Required Inputs + +1. **Baseline conversion rate**: Your current rate +2. **Minimum detectable effect (MDE)**: Smallest change worth detecting +3. **Statistical significance level**: Usually 95% (α = 0.05) +4. **Statistical power**: Usually 80% (β = 0.20) + +### What These Mean + +**Baseline conversion rate**: If your page converts at 5%, that's your baseline. + +**MDE (Minimum Detectable Effect)**: The smallest improvement you care about detecting. Set this based on: +- Business impact (is a 5% lift meaningful?) +- Implementation cost (worth the effort?) +- Realistic expectations (what have past tests shown?) + +**Statistical significance (95%)**: Means there's less than 5% chance the observed difference is due to random chance. + +**Statistical power (80%)**: Means if there's a real effect of size MDE, you have 80% chance of detecting it. + +--- + +## Sample Size Quick Reference Tables + +### Conversion Rate: 1% + +| Lift to Detect | Sample per Variant | Total Sample | +|----------------|-------------------|--------------| +| 5% (1% → 1.05%) | 1,500,000 | 3,000,000 | +| 10% (1% → 1.1%) | 380,000 | 760,000 | +| 20% (1% → 1.2%) | 97,000 | 194,000 | +| 50% (1% → 1.5%) | 16,000 | 32,000 | +| 100% (1% → 2%) | 4,200 | 8,400 | + +### Conversion Rate: 3% + +| Lift to Detect | Sample per Variant | Total Sample | +|----------------|-------------------|--------------| +| 5% (3% → 3.15%) | 480,000 | 960,000 | +| 10% (3% → 3.3%) | 120,000 | 240,000 | +| 20% (3% → 3.6%) | 31,000 | 62,000 | +| 50% (3% → 4.5%) | 5,200 | 10,400 | +| 100% (3% → 6%) | 1,400 | 2,800 | + +### Conversion Rate: 5% + +| Lift to Detect | Sample per Variant | Total Sample | +|----------------|-------------------|--------------| +| 5% (5% → 5.25%) | 280,000 | 560,000 | +| 10% (5% → 5.5%) | 72,000 | 144,000 | +| 20% (5% → 6%) | 18,000 | 36,000 | +| 50% (5% → 7.5%) | 3,100 | 6,200 | +| 100% (5% → 10%) | 810 | 1,620 | + +### Conversion Rate: 10% + +| Lift to Detect | Sample per Variant | Total Sample | +|----------------|-------------------|--------------| +| 5% (10% → 10.5%) | 130,000 | 260,000 | +| 10% (10% → 11%) | 34,000 | 68,000 | +| 20% (10% → 12%) | 8,700 | 17,400 | +| 50% (10% → 15%) | 1,500 | 3,000 | +| 100% (10% → 20%) | 400 | 800 | + +### Conversion Rate: 20% + +| Lift to Detect | Sample per Variant | Total Sample | +|----------------|-------------------|--------------| +| 5% (20% → 21%) | 60,000 | 120,000 | +| 10% (20% → 22%) | 16,000 | 32,000 | +| 20% (20% → 24%) | 4,000 | 8,000 | +| 50% (20% → 30%) | 700 | 1,400 | +| 100% (20% → 40%) | 200 | 400 | + +--- + +## Duration Calculator + +### Formula + +``` +Duration (days) = (Sample per variant × Number of variants) / (Daily traffic × % exposed) +``` + +### Examples + +**Scenario 1: High-traffic page** +- Need: 10,000 per variant (2 variants = 20,000 total) +- Daily traffic: 5,000 visitors +- 100% exposed to test +- Duration: 20,000 / 5,000 = **4 days** + +**Scenario 2: Medium-traffic page** +- Need: 30,000 per variant (60,000 total) +- Daily traffic: 2,000 visitors +- 100% exposed +- Duration: 60,000 / 2,000 = **30 days** + +**Scenario 3: Low-traffic with partial exposure** +- Need: 15,000 per variant (30,000 total) +- Daily traffic: 500 visitors +- 50% exposed to test +- Effective daily: 250 +- Duration: 30,000 / 250 = **120 days** (too long!) + +### Minimum Duration Rules + +Even with sufficient sample size, run tests for at least: +- **1 full week**: To capture day-of-week variation +- **2 business cycles**: If B2B (weekday vs. weekend patterns) +- **Through paydays**: If e-commerce (beginning/end of month) + +### Maximum Duration Guidelines + +Avoid running tests longer than 4-8 weeks: +- Novelty effects wear off +- External factors intervene +- Opportunity cost of other tests + +--- + +## Online Calculators + +### Recommended Tools + +**Evan Miller's Calculator** +https://www.evanmiller.org/ab-testing/sample-size.html +- Simple interface +- Bookmark-worthy + +**Optimizely's Calculator** +https://www.optimizely.com/sample-size-calculator/ +- Business-friendly language +- Duration estimates + +**AB Test Guide Calculator** +https://www.abtestguide.com/calc/ +- Includes Bayesian option +- Multiple test types + +**VWO Duration Calculator** +https://vwo.com/tools/ab-test-duration-calculator/ +- Duration-focused +- Good for planning + +--- + +## Adjusting for Multiple Variants + +With more than 2 variants (A/B/n tests), you need more sample: + +| Variants | Multiplier | +|----------|------------| +| 2 (A/B) | 1x | +| 3 (A/B/C) | ~1.5x | +| 4 (A/B/C/D) | ~2x | +| 5+ | Consider reducing variants | + +**Why?** More comparisons increase chance of false positives. You're comparing: +- A vs B +- A vs C +- B vs C (sometimes) + +Apply Bonferroni correction or use tools that handle this automatically. + +--- + +## Common Sample Size Mistakes + +### 1. Underpowered tests +**Problem**: Not enough sample to detect realistic effects +**Fix**: Be realistic about MDE, get more traffic, or don't test + +### 2. Overpowered tests +**Problem**: Waiting for sample size when you already have significance +**Fix**: This is actually fine—you committed to sample size, honor it + +### 3. Wrong baseline rate +**Problem**: Using wrong conversion rate for calculation +**Fix**: Use the specific metric and page, not site-wide averages + +### 4. Ignoring segments +**Problem**: Calculating for full traffic, then analyzing segments +**Fix**: If you plan segment analysis, calculate sample for smallest segment + +### 5. Testing too many things +**Problem**: Dividing traffic too many ways +**Fix**: Prioritize ruthlessly, run fewer concurrent tests + +--- + +## When Sample Size Requirements Are Too High + +Options when you can't get enough traffic: + +1. **Increase MDE**: Accept only detecting larger effects (20%+ lift) +2. **Lower confidence**: Use 90% instead of 95% (risky, document it) +3. **Reduce variants**: Test only the most promising variant +4. **Combine traffic**: Test across multiple similar pages +5. **Test upstream**: Test earlier in funnel where traffic is higher +6. **Don't test**: Make decision based on qualitative data instead +7. **Longer test**: Accept longer duration (weeks/months) + +--- + +## Sequential Testing + +If you must check results before reaching sample size: + +### What is it? +Statistical method that adjusts for multiple looks at data. + +### When to use +- High-risk changes +- Need to stop bad variants early +- Time-sensitive decisions + +### Tools that support it +- Optimizely (Stats Accelerator) +- VWO (SmartStats) +- PostHog (Bayesian approach) + +### Tradeoff +- More flexibility to stop early +- Slightly larger sample size requirement +- More complex analysis + +--- + +## Quick Decision Framework + +### Can I run this test? + +``` +Daily traffic to page: _____ +Baseline conversion rate: _____ +MDE I care about: _____ + +Sample needed per variant: _____ (from tables above) +Days to run: Sample / Daily traffic = _____ + +If days > 60: Consider alternatives +If days > 30: Acceptable for high-impact tests +If days < 14: Likely feasible +If days < 7: Easy to run, consider running longer anyway +``` diff --git a/ab-test-setup/references/test-templates.md b/ab-test-setup/references/test-templates.md new file mode 100644 index 0000000..a504421 --- /dev/null +++ b/ab-test-setup/references/test-templates.md @@ -0,0 +1,268 @@ +# A/B Test Templates Reference + +Templates for planning, documenting, and analyzing experiments. + +## Test Plan Template + +```markdown +# A/B Test: [Name] + +## Overview +- **Owner**: [Name] +- **Test ID**: [ID in testing tool] +- **Page/Feature**: [What's being tested] +- **Planned dates**: [Start] - [End] + +## Hypothesis + +Because [observation/data], +we believe [change] +will cause [expected outcome] +for [audience]. +We'll know this is true when [metrics]. + +## Test Design + +| Element | Details | +|---------|---------| +| Test type | A/B / A/B/n / MVT | +| Duration | X weeks | +| Sample size | X per variant | +| Traffic allocation | 50/50 | +| Tool | [Tool name] | +| Implementation | Client-side / Server-side | + +## Variants + +### Control (A) +[Screenshot] +- Current experience +- [Key details about current state] + +### Variant (B) +[Screenshot or mockup] +- [Specific change #1] +- [Specific change #2] +- Rationale: [Why we think this will win] + +## Metrics + +### Primary +- **Metric**: [metric name] +- **Definition**: [how it's calculated] +- **Current baseline**: [X%] +- **Minimum detectable effect**: [X%] + +### Secondary +- [Metric 1]: [what it tells us] +- [Metric 2]: [what it tells us] +- [Metric 3]: [what it tells us] + +### Guardrails +- [Metric that shouldn't get worse] +- [Another safety metric] + +## Segment Analysis Plan +- Mobile vs. desktop +- New vs. returning visitors +- Traffic source +- [Other relevant segments] + +## Success Criteria +- Winner: [Primary metric improves by X% with 95% confidence] +- Loser: [Primary metric decreases significantly] +- Inconclusive: [What we'll do if no significant result] + +## Pre-Launch Checklist +- [ ] Hypothesis documented and reviewed +- [ ] Primary metric defined and trackable +- [ ] Sample size calculated +- [ ] Test duration estimated +- [ ] Variants implemented correctly +- [ ] Tracking verified in all variants +- [ ] QA completed on all variants +- [ ] Stakeholders informed +- [ ] Calendar hold for analysis date +``` + +--- + +## Results Documentation Template + +```markdown +# A/B Test Results: [Name] + +## Summary +| Element | Value | +|---------|-------| +| Test ID | [ID] | +| Dates | [Start] - [End] | +| Duration | X days | +| Result | Winner / Loser / Inconclusive | +| Decision | [What we're doing] | + +## Hypothesis (Reminder) +[Copy from test plan] + +## Results + +### Sample Size +| Variant | Target | Actual | % of target | +|---------|--------|--------|-------------| +| Control | X | Y | Z% | +| Variant | X | Y | Z% | + +### Primary Metric: [Metric Name] +| Variant | Value | 95% CI | vs. Control | +|---------|-------|--------|-------------| +| Control | X% | [X%, Y%] | — | +| Variant | X% | [X%, Y%] | +X% | + +**Statistical significance**: p = X.XX (95% = sig / not sig) +**Practical significance**: [Is this lift meaningful for the business?] + +### Secondary Metrics + +| Metric | Control | Variant | Change | Significant? | +|--------|---------|---------|--------|--------------| +| [Metric 1] | X | Y | +Z% | Yes/No | +| [Metric 2] | X | Y | +Z% | Yes/No | + +### Guardrail Metrics + +| Metric | Control | Variant | Change | Concern? | +|--------|---------|---------|--------|----------| +| [Metric 1] | X | Y | +Z% | Yes/No | + +### Segment Analysis + +**Mobile vs. Desktop** +| Segment | Control | Variant | Lift | +|---------|---------|---------|------| +| Mobile | X% | Y% | +Z% | +| Desktop | X% | Y% | +Z% | + +**New vs. Returning** +| Segment | Control | Variant | Lift | +|---------|---------|---------|------| +| New | X% | Y% | +Z% | +| Returning | X% | Y% | +Z% | + +## Interpretation + +### What happened? +[Explanation of results in plain language] + +### Why do we think this happened? +[Analysis and reasoning] + +### Caveats +[Any limitations, external factors, or concerns] + +## Decision + +**Winner**: [Control / Variant] + +**Action**: [Implement variant / Keep control / Re-test] + +**Timeline**: [When changes will be implemented] + +## Learnings + +### What we learned +- [Key insight 1] +- [Key insight 2] + +### What to test next +- [Follow-up test idea 1] +- [Follow-up test idea 2] + +### Impact +- **Projected lift**: [X% improvement in Y metric] +- **Business impact**: [Revenue, conversions, etc.] +``` + +--- + +## Test Repository Entry Template + +For tracking all tests in a central location: + +```markdown +| Test ID | Name | Page | Dates | Primary Metric | Result | Lift | Link | +|---------|------|------|-------|----------------|--------|------|------| +| 001 | Hero headline test | Homepage | 1/1-1/15 | CTR | Winner | +12% | [Link] | +| 002 | Pricing table layout | Pricing | 1/10-1/31 | Plan selection | Loser | -5% | [Link] | +| 003 | Signup form fields | Signup | 2/1-2/14 | Completion | Inconclusive | +2% | [Link] | +``` + +--- + +## Quick Test Brief Template + +For simple tests that don't need full documentation: + +```markdown +## [Test Name] + +**What**: [One sentence description] +**Why**: [One sentence hypothesis] +**Metric**: [Primary metric] +**Duration**: [X weeks] +**Result**: [TBD / Winner / Loser / Inconclusive] +**Learnings**: [Key takeaway] +``` + +--- + +## Stakeholder Update Template + +```markdown +## A/B Test Update: [Name] + +**Status**: Running / Complete +**Days remaining**: X (or complete) +**Current sample**: X% of target + +### Preliminary observations +[What we're seeing - without making decisions yet] + +### Next steps +[What happens next] + +### Timeline +- [Date]: Analysis complete +- [Date]: Decision and recommendation +- [Date]: Implementation (if winner) +``` + +--- + +## Experiment Prioritization Scorecard + +For deciding which tests to run: + +| Factor | Weight | Test A | Test B | Test C | +|--------|--------|--------|--------|--------| +| Potential impact | 30% | | | | +| Confidence in hypothesis | 25% | | | | +| Ease of implementation | 20% | | | | +| Risk if wrong | 15% | | | | +| Strategic alignment | 10% | | | | +| **Total** | | | | | + +Scoring: 1-5 (5 = best) + +--- + +## Hypothesis Bank Template + +For collecting test ideas: + +```markdown +| ID | Page/Area | Observation | Hypothesis | Potential Impact | Status | +|----|-----------|-------------|------------|------------------|--------| +| H1 | Homepage | Low scroll depth | Shorter hero will increase scroll | High | Testing | +| H2 | Pricing | Users compare plans | Comparison table will help | Medium | Backlog | +| H3 | Signup | Drop-off at email | Social login will increase completion | Medium | Backlog | +``` diff --git a/agents-md/.skillshare-meta.json b/agents-md/.skillshare-meta.json new file mode 100644 index 0000000..db4856b --- /dev/null +++ b/agents-md/.skillshare-meta.json @@ -0,0 +1,8 @@ +{ + "source": "github.com/getsentry/skills/tree/main/plugins/sentry-skills/skills/agents-md", + "type": "github-subdir", + "installed_at": "2026-01-30T02:23:31.680827073Z", + "repo_url": "https://github.com/getsentry/skills.git", + "subdir": "plugins/sentry-skills/skills/agents-md", + "version": "bb366a0" +} \ No newline at end of file diff --git a/agents-md/SKILL.md b/agents-md/SKILL.md new file mode 100644 index 0000000..957d8c7 --- /dev/null +++ b/agents-md/SKILL.md @@ -0,0 +1,111 @@ +--- +name: agents-md +description: This skill should be used when the user asks to "create AGENTS.md", "update AGENTS.md", "maintain agent docs", "set up CLAUDE.md", or needs to keep agent instructions concise. Guides discovery of local skills and enforces minimal documentation style. +--- + +# Maintaining AGENTS.md + +AGENTS.md is the canonical agent-facing documentation. Keep it minimal—agents are capable and don't need hand-holding. + +## File Setup + +1. Create `AGENTS.md` at project root +2. Create symlink: `ln -s AGENTS.md CLAUDE.md` + +## Before Writing + +Discover local skills to reference: + +```bash +find .claude/skills -name "SKILL.md" 2>/dev/null +ls plugins/*/skills/*/SKILL.md 2>/dev/null +``` + +Read each skill's frontmatter to understand when to reference it. + +## Writing Rules + +- **Headers + bullets** - No paragraphs +- **Code blocks** - For commands and templates +- **Reference, don't duplicate** - Point to skills: "Use `db-migrate` skill. See `.claude/skills/db-migrate/SKILL.md`" +- **No filler** - No intros, conclusions, or pleasantries +- **Trust capabilities** - Omit obvious context + +## Required Sections + +### Package Manager +Which tool and key commands only: +```markdown +## Package Manager +Use **pnpm**: `pnpm install`, `pnpm dev`, `pnpm test` +``` + +### Commit Attribution +Always include this section. Agents should use their own identity: +```markdown +## Commit Attribution +AI commits MUST include: +``` +Co-Authored-By: (the agent model's name and attribution byline) +``` +Example: `Co-Authored-By: Claude Sonnet 4 ` +``` + +### Key Conventions +Project-specific patterns agents must follow. Keep brief. + +### Local Skills +Reference each discovered skill: +```markdown +## Database +Use `db-migrate` skill for schema changes. See `.claude/skills/db-migrate/SKILL.md` + +## Testing +Use `write-tests` skill. See `.claude/skills/write-tests/SKILL.md` +``` + +## Optional Sections + +Add only if truly needed: +- API route patterns (show template, not explanation) +- CLI commands (table format) +- File naming conventions + +## Anti-Patterns + +Omit these: +- "Welcome to..." or "This document explains..." +- "You should..." or "Remember to..." +- Content duplicated from skills (reference instead) +- Obvious instructions ("run tests", "write clean code") +- Explanations of why (just say what) +- Long prose paragraphs + +## Example Structure + +```markdown +# Agent Instructions + +## Package Manager +Use **pnpm**: `pnpm install`, `pnpm dev` + +## Commit Attribution +AI commits MUST include: +``` +Co-Authored-By: (the agent model's name and attribution byline) +``` + +## API Routes +[Template code block] + +## Database +Use `db-migrate` skill. See `.claude/skills/db-migrate/SKILL.md` + +## Testing +Use `write-tests` skill. See `.claude/skills/write-tests/SKILL.md` + +## CLI +| Command | Description | +|---------|-------------| +| `pnpm cli sync` | Sync data | +``` diff --git a/agents-sdk/.skillshare-meta.json b/agents-sdk/.skillshare-meta.json new file mode 100644 index 0000000..4048bd8 --- /dev/null +++ b/agents-sdk/.skillshare-meta.json @@ -0,0 +1,8 @@ +{ + "source": "github.com/cloudflare/skills/tree/main/skills/agents-sdk", + "type": "github-subdir", + "installed_at": "2026-01-30T02:29:59.573967009Z", + "repo_url": "https://github.com/cloudflare/skills.git", + "subdir": "skills/agents-sdk", + "version": "75a603b" +} \ No newline at end of file diff --git a/agents-sdk/SKILL.md b/agents-sdk/SKILL.md new file mode 100644 index 0000000..841a9b4 --- /dev/null +++ b/agents-sdk/SKILL.md @@ -0,0 +1,160 @@ +--- +name: agents-sdk +description: Build stateful AI agents using the Cloudflare Agents SDK. Load when creating agents with persistent state, scheduling, RPC, MCP servers, email handling, or streaming chat. Covers Agent class, AIChatAgent, state management, and Code Mode for reduced token usage. +--- + +# Cloudflare Agents SDK + +Build persistent, stateful AI agents on Cloudflare Workers using the `agents` npm package. + +## FIRST: Verify Installation + +```bash +npm install agents +``` + +Agents require a binding in `wrangler.jsonc`: + +```jsonc +{ + "durable_objects": { + // "class_name" must match your Agent class name exactly + "bindings": [{ "name": "Counter", "class_name": "Counter" }] + }, + "migrations": [ + // Required: list all Agent classes for SQLite storage + { "tag": "v1", "new_sqlite_classes": ["Counter"] } + ] +} +``` + +## Choosing an Agent Type + +| Use Case | Base Class | Package | +|----------|------------|---------| +| Custom state + RPC, no chat | `Agent` | `agents` | +| Chat with message persistence | `AIChatAgent` | `@cloudflare/ai-chat` | +| Building an MCP server | `McpAgent` | `agents/mcp` | + +## Key Concepts + +- **Agent** base class provides state, scheduling, RPC, MCP, and email capabilities +- **AIChatAgent** adds streaming chat with automatic message persistence and resumable streams +- **Code Mode** generates executable code instead of tool calls—reduces token usage significantly +- **this.state / this.setState()** - automatic persistence to SQLite, broadcasts to clients +- **this.schedule()** - schedule tasks at Date, delay (seconds), or cron expression +- **@callable** decorator - expose methods to clients via WebSocket RPC + +## Quick Reference + +| Task | API | +|------|-----| +| Persist state | `this.setState({ count: 1 })` | +| Read state | `this.state.count` | +| Schedule task | `this.schedule(60, "taskMethod", payload)` | +| Schedule cron | `this.schedule("0 * * * *", "hourlyTask")` | +| Cancel schedule | `this.cancelSchedule(id)` | +| Queue task | `this.queue("processItem", payload)` | +| SQL query | `` this.sql`SELECT * FROM users WHERE id = ${id}` `` | +| RPC method | `@callable() async myMethod() { ... }` | +| Streaming RPC | `@callable({ streaming: true }) async stream(res) { ... }` | + +## Minimal Agent + +```typescript +import { Agent, routeAgentRequest, callable } from "agents"; + +type State = { count: number }; + +export class Counter extends Agent { + initialState = { count: 0 }; + + @callable() + increment() { + this.setState({ count: this.state.count + 1 }); + return this.state.count; + } +} + +export default { + fetch: (req, env) => routeAgentRequest(req, env) ?? new Response("Not found", { status: 404 }) +}; +``` + +## Streaming Chat Agent + +Use `AIChatAgent` for chat with automatic message persistence and resumable streaming. + +**Install additional dependencies first:** +```bash +npm install @cloudflare/ai-chat ai @ai-sdk/openai +``` + +**Add wrangler.jsonc config** (same pattern as base Agent): +```jsonc +{ + "durable_objects": { + "bindings": [{ "name": "Chat", "class_name": "Chat" }] + }, + "migrations": [{ "tag": "v1", "new_sqlite_classes": ["Chat"] }] +} +``` + +```typescript +import { AIChatAgent } from "@cloudflare/ai-chat"; +import { routeAgentRequest } from "agents"; +import { streamText, convertToModelMessages } from "ai"; +import { openai } from "@ai-sdk/openai"; + +export class Chat extends AIChatAgent { + async onChatMessage(onFinish) { + const result = streamText({ + model: openai("gpt-4o"), + messages: await convertToModelMessages(this.messages), + onFinish + }); + return result.toUIMessageStreamResponse(); + } +} + +export default { + fetch: (req, env) => routeAgentRequest(req, env) ?? new Response("Not found", { status: 404 }) +}; +``` + +**Client** (React): +```tsx +import { useAgent } from "agents/react"; +import { useAgentChat } from "@cloudflare/ai-chat/react"; + +const agent = useAgent({ agent: "Chat", name: "my-chat" }); +const { messages, input, handleSubmit } = useAgentChat({ agent }); +``` + +## Detailed References + +- **[references/state-scheduling.md](references/state-scheduling.md)** - State persistence, scheduling, queues +- **[references/streaming-chat.md](references/streaming-chat.md)** - AIChatAgent, resumable streams, UI patterns +- **[references/codemode.md](references/codemode.md)** - Generate code instead of tool calls (token savings) +- **[references/mcp.md](references/mcp.md)** - MCP server integration +- **[references/email.md](references/email.md)** - Email routing and handling + +## When to Use Code Mode + +Code Mode generates executable JavaScript instead of making individual tool calls. Use it when: + +- Chaining multiple tool calls in sequence +- Complex conditional logic across tools +- MCP server orchestration (multiple servers) +- Token budget is constrained + +See [references/codemode.md](references/codemode.md) for setup and examples. + +## Best Practices + +1. **Prefer streaming**: Use `streamText` and `toUIMessageStreamResponse()` for chat +2. **Use AIChatAgent for chat**: Handles message persistence and resumable streams automatically +3. **Type your state**: `Agent` ensures type safety for `this.state` +4. **Use @callable for RPC**: Cleaner than manual WebSocket message handling +5. **Code Mode for complex workflows**: Reduces round-trips and token usage +6. **Schedule vs Queue**: Use `schedule()` for time-based, `queue()` for sequential processing diff --git a/agents-sdk/references/codemode.md b/agents-sdk/references/codemode.md new file mode 100644 index 0000000..f95ab38 --- /dev/null +++ b/agents-sdk/references/codemode.md @@ -0,0 +1,207 @@ +# Code Mode (Experimental) + +Code Mode generates executable JavaScript instead of making individual tool calls. This significantly reduces token usage and enables complex multi-tool workflows. + +## Why Code Mode? + +Traditional tool calling: +- One tool call per LLM request +- Multiple round-trips for chained operations +- High token usage for complex workflows + +Code Mode: +- LLM generates code that orchestrates multiple tools +- Single execution for complex workflows +- Self-debugging and error recovery +- Ideal for MCP server orchestration + +## Setup + +### 1. Wrangler Config + +```jsonc +{ + "name": "my-agent-worker", + "compatibility_flags": ["experimental", "enable_ctx_exports"], + "durable_objects": { + // "class_name" must match your Agent class name exactly + "bindings": [{ "name": "MyAgent", "class_name": "MyAgent" }] + }, + "migrations": [ + // Required: list all Agent classes for SQLite storage + { "tag": "v1", "new_sqlite_classes": ["MyAgent"] } + ], + "services": [ + { + "binding": "globalOutbound", + // "service" must match "name" above + "service": "my-agent-worker", + "entrypoint": "globalOutbound" + }, + { + "binding": "CodeModeProxy", + "service": "my-agent-worker", + "entrypoint": "CodeModeProxy" + } + ], + "worker_loaders": [{ "binding": "LOADER" }] +} +``` + +### 2. Export Required Classes + +```typescript +// Export the proxy for tool execution (required for codemode) +export { CodeModeProxy } from "@cloudflare/codemode/ai"; + +// Define outbound fetch handler for security filtering +export const globalOutbound = { + fetch: async (input: string | URL | RequestInfo, init?: RequestInit) => { + const url = new URL( + typeof input === "string" + ? input + : typeof input === "object" && "url" in input + ? input.url + : input.toString() + ); + // Block certain domains if needed + if (url.hostname === "blocked.example.com") { + return new Response("Not allowed", { status: 403 }); + } + return fetch(input, init); + } +}; +``` + +### 3. Install Dependencies + +```bash +npm install @cloudflare/codemode ai @ai-sdk/openai zod +``` + +### 4. Use Code Mode in Agent + +```typescript +import { Agent } from "agents"; +import { experimental_codemode as codemode } from "@cloudflare/codemode/ai"; +import { streamText, tool, convertToModelMessages } from "ai"; +import { openai } from "@ai-sdk/openai"; +import { env } from "cloudflare:workers"; +import { z } from "zod"; + +const tools = { + getWeather: tool({ + description: "Get weather for a location", + parameters: z.object({ location: z.string() }), + execute: async ({ location }) => `Weather: ${location} 72°F` + }), + sendEmail: tool({ + description: "Send an email", + parameters: z.object({ to: z.string(), subject: z.string(), body: z.string() }), + execute: async ({ to, subject, body }) => `Email sent to ${to}` + }) +}; + +export class MyAgent extends Agent { + tools = {}; + + // Method called by codemode proxy + callTool(functionName: string, args: unknown[]) { + return this.tools[functionName]?.execute?.(args, { + abortSignal: new AbortController().signal, + toolCallId: "codemode", + messages: [] + }); + } + + async onChatMessage() { + this.tools = { ...tools, ...this.mcp.getAITools() }; + + const { prompt, tools: wrappedTools } = await codemode({ + prompt: "You are a helpful assistant...", + tools: this.tools, + globalOutbound: env.globalOutbound, + loader: env.LOADER, + proxy: this.ctx.exports.CodeModeProxy({ + props: { + binding: "MyAgent", // Class name + name: this.name, // Instance name + callback: "callTool" // Method to call + } + }) + }); + + const result = streamText({ + system: prompt, + model: openai("gpt-4o"), + messages: await convertToModelMessages(this.state.messages), + tools: wrappedTools // Use wrapped tools, not original + }); + + // ... handle stream + } +} +``` + +## Generated Code Example + +When user asks "Check the weather in NYC and email me the forecast", codemode generates: + +```javascript +async function executeTask() { + const weather = await codemode.getWeather({ location: "NYC" }); + + await codemode.sendEmail({ + to: "user@example.com", + subject: "NYC Weather Forecast", + body: `Current weather: ${weather}` + }); + + return { success: true, weather }; +} +``` + +## MCP Server Orchestration + +Code Mode excels at orchestrating multiple MCP servers: + +```javascript +async function executeTask() { + // Query file system MCP + const files = await codemode.listFiles({ path: "/projects" }); + + // Query database MCP + const status = await codemode.queryDatabase({ + query: "SELECT * FROM projects WHERE name = ?", + params: [files[0].name] + }); + + // Conditional logic based on results + if (status.length === 0) { + await codemode.createTask({ + title: `Review: ${files[0].name}`, + priority: "high" + }); + } + + return { files, status }; +} +``` + +## When to Use + +| Scenario | Use Code Mode? | +|----------|---------------| +| Single tool call | No | +| Chained tool calls | Yes | +| Conditional logic across tools | Yes | +| MCP multi-server workflows | Yes | +| Token budget constrained | Yes | +| Simple Q&A chat | No | + +## Limitations + +- Experimental - API may change +- Requires Cloudflare Workers +- JavaScript execution only (Python planned) +- Requires additional wrangler config diff --git a/agents-sdk/references/email.md b/agents-sdk/references/email.md new file mode 100644 index 0000000..e465ecc --- /dev/null +++ b/agents-sdk/references/email.md @@ -0,0 +1,119 @@ +# Email Handling + +Agents can receive and reply to emails via Cloudflare Email Routing. + +## Wrangler Config + +```jsonc +{ + "durable_objects": { + "bindings": [{ "name": "EmailAgent", "class_name": "EmailAgent" }] + }, + "migrations": [{ "tag": "v1", "new_sqlite_classes": ["EmailAgent"] }], + "send_email": [ + { "name": "SEB", "destination_address": "reply@yourdomain.com" } + ] +} +``` + +Configure Email Routing in Cloudflare dashboard to forward to your Worker. + +## Implement onEmail + +```typescript +import { Agent, AgentEmail } from "agents"; +import PostalMime from "postal-mime"; + +type State = { emails: Array<{ from: string; subject: string; text: string; timestamp: Date }> }; + +export class EmailAgent extends Agent { + initialState: State = { emails: [] }; + + async onEmail(email: AgentEmail) { + console.log("From:", email.from); + console.log("To:", email.to); + console.log("Subject:", email.headers.get("subject")); + + // Get raw email content + const raw = await email.getRaw(); + + // Parse with postal-mime + const parsed = await PostalMime.parse(raw); + + // Update state + this.setState({ + emails: [...this.state.emails, { + from: email.from, + subject: parsed.subject ?? "", + text: parsed.text ?? "", + timestamp: new Date() + }] + }); + + // Reply + await this.replyToEmail(email, { + fromName: "My Agent", + subject: `Re: ${email.headers.get("subject")}`, + body: "Thanks for your email! I'll process it shortly.", + contentType: "text/plain" + }); + } +} +``` + +**Install postal-mime for parsing:** +```bash +npm install postal-mime +``` + +## Route Emails to Agent + +```typescript +import { routeAgentRequest, routeAgentEmail, createAddressBasedEmailResolver } from "agents"; + +export default { + async email(message, env) { + await routeAgentEmail(message, env, { + resolver: createAddressBasedEmailResolver("EmailAgent") + }); + }, + + async fetch(request, env) { + return routeAgentRequest(request, env) ?? new Response("Not found", { status: 404 }); + } +}; +``` + +## Custom Email Resolvers + +### Header-Based Resolver + +Routes based on X-Agent headers in replies: + +```typescript +import { createHeaderBasedEmailResolver } from "agents"; + +await routeAgentEmail(message, env, { + resolver: createHeaderBasedEmailResolver() +}); +``` + +### Custom Resolver + +```typescript +const customResolver = async (email, env) => { + // Parse recipient to determine agent + const [localPart] = email.to.split("@"); + + if (localPart.startsWith("support-")) { + return { + agentName: "SupportAgent", + agentId: localPart.replace("support-", "") + }; + } + + return null; // Don't route +}; + +await routeAgentEmail(message, env, { resolver: customResolver }); +``` diff --git a/agents-sdk/references/mcp.md b/agents-sdk/references/mcp.md new file mode 100644 index 0000000..7bc7cbb --- /dev/null +++ b/agents-sdk/references/mcp.md @@ -0,0 +1,153 @@ +# MCP Server Integration + +Agents include a multi-server MCP client for connecting to external MCP servers. + +## Add an MCP Server + +```typescript +import { Agent, callable } from "agents"; + +export class MyAgent extends Agent { + @callable() + async addServer(name: string, url: string) { + const result = await this.addMcpServer( + name, + url, + "https://my-worker.workers.dev", // callback host for OAuth + "agents" // routing prefix + ); + + if (result.state === "authenticating") { + // OAuth required - redirect user to result.authUrl + return { needsAuth: true, authUrl: result.authUrl }; + } + + return { ready: true, id: result.id }; + } +} +``` + +## Use MCP Tools + +```typescript +async onChatMessage() { + // Get AI-compatible tools from all connected MCP servers + const mcpTools = this.mcp.getAITools(); + + const allTools = { + ...localTools, + ...mcpTools + }; + + const result = streamText({ + model: openai("gpt-4o"), + messages: await convertToModelMessages(this.messages), + tools: allTools + }); + + return result.toUIMessageStreamResponse(); +} +``` + +## List MCP Resources + +```typescript +// List all registered servers +const servers = this.mcp.listServers(); + +// List tools from all servers +const tools = this.mcp.listTools(); + +// List resources +const resources = this.mcp.listResources(); + +// List prompts +const prompts = this.mcp.listPrompts(); +``` + +## Remove Server + +```typescript +await this.removeMcpServer(serverId); +``` + +## Building an MCP Server + +Use `McpAgent` from the SDK to create an MCP server. + +**Install dependencies:** +```bash +npm install @modelcontextprotocol/sdk zod +``` + +**Wrangler config:** +```jsonc +{ + "durable_objects": { + "bindings": [{ "name": "MyMCP", "class_name": "MyMCP" }] + }, + "migrations": [{ "tag": "v1", "new_sqlite_classes": ["MyMCP"] }] +} +``` + +**Server implementation:** +```typescript +import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; +import { McpAgent } from "agents/mcp"; +import { z } from "zod"; + +type State = { counter: number }; + +export class MyMCP extends McpAgent { + server = new McpServer({ + name: "MyMCPServer", + version: "1.0.0" + }); + + initialState = { counter: 0 }; + + async init() { + // Register a resource + this.server.resource("counter", "mcp://resource/counter", (uri) => ({ + contents: [{ text: String(this.state.counter), uri: uri.href }] + })); + + // Register a tool + this.server.registerTool( + "increment", + { + description: "Increment the counter", + inputSchema: { amount: z.number().default(1) } + }, + async ({ amount }) => { + this.setState({ counter: this.state.counter + amount }); + return { + content: [{ text: `Counter: ${this.state.counter}`, type: "text" }] + }; + } + ); + } +} +``` + +## Serve MCP Server + +```typescript +export default { + fetch(request: Request, env: Env, ctx: ExecutionContext) { + const url = new URL(request.url); + + // SSE transport (legacy) + if (url.pathname.startsWith("/sse")) { + return MyMCP.serveSSE("/sse", { binding: "MyMCP" }).fetch(request, env, ctx); + } + + // Streamable HTTP transport (recommended) + if (url.pathname.startsWith("/mcp")) { + return MyMCP.serve("/mcp", { binding: "MyMCP" }).fetch(request, env, ctx); + } + + return new Response("Not found", { status: 404 }); + } +}; +``` diff --git a/agents-sdk/references/state-scheduling.md b/agents-sdk/references/state-scheduling.md new file mode 100644 index 0000000..2933701 --- /dev/null +++ b/agents-sdk/references/state-scheduling.md @@ -0,0 +1,208 @@ +# State, Scheduling & Queues + +## State Management + +State persists automatically to SQLite and broadcasts to connected clients. + +### Define Typed State + +```typescript +type State = { + count: number; + items: string[]; + lastUpdated: Date; +}; + +export class MyAgent extends Agent { + initialState: State = { + count: 0, + items: [], + lastUpdated: new Date() + }; +} +``` + +### Read and Update State + +```typescript +// Read (lazy-loaded from SQLite on first access) +const count = this.state.count; + +// Update (persists to SQLite, broadcasts to clients) +this.setState({ + ...this.state, + count: this.state.count + 1 +}); +``` + +### React to State Changes + +```typescript +onStateUpdate(state: State, source: Connection | "server") { + if (source !== "server") { + // Client updated state via WebSocket + console.log("Client update:", state); + } +} +``` + +### Client-Side State Sync (React) + +```tsx +import { useAgent } from "agents/react"; +import { useState } from "react"; + +function App() { + const [state, setLocalState] = useState({ count: 0 }); + + const agent = useAgent({ + agent: "MyAgent", + name: "instance-1", + onStateUpdate: (newState) => setLocalState(newState) + }); + + const increment = () => { + agent.setState({ ...state, count: state.count + 1 }); + }; + + return ; +} +``` + +The `onStateUpdate` callback receives state changes from the server. Use local React state to store and render the synced state. + +## Scheduling + +Schedule methods to run at specific times using `this.schedule()`. + +### Schedule Types + +```typescript +// At specific Date +await this.schedule(new Date("2025-12-25T00:00:00Z"), "sendGreeting", { to: "user" }); + +// Delay in seconds +await this.schedule(60, "checkStatus", { id: "abc123" }); // 1 minute + +// Cron expression (recurring) +await this.schedule("0 * * * *", "hourlyCleanup", {}); // Every hour +await this.schedule("0 9 * * 1-5", "weekdayReport", {}); // 9am weekdays +``` + +### Schedule Handler + +```typescript +export class MyAgent extends Agent { + async sendGreeting(payload: { to: string }, schedule: Schedule) { + console.log(`Sending greeting to ${payload.to}`); + // Cron schedules automatically reschedule; one-time schedules are deleted + } +} +``` + +### Manage Schedules + +```typescript +// Get all schedules +const schedules = this.getSchedules(); + +// Get by type +const crons = this.getSchedules({ type: "cron" }); + +// Get by time range +const upcoming = this.getSchedules({ + timeRange: { start: new Date(), end: nextWeek } +}); + +// Cancel +await this.cancelSchedule(schedule.id); +``` + +## Task Queue + +Process tasks sequentially with automatic dequeue on success. + +### Queue a Task + +```typescript +await this.queue("processItem", { itemId: "123", priority: "high" }); +``` + +### Queue Handler + +```typescript +async processItem(payload: { itemId: string }, queueItem: QueueItem) { + const item = await fetchItem(payload.itemId); + await processItem(item); + // Task automatically dequeued on success +} +``` + +### Queue Operations + +```typescript +// Manual dequeue +await this.dequeue(queueItem.id); + +// Dequeue all +await this.dequeueAll(); + +// Dequeue by callback +await this.dequeueAllByCallback("processItem"); + +// Query queue +const pending = await this.getQueues("priority", "high"); +``` + +## SQL API + +Direct SQLite access for custom queries: + +```typescript +// Create table +this.sql` + CREATE TABLE IF NOT EXISTS items ( + id TEXT PRIMARY KEY, + name TEXT, + created_at INTEGER DEFAULT (unixepoch()) + ) +`; + +// Insert with params +this.sql`INSERT INTO items (id, name) VALUES (${id}, ${name})`; + +// Query with types +const items = this.sql<{ id: string; name: string }>` + SELECT * FROM items WHERE name LIKE ${`%${search}%`} +`; +``` + +## Lifecycle Callbacks + +```typescript +export class MyAgent extends Agent { + // Called when agent starts (after hibernation or first create) + async onStart() { + console.log("Agent started:", this.name); + } + + // WebSocket connected + onConnect(conn: Connection, ctx: ConnectionContext) { + console.log("Client connected:", conn.id); + } + + // WebSocket message (non-RPC) + onMessage(conn: Connection, message: WSMessage) { + console.log("Received:", message); + } + + // State changed + onStateUpdate(state: State, source: Connection | "server") {} + + // Error handler + onError(error: unknown) { + console.error("Agent error:", error); + throw error; // Re-throw to propagate + } +} +``` diff --git a/agents-sdk/references/streaming-chat.md b/agents-sdk/references/streaming-chat.md new file mode 100644 index 0000000..c819c81 --- /dev/null +++ b/agents-sdk/references/streaming-chat.md @@ -0,0 +1,176 @@ +# Streaming Chat with AIChatAgent + +`AIChatAgent` provides streaming chat with automatic message persistence and resumable streams. + +## Basic Chat Agent + +```typescript +import { AIChatAgent } from "@cloudflare/ai-chat"; +import { streamText, convertToModelMessages } from "ai"; +import { openai } from "@ai-sdk/openai"; + +export class Chat extends AIChatAgent { + async onChatMessage(onFinish) { + const result = streamText({ + model: openai("gpt-4o"), + messages: await convertToModelMessages(this.messages), + onFinish + }); + return result.toUIMessageStreamResponse(); + } +} +``` + +## With Custom System Prompt + +```typescript +export class Chat extends AIChatAgent { + async onChatMessage(onFinish) { + const result = streamText({ + model: openai("gpt-4o"), + system: "You are a helpful assistant specializing in...", + messages: await convertToModelMessages(this.messages), + onFinish + }); + return result.toUIMessageStreamResponse(); + } +} +``` + +## With Tools + +```typescript +import { tool } from "ai"; +import { z } from "zod"; + +const tools = { + getWeather: tool({ + description: "Get weather for a location", + parameters: z.object({ location: z.string() }), + execute: async ({ location }) => `Weather in ${location}: 72°F, sunny` + }) +}; + +export class Chat extends AIChatAgent { + async onChatMessage(onFinish) { + const result = streamText({ + model: openai("gpt-4o"), + messages: await convertToModelMessages(this.messages), + tools, + onFinish + }); + return result.toUIMessageStreamResponse(); + } +} +``` + +## Custom UI Message Stream + +For more control, use `createUIMessageStream`: + +```typescript +import { createUIMessageStream, createUIMessageStreamResponse } from "ai"; + +export class Chat extends AIChatAgent { + async onChatMessage(onFinish) { + const stream = createUIMessageStream({ + execute: async ({ writer }) => { + const result = streamText({ + model: openai("gpt-4o"), + messages: await convertToModelMessages(this.messages), + onFinish + }); + writer.merge(result.toUIMessageStream()); + } + }); + return createUIMessageStreamResponse({ stream }); + } +} +``` + +## Resumable Streaming + +Streams automatically resume if client disconnects and reconnects: + +1. Chunks buffered to SQLite during streaming +2. On reconnect, buffered chunks sent immediately +3. Live streaming continues from where it left off + +**Enabled by default.** To disable: + +```tsx +const { messages } = useAgentChat({ agent, resume: false }); +``` + +## React Client + +```tsx +import { useAgent } from "agents/react"; +import { useAgentChat } from "@cloudflare/ai-chat/react"; + +function ChatUI() { + const agent = useAgent({ + agent: "Chat", + name: "my-chat-session" + }); + + const { + messages, + input, + handleInputChange, + handleSubmit, + status + } = useAgentChat({ agent }); + + return ( +
+ {messages.map((m) => ( +
+ {m.role}: {m.content} +
+ ))} + +
+ + +
+
+ ); +} +``` + +## Streaming RPC Methods + +For non-chat streaming, use `@callable({ streaming: true })`: + +```typescript +import { Agent, callable, StreamingResponse } from "agents"; + +export class MyAgent extends Agent { + @callable({ streaming: true }) + async streamData(stream: StreamingResponse, query: string) { + for (let i = 0; i < 10; i++) { + stream.send(`Result ${i}: ${query}`); + await sleep(100); + } + stream.close(); + } +} +``` + +Client receives streamed messages via WebSocket RPC. + +## Status Values + +`useAgentChat` status: + +| Status | Meaning | +|--------|---------| +| `ready` | Idle, ready for input | +| `streaming` | Response streaming | +| `submitted` | Request sent, waiting | +| `error` | Error occurred | diff --git a/algorithmic-art/.skillshare-meta.json b/algorithmic-art/.skillshare-meta.json new file mode 100644 index 0000000..cfb2cbc --- /dev/null +++ b/algorithmic-art/.skillshare-meta.json @@ -0,0 +1,8 @@ +{ + "source": "github.com/anthropics/skills/tree/main/skills/algorithmic-art", + "type": "github-subdir", + "installed_at": "2026-01-30T02:17:16.089806916Z", + "repo_url": "https://github.com/anthropics/skills.git", + "subdir": "skills/algorithmic-art", + "version": "69c0b1a" +} \ No newline at end of file diff --git a/algorithmic-art/LICENSE.txt b/algorithmic-art/LICENSE.txt new file mode 100644 index 0000000..7a4a3ea --- /dev/null +++ b/algorithmic-art/LICENSE.txt @@ -0,0 +1,202 @@ + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. \ No newline at end of file diff --git a/algorithmic-art/SKILL.md b/algorithmic-art/SKILL.md new file mode 100644 index 0000000..634f6fa --- /dev/null +++ b/algorithmic-art/SKILL.md @@ -0,0 +1,405 @@ +--- +name: algorithmic-art +description: Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations. +license: Complete terms in LICENSE.txt +--- + +Algorithmic philosophies are computational aesthetic movements that are then expressed through code. Output .md files (philosophy), .html files (interactive viewer), and .js files (generative algorithms). + +This happens in two steps: +1. Algorithmic Philosophy Creation (.md file) +2. Express by creating p5.js generative art (.html + .js files) + +First, undertake this task: + +## ALGORITHMIC PHILOSOPHY CREATION + +To begin, create an ALGORITHMIC PHILOSOPHY (not static images or templates) that will be interpreted through: +- Computational processes, emergent behavior, mathematical beauty +- Seeded randomness, noise fields, organic systems +- Particles, flows, fields, forces +- Parametric variation and controlled chaos + +### THE CRITICAL UNDERSTANDING +- What is received: Some subtle input or instructions by the user to take into account, but use as a foundation; it should not constrain creative freedom. +- What is created: An algorithmic philosophy/generative aesthetic movement. +- What happens next: The same version receives the philosophy and EXPRESSES IT IN CODE - creating p5.js sketches that are 90% algorithmic generation, 10% essential parameters. + +Consider this approach: +- Write a manifesto for a generative art movement +- The next phase involves writing the algorithm that brings it to life + +The philosophy must emphasize: Algorithmic expression. Emergent behavior. Computational beauty. Seeded variation. + +### HOW TO GENERATE AN ALGORITHMIC PHILOSOPHY + +**Name the movement** (1-2 words): "Organic Turbulence" / "Quantum Harmonics" / "Emergent Stillness" + +**Articulate the philosophy** (4-6 paragraphs - concise but complete): + +To capture the ALGORITHMIC essence, express how this philosophy manifests through: +- Computational processes and mathematical relationships? +- Noise functions and randomness patterns? +- Particle behaviors and field dynamics? +- Temporal evolution and system states? +- Parametric variation and emergent complexity? + +**CRITICAL GUIDELINES:** +- **Avoid redundancy**: Each algorithmic aspect should be mentioned once. Avoid repeating concepts about noise theory, particle dynamics, or mathematical principles unless adding new depth. +- **Emphasize craftsmanship REPEATEDLY**: The philosophy MUST stress multiple times that the final algorithm should appear as though it took countless hours to develop, was refined with care, and comes from someone at the absolute top of their field. This framing is essential - repeat phrases like "meticulously crafted algorithm," "the product of deep computational expertise," "painstaking optimization," "master-level implementation." +- **Leave creative space**: Be specific about the algorithmic direction, but concise enough that the next Claude has room to make interpretive implementation choices at an extremely high level of craftsmanship. + +The philosophy must guide the next version to express ideas ALGORITHMICALLY, not through static images. Beauty lives in the process, not the final frame. + +### PHILOSOPHY EXAMPLES + +**"Organic Turbulence"** +Philosophy: Chaos constrained by natural law, order emerging from disorder. +Algorithmic expression: Flow fields driven by layered Perlin noise. Thousands of particles following vector forces, their trails accumulating into organic density maps. Multiple noise octaves create turbulent regions and calm zones. Color emerges from velocity and density - fast particles burn bright, slow ones fade to shadow. The algorithm runs until equilibrium - a meticulously tuned balance where every parameter was refined through countless iterations by a master of computational aesthetics. + +**"Quantum Harmonics"** +Philosophy: Discrete entities exhibiting wave-like interference patterns. +Algorithmic expression: Particles initialized on a grid, each carrying a phase value that evolves through sine waves. When particles are near, their phases interfere - constructive interference creates bright nodes, destructive creates voids. Simple harmonic motion generates complex emergent mandalas. The result of painstaking frequency calibration where every ratio was carefully chosen to produce resonant beauty. + +**"Recursive Whispers"** +Philosophy: Self-similarity across scales, infinite depth in finite space. +Algorithmic expression: Branching structures that subdivide recursively. Each branch slightly randomized but constrained by golden ratios. L-systems or recursive subdivision generate tree-like forms that feel both mathematical and organic. Subtle noise perturbations break perfect symmetry. Line weights diminish with each recursion level. Every branching angle the product of deep mathematical exploration. + +**"Field Dynamics"** +Philosophy: Invisible forces made visible through their effects on matter. +Algorithmic expression: Vector fields constructed from mathematical functions or noise. Particles born at edges, flowing along field lines, dying when they reach equilibrium or boundaries. Multiple fields can attract, repel, or rotate particles. The visualization shows only the traces - ghost-like evidence of invisible forces. A computational dance meticulously choreographed through force balance. + +**"Stochastic Crystallization"** +Philosophy: Random processes crystallizing into ordered structures. +Algorithmic expression: Randomized circle packing or Voronoi tessellation. Start with random points, let them evolve through relaxation algorithms. Cells push apart until equilibrium. Color based on cell size, neighbor count, or distance from center. The organic tiling that emerges feels both random and inevitable. Every seed produces unique crystalline beauty - the mark of a master-level generative algorithm. + +*These are condensed examples. The actual algorithmic philosophy should be 4-6 substantial paragraphs.* + +### ESSENTIAL PRINCIPLES +- **ALGORITHMIC PHILOSOPHY**: Creating a computational worldview to be expressed through code +- **PROCESS OVER PRODUCT**: Always emphasize that beauty emerges from the algorithm's execution - each run is unique +- **PARAMETRIC EXPRESSION**: Ideas communicate through mathematical relationships, forces, behaviors - not static composition +- **ARTISTIC FREEDOM**: The next Claude interprets the philosophy algorithmically - provide creative implementation room +- **PURE GENERATIVE ART**: This is about making LIVING ALGORITHMS, not static images with randomness +- **EXPERT CRAFTSMANSHIP**: Repeatedly emphasize the final algorithm must feel meticulously crafted, refined through countless iterations, the product of deep expertise by someone at the absolute top of their field in computational aesthetics + +**The algorithmic philosophy should be 4-6 paragraphs long.** Fill it with poetic computational philosophy that brings together the intended vision. Avoid repeating the same points. Output this algorithmic philosophy as a .md file. + +--- + +## DEDUCING THE CONCEPTUAL SEED + +**CRITICAL STEP**: Before implementing the algorithm, identify the subtle conceptual thread from the original request. + +**THE ESSENTIAL PRINCIPLE**: +The concept is a **subtle, niche reference embedded within the algorithm itself** - not always literal, always sophisticated. Someone familiar with the subject should feel it intuitively, while others simply experience a masterful generative composition. The algorithmic philosophy provides the computational language. The deduced concept provides the soul - the quiet conceptual DNA woven invisibly into parameters, behaviors, and emergence patterns. + +This is **VERY IMPORTANT**: The reference must be so refined that it enhances the work's depth without announcing itself. Think like a jazz musician quoting another song through algorithmic harmony - only those who know will catch it, but everyone appreciates the generative beauty. + +--- + +## P5.JS IMPLEMENTATION + +With the philosophy AND conceptual framework established, express it through code. Pause to gather thoughts before proceeding. Use only the algorithmic philosophy created and the instructions below. + +### ⚠️ STEP 0: READ THE TEMPLATE FIRST ⚠️ + +**CRITICAL: BEFORE writing any HTML:** + +1. **Read** `templates/viewer.html` using the Read tool +2. **Study** the exact structure, styling, and Anthropic branding +3. **Use that file as the LITERAL STARTING POINT** - not just inspiration +4. **Keep all FIXED sections exactly as shown** (header, sidebar structure, Anthropic colors/fonts, seed controls, action buttons) +5. **Replace only the VARIABLE sections** marked in the file's comments (algorithm, parameters, UI controls for parameters) + +**Avoid:** +- ❌ Creating HTML from scratch +- ❌ Inventing custom styling or color schemes +- ❌ Using system fonts or dark themes +- ❌ Changing the sidebar structure + +**Follow these practices:** +- ✅ Copy the template's exact HTML structure +- ✅ Keep Anthropic branding (Poppins/Lora fonts, light colors, gradient backdrop) +- ✅ Maintain the sidebar layout (Seed → Parameters → Colors? → Actions) +- ✅ Replace only the p5.js algorithm and parameter controls + +The template is the foundation. Build on it, don't rebuild it. + +--- + +To create gallery-quality computational art that lives and breathes, use the algorithmic philosophy as the foundation. + +### TECHNICAL REQUIREMENTS + +**Seeded Randomness (Art Blocks Pattern)**: +```javascript +// ALWAYS use a seed for reproducibility +let seed = 12345; // or hash from user input +randomSeed(seed); +noiseSeed(seed); +``` + +**Parameter Structure - FOLLOW THE PHILOSOPHY**: + +To establish parameters that emerge naturally from the algorithmic philosophy, consider: "What qualities of this system can be adjusted?" + +```javascript +let params = { + seed: 12345, // Always include seed for reproducibility + // colors + // Add parameters that control YOUR algorithm: + // - Quantities (how many?) + // - Scales (how big? how fast?) + // - Probabilities (how likely?) + // - Ratios (what proportions?) + // - Angles (what direction?) + // - Thresholds (when does behavior change?) +}; +``` + +**To design effective parameters, focus on the properties the system needs to be tunable rather than thinking in terms of "pattern types".** + +**Core Algorithm - EXPRESS THE PHILOSOPHY**: + +**CRITICAL**: The algorithmic philosophy should dictate what to build. + +To express the philosophy through code, avoid thinking "which pattern should I use?" and instead think "how to express this philosophy through code?" + +If the philosophy is about **organic emergence**, consider using: +- Elements that accumulate or grow over time +- Random processes constrained by natural rules +- Feedback loops and interactions + +If the philosophy is about **mathematical beauty**, consider using: +- Geometric relationships and ratios +- Trigonometric functions and harmonics +- Precise calculations creating unexpected patterns + +If the philosophy is about **controlled chaos**, consider using: +- Random variation within strict boundaries +- Bifurcation and phase transitions +- Order emerging from disorder + +**The algorithm flows from the philosophy, not from a menu of options.** + +To guide the implementation, let the conceptual essence inform creative and original choices. Build something that expresses the vision for this particular request. + +**Canvas Setup**: Standard p5.js structure: +```javascript +function setup() { + createCanvas(1200, 1200); + // Initialize your system +} + +function draw() { + // Your generative algorithm + // Can be static (noLoop) or animated +} +``` + +### CRAFTSMANSHIP REQUIREMENTS + +**CRITICAL**: To achieve mastery, create algorithms that feel like they emerged through countless iterations by a master generative artist. Tune every parameter carefully. Ensure every pattern emerges with purpose. This is NOT random noise - this is CONTROLLED CHAOS refined through deep expertise. + +- **Balance**: Complexity without visual noise, order without rigidity +- **Color Harmony**: Thoughtful palettes, not random RGB values +- **Composition**: Even in randomness, maintain visual hierarchy and flow +- **Performance**: Smooth execution, optimized for real-time if animated +- **Reproducibility**: Same seed ALWAYS produces identical output + +### OUTPUT FORMAT + +Output: +1. **Algorithmic Philosophy** - As markdown or text explaining the generative aesthetic +2. **Single HTML Artifact** - Self-contained interactive generative art built from `templates/viewer.html` (see STEP 0 and next section) + +The HTML artifact contains everything: p5.js (from CDN), the algorithm, parameter controls, and UI - all in one file that works immediately in claude.ai artifacts or any browser. Start from the template file, not from scratch. + +--- + +## INTERACTIVE ARTIFACT CREATION + +**REMINDER: `templates/viewer.html` should have already been read (see STEP 0). Use that file as the starting point.** + +To allow exploration of the generative art, create a single, self-contained HTML artifact. Ensure this artifact works immediately in claude.ai or any browser - no setup required. Embed everything inline. + +### CRITICAL: WHAT'S FIXED VS VARIABLE + +The `templates/viewer.html` file is the foundation. It contains the exact structure and styling needed. + +**FIXED (always include exactly as shown):** +- Layout structure (header, sidebar, main canvas area) +- Anthropic branding (UI colors, fonts, gradients) +- Seed section in sidebar: + - Seed display + - Previous/Next buttons + - Random button + - Jump to seed input + Go button +- Actions section in sidebar: + - Regenerate button + - Reset button + +**VARIABLE (customize for each artwork):** +- The entire p5.js algorithm (setup/draw/classes) +- The parameters object (define what the art needs) +- The Parameters section in sidebar: + - Number of parameter controls + - Parameter names + - Min/max/step values for sliders + - Control types (sliders, inputs, etc.) +- Colors section (optional): + - Some art needs color pickers + - Some art might use fixed colors + - Some art might be monochrome (no color controls needed) + - Decide based on the art's needs + +**Every artwork should have unique parameters and algorithm!** The fixed parts provide consistent UX - everything else expresses the unique vision. + +### REQUIRED FEATURES + +**1. Parameter Controls** +- Sliders for numeric parameters (particle count, noise scale, speed, etc.) +- Color pickers for palette colors +- Real-time updates when parameters change +- Reset button to restore defaults + +**2. Seed Navigation** +- Display current seed number +- "Previous" and "Next" buttons to cycle through seeds +- "Random" button for random seed +- Input field to jump to specific seed +- Generate 100 variations when requested (seeds 1-100) + +**3. Single Artifact Structure** +```html + + + + + + + + +
+
+ +
+ + + +``` + +**CRITICAL**: This is a single artifact. No external files, no imports (except p5.js CDN). Everything inline. + +**4. Implementation Details - BUILD THE SIDEBAR** + +The sidebar structure: + +**1. Seed (FIXED)** - Always include exactly as shown: +- Seed display +- Prev/Next/Random/Jump buttons + +**2. Parameters (VARIABLE)** - Create controls for the art: +```html +
+ + + ... +
+``` +Add as many control-group divs as there are parameters. + +**3. Colors (OPTIONAL/VARIABLE)** - Include if the art needs adjustable colors: +- Add color pickers if users should control palette +- Skip this section if the art uses fixed colors +- Skip if the art is monochrome + +**4. Actions (FIXED)** - Always include exactly as shown: +- Regenerate button +- Reset button +- Download PNG button + +**Requirements**: +- Seed controls must work (prev/next/random/jump/display) +- All parameters must have UI controls +- Regenerate, Reset, Download buttons must work +- Keep Anthropic branding (UI styling, not art colors) + +### USING THE ARTIFACT + +The HTML artifact works immediately: +1. **In claude.ai**: Displayed as an interactive artifact - runs instantly +2. **As a file**: Save and open in any browser - no server needed +3. **Sharing**: Send the HTML file - it's completely self-contained + +--- + +## VARIATIONS & EXPLORATION + +The artifact includes seed navigation by default (prev/next/random buttons), allowing users to explore variations without creating multiple files. If the user wants specific variations highlighted: + +- Include seed presets (buttons for "Variation 1: Seed 42", "Variation 2: Seed 127", etc.) +- Add a "Gallery Mode" that shows thumbnails of multiple seeds side-by-side +- All within the same single artifact + +This is like creating a series of prints from the same plate - the algorithm is consistent, but each seed reveals different facets of its potential. The interactive nature means users discover their own favorites by exploring the seed space. + +--- + +## THE CREATIVE PROCESS + +**User request** → **Algorithmic philosophy** → **Implementation** + +Each request is unique. The process involves: + +1. **Interpret the user's intent** - What aesthetic is being sought? +2. **Create an algorithmic philosophy** (4-6 paragraphs) describing the computational approach +3. **Implement it in code** - Build the algorithm that expresses this philosophy +4. **Design appropriate parameters** - What should be tunable? +5. **Build matching UI controls** - Sliders/inputs for those parameters + +**The constants**: +- Anthropic branding (colors, fonts, layout) +- Seed navigation (always present) +- Self-contained HTML artifact + +**Everything else is variable**: +- The algorithm itself +- The parameters +- The UI controls +- The visual outcome + +To achieve the best results, trust creativity and let the philosophy guide the implementation. + +--- + +## RESOURCES + +This skill includes helpful templates and documentation: + +- **templates/viewer.html**: REQUIRED STARTING POINT for all HTML artifacts. + - This is the foundation - contains the exact structure and Anthropic branding + - **Keep unchanged**: Layout structure, sidebar organization, Anthropic colors/fonts, seed controls, action buttons + - **Replace**: The p5.js algorithm, parameter definitions, and UI controls in Parameters section + - The extensive comments in the file mark exactly what to keep vs replace + +- **templates/generator_template.js**: Reference for p5.js best practices and code structure principles. + - Shows how to organize parameters, use seeded randomness, structure classes + - NOT a pattern menu - use these principles to build unique algorithms + - Embed algorithms inline in the HTML artifact (don't create separate .js files) + +**Critical reminder**: +- The **template is the STARTING POINT**, not inspiration +- The **algorithm is where to create** something unique +- Don't copy the flow field example - build what the philosophy demands +- But DO keep the exact UI structure and Anthropic branding from the template \ No newline at end of file diff --git a/algorithmic-art/templates/generator_template.js b/algorithmic-art/templates/generator_template.js new file mode 100644 index 0000000..e263fbd --- /dev/null +++ b/algorithmic-art/templates/generator_template.js @@ -0,0 +1,223 @@ +/** + * ═══════════════════════════════════════════════════════════════════════════ + * P5.JS GENERATIVE ART - BEST PRACTICES + * ═══════════════════════════════════════════════════════════════════════════ + * + * This file shows STRUCTURE and PRINCIPLES for p5.js generative art. + * It does NOT prescribe what art you should create. + * + * Your algorithmic philosophy should guide what you build. + * These are just best practices for how to structure your code. + * + * ═══════════════════════════════════════════════════════════════════════════ + */ + +// ============================================================================ +// 1. PARAMETER ORGANIZATION +// ============================================================================ +// Keep all tunable parameters in one object +// This makes it easy to: +// - Connect to UI controls +// - Reset to defaults +// - Serialize/save configurations + +let params = { + // Define parameters that match YOUR algorithm + // Examples (customize for your art): + // - Counts: how many elements (particles, circles, branches, etc.) + // - Scales: size, speed, spacing + // - Probabilities: likelihood of events + // - Angles: rotation, direction + // - Colors: palette arrays + + seed: 12345, + // define colorPalette as an array -- choose whatever colors you'd like ['#d97757', '#6a9bcc', '#788c5d', '#b0aea5'] + // Add YOUR parameters here based on your algorithm +}; + +// ============================================================================ +// 2. SEEDED RANDOMNESS (Critical for reproducibility) +// ============================================================================ +// ALWAYS use seeded random for Art Blocks-style reproducible output + +function initializeSeed(seed) { + randomSeed(seed); + noiseSeed(seed); + // Now all random() and noise() calls will be deterministic +} + +// ============================================================================ +// 3. P5.JS LIFECYCLE +// ============================================================================ + +function setup() { + createCanvas(800, 800); + + // Initialize seed first + initializeSeed(params.seed); + + // Set up your generative system + // This is where you initialize: + // - Arrays of objects + // - Grid structures + // - Initial positions + // - Starting states + + // For static art: call noLoop() at the end of setup + // For animated art: let draw() keep running +} + +function draw() { + // Option 1: Static generation (runs once, then stops) + // - Generate everything in setup() + // - Call noLoop() in setup() + // - draw() doesn't do much or can be empty + + // Option 2: Animated generation (continuous) + // - Update your system each frame + // - Common patterns: particle movement, growth, evolution + // - Can optionally call noLoop() after N frames + + // Option 3: User-triggered regeneration + // - Use noLoop() by default + // - Call redraw() when parameters change +} + +// ============================================================================ +// 4. CLASS STRUCTURE (When you need objects) +// ============================================================================ +// Use classes when your algorithm involves multiple entities +// Examples: particles, agents, cells, nodes, etc. + +class Entity { + constructor() { + // Initialize entity properties + // Use random() here - it will be seeded + } + + update() { + // Update entity state + // This might involve: + // - Physics calculations + // - Behavioral rules + // - Interactions with neighbors + } + + display() { + // Render the entity + // Keep rendering logic separate from update logic + } +} + +// ============================================================================ +// 5. PERFORMANCE CONSIDERATIONS +// ============================================================================ + +// For large numbers of elements: +// - Pre-calculate what you can +// - Use simple collision detection (spatial hashing if needed) +// - Limit expensive operations (sqrt, trig) when possible +// - Consider using p5 vectors efficiently + +// For smooth animation: +// - Aim for 60fps +// - Profile if things are slow +// - Consider reducing particle counts or simplifying calculations + +// ============================================================================ +// 6. UTILITY FUNCTIONS +// ============================================================================ + +// Color utilities +function hexToRgb(hex) { + const result = /^#?([a-f\d]{2})([a-f\d]{2})([a-f\d]{2})$/i.exec(hex); + return result ? { + r: parseInt(result[1], 16), + g: parseInt(result[2], 16), + b: parseInt(result[3], 16) + } : null; +} + +function colorFromPalette(index) { + return params.colorPalette[index % params.colorPalette.length]; +} + +// Mapping and easing +function mapRange(value, inMin, inMax, outMin, outMax) { + return outMin + (outMax - outMin) * ((value - inMin) / (inMax - inMin)); +} + +function easeInOutCubic(t) { + return t < 0.5 ? 4 * t * t * t : 1 - Math.pow(-2 * t + 2, 3) / 2; +} + +// Constrain to bounds +function wrapAround(value, max) { + if (value < 0) return max; + if (value > max) return 0; + return value; +} + +// ============================================================================ +// 7. PARAMETER UPDATES (Connect to UI) +// ============================================================================ + +function updateParameter(paramName, value) { + params[paramName] = value; + // Decide if you need to regenerate or just update + // Some params can update in real-time, others need full regeneration +} + +function regenerate() { + // Reinitialize your generative system + // Useful when parameters change significantly + initializeSeed(params.seed); + // Then regenerate your system +} + +// ============================================================================ +// 8. COMMON P5.JS PATTERNS +// ============================================================================ + +// Drawing with transparency for trails/fading +function fadeBackground(opacity) { + fill(250, 249, 245, opacity); // Anthropic light with alpha + noStroke(); + rect(0, 0, width, height); +} + +// Using noise for organic variation +function getNoiseValue(x, y, scale = 0.01) { + return noise(x * scale, y * scale); +} + +// Creating vectors from angles +function vectorFromAngle(angle, magnitude = 1) { + return createVector(cos(angle), sin(angle)).mult(magnitude); +} + +// ============================================================================ +// 9. EXPORT FUNCTIONS +// ============================================================================ + +function exportImage() { + saveCanvas('generative-art-' + params.seed, 'png'); +} + +// ============================================================================ +// REMEMBER +// ============================================================================ +// +// These are TOOLS and PRINCIPLES, not a recipe. +// Your algorithmic philosophy should guide WHAT you create. +// This structure helps you create it WELL. +// +// Focus on: +// - Clean, readable code +// - Parameterized for exploration +// - Seeded for reproducibility +// - Performant execution +// +// The art itself is entirely up to you! +// +// ============================================================================ \ No newline at end of file diff --git a/algorithmic-art/templates/viewer.html b/algorithmic-art/templates/viewer.html new file mode 100644 index 0000000..630cc1f --- /dev/null +++ b/algorithmic-art/templates/viewer.html @@ -0,0 +1,599 @@ + + + + + + + Generative Art Viewer + + + + + + + +
+ + + + +
+
+
Initializing generative art...
+
+
+
+ + + + \ No newline at end of file diff --git a/analytics-tracking/.skillshare-meta.json b/analytics-tracking/.skillshare-meta.json new file mode 100644 index 0000000..0f14e0c --- /dev/null +++ b/analytics-tracking/.skillshare-meta.json @@ -0,0 +1,8 @@ +{ + "source": "github.com/coreyhaines31/marketingskills/tree/main/skills/analytics-tracking", + "type": "github-subdir", + "installed_at": "2026-01-30T02:20:14.60799167Z", + "repo_url": "https://github.com/coreyhaines31/marketingskills.git", + "subdir": "skills/analytics-tracking", + "version": "a04cb61" +} \ No newline at end of file diff --git a/analytics-tracking/SKILL.md b/analytics-tracking/SKILL.md new file mode 100644 index 0000000..9d8e0a0 --- /dev/null +++ b/analytics-tracking/SKILL.md @@ -0,0 +1,307 @@ +--- +name: analytics-tracking +version: 1.0.0 +description: When the user wants to set up, improve, or audit analytics tracking and measurement. Also use when the user mentions "set up tracking," "GA4," "Google Analytics," "conversion tracking," "event tracking," "UTM parameters," "tag manager," "GTM," "analytics implementation," or "tracking plan." For A/B test measurement, see ab-test-setup. +--- + +# Analytics Tracking + +You are an expert in analytics implementation and measurement. Your goal is to help set up tracking that provides actionable insights for marketing and product decisions. + +## Initial Assessment + +**Check for product marketing context first:** +If `.claude/product-marketing-context.md` exists, read it before asking questions. Use that context and only ask for information not already covered or specific to this task. + +Before implementing tracking, understand: + +1. **Business Context** - What decisions will this data inform? What are key conversions? +2. **Current State** - What tracking exists? What tools are in use? +3. **Technical Context** - What's the tech stack? Any privacy/compliance requirements? + +--- + +## Core Principles + +### 1. Track for Decisions, Not Data +- Every event should inform a decision +- Avoid vanity metrics +- Quality > quantity of events + +### 2. Start with the Questions +- What do you need to know? +- What actions will you take based on this data? +- Work backwards to what you need to track + +### 3. Name Things Consistently +- Naming conventions matter +- Establish patterns before implementing +- Document everything + +### 4. Maintain Data Quality +- Validate implementation +- Monitor for issues +- Clean data > more data + +--- + +## Tracking Plan Framework + +### Structure + +``` +Event Name | Category | Properties | Trigger | Notes +---------- | -------- | ---------- | ------- | ----- +``` + +### Event Types + +| Type | Examples | +|------|----------| +| Pageviews | Automatic, enhanced with metadata | +| User Actions | Button clicks, form submissions, feature usage | +| System Events | Signup completed, purchase, subscription changed | +| Custom Conversions | Goal completions, funnel stages | + +**For comprehensive event lists**: See [references/event-library.md](references/event-library.md) + +--- + +## Event Naming Conventions + +### Recommended Format: Object-Action + +``` +signup_completed +button_clicked +form_submitted +article_read +checkout_payment_completed +``` + +### Best Practices +- Lowercase with underscores +- Be specific: `cta_hero_clicked` vs. `button_clicked` +- Include context in properties, not event name +- Avoid spaces and special characters +- Document decisions + +--- + +## Essential Events + +### Marketing Site + +| Event | Properties | +|-------|------------| +| cta_clicked | button_text, location | +| form_submitted | form_type | +| signup_completed | method, source | +| demo_requested | - | + +### Product/App + +| Event | Properties | +|-------|------------| +| onboarding_step_completed | step_number, step_name | +| feature_used | feature_name | +| purchase_completed | plan, value | +| subscription_cancelled | reason | + +**For full event library by business type**: See [references/event-library.md](references/event-library.md) + +--- + +## Event Properties + +### Standard Properties + +| Category | Properties | +|----------|------------| +| Page | page_title, page_location, page_referrer | +| User | user_id, user_type, account_id, plan_type | +| Campaign | source, medium, campaign, content, term | +| Product | product_id, product_name, category, price | + +### Best Practices +- Use consistent property names +- Include relevant context +- Don't duplicate automatic properties +- Avoid PII in properties + +--- + +## GA4 Implementation + +### Quick Setup + +1. Create GA4 property and data stream +2. Install gtag.js or GTM +3. Enable enhanced measurement +4. Configure custom events +5. Mark conversions in Admin + +### Custom Event Example + +```javascript +gtag('event', 'signup_completed', { + 'method': 'email', + 'plan': 'free' +}); +``` + +**For detailed GA4 implementation**: See [references/ga4-implementation.md](references/ga4-implementation.md) + +--- + +## Google Tag Manager + +### Container Structure + +| Component | Purpose | +|-----------|---------| +| Tags | Code that executes (GA4, pixels) | +| Triggers | When tags fire (page view, click) | +| Variables | Dynamic values (click text, data layer) | + +### Data Layer Pattern + +```javascript +dataLayer.push({ + 'event': 'form_submitted', + 'form_name': 'contact', + 'form_location': 'footer' +}); +``` + +**For detailed GTM implementation**: See [references/gtm-implementation.md](references/gtm-implementation.md) + +--- + +## UTM Parameter Strategy + +### Standard Parameters + +| Parameter | Purpose | Example | +|-----------|---------|---------| +| utm_source | Traffic source | google, newsletter | +| utm_medium | Marketing medium | cpc, email, social | +| utm_campaign | Campaign name | spring_sale | +| utm_content | Differentiate versions | hero_cta | +| utm_term | Paid search keywords | running+shoes | + +### Naming Conventions +- Lowercase everything +- Use underscores or hyphens consistently +- Be specific but concise: `blog_footer_cta`, not `cta1` +- Document all UTMs in a spreadsheet + +--- + +## Debugging and Validation + +### Testing Tools + +| Tool | Use For | +|------|---------| +| GA4 DebugView | Real-time event monitoring | +| GTM Preview Mode | Test triggers before publish | +| Browser Extensions | Tag Assistant, dataLayer Inspector | + +### Validation Checklist + +- [ ] Events firing on correct triggers +- [ ] Property values populating correctly +- [ ] No duplicate events +- [ ] Works across browsers and mobile +- [ ] Conversions recorded correctly +- [ ] No PII leaking + +### Common Issues + +| Issue | Check | +|-------|-------| +| Events not firing | Trigger config, GTM loaded | +| Wrong values | Variable path, data layer structure | +| Duplicate events | Multiple containers, trigger firing twice | + +--- + +## Privacy and Compliance + +### Considerations +- Cookie consent required in EU/UK/CA +- No PII in analytics properties +- Data retention settings +- User deletion capabilities + +### Implementation +- Use consent mode (wait for consent) +- IP anonymization +- Only collect what you need +- Integrate with consent management platform + +--- + +## Output Format + +### Tracking Plan Document + +```markdown +# [Site/Product] Tracking Plan + +## Overview +- Tools: GA4, GTM +- Last updated: [Date] + +## Events + +| Event Name | Description | Properties | Trigger | +|------------|-------------|------------|---------| +| signup_completed | User completes signup | method, plan | Success page | + +## Custom Dimensions + +| Name | Scope | Parameter | +|------|-------|-----------| +| user_type | User | user_type | + +## Conversions + +| Conversion | Event | Counting | +|------------|-------|----------| +| Signup | signup_completed | Once per session | +``` + +--- + +## Task-Specific Questions + +1. What tools are you using (GA4, Mixpanel, etc.)? +2. What key actions do you want to track? +3. What decisions will this data inform? +4. Who implements - dev team or marketing? +5. Are there privacy/consent requirements? +6. What's already tracked? + +--- + +## Tool Integrations + +For implementation, see the [tools registry](../../tools/REGISTRY.md). Key analytics tools: + +| Tool | Best For | MCP | Guide | +|------|----------|:---:|-------| +| **GA4** | Web analytics, Google ecosystem | ✓ | [ga4.md](../../tools/integrations/ga4.md) | +| **Mixpanel** | Product analytics, event tracking | - | [mixpanel.md](../../tools/integrations/mixpanel.md) | +| **Amplitude** | Product analytics, cohort analysis | - | [amplitude.md](../../tools/integrations/amplitude.md) | +| **PostHog** | Open-source analytics, session replay | - | [posthog.md](../../tools/integrations/posthog.md) | +| **Segment** | Customer data platform, routing | - | [segment.md](../../tools/integrations/segment.md) | + +--- + +## Related Skills + +- **ab-test-setup**: For experiment tracking +- **seo-audit**: For organic traffic analysis +- **page-cro**: For conversion optimization (uses this data) diff --git a/analytics-tracking/references/event-library.md b/analytics-tracking/references/event-library.md new file mode 100644 index 0000000..586025e --- /dev/null +++ b/analytics-tracking/references/event-library.md @@ -0,0 +1,251 @@ +# Event Library Reference + +Comprehensive list of events to track by business type and context. + +## Marketing Site Events + +### Navigation & Engagement + +| Event Name | Description | Properties | +|------------|-------------|------------| +| page_view | Page loaded (enhanced) | page_title, page_location, content_group | +| scroll_depth | User scrolled to threshold | depth (25, 50, 75, 100) | +| outbound_link_clicked | Click to external site | link_url, link_text | +| internal_link_clicked | Click within site | link_url, link_text, location | +| video_played | Video started | video_id, video_title, duration | +| video_completed | Video finished | video_id, video_title, duration | + +### CTA & Form Interactions + +| Event Name | Description | Properties | +|------------|-------------|------------| +| cta_clicked | Call to action clicked | button_text, cta_location, page | +| form_started | User began form | form_name, form_location | +| form_field_completed | Field filled | form_name, field_name | +| form_submitted | Form successfully sent | form_name, form_location | +| form_error | Form validation failed | form_name, error_type | +| resource_downloaded | Asset downloaded | resource_name, resource_type | + +### Conversion Events + +| Event Name | Description | Properties | +|------------|-------------|------------| +| signup_started | Initiated signup | source, page | +| signup_completed | Finished signup | method, plan, source | +| demo_requested | Demo form submitted | company_size, industry | +| contact_submitted | Contact form sent | inquiry_type | +| newsletter_subscribed | Email list signup | source, list_name | +| trial_started | Free trial began | plan, source | + +--- + +## Product/App Events + +### Onboarding + +| Event Name | Description | Properties | +|------------|-------------|------------| +| signup_completed | Account created | method, referral_source | +| onboarding_started | Began onboarding | - | +| onboarding_step_completed | Step finished | step_number, step_name | +| onboarding_completed | All steps done | steps_completed, time_to_complete | +| onboarding_skipped | User skipped onboarding | step_skipped_at | +| first_key_action_completed | Aha moment reached | action_type | + +### Core Usage + +| Event Name | Description | Properties | +|------------|-------------|------------| +| session_started | App session began | session_number | +| feature_used | Feature interaction | feature_name, feature_category | +| action_completed | Core action done | action_type, count | +| content_created | User created content | content_type | +| content_edited | User modified content | content_type | +| content_deleted | User removed content | content_type | +| search_performed | In-app search | query, results_count | +| settings_changed | Settings modified | setting_name, new_value | +| invite_sent | User invited others | invite_type, count | + +### Errors & Support + +| Event Name | Description | Properties | +|------------|-------------|------------| +| error_occurred | Error experienced | error_type, error_message, page | +| help_opened | Help accessed | help_type, page | +| support_contacted | Support request made | contact_method, issue_type | +| feedback_submitted | User feedback given | feedback_type, rating | + +--- + +## Monetization Events + +### Pricing & Checkout + +| Event Name | Description | Properties | +|------------|-------------|------------| +| pricing_viewed | Pricing page seen | source | +| plan_selected | Plan chosen | plan_name, billing_cycle | +| checkout_started | Began checkout | plan, value | +| payment_info_entered | Payment submitted | payment_method | +| purchase_completed | Purchase successful | plan, value, currency, transaction_id | +| purchase_failed | Purchase failed | error_reason, plan | + +### Subscription Management + +| Event Name | Description | Properties | +|------------|-------------|------------| +| trial_started | Trial began | plan, trial_length | +| trial_ended | Trial expired | plan, converted (bool) | +| subscription_upgraded | Plan upgraded | from_plan, to_plan, value | +| subscription_downgraded | Plan downgraded | from_plan, to_plan | +| subscription_cancelled | Cancelled | plan, reason, tenure | +| subscription_renewed | Renewed | plan, value | +| billing_updated | Payment method changed | - | + +--- + +## E-commerce Events + +### Browsing + +| Event Name | Description | Properties | +|------------|-------------|------------| +| product_viewed | Product page viewed | product_id, product_name, category, price | +| product_list_viewed | Category/list viewed | list_name, products[] | +| product_searched | Search performed | query, results_count | +| product_filtered | Filters applied | filter_type, filter_value | +| product_sorted | Sort applied | sort_by, sort_order | + +### Cart + +| Event Name | Description | Properties | +|------------|-------------|------------| +| product_added_to_cart | Item added | product_id, product_name, price, quantity | +| product_removed_from_cart | Item removed | product_id, product_name, price, quantity | +| cart_viewed | Cart page viewed | cart_value, items_count | + +### Checkout + +| Event Name | Description | Properties | +|------------|-------------|------------| +| checkout_started | Checkout began | cart_value, items_count | +| checkout_step_completed | Step finished | step_number, step_name | +| shipping_info_entered | Address entered | shipping_method | +| payment_info_entered | Payment entered | payment_method | +| coupon_applied | Coupon used | coupon_code, discount_value | +| purchase_completed | Order placed | transaction_id, value, currency, items[] | + +### Post-Purchase + +| Event Name | Description | Properties | +|------------|-------------|------------| +| order_confirmed | Confirmation viewed | transaction_id | +| refund_requested | Refund initiated | transaction_id, reason | +| refund_completed | Refund processed | transaction_id, value | +| review_submitted | Product reviewed | product_id, rating | + +--- + +## B2B / SaaS Specific Events + +### Team & Collaboration + +| Event Name | Description | Properties | +|------------|-------------|------------| +| team_created | New team/org made | team_size, plan | +| team_member_invited | Invite sent | role, invite_method | +| team_member_joined | Member accepted | role | +| team_member_removed | Member removed | role | +| role_changed | Permissions updated | user_id, old_role, new_role | + +### Integration Events + +| Event Name | Description | Properties | +|------------|-------------|------------| +| integration_viewed | Integration page seen | integration_name | +| integration_started | Setup began | integration_name | +| integration_connected | Successfully connected | integration_name | +| integration_disconnected | Removed integration | integration_name, reason | + +### Account Events + +| Event Name | Description | Properties | +|------------|-------------|------------| +| account_created | New account | source, plan | +| account_upgraded | Plan upgrade | from_plan, to_plan | +| account_churned | Account closed | reason, tenure, mrr_lost | +| account_reactivated | Returned customer | previous_tenure, new_plan | + +--- + +## Event Properties (Parameters) + +### Standard Properties to Include + +**User Context:** +``` +user_id: "12345" +user_type: "free" | "trial" | "paid" +account_id: "acct_123" +plan_type: "starter" | "pro" | "enterprise" +``` + +**Session Context:** +``` +session_id: "sess_abc" +session_number: 5 +page: "/pricing" +referrer: "https://google.com" +``` + +**Campaign Context:** +``` +source: "google" +medium: "cpc" +campaign: "spring_sale" +content: "hero_cta" +``` + +**Product Context (E-commerce):** +``` +product_id: "SKU123" +product_name: "Product Name" +category: "Category" +price: 99.99 +quantity: 1 +currency: "USD" +``` + +**Timing:** +``` +timestamp: "2024-01-15T10:30:00Z" +time_on_page: 45 +session_duration: 300 +``` + +--- + +## Funnel Event Sequences + +### Signup Funnel +1. signup_started +2. signup_step_completed (email) +3. signup_step_completed (password) +4. signup_completed +5. onboarding_started + +### Purchase Funnel +1. pricing_viewed +2. plan_selected +3. checkout_started +4. payment_info_entered +5. purchase_completed + +### E-commerce Funnel +1. product_viewed +2. product_added_to_cart +3. cart_viewed +4. checkout_started +5. shipping_info_entered +6. payment_info_entered +7. purchase_completed diff --git a/analytics-tracking/references/ga4-implementation.md b/analytics-tracking/references/ga4-implementation.md new file mode 100644 index 0000000..2cf874f --- /dev/null +++ b/analytics-tracking/references/ga4-implementation.md @@ -0,0 +1,290 @@ +# GA4 Implementation Reference + +Detailed implementation guide for Google Analytics 4. + +## Configuration + +### Data Streams + +- One stream per platform (web, iOS, Android) +- Enable enhanced measurement for automatic tracking +- Configure data retention (2 months default, 14 months max) +- Enable Google Signals (for cross-device, if consented) + +### Enhanced Measurement Events (Automatic) + +| Event | Description | Configuration | +|-------|-------------|---------------| +| page_view | Page loads | Automatic | +| scroll | 90% scroll depth | Toggle on/off | +| outbound_click | Click to external domain | Automatic | +| site_search | Search query used | Configure parameter | +| video_engagement | YouTube video plays | Toggle on/off | +| file_download | PDF, docs, etc. | Configurable extensions | + +### Recommended Events + +Use Google's predefined events when possible for enhanced reporting: + +**All properties:** +- login, sign_up +- share +- search + +**E-commerce:** +- view_item, view_item_list +- add_to_cart, remove_from_cart +- begin_checkout +- add_payment_info +- purchase, refund + +**Games:** +- level_up, unlock_achievement +- post_score, spend_virtual_currency + +Reference: https://support.google.com/analytics/answer/9267735 + +--- + +## Custom Events + +### gtag.js Implementation + +```javascript +// Basic event +gtag('event', 'signup_completed', { + 'method': 'email', + 'plan': 'free' +}); + +// Event with value +gtag('event', 'purchase', { + 'transaction_id': 'T12345', + 'value': 99.99, + 'currency': 'USD', + 'items': [{ + 'item_id': 'SKU123', + 'item_name': 'Product Name', + 'price': 99.99 + }] +}); + +// User properties +gtag('set', 'user_properties', { + 'user_type': 'premium', + 'plan_name': 'pro' +}); + +// User ID (for logged-in users) +gtag('config', 'GA_MEASUREMENT_ID', { + 'user_id': 'USER_ID' +}); +``` + +### Google Tag Manager (dataLayer) + +```javascript +// Custom event +dataLayer.push({ + 'event': 'signup_completed', + 'method': 'email', + 'plan': 'free' +}); + +// Set user properties +dataLayer.push({ + 'user_id': '12345', + 'user_type': 'premium' +}); + +// E-commerce purchase +dataLayer.push({ + 'event': 'purchase', + 'ecommerce': { + 'transaction_id': 'T12345', + 'value': 99.99, + 'currency': 'USD', + 'items': [{ + 'item_id': 'SKU123', + 'item_name': 'Product Name', + 'price': 99.99, + 'quantity': 1 + }] + } +}); + +// Clear ecommerce before sending (best practice) +dataLayer.push({ ecommerce: null }); +dataLayer.push({ + 'event': 'view_item', + 'ecommerce': { + // ... + } +}); +``` + +--- + +## Conversions Setup + +### Creating Conversions + +1. **Collect the event** - Ensure event is firing in GA4 +2. **Mark as conversion** - Admin > Events > Mark as conversion +3. **Set counting method**: + - Once per session (leads, signups) + - Every event (purchases) +4. **Import to Google Ads** - For conversion-optimized bidding + +### Conversion Values + +```javascript +// Event with conversion value +gtag('event', 'purchase', { + 'value': 99.99, + 'currency': 'USD' +}); +``` + +Or set default value in GA4 Admin when marking conversion. + +--- + +## Custom Dimensions and Metrics + +### When to Use + +**Custom dimensions:** +- Properties you want to segment/filter by +- User attributes (plan type, industry) +- Content attributes (author, category) + +**Custom metrics:** +- Numeric values to aggregate +- Scores, counts, durations + +### Setup Steps + +1. Admin > Data display > Custom definitions +2. Create dimension or metric +3. Choose scope: + - **Event**: Per event (content_type) + - **User**: Per user (account_type) + - **Item**: Per product (product_category) +4. Enter parameter name (must match event parameter) + +### Examples + +| Dimension | Scope | Parameter | Description | +|-----------|-------|-----------|-------------| +| User Type | User | user_type | Free, trial, paid | +| Content Author | Event | author | Blog post author | +| Product Category | Item | item_category | E-commerce category | + +--- + +## Audiences + +### Creating Audiences + +Admin > Data display > Audiences + +**Use cases:** +- Remarketing audiences (export to Ads) +- Segment analysis +- Trigger-based events + +### Audience Examples + +**High-intent visitors:** +- Viewed pricing page +- Did not convert +- In last 7 days + +**Engaged users:** +- 3+ sessions +- Or 5+ minutes total engagement + +**Purchasers:** +- Purchase event +- For exclusion or lookalike + +--- + +## Debugging + +### DebugView + +Enable with: +- URL parameter: `?debug_mode=true` +- Chrome extension: GA Debugger +- gtag: `'debug_mode': true` in config + +View at: Reports > Configure > DebugView + +### Real-Time Reports + +Check events within 30 minutes: +Reports > Real-time + +### Common Issues + +**Events not appearing:** +- Check DebugView first +- Verify gtag/GTM firing +- Check filter exclusions + +**Parameter values missing:** +- Custom dimension not created +- Parameter name mismatch +- Data still processing (24-48 hrs) + +**Conversions not recording:** +- Event not marked as conversion +- Event name doesn't match +- Counting method (once vs. every) + +--- + +## Data Quality + +### Filters + +Admin > Data streams > [Stream] > Configure tag settings > Define internal traffic + +**Exclude:** +- Internal IP addresses +- Developer traffic +- Testing environments + +### Cross-Domain Tracking + +For multiple domains sharing analytics: + +1. Admin > Data streams > [Stream] > Configure tag settings +2. Configure your domains +3. List all domains that should share sessions + +### Session Settings + +Admin > Data streams > [Stream] > Configure tag settings + +- Session timeout (default 30 min) +- Engaged session duration (10 sec default) + +--- + +## Integration with Google Ads + +### Linking + +1. Admin > Product links > Google Ads links +2. Enable auto-tagging in Google Ads +3. Import conversions in Google Ads + +### Audience Export + +Audiences created in GA4 can be used in Google Ads for: +- Remarketing campaigns +- Customer match +- Similar audiences diff --git a/analytics-tracking/references/gtm-implementation.md b/analytics-tracking/references/gtm-implementation.md new file mode 100644 index 0000000..914ada1 --- /dev/null +++ b/analytics-tracking/references/gtm-implementation.md @@ -0,0 +1,380 @@ +# Google Tag Manager Implementation Reference + +Detailed guide for implementing tracking via Google Tag Manager. + +## Container Structure + +### Tags + +Tags are code snippets that execute when triggered. + +**Common tag types:** +- GA4 Configuration (base setup) +- GA4 Event (custom events) +- Google Ads Conversion +- Facebook Pixel +- LinkedIn Insight Tag +- Custom HTML (for other pixels) + +### Triggers + +Triggers define when tags fire. + +**Built-in triggers:** +- Page View: All Pages, DOM Ready, Window Loaded +- Click: All Elements, Just Links +- Form Submission +- Scroll Depth +- Timer +- Element Visibility + +**Custom triggers:** +- Custom Event (from dataLayer) +- Trigger Groups (multiple conditions) + +### Variables + +Variables capture dynamic values. + +**Built-in (enable as needed):** +- Click Text, Click URL, Click ID, Click Classes +- Page Path, Page URL, Page Hostname +- Referrer +- Form Element, Form ID + +**User-defined:** +- Data Layer variables +- JavaScript variables +- Lookup tables +- RegEx tables +- Constants + +--- + +## Naming Conventions + +### Recommended Format + +``` +[Type] - [Description] - [Detail] + +Tags: +GA4 - Event - Signup Completed +GA4 - Config - Base Configuration +FB - Pixel - Page View +HTML - LiveChat Widget + +Triggers: +Click - CTA Button +Submit - Contact Form +View - Pricing Page +Custom - signup_completed + +Variables: +DL - user_id +JS - Current Timestamp +LT - Campaign Source Map +``` + +--- + +## Data Layer Patterns + +### Basic Structure + +```javascript +// Initialize (in before GTM) +window.dataLayer = window.dataLayer || []; + +// Push event +dataLayer.push({ + 'event': 'event_name', + 'property1': 'value1', + 'property2': 'value2' +}); +``` + +### Page Load Data + +```javascript +// Set on page load (before GTM container) +window.dataLayer = window.dataLayer || []; +dataLayer.push({ + 'pageType': 'product', + 'contentGroup': 'products', + 'user': { + 'loggedIn': true, + 'userId': '12345', + 'userType': 'premium' + } +}); +``` + +### Form Submission + +```javascript +document.querySelector('#contact-form').addEventListener('submit', function() { + dataLayer.push({ + 'event': 'form_submitted', + 'formName': 'contact', + 'formLocation': 'footer' + }); +}); +``` + +### Button Click + +```javascript +document.querySelector('.cta-button').addEventListener('click', function() { + dataLayer.push({ + 'event': 'cta_clicked', + 'ctaText': this.innerText, + 'ctaLocation': 'hero' + }); +}); +``` + +### E-commerce Events + +```javascript +// Product view +dataLayer.push({ ecommerce: null }); // Clear previous +dataLayer.push({ + 'event': 'view_item', + 'ecommerce': { + 'items': [{ + 'item_id': 'SKU123', + 'item_name': 'Product Name', + 'price': 99.99, + 'item_category': 'Category', + 'quantity': 1 + }] + } +}); + +// Add to cart +dataLayer.push({ ecommerce: null }); +dataLayer.push({ + 'event': 'add_to_cart', + 'ecommerce': { + 'items': [{ + 'item_id': 'SKU123', + 'item_name': 'Product Name', + 'price': 99.99, + 'quantity': 1 + }] + } +}); + +// Purchase +dataLayer.push({ ecommerce: null }); +dataLayer.push({ + 'event': 'purchase', + 'ecommerce': { + 'transaction_id': 'T12345', + 'value': 99.99, + 'currency': 'USD', + 'tax': 5.00, + 'shipping': 10.00, + 'items': [{ + 'item_id': 'SKU123', + 'item_name': 'Product Name', + 'price': 99.99, + 'quantity': 1 + }] + } +}); +``` + +--- + +## Common Tag Configurations + +### GA4 Configuration Tag + +**Tag Type:** Google Analytics: GA4 Configuration + +**Settings:** +- Measurement ID: G-XXXXXXXX +- Send page view: Checked (for pageviews) +- User Properties: Add any user-level dimensions + +**Trigger:** All Pages + +### GA4 Event Tag + +**Tag Type:** Google Analytics: GA4 Event + +**Settings:** +- Configuration Tag: Select your config tag +- Event Name: {{DL - event_name}} or hardcode +- Event Parameters: Add parameters from dataLayer + +**Trigger:** Custom Event with event name match + +### Facebook Pixel - Base + +**Tag Type:** Custom HTML + +```html + +``` + +**Trigger:** All Pages + +### Facebook Pixel - Event + +**Tag Type:** Custom HTML + +```html + +``` + +**Trigger:** Custom Event - form_submitted + +--- + +## Preview and Debug + +### Preview Mode + +1. Click "Preview" in GTM +2. Enter site URL +3. GTM debug panel opens at bottom + +**What to check:** +- Tags fired on this event +- Tags not fired (and why) +- Variables and their values +- Data layer contents + +### Debug Tips + +**Tag not firing:** +- Check trigger conditions +- Verify data layer push +- Check tag sequencing + +**Wrong variable value:** +- Check data layer structure +- Verify variable path (nested objects) +- Check timing (data may not exist yet) + +**Multiple firings:** +- Check trigger uniqueness +- Look for duplicate tags +- Check tag firing options + +--- + +## Workspaces and Versioning + +### Workspaces + +Use workspaces for team collaboration: +- Default workspace for production +- Separate workspaces for large changes +- Merge when ready + +### Version Management + +**Best practices:** +- Name every version descriptively +- Add notes explaining changes +- Review changes before publish +- Keep production version noted + +**Version notes example:** +``` +v15: Added purchase conversion tracking +- New tag: GA4 - Event - Purchase +- New trigger: Custom Event - purchase +- New variables: DL - transaction_id, DL - value +- Tested: Chrome, Safari, Mobile +``` + +--- + +## Consent Management + +### Consent Mode Integration + +```javascript +// Default state (before consent) +gtag('consent', 'default', { + 'analytics_storage': 'denied', + 'ad_storage': 'denied' +}); + +// Update on consent +function grantConsent() { + gtag('consent', 'update', { + 'analytics_storage': 'granted', + 'ad_storage': 'granted' + }); +} +``` + +### GTM Consent Overview + +1. Enable Consent Overview in Admin +2. Configure consent for each tag +3. Tags respect consent state automatically + +--- + +## Advanced Patterns + +### Tag Sequencing + +**Setup tags to fire in order:** +Tag Configuration > Advanced Settings > Tag Sequencing + +**Use cases:** +- Config tag before event tags +- Pixel initialization before tracking +- Cleanup after conversion + +### Exception Handling + +**Trigger exceptions** - Prevent tag from firing: +- Exclude certain pages +- Exclude internal traffic +- Exclude during testing + +### Custom JavaScript Variables + +```javascript +// Get URL parameter +function() { + var params = new URLSearchParams(window.location.search); + return params.get('campaign') || '(not set)'; +} + +// Get cookie value +function() { + var match = document.cookie.match('(^|;) ?user_id=([^;]*)(;|$)'); + return match ? match[2] : null; +} + +// Get data from page +function() { + var el = document.querySelector('.product-price'); + return el ? parseFloat(el.textContent.replace('$', '')) : 0; +} +``` diff --git a/ask-questions-if-underspecified/.skillshare-meta.json b/ask-questions-if-underspecified/.skillshare-meta.json new file mode 100644 index 0000000..e81c72f --- /dev/null +++ b/ask-questions-if-underspecified/.skillshare-meta.json @@ -0,0 +1,8 @@ +{ + "source": "github.com/trailofbits/skills/tree/main/plugins/ask-questions-if-underspecified/skills/ask-questions-if-underspecified", + "type": "github-subdir", + "installed_at": "2026-01-30T02:23:15.701719296Z", + "repo_url": "https://github.com/trailofbits/skills.git", + "subdir": "plugins/ask-questions-if-underspecified/skills/ask-questions-if-underspecified", + "version": "650f6e3" +} \ No newline at end of file diff --git a/ask-questions-if-underspecified/SKILL.md b/ask-questions-if-underspecified/SKILL.md new file mode 100644 index 0000000..9c11bef --- /dev/null +++ b/ask-questions-if-underspecified/SKILL.md @@ -0,0 +1,85 @@ +--- +name: ask-questions-if-underspecified +description: Clarify requirements before implementing. Use when serious doubts arise. +--- + +# Ask Questions If Underspecified + +## When to Use + +Use this skill when a request has multiple plausible interpretations or key details (objective, scope, constraints, environment, or safety) are unclear. + +## When NOT to Use + +Do not use this skill when the request is already clear, or when a quick, low-risk discovery read can answer the missing details. + +## Goal + +Ask the minimum set of clarifying questions needed to avoid wrong work; do not start implementing until the must-have questions are answered (or the user explicitly approves proceeding with stated assumptions). + +## Workflow + +### 1) Decide whether the request is underspecified + +Treat a request as underspecified if after exploring how to perform the work, some or all of the following are not clear: +- Define the objective (what should change vs stay the same) +- Define "done" (acceptance criteria, examples, edge cases) +- Define scope (which files/components/users are in/out) +- Define constraints (compatibility, performance, style, deps, time) +- Identify environment (language/runtime versions, OS, build/test runner) +- Clarify safety/reversibility (data migration, rollout/rollback, risk) + +If multiple plausible interpretations exist, assume it is underspecified. + +### 2) Ask must-have questions first (keep it small) + +Ask 1-5 questions in the first pass. Prefer questions that eliminate whole branches of work. + +Make questions easy to answer: +- Optimize for scannability (short, numbered questions; avoid paragraphs) +- Offer multiple-choice options when possible +- Suggest reasonable defaults when appropriate (mark them clearly as the default/recommended choice; bold the recommended choice in the list, or if you present options in a code block, put a bold "Recommended" line immediately above the block and also tag defaults inside the block) +- Include a fast-path response (e.g., reply `defaults` to accept all recommended/default choices) +- Include a low-friction "not sure" option when helpful (e.g., "Not sure - use default") +- Separate "Need to know" from "Nice to know" if that reduces friction +- Structure options so the user can respond with compact decisions (e.g., `1b 2a 3c`); restate the chosen options in plain language to confirm + +### 3) Pause before acting + +Until must-have answers arrive: +- Do not run commands, edit files, or produce a detailed plan that depends on unknowns +- Do perform a clearly labeled, low-risk discovery step only if it does not commit you to a direction (e.g., inspect repo structure, read relevant config files) + +If the user explicitly asks you to proceed without answers: +- State your assumptions as a short numbered list +- Ask for confirmation; proceed only after they confirm or correct them + +### 4) Confirm interpretation, then proceed + +Once you have answers, restate the requirements in 1-3 sentences (including key constraints and what success looks like), then start work. + +## Question templates + +- "Before I start, I need: (1) ..., (2) ..., (3) .... If you don't care about (2), I will assume ...." +- "Which of these should it be? A) ... B) ... C) ... (pick one)" +- "What would you consider 'done'? For example: ..." +- "Any constraints I must follow (versions, performance, style, deps)? If none, I will target the existing project defaults." +- Use numbered questions with lettered options and a clear reply format + +```text +1) Scope? +a) Minimal change (default) +b) Refactor while touching the area +c) Not sure - use default +2) Compatibility target? +a) Current project defaults (default) +b) Also support older versions: +c) Not sure - use default + +Reply with: defaults (or 1a 2a) +``` + +## Anti-patterns + +- Don't ask questions you can answer with a quick, low-risk discovery read (e.g., configs, existing patterns, docs). +- Don't ask open-ended questions if a tight multiple-choice or yes/no would eliminate ambiguity faster. diff --git a/audit-context-building/.skillshare-meta.json b/audit-context-building/.skillshare-meta.json new file mode 100644 index 0000000..537aab3 --- /dev/null +++ b/audit-context-building/.skillshare-meta.json @@ -0,0 +1,8 @@ +{ + "source": "github.com/trailofbits/skills/tree/main/plugins/audit-context-building/skills/audit-context-building", + "type": "github-subdir", + "installed_at": "2026-01-30T02:23:12.950780406Z", + "repo_url": "https://github.com/trailofbits/skills.git", + "subdir": "plugins/audit-context-building/skills/audit-context-building", + "version": "650f6e3" +} \ No newline at end of file diff --git a/audit-context-building/SKILL.md b/audit-context-building/SKILL.md new file mode 100644 index 0000000..09c2fb8 --- /dev/null +++ b/audit-context-building/SKILL.md @@ -0,0 +1,297 @@ +--- +name: audit-context-building +description: Enables ultra-granular, line-by-line code analysis to build deep architectural context before vulnerability or bug finding. +--- + +# Deep Context Builder Skill (Ultra-Granular Pure Context Mode) + +## 1. Purpose + +This skill governs **how Claude thinks** during the context-building phase of an audit. + +When active, Claude will: +- Perform **line-by-line / block-by-block** code analysis by default. +- Apply **First Principles**, **5 Whys**, and **5 Hows** at micro scale. +- Continuously link insights → functions → modules → entire system. +- Maintain a stable, explicit mental model that evolves with new evidence. +- Identify invariants, assumptions, flows, and reasoning hazards. + +This skill defines a structured analysis format (see Example: Function Micro-Analysis below) and runs **before** the vulnerability-hunting phase. + +--- + +## 2. When to Use This Skill + +Use when: +- Deep comprehension is needed before bug or vulnerability discovery. +- You want bottom-up understanding instead of high-level guessing. +- Reducing hallucinations, contradictions, and context loss is critical. +- Preparing for security auditing, architecture review, or threat modeling. + +Do **not** use for: +- Vulnerability findings +- Fix recommendations +- Exploit reasoning +- Severity/impact rating + +--- + +## 3. How This Skill Behaves + +When active, Claude will: +- Default to **ultra-granular analysis** of each block and line. +- Apply micro-level First Principles, 5 Whys, and 5 Hows. +- Build and refine a persistent global mental model. +- Update earlier assumptions when contradicted ("Earlier I thought X; now Y."). +- Periodically anchor summaries to maintain stable context. +- Avoid speculation; express uncertainty explicitly when needed. + +Goal: **deep, accurate understanding**, not conclusions. + +--- + +## Rationalizations (Do Not Skip) + +| Rationalization | Why It's Wrong | Required Action | +|-----------------|----------------|-----------------| +| "I get the gist" | Gist-level understanding misses edge cases | Line-by-line analysis required | +| "This function is simple" | Simple functions compose into complex bugs | Apply 5 Whys anyway | +| "I'll remember this invariant" | You won't. Context degrades. | Write it down explicitly | +| "External call is probably fine" | External = adversarial until proven otherwise | Jump into code or model as hostile | +| "I can skip this helper" | Helpers contain assumptions that propagate | Trace the full call chain | +| "This is taking too long" | Rushed context = hallucinated vulnerabilities later | Slow is fast | + +--- + +## 4. Phase 1 — Initial Orientation (Bottom-Up Scan) + +Before deep analysis, Claude performs a minimal mapping: + +1. Identify major modules/files/contracts. +2. Note obvious public/external entrypoints. +3. Identify likely actors (users, owners, relayers, oracles, other contracts). +4. Identify important storage variables, dicts, state structs, or cells. +5. Build a preliminary structure without assuming behavior. + +This establishes anchors for detailed analysis. + +--- + +## 5. Phase 2 — Ultra-Granular Function Analysis (Default Mode) + +Every non-trivial function receives full micro analysis. + +### 5.1 Per-Function Microstructure Checklist + +For each function: + +1. **Purpose** + - Why the function exists and its role in the system. + +2. **Inputs & Assumptions** + - Parameters and implicit inputs (state, sender, env). + - Preconditions and constraints. + +3. **Outputs & Effects** + - Return values. + - State/storage writes. + - Events/messages. + - External interactions. + +4. **Block-by-Block / Line-by-Line Analysis** + For each logical block: + - What it does. + - Why it appears here (ordering logic). + - What assumptions it relies on. + - What invariants it establishes or maintains. + - What later logic depends on it. + + Apply per-block: + - **First Principles** + - **5 Whys** + - **5 Hows** + +--- + +### 5.2 Cross-Function & External Flow Analysis +*(Full Integration of Jump-Into-External-Code Rule)* + +When encountering calls, **continue the same micro-first analysis across boundaries.** + +#### Internal Calls +- Jump into the callee immediately. +- Perform block-by-block analysis of relevant code. +- Track flow of data, assumptions, and invariants: + caller → callee → return → caller. +- Note if callee logic behaves differently in this specific call context. + +#### External Calls — Two Cases + +**Case A — External Call to a Contract Whose Code Exists in the Codebase** +Treat as an internal call: +- Jump into the target contract/function. +- Continue block-by-block micro-analysis. +- Propagate invariants and assumptions seamlessly. +- Consider edge cases based on the *actual* code, not a black-box guess. + +**Case B — External Call Without Available Code (True External / Black Box)** +Analyze as adversarial: +- Describe payload/value/gas or parameters sent. +- Identify assumptions about the target. +- Consider all outcomes: + - revert + - incorrect/strange return values + - unexpected state changes + - misbehavior + - reentrancy (if applicable) + +#### Continuity Rule +Treat the entire call chain as **one continuous execution flow**. +Never reset context. +All invariants, assumptions, and data dependencies must propagate across calls. + +--- + +### 5.3 Complete Analysis Example + +See [FUNCTION_MICRO_ANALYSIS_EXAMPLE.md](resources/FUNCTION_MICRO_ANALYSIS_EXAMPLE.md) for a complete walkthrough demonstrating: +- Full micro-analysis of a DEX swap function +- Application of First Principles, 5 Whys, and 5 Hows +- Block-by-block analysis with invariants and assumptions +- Cross-function dependency mapping +- Risk analysis for external interactions + +This example demonstrates the level of depth and structure required for all analyzed functions. + +--- + +### 5.4 Output Requirements + +When performing ultra-granular analysis, Claude MUST structure output following the format defined in [OUTPUT_REQUIREMENTS.md](resources/OUTPUT_REQUIREMENTS.md). + +Key requirements: +- **Purpose** (2-3 sentences minimum) +- **Inputs & Assumptions** (all parameters, preconditions, trust assumptions) +- **Outputs & Effects** (returns, state writes, external calls, events, postconditions) +- **Block-by-Block Analysis** (What, Why here, Assumptions, First Principles/5 Whys/5 Hows) +- **Cross-Function Dependencies** (internal calls, external calls with risk analysis, shared state) + +Quality thresholds: +- Minimum 3 invariants per function +- Minimum 5 assumptions documented +- Minimum 3 risk considerations for external interactions +- At least 1 First Principles application +- At least 3 combined 5 Whys/5 Hows applications + +--- + +### 5.5 Completeness Checklist + +Before concluding micro-analysis of a function, verify against the [COMPLETENESS_CHECKLIST.md](resources/COMPLETENESS_CHECKLIST.md): + +- **Structural Completeness**: All required sections present (Purpose, Inputs, Outputs, Block-by-Block, Dependencies) +- **Content Depth**: Minimum thresholds met (invariants, assumptions, risk analysis, First Principles) +- **Continuity & Integration**: Cross-references, propagated assumptions, invariant couplings +- **Anti-Hallucination**: Line number citations, no vague statements, evidence-based claims + +Analysis is complete when all checklist items are satisfied and no unresolved "unclear" items remain. + +--- + +## 6. Phase 3 — Global System Understanding + +After sufficient micro-analysis: + +1. **State & Invariant Reconstruction** + - Map reads/writes of each state variable. + - Derive multi-function and multi-module invariants. + +2. **Workflow Reconstruction** + - Identify end-to-end flows (deposit, withdraw, lifecycle, upgrades). + - Track how state transforms across these flows. + - Record assumptions that persist across steps. + +3. **Trust Boundary Mapping** + - Actor → entrypoint → behavior. + - Identify untrusted input paths. + - Privilege changes and implicit role expectations. + +4. **Complexity & Fragility Clustering** + - Functions with many assumptions. + - High branching logic. + - Multi-step dependencies. + - Coupled state changes across modules. + +These clusters help guide the vulnerability-hunting phase. + +--- + +## 7. Stability & Consistency Rules +*(Anti-Hallucination, Anti-Contradiction)* + +Claude must: + +- **Never reshape evidence to fit earlier assumptions.** + When contradicted: + - Update the model. + - State the correction explicitly. + +- **Periodically anchor key facts** + Summarize core: + - invariants + - state relationships + - actor roles + - workflows + +- **Avoid vague guesses** + Use: + - "Unclear; need to inspect X." + instead of: + - "It probably…" + +- **Cross-reference constantly** + Connect new insights to previous state, flows, and invariants to maintain global coherence. + +--- + +## 8. Subagent Usage + +Claude may spawn subagents for: +- Dense or complex functions. +- Long data-flow or control-flow chains. +- Cryptographic / mathematical logic. +- Complex state machines. +- Multi-module workflow reconstruction. + +Subagents must: +- Follow the same micro-first rules. +- Return summaries that Claude integrates into its global model. + +--- + +## 9. Relationship to Other Phases + +This skill runs **before**: +- Vulnerability discovery +- Classification / triage +- Report writing +- Impact modeling +- Exploit reasoning + +It exists solely to build: +- Deep understanding +- Stable context +- System-level clarity + +--- + +## 10. Non-Goals + +While active, Claude should NOT: +- Identify vulnerabilities +- Propose fixes +- Generate proofs-of-concept +- Model exploits +- Assign severity or impact + +This is **pure context building** only. diff --git a/audit-context-building/resources/COMPLETENESS_CHECKLIST.md b/audit-context-building/resources/COMPLETENESS_CHECKLIST.md new file mode 100644 index 0000000..9561a47 --- /dev/null +++ b/audit-context-building/resources/COMPLETENESS_CHECKLIST.md @@ -0,0 +1,47 @@ +# Completeness Checklist + +Before concluding micro-analysis of a function, verify: + +--- + +## Structural Completeness +- [ ] Purpose section: 2+ sentences explaining function role +- [ ] Inputs & Assumptions section: All parameters + implicit inputs documented +- [ ] Outputs & Effects section: All returns, state writes, external calls, events +- [ ] Block-by-Block Analysis: Every logical block analyzed (no gaps) +- [ ] Cross-Function Dependencies: All calls and shared state documented + +--- + +## Content Depth +- [ ] Identified at least 3 invariants (what must always hold) +- [ ] Documented at least 5 assumptions (what is assumed true) +- [ ] Applied First Principles at least once +- [ ] Applied 5 Whys or 5 Hows at least 3 times total +- [ ] Risk analysis for all external interactions (reentrancy, malicious contracts, etc.) + +--- + +## Continuity & Integration +- [ ] Cross-reference with related functions (if internal calls exist, analyze callees) +- [ ] Propagated assumptions from callers (if this function is called by others) +- [ ] Identified invariant couplings (how this function's invariants relate to global system) +- [ ] Tracked data flow across function boundaries (if applicable) + +--- + +## Anti-Hallucination Verification +- [ ] All claims reference specific line numbers (L45, L98-102, etc.) +- [ ] No vague statements ("probably", "might", "seems to") - replaced with "unclear; need to check X" +- [ ] Contradictions resolved (if earlier analysis conflicts with current findings, explicitly updated) +- [ ] Evidence-based: Every invariant/assumption tied to actual code + +--- + +## Completeness Signal + +Analysis is complete when: +1. All checklist items above are satisfied +2. No remaining "TODO: analyze X" or "unclear Y" items +3. Full call chain analyzed (for internal calls, jumped into and analyzed) +4. All identified risks have mitigation analysis or acknowledged as unresolved diff --git a/audit-context-building/resources/FUNCTION_MICRO_ANALYSIS_EXAMPLE.md b/audit-context-building/resources/FUNCTION_MICRO_ANALYSIS_EXAMPLE.md new file mode 100644 index 0000000..c571a45 --- /dev/null +++ b/audit-context-building/resources/FUNCTION_MICRO_ANALYSIS_EXAMPLE.md @@ -0,0 +1,355 @@ +# Function Micro-Analysis Example + +This example demonstrates a complete micro-analysis following the Per-Function Microstructure Checklist. + +--- + +## Target: `swap(address tokenIn, address tokenOut, uint256 amountIn, uint256 minAmountOut, uint256 deadline)` in Router.sol + +**Purpose:** +Enables users to swap one token for another through a liquidity pool. Core trading operation in a DEX that: +- Calculates output amount using constant product formula (x * y = k) +- Deducts 0.3% protocol fee from input amount +- Enforces user-specified slippage protection +- Updates pool reserves to maintain AMM invariant +- Prevents stale transactions via deadline check + +This is a critical financial primitive affecting pool solvency, user fund safety, and protocol fee collection. + +--- + +**Inputs & Assumptions:** + +*Parameters:* +- `tokenIn` (address): Source token to swap from. Assumed untrusted (could be malicious ERC20). +- `tokenOut` (address): Destination token to receive. Assumed untrusted. +- `amountIn` (uint256): Amount of tokenIn to swap. User-specified, untrusted input. +- `minAmountOut` (uint256): Minimum acceptable output. User-specified slippage tolerance. +- `deadline` (uint256): Unix timestamp. Transaction must execute before this or revert. + +*Implicit Inputs:* +- `msg.sender`: Transaction initiator. Assumed to have approved Router to spend amountIn of tokenIn. +- `pairs[tokenIn][tokenOut]`: Storage mapping to pool address. Assumed populated during pool creation. +- `reserves[pair]`: Pool's current token reserves. Assumed synchronized with actual pool balances. +- `block.timestamp`: Current block time. Assumed honest (no validator manipulation considered here). + +*Preconditions:* +- Pool exists for tokenIn/tokenOut pair (pairs[tokenIn][tokenOut] != address(0)) +- msg.sender has approved Router for at least amountIn of tokenIn +- msg.sender balance of tokenIn >= amountIn +- Pool has sufficient liquidity to output at least minAmountOut +- block.timestamp <= deadline + +*Trust Assumptions:* +- Pool contract correctly maintains reserves +- ERC20 tokens follow standard behavior (return true on success, revert on failure) +- No reentrancy from tokenIn/tokenOut during transfers (or handled by nonReentrant modifier) + +--- + +**Outputs & Effects:** + +*Returns:* +- Implicit: amountOut (not returned, but emitted in event) + +*State Writes:* +- `reserves[pair].reserve0` and `reserves[pair].reserve1`: Updated to reflect post-swap balances +- Pool token balances: Physical token transfers change actual balances + +*External Interactions:* +- `IERC20(tokenIn).transferFrom(msg.sender, pair, amountIn)`: Pulls tokenIn from user to pool +- `IERC20(tokenOut).transfer(msg.sender, amountOut)`: Sends tokenOut from pool to user + +*Events Emitted:* +- `Swap(msg.sender, tokenIn, tokenOut, amountIn, amountOut, block.timestamp)` + +*Postconditions:* +- `amountOut >= minAmountOut` (slippage protection enforced) +- Pool reserves updated: `reserve0 * reserve1 >= k_before` (constant product maintained with fee) +- User received exactly amountOut of tokenOut +- Pool received exactly amountIn of tokenIn +- Fee collected: `amountIn * 0.003` remains in pool as liquidity + +--- + +**Block-by-Block Analysis:** + +```solidity +// L90: Deadline validation (modifier: ensure(deadline)) +modifier ensure(uint256 deadline) { + require(block.timestamp <= deadline, "Expired"); + _; +} +``` +- **What:** Checks transaction hasn't expired based on user-provided deadline +- **Why here:** First line of defense; fail fast before any state reads or computation +- **Assumption:** `block.timestamp` is sufficiently honest (no 900-second manipulation considered) +- **Depends on:** User setting reasonable deadline (e.g., block.timestamp + 300 seconds) +- **First Principles:** Time-sensitive operations need expiration to prevent stale execution at unexpected prices +- **5 Whys:** + - Why check deadline? → Prevent stale transactions + - Why are stale transactions bad? → Price may have moved significantly + - Why not just use slippage protection? → Slippage doesn't prevent execution hours later + - Why does timing matter? → Market conditions change, user intent expires + - Why user-provided vs fixed? → User decides their time tolerance based on urgency + +--- + +```solidity +// L92-94: Input validation +require(amountIn > 0, "Invalid input amount"); +require(minAmountOut > 0, "Invalid minimum output"); +require(tokenIn != tokenOut, "Identical tokens"); +``` +- **What:** Validates basic input sanity (non-zero amounts, different tokens) +- **Why here:** Second line of defense; cheap checks before expensive operations +- **Assumption:** Zero amounts indicate user error, not intentional probe +- **Invariant established:** `amountIn > 0 && minAmountOut > 0 && tokenIn != tokenOut` +- **First Principles:** Fail fast on invalid input before consuming gas on computation/storage +- **5 Hows:** + - How to ensure valid swap? → Check inputs meet minimum requirements + - How to check minimum requirements? → Test amounts > 0 and tokens differ + - How to handle violations? → Revert with descriptive error + - How to order checks? → Cheapest first (inequality checks before storage reads) + - How to communicate failure? → Require statements with clear messages + +--- + +```solidity +// L98-99: Pool resolution +address pair = pairs[tokenIn][tokenOut]; +require(pair != address(0), "Pool does not exist"); +``` +- **What:** Looks up liquidity pool address for token pair, validates existence +- **Why here:** Must identify pool before reading reserves or executing transfers +- **Assumption:** `pairs` mapping is correctly populated during pool creation; no race conditions +- **Depends on:** Factory having called createPair(tokenIn, tokenOut) previously +- **Invariant established:** `pair != 0x0` (valid pool address exists) +- **Risk:** If pairs mapping is corrupted or pool address is incorrect, funds could be sent to wrong address + +--- + +```solidity +// L102-103: Reserve reads +(uint112 reserveIn, uint112 reserveOut) = getReserves(pair, tokenIn, tokenOut); +require(reserveIn > 0 && reserveOut > 0, "Insufficient liquidity"); +``` +- **What:** Reads current pool reserves for tokenIn and tokenOut, validates pool has liquidity +- **Why here:** Need current reserves to calculate output amount; must confirm pool is operational +- **Assumption:** `reserves[pair]` storage is synchronized with actual pool token balances +- **Invariant established:** `reserveIn > 0 && reserveOut > 0` (pool is liquid) +- **Depends on:** Sync mechanism keeping reserves accurate (called after transfers/swaps) +- **5 Whys:** + - Why read reserves? → Need current pool state for price calculation + - Why must reserves be > 0? → Division by zero in formula if empty + - Why check liquidity here? → Cheaper to fail now than after transferFrom + - Why not just try the swap? → Better UX with specific error message + - Why trust reserves storage? → Alternative is querying balances (expensive) + +--- + +```solidity +// L108-109: Fee application +uint256 amountInWithFee = amountIn * 997; +uint256 numerator = amountInWithFee * reserveOut; +``` +- **What:** Applies 0.3% protocol fee by multiplying amountIn by 997 (instead of deducting 3) +- **Why here:** Fee must be applied before price calculation to affect output amount +- **Assumption:** 997/1000 = 0.997 = (1 - 0.003) represents 0.3% fee deduction +- **Invariant maintained:** `amountInWithFee = amountIn * 0.997` (3/1000 fee taken) +- **First Principles:** Fees modify effective input, reducing output proportionally +- **5 Whys:** + - Why multiply by 997? → Gas optimization: avoids separate subtraction step + - Why not amountIn * 0.997? → Solidity doesn't support floating point + - Why 0.3% fee? → Protocol parameter (Uniswap V2 standard, commonly copied) + - Why apply before calculation? → Fee reduces input amount, must affect price + - Why not apply after? → Would incorrectly calculate output at full amountIn + +--- + +```solidity +// L110-111: Output calculation (constant product formula) +uint256 denominator = (reserveIn * 1000) + amountInWithFee; +uint256 amountOut = numerator / denominator; +``` +- **What:** Calculates output amount using AMM constant product formula: `Δy = (x * Δx_fee) / (y + Δx_fee)` +- **Why here:** After fee application; core pricing logic of the AMM +- **Assumption:** `k = reserveIn * reserveOut` is the invariant to maintain (with fee adding to k) +- **Invariant formula:** `(reserveIn + amountIn) * (reserveOut - amountOut) >= reserveIn * reserveOut` +- **First Principles:** Constant product AMM maintains `x * y = k` (with fee slightly increasing k) +- **5 Whys:** + - Why this formula? → Constant product market maker (x * y = k) + - Why not linear pricing? → Would drain pool at constant price (exploitable) + - Why multiply reserveIn by 1000? → Match denominator scale with numerator (997 * 1000) + - Why divide? → Solving for Δy in: (x + Δx_fee) * (y - Δy) = k + - Why this maintains k? → New product = (reserveIn + amountIn*0.997) * (reserveOut - amountOut) ≈ k * 1.003 +- **Mathematical verification:** + - Given: `k = reserveIn * reserveOut` + - New reserves: `reserveIn' = reserveIn + amountIn`, `reserveOut' = reserveOut - amountOut` + - With fee: `amountInWithFee = amountIn * 0.997` + - Solving `(reserveIn + amountIn) * (reserveOut - amountOut) = k`: + - `reserveOut - amountOut = k / (reserveIn + amountIn)` + - `amountOut = reserveOut - k / (reserveIn + amountIn)` + - Substituting and simplifying yields the formula above + +--- + +```solidity +// L115: Slippage protection enforcement +require(amountOut >= minAmountOut, "Slippage exceeded"); +``` +- **What:** Validates calculated output meets user's minimum acceptable amount +- **Why here:** After calculation, before any state changes or transfers (fail fast if insufficient) +- **Assumption:** User calculated minAmountOut correctly based on acceptable slippage tolerance +- **Invariant enforced:** `amountOut >= minAmountOut` (user-defined slippage limit) +- **First Principles:** User must explicitly consent to price via slippage tolerance; prevents sandwich attacks +- **5 Whys:** + - Why check minAmountOut? → Protect user from excessive slippage + - Why is slippage protection critical? → Prevents sandwich attacks and MEV extraction + - Why user-specified? → Different users have different risk tolerances + - Why fail here vs warn? → Financial safety: user should not receive less than intended + - Why before transfers? → Cheaper to revert now than after expensive external calls +- **Attack scenario prevented:** + - Attacker front-runs with large buy → price increases + - Victim's swap would execute at worse price + - This check causes victim's transaction to revert instead + - Attacker cannot profit from sandwich + +--- + +```solidity +// L118: Input token transfer (pull pattern) +IERC20(tokenIn).transferFrom(msg.sender, pair, amountIn); +``` +- **What:** Pulls tokenIn from user to liquidity pool +- **Why here:** After all validations pass; begins state-changing operations (point of no return) +- **Assumption:** User has approved Router for at least amountIn; tokenIn is standard ERC20 +- **Depends on:** Prior approval: `tokenIn.approve(router, amountIn)` called by user +- **Risk considerations:** + - If tokenIn is malicious: could revert (DoS), consume excessive gas, or attempt reentrancy + - If tokenIn has transfer fee: actual amount received < amountIn (breaks invariant) + - If tokenIn is pausable: could revert if paused + - Reentrancy: If tokenIn has callback, attacker could call Router again (mitigated by nonReentrant modifier) +- **First Principles:** Pull pattern (transferFrom) is safer than users sending first (push) - Router controls timing +- **5 Hows:** + - How to get tokenIn? → Pull from user via transferFrom + - How to ensure Router can pull? → User must have approved Router + - How to specify destination? → Send directly to pair (gas optimization: no router intermediate storage) + - How to handle failures? → transferFrom reverts on failure (ERC20 standard) + - How to prevent reentrancy? → nonReentrant modifier (assumed present) + +--- + +```solidity +// L122: Output token transfer (push pattern) +IERC20(tokenOut).transfer(msg.sender, amountOut); +``` +- **What:** Sends calculated amountOut of tokenOut from pool to user +- **Why here:** After input transfer succeeds; completes the swap atomically +- **Assumption:** Pool has at least amountOut of tokenOut; tokenOut is standard ERC20 +- **Invariant maintained:** User receives exact amountOut (no more, no less) +- **Risk considerations:** + - If tokenOut is malicious: could revert (DoS), but user selected this token pair + - If tokenOut has transfer hook: could attempt reentrancy (mitigated by nonReentrant) + - If transfer fails: entire transaction reverts (atomic swap) +- **CEI pattern:** Not strictly followed (Check-Effects-Interactions) - both transfers are interactions + - Typically Effects (reserve update) should precede Interactions (transfers) + - Here, transfers happen before reserve update (see next block) + - Justification: nonReentrant modifier prevents exploitation +- **5 Whys:** + - Why transfer to msg.sender? → User initiated swap, they receive output + - Why not to an arbitrary recipient? → Simplicity; extensions can add recipient parameter + - Why this amount exactly? → amountOut calculated from constant product formula + - Why after input transfer? → Ensures atomicity: both succeed or both fail + - Why trust pool has balance? → Pool's job to maintain reserves; if insufficient, transfer reverts + +--- + +```solidity +// L125-126: Reserve synchronization +reserves[pair].reserve0 = uint112(reserveIn + amountIn); +reserves[pair].reserve1 = uint112(reserveOut - amountOut); +``` +- **What:** Updates stored reserves to reflect post-swap balances +- **Why here:** After transfers complete; brings storage in sync with actual balances +- **Assumption:** No other operations have modified pool balances since reserves were read +- **Invariant maintained:** `reserve0 * reserve1 >= k_before * 1.003` (constant product + fee) +- **Casting risk:** `uint112` casting could truncate if reserves exceed 2^112 - 1 (≈ 5.2e33) + - For most tokens with 18 decimals: limit is ~5.2e15 tokens + - Overflow protection: require reserves fit in uint112, else revert +- **5 Whys:** + - Why update reserves? → Storage must match actual balances for next swap + - Why after transfers? → Need to know final state before recording + - Why not query balances? → Gas optimization: storage update cheaper than CALL + BALANCE + - Why uint112? → Pack two reserves in one storage slot (256 bits = 2 * 112 + 32 for timestamp) + - Why this formula? → reserveIn increased by amountIn, reserveOut decreased by amountOut +- **Invariant verification:** + - Before: `k_before = reserveIn * reserveOut` + - After: `k_after = (reserveIn + amountIn) * (reserveOut - amountOut)` + - With 0.3% fee: `k_after ≈ k_before * 1.003` (fee adds permanent liquidity) + +--- + +```solidity +// L130: Event emission +emit Swap(msg.sender, tokenIn, tokenOut, amountIn, amountOut, block.timestamp); +``` +- **What:** Emits event logging swap details for off-chain indexing +- **Why here:** After all state changes finalized; last operation before return +- **Assumption:** Event watchers (subgraphs, dex aggregators) rely on this for tracking trades +- **Data included:** + - `msg.sender`: Who initiated swap (for user trade history) + - `tokenIn/tokenOut`: Which pair was traded + - `amountIn/amountOut`: Exact amounts for price tracking + - `block.timestamp`: When trade occurred (for TWAP calculations, analytics) +- **First Principles:** Events are write-only log for off-chain systems; don't affect on-chain state +- **5 Hows:** + - How to notify off-chain? → Emit event (logs are cheaper than storage) + - How to structure event? → Include all relevant swap parameters + - How do indexers use this? → Build trade history, calculate volume, track prices + - How to ensure consistency? → Emit after state finalized (can't be front-run) + - How to query later? → Blockchain logs filtered by event signature + contract address + +--- + +**Cross-Function Dependencies:** + +*Internal Calls:* +- `getReserves(pair, tokenIn, tokenOut)`: Helper to read and order reserves based on token addresses + - Depends on: `reserves[pair]` storage being synchronized + - Returns: (reserveIn, reserveOut) in correct order for tokenIn/tokenOut + +*External Calls (Outbound):* +- `IERC20(tokenIn).transferFrom(msg.sender, pair, amountIn)`: ERC20 standard call + - Assumes: tokenIn implements ERC20, user has approved Router + - Reentrancy risk: If tokenIn is malicious, could callback + - Failure: Reverts entire transaction +- `IERC20(tokenOut).transfer(msg.sender, amountOut)`: ERC20 standard call + - Assumes: Pool has sufficient tokenOut balance + - Reentrancy risk: If tokenOut has hooks + - Failure: Reverts entire transaction + +*Called By:* +- Users directly (external call) +- Aggregators/routers (external call) +- Multi-hop swap functions (internal call from same contract) + +*Shares State With:* +- `addLiquidity()`: Modifies same reserves[pair], must maintain k invariant +- `removeLiquidity()`: Modifies same reserves[pair] +- `sync()`: Emergency function to force reserves sync with balances +- `skim()`: Removes excess tokens beyond reserves + +*Invariant Coupling:* +- **Global invariant:** `sum(all reserves[pair].reserve0 for all pairs) <= sum(all token balances in pools)` +- **Per-pool invariant:** `reserves[pair].reserve0 * reserves[pair].reserve1 >= k_initial * (1.003^n)` where n = number of swaps + - Each swap increases k by 0.3% due to fee +- **Reentrancy protection:** `nonReentrant` modifier ensures no cross-function reentrancy + - swap() cannot be re-entered while executing + - addLiquidity/removeLiquidity also cannot execute during swap + +*Assumptions Propagated to Callers:* +- Caller must have approved Router to spend amountIn of tokenIn +- Caller must set reasonable deadline (e.g., block.timestamp + 300 seconds) +- Caller must calculate minAmountOut based on acceptable slippage (e.g., expectedOutput * 0.99 for 1%) +- Caller assumes pair exists (or will handle "Pool does not exist" revert) diff --git a/audit-context-building/resources/OUTPUT_REQUIREMENTS.md b/audit-context-building/resources/OUTPUT_REQUIREMENTS.md new file mode 100644 index 0000000..ca2ace1 --- /dev/null +++ b/audit-context-building/resources/OUTPUT_REQUIREMENTS.md @@ -0,0 +1,71 @@ +# Output Requirements + +When performing ultra-granular analysis, Claude MUST structure output following the Per-Function Microstructure Checklist format demonstrated in [FUNCTION_MICRO_ANALYSIS_EXAMPLE.md](FUNCTION_MICRO_ANALYSIS_EXAMPLE.md). + +--- + +## Required Structure + +For EACH analyzed function, output MUST include: + +**1. Purpose** (mandatory) +- Clear statement of function's role in the system +- Impact on system state, security, or economics +- Minimum 2-3 sentences + +**2. Inputs & Assumptions** (mandatory) +- All parameters (explicit and implicit) +- All preconditions +- All trust assumptions +- Each input must identify: type, source, trust level +- Minimum 3 assumptions documented + +**3. Outputs & Effects** (mandatory) +- Return values (or "void" if none) +- All state writes +- All external interactions +- All events emitted +- All postconditions +- Minimum 3 effects documented + +**4. Block-by-Block Analysis** (mandatory) +For EACH logical code block, document: +- **What:** What the block does (1 sentence) +- **Why here:** Why this ordering/placement (1 sentence) +- **Assumptions:** What must be true (1+ items) +- **Depends on:** What prior state/logic this relies on +- **First Principles / 5 Whys / 5 Hows:** Apply at least ONE per block + +Minimum standards: +- Analyze at minimum: ALL conditional branches, ALL external calls, ALL state modifications +- For complex blocks (>5 lines): Apply First Principles AND 5 Whys or 5 Hows +- For simple blocks (<5 lines): Minimum What + Why here + 1 Assumption + +**5. Cross-Function Dependencies** (mandatory) +- Internal calls made (list all) +- External calls made (list all with risk analysis) +- Functions that call this function +- Shared state with other functions +- Invariant couplings (how this function's invariants interact with others) +- Minimum 3 dependency relationships documented + +--- + +## Quality Thresholds + +A complete micro-analysis MUST identify: +- Minimum 3 invariants (per function) +- Minimum 5 assumptions (across all sections) +- Minimum 3 risk considerations (especially for external interactions) +- At least 1 application of First Principles +- At least 3 applications of 5 Whys or 5 Hows (combined) + +--- + +## Format Consistency + +- Use markdown headers: `**Section Name:**` for major sections +- Use bullet points (`-`) for lists +- Use code blocks (` ```solidity `) for code snippets +- Reference line numbers: `L45`, `lines 98-102` +- Separate blocks with `---` horizontal rules for readability diff --git a/audit-prep-assistant/.skillshare-meta.json b/audit-prep-assistant/.skillshare-meta.json new file mode 100644 index 0000000..787f0ab --- /dev/null +++ b/audit-prep-assistant/.skillshare-meta.json @@ -0,0 +1,8 @@ +{ + "source": "github.com/trailofbits/skills/tree/main/plugins/building-secure-contracts/skills/audit-prep-assistant", + "type": "github-subdir", + "installed_at": "2026-01-30T02:24:00.994207953Z", + "repo_url": "https://github.com/trailofbits/skills.git", + "subdir": "plugins/building-secure-contracts/skills/audit-prep-assistant", + "version": "650f6e3" +} \ No newline at end of file diff --git a/audit-prep-assistant/SKILL.md b/audit-prep-assistant/SKILL.md new file mode 100644 index 0000000..1379632 --- /dev/null +++ b/audit-prep-assistant/SKILL.md @@ -0,0 +1,409 @@ +--- +name: audit-prep-assistant +description: Prepares codebases for security review using Trail of Bits' checklist. Helps set review goals, runs static analysis tools, increases test coverage, removes dead code, ensures accessibility, and generates documentation (flowcharts, user stories, inline comments). +--- + +# Audit Prep Assistant + +## Purpose + +Helps prepare for a security review using Trail of Bits' checklist. A well-prepared codebase makes the review process smoother and more effective. + +**Use this**: 1-2 weeks before your security audit + +--- + +## The Preparation Process + +### Step 1: Set Review Goals + +Helps define what you want from the review: + +**Key Questions**: +- What's the overall security level you're aiming for? +- What areas concern you most? + - Previous audit issues? + - Complex components? + - Fragile parts? +- What's the worst-case scenario for your project? + +Documents goals to share with the assessment team. + +--- + +### Step 2: Resolve Easy Issues + +Runs static analysis and helps fix low-hanging fruit: + +**Run Static Analysis**: + +For Solidity: +```bash +slither . --exclude-dependencies +``` + +For Rust: +```bash +dylint --all +``` + +For Go: +```bash +golangci-lint run +``` + +For Go/Rust/C++: +```bash +# CodeQL and Semgrep checks +``` + +Then I'll: +- Triage all findings +- Help fix easy issues +- Document accepted risks + +**Increase Test Coverage**: +- Analyze current coverage +- Identify untested code +- Suggest new tests +- Run full test suite + +**Remove Dead Code**: +- Find unused functions/variables +- Identify unused libraries +- Locate stale features +- Suggest cleanup + +**Goal**: Clean static analysis report, high test coverage, minimal dead code + +--- + +### Step 3: Ensure Code Accessibility + +Helps make code clear and accessible: + +**Provide Detailed File List**: +- List all files in scope +- Mark out-of-scope files +- Explain folder structure +- Document dependencies + +**Create Build Instructions**: +- Write step-by-step setup guide +- Test on fresh environment +- Document dependencies and versions +- Verify build succeeds + +**Freeze Stable Version**: +- Identify commit hash for review +- Create dedicated branch +- Tag release version +- Lock dependencies + +**Identify Boilerplate**: +- Mark copied/forked code +- Highlight your modifications +- Document third-party code +- Focus review on your code + +--- + +### Step 4: Generate Documentation + +Helps create documentation: + +**Flowcharts and Sequence Diagrams**: +- Map primary workflows +- Show component relationships +- Visualize data flow +- Identify critical paths + +**User Stories**: +- Define user roles +- Document use cases +- Explain interactions +- Clarify expectations + +**On-chain/Off-chain Assumptions**: +- Data validation procedures +- Oracle information +- Bridge assumptions +- Trust boundaries + +**Actors and Privileges**: +- List all actors +- Document roles +- Define privileges +- Map access controls + +**External Developer Docs**: +- Link docs to code +- Keep synchronized +- Explain architecture +- Document APIs + +**Function Documentation**: +- System and function invariants +- Parameter ranges (min/max values) +- Arithmetic formulas and precision loss +- Complex logic explanations +- NatSpec for Solidity + +**Glossary**: +- Define domain terms +- Explain acronyms +- Consistent terminology +- Business logic concepts + +**Video Walkthroughs** (optional): +- Complex workflows +- Areas of concern +- Architecture overview + +--- + +## How I Work + +When invoked, I will: + +1. **Help set review goals** - Ask about concerns and document them +2. **Run static analysis** - Execute appropriate tools for your platform +3. **Analyze test coverage** - Identify gaps and suggest improvements +4. **Find dead code** - Search for unused code and libraries +5. **Review accessibility** - Check build instructions and scope clarity +6. **Generate documentation** - Create flowcharts, user stories, glossaries +7. **Create prep checklist** - Track what's done and what's remaining + +Adapts based on: +- Your platform (Solidity, Rust, Go, etc.) +- Available tools +- Existing documentation +- Review timeline + +--- + +## Rationalizations (Do Not Skip) + +| Rationalization | Why It's Wrong | Required Action | +|-----------------|----------------|-----------------| +| "README covers setup, no need for detailed build instructions" | READMEs assume context auditors don't have | Test build on fresh environment, document every dependency version | +| "Static analysis already ran, no need to run again" | Codebase changed since last run | Execute static analysis tools, generate fresh report | +| "Test coverage looks decent" | "Looks decent" isn't measured coverage | Run coverage tools, identify specific untested code paths | +| "Not much dead code to worry about" | Dead code hides during manual review | Use automated detection tools to find unused functions/variables | +| "Architecture is straightforward, no diagrams needed" | Text descriptions miss visual patterns | Generate actual flowcharts and sequence diagrams | +| "Can freeze version right before audit" | Last-minute freezing creates rushed handoff | Identify and document commit hash now, create dedicated branch | +| "Terms are self-explanatory" | Domain knowledge isn't universal | Create comprehensive glossary with all domain-specific terms | +| "I'll do this step later" | Steps build on each other - skipping creates gaps | Complete all 4 steps sequentially, track progress with checklist | + +--- + +## Example Output + +When I finish helping you prepare, you'll have concrete deliverables like: + +``` +=== AUDIT PREP PACKAGE === + +Project: DeFi DEX Protocol +Audit Date: March 15, 2024 +Preparation Status: Complete + +--- + +## REVIEW GOALS DOCUMENT + +Security Objectives: +- Verify economic security of liquidity pool swaps +- Validate oracle manipulation resistance +- Assess flash loan attack vectors + +Areas of Concern: +1. Complex AMM pricing calculation (src/SwapRouter.sol:89-156) +2. Multi-hop swap routing logic (src/Router.sol) +3. Oracle price aggregation (src/PriceOracle.sol:45-78) + +Worst-Case Scenario: +- Flash loan attack drains liquidity pools via oracle manipulation + +Questions for Auditors: +- Can the AMM pricing model produce negative slippage under edge cases? +- Is the slippage protection sufficient to prevent sandwich attacks? +- How resilient is the system to temporary oracle failures? + +--- + +## STATIC ANALYSIS REPORT + +Slither Scan Results: +✓ High: 0 issues +✓ Medium: 0 issues +⚠ Low: 2 issues (triaged - documented in TRIAGE.md) +ℹ Info: 5 issues (code style, acceptable) + +Tool: slither . --exclude-dependencies +Date: March 1, 2024 +Status: CLEAN (all critical issues resolved) + +--- + +## TEST COVERAGE REPORT + +Overall Coverage: 94% +- Statements: 1,245 / 1,321 (94%) +- Branches: 456 / 498 (92%) +- Functions: 89 / 92 (97%) + +Uncovered Areas: +- Emergency pause admin functions (tested manually) +- Governance migration path (one-time use) + +Command: forge coverage +Status: EXCELLENT + +--- + +## CODE SCOPE + +In-Scope Files (8): +✓ src/SwapRouter.sol (456 lines) +✓ src/LiquidityPool.sol (234 lines) +✓ src/PairFactory.sol (389 lines) +✓ src/PriceOracle.sol (167 lines) +✓ src/LiquidityManager.sol (298 lines) +✓ src/Governance.sol (201 lines) +✓ src/FlashLoan.sol (145 lines) +✓ src/RewardsDistributor.sol (178 lines) + +Out-of-Scope: +- lib/ (OpenZeppelin, external dependencies) +- test/ (test contracts) +- scripts/ (deployment scripts) + +Total In-Scope: 2,068 lines of Solidity + +--- + +## BUILD INSTRUCTIONS + +Prerequisites: +- Foundry 0.2.0+ +- Node.js 18+ +- Git + +Setup: +```bash +git clone https://github.com/project/repo.git +cd repo +git checkout audit-march-2024 # Frozen branch +forge install +forge build +forge test +``` + +Verification: +✓ Build succeeds without errors +✓ All 127 tests pass +✓ No warnings from compiler + +--- + +## DOCUMENTATION + +Generated Artifacts: +✓ ARCHITECTURE.md - System overview with diagrams +✓ USER_STORIES.md - 12 user interaction flows +✓ GLOSSARY.md - 34 domain terms defined +✓ docs/diagrams/contract-interactions.png +✓ docs/diagrams/swap-flow.png +✓ docs/diagrams/state-machine.png + +NatSpec Coverage: 100% of public functions + +--- + +## DEPLOYMENT INFO + +Network: Ethereum Mainnet +Commit: abc123def456 (audit-march-2024 branch) +Deployed Contracts: +- SwapRouter: 0x1234... +- PriceOracle: 0x5678... +[... etc] + +--- + +PACKAGE READY FOR AUDIT ✓ +Next Step: Share with Trail of Bits assessment team +``` + +--- + +## What You'll Get + +**Review Goals Document**: +- Security objectives +- Areas of concern +- Worst-case scenarios +- Questions for auditors + +**Clean Codebase**: +- Triaged static analysis (or clean report) +- High test coverage +- No dead code +- Clear scope + +**Accessibility Package**: +- File list with scope +- Build instructions +- Frozen commit/branch +- Boilerplate identified + +**Documentation Suite**: +- Flowcharts and diagrams +- User stories +- Architecture docs +- Actor/privilege map +- Inline code comments +- Glossary +- Video walkthroughs (if created) + +**Audit Prep Checklist**: +- [ ] Review goals documented +- [ ] Static analysis clean/triaged +- [ ] Test coverage >80% +- [ ] Dead code removed +- [ ] Build instructions verified +- [ ] Stable version frozen +- [ ] Flowcharts created +- [ ] User stories documented +- [ ] Assumptions documented +- [ ] Actors/privileges listed +- [ ] Function docs complete +- [ ] Glossary created + +--- + +## Timeline + +**2 weeks before audit**: +- Set review goals +- Run static analysis +- Start fixing issues + +**1 week before audit**: +- Increase test coverage +- Remove dead code +- Freeze stable version +- Start documentation + +**Few days before audit**: +- Complete documentation +- Verify build instructions +- Create final checklist +- Send package to auditors + +--- + +## Ready to Prep + +Let me know when you're ready and I'll help you prepare for your security review! diff --git a/best-practices/.skillshare-meta.json b/best-practices/.skillshare-meta.json new file mode 100644 index 0000000..1b041cb --- /dev/null +++ b/best-practices/.skillshare-meta.json @@ -0,0 +1,8 @@ +{ + "source": "github.com/better-auth/skills/tree/main/better-auth/best-practices", + "type": "github-subdir", + "installed_at": "2026-01-30T02:26:09.125899442Z", + "repo_url": "https://github.com/better-auth/skills.git", + "subdir": "better-auth/best-practices", + "version": "14c9623" +} \ No newline at end of file diff --git a/best-practices/SKILL.md b/best-practices/SKILL.md new file mode 100644 index 0000000..3458e07 --- /dev/null +++ b/best-practices/SKILL.md @@ -0,0 +1,166 @@ +--- +name: better-auth-best-practices +description: Skill for integrating Better Auth - the comprehensive TypeScript authentication framework. +--- + +# Better Auth Integration Guide + +**Always consult [better-auth.com/docs](https://better-auth.com/docs) for code examples and latest API.** + +Better Auth is a TypeScript-first, framework-agnostic auth framework supporting email/password, OAuth, magic links, passkeys, and more via plugins. + +--- + +## Quick Reference + +### Environment Variables +- `BETTER_AUTH_SECRET` - Encryption secret (min 32 chars). Generate: `openssl rand -base64 32` +- `BETTER_AUTH_URL` - Base URL (e.g., `https://example.com`) + +Only define `baseURL`/`secret` in config if env vars are NOT set. + +### File Location +CLI looks for `auth.ts` in: `./`, `./lib`, `./utils`, or under `./src`. Use `--config` for custom path. + +### CLI Commands +- `npx @better-auth/cli@latest migrate` - Apply schema (built-in adapter) +- `npx @better-auth/cli@latest generate` - Generate schema for Prisma/Drizzle +- `npx @better-auth/cli mcp --cursor` - Add MCP to AI tools + +**Re-run after adding/changing plugins.** + +--- + +## Core Config Options + +| Option | Notes | +|--------|-------| +| `appName` | Optional display name | +| `baseURL` | Only if `BETTER_AUTH_URL` not set | +| `basePath` | Default `/api/auth`. Set `/` for root. | +| `secret` | Only if `BETTER_AUTH_SECRET` not set | +| `database` | Required for most features. See adapters docs. | +| `secondaryStorage` | Redis/KV for sessions & rate limits | +| `emailAndPassword` | `{ enabled: true }` to activate | +| `socialProviders` | `{ google: { clientId, clientSecret }, ... }` | +| `plugins` | Array of plugins | +| `trustedOrigins` | CSRF whitelist | + +--- + +## Database + +**Direct connections:** Pass `pg.Pool`, `mysql2` pool, `better-sqlite3`, or `bun:sqlite` instance. + +**ORM adapters:** Import from `better-auth/adapters/drizzle`, `better-auth/adapters/prisma`, `better-auth/adapters/mongodb`. + +**Critical:** Better Auth uses adapter model names, NOT underlying table names. If Prisma model is `User` mapping to table `users`, use `modelName: "user"` (Prisma reference), not `"users"`. + +--- + +## Session Management + +**Storage priority:** +1. If `secondaryStorage` defined → sessions go there (not DB) +2. Set `session.storeSessionInDatabase: true` to also persist to DB +3. No database + `cookieCache` → fully stateless mode + +**Cookie cache strategies:** +- `compact` (default) - Base64url + HMAC. Smallest. +- `jwt` - Standard JWT. Readable but signed. +- `jwe` - Encrypted. Maximum security. + +**Key options:** `session.expiresIn` (default 7 days), `session.updateAge` (refresh interval), `session.cookieCache.maxAge`, `session.cookieCache.version` (change to invalidate all sessions). + +--- + +## User & Account Config + +**User:** `user.modelName`, `user.fields` (column mapping), `user.additionalFields`, `user.changeEmail.enabled` (disabled by default), `user.deleteUser.enabled` (disabled by default). + +**Account:** `account.modelName`, `account.accountLinking.enabled`, `account.storeAccountCookie` (for stateless OAuth). + +**Required for registration:** `email` and `name` fields. + +--- + +## Email Flows + +- `emailVerification.sendVerificationEmail` - Must be defined for verification to work +- `emailVerification.sendOnSignUp` / `sendOnSignIn` - Auto-send triggers +- `emailAndPassword.sendResetPassword` - Password reset email handler + +--- + +## Security + +**In `advanced`:** +- `useSecureCookies` - Force HTTPS cookies +- `disableCSRFCheck` - ⚠️ Security risk +- `disableOriginCheck` - ⚠️ Security risk +- `crossSubDomainCookies.enabled` - Share cookies across subdomains +- `ipAddress.ipAddressHeaders` - Custom IP headers for proxies +- `database.generateId` - Custom ID generation or `"serial"`/`"uuid"`/`false` + +**Rate limiting:** `rateLimit.enabled`, `rateLimit.window`, `rateLimit.max`, `rateLimit.storage` ("memory" | "database" | "secondary-storage"). + +--- + +## Hooks + +**Endpoint hooks:** `hooks.before` / `hooks.after` - Array of `{ matcher, handler }`. Use `createAuthMiddleware`. Access `ctx.path`, `ctx.context.returned` (after), `ctx.context.session`. + +**Database hooks:** `databaseHooks.user.create.before/after`, same for `session`, `account`. Useful for adding default values or post-creation actions. + +**Hook context (`ctx.context`):** `session`, `secret`, `authCookies`, `password.hash()`/`verify()`, `adapter`, `internalAdapter`, `generateId()`, `tables`, `baseURL`. + +--- + +## Plugins + +**Import from dedicated paths for tree-shaking:** +``` +import { twoFactor } from "better-auth/plugins/two-factor" +``` +NOT `from "better-auth/plugins"`. + +**Popular plugins:** `twoFactor`, `organization`, `passkey`, `magicLink`, `emailOtp`, `username`, `phoneNumber`, `admin`, `apiKey`, `bearer`, `jwt`, `multiSession`, `sso`, `oauthProvider`, `oidcProvider`, `openAPI`, `genericOAuth`. + +Client plugins go in `createAuthClient({ plugins: [...] })`. + +--- + +## Client + +Import from: `better-auth/client` (vanilla), `better-auth/react`, `better-auth/vue`, `better-auth/svelte`, `better-auth/solid`. + +Key methods: `signUp.email()`, `signIn.email()`, `signIn.social()`, `signOut()`, `useSession()`, `getSession()`, `revokeSession()`, `revokeSessions()`. + +--- + +## Type Safety + +Infer types: `typeof auth.$Infer.Session`, `typeof auth.$Infer.Session.user`. + +For separate client/server projects: `createAuthClient()`. + +--- + +## Common Gotchas + +1. **Model vs table name** - Config uses ORM model name, not DB table name +2. **Plugin schema** - Re-run CLI after adding plugins +3. **Secondary storage** - Sessions go there by default, not DB +4. **Cookie cache** - Custom session fields NOT cached, always re-fetched +5. **Stateless mode** - No DB = session in cookie only, logout on cache expiry +6. **Change email flow** - Sends to current email first, then new email + +--- + +## Resources + +- [Docs](https://better-auth.com/docs) +- [Options Reference](https://better-auth.com/docs/reference/options) +- [LLMs.txt](https://better-auth.com/llms.txt) +- [GitHub](https://github.com/better-auth/better-auth) +- [Init Options Source](https://github.com/better-auth/better-auth/blob/main/packages/core/src/types/init-options.ts) \ No newline at end of file diff --git a/brainstorming/.skillshare-meta.json b/brainstorming/.skillshare-meta.json new file mode 100644 index 0000000..4611f6b --- /dev/null +++ b/brainstorming/.skillshare-meta.json @@ -0,0 +1,8 @@ +{ + "source": "github.com/obra/superpowers/tree/main/skills/brainstorming", + "type": "github-subdir", + "installed_at": "2026-01-30T02:19:55.278306836Z", + "repo_url": "https://github.com/obra/superpowers.git", + "subdir": "skills/brainstorming", + "version": "469a6d8" +} \ No newline at end of file diff --git a/brainstorming/SKILL.md b/brainstorming/SKILL.md new file mode 100644 index 0000000..2fd19ba --- /dev/null +++ b/brainstorming/SKILL.md @@ -0,0 +1,54 @@ +--- +name: brainstorming +description: "You MUST use this before any creative work - creating features, building components, adding functionality, or modifying behavior. Explores user intent, requirements and design before implementation." +--- + +# Brainstorming Ideas Into Designs + +## Overview + +Help turn ideas into fully formed designs and specs through natural collaborative dialogue. + +Start by understanding the current project context, then ask questions one at a time to refine the idea. Once you understand what you're building, present the design in small sections (200-300 words), checking after each section whether it looks right so far. + +## The Process + +**Understanding the idea:** +- Check out the current project state first (files, docs, recent commits) +- Ask questions one at a time to refine the idea +- Prefer multiple choice questions when possible, but open-ended is fine too +- Only one question per message - if a topic needs more exploration, break it into multiple questions +- Focus on understanding: purpose, constraints, success criteria + +**Exploring approaches:** +- Propose 2-3 different approaches with trade-offs +- Present options conversationally with your recommendation and reasoning +- Lead with your recommended option and explain why + +**Presenting the design:** +- Once you believe you understand what you're building, present the design +- Break it into sections of 200-300 words +- Ask after each section whether it looks right so far +- Cover: architecture, components, data flow, error handling, testing +- Be ready to go back and clarify if something doesn't make sense + +## After the Design + +**Documentation:** +- Write the validated design to `docs/plans/YYYY-MM-DD--design.md` +- Use elements-of-style:writing-clearly-and-concisely skill if available +- Commit the design document to git + +**Implementation (if continuing):** +- Ask: "Ready to set up for implementation?" +- Use superpowers:using-git-worktrees to create isolated workspace +- Use superpowers:writing-plans to create detailed implementation plan + +## Key Principles + +- **One question at a time** - Don't overwhelm with multiple questions +- **Multiple choice preferred** - Easier to answer than open-ended when possible +- **YAGNI ruthlessly** - Remove unnecessary features from all designs +- **Explore alternatives** - Always propose 2-3 approaches before settling +- **Incremental validation** - Present design in sections, validate each +- **Be flexible** - Go back and clarify when something doesn't make sense diff --git a/brand-guidelines/.skillshare-meta.json b/brand-guidelines/.skillshare-meta.json new file mode 100644 index 0000000..2b1ec50 --- /dev/null +++ b/brand-guidelines/.skillshare-meta.json @@ -0,0 +1,8 @@ +{ + "source": "github.com/anthropics/skills/tree/main/skills/brand-guidelines", + "type": "github-subdir", + "installed_at": "2026-01-30T02:17:44.846548934Z", + "repo_url": "https://github.com/anthropics/skills.git", + "subdir": "skills/brand-guidelines", + "version": "69c0b1a" +} \ No newline at end of file diff --git a/brand-guidelines/LICENSE.txt b/brand-guidelines/LICENSE.txt new file mode 100644 index 0000000..7a4a3ea --- /dev/null +++ b/brand-guidelines/LICENSE.txt @@ -0,0 +1,202 @@ + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. \ No newline at end of file diff --git a/brand-guidelines/SKILL.md b/brand-guidelines/SKILL.md new file mode 100644 index 0000000..47c72c6 --- /dev/null +++ b/brand-guidelines/SKILL.md @@ -0,0 +1,73 @@ +--- +name: brand-guidelines +description: Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply. +license: Complete terms in LICENSE.txt +--- + +# Anthropic Brand Styling + +## Overview + +To access Anthropic's official brand identity and style resources, use this skill. + +**Keywords**: branding, corporate identity, visual identity, post-processing, styling, brand colors, typography, Anthropic brand, visual formatting, visual design + +## Brand Guidelines + +### Colors + +**Main Colors:** + +- Dark: `#141413` - Primary text and dark backgrounds +- Light: `#faf9f5` - Light backgrounds and text on dark +- Mid Gray: `#b0aea5` - Secondary elements +- Light Gray: `#e8e6dc` - Subtle backgrounds + +**Accent Colors:** + +- Orange: `#d97757` - Primary accent +- Blue: `#6a9bcc` - Secondary accent +- Green: `#788c5d` - Tertiary accent + +### Typography + +- **Headings**: Poppins (with Arial fallback) +- **Body Text**: Lora (with Georgia fallback) +- **Note**: Fonts should be pre-installed in your environment for best results + +## Features + +### Smart Font Application + +- Applies Poppins font to headings (24pt and larger) +- Applies Lora font to body text +- Automatically falls back to Arial/Georgia if custom fonts unavailable +- Preserves readability across all systems + +### Text Styling + +- Headings (24pt+): Poppins font +- Body text: Lora font +- Smart color selection based on background +- Preserves text hierarchy and formatting + +### Shape and Accent Colors + +- Non-text shapes use accent colors +- Cycles through orange, blue, and green accents +- Maintains visual interest while staying on-brand + +## Technical Details + +### Font Management + +- Uses system-installed Poppins and Lora fonts when available +- Provides automatic fallback to Arial (headings) and Georgia (body) +- No font installation required - works with existing system fonts +- For best results, pre-install Poppins and Lora fonts in your environment + +### Color Application + +- Uses RGB color values for precise brand matching +- Applied via python-pptx's RGBColor class +- Maintains color fidelity across different systems diff --git a/building-ai-agent-on-cloudflare/.skillshare-meta.json b/building-ai-agent-on-cloudflare/.skillshare-meta.json new file mode 100644 index 0000000..1ce0c67 --- /dev/null +++ b/building-ai-agent-on-cloudflare/.skillshare-meta.json @@ -0,0 +1,8 @@ +{ + "source": "github.com/cloudflare/skills/tree/main/skills/building-ai-agent-on-cloudflare", + "type": "github-subdir", + "installed_at": "2026-01-30T02:30:20.91375431Z", + "repo_url": "https://github.com/cloudflare/skills.git", + "subdir": "skills/building-ai-agent-on-cloudflare", + "version": "75a603b" +} \ No newline at end of file diff --git a/building-ai-agent-on-cloudflare/SKILL.md b/building-ai-agent-on-cloudflare/SKILL.md new file mode 100644 index 0000000..c47f423 --- /dev/null +++ b/building-ai-agent-on-cloudflare/SKILL.md @@ -0,0 +1,391 @@ +--- +name: building-ai-agent-on-cloudflare +description: | + Builds AI agents on Cloudflare using the Agents SDK with state management, + real-time WebSockets, scheduled tasks, tool integration, and chat capabilities. + Generates production-ready agent code deployed to Workers. + + Use when: user wants to "build an agent", "AI agent", "chat agent", "stateful + agent", mentions "Agents SDK", needs "real-time AI", "WebSocket AI", or asks + about agent "state management", "scheduled tasks", or "tool calling". +--- + +# Building Cloudflare Agents + +Creates AI-powered agents using Cloudflare's Agents SDK with persistent state, real-time communication, and tool integration. + +## When to Use + +- User wants to build an AI agent or chatbot +- User needs stateful, real-time AI interactions +- User asks about the Cloudflare Agents SDK +- User wants scheduled tasks or background AI work +- User needs WebSocket-based AI communication + +## Prerequisites + +- Cloudflare account with Workers enabled +- Node.js 18+ and npm/pnpm/yarn +- Wrangler CLI (`npm install -g wrangler`) + +## Quick Start + +```bash +npm create cloudflare@latest -- my-agent --template=cloudflare/agents-starter +cd my-agent +npm start +``` + +Agent runs at `http://localhost:8787` + +## Core Concepts + +### What is an Agent? + +An Agent is a stateful, persistent AI service that: +- Maintains state across requests and reconnections +- Communicates via WebSockets or HTTP +- Runs on Cloudflare's edge via Durable Objects +- Can schedule tasks and call tools +- Scales horizontally (each user/session gets own instance) + +### Agent Lifecycle + +``` +Client connects → Agent.onConnect() → Agent processes messages + → Agent.onMessage() + → Agent.setState() (persists + syncs) +Client disconnects → State persists → Client reconnects → State restored +``` + +## Basic Agent Structure + +```typescript +import { Agent, Connection } from "agents"; + +interface Env { + AI: Ai; // Workers AI binding +} + +interface State { + messages: Array<{ role: string; content: string }>; + preferences: Record; +} + +export class MyAgent extends Agent { + // Initial state for new instances + initialState: State = { + messages: [], + preferences: {}, + }; + + // Called when agent starts or resumes + async onStart() { + console.log("Agent started with state:", this.state); + } + + // Handle WebSocket connections + async onConnect(connection: Connection) { + connection.send(JSON.stringify({ + type: "welcome", + history: this.state.messages, + })); + } + + // Handle incoming messages + async onMessage(connection: Connection, message: string) { + const data = JSON.parse(message); + + if (data.type === "chat") { + await this.handleChat(connection, data.content); + } + } + + // Handle disconnections + async onClose(connection: Connection) { + console.log("Client disconnected"); + } + + // React to state changes + onStateUpdate(state: State, source: string) { + console.log("State updated by:", source); + } + + private async handleChat(connection: Connection, userMessage: string) { + // Add user message to history + const messages = [ + ...this.state.messages, + { role: "user", content: userMessage }, + ]; + + // Call AI + const response = await this.env.AI.run("@cf/meta/llama-3-8b-instruct", { + messages, + }); + + // Update state (persists and syncs to all clients) + this.setState({ + ...this.state, + messages: [ + ...messages, + { role: "assistant", content: response.response }, + ], + }); + + // Send response + connection.send(JSON.stringify({ + type: "response", + content: response.response, + })); + } +} +``` + +## Entry Point Configuration + +```typescript +// src/index.ts +import { routeAgentRequest } from "agents"; +import { MyAgent } from "./agent"; + +export default { + async fetch(request: Request, env: Env) { + // routeAgentRequest handles routing to /agents/:class/:name + return ( + (await routeAgentRequest(request, env)) || + new Response("Not found", { status: 404 }) + ); + }, +}; + +export { MyAgent }; +``` + +Clients connect via: `wss://my-agent.workers.dev/agents/MyAgent/session-id` + +## Wrangler Configuration + +```toml +name = "my-agent" +main = "src/index.ts" +compatibility_date = "2024-12-01" + +[ai] +binding = "AI" + +[durable_objects] +bindings = [{ name = "AGENT", class_name = "MyAgent" }] + +[[migrations]] +tag = "v1" +new_classes = ["MyAgent"] +``` + +## State Management + +### Reading State + +```typescript +// Current state is always available +const currentMessages = this.state.messages; +const userPrefs = this.state.preferences; +``` + +### Updating State + +```typescript +// setState persists AND syncs to all connected clients +this.setState({ + ...this.state, + messages: [...this.state.messages, newMessage], +}); + +// Partial updates work too +this.setState({ + preferences: { ...this.state.preferences, theme: "dark" }, +}); +``` + +### SQL Storage + +For complex queries, use the embedded SQLite database: + +```typescript +// Create tables +await this.sql` + CREATE TABLE IF NOT EXISTS documents ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + title TEXT NOT NULL, + content TEXT, + created_at DATETIME DEFAULT CURRENT_TIMESTAMP + ) +`; + +// Insert +await this.sql` + INSERT INTO documents (title, content) + VALUES (${title}, ${content}) +`; + +// Query +const docs = await this.sql` + SELECT * FROM documents WHERE title LIKE ${`%${search}%`} +`; +``` + +## Scheduled Tasks + +Agents can schedule future work: + +```typescript +async onMessage(connection: Connection, message: string) { + const data = JSON.parse(message); + + if (data.type === "schedule_reminder") { + // Schedule task for 1 hour from now + const { id } = await this.schedule(3600, "sendReminder", { + message: data.reminderText, + userId: data.userId, + }); + + connection.send(JSON.stringify({ type: "scheduled", taskId: id })); + } +} + +// Called when scheduled task fires +async sendReminder(data: { message: string; userId: string }) { + // Send notification, email, etc. + console.log(`Reminder for ${data.userId}: ${data.message}`); + + // Can also update state + this.setState({ + ...this.state, + lastReminder: new Date().toISOString(), + }); +} +``` + +### Schedule Options + +```typescript +// Delay in seconds +await this.schedule(60, "taskMethod", { data }); + +// Specific date +await this.schedule(new Date("2025-01-01T00:00:00Z"), "taskMethod", { data }); + +// Cron expression (recurring) +await this.schedule("0 9 * * *", "dailyTask", {}); // 9 AM daily +await this.schedule("*/5 * * * *", "everyFiveMinutes", {}); // Every 5 min + +// Manage schedules +const schedules = await this.getSchedules(); +await this.cancelSchedule(taskId); +``` + +## Chat Agent (AI-Powered) + +For chat-focused agents, extend `AIChatAgent`: + +```typescript +import { AIChatAgent } from "agents/ai-chat-agent"; + +export class ChatBot extends AIChatAgent { + // Called for each user message + async onChatMessage(message: string) { + const response = await this.env.AI.run("@cf/meta/llama-3-8b-instruct", { + messages: [ + { role: "system", content: "You are a helpful assistant." }, + ...this.messages, // Automatic history management + { role: "user", content: message }, + ], + stream: true, + }); + + // Stream response back to client + return response; + } +} +``` + +Features included: +- Automatic message history +- Resumable streaming (survives disconnects) +- Built-in `saveMessages()` for persistence + +## Client Integration + +### React Hook + +```tsx +import { useAgent } from "agents/react"; + +function Chat() { + const { state, send, connected } = useAgent({ + agent: "my-agent", + name: userId, // Agent instance ID + }); + + const sendMessage = (text: string) => { + send(JSON.stringify({ type: "chat", content: text })); + }; + + return ( +
+ {state.messages.map((msg, i) => ( +
{msg.role}: {msg.content}
+ ))} + e.key === "Enter" && sendMessage(e.target.value)} /> +
+ ); +} +``` + +### Vanilla JavaScript + +```javascript +const ws = new WebSocket("wss://my-agent.workers.dev/agents/MyAgent/user123"); + +ws.onopen = () => { + console.log("Connected to agent"); +}; + +ws.onmessage = (event) => { + const data = JSON.parse(event.data); + console.log("Received:", data); +}; + +ws.send(JSON.stringify({ type: "chat", content: "Hello!" })); +``` + +## Common Patterns + +See [references/agent-patterns.md](references/agent-patterns.md) for: +- Tool calling and function execution +- Multi-agent orchestration +- RAG (Retrieval Augmented Generation) +- Human-in-the-loop workflows + +## Deployment + +```bash +# Deploy +npx wrangler deploy + +# View logs +wrangler tail + +# Test endpoint +curl https://my-agent.workers.dev/agents/MyAgent/test-user +``` + +## Troubleshooting + +See [references/troubleshooting.md](references/troubleshooting.md) for common issues. + +## References + +- [references/examples.md](references/examples.md) — Official templates and production examples +- [references/agent-patterns.md](references/agent-patterns.md) — Advanced patterns +- [references/state-patterns.md](references/state-patterns.md) — State management strategies +- [references/troubleshooting.md](references/troubleshooting.md) — Error solutions diff --git a/building-ai-agent-on-cloudflare/references/agent-patterns.md b/building-ai-agent-on-cloudflare/references/agent-patterns.md new file mode 100644 index 0000000..219e825 --- /dev/null +++ b/building-ai-agent-on-cloudflare/references/agent-patterns.md @@ -0,0 +1,461 @@ +# Agent Patterns + +Advanced patterns for building sophisticated agents. + +## Tool Calling + +Agents can expose tools that AI models can call: + +```typescript +import { Agent, Connection } from "agents"; +import { z } from "zod"; + +interface Tool { + name: string; + description: string; + parameters: z.ZodSchema; + handler: (params: any) => Promise; +} + +export class ToolAgent extends Agent { + private tools: Map = new Map(); + + async onStart() { + // Register tools + this.registerTool({ + name: "get_weather", + description: "Get current weather for a city", + parameters: z.object({ city: z.string() }), + handler: async ({ city }) => { + const res = await fetch(`https://api.weather.com/${city}`); + return JSON.stringify(await res.json()); + }, + }); + + this.registerTool({ + name: "search_database", + description: "Search the document database", + parameters: z.object({ query: z.string(), limit: z.number().default(10) }), + handler: async ({ query, limit }) => { + const results = await this.sql` + SELECT * FROM documents + WHERE content LIKE ${`%${query}%`} + LIMIT ${limit} + `; + return JSON.stringify(results); + }, + }); + } + + private registerTool(tool: Tool) { + this.tools.set(tool.name, tool); + } + + async onMessage(connection: Connection, message: string) { + const data = JSON.parse(message); + + if (data.type === "chat") { + await this.handleChatWithTools(connection, data.content); + } + } + + private async handleChatWithTools(connection: Connection, userMessage: string) { + // Build tool descriptions for the AI + const toolDescriptions = Array.from(this.tools.values()).map((t) => ({ + type: "function", + function: { + name: t.name, + description: t.description, + parameters: JSON.parse(JSON.stringify(t.parameters)), + }, + })); + + // First AI call - may request tool use + const response = await this.env.AI.run("@cf/meta/llama-3-8b-instruct", { + messages: [ + { role: "system", content: "You are a helpful assistant with access to tools." }, + ...this.state.messages, + { role: "user", content: userMessage }, + ], + tools: toolDescriptions, + }); + + // Check if AI wants to use a tool + if (response.tool_calls) { + for (const toolCall of response.tool_calls) { + const tool = this.tools.get(toolCall.function.name); + if (tool) { + const params = JSON.parse(toolCall.function.arguments); + const result = await tool.handler(params); + + // Send tool result back to AI + const finalResponse = await this.env.AI.run("@cf/meta/llama-3-8b-instruct", { + messages: [ + ...this.state.messages, + { role: "user", content: userMessage }, + { role: "assistant", tool_calls: response.tool_calls }, + { role: "tool", tool_call_id: toolCall.id, content: result }, + ], + }); + + connection.send(JSON.stringify({ + type: "response", + content: finalResponse.response, + toolUsed: toolCall.function.name, + })); + } + } + } else { + connection.send(JSON.stringify({ + type: "response", + content: response.response, + })); + } + } +} +``` + +## RAG (Retrieval Augmented Generation) + +Combine Vectorize with Agents for knowledge-grounded responses: + +```typescript +interface Env { + AI: Ai; + VECTORIZE: VectorizeIndex; +} + +export class RAGAgent extends Agent { + async onMessage(connection: Connection, message: string) { + const data = JSON.parse(message); + + if (data.type === "chat") { + // 1. Generate embedding for query + const embedding = await this.env.AI.run("@cf/baai/bge-base-en-v1.5", { + text: data.content, + }); + + // 2. Search vector database + const results = await this.env.VECTORIZE.query(embedding.data[0], { + topK: 5, + returnMetadata: true, + }); + + // 3. Build context from results + const context = results.matches + .map((m) => m.metadata?.text || "") + .join("\n\n"); + + // 4. Generate response with context + const response = await this.env.AI.run("@cf/meta/llama-3-8b-instruct", { + messages: [ + { + role: "system", + content: `Answer based on this context:\n\n${context}\n\nIf the context doesn't contain relevant information, say so.`, + }, + { role: "user", content: data.content }, + ], + }); + + // 5. Update state and respond + this.setState({ + messages: [ + ...this.state.messages, + { role: "user", content: data.content }, + { role: "assistant", content: response.response }, + ], + }); + + connection.send(JSON.stringify({ + type: "response", + content: response.response, + sources: results.matches.map((m) => m.metadata?.source), + })); + } + } + + // Ingest documents into vector store + async ingestDocument(doc: { id: string; text: string; source: string }) { + const embedding = await this.env.AI.run("@cf/baai/bge-base-en-v1.5", { + text: doc.text, + }); + + await this.env.VECTORIZE.upsert([{ + id: doc.id, + values: embedding.data[0], + metadata: { text: doc.text, source: doc.source }, + }]); + } +} +``` + +## Multi-Agent Orchestration + +Coordinate multiple specialized agents: + +```typescript +interface Env { + RESEARCHER: DurableObjectNamespace; + WRITER: DurableObjectNamespace; + REVIEWER: DurableObjectNamespace; +} + +export class OrchestratorAgent extends Agent { + async onMessage(connection: Connection, message: string) { + const data = JSON.parse(message); + + if (data.type === "create_article") { + connection.send(JSON.stringify({ type: "status", step: "researching" })); + + // Step 1: Research agent gathers information + const researchResult = await this.callAgent( + this.env.RESEARCHER, + data.topic, + { action: "research", topic: data.topic } + ); + + connection.send(JSON.stringify({ type: "status", step: "writing" })); + + // Step 2: Writer agent creates draft + const draftResult = await this.callAgent( + this.env.WRITER, + data.topic, + { action: "write", research: researchResult, topic: data.topic } + ); + + connection.send(JSON.stringify({ type: "status", step: "reviewing" })); + + // Step 3: Reviewer agent improves draft + const finalResult = await this.callAgent( + this.env.REVIEWER, + data.topic, + { action: "review", draft: draftResult } + ); + + connection.send(JSON.stringify({ + type: "complete", + article: finalResult, + })); + } + } + + private async callAgent( + namespace: DurableObjectNamespace, + id: string, + payload: any + ): Promise { + const agentId = namespace.idFromName(id); + const agent = namespace.get(agentId); + + const response = await agent.fetch("http://agent/task", { + method: "POST", + body: JSON.stringify(payload), + }); + + return response.text(); + } +} +``` + +## Human-in-the-Loop + +Pause agent execution for human approval: + +```typescript +interface State { + pendingApprovals: Array<{ + id: string; + action: string; + data: any; + requestedAt: string; + }>; +} + +export class ApprovalAgent extends Agent { + initialState: State = { pendingApprovals: [] }; + + async onMessage(connection: Connection, message: string) { + const data = JSON.parse(message); + + if (data.type === "request_action") { + // Action requires approval + if (this.requiresApproval(data.action)) { + const approvalId = crypto.randomUUID(); + + this.setState({ + pendingApprovals: [ + ...this.state.pendingApprovals, + { + id: approvalId, + action: data.action, + data: data.payload, + requestedAt: new Date().toISOString(), + }, + ], + }); + + connection.send(JSON.stringify({ + type: "approval_required", + approvalId, + action: data.action, + description: this.describeAction(data.action, data.payload), + })); + + return; + } + + // Execute immediately if no approval needed + await this.executeAction(connection, data.action, data.payload); + } + + if (data.type === "approve") { + const approval = this.state.pendingApprovals.find( + (a) => a.id === data.approvalId + ); + + if (approval) { + // Remove from pending + this.setState({ + pendingApprovals: this.state.pendingApprovals.filter( + (a) => a.id !== data.approvalId + ), + }); + + // Execute the approved action + await this.executeAction(connection, approval.action, approval.data); + } + } + + if (data.type === "reject") { + this.setState({ + pendingApprovals: this.state.pendingApprovals.filter( + (a) => a.id !== data.approvalId + ), + }); + + connection.send(JSON.stringify({ + type: "action_rejected", + approvalId: data.approvalId, + })); + } + } + + private requiresApproval(action: string): boolean { + const sensitiveActions = ["delete", "send_email", "make_payment", "publish"]; + return sensitiveActions.includes(action); + } + + private describeAction(action: string, data: any): string { + // Generate human-readable description + return `${action}: ${JSON.stringify(data)}`; + } + + private async executeAction(connection: Connection, action: string, data: any) { + // Execute the action + const result = await this.performAction(action, data); + + connection.send(JSON.stringify({ + type: "action_completed", + action, + result, + })); + } +} +``` + +## Streaming Responses + +Stream AI responses in real-time: + +```typescript +export class StreamingAgent extends Agent { + async onMessage(connection: Connection, message: string) { + const data = JSON.parse(message); + + if (data.type === "chat") { + // Start streaming response + const stream = await this.env.AI.run("@cf/meta/llama-3-8b-instruct", { + messages: [ + { role: "system", content: "You are a helpful assistant." }, + ...this.state.messages, + { role: "user", content: data.content }, + ], + stream: true, + }); + + let fullResponse = ""; + + // Stream chunks to client + for await (const chunk of stream) { + if (chunk.response) { + fullResponse += chunk.response; + connection.send(JSON.stringify({ + type: "stream", + content: chunk.response, + done: false, + })); + } + } + + // Update state with complete response + this.setState({ + messages: [ + ...this.state.messages, + { role: "user", content: data.content }, + { role: "assistant", content: fullResponse }, + ], + }); + + // Signal completion + connection.send(JSON.stringify({ + type: "stream", + content: "", + done: true, + })); + } + } +} +``` + +## Connecting to MCP Servers + +Agents can connect to MCP servers as clients: + +```typescript +export class MCPClientAgent extends Agent { + async onStart() { + // Connect to external MCP server + await this.addMcpServer( + "github", + "https://github-mcp.example.com/sse", + { headers: { Authorization: `Bearer ${this.env.GITHUB_TOKEN}` } } + ); + + await this.addMcpServer( + "database", + "https://db-mcp.example.com/sse" + ); + } + + async onMessage(connection: Connection, message: string) { + const data = JSON.parse(message); + + if (data.type === "use_tool") { + // Call tool on connected MCP server + const servers = await this.getMcpServers(); + const server = servers.find((s) => s.name === data.server); + + if (server) { + const result = await server.callTool(data.tool, data.params); + connection.send(JSON.stringify({ type: "tool_result", result })); + } + } + } + + async onClose() { + // Cleanup MCP connections + await this.removeMcpServer("github"); + await this.removeMcpServer("database"); + } +} +``` diff --git a/building-ai-agent-on-cloudflare/references/examples.md b/building-ai-agent-on-cloudflare/references/examples.md new file mode 100644 index 0000000..7efc0f8 --- /dev/null +++ b/building-ai-agent-on-cloudflare/references/examples.md @@ -0,0 +1,188 @@ +# Project Bootstrapping + +Instructions for creating new agent projects. + +--- + +## Create Command + +Execute in terminal to generate a new project: + +```bash +npm create cloudflare@latest -- my-agent \ + --template=cloudflare/agents-starter +``` + +Or use npx directly: + +```bash +npx create-cloudflare@latest --template cloudflare/agents-starter +``` + +Includes: +- Persistent data via `this.setState` and `this.sql` +- WebSocket real-time connections +- Workers AI bindings ready +- React chat interface example + +--- + +## Project Layout + +Generated structure: + +``` +my-agent/ +├── src/ +│ ├── app.tsx # React chat interface +│ ├── server.ts # Agent implementation +│ ├── tools.ts # Tool definitions +│ └── utils.ts # Helpers +├── wrangler.toml # Platform configuration +└── package.json +``` + +--- + +## Agent Variations + +**Chat-focused:** + +Inherit from base `Agent` class, implement `onMessage` handler: +- Manual conversation tracking +- Full control over responses +- Integrates with any AI provider + +**Persistent data:** + +Use `this.setState()` for automatic persistence: +- JSON-serializable data +- Auto-syncs to connected clients +- Survives instance eviction + +**Per-session isolation:** + +Route by unique identifier in URL path: +- Each identifier gets dedicated instance +- Isolated data storage +- Horizontal scaling automatic + +--- + +## Platform Documentation + +- developers.cloudflare.com/agents/ +- developers.cloudflare.com/agents/getting-started/ +- developers.cloudflare.com/agents/api-reference/ + +**Source repositories:** +- `github.com/cloudflare/agents-starter` (starter template) +- `github.com/cloudflare/agents/tree/main/examples` (reference implementations) + +**Related services:** + +- developers.cloudflare.com/workers-ai/ (AI models) +- developers.cloudflare.com/vectorize/ (vector search) +- developers.cloudflare.com/d1/ (SQL database) + +--- + +## Reference Implementations + +Located at `github.com/cloudflare/agents/tree/main/examples`: + +| Example | Description | +|---------|-------------| +| `resumable-stream-chat` | Chat with reconnection-safe streaming | +| `email-agent` | Handle incoming emails via Email Routing | +| `mcp-client` | Connect agents to external MCP servers | +| `mcp-worker` | Expose agent capabilities via MCP protocol | +| `cross-domain` | Multi-domain authentication patterns | +| `tictactoe` | Multiplayer game with shared state | +| `a2a` | Agent-to-agent communication | +| `codemode` | Code transformation workflows | +| `playground` | Interactive testing sandbox | + +Browse each folder for complete implementation code and wrangler configuration. + +--- + +## Selection Matrix + +| Goal | Approach | +|------|----------| +| Conversational bot | Agent + onMessage handler | +| Custom data schema | Agent + setState() | +| Knowledge retrieval | Agent + Vectorize | +| Background jobs | Agent + schedule() | +| External integrations | Agent + tool definitions | + +--- + +## Commands Reference + +**Local execution:** + +```bash +cd my-agent +npm install +npm start +# Accessible at http://localhost:8787 +``` + +**Production push:** + +```bash +npx wrangler deploy +# Accessible at https://[name].[subdomain].workers.dev +``` + +**WebSocket connection:** + +```javascript +// URL pattern: /agents/:className/:instanceName +const socket = new WebSocket("wss://my-agent.workers.dev/agents/MyAgent/session-123"); + +socket.onmessage = (e) => { + console.log("Received:", JSON.parse(e.data)); +}; + +socket.send(JSON.stringify({ type: "chat", content: "Hello" })); +``` + +**React integration:** + +```tsx +import { useAgent } from "agents/react"; + +function Chat() { + const { state, send } = useAgent({ + agent: "my-agent", + name: "session-123", + }); + + // state auto-updates, send() dispatches messages +} +``` + +--- + +## Key Methods (from Agent class) + +| Method | Purpose | +|--------|---------| +| `onStart()` | Runs on instance startup | +| `onConnect()` | Handles new WebSocket connections | +| `onMessage()` | Processes incoming messages | +| `onClose()` | Cleanup on disconnect | +| `setState()` | Persist and broadcast data | +| `this.sql` | Query embedded SQLite | +| `schedule()` | Delayed/recurring tasks | +| `broadcast()` | Message all connections | + +--- + +## Help Channels + +- Cloudflare Discord +- GitHub discussions on cloudflare/agents repository diff --git a/building-ai-agent-on-cloudflare/references/state-patterns.md b/building-ai-agent-on-cloudflare/references/state-patterns.md new file mode 100644 index 0000000..889f755 --- /dev/null +++ b/building-ai-agent-on-cloudflare/references/state-patterns.md @@ -0,0 +1,360 @@ +# State Management Patterns + +Strategies for managing state in Cloudflare Agents. + +## How State Works + +State is automatically persisted to the `cf_agents_state` SQL table. The `this.state` getter lazily loads from storage, while `this.setState()` serializes and persists changes. State survives Durable Object evictions. + +```typescript +class MyAgent extends Agent { + initialState = { count: 0 }; + + increment() { + this.setState({ count: this.state.count + 1 }); + } + + onStateUpdate(state: State, source: string) { + console.log("State updated by:", source); + } +} +``` + +## State vs SQL: When to Use Which + +### Use `this.state` + `setState()` When: + +- Data is small (< 1MB recommended) +- Needs real-time sync to all connected clients +- Simple key-value or object structure +- Frequently read, occasionally updated + +```typescript +interface State { + currentUser: { id: string; name: string }; + preferences: Record; + recentMessages: Message[]; // Keep limited, e.g., last 50 + isTyping: boolean; +} +``` + +### Use `this.sql` When: + +- Large datasets (many records) +- Complex queries (JOINs, aggregations, filtering) +- Historical data / audit logs +- Data that doesn't need real-time sync + +```typescript +// Good for SQL +// - Full message history +// - User documents +// - Analytics events +// - Search indexes +``` + +## Hybrid Pattern + +Combine both for optimal performance: + +```typescript +interface State { + recentMessages: Message[]; + onlineUsers: string[]; + currentDocument: Document | null; +} + +export class HybridAgent extends Agent { + initialState: State = { + recentMessages: [], + onlineUsers: [], + currentDocument: null, + }; + + async onStart() { + await this.sql` + CREATE TABLE IF NOT EXISTS messages ( + id TEXT PRIMARY KEY, + user_id TEXT NOT NULL, + content TEXT NOT NULL, + created_at DATETIME DEFAULT CURRENT_TIMESTAMP + ) + `; + + const recent = await this.sql` + SELECT * FROM messages + ORDER BY created_at DESC + LIMIT 50 + `; + + this.setState({ + ...this.state, + recentMessages: recent.reverse(), + }); + } + + async addMessage(message: Message) { + await this.sql` + INSERT INTO messages (id, user_id, content) + VALUES (${message.id}, ${message.userId}, ${message.content}) + `; + + const recentMessages = [...this.state.recentMessages, message].slice(-50); + this.setState({ ...this.state, recentMessages }); + } +} +``` + +--- + +## Queue System + +The SDK includes a built-in queue for background task processing. Tasks are stored in SQLite and processed in FIFO order. + +### Queue Methods + +| Method | Purpose | +|--------|---------| +| `queue(callback, payload)` | Add task, returns task ID | +| `dequeue(id)` | Remove specific task | +| `dequeueAll()` | Clear entire queue | +| `dequeueAllByCallback(name)` | Remove tasks by callback name | +| `getQueue(id)` | Get single task | +| `getQueues(key, value)` | Find tasks by payload field | + +### Queue Example + +```typescript +export class TaskAgent extends Agent { + async onMessage(connection: Connection, message: string) { + const data = JSON.parse(message); + + if (data.type === "process_later") { + const taskId = await this.queue("processItem", { + itemId: data.itemId, + priority: data.priority, + }); + + connection.send(JSON.stringify({ queued: true, taskId })); + } + } + + // Callback receives payload and QueueItem metadata + async processItem(payload: { itemId: string }, item: QueueItem) { + console.log(`Processing ${payload.itemId}, queued at ${item.createdAt}`); + // Successfully executed tasks are auto-removed + } +} +``` + +**Queue characteristics:** +- Sequential processing (no parallelization) +- Persists across agent restarts +- No built-in retry mechanism +- Payloads must be JSON-serializable + +--- + +## Context Management + +Custom methods automatically have full agent context. Use `getCurrentAgent()` to access context from external functions. + +```typescript +import { getCurrentAgent } from "agents"; + +// External utility function +async function logActivity(action: string) { + const { agent } = getCurrentAgent(); + await agent.sql` + INSERT INTO activity_log (action, timestamp) + VALUES (${action}, ${Date.now()}) + `; +} + +export class MyAgent extends Agent { + async performAction() { + // Context automatically available + await logActivity("action_performed"); + } +} +``` + +`getCurrentAgent()` returns: +- `agent` - The current agent instance +- `connection` - Connection object (if applicable) +- `request` - Request object (if applicable) + +--- + +## State Synchronization + +### Optimistic Updates + +Update UI immediately, then persist: + +```typescript +async onMessage(connection: Connection, message: string) { + const data = JSON.parse(message); + + if (data.type === "update_preference") { + this.setState({ + ...this.state, + preferences: { + ...this.state.preferences, + [data.key]: data.value, + }, + }); + + await this.sql` + INSERT OR REPLACE INTO preferences (key, value) + VALUES (${data.key}, ${data.value}) + `; + } +} +``` + +### Conflict Resolution + +Handle concurrent updates with versioning: + +```typescript +interface State { + document: { + content: string; + version: number; + lastModifiedBy: string; + }; +} + +async updateDocument(userId: string, newContent: string, expectedVersion: number) { + if (this.state.document.version !== expectedVersion) { + throw new Error("Conflict: document was modified by another user"); + } + + this.setState({ + ...this.state, + document: { + content: newContent, + version: expectedVersion + 1, + lastModifiedBy: userId, + }, + }); +} +``` + +### Per-Connection State + +Track ephemeral state for each connected client: + +```typescript +export class MultiUserAgent extends Agent { + private connectionState = new Map(); + + async onConnect(connection: Connection) { + this.connectionState.set(connection.id, { + userId: "", + cursor: { x: 0, y: 0 }, + lastActivity: Date.now(), + }); + } + + async onClose(connection: Connection) { + this.connectionState.delete(connection.id); + } +} +``` + +--- + +## State Migration + +When state schema changes: + +```typescript +interface StateV2 { + messages: Array<{ id: string; content: string; timestamp: string }>; + version: 2; +} + +export class MigratingAgent extends Agent { + initialState: StateV2 = { + messages: [], + version: 2, + }; + + async onStart() { + const rawState = this.state as any; + + if (!rawState.version || rawState.version < 2) { + const migratedMessages = (rawState.messages || []).map( + (content: string, i: number) => ({ + id: `migrated-${i}`, + content, + timestamp: new Date().toISOString(), + }) + ); + + this.setState({ + messages: migratedMessages, + version: 2, + }); + } + } +} +``` + +--- + +## State Size Management + +Keep state lean for performance: + +```typescript +export class LeanStateAgent extends Agent { + private readonly MAX_RECENT_MESSAGES = 100; + + async addMessage(message: Message) { + await this.sql`INSERT INTO messages (id, content) VALUES (${message.id}, ${message.content})`; + + let recentMessages = [...this.state.recentMessages, message]; + if (recentMessages.length > this.MAX_RECENT_MESSAGES) { + recentMessages = recentMessages.slice(-this.MAX_RECENT_MESSAGES); + } + + this.setState({ + ...this.state, + recentMessages, + stats: { + ...this.state.stats, + totalMessages: this.state.stats.totalMessages + 1, + lastActivity: new Date().toISOString(), + }, + }); + } +} +``` + +--- + +## Debugging State + +```typescript +async onMessage(connection: Connection, message: string) { + const data = JSON.parse(message); + + if (data.type === "debug_state") { + connection.send(JSON.stringify({ + type: "debug_response", + state: this.state, + stateSize: JSON.stringify(this.state).length, + sqlTables: await this.sql` + SELECT name FROM sqlite_master WHERE type='table' + `, + })); + } +} +``` diff --git a/building-ai-agent-on-cloudflare/references/troubleshooting.md b/building-ai-agent-on-cloudflare/references/troubleshooting.md new file mode 100644 index 0000000..97008a5 --- /dev/null +++ b/building-ai-agent-on-cloudflare/references/troubleshooting.md @@ -0,0 +1,362 @@ +# Agent Troubleshooting + +Common issues and solutions for Cloudflare Agents. + +## Connection Issues + +### "WebSocket connection failed" + +**Symptoms:** Client cannot connect to agent. + +**Causes & Solutions:** + +1. **Worker not deployed** + ```bash + wrangler deployments list + wrangler deploy # If not deployed + ``` + +2. **Wrong URL path** + ```javascript + // Ensure your routing handles the agent path + // Client: + new WebSocket("wss://my-worker.workers.dev/agent/user123"); + + // Worker must route to agent: + if (url.pathname.startsWith("/agent/")) { + const id = url.pathname.split("/")[2]; + return env.AGENT.get(env.AGENT.idFromName(id)).fetch(request); + } + ``` + +3. **CORS issues (browser clients)** + Agents handle WebSocket upgrades automatically, but ensure your entry worker doesn't block the request. + +### "Connection closed unexpectedly" + +1. **Agent threw an error** + ```bash + wrangler tail # Check for exceptions + ``` + +2. **Message handler crashed** + ```typescript + async onMessage(connection: Connection, message: string) { + try { + // Your logic + } catch (error) { + console.error("Message handling error:", error); + connection.send(JSON.stringify({ type: "error", message: error.message })); + } + } + ``` + +3. **Hibernation woke agent with stale connection** + Ensure you handle reconnection gracefully in client code. + +## State Issues + +### "State not persisting" + +**Causes:** + +1. **Didn't call `setState()`** + ```typescript + // Wrong - direct mutation doesn't persist + this.state.messages.push(newMessage); + + // Correct - use setState + this.setState({ + ...this.state, + messages: [...this.state.messages, newMessage], + }); + ``` + +2. **Agent crashed before state saved** + `setState()` is durable, but if agent crashes during processing before `setState()`, changes are lost. + +3. **Wrong agent instance** + Each unique ID gets a separate agent. Ensure clients connect to the same ID. + +### "State out of sync between clients" + +`setState()` automatically syncs to all connected clients via `onStateUpdate()`. If sync isn't working: + +1. **Check `onStateUpdate` is implemented** + ```typescript + onStateUpdate(state: State, source: string) { + // This fires when state changes from any source + console.log("State updated:", state, "from:", source); + } + ``` + +2. **Client not listening for state updates** + ```typescript + // React hook handles this automatically + const { state } = useAgent({ agent: "my-agent", name: id }); + + // Manual WebSocket - listen for state messages + ws.onmessage = (event) => { + const data = JSON.parse(event.data); + if (data.type === "state_update") { + updateLocalState(data.state); + } + }; + ``` + +### "State too large" / Performance issues + +State is serialized as JSON. Keep it small: + +```typescript +// Bad - storing everything in state +interface State { + allMessages: Message[]; // Could be thousands + allDocuments: Document[]; +} + +// Good - state for hot data, SQL for cold +interface State { + recentMessages: Message[]; // Last 50 only + currentDocument: Document | null; +} + +// Store full history in SQL +await this.sql`INSERT INTO messages ...`; +``` + +## SQL Issues + +### "no such table" + +Table not created. Create in `onStart()`: + +```typescript +async onStart() { + await this.sql` + CREATE TABLE IF NOT EXISTS messages ( + id TEXT PRIMARY KEY, + content TEXT NOT NULL, + created_at DATETIME DEFAULT CURRENT_TIMESTAMP + ) + `; +} +``` + +### "SQL logic error" + +Check your query syntax. Use tagged templates correctly: + +```typescript +// Wrong - string interpolation (SQL injection risk!) +await this.sql`SELECT * FROM users WHERE id = '${userId}'`; + +// Correct - parameterized query +await this.sql`SELECT * FROM users WHERE id = ${userId}`; +``` + +### SQL query returns empty + +1. **Wrong table name** +2. **Data in different agent instance** (each agent ID has isolated storage) +3. **Query conditions don't match** + +Debug: +```typescript +const tables = await this.sql` + SELECT name FROM sqlite_master WHERE type='table' +`; +console.log("Tables:", tables); + +const count = await this.sql`SELECT COUNT(*) as count FROM messages`; +console.log("Message count:", count); +``` + +## Scheduled Task Issues + +### "Task never fires" + +1. **Method name mismatch** + ```typescript + // Schedule references method that must exist + await this.schedule(60, "sendReminder", { ... }); + + // Method must be defined on the class + async sendReminder(data: any) { + // This method MUST exist + } + ``` + +2. **Cron syntax error** + ```typescript + // Invalid cron + await this.schedule("every 5 minutes", "task", {}); // Wrong + + // Valid cron + await this.schedule("*/5 * * * *", "task", {}); // Every 5 minutes + ``` + +3. **Task was cancelled** + ```typescript + const schedules = await this.getSchedules(); + console.log("Active schedules:", schedules); + ``` + +### "Task fires multiple times" + +If you schedule in `onStart()` without checking: + +```typescript +async onStart() { + // Bad - schedules new task every time agent wakes + await this.schedule("0 9 * * *", "dailyTask", {}); + + // Good - check first + const schedules = await this.getSchedules(); + const hasDaily = schedules.some(s => s.callback === "dailyTask"); + if (!hasDaily) { + await this.schedule("0 9 * * *", "dailyTask", {}); + } +} +``` + +## Deployment Issues + +### "Class MyAgent is not exported" + +```typescript +// src/index.ts - Must export the class +export { MyAgent } from "./agent"; + +// Or if defined in same file +export class MyAgent extends Agent { ... } +``` + +### "Durable Object not found" + +Check `wrangler.toml`: + +```toml +[durable_objects] +bindings = [{ name = "AGENT", class_name = "MyAgent" }] + +[[migrations]] +tag = "v1" +new_classes = ["MyAgent"] +``` + +### "Migration required" + +When adding new Durable Object classes: + +```toml +[[migrations]] +tag = "v2" # Increment from previous +new_classes = ["NewAgentClass"] + +# Or for renames +# renamed_classes = [{ from = "OldName", to = "NewName" }] +``` + +## AI Integration Issues + +### "AI binding not found" + +Add to `wrangler.toml`: + +```toml +[ai] +binding = "AI" +``` + +### "Model not found" / "Rate limited" + +```typescript +// Check model name is correct +const response = await this.env.AI.run( + "@cf/meta/llama-3-8b-instruct", // Exact model name + { messages: [...] } +); + +// Handle rate limits +try { + const response = await this.env.AI.run(...); +} catch (error) { + if (error.message.includes("rate limit")) { + // Retry with backoff or use queue + } +} +``` + +### "Streaming not working" + +```typescript +// Enable streaming +const stream = await this.env.AI.run("@cf/meta/llama-3-8b-instruct", { + messages: [...], + stream: true, // Must be true +}); + +// Iterate over stream +for await (const chunk of stream) { + connection.send(JSON.stringify({ type: "chunk", content: chunk.response })); +} +``` + +## Debugging Tips + +### Enable Verbose Logging + +```typescript +export class MyAgent extends Agent { + async onStart() { + console.log("Agent starting, state:", JSON.stringify(this.state)); + } + + async onConnect(connection: Connection) { + console.log("Client connected:", connection.id); + } + + async onMessage(connection: Connection, message: string) { + console.log("Received message:", message); + // ... handle + console.log("State after:", JSON.stringify(this.state)); + } + + async onClose(connection: Connection) { + console.log("Client disconnected:", connection.id); + } +} +``` + +View logs: +```bash +wrangler tail --format pretty +``` + +### Test Locally First + +```bash +npm start +# Connect with test client or use browser console: +# new WebSocket("ws://localhost:8787/agent/test") +``` + +### Inspect State + +Add a debug endpoint: + +```typescript +async onRequest(request: Request) { + const url = new URL(request.url); + + if (url.pathname === "/debug") { + return Response.json({ + state: this.state, + schedules: await this.getSchedules(), + }); + } + + return new Response("Not found", { status: 404 }); +} +``` diff --git a/building-mcp-server-on-cloudflare/.skillshare-meta.json b/building-mcp-server-on-cloudflare/.skillshare-meta.json new file mode 100644 index 0000000..f1adff9 --- /dev/null +++ b/building-mcp-server-on-cloudflare/.skillshare-meta.json @@ -0,0 +1,8 @@ +{ + "source": "github.com/cloudflare/skills/tree/main/skills/building-mcp-server-on-cloudflare", + "type": "github-subdir", + "installed_at": "2026-01-30T02:30:25.030942117Z", + "repo_url": "https://github.com/cloudflare/skills.git", + "subdir": "skills/building-mcp-server-on-cloudflare", + "version": "75a603b" +} \ No newline at end of file diff --git a/building-mcp-server-on-cloudflare/SKILL.md b/building-mcp-server-on-cloudflare/SKILL.md new file mode 100644 index 0000000..fdfa729 --- /dev/null +++ b/building-mcp-server-on-cloudflare/SKILL.md @@ -0,0 +1,265 @@ +--- +name: building-mcp-server-on-cloudflare +description: | + Builds remote MCP (Model Context Protocol) servers on Cloudflare Workers + with tools, OAuth authentication, and production deployment. Generates + server code, configures auth providers, and deploys to Workers. + + Use when: user wants to "build MCP server", "create MCP tools", "remote + MCP", "deploy MCP", add "OAuth to MCP", or mentions Model Context Protocol + on Cloudflare. Also triggers on "MCP authentication" or "MCP deployment". +--- + +# Building MCP Servers on Cloudflare + +Creates production-ready Model Context Protocol servers on Cloudflare Workers with tools, authentication, and deployment. + +## When to Use + +- User wants to build a remote MCP server +- User needs to expose tools via MCP +- User asks about MCP authentication or OAuth +- User wants to deploy MCP to Cloudflare Workers + +## Prerequisites + +- Cloudflare account with Workers enabled +- Node.js 18+ and npm/pnpm/yarn +- Wrangler CLI (`npm install -g wrangler`) + +## Quick Start + +### Option 1: Public Server (No Auth) + +```bash +npm create cloudflare@latest -- my-mcp-server \ + --template=cloudflare/ai/demos/remote-mcp-authless +cd my-mcp-server +npm start +``` + +Server runs at `http://localhost:8788/mcp` + +### Option 2: Authenticated Server (OAuth) + +```bash +npm create cloudflare@latest -- my-mcp-server \ + --template=cloudflare/ai/demos/remote-mcp-github-oauth +cd my-mcp-server +``` + +Requires OAuth app setup. See [references/oauth-setup.md](references/oauth-setup.md). + +## Core Workflow + +### Step 1: Define Tools + +Tools are functions MCP clients can call. Define them using `server.tool()`: + +```typescript +import { McpAgent } from "agents/mcp"; +import { z } from "zod"; + +export class MyMCP extends McpAgent { + server = new Server({ name: "my-mcp", version: "1.0.0" }); + + async init() { + // Simple tool with parameters + this.server.tool( + "add", + { a: z.number(), b: z.number() }, + async ({ a, b }) => ({ + content: [{ type: "text", text: String(a + b) }], + }) + ); + + // Tool that calls external API + this.server.tool( + "get_weather", + { city: z.string() }, + async ({ city }) => { + const response = await fetch(`https://api.weather.com/${city}`); + const data = await response.json(); + return { + content: [{ type: "text", text: JSON.stringify(data) }], + }; + } + ); + } +} +``` + +### Step 2: Configure Entry Point + +**Public server** (`src/index.ts`): + +```typescript +import { MyMCP } from "./mcp"; + +export default { + fetch(request: Request, env: Env, ctx: ExecutionContext) { + const url = new URL(request.url); + if (url.pathname === "/mcp") { + return MyMCP.serveSSE("/mcp").fetch(request, env, ctx); + } + return new Response("MCP Server", { status: 200 }); + }, +}; + +export { MyMCP }; +``` + +**Authenticated server** — See [references/oauth-setup.md](references/oauth-setup.md). + +### Step 3: Test Locally + +```bash +# Start server +npm start + +# In another terminal, test with MCP Inspector +npx @modelcontextprotocol/inspector@latest +# Open http://localhost:5173, enter http://localhost:8788/mcp +``` + +### Step 4: Deploy + +```bash +npx wrangler deploy +``` + +Server accessible at `https://[worker-name].[account].workers.dev/mcp` + +### Step 5: Connect Clients + +**Claude Desktop** (`claude_desktop_config.json`): + +```json +{ + "mcpServers": { + "my-server": { + "command": "npx", + "args": ["mcp-remote", "https://my-mcp.workers.dev/mcp"] + } + } +} +``` + +Restart Claude Desktop after updating config. + +## Tool Patterns + +### Return Types + +```typescript +// Text response +return { content: [{ type: "text", text: "result" }] }; + +// Multiple content items +return { + content: [ + { type: "text", text: "Here's the data:" }, + { type: "text", text: JSON.stringify(data, null, 2) }, + ], +}; +``` + +### Input Validation with Zod + +```typescript +this.server.tool( + "create_user", + { + email: z.string().email(), + name: z.string().min(1).max(100), + role: z.enum(["admin", "user", "guest"]), + age: z.number().int().min(0).optional(), + }, + async (params) => { + // params are fully typed and validated + } +); +``` + +### Accessing Environment/Bindings + +```typescript +export class MyMCP extends McpAgent { + async init() { + this.server.tool("query_db", { sql: z.string() }, async ({ sql }) => { + // Access D1 binding + const result = await this.env.DB.prepare(sql).all(); + return { content: [{ type: "text", text: JSON.stringify(result) }] }; + }); + } +} +``` + +## Authentication + +For OAuth-protected servers, see [references/oauth-setup.md](references/oauth-setup.md). + +Supported providers: +- GitHub +- Google +- Auth0 +- Stytch +- WorkOS +- Any OAuth 2.0 compliant provider + +## Wrangler Configuration + +Minimal `wrangler.toml`: + +```toml +name = "my-mcp-server" +main = "src/index.ts" +compatibility_date = "2024-12-01" + +[durable_objects] +bindings = [{ name = "MCP", class_name = "MyMCP" }] + +[[migrations]] +tag = "v1" +new_classes = ["MyMCP"] +``` + +With bindings (D1, KV, etc.): + +```toml +[[d1_databases]] +binding = "DB" +database_name = "my-db" +database_id = "xxx" + +[[kv_namespaces]] +binding = "KV" +id = "xxx" +``` + +## Common Issues + +### "Tool not found" in Client + +1. Verify tool name matches exactly (case-sensitive) +2. Ensure `init()` registers tools before connections +3. Check server logs: `wrangler tail` + +### Connection Fails + +1. Confirm endpoint path is `/mcp` +2. Check CORS if browser-based client +3. Verify Worker is deployed: `wrangler deployments list` + +### OAuth Redirect Errors + +1. Callback URL must match OAuth app config exactly +2. Check `GITHUB_CLIENT_ID` and `GITHUB_CLIENT_SECRET` are set +3. For local dev, use `http://localhost:8788/callback` + +## References + +- [references/examples.md](references/examples.md) — Official templates and production examples +- [references/oauth-setup.md](references/oauth-setup.md) — OAuth provider configuration +- [references/tool-patterns.md](references/tool-patterns.md) — Advanced tool examples +- [references/troubleshooting.md](references/troubleshooting.md) — Error codes and fixes diff --git a/building-mcp-server-on-cloudflare/references/examples.md b/building-mcp-server-on-cloudflare/references/examples.md new file mode 100644 index 0000000..94e8e64 --- /dev/null +++ b/building-mcp-server-on-cloudflare/references/examples.md @@ -0,0 +1,115 @@ +# Project Bootstrapping + +Instructions for creating new MCP server projects. + +--- + +## Create Commands + +Execute in terminal to generate a new project: + +**Without authentication:** + +```bash +npm create cloudflare@latest -- my-mcp-server \ + --template=cloudflare/ai/demos/remote-mcp-authless +``` + +**With GitHub login:** + +```bash +npm create cloudflare@latest -- my-mcp-server \ + --template=cloudflare/ai/demos/remote-mcp-github-oauth +``` + +**With Google login:** + +```bash +npm create cloudflare@latest -- my-mcp-server \ + --template=cloudflare/ai/demos/remote-mcp-google-oauth +``` + +--- + +## Additional Boilerplate Locations + +**Main repository:** `github.com/cloudflare/ai` (check demos directory) + +Other authentication providers: +- Auth0 +- WorkOS AuthKit +- Logto +- Descope +- Stytch + +**Cloudflare tooling:** `github.com/cloudflare/mcp-server-cloudflare` + +--- + +## Selection Matrix + +| Goal | Boilerplate | +|------|-------------| +| Testing/learning | authless | +| GitHub API access | github-oauth | +| Google API access | google-oauth | +| Enterprise auth | auth0 / authkit | +| Slack apps | slack-oauth | +| Zero Trust | cf-access | + +--- + +## Platform Documentation + +- developers.cloudflare.com/agents/model-context-protocol/ +- developers.cloudflare.com/agents/guides/remote-mcp-server/ +- developers.cloudflare.com/agents/guides/test-remote-mcp-server/ +- developers.cloudflare.com/agents/model-context-protocol/authorization/ + +--- + +## Commands Reference + +**Local execution:** + +```bash +cd my-mcp-server +npm install +npm start +# Accessible at http://localhost:8788/mcp +``` + +**Production push:** + +```bash +npx wrangler deploy +# Accessible at https://[worker-name].[subdomain].workers.dev/mcp +``` + +**Claude Desktop setup** (modify `claude_desktop_config.json`): + +```json +{ + "mcpServers": { + "my-server": { + "command": "npx", + "args": ["mcp-remote", "https://my-mcp-server.my-account.workers.dev/mcp"] + } + } +} +``` + +**Inspector testing:** + +```bash +npx @modelcontextprotocol/inspector@latest +# Launch browser at http://localhost:5173 +# Input your server URL: http://localhost:8788/mcp +``` + +--- + +## Help Channels + +- Cloudflare Discord +- GitHub discussions on cloudflare/ai repository diff --git a/building-mcp-server-on-cloudflare/references/oauth-setup.md b/building-mcp-server-on-cloudflare/references/oauth-setup.md new file mode 100644 index 0000000..0a8216c --- /dev/null +++ b/building-mcp-server-on-cloudflare/references/oauth-setup.md @@ -0,0 +1,338 @@ +# Securing MCP Servers + +MCP servers require authentication to ensure only trusted users can access them. The MCP specification uses OAuth 2.1 for authentication between clients and servers. + +Cloudflare's `workers-oauth-provider` handles token management, client registration, and access token validation automatically. + +## Basic Setup + +```typescript +import { OAuthProvider } from "@cloudflare/workers-oauth-provider"; +import { createMcpHandler } from "agents/mcp"; + +const apiHandler = { + async fetch(request: Request, env: unknown, ctx: ExecutionContext) { + return createMcpHandler(server)(request, env, ctx); + } +}; + +export default new OAuthProvider({ + authorizeEndpoint: "/authorize", + tokenEndpoint: "/oauth/token", + clientRegistrationEndpoint: "/oauth/register", + apiRoute: "/mcp", + apiHandler: apiHandler, + defaultHandler: AuthHandler +}); +``` + +## Proxy Server Pattern + +MCP servers often act as OAuth clients too. Your server sits between Claude Desktop and a third-party API like GitHub. To Claude, you're a server. To GitHub, you're a client. This lets users authenticate with their GitHub credentials. + +Building a secure proxy server requires careful attention to several security concerns. + +--- + +## Security Requirements + +### Redirect URI Validation + +The `workers-oauth-provider` validates that `redirect_uri` in authorization requests matches registered URIs. This prevents attackers from redirecting authorization codes to malicious endpoints. + +### Consent Dialog + +When proxying to third-party providers, implement your own consent dialog before forwarding users upstream. This prevents the "confused deputy" problem where attackers exploit cached consent. + +Your consent dialog should: +- Identify the requesting MCP client by name +- Display the specific scopes being requested + +--- + +## CSRF Protection + +Prevent attackers from tricking users into approving malicious OAuth clients. Use a random token stored in a secure cookie. + +```typescript +// Generate token when showing consent form +function generateCSRFProtection(): { token: string; setCookie: string } { + const token = crypto.randomUUID(); + const setCookie = `__Host-CSRF_TOKEN=${token}; HttpOnly; Secure; Path=/; SameSite=Lax; Max-Age=600`; + return { token, setCookie }; +} + +// Validate token when user approves +function validateCSRFToken(formData: FormData, request: Request): { clearCookie: string } { + const tokenFromForm = formData.get("csrf_token"); + const cookieHeader = request.headers.get("Cookie") || ""; + const tokenFromCookie = cookieHeader + .split(";") + .find((c) => c.trim().startsWith("__Host-CSRF_TOKEN=")) + ?.split("=")[1]; + + if (!tokenFromForm || !tokenFromCookie || tokenFromForm !== tokenFromCookie) { + throw new Error("CSRF token mismatch"); + } + + return { + clearCookie: `__Host-CSRF_TOKEN=; HttpOnly; Secure; Path=/; SameSite=Lax; Max-Age=0` + }; +} +``` + +Include the token as a hidden form field: + +```html + +``` + +--- + +## Input Sanitization + +Client-controlled content (names, logos, URIs) can execute malicious scripts if not sanitized. Treat all client metadata as untrusted. + +```typescript +function sanitizeText(text: string): string { + return text + .replace(/&/g, "&") + .replace(//g, ">") + .replace(/"/g, """) + .replace(/'/g, "'"); +} + +function sanitizeUrl(url: string): string { + if (!url) return ""; + try { + const parsed = new URL(url); + if (!["http:", "https:"].includes(parsed.protocol)) { + return ""; + } + return url; + } catch { + return ""; + } +} +``` + +**Required protections:** +- Client names/descriptions: HTML-escape before rendering +- Logo URLs: Allow only `http:` and `https:` schemes +- Client URIs: Same as logo URLs +- Scopes: Treat as text, HTML-escape + +--- + +## Content Security Policy + +CSP headers block dangerous content and provide defense in depth. + +```typescript +function buildSecurityHeaders(setCookie: string, nonce?: string): HeadersInit { + const cspDirectives = [ + "default-src 'none'", + "script-src 'self'" + (nonce ? ` 'nonce-${nonce}'` : ""), + "style-src 'self' 'unsafe-inline'", + "img-src 'self' https:", + "font-src 'self'", + "form-action 'self'", + "frame-ancestors 'none'", + "base-uri 'self'", + "connect-src 'self'" + ].join("; "); + + return { + "Content-Security-Policy": cspDirectives, + "X-Frame-Options": "DENY", + "X-Content-Type-Options": "nosniff", + "Content-Type": "text/html; charset=utf-8", + "Set-Cookie": setCookie + }; +} +``` + +--- + +## State Management + +Ensure the same user that hits authorize reaches the callback. Use a random state token stored in KV with short expiration. + +```typescript +// Create state before redirecting to upstream provider +async function createOAuthState( + oauthReqInfo: AuthRequest, + kv: KVNamespace +): Promise<{ stateToken: string }> { + const stateToken = crypto.randomUUID(); + await kv.put(`oauth:state:${stateToken}`, JSON.stringify(oauthReqInfo), { + expirationTtl: 600 + }); + return { stateToken }; +} + +// Bind state to browser session via hashed cookie +async function bindStateToSession(stateToken: string): Promise<{ setCookie: string }> { + const encoder = new TextEncoder(); + const data = encoder.encode(stateToken); + const hashBuffer = await crypto.subtle.digest("SHA-256", data); + const hashArray = Array.from(new Uint8Array(hashBuffer)); + const hashHex = hashArray.map((b) => b.toString(16).padStart(2, "0")).join(""); + + return { + setCookie: `__Host-CONSENTED_STATE=${hashHex}; HttpOnly; Secure; Path=/; SameSite=Lax; Max-Age=600` + }; +} + +// Validate in callback - check both KV and session cookie +async function validateOAuthState( + request: Request, + kv: KVNamespace +): Promise<{ oauthReqInfo: AuthRequest; clearCookie: string }> { + const url = new URL(request.url); + const stateFromQuery = url.searchParams.get("state"); + + if (!stateFromQuery) { + throw new Error("Missing state parameter"); + } + + // Check KV + const storedDataJson = await kv.get(`oauth:state:${stateFromQuery}`); + if (!storedDataJson) { + throw new Error("Invalid or expired state"); + } + + // Check session cookie matches + const cookieHeader = request.headers.get("Cookie") || ""; + const consentedStateHash = cookieHeader + .split(";") + .find((c) => c.trim().startsWith("__Host-CONSENTED_STATE=")) + ?.split("=")[1]; + + if (!consentedStateHash) { + throw new Error("Missing session binding cookie"); + } + + // Hash state and compare + const encoder = new TextEncoder(); + const hashBuffer = await crypto.subtle.digest("SHA-256", encoder.encode(stateFromQuery)); + const stateHash = Array.from(new Uint8Array(hashBuffer)) + .map((b) => b.toString(16).padStart(2, "0")) + .join(""); + + if (stateHash !== consentedStateHash) { + throw new Error("State token does not match session"); + } + + await kv.delete(`oauth:state:${stateFromQuery}`); + + return { + oauthReqInfo: JSON.parse(storedDataJson), + clearCookie: `__Host-CONSENTED_STATE=; HttpOnly; Secure; Path=/; SameSite=Lax; Max-Age=0` + }; +} +``` + +--- + +## Approved Clients Registry + +Maintain a registry of approved client IDs per user. Store in a cryptographically signed cookie with HMAC-SHA256. + +```typescript +export async function addApprovedClient( + request: Request, + clientId: string, + cookieSecret: string +): Promise { + const existingClients = await getApprovedClientsFromCookie(request, cookieSecret) || []; + const updatedClients = Array.from(new Set([...existingClients, clientId])); + + const payload = JSON.stringify(updatedClients); + const signature = await signData(payload, cookieSecret); + const cookieValue = `${signature}.${btoa(payload)}`; + + return `__Host-APPROVED_CLIENTS=${cookieValue}; HttpOnly; Secure; Path=/; SameSite=Lax; Max-Age=2592000`; +} +``` + +When reading the cookie, verify the signature before trusting data. If client isn't approved, show consent dialog. + +--- + +## Cookie Security + +### Why `__Host-` prefix? + +The `__Host-` prefix prevents subdomain attacks on `*.workers.dev` domains. Requirements: +- Must have `Secure` flag (HTTPS only) +- Must have `Path=/` +- Must not have `Domain` attribute + +Without this prefix, an attacker on `evil.workers.dev` could set cookies for your `mcp-server.workers.dev` domain. + +### Multiple OAuth Providers + +If running multiple OAuth flows on the same domain, namespace your cookies: +- `__Host-CSRF_TOKEN_GITHUB` vs `__Host-CSRF_TOKEN_GOOGLE` +- `__Host-APPROVED_CLIENTS_GITHUB` vs `__Host-APPROVED_CLIENTS_GOOGLE` + +--- + +## Inline JavaScript + +If your consent dialog needs inline JavaScript, use data attributes and nonces: + +```typescript +const nonce = crypto.randomUUID(); + +const html = ` + +`; + +return new Response(html, { + headers: buildSecurityHeaders(setCookie, nonce) +}); +``` + +Data attributes store user-controlled data separately from executable code. Nonces with CSP allow your specific script while blocking injected scripts. + +--- + +## Provider-Specific Setup + +### GitHub + +1. Create OAuth App at github.com/settings/developers +2. Set callback URL: `https://[worker].workers.dev/callback` +3. Store secrets: + ```bash + wrangler secret put GITHUB_CLIENT_ID + wrangler secret put GITHUB_CLIENT_SECRET + ``` + +### Google + +1. Create OAuth Client at console.cloud.google.com/apis/credentials +2. Set authorized redirect URI +3. Scopes: `openid email profile` + +### Auth0 + +1. Create Regular Web Application in Auth0 Dashboard +2. Set allowed callback URLs +3. Endpoints: `https://${AUTH0_DOMAIN}/authorize`, `/oauth/token`, `/userinfo` + +--- + +## References + +- [MCP Authorization Spec](https://modelcontextprotocol.io/specification/2025-06-18/basic/authorization) +- [MCP Security Best Practices](https://modelcontextprotocol.io/specification/draft/basic/security_best_practices) +- [RFC 9700 - OAuth Security](https://www.rfc-editor.org/rfc/rfc9700) diff --git a/building-mcp-server-on-cloudflare/references/troubleshooting.md b/building-mcp-server-on-cloudflare/references/troubleshooting.md new file mode 100644 index 0000000..f1e62e7 --- /dev/null +++ b/building-mcp-server-on-cloudflare/references/troubleshooting.md @@ -0,0 +1,317 @@ +# MCP Server Troubleshooting + +Common errors and solutions for MCP servers on Cloudflare. + +## Connection Issues + +### "Failed to connect to MCP server" + +**Symptoms:** Client cannot establish connection to deployed server. + +**Causes & Solutions:** + +1. **Wrong URL path** + ``` + # Wrong + https://my-server.workers.dev/ + + # Correct + https://my-server.workers.dev/mcp + ``` + +2. **Worker not deployed** + ```bash + wrangler deployments list + # If empty, deploy first: + wrangler deploy + ``` + +3. **Worker crashed on startup** + ```bash + wrangler tail + # Check for initialization errors + ``` + +### "WebSocket connection failed" + +MCP uses SSE (Server-Sent Events), not WebSockets. Ensure your client is configured for SSE transport: + +```json +{ + "mcpServers": { + "my-server": { + "command": "npx", + "args": ["mcp-remote", "https://my-server.workers.dev/mcp"] + } + } +} +``` + +### CORS Errors in Browser + +If calling from browser-based client: + +```typescript +// Add CORS headers to your worker +export default { + async fetch(request: Request, env: Env) { + // Handle preflight + if (request.method === "OPTIONS") { + return new Response(null, { + headers: { + "Access-Control-Allow-Origin": "*", + "Access-Control-Allow-Methods": "GET, POST, OPTIONS", + "Access-Control-Allow-Headers": "Content-Type", + }, + }); + } + + const response = await handleRequest(request, env); + + // Add CORS headers to response + const headers = new Headers(response.headers); + headers.set("Access-Control-Allow-Origin", "*"); + + return new Response(response.body, { + status: response.status, + headers, + }); + }, +}; +``` + +## Tool Errors + +### "Tool not found: [tool_name]" + +**Causes:** + +1. Tool not registered in `init()` +2. Tool name mismatch (case-sensitive) +3. `init()` threw an error before registering tool + +**Debug:** + +```typescript +async init() { + console.log("Registering tools..."); + + this.server.tool("my_tool", { ... }, async () => { ... }); + + console.log("Tools registered:", this.server.listTools()); +} +``` + +Check logs: `wrangler tail` + +### "Invalid parameters for tool" + +Zod validation failed. Check parameter schema: + +```typescript +// Schema expects number, client sent string +this.server.tool( + "calculate", + { value: z.number() }, // Client must send number, not "123" + async ({ value }) => { ... } +); + +// Fix: Coerce string to number +this.server.tool( + "calculate", + { value: z.coerce.number() }, // "123" → 123 + async ({ value }) => { ... } +); +``` + +### Tool Timeout + +Workers have CPU time limits (10-30ms for free, longer for paid). For long operations: + +```typescript +this.server.tool( + "long_operation", + { ... }, + async (params) => { + // Break into smaller chunks + // Or use Queues/Durable Objects for background work + + // Don't do this: + // await sleep(5000); // Will timeout + + return { content: [{ type: "text", text: "Queued for processing" }] }; + } +); +``` + +## Authentication Errors + +### "401 Unauthorized" + +OAuth token missing or expired. + +1. **Check client is handling OAuth flow** +2. **Verify secrets are set:** + ```bash + wrangler secret list + # Should show GITHUB_CLIENT_ID, GITHUB_CLIENT_SECRET + ``` + +3. **Check KV namespace exists:** + ```bash + wrangler kv namespace list + # Should show OAUTH_KV + ``` + +### "Invalid redirect_uri" + +OAuth callback URL doesn't match app configuration. + +**Local development:** +- OAuth app callback: `http://localhost:8788/callback` + +**Production:** +- OAuth app callback: `https://[worker-name].[account].workers.dev/callback` + +Must match EXACTLY (including trailing slash or lack thereof). + +### "State mismatch" / CSRF Error + +State parameter validation failed. + +1. **Clear browser cookies and retry** +2. **Check KV is storing state:** + ```typescript + // In your auth handler + console.log("Storing state:", state); + await env.OAUTH_KV.put(`state:${state}`, "1", { expirationTtl: 600 }); + ``` + +3. **Verify same domain for all requests** + +## Binding Errors + +### "Binding not found: [BINDING_NAME]" + +Binding not in `wrangler.toml` or not deployed. + +```toml +# wrangler.toml +[[d1_databases]] +binding = "DB" # Must match env.DB in code +database_name = "mydb" +database_id = "xxx-xxx" +``` + +After adding bindings: `wrangler deploy` + +### "D1_ERROR: no such table" + +Migrations not applied. + +```bash +# Local +wrangler d1 migrations apply DB_NAME --local + +# Production +wrangler d1 migrations apply DB_NAME +``` + +### Durable Object Not Found + +```toml +# wrangler.toml must have: +[durable_objects] +bindings = [{ name = "MCP", class_name = "MyMCP" }] + +[[migrations]] +tag = "v1" +new_classes = ["MyMCP"] +``` + +And class must be exported: + +```typescript +export { MyMCP }; // Don't forget this! +``` + +## Deployment Errors + +### "Class MyMCP is not exported" + +```typescript +// src/index.ts - Must export the class +export { MyMCP } from "./mcp"; + +// OR in same file +export class MyMCP extends McpAgent { ... } +``` + +### "Migration required" + +New Durable Object class needs migration: + +```toml +# Add to wrangler.toml +[[migrations]] +tag = "v2" # Increment version +new_classes = ["NewClassName"] +# Or for renames: +# renamed_classes = [{ from = "OldName", to = "NewName" }] +``` + +### Build Errors + +```bash +# Clear cache and rebuild +rm -rf node_modules .wrangler +npm install +wrangler deploy +``` + +## Debugging Tips + +### Enable Verbose Logging + +```typescript +export class MyMCP extends McpAgent { + async init() { + console.log("MCP Server initializing..."); + console.log("Environment:", Object.keys(this.env)); + + this.server.tool("test", {}, async () => { + console.log("Test tool called"); + return { content: [{ type: "text", text: "OK" }] }; + }); + + console.log("Tools registered"); + } +} +``` + +View logs: +```bash +wrangler tail --format pretty +``` + +### Test Locally First + +```bash +npm start +npx @modelcontextprotocol/inspector@latest +``` + +Always verify tools work locally before deploying. + +### Check Worker Health + +```bash +# List deployments +wrangler deployments list + +# View recent logs +wrangler tail + +# Check worker status +curl -I https://your-worker.workers.dev/mcp +``` diff --git a/building-native-ui/.skillshare-meta.json b/building-native-ui/.skillshare-meta.json new file mode 100644 index 0000000..1b114c7 --- /dev/null +++ b/building-native-ui/.skillshare-meta.json @@ -0,0 +1,8 @@ +{ + "source": "github.com/expo/skills/tree/main/plugins/expo-app-design/skills/building-native-ui", + "type": "github-subdir", + "installed_at": "2026-01-30T02:26:49.540522331Z", + "repo_url": "https://github.com/expo/skills.git", + "subdir": "plugins/expo-app-design/skills/building-native-ui", + "version": "b631a60" +} \ No newline at end of file diff --git a/building-native-ui/SKILL.md b/building-native-ui/SKILL.md new file mode 100644 index 0000000..17026de --- /dev/null +++ b/building-native-ui/SKILL.md @@ -0,0 +1,315 @@ +--- +name: building-native-ui +description: Complete guide for building beautiful apps with Expo Router. Covers fundamentals, styling, components, navigation, animations, patterns, and native tabs. +version: 1.0.0 +license: MIT +--- + +# Expo UI Guidelines + +## References + +Consult these resources as needed: + +- ./references/route-structure.md -- Route file conventions, dynamic routes, query parameters, groups, and folder organization +- ./references/tabs.md -- Native tab bar with NativeTabs, migration from JS tabs, iOS 26 features +- ./references/icons.md -- SF Symbols with expo-symbols, common icon names, animations, and weights +- ./references/controls.md -- Native iOS controls: Switch, Slider, SegmentedControl, DateTimePicker, Picker +- ./references/visual-effects.md -- Blur effects with expo-blur and liquid glass with expo-glass-effect +- ./references/animations.md -- Reanimated animations: entering, exiting, layout, scroll-driven, and gestures +- ./references/search.md -- Search bar integration with headers, useSearch hook, and filtering patterns +- ./references/gradients.md -- CSS gradients using experimental_backgroundImage (New Architecture only) +- ./references/media.md -- Media handling for Expo Router including camera, audio, video, and file saving +- ./references/storage.md -- Data storage patterns including SQLite, AsyncStorage, and SecureStore +- ./references/webgpu-three.md -- 3D graphics, games, and GPU-powered visualizations with WebGPU and Three.js +- ./references/toolbars-and-headers.md -- Customizing stack headers and toolbar with buttons, menus, and search bars in expo-router app. Available only on iOS. + +## Running the App + +**CRITICAL: Always try Expo Go first before creating custom builds.** + +Most Expo apps work in Expo Go without any custom native code. Before running `npx expo run:ios` or `npx expo run:android`: + +1. **Start with Expo Go**: Run `npx expo start` and scan the QR code with Expo Go +2. **Check if features work**: Test your app thoroughly in Expo Go +3. **Only create custom builds when required** - see below + +### When Custom Builds Are Required + +You need `npx expo run:ios/android` or `eas build` ONLY when using: + +- **Local Expo modules** (custom native code in `modules/`) +- **Apple targets** (widgets, app clips, extensions via `@bacons/apple-targets`) +- **Third-party native modules** not included in Expo Go +- **Custom native configuration** that can't be expressed in `app.json` + +### When Expo Go Works + +Expo Go supports a huge range of features out of the box: + +- All `expo-*` packages (camera, location, notifications, etc.) +- Expo Router navigation +- Most UI libraries (reanimated, gesture handler, etc.) +- Push notifications, deep links, and more + +**If you're unsure, try Expo Go first.** Creating custom builds adds complexity, slower iteration, and requires Xcode/Android Studio setup. + +## Code Style + +- Be cautious of unterminated strings. Ensure nested backticks are escaped; never forget to escape quotes correctly. +- Always use import statements at the top of the file. +- Always use kebab-case for file names, e.g. `comment-card.tsx` +- Always remove old route files when moving or restructuring navigation +- Never use special characters in file names +- Configure tsconfig.json with path aliases, and prefer aliases over relative imports for refactors. + +## Routes + +See `./references/route-structure.md` for detailed route conventions. + +- Routes belong in the `app` directory. +- Never co-locate components, types, or utilities in the app directory. This is an anti-pattern. +- Ensure the app always has a route that matches "/", it may be inside a group route. + +## Library Preferences + +- Never use modules removed from React Native such as Picker, WebView, SafeAreaView, or AsyncStorage +- Never use legacy expo-permissions +- `expo-audio` not `expo-av` +- `expo-video` not `expo-av` +- `expo-symbols` not `@expo/vector-icons` +- `react-native-safe-area-context` not react-native SafeAreaView +- `process.env.EXPO_OS` not `Platform.OS` +- `React.use` not `React.useContext` +- `expo-image` Image component instead of intrinsic element `img` +- `expo-glass-effect` for liquid glass backdrops + +## Responsiveness + +- Always wrap root component in a scroll view for responsiveness +- Use `` instead of `` for smarter safe area insets +- `contentInsetAdjustmentBehavior="automatic"` should be applied to FlatList and SectionList as well +- Use flexbox instead of Dimensions API +- ALWAYS prefer `useWindowDimensions` over `Dimensions.get()` to measure screen size + +## Behavior + +- Use expo-haptics conditionally on iOS to make more delightful experiences +- Use views with built-in haptics like `` from React Native and `@react-native-community/datetimepicker` +- When a route belongs to a Stack, its first child should almost always be a ScrollView with `contentInsetAdjustmentBehavior="automatic"` set +- Prefer `headerSearchBarOptions` in Stack.Screen options to add a search bar +- Use the `` prop on text containing data that could be copied +- Consider formatting large numbers like 1.4M or 38k +- Never use intrinsic elements like 'img' or 'div' unless in a webview or Expo DOM component + +# Styling + +Follow Apple Human Interface Guidelines. + +## General Styling Rules + +- Prefer flex gap over margin and padding styles +- Prefer padding over margin where possible +- Always account for safe area, either with stack headers, tabs, or ScrollView/FlatList `contentInsetAdjustmentBehavior="automatic"` +- Ensure both top and bottom safe area insets are accounted for +- Inline styles not StyleSheet.create unless reusing styles is faster +- Add entering and exiting animations for state changes +- Use `{ borderCurve: 'continuous' }` for rounded corners unless creating a capsule shape +- ALWAYS use a navigation stack title instead of a custom text element on the page +- When padding a ScrollView, use `contentContainerStyle` padding and gap instead of padding on the ScrollView itself (reduces clipping) +- CSS and Tailwind are not supported - use inline styles + +## Text Styling + +- Add the `selectable` prop to every `` element displaying important data or error messages +- Counters should use `{ fontVariant: 'tabular-nums' }` for alignment + +## Shadows + +Use CSS `boxShadow` style prop. NEVER use legacy React Native shadow or elevation styles. + +```tsx + +``` + +'inset' shadows are supported. + +# Navigation + +## Link + +Use `` from 'expo-router' for navigation between routes. + +```tsx +import { Link } from 'expo-router'; + +// Basic link + + +// Wrapping custom components + + ... + +``` + +Whenever possible, include a `` to follow iOS conventions. Add context menus and previews frequently to enhance navigation. + +## Stack + +- ALWAYS use `_layout.tsx` files to define stacks +- Use Stack from 'expo-router/stack' for native navigation stacks + +### Page Title + +Set the page title in Stack.Screen options: + +```tsx + +``` + +## Context Menus + +Add long press context menus to Link components: + +```tsx +import { Link } from "expo-router"; + + + + + + + + + + + + {}} /> + {}} + /> + + +; +``` + +## Link Previews + +Use link previews frequently to enhance navigation: + +```tsx + + + + + + + + +``` + +Link preview can be used with context menus. + +## Modal + +Present a screen as a modal: + +```tsx + +``` + +Prefer this to building a custom modal component. + +## Sheet + +Present a screen as a dynamic form sheet: + +```tsx + +``` + +- Using `contentStyle: { backgroundColor: "transparent" }` makes the background liquid glass on iOS 26+. + +## Common route structure + +A standard app layout with tabs and stacks inside each tab: + +``` +app/ + _layout.tsx — + (index,search)/ + _layout.tsx — + index.tsx — Main list + search.tsx — Search view +``` + +```tsx +// app/_layout.tsx +import { NativeTabs, Icon, Label } from "expo-router/unstable-native-tabs"; +import { Theme } from "../components/theme"; + +export default function Layout() { + return ( + + + + + + + + + + ); +} +``` + +Create a shared group route so both tabs can push common screens: + +```tsx +// app/(index,search)/_layout.tsx +import { Stack } from "expo-router/stack"; +import { PlatformColor } from "react-native"; + +export default function Layout({ segment }) { + const screen = segment.match(/\((.*)\)/)?.[1]!; + const titles: Record = { index: "Items", search: "Search" }; + + return ( + + + + + ); +} +``` diff --git a/building-native-ui/references/animations.md b/building-native-ui/references/animations.md new file mode 100644 index 0000000..657cad8 --- /dev/null +++ b/building-native-ui/references/animations.md @@ -0,0 +1,220 @@ +# Animations + +Use Reanimated v4. Avoid React Native's built-in Animated API. + +## Entering and Exiting Animations + +Use Animated.View with entering and exiting animations. Layout animations can animate state changes. + +```tsx +import Animated, { + FadeIn, + FadeOut, + LinearTransition, +} from "react-native-reanimated"; + +function App() { + return ( + + ); +} +``` + +## On-Scroll Animations + +Create high-performance scroll animations using Reanimated's hooks: + +```tsx +import Animated, { + useAnimatedRef, + useScrollViewOffset, + useAnimatedStyle, + interpolate, +} from "react-native-reanimated"; + +function Page() { + const ref = useAnimatedRef(); + const scroll = useScrollViewOffset(ref); + + const style = useAnimatedStyle(() => ({ + opacity: interpolate(scroll.value, [0, 30], [0, 1], "clamp"), + })); + + return ( + + + + ); +} +``` + +## Common Animation Presets + +### Entering Animations + +- `FadeIn`, `FadeInUp`, `FadeInDown`, `FadeInLeft`, `FadeInRight` +- `SlideInUp`, `SlideInDown`, `SlideInLeft`, `SlideInRight` +- `ZoomIn`, `ZoomInUp`, `ZoomInDown` +- `BounceIn`, `BounceInUp`, `BounceInDown` + +### Exiting Animations + +- `FadeOut`, `FadeOutUp`, `FadeOutDown`, `FadeOutLeft`, `FadeOutRight` +- `SlideOutUp`, `SlideOutDown`, `SlideOutLeft`, `SlideOutRight` +- `ZoomOut`, `ZoomOutUp`, `ZoomOutDown` +- `BounceOut`, `BounceOutUp`, `BounceOutDown` + +### Layout Animations + +- `LinearTransition` — Smooth linear interpolation +- `SequencedTransition` — Sequenced property changes +- `FadingTransition` — Fade between states + +## Customizing Animations + +```tsx + +``` + +### Modifiers + +```tsx +// Duration in milliseconds +FadeIn.duration(300); + +// Delay before starting +FadeIn.delay(100); + +// Spring physics +FadeIn.springify(); +FadeIn.springify().damping(15).stiffness(100); + +// Easing curves +FadeIn.easing(Easing.bezier(0.25, 0.1, 0.25, 1)); + +// Chaining +FadeInDown.duration(400).delay(200).springify(); +``` + +## Shared Value Animations + +For imperative control over animations: + +```tsx +import { + useSharedValue, + withSpring, + withTiming, +} from "react-native-reanimated"; + +const offset = useSharedValue(0); + +// Spring animation +offset.value = withSpring(100); + +// Timing animation +offset.value = withTiming(100, { duration: 300 }); + +// Use in styles +const style = useAnimatedStyle(() => ({ + transform: [{ translateX: offset.value }], +})); +``` + +## Gesture Animations + +Combine with React Native Gesture Handler: + +```tsx +import { Gesture, GestureDetector } from "react-native-gesture-handler"; +import Animated, { + useSharedValue, + useAnimatedStyle, + withSpring, +} from "react-native-reanimated"; + +function DraggableBox() { + const translateX = useSharedValue(0); + const translateY = useSharedValue(0); + + const gesture = Gesture.Pan() + .onUpdate((e) => { + translateX.value = e.translationX; + translateY.value = e.translationY; + }) + .onEnd(() => { + translateX.value = withSpring(0); + translateY.value = withSpring(0); + }); + + const style = useAnimatedStyle(() => ({ + transform: [ + { translateX: translateX.value }, + { translateY: translateY.value }, + ], + })); + + return ( + + + + ); +} +``` + +## Keyboard Animations + +Animate with keyboard height changes: + +```tsx +import Animated, { + useAnimatedKeyboard, + useAnimatedStyle, +} from "react-native-reanimated"; + +function KeyboardAwareView() { + const keyboard = useAnimatedKeyboard(); + + const style = useAnimatedStyle(() => ({ + paddingBottom: keyboard.height.value, + })); + + return {/* content */}; +} +``` + +## Staggered List Animations + +Animate list items with delays: + +```tsx +{ + items.map((item, index) => ( + + + + )); +} +``` + +## Best Practices + +- Add entering and exiting animations for state changes +- Use layout animations when items are added/removed from lists +- Use `useAnimatedStyle` for scroll-driven animations +- Prefer `interpolate` with "clamp" for bounded values +- You can't pass PlatformColors to reanimated views or styles; use static colors instead +- Keep animations under 300ms for responsive feel +- Use spring animations for natural movement +- Avoid animating layout properties (width, height) when possible — prefer transforms diff --git a/building-native-ui/references/controls.md b/building-native-ui/references/controls.md new file mode 100644 index 0000000..762fe20 --- /dev/null +++ b/building-native-ui/references/controls.md @@ -0,0 +1,270 @@ +# Native Controls + +Native iOS controls provide built-in haptics, accessibility, and platform-appropriate styling. + +## Switch + +Use for binary on/off settings. Has built-in haptics. + +```tsx +import { Switch } from "react-native"; +import { useState } from "react"; + +const [enabled, setEnabled] = useState(false); + +; +``` + +### Customization + +```tsx + +``` + +## Segmented Control + +Use for non-navigational tabs or mode selection. Avoid changing default colors. + +```tsx +import SegmentedControl from "@react-native-segmented-control/segmented-control"; +import { useState } from "react"; + +const [index, setIndex] = useState(0); + + setIndex(nativeEvent.selectedSegmentIndex)} +/>; +``` + +### Rules + +- Maximum 4 options — use a picker for more +- Keep labels short (1-2 words) +- Avoid custom colors — native styling adapts to dark mode + +### With Icons (iOS 14+) + +```tsx + setIndex(nativeEvent.selectedSegmentIndex)} +/> +``` + +## Slider + +Continuous value selection. + +```tsx +import Slider from "@react-native-community/slider"; +import { useState } from "react"; + +const [value, setValue] = useState(0.5); + +; +``` + +### Customization + +```tsx + +``` + +### Discrete Steps + +```tsx + +``` + +## Date/Time Picker + +Compact pickers with popovers. Has built-in haptics. + +```tsx +import DateTimePicker from "@react-native-community/datetimepicker"; +import { useState } from "react"; + +const [date, setDate] = useState(new Date()); + + { + if (selectedDate) setDate(selectedDate); + }} + mode="datetime" +/>; +``` + +### Modes + +- `date` — Date only +- `time` — Time only +- `datetime` — Date and time + +### Display Styles + +```tsx +// Compact inline (default) + + +// Spinner wheel + + +// Full calendar + +``` + +### Time Intervals + +```tsx + +``` + +### Min/Max Dates + +```tsx + +``` + +## Stepper + +Increment/decrement numeric values. + +```tsx +import { Stepper } from "react-native"; +import { useState } from "react"; + +const [count, setCount] = useState(0); + +; +``` + +## TextInput + +Native text input with various keyboard types. + +```tsx +import { TextInput } from "react-native"; + + +``` + +### Keyboard Types + +```tsx +// Email + + +// Phone + + +// Number + + +// Password + + +// Search + +``` + +### Multiline + +```tsx + +``` + +## Picker (Wheel) + +For selection from many options (5+ items). + +```tsx +import { Picker } from "@react-native-picker/picker"; +import { useState } from "react"; + +const [selected, setSelected] = useState("js"); + + + + + + +; +``` + +## Best Practices + +- **Haptics**: Switch and DateTimePicker have built-in haptics — don't add extra +- **Accessibility**: Native controls have proper accessibility labels by default +- **Dark Mode**: Avoid custom colors — native styling adapts automatically +- **Spacing**: Use consistent padding around controls (12-16pt) +- **Labels**: Place labels above or to the left of controls +- **Grouping**: Group related controls in sections with headers diff --git a/building-native-ui/references/form-sheet.md b/building-native-ui/references/form-sheet.md new file mode 100644 index 0000000..240c75c --- /dev/null +++ b/building-native-ui/references/form-sheet.md @@ -0,0 +1,227 @@ +# Form Sheets in Expo Router + +This skill covers implementing form sheets with footers using Expo Router's Stack navigator and react-native-screens. + +## Overview + +Form sheets are modal presentations that appear as a card sliding up from the bottom of the screen. They're ideal for: + +- Quick actions and confirmations +- Settings panels +- Login/signup flows +- Action sheets with custom content + +**Requirements:** + +- Expo Router Stack navigator + +## Basic Usage + +### Form Sheet with Footer + +Configure the Stack.Screen with transparent backgrounds and sheet presentation: + +```tsx +// app/_layout.tsx +import { Stack } from "expo-router"; + +export default function Layout() { + return ( + + + + + + + ); +} +``` + +### Form Sheet Screen Content + +> Requires Expo SDK 55 or later. + +Use `flex: 1` to allow the content to fill available space, enabling footer positioning: + +```tsx +// app/about.tsx +import { View, Text, StyleSheet } from "react-native"; + +export default function AboutSheet() { + return ( + + {/* Main content */} + + Sheet Content + + + {/* Footer - stays at bottom */} + + Footer Content + + + ); +} + +const styles = StyleSheet.create({ + container: { + flex: 1, + }, + content: { + flex: 1, + padding: 16, + }, + footer: { + padding: 16, + }, +}); +``` + +## Key Options + +| Option | Type | Description | +| --------------------- | ---------- | ----------------------------------------------------------- | +| `presentation` | `string` | Set to `'formSheet'` for sheet presentation | +| `sheetGrabberVisible` | `boolean` | Shows the drag handle at the top of the sheet | +| `sheetAllowedDetents` | `number[]` | Array of detent heights (0-1 range, e.g., `[0.25]` for 25%) | +| `headerTransparent` | `boolean` | Makes header background transparent | +| `contentStyle` | `object` | Style object for the screen content container | +| `title` | `string` | Screen title (set to `''` for no title) | + +## Common Detent Values + +- `[0.25]` - Quarter sheet (compact actions) +- `[0.5]` - Half sheet (medium content) +- `[0.75]` - Three-quarter sheet (detailed forms) +- `[0.25, 0.5, 1]` - Multiple stops (expandable sheet) + +## Complete Example + +```tsx +// _layout.tsx +import { Stack } from "expo-router"; + +export default function Layout() { + return ( + + + + + + + + + ); +} +``` + +```tsx +// app/confirm.tsx +import { View, Text, Pressable, StyleSheet } from "react-native"; +import { router } from "expo-router"; + +export default function ConfirmSheet() { + return ( + + + Confirm Action + + Are you sure you want to proceed? + + + + + router.back()}> + Cancel + + router.back()}> + Confirm + + + + ); +} + +const styles = StyleSheet.create({ + container: { + flex: 1, + }, + content: { + flex: 1, + padding: 20, + alignItems: "center", + justifyContent: "center", + }, + title: { + fontSize: 18, + fontWeight: "600", + marginBottom: 8, + }, + description: { + fontSize: 14, + color: "#666", + textAlign: "center", + }, + footer: { + flexDirection: "row", + padding: 16, + gap: 12, + }, + cancelButton: { + flex: 1, + padding: 14, + borderRadius: 10, + backgroundColor: "#f0f0f0", + alignItems: "center", + }, + cancelText: { + fontSize: 16, + fontWeight: "500", + }, + confirmButton: { + flex: 1, + padding: 14, + borderRadius: 10, + backgroundColor: "#007AFF", + alignItems: "center", + }, + confirmText: { + fontSize: 16, + fontWeight: "500", + color: "white", + }, +}); +``` + +## Troubleshooting + +### Content not filling sheet + +Make sure the root View uses `flex: 1`: + +```tsx +{/* content */} +``` + +### Sheet background showing through + +Set `contentStyle: { backgroundColor: 'transparent' }` in options and style your content container with the desired background color instead. diff --git a/building-native-ui/references/gradients.md b/building-native-ui/references/gradients.md new file mode 100644 index 0000000..329600d --- /dev/null +++ b/building-native-ui/references/gradients.md @@ -0,0 +1,106 @@ +# CSS Gradients + +> **New Architecture Only**: CSS gradients require React Native's New Architecture (Fabric). They are not available in the old architecture or Expo Go. + +Use CSS gradients with the `experimental_backgroundImage` style property. + +## Linear Gradients + +```tsx +// Top to bottom + + +// Left to right + + +// Diagonal + + +// Using degrees + +``` + +## Radial Gradients + +```tsx +// Circle at center + + +// Ellipse + + +// Positioned + +``` + +## Multiple Gradients + +Stack multiple gradients by comma-separating them: + +```tsx + +``` + +## Common Patterns + +### Overlay on Image + +```tsx + + + + +``` + +### Frosted Glass Effect + +```tsx + +``` + +### Button Gradient + +```tsx + + Submit + +``` + +## Important Notes + +- Do NOT use `expo-linear-gradient` — use CSS gradients instead +- Gradients are strings, not objects +- Use `rgba()` for transparency, or `transparent` keyword +- Color stops use percentages (0%, 50%, 100%) +- Direction keywords: `to top`, `to bottom`, `to left`, `to right`, `to top left`, etc. +- Degree values: `45deg`, `90deg`, `135deg`, etc. diff --git a/building-native-ui/references/icons.md b/building-native-ui/references/icons.md new file mode 100644 index 0000000..eebf674 --- /dev/null +++ b/building-native-ui/references/icons.md @@ -0,0 +1,213 @@ +# Icons (SF Symbols) + +Use SF Symbols for native feel. Never use FontAwesome or Ionicons. + +## Basic Usage + +```tsx +import { SymbolView } from "expo-symbols"; +import { PlatformColor } from "react-native"; + +; +``` + +## Props + +```tsx + +``` + +## Common Icons + +### Navigation & Actions +- `house.fill` - home +- `gear` - settings +- `magnifyingglass` - search +- `plus` - add +- `xmark` - close +- `chevron.left` - back +- `chevron.right` - forward +- `arrow.left` - back arrow +- `arrow.right` - forward arrow + +### Media +- `play.fill` - play +- `pause.fill` - pause +- `stop.fill` - stop +- `backward.fill` - rewind +- `forward.fill` - fast forward +- `speaker.wave.2.fill` - volume +- `speaker.slash.fill` - mute + +### Camera +- `camera` - camera +- `camera.fill` - camera filled +- `arrow.triangle.2.circlepath` - flip camera +- `photo` - gallery/photos +- `bolt` - flash +- `bolt.slash` - flash off + +### Communication +- `message` - message +- `message.fill` - message filled +- `envelope` - email +- `envelope.fill` - email filled +- `phone` - phone +- `phone.fill` - phone filled +- `video` - video call +- `video.fill` - video call filled + +### Social +- `heart` - like +- `heart.fill` - liked +- `star` - favorite +- `star.fill` - favorited +- `hand.thumbsup` - thumbs up +- `hand.thumbsdown` - thumbs down +- `person` - profile +- `person.fill` - profile filled +- `person.2` - people +- `person.2.fill` - people filled + +### Content Actions +- `square.and.arrow.up` - share +- `square.and.arrow.down` - download +- `doc.on.doc` - copy +- `trash` - delete +- `pencil` - edit +- `folder` - folder +- `folder.fill` - folder filled +- `bookmark` - bookmark +- `bookmark.fill` - bookmarked + +### Status & Feedback +- `checkmark` - success/done +- `checkmark.circle.fill` - completed +- `xmark.circle.fill` - error/failed +- `exclamationmark.triangle` - warning +- `info.circle` - info +- `questionmark.circle` - help +- `bell` - notification +- `bell.fill` - notification filled + +### Misc +- `ellipsis` - more options +- `ellipsis.circle` - more in circle +- `line.3.horizontal` - menu/hamburger +- `slider.horizontal.3` - filters +- `arrow.clockwise` - refresh +- `location` - location +- `location.fill` - location filled +- `map` - map +- `mappin` - pin +- `clock` - time +- `calendar` - calendar +- `link` - link +- `nosign` - block/prohibited + +## Animated Symbols + +```tsx + +``` + +### Animation Effects + +- `bounce` - Bouncy animation +- `pulse` - Pulsing effect +- `variableColor` - Color cycling +- `scale` - Scale animation + +```tsx +// Bounce with direction +animationSpec={{ + effect: { type: "bounce", direction: "up" } // up | down +}} + +// Pulse +animationSpec={{ + effect: { type: "pulse" } +}} + +// Variable color (multicolor symbols) +animationSpec={{ + effect: { + type: "variableColor", + cumulative: true, + reversing: true + } +}} +``` + +## Symbol Weights + +```tsx +// Lighter weights + + + + +// Default + + +// Heavier weights + + + + + +``` + +## Symbol Scales + +```tsx + + // default + +``` + +## Multicolor Symbols + +Some symbols support multiple colors: + +```tsx + +``` + +## Finding Symbol Names + +1. Use the SF Symbols app on macOS (free from Apple) +2. Search at https://developer.apple.com/sf-symbols/ +3. Symbol names use dot notation: `square.and.arrow.up` + +## Best Practices + +- Always use SF Symbols over vector icon libraries +- Match symbol weight to nearby text weight +- Use `.fill` variants for selected/active states +- Use PlatformColor for tint to support dark mode +- Keep icons at consistent sizes (16, 20, 24, 32) diff --git a/building-native-ui/references/media.md b/building-native-ui/references/media.md new file mode 100644 index 0000000..50c0ffb --- /dev/null +++ b/building-native-ui/references/media.md @@ -0,0 +1,198 @@ +# Media + +## Camera + +- Hide navigation headers when there's a full screen camera +- Ensure to flip the camera with `mirror` to emulate social apps +- Use liquid glass buttons on cameras +- Icons: `arrow.triangle.2.circlepath` (flip), `photo` (gallery), `bolt` (flash) +- Eagerly request camera permission +- Lazily request media library permission + +```tsx +import React, { useRef, useState } from "react"; +import { View, TouchableOpacity, Text, Alert } from "react-native"; +import { CameraView, CameraType, useCameraPermissions } from "expo-camera"; +import * as MediaLibrary from "expo-media-library"; +import * as ImagePicker from "expo-image-picker"; +import * as Haptics from "expo-haptics"; +import { SymbolView } from "expo-symbols"; +import { PlatformColor } from "react-native"; +import { GlassView } from "expo-glass-effect"; +import { useSafeAreaInsets } from "react-native-safe-area-context"; + +function Camera({ onPicture }: { onPicture: (uri: string) => Promise }) { + const [permission, requestPermission] = useCameraPermissions(); + const cameraRef = useRef(null); + const [type, setType] = useState("back"); + const { bottom } = useSafeAreaInsets(); + + if (!permission?.granted) { + return ( + + Camera access is required + + + Grant Permission + + + + ); + } + + const takePhoto = async () => { + await Haptics.selectionAsync(); + if (!cameraRef.current) return; + const photo = await cameraRef.current.takePictureAsync({ quality: 0.8 }); + await onPicture(photo.uri); + }; + + const selectPhoto = async () => { + await Haptics.selectionAsync(); + const result = await ImagePicker.launchImageLibraryAsync({ + mediaTypes: "images", + allowsEditing: false, + quality: 0.8, + }); + if (!result.canceled && result.assets?.[0]) { + await onPicture(result.assets[0].uri); + } + }; + + return ( + + + + + + + + + setType(t => t === "back" ? "front" : "back")} icon="arrow.triangle.2.circlepath" /> + + + + ); +} +``` + +## Audio Playback + +Use `expo-audio` not `expo-av`: + +```tsx +import { useAudioPlayer } from 'expo-audio'; + +const player = useAudioPlayer({ uri: 'https://stream.nightride.fm/rektory.mp3' }); + + + + + ); +} +``` + +## Human-in-the-Loop (Client Tools) + +Server defines tool, client executes: + +```ts +// Server +export class ChatAgent extends AIChatAgent { + async onChatMessage(onFinish) { + return this.streamText({ + model: openai("gpt-4"), + messages: this.messages, + tools: { + confirmAction: tool({ + description: "Ask user to confirm", + parameters: z.object({ action: z.string() }), + execute: "client", // Client-side execution + }) + }, + onFinish, + }); + } +} + +// Client +const { messages } = useAgentChat({ + agent, + onToolCall: async (toolCall) => { + if (toolCall.toolName === "confirmAction") { + return { confirmed: window.confirm(`Confirm: ${toolCall.args.action}?`) }; + } + } +}); +``` + +## Task Queue & Scheduled Processing + +```ts +export class TaskAgent extends Agent { + onStart() { + this.schedule("*/5 * * * *", "processQueue", {}); // Every 5 min + this.schedule("0 0 * * *", "dailyCleanup", {}); // Daily + } + + async onRequest(req: Request) { + await this.queue("processVideo", { videoId: (await req.json()).videoId }); + return Response.json({ queued: true }); + } + + async processQueue() { + const tasks = await this.dequeue(10); + for (const task of tasks) { + if (task.name === "processVideo") await this.processVideo(task.data.videoId); + } + } + + async dailyCleanup() { + this.sql`DELETE FROM logs WHERE created_at < ${Date.now() - 86400000}`; + } +} +``` + +## Manual WebSocket Chat + +Custom protocols (non-AI): + +```ts +export class ChatAgent extends Agent { + async onConnect(conn: Connection, ctx: ConnectionContext) { + conn.accept(); + conn.setState({userId: ctx.request.headers.get("X-User-ID") || "anon"}); + conn.send(JSON.stringify({type: "history", messages: this.state.messages})); + } + + async onMessage(conn: Connection, msg: WSMessage) { + const newMsg = {userId: conn.state.userId, text: JSON.parse(msg as string).text, timestamp: Date.now()}; + this.setState({messages: [...this.state.messages, newMsg]}); + this.connections.forEach(c => c.send(JSON.stringify(newMsg))); + } +} +``` + +## Email Processing w/AI + +```ts +export class EmailAgent extends Agent { + async onEmail(email: AgentEmail) { + const [text, from, subject] = [await email.text(), email.from, email.headers.get("subject") || ""]; + this.sql`INSERT INTO emails (from_addr, subject, body) VALUES (${from}, ${subject}, ${text})`; + + const { text: summary } = await generateText({ + model: openai("gpt-4o-mini"), prompt: `Summarize: ${subject}\n\n${text}` + }); + + this.connections.forEach(c => c.send(JSON.stringify({type: "new_email", from, summary}))); + if (summary.includes("urgent")) await this.schedule(0, "sendAutoReply", { to: from }); + } +} +``` + +## Real-time Collaboration + +```ts +export class GameAgent extends Agent { + initialState = { players: [], gameStarted: false }; + + async onConnect(conn: Connection, ctx: ConnectionContext) { + conn.accept(); + const playerId = ctx.request.headers.get("X-Player-ID") || crypto.randomUUID(); + conn.setState({ playerId }); + + const newPlayer = { id: playerId, score: 0 }; + this.setState({...this.state, players: [...this.state.players, newPlayer]}); + this.connections.forEach(c => c.send(JSON.stringify({type: "player_joined", player: newPlayer}))); + } + + async onMessage(conn: Connection, msg: WSMessage) { + const m = JSON.parse(msg as string); + + if (m.type === "move") { + this.setState({ + ...this.state, + players: this.state.players.map(p => p.id === conn.state.playerId ? {...p, score: p.score + m.points} : p) + }); + this.connections.forEach(c => c.send(JSON.stringify({type: "player_moved", playerId: conn.state.playerId}))); + } + + if (m.type === "start" && this.state.players.length >= 2) { + this.setState({...this.state, gameStarted: true}); + this.connections.forEach(c => c.send(JSON.stringify({type: "game_started"}))); + } + } +} +``` diff --git a/cloudflare/references/ai-gateway/README.md b/cloudflare/references/ai-gateway/README.md new file mode 100644 index 0000000..75a12ab --- /dev/null +++ b/cloudflare/references/ai-gateway/README.md @@ -0,0 +1,175 @@ +# Cloudflare AI Gateway + +Expert guidance for implementing Cloudflare AI Gateway - a universal gateway for AI model providers with analytics, caching, rate limiting, and routing capabilities. + +## When to Use This Reference + +- Setting up AI Gateway for any AI provider (OpenAI, Anthropic, Workers AI, etc.) +- Implementing caching, rate limiting, or request retry/fallback +- Configuring dynamic routing with A/B testing or model fallbacks +- Managing provider API keys securely with BYOK +- Adding security features (guardrails, DLP) +- Setting up observability with logging and custom metadata +- Debugging AI Gateway requests or optimizing configurations + +## Quick Start + +**What's your setup?** + +- **Using Vercel AI SDK** → Pattern 1 (recommended) - see [sdk-integration.md](./sdk-integration.md) +- **Using OpenAI SDK** → Pattern 2 - see [sdk-integration.md](./sdk-integration.md) +- **Cloudflare Worker + Workers AI** → Pattern 3 - see [sdk-integration.md](./sdk-integration.md) +- **Direct HTTP (any language)** → Pattern 4 - see [configuration.md](./configuration.md) +- **Framework (LangChain, etc.)** → See [sdk-integration.md](./sdk-integration.md) + +## Pattern 1: Vercel AI SDK (Recommended) + +Most modern pattern using official `ai-gateway-provider` package with automatic fallbacks. + +```typescript +import { createAiGateway } from 'ai-gateway-provider'; +import { createOpenAI } from '@ai-sdk/openai'; +import { generateText } from 'ai'; + +const gateway = createAiGateway({ + accountId: process.env.CF_ACCOUNT_ID, + gateway: process.env.CF_GATEWAY_ID, +}); + +const openai = createOpenAI({ + apiKey: process.env.OPENAI_API_KEY +}); + +// Single model +const { text } = await generateText({ + model: gateway(openai('gpt-4o')), + prompt: 'Hello' +}); + +// Automatic fallback array +const { text } = await generateText({ + model: gateway([ + openai('gpt-4o'), // Try first + anthropic('claude-sonnet-4-5'), // Fallback + ]), + prompt: 'Hello' +}); +``` + +**Install:** `npm install ai-gateway-provider ai @ai-sdk/openai @ai-sdk/anthropic` + +## Pattern 2: OpenAI SDK + +Drop-in replacement for OpenAI API with multi-provider support. + +```typescript +import OpenAI from 'openai'; + +const client = new OpenAI({ + apiKey: process.env.OPENAI_API_KEY, + baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/compat`, + defaultHeaders: { + 'cf-aig-authorization': `Bearer ${cfToken}` // For authenticated gateways + } +}); + +// Switch providers by changing model format: {provider}/{model} +const response = await client.chat.completions.create({ + model: 'openai/gpt-4o', // or 'anthropic/claude-sonnet-4-5' + messages: [{ role: 'user', content: 'Hello!' }] +}); +``` + +## Pattern 3: Workers AI Binding + +For Cloudflare Workers using Workers AI. + +```typescript +export default { + async fetch(request, env, ctx) { + const response = await env.AI.run( + '@cf/meta/llama-3-8b-instruct', + { messages: [{ role: 'user', content: 'Hello!' }] }, + { + gateway: { + id: 'my-gateway', + metadata: { userId: '123', team: 'engineering' } + } + } + ); + + return Response.json(response); + } +}; +``` + +## Headers Quick Reference + +| Header | Purpose | Example | Notes | +|--------|---------|---------|-------| +| `cf-aig-authorization` | Gateway auth | `Bearer {token}` | Required for authenticated gateways | +| `cf-aig-metadata` | Tracking | `{"userId":"x"}` | Max 5 entries, flat structure | +| `cf-aig-cache-ttl` | Cache duration | `3600` | Seconds, min 60, max 2592000 (30 days) | +| `cf-aig-skip-cache` | Bypass cache | `true` | - | +| `cf-aig-cache-key` | Custom cache key | `my-key` | Must be unique per response | +| `cf-aig-collect-log` | Skip logging | `false` | Default: true | +| `cf-aig-cache-status` | Cache hit/miss | Response only | `HIT` or `MISS` | + +## In This Reference + +| File | Purpose | +|------|---------| +| [sdk-integration.md](./sdk-integration.md) | Vercel AI SDK, OpenAI SDK, Workers binding patterns | +| [configuration.md](./configuration.md) | Dashboard setup, wrangler, API tokens | +| [features.md](./features.md) | Caching, rate limits, guardrails, DLP, BYOK, unified billing | +| [dynamic-routing.md](./dynamic-routing.md) | Fallbacks, A/B testing, conditional routing | +| [troubleshooting.md](./troubleshooting.md) | Debugging, errors, observability, gotchas | + +## Reading Order + +| Task | Files | +|------|-------| +| First-time setup | README + [configuration.md](./configuration.md) | +| SDK integration | README + [sdk-integration.md](./sdk-integration.md) | +| Enable caching | README + [features.md](./features.md) | +| Setup fallbacks | README + [dynamic-routing.md](./dynamic-routing.md) | +| Debug errors | README + [troubleshooting.md](./troubleshooting.md) | + +## Architecture + +AI Gateway acts as a proxy between your application and AI providers: + +``` +Your App → AI Gateway → AI Provider (OpenAI, Anthropic, etc.) + ↓ + Analytics, Caching, Rate Limiting, Logging +``` + +**Key URL patterns:** +- Unified API (OpenAI-compatible): `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat/chat/completions` +- Provider-specific: `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/{provider}/{endpoint}` +- Dynamic routes: Use route name instead of model: `dynamic/{route-name}` + +## Gateway Types + +1. **Unauthenticated Gateway**: Open access (not recommended for production) +2. **Authenticated Gateway**: Requires `cf-aig-authorization` header with Cloudflare API token (recommended) + +## Provider Authentication Options + +1. **Unified Billing**: Use AI Gateway billing to pay for inference (keyless mode - no provider API key needed) +2. **BYOK (Store Keys)**: Store provider API keys in Cloudflare dashboard +3. **Request Headers**: Include provider API key in each request + +## Related Skills + +- [Workers AI](../workers-ai/README.md) - For `env.AI.run()` details +- [Agents SDK](../agents-sdk/README.md) - For stateful AI patterns +- [Vectorize](../vectorize/README.md) - For RAG patterns with embeddings + +## Resources + +- [Official Docs](https://developers.cloudflare.com/ai-gateway/) +- [API Reference](https://developers.cloudflare.com/api/resources/ai_gateway/) +- [Provider Guides](https://developers.cloudflare.com/ai-gateway/usage/providers/) +- [Discord Community](https://discord.cloudflare.com) diff --git a/cloudflare/references/ai-gateway/configuration.md b/cloudflare/references/ai-gateway/configuration.md new file mode 100644 index 0000000..78b5615 --- /dev/null +++ b/cloudflare/references/ai-gateway/configuration.md @@ -0,0 +1,111 @@ +# Configuration & Setup + +## Creating a Gateway + +### Dashboard +AI > AI Gateway > Create Gateway > Configure (auth, caching, rate limiting, logging) + +### API +```bash +curl -X POST https://api.cloudflare.com/client/v4/accounts/{account_id}/ai-gateway/gateways \ + -H "Authorization: Bearer $CF_API_TOKEN" -H "Content-Type: application/json" \ + -d '{"id":"my-gateway","cache_ttl":3600,"rate_limiting_interval":60,"rate_limiting_limit":100,"collect_logs":true}' +``` + +**Naming:** lowercase alphanumeric + hyphens (e.g., `prod-api`, `dev-chat`) + +## Wrangler Integration + +```toml +[ai] +binding = "AI" + +[[ai.gateway]] +id = "my-gateway" +``` + +```bash +wrangler secret put CF_API_TOKEN +wrangler secret put OPENAI_API_KEY # If not using BYOK +``` + +## Authentication + +### Gateway Auth (protects gateway access) +```typescript +const client = new OpenAI({ + baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`, + defaultHeaders: { 'cf-aig-authorization': `Bearer ${cfToken}` } +}); +``` + +### Provider Auth Options + +**1. Unified Billing (keyless)** - pay through Cloudflare, no provider key: +```typescript +const client = new OpenAI({ + baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`, + defaultHeaders: { 'cf-aig-authorization': `Bearer ${cfToken}` } +}); +``` +Supports: OpenAI, Anthropic, Google AI Studio + +**2. BYOK** - store keys in dashboard (Provider Keys > Add), no key in code + +**3. Request Headers** - pass provider key per request: +```typescript +const client = new OpenAI({ + apiKey: process.env.OPENAI_API_KEY, + baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`, + defaultHeaders: { 'cf-aig-authorization': `Bearer ${cfToken}` } +}); +``` + +## API Token Permissions + +- **Gateway management:** AI Gateway - Read + Edit +- **Gateway access:** AI Gateway - Read (minimum) + +## Gateway Management API + +```bash +# List +curl https://api.cloudflare.com/client/v4/accounts/{account_id}/ai-gateway/gateways \ + -H "Authorization: Bearer $CF_API_TOKEN" + +# Get +curl .../gateways/{gateway_id} + +# Update +curl -X PUT .../gateways/{gateway_id} \ + -d '{"cache_ttl":7200,"rate_limiting_limit":200}' + +# Delete +curl -X DELETE .../gateways/{gateway_id} +``` + +## Getting IDs + +- **Account ID:** Dashboard > Overview > Copy +- **Gateway ID:** AI Gateway > Gateway name column + +## Python Example + +```python +from openai import OpenAI +import os + +client = OpenAI( + api_key=os.environ.get("OPENAI_API_KEY"), + base_url=f"https://gateway.ai.cloudflare.com/v1/{os.environ['CF_ACCOUNT_ID']}/{os.environ['GATEWAY_ID']}/openai", + default_headers={"cf-aig-authorization": f"Bearer {os.environ['CF_API_TOKEN']}"} +) +``` + +## Best Practices + +1. **Always authenticate gateways in production** +2. **Use BYOK or unified billing** - secrets out of code +3. **Environment-specific gateways** - separate dev/staging/prod +4. **Set rate limits** - prevent runaway costs +5. **Enable logging** - track usage, debug issues diff --git a/cloudflare/references/ai-gateway/dynamic-routing.md b/cloudflare/references/ai-gateway/dynamic-routing.md new file mode 100644 index 0000000..540be5c --- /dev/null +++ b/cloudflare/references/ai-gateway/dynamic-routing.md @@ -0,0 +1,82 @@ +# Dynamic Routing + +Configure complex routing in dashboard without code changes. Use route names instead of model names. + +## Usage + +```typescript +const response = await client.chat.completions.create({ + model: 'dynamic/smart-chat', // Route name from dashboard + messages: [{ role: 'user', content: 'Hello!' }] +}); +``` + +## Node Types + +| Node | Purpose | Use Case | +|------|---------|----------| +| **Conditional** | Branch on metadata | Paid vs free users, geo routing | +| **Percentage** | A/B split traffic | Model testing, gradual rollouts | +| **Rate Limit** | Enforce quotas | Per-user/team limits | +| **Budget Limit** | Cost quotas | Per-user spending caps | +| **Model** | Call provider | Final destination | + +## Metadata + +Pass via header (max 5 entries, flat only): +```typescript +headers: { + 'cf-aig-metadata': JSON.stringify({ + userId: 'user-123', + tier: 'pro', + region: 'us-east' + }) +} +``` + +## Common Patterns + +**Multi-model fallback:** +``` +Start → GPT-4 → On error: Claude → On error: Llama +``` + +**Tiered access:** +``` +Conditional: tier == 'enterprise' → GPT-4 (no limit) +Conditional: tier == 'pro' → Rate Limit 1000/hr → GPT-4o +Conditional: tier == 'free' → Rate Limit 10/hr → GPT-4o-mini +``` + +**Gradual rollout:** +``` +Percentage: 10% → New model, 90% → Old model +``` + +**Cost-based fallback:** +``` +Budget Limit: $100/day per teamId + < 80%: GPT-4 + >= 80%: GPT-4o-mini + >= 100%: Error +``` + +## Version Management + +- Save changes as new version +- Test with `model: 'dynamic/route@v2'` +- Roll back by deploying previous version + +## Monitoring + +Dashboard → Gateway → Dynamic Routes: +- Request count per path +- Success/error rates +- Latency/cost by path + +## Limitations + +- Max 5 metadata entries +- Values: string/number/boolean/null only +- No nested objects +- Route names: alphanumeric + hyphens diff --git a/cloudflare/references/ai-gateway/features.md b/cloudflare/references/ai-gateway/features.md new file mode 100644 index 0000000..4a0d384 --- /dev/null +++ b/cloudflare/references/ai-gateway/features.md @@ -0,0 +1,96 @@ +# Features & Capabilities + +## Caching + +Dashboard: Settings → Cache Responses → Enable + +```typescript +// Custom TTL (1 hour) +headers: { 'cf-aig-cache-ttl': '3600' } + +// Skip cache +headers: { 'cf-aig-skip-cache': 'true' } + +// Custom cache key +headers: { 'cf-aig-cache-key': 'greeting-en' } +``` + +**Limits:** TTL 60s - 30 days. **Does NOT work with streaming.** + +## Rate Limiting + +Dashboard: Settings → Rate-limiting → Enable + +- **Fixed window:** Resets at intervals +- **Sliding window:** Rolling window (more accurate) +- Returns `429` when exceeded + +## Guardrails + +Dashboard: Settings → Guardrails → Enable + +Filter prompts/responses for inappropriate content. Actions: Flag (log) or Block (reject). + +## Data Loss Prevention (DLP) + +Dashboard: Settings → DLP → Enable + +Detect PII (emails, SSNs, credit cards). Actions: Flag, Block, or Redact. + +## Billing Modes + +| Mode | Description | Setup | +|------|-------------|-------| +| **Unified Billing** | Pay through Cloudflare, no provider keys | Use `cf-aig-authorization` header only | +| **BYOK** | Store provider keys in dashboard | Add keys in Provider Keys section | +| **Pass-through** | Send provider key with each request | Include provider's auth header | + +## Zero Data Retention + +Dashboard: Settings → Privacy → Zero Data Retention + +No prompts/responses stored. Request counts and costs still tracked. + +## Logging + +Dashboard: Settings → Logs → Enable (up to 10M logs) + +Each entry: prompt, response, provider, model, tokens, cost, duration, cache status, metadata. + +```typescript +// Skip logging for request +headers: { 'cf-aig-collect-log': 'false' } +``` + +**Export:** Use Logpush to S3, GCS, Datadog, Splunk, etc. + +## Custom Cost Tracking + +For models not in Cloudflare's pricing database: + +Dashboard: Gateway → Settings → Custom Costs + +Or via API: set `model`, `input_cost`, `output_cost`. + +## Supported Providers (22+) + +| Provider | Unified API | Notes | +|----------|-------------|-------| +| OpenAI | `openai/gpt-4o` | Full support | +| Anthropic | `anthropic/claude-sonnet-4-5` | Full support | +| Google AI | `google-ai-studio/gemini-2.0-flash` | Full support | +| Workers AI | `workersai/@cf/meta/llama-3` | Native | +| Azure OpenAI | `azure-openai/*` | Deployment names | +| AWS Bedrock | Provider endpoint only | `/bedrock/*` | +| Groq | `groq/*` | Fast inference | +| Mistral, Cohere, Perplexity, xAI, DeepSeek, Cerebras | Full support | - | + +## Best Practices + +1. Enable caching for deterministic prompts +2. Set rate limits to prevent abuse +3. Use guardrails for user-facing AI +4. Enable DLP for sensitive data +5. Use unified billing or BYOK for simpler key management +6. Enable logging for debugging +7. Use zero data retention when privacy required diff --git a/cloudflare/references/ai-gateway/sdk-integration.md b/cloudflare/references/ai-gateway/sdk-integration.md new file mode 100644 index 0000000..7ca4930 --- /dev/null +++ b/cloudflare/references/ai-gateway/sdk-integration.md @@ -0,0 +1,114 @@ +# AI Gateway SDK Integration + +## Vercel AI SDK (Recommended) + +```typescript +import { createAiGateway } from 'ai-gateway-provider'; +import { createOpenAI } from '@ai-sdk/openai'; +import { generateText } from 'ai'; + +const gateway = createAiGateway({ + accountId: process.env.CF_ACCOUNT_ID, + gateway: process.env.CF_GATEWAY_ID, + apiKey: process.env.CF_API_TOKEN // Optional for auth gateways +}); + +const openai = createOpenAI({ apiKey: process.env.OPENAI_API_KEY }); + +// Single model +const { text } = await generateText({ + model: gateway(openai('gpt-4o')), + prompt: 'Hello' +}); + +// Automatic fallback array +const { text } = await generateText({ + model: gateway([ + openai('gpt-4o'), + anthropic('claude-sonnet-4-5'), + openai('gpt-4o-mini') + ]), + prompt: 'Complex task' +}); +``` + +### Options + +```typescript +model: gateway(openai('gpt-4o'), { + cacheKey: 'my-key', + cacheTtl: 3600, + metadata: { userId: 'u123', team: 'eng' }, // Max 5 entries + retries: { maxAttempts: 3, backoff: 'exponential' } +}) +``` + +## OpenAI SDK + +```typescript +const client = new OpenAI({ + apiKey: process.env.OPENAI_API_KEY, + baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`, + defaultHeaders: { 'cf-aig-authorization': `Bearer ${cfToken}` } +}); + +// Unified API - switch providers via model name +model: 'openai/gpt-4o' // or 'anthropic/claude-sonnet-4-5' +``` + +## Anthropic SDK + +```typescript +const client = new Anthropic({ + apiKey: process.env.ANTHROPIC_API_KEY, + baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/anthropic`, + defaultHeaders: { 'cf-aig-authorization': `Bearer ${cfToken}` } +}); +``` + +## Workers AI Binding + +```toml +# wrangler.toml +[ai] +binding = "AI" +[[ai.gateway]] +id = "my-gateway" +``` + +```typescript +await env.AI.run('@cf/meta/llama-3-8b-instruct', + { messages: [...] }, + { gateway: { id: 'my-gateway', metadata: { userId: '123' } } } +); +``` + +## LangChain / LlamaIndex + +```typescript +// Use OpenAI SDK pattern with custom baseURL +new ChatOpenAI({ + configuration: { + baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai` + } +}); +``` + +## HTTP / cURL + +```bash +curl https://gateway.ai.cloudflare.com/v1/{account}/{gateway}/openai/chat/completions \ + -H "Authorization: Bearer $OPENAI_KEY" \ + -H "cf-aig-authorization: Bearer $CF_TOKEN" \ + -H "cf-aig-metadata: {\"userId\":\"123\"}" \ + -d '{"model":"gpt-4o","messages":[...]}' +``` + +## Headers Reference + +| Header | Purpose | +|--------|---------| +| `cf-aig-authorization` | Gateway auth token | +| `cf-aig-metadata` | JSON object (max 5 keys) | +| `cf-aig-cache-ttl` | Cache TTL in seconds | +| `cf-aig-skip-cache` | `true` to bypass cache | diff --git a/cloudflare/references/ai-gateway/troubleshooting.md b/cloudflare/references/ai-gateway/troubleshooting.md new file mode 100644 index 0000000..4d66357 --- /dev/null +++ b/cloudflare/references/ai-gateway/troubleshooting.md @@ -0,0 +1,88 @@ +# AI Gateway Troubleshooting + +## Common Errors + +| Error | Cause | Fix | +|-------|-------|-----| +| 401 | Missing `cf-aig-authorization` header | Add header with CF API token | +| 403 | Invalid provider key / BYOK expired | Check provider key in dashboard | +| 429 | Rate limit exceeded | Increase limit or implement backoff | + +### 401 Fix + +```typescript +const client = new OpenAI({ + baseURL: `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`, + defaultHeaders: { 'cf-aig-authorization': `Bearer ${CF_API_TOKEN}` } +}); +``` + +### 429 Retry Pattern + +```typescript +async function requestWithRetry(fn, maxRetries = 3) { + for (let i = 0; i < maxRetries; i++) { + try { return await fn(); } + catch (e) { + if (e.status === 429 && i < maxRetries - 1) { + await new Promise(r => setTimeout(r, Math.pow(2, i) * 1000)); + continue; + } + throw e; + } + } +} +``` + +## Gotchas + +| Issue | Reality | +|-------|---------| +| Metadata limits | Max 5 entries, flat only (no nesting) | +| Cache key collision | Use unique keys per expected response | +| BYOK + Unified Billing | Mutually exclusive | +| Rate limit scope | Per-gateway, not per-user (use dynamic routing for per-user) | +| Log delay | 30-60 seconds normal | +| Streaming + caching | **Incompatible** | +| Model name (unified API) | Prefix required: `openai/gpt-4o`, not `gpt-4o` | + +## Cache Not Working + +**Causes:** +- Different request params (temperature, etc.) +- Streaming enabled +- Caching disabled in settings + +**Check:** `response.headers.get('cf-aig-cache-status')` → HIT or MISS + +## Logs Not Appearing + +1. Check logging enabled: Dashboard → Gateway → Settings +2. Remove `cf-aig-collect-log: false` header +3. Wait 30-60 seconds +4. Check log limit (10M default) + +## Debugging + +```bash +# Test connectivity +curl -v https://gateway.ai.cloudflare.com/v1/{account}/{gateway}/openai/models \ + -H "Authorization: Bearer $OPENAI_KEY" \ + -H "cf-aig-authorization: Bearer $CF_TOKEN" +``` + +```typescript +// Check response headers +console.log('Cache:', response.headers.get('cf-aig-cache-status')); +console.log('Request ID:', response.headers.get('cf-ray')); +``` + +## Analytics + +Dashboard → AI Gateway → Select gateway + +**Metrics:** Requests, tokens, latency (p50/p95/p99), cache hit rate, costs + +**Log filters:** `status: error`, `provider: openai`, `cost > 0.01`, `duration > 1000` + +**Export:** Logpush to S3/GCS/Datadog/Splunk diff --git a/cloudflare/references/ai-search/README.md b/cloudflare/references/ai-search/README.md new file mode 100644 index 0000000..52d766d --- /dev/null +++ b/cloudflare/references/ai-search/README.md @@ -0,0 +1,138 @@ +# Cloudflare AI Search Reference + +Expert guidance for implementing Cloudflare AI Search (formerly AutoRAG), Cloudflare's managed semantic search and RAG service. + +## Overview + +**AI Search** is a managed RAG (Retrieval-Augmented Generation) pipeline that combines: +- Automatic semantic indexing of your content +- Vector similarity search +- Built-in LLM generation + +**Key value propositions:** +- **Zero vector management** - No manual embedding, indexing, or storage +- **Auto-indexing** - Content automatically re-indexed every 6 hours +- **Built-in generation** - Optional AI response generation from retrieved context +- **Multi-source** - Index from R2 buckets or website crawls + +**Data source options:** +- **R2 bucket** - Index files from Cloudflare R2 (supports MD, TXT, HTML, PDF, DOC, CSV, JSON) +- **Website** - Crawl and index website content (requires Cloudflare-hosted domain) + +**Indexing lifecycle:** +- Automatic 6-hour refresh cycle +- Manual "Force Sync" available (30s rate limit) +- Not designed for real-time updates + +## Quick Start + +**1. Create AI Search instance in dashboard:** +- Go to Cloudflare Dashboard → AI Search → Create +- Choose data source (R2 or website) +- Configure instance name and settings + +**2. Configure Worker:** + +```jsonc +// wrangler.jsonc +{ + "ai": { + "binding": "AI" + } +} +``` + +**3. Use in Worker:** + +```typescript +export default { + async fetch(request, env) { + const answer = await env.AI.autorag("my-search-instance").aiSearch({ + query: "How do I configure caching?", + model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast" + }); + + return Response.json({ answer: answer.response }); + } +}; +``` + +## When to Use AI Search + +### AI Search vs Vectorize + +| Factor | AI Search | Vectorize | +|--------|-----------|-----------| +| **Management** | Fully managed | Manual embedding + indexing | +| **Use when** | Want zero-ops RAG pipeline | Need custom embeddings/control | +| **Indexing** | Automatic (6hr cycle) | Manual via API | +| **Generation** | Built-in optional | Bring your own LLM | +| **Data sources** | R2 or website | Manual insert | +| **Best for** | Docs, support, enterprise search | Custom ML pipelines, real-time | + +### AI Search vs Direct Workers AI + +| Factor | AI Search | Workers AI (direct) | +|--------|-----------|---------------------| +| **Context** | Automatic retrieval | Manual context building | +| **Use when** | Need RAG (search + generate) | Simple generation tasks | +| **Indexing** | Built-in | Not applicable | +| **Best for** | Knowledge bases, docs | Simple chat, transformations | + +### search() vs aiSearch() + +| Method | Returns | Use When | +|--------|---------|----------| +| `search()` | Search results only | Building custom UI, need raw chunks | +| `aiSearch()` | AI response + results | Need ready-to-use answer (chatbot, Q&A) | + +### Real-time Updates Consideration + +**AI Search is NOT ideal if:** +- Need real-time content updates (<6 hours) +- Content changes multiple times per hour +- Strict freshness requirements + +**AI Search IS ideal if:** +- Content relatively stable (docs, policies, knowledge bases) +- 6-hour refresh acceptable +- Prefer zero-ops over real-time + +## Platform Limits + +| Limit | Value | +|-------|-------| +| Max instances per account | 10 | +| Max files per instance | 100,000 | +| Max file size | 4 MB | +| Index frequency | Every 6 hours | +| Force Sync rate limit | Once per 30 seconds | +| Filter nesting depth | 2 levels | +| Filters per compound | 10 | +| Score threshold range | 0.0 - 1.0 | + +## Reading Order + +Navigate these references based on your task: + +| Task | Read | Est. Time | +|------|------|-----------| +| **Understand AI Search** | README only | 5 min | +| **Implement basic search** | README → api.md | 10 min | +| **Configure data source** | README → configuration.md | 10 min | +| **Production patterns** | patterns.md | 15 min | +| **Debug issues** | gotchas.md | 10 min | +| **Full implementation** | README → api.md → patterns.md | 30 min | + +## In This Reference + +- **[api.md](api.md)** - API endpoints, methods, TypeScript interfaces +- **[configuration.md](configuration.md)** - Setup, data sources, wrangler config +- **[patterns.md](patterns.md)** - Common patterns, decision guidance, code examples +- **[gotchas.md](gotchas.md)** - Troubleshooting, code-level gotchas, limits + +## See Also + +- [Cloudflare AI Search Docs](https://developers.cloudflare.com/ai-search/) +- [Workers AI Docs](https://developers.cloudflare.com/workers-ai/) +- [Vectorize Docs](https://developers.cloudflare.com/vectorize/) diff --git a/cloudflare/references/ai-search/api.md b/cloudflare/references/ai-search/api.md new file mode 100644 index 0000000..b6220c4 --- /dev/null +++ b/cloudflare/references/ai-search/api.md @@ -0,0 +1,87 @@ +# AI Search API Reference + +## Workers Binding + +```typescript +const answer = await env.AI.autorag("instance-name").aiSearch(options); +const results = await env.AI.autorag("instance-name").search(options); +const instances = await env.AI.autorag("_").listInstances(); +``` + +## aiSearch() Options + +```typescript +interface AiSearchOptions { + query: string; // User query + model: string; // Workers AI model ID + system_prompt?: string; // LLM instructions + rewrite_query?: boolean; // Fix typos (default: false) + max_num_results?: number; // Max chunks (default: 10) + ranking_options?: { score_threshold?: number }; // 0.0-1.0 (default: 0.3) + reranking?: { enabled: boolean; model: string }; + stream?: boolean; // Stream response (default: false) + filters?: Filter; // Metadata filters + page?: string; // Pagination token +} +``` + +## Response + +```typescript +interface AiSearchResponse { + search_query: string; // Query used (rewritten if enabled) + response: string; // AI-generated answer + data: SearchResult[]; // Retrieved chunks + has_more: boolean; + next_page?: string; +} + +interface SearchResult { + id: string; + score: number; + content: string; + metadata: { filename: string; folder: string; timestamp: number }; +} +``` + +## Filters + +```typescript +// Comparison +{ column: "folder", operator: "gte", value: "docs/" } + +// Compound +{ operator: "and", filters: [ + { column: "folder", operator: "gte", value: "docs/" }, + { column: "timestamp", operator: "gte", value: 1704067200 } +]} +``` + +**Operators:** `eq`, `ne`, `gt`, `gte`, `lt`, `lte` + +**Built-in metadata:** `filename`, `folder`, `timestamp` (Unix seconds) + +## Streaming + +```typescript +const stream = await env.AI.autorag("docs").aiSearch({ query, model, stream: true }); +return new Response(stream, { headers: { "Content-Type": "text/event-stream" } }); +``` + +## Error Types + +| Error | Cause | +|-------|-------| +| `AutoRAGNotFoundError` | Instance doesn't exist | +| `AutoRAGUnauthorizedError` | Invalid/missing token | +| `AutoRAGValidationError` | Invalid parameters | + +## REST API + +```bash +curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/autorag/rags/{NAME}/ai-search \ + -H "Authorization: Bearer {TOKEN}" \ + -d '{"query": "...", "model": "@cf/meta/llama-3.3-70b-instruct-fp8-fast"}' +``` + +Requires Service API token with "AI Search - Read" permission. diff --git a/cloudflare/references/ai-search/configuration.md b/cloudflare/references/ai-search/configuration.md new file mode 100644 index 0000000..d1f34ad --- /dev/null +++ b/cloudflare/references/ai-search/configuration.md @@ -0,0 +1,88 @@ +# AI Search Configuration + +## Worker Setup + +```jsonc +// wrangler.jsonc +{ + "ai": { "binding": "AI" } +} +``` + +```typescript +interface Env { + AI: Ai; +} + +const answer = await env.AI.autorag("my-instance").aiSearch({ + query: "How do I configure caching?", + model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast" +}); +``` + +## Data Sources + +### R2 Bucket + +Dashboard: AI Search → Create Instance → Select R2 bucket + +**Supported formats:** `.md`, `.txt`, `.html`, `.pdf`, `.doc`, `.docx`, `.csv`, `.json` + +**Auto-indexed metadata:** `filename`, `folder`, `timestamp` + +### Website Crawler + +Requirements: +- Domain on Cloudflare +- `sitemap.xml` at root +- Bot protection must allow `CloudflareAISearch` user agent + +## Path Filtering (R2) + +``` +docs/**/*.md # All .md in docs/ recursively +**/*.draft.md # Exclude (use in exclude patterns) +``` + +## Indexing + +- **Automatic:** Every 6 hours +- **Force Sync:** Dashboard button (30s rate limit between syncs) +- **Pause:** Settings → Pause Indexing (existing index remains searchable) + +## Service API Token + +Dashboard: AI Search → Instance → Use AI Search → API → Create Token + +Permissions: +- **Read** - search operations +- **Edit** - instance management + +Store securely: +```bash +wrangler secret put AI_SEARCH_TOKEN +``` + +## Multi-Environment + +```toml +# wrangler.toml +[env.production.vars] +AI_SEARCH_INSTANCE = "prod-docs" + +[env.staging.vars] +AI_SEARCH_INSTANCE = "staging-docs" +``` + +```typescript +const answer = await env.AI.autorag(env.AI_SEARCH_INSTANCE).aiSearch({ query }); +``` + +## Monitoring + +```typescript +const instances = await env.AI.autorag("_").listInstances(); +console.log(instances.find(i => i.name === "docs")); +``` + +Dashboard shows: files indexed, status, last index time, storage usage. diff --git a/cloudflare/references/ai-search/gotchas.md b/cloudflare/references/ai-search/gotchas.md new file mode 100644 index 0000000..04c987f --- /dev/null +++ b/cloudflare/references/ai-search/gotchas.md @@ -0,0 +1,81 @@ +# AI Search Gotchas + +## Type Safety + +**Timestamp precision:** Use seconds (10-digit), not milliseconds. +```typescript +const nowInSeconds = Math.floor(Date.now() / 1000); // Correct +``` + +**Folder prefix matching:** Use `gte` for "starts with" on paths. +```typescript +filters: { column: "folder", operator: "gte", value: "docs/api/" } // Matches nested +``` + +## Filter Limitations + +| Limit | Value | +|-------|-------| +| Max nesting depth | 2 levels | +| Filters per compound | 10 | +| `or` operator | Same column, `eq` only | + +**OR restriction example:** +```typescript +// ✅ Valid: same column, eq only +{ operator: "or", filters: [ + { column: "folder", operator: "eq", value: "docs/" }, + { column: "folder", operator: "eq", value: "guides/" } +]} +``` + +## Indexing Issues + +| Problem | Cause | Solution | +|---------|-------|----------| +| File not indexed | Unsupported format or >4MB | Check format (.md/.txt/.html/.pdf/.doc/.csv/.json) | +| Index out of sync | 6-hour index cycle | Wait or use "Force Sync" (30s rate limit) | +| Empty results | Index incomplete | Check dashboard for indexing status | + +## Auth Errors + +| Error | Cause | Fix | +|-------|-------|-----| +| `AutoRAGUnauthorizedError` | Invalid/missing token | Create Service API token with AI Search permissions | +| `AutoRAGNotFoundError` | Wrong instance name | Verify exact name from dashboard | + +## Performance + +**Slow responses (>3s):** +```typescript +// Add score threshold + limit results +ranking_options: { score_threshold: 0.5 }, +max_num_results: 10 +``` + +**Empty results debug:** +1. Remove filters, test basic query +2. Lower `score_threshold` to 0.1 +3. Check index is populated + +## Limits + +| Resource | Limit | +|----------|-------| +| Instances per account | 10 | +| Files per instance | 100,000 | +| Max file size | 4 MB | +| Index frequency | 6 hours | + +## Anti-Patterns + +**Use env vars for instance names:** +```typescript +const answer = await env.AI.autorag(env.AI_SEARCH_INSTANCE).aiSearch({...}); +``` + +**Handle specific error types:** +```typescript +if (error instanceof AutoRAGNotFoundError) { /* 404 */ } +if (error instanceof AutoRAGUnauthorizedError) { /* 401 */ } +``` diff --git a/cloudflare/references/ai-search/patterns.md b/cloudflare/references/ai-search/patterns.md new file mode 100644 index 0000000..4a70f08 --- /dev/null +++ b/cloudflare/references/ai-search/patterns.md @@ -0,0 +1,85 @@ +# AI Search Patterns + +## search() vs aiSearch() + +| Use | Method | Returns | +|-----|--------|---------| +| Custom UI, analytics | `search()` | Raw chunks only (~100-300ms) | +| Chatbots, Q&A | `aiSearch()` | AI response + chunks (~500-2000ms) | + +## rewrite_query + +| Setting | Use When | +|---------|----------| +| `true` | User input (typos, vague queries) | +| `false` | LLM-generated queries (already optimized) | + +## Multitenancy (Folder-Based) + +```typescript +const answer = await env.AI.autorag("saas-docs").aiSearch({ + query: "refund policy", + model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast", + filters: { + column: "folder", + operator: "gte", // "starts with" pattern + value: `tenants/${tenantId}/` + } +}); +``` + +## Streaming + +```typescript +const stream = await env.AI.autorag("docs").aiSearch({ + query, model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast", stream: true +}); +return new Response(stream, { headers: { "Content-Type": "text/event-stream" } }); +``` + +## Score Threshold + +| Threshold | Use | +|-----------|-----| +| 0.3 (default) | Broad recall, exploratory | +| 0.5 | Balanced, production default | +| 0.7 | High precision, critical accuracy | + +## System Prompt Template + +```typescript +const systemPrompt = `You are a documentation assistant. +- Answer ONLY based on provided context +- If context doesn't contain answer, say "I don't have information" +- Include code examples from context`; +``` + +## Compound Filters + +```typescript +// OR: Multiple folders +filters: { + operator: "or", + filters: [ + { column: "folder", operator: "gte", value: "docs/api/" }, + { column: "folder", operator: "gte", value: "docs/auth/" } + ] +} + +// AND: Folder + date +filters: { + operator: "and", + filters: [ + { column: "folder", operator: "gte", value: "docs/" }, + { column: "timestamp", operator: "gte", value: oneWeekAgoSeconds } + ] +} +``` + +## Reranking + +Enable for high-stakes use cases (adds ~300ms latency): + +```typescript +reranking: { enabled: true, model: "@cf/baai/bge-reranker-base" } +``` diff --git a/cloudflare/references/analytics-engine/README.md b/cloudflare/references/analytics-engine/README.md new file mode 100644 index 0000000..524123b --- /dev/null +++ b/cloudflare/references/analytics-engine/README.md @@ -0,0 +1,92 @@ +# Cloudflare Workers Analytics Engine Reference + +Expert guidance for implementing unlimited-cardinality analytics at scale using Cloudflare Workers Analytics Engine. + +## What is Analytics Engine? + +Time-series analytics database designed for high-cardinality data (millions of unique dimensions). Write data points from Workers, query via SQL API. Use for: +- Custom user-facing analytics dashboards +- Usage-based billing & metering +- Per-customer/per-feature monitoring +- High-frequency instrumentation without performance impact + +**Key Capability:** Track metrics with unlimited unique values (e.g., millions of user IDs, API keys) without performance degradation. + +## Core Concepts + +| Concept | Description | Example | +|---------|-------------|---------| +| **Dataset** | Logical table for related metrics | `api_requests`, `user_events` | +| **Data Point** | Single measurement with timestamp | One API request's metrics | +| **Blobs** | String dimensions (max 20) | endpoint, method, status, user_id | +| **Doubles** | Numeric values (max 20) | latency_ms, request_count, bytes | +| **Indexes** | Filtered blobs for efficient queries | customer_id, api_key | + +## Reading Order + +| Task | Start Here | Then Read | +|------|------------|-----------| +| **First-time setup** | [configuration.md](configuration.md) → [api.md](api.md) → [patterns.md](patterns.md) | | +| **Writing data** | [api.md](api.md) → [gotchas.md](gotchas.md) (sampling) | | +| **Querying data** | [api.md](api.md) (SQL API) → [patterns.md](patterns.md) (examples) | | +| **Debugging** | [gotchas.md](gotchas.md) → [api.md](api.md) (limits) | | +| **Optimization** | [patterns.md](patterns.md) (anti-patterns) → [gotchas.md](gotchas.md) | | + +## When to Use Analytics Engine + +``` +Need to track metrics? → Yes + ↓ +Millions of unique dimension values? → Yes + ↓ + Need real-time queries? → Yes + ↓ + Use Analytics Engine ✓ + +Alternative scenarios: +- Low cardinality (<10k unique values) → Workers Analytics (free tier) +- Complex joins/relations → D1 Database +- Logs/debugging → Tail Workers (logpush) +- External tools → Send to external analytics (Datadog, etc.) +``` + +## Quick Start + +1. Add binding to `wrangler.jsonc`: +```jsonc +{ + "analytics_engine_datasets": [ + { "binding": "ANALYTICS", "dataset": "my_events" } + ] +} +``` + +2. Write data points (fire-and-forget, no await): +```typescript +env.ANALYTICS.writeDataPoint({ + blobs: ["/api/users", "GET", "200"], + doubles: [145.2, 1], // latency_ms, count + indexes: [customerId] +}); +``` + +3. Query via SQL API (HTTP): +```sql +SELECT blob1, SUM(double2) AS total_requests +FROM my_events +WHERE index1 = 'customer_123' + AND timestamp >= NOW() - INTERVAL '7' DAY +GROUP BY blob1 +ORDER BY total_requests DESC +``` + +## In This Reference + +- **[configuration.md](configuration.md)** - Setup, bindings, TypeScript types, limits +- **[api.md](api.md)** - `writeDataPoint()`, SQL API, query syntax +- **[patterns.md](patterns.md)** - Use cases, examples, anti-patterns +- **[gotchas.md](gotchas.md)** - Sampling, index selection, troubleshooting + +## See Also + +- [Cloudflare Analytics Engine Docs](https://developers.cloudflare.com/analytics/analytics-engine/) diff --git a/cloudflare/references/analytics-engine/api.md b/cloudflare/references/analytics-engine/api.md new file mode 100644 index 0000000..20e7084 --- /dev/null +++ b/cloudflare/references/analytics-engine/api.md @@ -0,0 +1,112 @@ +# Analytics Engine API Reference + +## Writing Data + +### `writeDataPoint()` + +Fire-and-forget (returns `void`, not Promise). Writes happen asynchronously. + +```typescript +interface AnalyticsEngineDataPoint { + blobs?: string[]; // Up to 20 strings (dimensions), 16KB each + doubles?: number[]; // Up to 20 numbers (metrics) + indexes?: string[]; // 1 indexed string for high-cardinality filtering +} + +env.ANALYTICS.writeDataPoint({ + blobs: ["/api/users", "GET", "200"], + doubles: [145.2, 1], // latency_ms, count + indexes: ["customer_abc123"] +}); +``` + +**Behaviors:** No await needed, no error thrown (check tail logs), auto-sampled at high volumes, auto-timestamped. + +**Blob vs Index:** Blob for GROUP BY (<100k unique), Index for filter-only (millions unique). + +### Full Example + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const start = Date.now(); + const url = new URL(request.url); + try { + const response = await handleRequest(request); + env.ANALYTICS.writeDataPoint({ + blobs: [url.pathname, request.method, response.status.toString()], + doubles: [Date.now() - start, 1], + indexes: [request.headers.get("x-api-key") || "anonymous"] + }); + return response; + } catch (error) { + env.ANALYTICS.writeDataPoint({ + blobs: [url.pathname, request.method, "500"], + doubles: [Date.now() - start, 1, 0], + }); + throw error; + } + } +}; +``` + +## SQL API (External Only) + +```bash +curl -X POST https://api.cloudflare.com/client/v4/accounts/{account_id}/analytics_engine/sql \ + -H "Authorization: Bearer $TOKEN" \ + -d "SELECT blob1 AS endpoint, COUNT(*) AS requests FROM dataset WHERE timestamp >= NOW() - INTERVAL '1' HOUR GROUP BY blob1" +``` + +### Column References + +```sql +-- blob1..blob20, double1..double20, index1, timestamp +SELECT blob1 AS endpoint, SUM(double1) AS latency, COUNT(*) AS requests +FROM my_dataset +WHERE index1 = 'customer_123' AND timestamp >= NOW() - INTERVAL '7' DAY +GROUP BY blob1 +HAVING COUNT(*) > 100 +ORDER BY requests DESC LIMIT 100 +``` + +**Aggregations:** `SUM()`, `AVG()`, `COUNT()`, `MIN()`, `MAX()`, `quantile(0.95)()` + +**Time ranges:** `NOW() - INTERVAL '1' HOUR`, `BETWEEN '2026-01-01' AND '2026-01-31'` + +### Query Examples + +```sql +-- Top endpoints +SELECT blob1, COUNT(*) AS requests, AVG(double1) AS avg_latency +FROM api_requests WHERE timestamp >= NOW() - INTERVAL '24' HOUR +GROUP BY blob1 ORDER BY requests DESC LIMIT 20 + +-- Error rate +SELECT blob1, COUNT(*) AS total, + SUM(CASE WHEN blob3 LIKE '5%' THEN 1 ELSE 0 END) AS errors +FROM api_requests WHERE timestamp >= NOW() - INTERVAL '1' HOUR +GROUP BY blob1 HAVING total > 50 + +-- P95 latency +SELECT blob1, quantile(0.95)(double1) AS p95 +FROM api_requests GROUP BY blob1 +``` + +## Response Format + +```json +{"data": [{"endpoint": "/api/users", "requests": 1523}], "rows": 2} +``` + +## Limits + +| Resource | Limit | +|----------|-------| +| Blobs/Doubles per point | 20 each | +| Indexes per point | 1 | +| Blob/Index size | 16KB | +| Data retention | 90 days | +| Query timeout | 30s | + +**Critical:** High write volumes (>1M/min) trigger automatic sampling. diff --git a/cloudflare/references/analytics-engine/configuration.md b/cloudflare/references/analytics-engine/configuration.md new file mode 100644 index 0000000..a71af96 --- /dev/null +++ b/cloudflare/references/analytics-engine/configuration.md @@ -0,0 +1,112 @@ +# Analytics Engine Configuration + +## Setup + +1. Add binding to `wrangler.jsonc` +2. Deploy Worker +3. Dataset created automatically on first write +4. Query via SQL API + +## wrangler.jsonc + +```jsonc +{ + "name": "my-worker", + "analytics_engine_datasets": [ + { "binding": "ANALYTICS", "dataset": "my_events" } + ] +} +``` + +Multiple datasets for separate concerns: +```jsonc +{ + "analytics_engine_datasets": [ + { "binding": "API_ANALYTICS", "dataset": "api_requests" }, + { "binding": "USER_EVENTS", "dataset": "user_activity" } + ] +} +``` + +## TypeScript + +```typescript +interface Env { + ANALYTICS: AnalyticsEngineDataset; +} + +export default { + async fetch(request: Request, env: Env) { + // No await - returns void, fire-and-forget + env.ANALYTICS.writeDataPoint({ + blobs: [pathname, method, status], // String dimensions (max 20) + doubles: [latency, 1], // Numeric metrics (max 20) + indexes: [apiKey] // High-cardinality filter (max 1) + }); + return response; + } +}; +``` + +## Data Point Limits + +| Field | Limit | SQL Access | +|-------|-------|------------| +| blobs | 20 strings, 16KB each | `blob1`...`blob20` | +| doubles | 20 numbers | `double1`...`double20` | +| indexes | 1 string, 16KB | `index1` | + +## Write Behavior + +| Scenario | Behavior | +|----------|----------| +| <1M writes/min | All accepted | +| >1M writes/min | Automatic sampling | +| Invalid data | Silent failure (check tail logs) | + +**Mitigate sampling:** Pre-aggregate, use multiple datasets, write only critical metrics. + +## Query Limits + +| Resource | Limit | +|----------|-------| +| Query timeout | 30 seconds | +| Data retention | 90 days (default) | +| Result size | ~10MB | + +## Cost + +**Free tier:** 10M writes/month, 1M reads/month + +**Paid:** $0.05 per 1M writes, $1.00 per 1M reads + +## Environment-Specific + +```jsonc +{ + "analytics_engine_datasets": [ + { "binding": "ANALYTICS", "dataset": "prod_events" } + ], + "env": { + "staging": { + "analytics_engine_datasets": [ + { "binding": "ANALYTICS", "dataset": "staging_events" } + ] + } + } +} +``` + +## Monitoring + +```bash +npx wrangler tail # Check for sampling/write errors +``` + +```sql +-- Check write activity +SELECT DATE_TRUNC('hour', timestamp) AS hour, COUNT(*) AS writes +FROM my_dataset +WHERE timestamp >= NOW() - INTERVAL '24' HOUR +GROUP BY hour +``` diff --git a/cloudflare/references/analytics-engine/gotchas.md b/cloudflare/references/analytics-engine/gotchas.md new file mode 100644 index 0000000..ed1767f --- /dev/null +++ b/cloudflare/references/analytics-engine/gotchas.md @@ -0,0 +1,85 @@ +# Analytics Engine Gotchas + +## Critical Issues + +### Sampling at High Volumes + +**Problem:** Queries return fewer points than written at >1M writes/min. + +**Solution:** +```typescript +// Pre-aggregate before writing +let buffer = { count: 0, total: 0 }; +buffer.count++; buffer.total += value; + +// Write once per second instead of per request +if (Date.now() % 1000 === 0) { + env.ANALYTICS.writeDataPoint({ doubles: [buffer.count, buffer.total] }); +} +``` + +**Detection:** `npx wrangler tail` → look for "sampling enabled" + +### writeDataPoint Returns void + +```typescript +// ❌ Pointless await +await env.ANALYTICS.writeDataPoint({...}); + +// ✅ Fire-and-forget +env.ANALYTICS.writeDataPoint({...}); +``` + +Writes can fail silently. Check tail logs. + +### Index vs Blob + +| Cardinality | Use | Example | +|-------------|-----|---------| +| Millions | **Index** | user_id, api_key | +| Hundreds | **Blob** | endpoint, status_code, country | + +```typescript +// ✅ Correct +{ blobs: [method, path, status], indexes: [userId] } +``` + +### Can't Query from Workers + +Query API requires HTTP auth. Use external service or cache in KV/D1. + +### No Custom Timestamps + +Auto-generated at write time. Store original in blob if needed. + +## Common Errors + +| Error | Fix | +|-------|-----| +| Binding not found | Check wrangler.jsonc, redeploy | +| No data in query | Wait 30s; check dataset name; check time range | +| Query timeout | Add time filter; use index for filtering | + +## Limits + +| Resource | Limit | +|----------|-------| +| Blobs per point | 20 | +| Doubles per point | 20 | +| Indexes per point | 1 | +| Blob/Index size | 16KB | +| Write rate (no sampling) | ~1M/min | +| Retention | 90 days | +| Query timeout | 30s | + +## Best Practices + +✅ Pre-aggregate at high volumes +✅ Use index for high-cardinality (millions) +✅ Always include time filter in queries +✅ Design schema before coding + +❌ Don't await writeDataPoint +❌ Don't use index for low-cardinality +❌ Don't query without time range +❌ Don't assume all writes succeed diff --git a/cloudflare/references/analytics-engine/patterns.md b/cloudflare/references/analytics-engine/patterns.md new file mode 100644 index 0000000..dc3fc5d --- /dev/null +++ b/cloudflare/references/analytics-engine/patterns.md @@ -0,0 +1,83 @@ +# Analytics Engine Patterns + +## Use Cases + +| Use Case | Key Metrics | Index On | +|----------|-------------|----------| +| API Metering | requests, bytes, compute_units | api_key | +| Feature Usage | feature, action, duration | user_id | +| Error Tracking | error_type, endpoint, count | customer_id | +| Performance | latency_ms, cache_status | endpoint | +| A/B Testing | variant, conversions | user_id | + +## API Metering (Billing) + +```typescript +env.ANALYTICS.writeDataPoint({ + blobs: [pathname, method, status, tier], + doubles: [1, computeUnits, bytes, latencyMs], + indexes: [apiKey] +}); + +// Query: Monthly usage by customer +// SELECT index1 AS api_key, SUM(double2) AS compute_units +// FROM usage WHERE timestamp >= DATE_TRUNC('month', NOW()) GROUP BY index1 +``` + +## Error Tracking + +```typescript +env.ANALYTICS.writeDataPoint({ + blobs: [endpoint, method, errorName, errorMessage.slice(0, 1000)], + doubles: [1, timeToErrorMs], + indexes: [customerId] +}); +``` + +## Performance Monitoring + +```typescript +env.ANALYTICS.writeDataPoint({ + blobs: [pathname, method, cacheStatus, status], + doubles: [latencyMs, 1], + indexes: [userId] +}); + +// Query: P95 latency by endpoint +// SELECT blob1, quantile(0.95)(double1) AS p95_ms FROM perf GROUP BY blob1 +``` + +## Anti-Patterns + +| ❌ Wrong | ✅ Correct | +|----------|-----------| +| `await writeDataPoint()` | `writeDataPoint()` (fire-and-forget) | +| `indexes: [method]` (low cardinality) | `blobs: [method]`, `indexes: [userId]` | +| `blobs: [JSON.stringify(obj)]` | Store ID in blob, full object in D1/KV | +| Write every request at 10M/min | Pre-aggregate per second | +| Query from Worker | Query from external service/API | + +## Best Practices + +1. **Design schema upfront** - Document blob/double/index assignments +2. **Always include count metric** - `doubles: [latency, 1]` for AVG calculations +3. **Use enums for blobs** - Consistent values like `Status.SUCCESS` +4. **Handle sampling** - Use ratios (avg_latency = SUM(latency)/SUM(count)) +5. **Test queries early** - Validate schema before heavy writes + +## Schema Template + +```typescript +/** + * Dataset: my_metrics + * + * Blobs: + * blob1: endpoint, blob2: method, blob3: status + * + * Doubles: + * double1: latency_ms, double2: count (always 1) + * + * Indexes: + * index1: customer_id (high cardinality) + */ +``` diff --git a/cloudflare/references/api-shield/README.md b/cloudflare/references/api-shield/README.md new file mode 100644 index 0000000..86613c8 --- /dev/null +++ b/cloudflare/references/api-shield/README.md @@ -0,0 +1,44 @@ +# Cloudflare API Shield Reference + +Expert guidance for API Shield - comprehensive API security suite for discovery, protection, and monitoring. + +## Reading Order + +| Task | Files to Read | +|------|---------------| +| Initial setup | README → configuration.md | +| Implement JWT validation | configuration.md → api.md | +| Add schema validation | configuration.md → patterns.md | +| Detect API attacks | patterns.md → api.md | +| Debug issues | gotchas.md | + +## Feature Selection + +What protection do you need? + +``` +├─ Validate request/response structure → Schema Validation 2.0 (configuration.md) +├─ Verify auth tokens → JWT Validation (configuration.md) +├─ Client certificates → mTLS (configuration.md) +├─ Detect BOLA attacks → BOLA Detection (patterns.md) +├─ Track auth coverage → Auth Posture (patterns.md) +├─ Stop volumetric abuse → Abuse Detection (patterns.md) +└─ Discover shadow APIs → API Discovery (api.md) +``` + +## In This Reference + +- **[configuration.md](configuration.md)** - Setup, session identifiers, rules, token/mTLS configs +- **[api.md](api.md)** - Endpoint management, discovery, validation APIs, GraphQL operations +- **[patterns.md](patterns.md)** - Common patterns, progressive rollout, OWASP mappings, workflows +- **[gotchas.md](gotchas.md)** - Troubleshooting, false positives, performance, best practices + +## Quick Start + +API Shield: Enterprise-grade API security (Discovery, Schema Validation 2.0, JWT, mTLS, BOLA Detection, Auth Posture). Available as Enterprise add-on with preview access. + +## See Also + +- [API Shield Docs](https://developers.cloudflare.com/api-shield/) +- [API Reference](https://developers.cloudflare.com/api/resources/api_gateway/) +- [OWASP API Security Top 10](https://owasp.org/www-project-api-security/) diff --git a/cloudflare/references/api-shield/api.md b/cloudflare/references/api-shield/api.md new file mode 100644 index 0000000..c833501 --- /dev/null +++ b/cloudflare/references/api-shield/api.md @@ -0,0 +1,141 @@ +# API Reference + +Base: `/zones/{zone_id}/api_gateway` + +## Endpoints + +```bash +GET /operations # List +GET /operations/{op_id} # Get single +POST /operations/item # Create: {endpoint,host,method} +POST /operations # Bulk: {operations:[{endpoint,host,method}]} +DELETE /operations/{op_id} # Delete +DELETE /operations # Bulk delete: {operation_ids:[...]} +``` + +## Discovery + +```bash +GET /discovery/operations # List discovered +PATCH /discovery/operations/{op_id} # Update: {state:"saved"|"ignored"} +PATCH /discovery/operations # Bulk: {operation_ids:{id:{state}}} +GET /discovery # OpenAPI export +``` + +## Config + +```bash +GET /configuration # Get session ID config +PUT /configuration # Update: {auth_id_characteristics:[{name,type:"header"|"cookie"}]} +``` + +## Token Validation + +```bash +GET /token_validation # List +POST /token_validation # Create: {name,location:{header:"..."},jwks:"..."} +POST /jwt_validation_rules # Rule: {name,hostname,token_validation_id,action:"block"} +``` + +## Workers Integration + +### Access JWT Claims +```js +export default { + async fetch(req, env) { + // Access validated JWT payload + const jwt = req.cf?.jwt?.payload?.[env.JWT_CONFIG_ID]?.[0]; + if (jwt) { + const userId = jwt.sub; + const role = jwt.role; + } + } +} +``` + +### Access mTLS Info +```js +export default { + async fetch(req, env) { + const tls = req.cf?.tlsClientAuth; + if (tls?.certVerified === 'SUCCESS') { + const fingerprint = tls.certFingerprintSHA256; + // Authenticated client + } + } +} +``` + +### Dynamic JWKS Update +```js +export default { + async scheduled(event, env) { + const jwks = await (await fetch('https://auth.example.com/.well-known/jwks.json')).json(); + await fetch(`https://api.cloudflare.com/client/v4/zones/${env.ZONE_ID}/api_gateway/token_validation/${env.CONFIG_ID}`, { + method: 'PATCH', + headers: {'Authorization': `Bearer ${env.CF_API_TOKEN}`, 'Content-Type': 'application/json'}, + body: JSON.stringify({jwks: JSON.stringify(jwks)}) + }); + } +} +``` + +## Firewall Fields + +### Core Fields +```js +cf.api_gateway.auth_id_present // Session ID present +cf.api_gateway.request_violates_schema // Schema violation +cf.api_gateway.fallthrough_triggered // No endpoint match +cf.tls_client_auth.cert_verified // mTLS cert valid +cf.tls_client_auth.cert_fingerprint_sha256 +``` + +### JWT Validation (2026) +```js +// Modern validation syntax +is_jwt_valid(http.request.jwt.payload["{config_id}"][0]) + +// Legacy (still supported) +cf.api_gateway.jwt_claims_valid + +// Extract claims +lookup_json_string(http.request.jwt.payload["{config_id}"][0], "claim_name") +``` + +### Risk Labels (2026) +```js +// BOLA detection +cf.api_gateway.cf-risk-bola-enumeration // Sequential resource access detected +cf.api_gateway.cf-risk-bola-pollution // Parameter pollution detected + +// Authentication posture +cf.api_gateway.cf-risk-missing-auth // Endpoint lacks authentication +cf.api_gateway.cf-risk-mixed-auth // Inconsistent auth patterns +``` + +## BOLA Detection + +```bash +GET /user_schemas/{schema_id}/bola # Get BOLA config +PATCH /user_schemas/{schema_id}/bola # Update: {enabled:true} +``` + +## Auth Posture + +```bash +GET /discovery/authentication_posture # List unprotected endpoints +``` + +## GraphQL Protection + +```bash +GET /settings/graphql_protection # Get limits +PUT /settings/graphql_protection # Set: {max_depth,max_size} +``` + +## See Also + +- [configuration.md](configuration.md) - Setup guides for all features +- [patterns.md](patterns.md) - Firewall rules and common patterns +- [API Gateway API Docs](https://developers.cloudflare.com/api/resources/api_gateway/) diff --git a/cloudflare/references/api-shield/configuration.md b/cloudflare/references/api-shield/configuration.md new file mode 100644 index 0000000..f95744c --- /dev/null +++ b/cloudflare/references/api-shield/configuration.md @@ -0,0 +1,192 @@ +# Configuration + +## Schema Validation 2.0 Setup + +> ⚠️ **Classic Schema Validation deprecated.** Use Schema Validation 2.0. + +**Upload schema (Dashboard):** +``` +Security > API Shield > Schema Validation > Add validation +- Upload .yml/.yaml/.json (OpenAPI v3.0) +- Endpoints auto-added to Endpoint Management +- Action: Log | Block | None +- Body inspection: JSON payloads +``` + +**Change validation action:** +``` +Security > API Shield > Settings > Schema Validation +Per-endpoint: Filter → ellipses → Change action +Default action: Set global mitigation action +``` + +**Migration from Classic:** +``` +1. Export existing schema (if available) +2. Delete all Classic schema validation rules +3. Wait 5 min for cache clear +4. Re-upload via Schema Validation 2.0 interface +5. Verify in Security > Events +``` + +**Fallthrough rule** (catch-all unknown endpoints): +``` +Security > API Shield > Settings > Fallthrough > Use Template +- Select hostnames +- Create rule with cf.api_gateway.fallthrough_triggered +- Action: Log (discover) or Block (strict) +``` + +**Body inspection:** Supports `application/json`, `*/*`, `application/*`. Disable origin MIME sniffing to prevent bypasses. + +## JWT Validation + +**Setup token config:** +``` +Security > API Shield > Settings > JWT Settings > Add configuration +- Name: "Auth0 JWT Config" +- Location: Header/Cookie + name (e.g., "Authorization") +- JWKS: Paste public keys from IdP +``` + +**Create validation rule:** +``` +Security > API Shield > API Rules > Add rule +- Hostname: api.example.com +- Deselect endpoints to ignore +- Token config: Select config +- Enforce presence: Ignore or Mark as non-compliant +- Action: Log/Block/Challenge +``` + +**Rate limit by JWT claim:** +```wirefilter +lookup_json_string(http.request.jwt.claims["{config_id}"][0], "sub") +``` + +**Special cases:** +- Two JWTs, different IdPs: Create 2 configs, select both, "Validate all" +- IdP migration: 2 configs + 2 rules, adjust actions per state +- Bearer prefix: API Shield handles with/without +- Nested claims: Dot notation `user.email` + +## Mutual TLS (mTLS) + +**Setup:** +``` +SSL/TLS > Client Certificates > Create Certificate +- Generate CF-managed CA (all plans) +- Upload custom CA (Enterprise, max 5) +``` + +**Configure mTLS rule:** +``` +Security > API Shield > mTLS +- Select hostname(s) +- Choose certificate(s) +- Action: Block/Log/Challenge +``` + +**Test:** +```bash +openssl req -x509 -newkey rsa:4096 -keyout client-key.pem -out client-cert.pem -days 365 +curl https://api.example.com/endpoint --cert client-cert.pem --key client-key.pem +``` + +## Session Identifiers + +Critical for BOLA Detection, Sequence Mitigation, and analytics. Configure header/cookie that uniquely IDs API users. + +**Examples:** JWT sub claim, session token, API key, custom user ID header + +**Configure:** +``` +Security > API Shield > Settings > Session Identifiers +- Type: Header/Cookie +- Name: "X-User-ID" or "Authorization" +``` + +## BOLA Detection + +Detects Broken Object Level Authorization attacks (enumeration + parameter pollution). + +**Enable:** +``` +Security > API Shield > Schema Validation > [Select Schema] > BOLA Detection +- Enable detection +- Threshold: Sensitivity level (Low/Medium/High) +- Action: Log or Block +``` + +**Requirements:** +- Schema Validation 2.0 enabled +- Session identifiers configured +- Minimum traffic: 1000+ requests/day per endpoint + +## Authentication Posture + +Identifies unprotected or inconsistently protected endpoints. + +**View report:** +``` +Security > API Shield > Authentication Posture +- Shows endpoints lacking JWT/mTLS +- Highlights mixed authentication patterns +``` + +**Remediate:** +1. Review flagged endpoints +2. Add JWT validation rules +3. Configure mTLS for sensitive endpoints +4. Monitor posture score + +## Volumetric Abuse + GraphQL + +**Volumetric Abuse Detection:** +`Security > API Shield > Settings > Volumetric Abuse Detection` +- Enable per-endpoint monitoring, set thresholds, action: Log | Challenge | Block + +**GraphQL Protection:** +`Security > API Shield > Settings > GraphQL Protection` +- Max query depth: 10, max size: 100KB, block introspection (production) + +## Terraform + +```hcl +# Session identifier +resource "cloudflare_api_shield" "main" { + zone_id = var.zone_id + auth_id_characteristics { + type = "header" + name = "Authorization" + } +} + +# Add endpoint +resource "cloudflare_api_shield_operation" "users_get" { + zone_id = var.zone_id + method = "GET" + host = "api.example.com" + endpoint = "/api/users/{id}" +} + +# JWT validation rule +resource "cloudflare_ruleset" "jwt_validation" { + zone_id = var.zone_id + name = "API JWT Validation" + kind = "zone" + phase = "http_request_firewall_custom" + + rules { + action = "block" + expression = "(http.host eq \"api.example.com\" and not is_jwt_valid(http.request.jwt.payload[\"{config_id}\"][0]))" + description = "Block invalid JWTs" + } +} +``` + +## See Also + +- [api.md](api.md) - API endpoints and Workers integration +- [patterns.md](patterns.md) - Firewall rules and deployment patterns +- [gotchas.md](gotchas.md) - Troubleshooting and limits diff --git a/cloudflare/references/api-shield/gotchas.md b/cloudflare/references/api-shield/gotchas.md new file mode 100644 index 0000000..255517f --- /dev/null +++ b/cloudflare/references/api-shield/gotchas.md @@ -0,0 +1,125 @@ +# Gotchas & Troubleshooting + +## Common Errors + +### "Schema Validation 2.0 not working after migration" + +**Cause:** Classic rules still active, conflicting with new system +**Solution:** +1. Delete ALL Classic schema validation rules +2. Clear Cloudflare cache (wait 5 min) +3. Re-upload schema via new Schema Validation 2.0 interface +4. Verify in Security > Events +5. Check action is set (Log/Block) + +### "Schema validation blocking valid requests" + +**Cause:** Schema too restrictive, missing fields, or incorrect types +**Solution:** +1. Check Firewall Events for violation details +2. Review schema in Settings +3. Test schema in Swagger Editor +4. Use Log mode to validate before blocking +5. Update schema with correct specifications +6. Ensure Schema Validation 2.0 (not Classic) + +### "JWT validation failing" + +**Cause:** JWKS mismatch with IdP, expired token, wrong header/cookie name, or clock skew +**Solution:** +1. Verify JWKS matches IdP configuration +2. Check token `exp` claim is valid +3. Confirm header/cookie name matches config +4. Test token at jwt.io +5. Account for clock skew (±5 min tolerance) +6. Use modern syntax: `is_jwt_valid(http.request.jwt.payload["{config_id}"][0])` + +### "BOLA detection false positives" + +**Cause:** Legitimate sequential access patterns, bulk operations, or sensitivity too high +**Solution:** +1. Review BOLA events in Security > Events +2. Lower sensitivity threshold (High → Medium → Low) +3. Exclude legitimate bulk operations from detection +4. Ensure session identifiers uniquely identify users +5. Verify minimum traffic requirements met (1000+ req/day) + +### "Risk labels not appearing in firewall rules" + +**Cause:** Feature not enabled, insufficient traffic, or missing session identifiers +**Solution:** +1. Verify Schema Validation 2.0 enabled +2. Enable BOLA Detection in schema settings +3. Configure session identifiers (required for BOLA) +4. Wait 24-48h for ML model training +5. Check minimum traffic thresholds met + +### "Endpoint discovery not finding APIs" + +**Cause:** Insufficient traffic (<500 reqs/10d), non-2xx responses, Worker direct requests, or incorrect session ID config +**Solution:** Ensure 500+ requests in 10 days, 2xx responses from edge (not Workers direct), configure session IDs correctly. ML updates daily. + +### "Sequence detection false positives" + +**Cause:** Lookback window issues, non-unique session IDs, or model sensitivity +**Solution:** +1. Review lookback settings (10 reqs to managed endpoints, 10min window) +2. Ensure session ID uniqueness per user (not shared tokens) +3. Adjust positive/negative model balance +4. Exclude legitimate workflows from detection + +### "GraphQL protection blocking valid queries" + +**Cause:** Query depth/size limits too restrictive, complex but legitimate queries +**Solution:** +1. Review blocked query patterns in Security > Events +2. Increase max_depth (default: 10) if needed +3. Increase max_size (default: 100KB) for complex queries +4. Whitelist specific query signatures +5. Use Log mode to tune before blocking + +### "Token invalid" + +**Cause:** Configuration error, JWKS mismatch, or expired token +**Solution:** Verify config matches IdP, update JWKS, check token expiration + +### "Schema violation" + +**Cause:** Missing required fields, wrong data types, or spec mismatch +**Solution:** Review schema against actual requests, ensure all required fields present, validate types match spec + +### "Fallthrough" + +**Cause:** Unknown endpoint or pattern mismatch +**Solution:** Update schema with all endpoints, check path pattern matching + +### "mTLS failed" + +**Cause:** Certificate untrusted/expired or wrong CA +**Solution:** Verify cert chain, check expiration, confirm correct CA uploaded + +## Limits (2026) + +| Resource/Limit | Value | Notes | +|----------------|-------|-------| +| OpenAPI version | v3.0.x only | No external refs, must be valid | +| Schema operations | 10K (Enterprise) | Contact for higher limits | +| JWT validation sources | Headers/cookies only | No query params/body | +| Endpoint discovery | 500+ reqs/10d | Minimum for ML model | +| Path normalization | Automatic | `/profile/238` → `/profile/{var1}` | +| Schema parameters | No `content` field | No object param validation | +| BOLA detection | 1000+ reqs/day/endpoint | Per-endpoint minimum | +| Session ID uniqueness | Required | BOLA/Sequence need unique IDs | +| GraphQL max depth | 1-50 | Default: 10 | +| GraphQL max size | 1KB-1MB | Default: 100KB | +| JWT claim nesting | 10 levels max | Use dot notation | +| mTLS CA certificates | 5 custom max | CF-managed unlimited | +| Schema upload size | 5MB max | Compressed OpenAPI spec | +| Volumetric abuse baseline | 7 days training | Initial ML period | +| Auth Posture refresh | Daily | Updated nightly | + +## See Also + +- [configuration.md](configuration.md) - Setup guides to avoid common issues +- [patterns.md](patterns.md) - Best practices and progressive rollout +- [API Shield Docs](https://developers.cloudflare.com/api-shield/) diff --git a/cloudflare/references/api-shield/patterns.md b/cloudflare/references/api-shield/patterns.md new file mode 100644 index 0000000..5721dba --- /dev/null +++ b/cloudflare/references/api-shield/patterns.md @@ -0,0 +1,180 @@ +# Patterns & Use Cases + +## Protect API with Schema + JWT + +```bash +# 1. Upload OpenAPI schema +POST /zones/{zone_id}/api_gateway/user_schemas + +# 2. Configure JWT validation +POST /zones/{zone_id}/api_gateway/token_validation +{ + "name": "Auth0", + "location": {"header": "Authorization"}, + "jwks": "{...}" +} + +# 3. Create JWT rule +POST /zones/{zone_id}/api_gateway/jwt_validation_rules + +# 4. Set schema validation action +PUT /zones/{zone_id}/api_gateway/settings/schema_validation +{"validation_default_mitigation_action": "block"} +``` + +## Progressive Rollout + +``` +1. Log mode: Observe false positives + - Schema: Action = Log + - JWT: Action = Log + +2. Block subset: Protect critical endpoints + - Change specific endpoint actions to Block + - Monitor firewall events + +3. Full enforcement: Block all violations + - Change default action to Block + - Handle fallthrough with custom rule +``` + +## BOLA Detection + +### Enumeration Detection +Detects sequential resource access (e.g., `/users/1`, `/users/2`, `/users/3`). + +```javascript +// Block BOLA enumeration attempts +(cf.api_gateway.cf-risk-bola-enumeration and http.host eq "api.example.com") +// Action: Block or Challenge +``` + +### Parameter Pollution +Detects duplicate/excessive parameters in requests. + +```javascript +// Block parameter pollution +(cf.api_gateway.cf-risk-bola-pollution and http.host eq "api.example.com") +// Action: Block +``` + +### Combined BOLA Protection +```javascript +// Comprehensive BOLA rule +(cf.api_gateway.cf-risk-bola-enumeration or cf.api_gateway.cf-risk-bola-pollution) +and http.host eq "api.example.com" +// Action: Block +``` + +## Authentication Posture + +### Detect Missing Auth +```javascript +// Log endpoints lacking authentication +(cf.api_gateway.cf-risk-missing-auth and http.host eq "api.example.com") +// Action: Log (for audit) +``` + +### Detect Mixed Auth +```javascript +// Alert on inconsistent auth patterns +(cf.api_gateway.cf-risk-mixed-auth and http.host eq "api.example.com") +// Action: Log (review required) +``` + +## Fallthrough Detection (Shadow APIs) + +```javascript +// WAF Custom Rule +(cf.api_gateway.fallthrough_triggered and http.host eq "api.example.com") +// Action: Log (discover unknown) or Block (strict) +``` + +## Rate Limiting by User + +```javascript +// Rate Limiting Rule (modern syntax) +(http.host eq "api.example.com" and + is_jwt_valid(http.request.jwt.payload["{config_id}"][0])) + +// Rate: 100 req/60s +// Counting expression: lookup_json_string(http.request.jwt.payload["{config_id}"][0], "sub") +``` + +## Volumetric Abuse Response + +```javascript +// Detect abnormal traffic spikes +(cf.api_gateway.volumetric_abuse_detected and http.host eq "api.example.com") +// Action: Challenge or Rate Limit + +// Combined with rate limiting +(cf.api_gateway.volumetric_abuse_detected or + cf.threat_score gt 50) and http.host eq "api.example.com" +// Action: JS Challenge +``` + +## GraphQL Protection + +```javascript +// Block oversized queries +(http.request.uri.path eq "/graphql" and + cf.api_gateway.graphql_query_size gt 100000) +// Action: Block + +// Block deep nested queries +(http.request.uri.path eq "/graphql" and + cf.api_gateway.graphql_query_depth gt 10) +// Action: Block +``` + +## Architecture Patterns + +**Public API:** Discovery + Schema Validation 2.0 + JWT + Rate Limiting + Bot Management +**Partner API:** mTLS + Schema Validation + Sequence Mitigation +**Internal API:** Discovery + Schema Learning + Auth Posture + +## OWASP API Security Top 10 Mapping (2026) + +| OWASP Issue | API Shield Solutions | +|-------------|---------------------| +| API1:2023 Broken Object Level Authorization | **BOLA Detection** (enumeration + pollution), Sequence mitigation, Schema, JWT, Rate Limiting | +| API2:2023 Broken Authentication | **Auth Posture**, mTLS, JWT validation, Bot Management | +| API3:2023 Broken Object Property Auth | Schema validation, JWT validation | +| API4:2023 Unrestricted Resource Access | Rate Limiting, **Volumetric Abuse Detection**, **GraphQL Protection**, Bot Management | +| API5:2023 Broken Function Level Auth | Schema validation, JWT validation, Auth Posture | +| API6:2023 Unrestricted Business Flows | Sequence mitigation, Bot Management | +| API7:2023 SSRF | Schema validation, WAF managed rules | +| API8:2023 Security Misconfiguration | **Schema Validation 2.0**, Auth Posture, WAF rules | +| API9:2023 Improper Inventory Management | **API Discovery**, Schema learning, Auth Posture | +| API10:2023 Unsafe API Consumption | JWT validation, Schema validation, WAF managed | + +## Monitoring + +**Security Events:** `Security > Events` → Filter: Action = block, Service = API Shield +**Firewall Analytics:** `Analytics > Security` → Filter by `cf.api_gateway.*` fields +**Logpush fields:** APIGatewayAuthIDPresent, APIGatewayRequestViolatesSchema, APIGatewayFallthroughDetected, JWTValidationResult + +## Availability (2026) + +| Feature | Availability | Notes | +|---------|-------------|-------| +| mTLS (CF-managed CA) | All plans | Self-service | +| Endpoint Management | All plans | Limited operations | +| Schema Validation 2.0 | All plans | Limited operations | +| API Discovery | Enterprise | 10K+ ops | +| JWT Validation | Enterprise add-on | Full validation | +| BOLA Detection | Enterprise add-on | Requires session IDs | +| Auth Posture | Enterprise add-on | Security audit | +| Volumetric Abuse Detection | Enterprise add-on | Traffic analysis | +| GraphQL Protection | Enterprise add-on | Query limits | +| Sequence Mitigation | Enterprise (beta) | Contact team | +| Full Suite | Enterprise add-on | All features | + +**Enterprise limits:** 10K operations (contact for higher). Preview access available for non-contract evaluation. + +## See Also + +- [configuration.md](configuration.md) - Setup all features before creating rules +- [api.md](api.md) - Firewall field reference and API endpoints +- [gotchas.md](gotchas.md) - Common issues and limits diff --git a/cloudflare/references/api/README.md b/cloudflare/references/api/README.md new file mode 100644 index 0000000..5e02109 --- /dev/null +++ b/cloudflare/references/api/README.md @@ -0,0 +1,65 @@ +# Cloudflare API Integration + +Guide for working with Cloudflare's REST API - authentication, SDK usage, common patterns, and troubleshooting. + +## Quick Decision Tree + +``` +How are you calling the Cloudflare API? +├─ From Workers runtime → Use bindings, not REST API (see ../bindings/) +├─ Server-side (Node/Python/Go) → Official SDK (see api.md) +├─ CLI/scripts → Wrangler or curl (see configuration.md) +├─ Infrastructure-as-code → See ../pulumi/ or ../terraform/ +└─ One-off requests → curl examples (see api.md) +``` + +## SDK Selection + +| Language | Package | Best For | Default Retries | +|----------|---------|----------|-----------------| +| TypeScript | `cloudflare` | Node.js, Bun, Next.js, Workers | 2 | +| Python | `cloudflare` | FastAPI, Django, scripts | 2 | +| Go | `cloudflare-go/v4` | CLI tools, microservices | 10 | + +All SDKs are Stainless-generated from OpenAPI spec (consistent APIs). + +## Authentication Methods + +| Method | Security | Use Case | Scope | +|--------|----------|----------|-------| +| **API Token** ✓ | Scoped, rotatable | Production | Per-zone or account | +| API Key + Email | Full account access | Legacy only | Everything | +| User Service Key | Limited | Origin CA certs only | Origin CA | + +**Always use API tokens** for new projects. + +## Rate Limits + +| Limit | Value | +|-------|-------| +| Per user/token | 1200 requests / 5 minutes | +| Per IP | 200 requests / second | +| GraphQL | 320 / 5 minutes (cost-based) | + +## Reading Order + +| Task | Files to Read | +|------|---------------| +| Initialize SDK client | api.md | +| Configure auth/timeout/retry | configuration.md | +| Find usage patterns | patterns.md | +| Debug errors/rate limits | gotchas.md | +| Product-specific APIs | ../workers/, ../r2/, ../kv/, etc. | + +## In This Reference + +- **[api.md](api.md)** - SDK client initialization, pagination, error handling, examples +- **[configuration.md](configuration.md)** - Environment variables, SDK config, Wrangler setup +- **[patterns.md](patterns.md)** - Real-world patterns, batch operations, workflows +- **[gotchas.md](gotchas.md)** - Rate limits, SDK-specific issues, troubleshooting + +## See Also + +- [Cloudflare API Docs](https://developers.cloudflare.com/api/) +- [Bindings Reference](../bindings/) - Workers runtime bindings (preferred over REST API) +- [Wrangler Reference](../wrangler/) - CLI tool for Cloudflare development diff --git a/cloudflare/references/api/api.md b/cloudflare/references/api/api.md new file mode 100644 index 0000000..3371014 --- /dev/null +++ b/cloudflare/references/api/api.md @@ -0,0 +1,204 @@ +# API Reference + +## Client Initialization + +### TypeScript + +```typescript +import Cloudflare from 'cloudflare'; + +const client = new Cloudflare({ + apiToken: process.env.CLOUDFLARE_API_TOKEN, +}); +``` + +### Python + +```python +from cloudflare import Cloudflare + +client = Cloudflare(api_token=os.environ.get("CLOUDFLARE_API_TOKEN")) + +# For async: +from cloudflare import AsyncCloudflare +client = AsyncCloudflare(api_token=os.environ["CLOUDFLARE_API_TOKEN"]) +``` + +### Go + +```go +import ( + "github.com/cloudflare/cloudflare-go/v4" + "github.com/cloudflare/cloudflare-go/v4/option" +) + +client := cloudflare.NewClient( + option.WithAPIToken(os.Getenv("CLOUDFLARE_API_TOKEN")), +) +``` + +## Authentication + +### API Token (Recommended) + +**Create token**: Dashboard → My Profile → API Tokens → Create Token + +```bash +export CLOUDFLARE_API_TOKEN='your-token-here' + +curl "https://api.cloudflare.com/client/v4/zones" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" +``` + +**Token scopes**: Always use minimal permissions (zone-specific, time-limited). + +### API Key (Legacy) + +```bash +curl "https://api.cloudflare.com/client/v4/zones" \ + --header "X-Auth-Email: user@example.com" \ + --header "X-Auth-Key: $CLOUDFLARE_API_KEY" +``` + +**Not recommended:** Full account access, cannot scope permissions. + +## Auto-Pagination + +All SDKs support automatic pagination for list operations. + +```typescript +// TypeScript: for await...of +for await (const zone of client.zones.list()) { + console.log(zone.id); +} +``` + +```python +# Python: iterator protocol +for zone in client.zones.list(): + print(zone.id) +``` + +```go +// Go: ListAutoPaging +iter := client.Zones.ListAutoPaging(ctx, cloudflare.ZoneListParams{}) +for iter.Next() { + zone := iter.Current() + fmt.Println(zone.ID) +} +``` + +## Error Handling + +```typescript +try { + const zone = await client.zones.get({ zone_id: 'xxx' }); +} catch (err) { + if (err instanceof Cloudflare.NotFoundError) { + // 404 + } else if (err instanceof Cloudflare.RateLimitError) { + // 429 - SDK auto-retries with backoff + } else if (err instanceof Cloudflare.APIError) { + console.log(err.status, err.message); + } +} +``` + +**Common Error Types:** +- `AuthenticationError` (401) - Invalid token +- `PermissionDeniedError` (403) - Insufficient scope +- `NotFoundError` (404) - Resource not found +- `RateLimitError` (429) - Rate limit exceeded +- `InternalServerError` (≥500) - Cloudflare error + +## Zone Management + +```typescript +// List zones +const zones = await client.zones.list({ + account: { id: 'account-id' }, + status: 'active', +}); + +// Create zone +const zone = await client.zones.create({ + account: { id: 'account-id' }, + name: 'example.com', + type: 'full', // or 'partial' +}); + +// Update zone +await client.zones.edit('zone-id', { + paused: false, +}); + +// Delete zone +await client.zones.delete('zone-id'); +``` + +```go +// Go: requires cloudflare.F() wrapper +zone, err := client.Zones.New(ctx, cloudflare.ZoneNewParams{ + Account: cloudflare.F(cloudflare.ZoneNewParamsAccount{ + ID: cloudflare.F("account-id"), + }), + Name: cloudflare.F("example.com"), + Type: cloudflare.F(cloudflare.ZoneNewParamsTypeFull), +}) +``` + +## DNS Management + +```typescript +// Create DNS record +await client.dns.records.create({ + zone_id: 'zone-id', + type: 'A', + name: 'subdomain.example.com', + content: '192.0.2.1', + ttl: 1, // auto + proxied: true, // Orange cloud +}); + +// List DNS records (with auto-pagination) +for await (const record of client.dns.records.list({ + zone_id: 'zone-id', + type: 'A', +})) { + console.log(record.name, record.content); +} + +// Update DNS record +await client.dns.records.update({ + zone_id: 'zone-id', + dns_record_id: 'record-id', + type: 'A', + name: 'subdomain.example.com', + content: '203.0.113.1', + proxied: true, +}); + +// Delete DNS record +await client.dns.records.delete({ + zone_id: 'zone-id', + dns_record_id: 'record-id', +}); +``` + +```python +# Python example +client.dns.records.create( + zone_id="zone-id", + type="A", + name="subdomain.example.com", + content="192.0.2.1", + ttl=1, + proxied=True, +) +``` + +## See Also + +- [configuration.md](./configuration.md) - SDK configuration, environment variables +- [patterns.md](./patterns.md) - Real-world patterns and workflows +- [gotchas.md](./gotchas.md) - Rate limits, troubleshooting diff --git a/cloudflare/references/api/configuration.md b/cloudflare/references/api/configuration.md new file mode 100644 index 0000000..4d4a299 --- /dev/null +++ b/cloudflare/references/api/configuration.md @@ -0,0 +1,160 @@ +# Configuration + +## Environment Variables + +### Set Variables + +| Platform | Command | +|----------|---------| +| Linux/macOS | `export CLOUDFLARE_API_TOKEN='token'` | +| PowerShell | `$env:CLOUDFLARE_API_TOKEN = 'token'` | +| Windows CMD | `set CLOUDFLARE_API_TOKEN=token` | + +**Security:** Never commit tokens. Use `.env` files (gitignored) or secret managers. + +### .env File Pattern + +```bash +# .env (add to .gitignore) +CLOUDFLARE_API_TOKEN=your-token-here +CLOUDFLARE_ACCOUNT_ID=your-account-id +``` + +```typescript +// TypeScript +import 'dotenv/config'; + +const client = new Cloudflare({ + apiToken: process.env.CLOUDFLARE_API_TOKEN, +}); +``` + +```python +# Python +from dotenv import load_dotenv +load_dotenv() + +client = Cloudflare(api_token=os.environ["CLOUDFLARE_API_TOKEN"]) +``` + +## SDK Configuration + +### TypeScript + +```typescript +const client = new Cloudflare({ + apiToken: process.env.CLOUDFLARE_API_TOKEN, + timeout: 120000, // 2 min (default 60s), in milliseconds + maxRetries: 5, // default 2 + baseURL: 'https://...', // proxy (rare) +}); + +// Per-request overrides +await client.zones.get( + { zone_id: 'zone-id' }, + { timeout: 5000, maxRetries: 0 } +); +``` + +### Python + +```python +client = Cloudflare( + api_token=os.environ["CLOUDFLARE_API_TOKEN"], + timeout=120, # seconds (default 60) + max_retries=5, # default 2 + base_url="https://...", # proxy (rare) +) + +# Per-request overrides +client.with_options(timeout=5, max_retries=0).zones.get(zone_id="zone-id") +``` + +### Go + +```go +client := cloudflare.NewClient( + option.WithAPIToken(os.Getenv("CLOUDFLARE_API_TOKEN")), + option.WithMaxRetries(5), // default 10 (higher than TS/Python) + option.WithRequestTimeout(2 * time.Minute), // default 60s + option.WithBaseURL("https://..."), // proxy (rare) +) + +// Per-request overrides +client.Zones.Get(ctx, "zone-id", option.WithMaxRetries(0)) +``` + +## Configuration Options + +| Option | TypeScript | Python | Go | Default | +|--------|-----------|--------|-----|---------| +| Timeout | `timeout` (ms) | `timeout` (s) | `WithRequestTimeout` | 60s | +| Retries | `maxRetries` | `max_retries` | `WithMaxRetries` | 2 (Go: 10) | +| Base URL | `baseURL` | `base_url` | `WithBaseURL` | api.cloudflare.com | + +**Note:** Go SDK has higher default retries (10) than TypeScript/Python (2). + +## Timeout Configuration + +**When to increase:** +- Large zone transfers +- Bulk DNS operations +- Worker script uploads + +```typescript +const client = new Cloudflare({ + timeout: 300000, // 5 minutes +}); +``` + +## Retry Configuration + +**When to increase:** Rate-limit-heavy workflows, flaky network + +**When to decrease:** Fast-fail requirements, user-facing requests + +```typescript +// Increase retries for batch operations +const client = new Cloudflare({ maxRetries: 10 }); + +// Disable retries for fast-fail +const fastClient = new Cloudflare({ maxRetries: 0 }); +``` + +## Wrangler CLI Integration + +```bash +# Configure authentication +wrangler login +# Or +export CLOUDFLARE_API_TOKEN='token' + +# Common commands that use API +wrangler deploy # Uploads worker via API +wrangler kv:key put # KV operations +wrangler r2 bucket create # R2 operations +wrangler d1 execute # D1 operations +wrangler pages deploy # Pages operations + +# Get API configuration +wrangler whoami # Shows authenticated user +``` + +### wrangler.toml + +```toml +name = "my-worker" +main = "src/index.ts" +compatibility_date = "2024-01-01" +account_id = "your-account-id" + +# Can also use env vars: +# CLOUDFLARE_ACCOUNT_ID +# CLOUDFLARE_API_TOKEN +``` + +## See Also + +- [api.md](./api.md) - Client initialization, authentication +- [gotchas.md](./gotchas.md) - Rate limits, timeout errors +- [Wrangler Reference](../wrangler/) - CLI tool details diff --git a/cloudflare/references/api/gotchas.md b/cloudflare/references/api/gotchas.md new file mode 100644 index 0000000..e6666dc --- /dev/null +++ b/cloudflare/references/api/gotchas.md @@ -0,0 +1,225 @@ +# Gotchas & Troubleshooting + +## Rate Limits & 429 Errors + +**Actual Limits:** +- **1200 requests / 5 minutes** per user/token (global) +- **200 requests / second** per IP address +- **GraphQL: 320 / 5 minutes** (cost-based) + +**SDK Behavior:** +- Auto-retry with exponential backoff (default 2 retries, Go: 10) +- Respects `Retry-After` header +- Throws `RateLimitError` after exhausting retries + +**Solution:** + +```typescript +// Increase retries for rate-limit-heavy workflows +const client = new Cloudflare({ maxRetries: 5 }); + +// Add application-level throttling +import pLimit from 'p-limit'; +const limit = pLimit(10); // Max 10 concurrent requests +``` + +## SDK-Specific Issues + +### Go: Required Field Wrapper + +**Problem:** Go SDK requires `cloudflare.F()` wrapper for optional fields. + +```go +// ❌ WRONG - Won't compile or send field +client.Zones.New(ctx, cloudflare.ZoneNewParams{ + Name: "example.com", +}) + +// ✅ CORRECT +client.Zones.New(ctx, cloudflare.ZoneNewParams{ + Name: cloudflare.F("example.com"), + Account: cloudflare.F(cloudflare.ZoneNewParamsAccount{ + ID: cloudflare.F("account-id"), + }), +}) +``` + +**Why:** Distinguishes between zero value, null, and omitted fields. + +### Python: Async vs Sync Clients + +**Problem:** Using sync client in async context or vice versa. + +```python +# ❌ WRONG - Can't await sync client +from cloudflare import Cloudflare +client = Cloudflare() +await client.zones.list() # TypeError + +# ✅ CORRECT - Use AsyncCloudflare +from cloudflare import AsyncCloudflare +client = AsyncCloudflare() +await client.zones.list() +``` + +## Token Permission Errors (403) + +**Problem:** API returns 403 Forbidden despite valid token. + +**Cause:** Token lacks required permissions (scope). + +**Scopes Required:** + +| Operation | Required Scope | +|-----------|----------------| +| List zones | Zone:Read (zone-level or account-level) | +| Create zone | Zone:Edit (account-level) | +| Edit DNS | DNS:Edit (zone-level) | +| Deploy Worker | Workers Script:Edit (account-level) | +| Read KV | Workers KV Storage:Read | +| Write KV | Workers KV Storage:Edit | + +**Solution:** Re-create token with correct permissions in Dashboard → My Profile → API Tokens. + +## Pagination Truncation + +**Problem:** Only getting first 20 results (default page size). + +**Solution:** Use auto-pagination iterators. + +```typescript +// ❌ WRONG - Only first page (20 items) +const page = await client.zones.list(); + +// ✅ CORRECT - All results +const zones = []; +for await (const zone of client.zones.list()) { + zones.push(zone); +} +``` + +## Workers Subrequests + +**Problem:** Rate limit hit faster than expected in Workers. + +**Cause:** Workers subrequests count as separate API calls. + +**Solution:** Use bindings instead of REST API in Workers (see ../bindings/). + +```typescript +// ❌ WRONG - REST API in Workers (counts against rate limit) +const client = new Cloudflare({ apiToken: env.CLOUDFLARE_API_TOKEN }); +const zones = await client.zones.list(); + +// ✅ CORRECT - Use bindings (no rate limit) +// Access via env.MY_BINDING +``` + +## Authentication Errors (401) + +**Problem:** "Authentication failed" or "Invalid token" + +**Causes:** +- Token expired +- Token deleted/revoked +- Token not set in environment +- Wrong token format + +**Solution:** + +```typescript +// Verify token is set +if (!process.env.CLOUDFLARE_API_TOKEN) { + throw new Error('CLOUDFLARE_API_TOKEN not set'); +} + +// Test token +const user = await client.user.tokens.verify(); +console.log('Token valid:', user.status); +``` + +## Timeout Errors + +**Problem:** Request times out (default 60s). + +**Cause:** Large operations (bulk DNS, zone transfers). + +**Solution:** Increase timeout or split operations. + +```typescript +// Increase timeout +const client = new Cloudflare({ + timeout: 300000, // 5 minutes +}); + +// Or split operations +const batchSize = 100; +for (let i = 0; i < records.length; i += batchSize) { + const batch = records.slice(i, i + batchSize); + await processBatch(batch); +} +``` + +## Zone Not Found (404) + +**Problem:** Zone ID valid but returns 404. + +**Causes:** +- Zone not in account associated with token +- Zone deleted +- Wrong zone ID format + +**Solution:** + +```typescript +// List all zones to find correct ID +for await (const zone of client.zones.list()) { + console.log(zone.id, zone.name); +} +``` + +## Limits Reference + +| Resource/Limit | Value | Notes | +|----------------|-------|-------| +| API rate limit | 1200/5min | Per user/token | +| IP rate limit | 200/sec | Per IP | +| GraphQL rate limit | 320/5min | Cost-based | +| Parallel requests (recommended) | < 10 | Avoid overwhelming API | +| Default page size | 20 | Use auto-pagination | +| Max page size | 50 | Some endpoints | + +## Best Practices + +**Security:** +- Never commit tokens +- Use minimal permissions +- Rotate tokens regularly +- Set token expiration + +**Performance:** +- Batch operations +- Use pagination wisely +- Cache responses +- Handle rate limits + +**Code Organization:** + +```typescript +// Create reusable client instance +export const cfClient = new Cloudflare({ + apiToken: process.env.CLOUDFLARE_API_TOKEN, + maxRetries: 5, +}); + +// Wrap common operations +export async function getZoneDetails(zoneId: string) { + return await cfClient.zones.get({ zone_id: zoneId }); +} +``` + +## See Also + +- [api.md](./api.md) - Error types, authentication +- [configuration.md](./configuration.md) - Timeout/retry configuration +- [patterns.md](./patterns.md) - Error handling patterns diff --git a/cloudflare/references/api/patterns.md b/cloudflare/references/api/patterns.md new file mode 100644 index 0000000..3c7c693 --- /dev/null +++ b/cloudflare/references/api/patterns.md @@ -0,0 +1,204 @@ +# Common Patterns + +## List All with Auto-Pagination + +**Problem:** API returns paginated results. Default page size is 20. + +**Solution:** Use SDK auto-pagination to iterate all results. + +```typescript +// TypeScript +for await (const zone of client.zones.list()) { + console.log(zone.name); +} +``` + +```python +# Python +for zone in client.zones.list(): + print(zone.name) +``` + +```go +// Go +iter := client.Zones.ListAutoPaging(ctx, cloudflare.ZoneListParams{}) +for iter.Next() { + fmt.Println(iter.Current().Name) +} +``` + +## Error Handling with Retry + +**Problem:** Rate limits (429) and transient errors need retry. + +**Solution:** SDKs auto-retry with exponential backoff. Customize as needed. + +```typescript +// Increase retries for rate-limit-heavy operations +const client = new Cloudflare({ maxRetries: 5 }); + +try { + const zone = await client.zones.create({ /* ... */ }); +} catch (err) { + if (err instanceof Cloudflare.RateLimitError) { + // Already retried 5 times with backoff + const retryAfter = err.headers['retry-after']; + console.log(`Rate limited. Retry after ${retryAfter}s`); + } +} +``` + +## Batch Parallel Operations + +**Problem:** Need to create multiple resources quickly. + +**Solution:** Use `Promise.all()` for parallel requests (respect rate limits). + +```typescript +// Create multiple DNS records in parallel +const records = ['www', 'api', 'cdn'].map(subdomain => + client.dns.records.create({ + zone_id: 'zone-id', + type: 'A', + name: `${subdomain}.example.com`, + content: '192.0.2.1', + }) +); +await Promise.all(records); +``` + +**Controlled concurrency** (avoid rate limits): + +```typescript +import pLimit from 'p-limit'; +const limit = pLimit(10); // Max 10 concurrent + +const subdomains = ['www', 'api', 'cdn', /* many more */]; +const records = subdomains.map(subdomain => + limit(() => client.dns.records.create({ + zone_id: 'zone-id', + type: 'A', + name: `${subdomain}.example.com`, + content: '192.0.2.1', + })) +); +await Promise.all(records); +``` + +## Zone CRUD Workflow + +```typescript +// Create +const zone = await client.zones.create({ + account: { id: 'account-id' }, + name: 'example.com', + type: 'full', +}); + +// Read +const fetched = await client.zones.get({ zone_id: zone.id }); + +// Update +await client.zones.edit(zone.id, { paused: false }); + +// Delete +await client.zones.delete(zone.id); +``` + +## DNS Bulk Update + +```typescript +// Fetch all A records +const records = []; +for await (const record of client.dns.records.list({ + zone_id: 'zone-id', + type: 'A', +})) { + records.push(record); +} + +// Update all to new IP +await Promise.all(records.map(record => + client.dns.records.update({ + zone_id: 'zone-id', + dns_record_id: record.id, + type: 'A', + name: record.name, + content: '203.0.113.1', // New IP + proxied: record.proxied, + ttl: record.ttl, + }) +)); +``` + +## Filter and Collect Results + +```typescript +// Find all proxied A records +const proxiedRecords = []; +for await (const record of client.dns.records.list({ + zone_id: 'zone-id', + type: 'A', +})) { + if (record.proxied) { + proxiedRecords.push(record); + } +} +``` + +## Error Recovery Pattern + +```typescript +async function createZoneWithRetry(name: string, maxAttempts = 3) { + for (let attempt = 1; attempt <= maxAttempts; attempt++) { + try { + return await client.zones.create({ + account: { id: 'account-id' }, + name, + type: 'full', + }); + } catch (err) { + if (err instanceof Cloudflare.RateLimitError && attempt < maxAttempts) { + const retryAfter = parseInt(err.headers['retry-after'] || '5'); + console.log(`Rate limited, waiting ${retryAfter}s (retry ${attempt}/${maxAttempts})`); + await new Promise(resolve => setTimeout(resolve, retryAfter * 1000)); + } else { + throw err; + } + } + } +} +``` + +## Conditional Update Pattern + +```typescript +// Only update if zone is active +const zone = await client.zones.get({ zone_id: 'zone-id' }); +if (zone.status === 'active') { + await client.zones.edit(zone.id, { paused: false }); +} +``` + +## Batch with Error Handling + +```typescript +// Process multiple zones, continue on errors +const results = await Promise.allSettled( + zoneIds.map(id => client.zones.get({ zone_id: id })) +); + +results.forEach((result, i) => { + if (result.status === 'fulfilled') { + console.log(`Zone ${i}: ${result.value.name}`); + } else { + console.error(`Zone ${i} failed:`, result.reason.message); + } +}); +``` + +## See Also + +- [api.md](./api.md) - SDK client initialization, basic operations +- [gotchas.md](./gotchas.md) - Rate limits, common errors +- [configuration.md](./configuration.md) - SDK configuration options diff --git a/cloudflare/references/argo-smart-routing/README.md b/cloudflare/references/argo-smart-routing/README.md new file mode 100644 index 0000000..ef5539d --- /dev/null +++ b/cloudflare/references/argo-smart-routing/README.md @@ -0,0 +1,90 @@ +# Cloudflare Argo Smart Routing Skill Reference + +## Overview + +Cloudflare Argo Smart Routing is a performance optimization service that detects real-time network issues and routes web traffic across the most efficient network path. It continuously monitors network conditions and intelligently routes traffic through the fastest, most reliable routes in Cloudflare's network. + +**Note on Smart Shield:** Argo Smart Routing is being integrated into Cloudflare's Smart Shield product for enhanced DDoS protection and performance. Existing Argo customers maintain full functionality with gradual migration to Smart Shield features. + +## Quick Start + +### Enable via cURL +```bash +curl -X PATCH "https://api.cloudflare.com/client/v4/zones/{zone_id}/argo/smart_routing" \ + -H "Authorization: Bearer YOUR_API_TOKEN" \ + -H "Content-Type: application/json" \ + -d '{"value": "on"}' +``` + +### Enable via TypeScript SDK +```typescript +import Cloudflare from 'cloudflare'; + +const client = new Cloudflare({ apiToken: process.env.CLOUDFLARE_API_TOKEN }); + +const result = await client.argo.smartRouting.edit({ + zone_id: 'your-zone-id', + value: 'on', +}); + +console.log(`Argo enabled: ${result.value}`); +``` + +## Core Concepts + +### What It Does +- **Intelligent routing**: Detects congestion, outages, packet loss in real-time +- **Global optimization**: Routes across 300+ Cloudflare data centers +- **Automatic failover**: Switches paths when issues detected (typically <1s) +- **Works with existing setup**: No origin changes required + +### Billing Model +- Usage-based: Charged per GB of traffic (excluding DDoS/WAF mitigated traffic) +- Requires billing configuration before enabling +- Available on Enterprise+ plans (check zone eligibility) + +### When to Use +- **High-traffic production sites** with global user base +- **Latency-sensitive applications** (APIs, real-time services) +- **Sites behind Cloudflare proxy** (orange-clouded DNS records) +- **Combined with Tiered Cache** for maximum performance gains + +### When NOT to Use +- Development/staging environments (cost control) +- Low-traffic sites (<1TB/month) where cost may exceed benefit +- Sites with primarily single-region traffic + +## Should I Enable Argo? + +| Your Situation | Recommendation | +|----------------|----------------| +| Global production app, >1TB/month traffic | ✅ Enable - likely ROI positive | +| Enterprise plan, latency-critical APIs | ✅ Enable - performance matters | +| Regional site, <100GB/month traffic | ⚠️ Evaluate - cost may not justify | +| Development/staging environment | ❌ Disable - use in production only | +| Not yet configured billing | ❌ Configure billing first | + +## Reading Order by Task + +| Your Goal | Start With | Then Read | +|-----------|------------|-----------| +| Enable Argo for first time | Quick Start above → [configuration.md](configuration.md) | [gotchas.md](gotchas.md) | +| Use TypeScript/Python SDK | [api.md](api.md) | [patterns.md](patterns.md) | +| Terraform/IaC setup | [configuration.md](configuration.md) | - | +| Enable for Spectrum TCP app | [patterns.md](patterns.md) → Spectrum section | [api.md](api.md) | +| Troubleshoot enablement issue | [gotchas.md](gotchas.md) | [api.md](api.md) | +| Manage billing/usage | [patterns.md](patterns.md) → Billing section | [gotchas.md](gotchas.md) | + +## In This Reference + +- **[api.md](api.md)** - API endpoints, SDK methods, error handling, Python/TypeScript examples +- **[configuration.md](configuration.md)** - Terraform setup, environment config, billing configuration +- **[patterns.md](patterns.md)** - Tiered Cache integration, Spectrum TCP apps, billing management, validation patterns +- **[gotchas.md](gotchas.md)** - Common errors, permission issues, limits, best practices + +## See Also + +- [Cloudflare Argo Smart Routing Docs](https://developers.cloudflare.com/argo-smart-routing/) +- [Cloudflare Smart Shield](https://developers.cloudflare.com/smart-shield/) +- [Spectrum Documentation](https://developers.cloudflare.com/spectrum/) +- [Tiered Cache](https://developers.cloudflare.com/cache/how-to/tiered-cache/) diff --git a/cloudflare/references/argo-smart-routing/api.md b/cloudflare/references/argo-smart-routing/api.md new file mode 100644 index 0000000..5cd8f5e --- /dev/null +++ b/cloudflare/references/argo-smart-routing/api.md @@ -0,0 +1,240 @@ +## API Reference + +**Note on Smart Shield:** Argo Smart Routing is being integrated into Cloudflare's Smart Shield product. API endpoints remain stable; existing integrations continue to work without changes. + +### Base Endpoint +``` +https://api.cloudflare.com/client/v4 +``` + +### Authentication +Use API tokens with Zone:Argo Smart Routing:Edit permissions: + +```bash +# Headers required +X-Auth-Email: user@example.com +Authorization: Bearer YOUR_API_TOKEN +``` + +### Get Argo Smart Routing Status + +**Endpoint:** `GET /zones/{zone_id}/argo/smart_routing` + +**Description:** Retrieves current Argo Smart Routing enablement status. + +**cURL Example:** +```bash +curl -X GET "https://api.cloudflare.com/client/v4/zones/{zone_id}/argo/smart_routing" \ + -H "Authorization: Bearer YOUR_API_TOKEN" \ + -H "Content-Type: application/json" +``` + +**Response:** +```json +{ + "result": { + "id": "smart_routing", + "value": "on", + "editable": true, + "modified_on": "2024-01-11T12:00:00Z" + }, + "success": true, + "errors": [], + "messages": [] +} +``` + +**TypeScript SDK Example:** +```typescript +import Cloudflare from 'cloudflare'; + +const client = new Cloudflare({ + apiToken: process.env.CLOUDFLARE_API_TOKEN +}); + +const status = await client.argo.smartRouting.get({ zone_id: 'your-zone-id' }); +console.log(`Argo status: ${status.value}, editable: ${status.editable}`); +``` + +**Python SDK Example:** +```python +from cloudflare import Cloudflare + +client = Cloudflare(api_token=os.environ.get('CLOUDFLARE_API_TOKEN')) + +status = client.argo.smart_routing.get(zone_id='your-zone-id') +print(f"Argo status: {status.value}, editable: {status.editable}") +``` + +### Update Argo Smart Routing Status + +**Endpoint:** `PATCH /zones/{zone_id}/argo/smart_routing` + +**Description:** Enable or disable Argo Smart Routing for a zone. + +**Request Body:** +```json +{ + "value": "on" // or "off" +} +``` + +**cURL Example:** +```bash +curl -X PATCH "https://api.cloudflare.com/client/v4/zones/{zone_id}/argo/smart_routing" \ + -H "Authorization: Bearer YOUR_API_TOKEN" \ + -H "Content-Type: application/json" \ + -d '{"value": "on"}' +``` + +**TypeScript SDK Example:** +```typescript +const result = await client.argo.smartRouting.edit({ + zone_id: 'your-zone-id', + value: 'on', +}); +console.log(`Updated: ${result.value} at ${result.modified_on}`); +``` + +**Python SDK Example:** +```python +result = client.argo.smart_routing.edit( + zone_id='your-zone-id', + value='on' +) +print(f"Updated: {result.value} at {result.modified_on}") +``` + +## Checking Editability Before Updates + +**Critical:** Always check the `editable` field before attempting to enable/disable Argo. When `editable: false`, the zone has restrictions (billing not configured, insufficient permissions, or plan limitations). + +**Pattern:** +```typescript +async function safelyEnableArgo(client: Cloudflare, zoneId: string): Promise { + const status = await client.argo.smartRouting.get({ zone_id: zoneId }); + + if (!status.editable) { + console.error('Cannot modify Argo: editable=false (check billing/permissions)'); + return false; + } + + if (status.value === 'on') { + console.log('Argo already enabled'); + return true; + } + + await client.argo.smartRouting.edit({ zone_id: zoneId, value: 'on' }); + console.log('Argo enabled successfully'); + return true; +} +``` + +**Python Pattern:** +```python +def safely_enable_argo(client: Cloudflare, zone_id: str) -> bool: + status = client.argo.smart_routing.get(zone_id=zone_id) + + if not status.editable: + print('Cannot modify Argo: editable=false (check billing/permissions)') + return False + + if status.value == 'on': + print('Argo already enabled') + return True + + client.argo.smart_routing.edit(zone_id=zone_id, value='on') + print('Argo enabled successfully') + return True +``` + +## Error Handling + +The TypeScript SDK provides typed error classes for robust error handling: + +```typescript +import Cloudflare from 'cloudflare'; +import { APIError, APIConnectionError, RateLimitError } from 'cloudflare'; + +async function enableArgoWithErrorHandling(client: Cloudflare, zoneId: string) { + try { + const result = await client.argo.smartRouting.edit({ + zone_id: zoneId, + value: 'on', + }); + return result; + } catch (error) { + if (error instanceof RateLimitError) { + console.error('Rate limited. Retry after:', error.response?.headers.get('retry-after')); + // Implement exponential backoff + } else if (error instanceof APIError) { + console.error('API error:', error.status, error.message); + if (error.status === 403) { + console.error('Permission denied - check API token scopes'); + } else if (error.status === 400) { + console.error('Bad request - verify zone_id and payload'); + } + } else if (error instanceof APIConnectionError) { + console.error('Connection failed:', error.message); + // Retry with exponential backoff + } else { + console.error('Unexpected error:', error); + } + throw error; + } +} +``` + +**Python Error Handling:** +```python +from cloudflare import Cloudflare, APIError, RateLimitError + +def enable_argo_with_error_handling(client: Cloudflare, zone_id: str): + try: + result = client.argo.smart_routing.edit(zone_id=zone_id, value='on') + return result + except RateLimitError as e: + print(f"Rate limited. Retry after: {e.response.headers.get('retry-after')}") + raise + except APIError as e: + print(f"API error: {e.status} - {e.message}") + if e.status == 403: + print('Permission denied - check API token scopes') + elif e.status == 400: + print('Bad request - verify zone_id and payload') + raise + except Exception as e: + print(f"Unexpected error: {e}") + raise +``` + +## Response Schema + +All Argo Smart Routing API responses follow this structure: + +```typescript +interface ArgoSmartRoutingResponse { + result: { + id: 'smart_routing'; + value: 'on' | 'off'; + editable: boolean; + modified_on: string; // ISO 8601 timestamp + }; + success: boolean; + errors: Array<{ + code: number; + message: string; + }>; + messages: Array; +} +``` + +## Key Response Fields + +| Field | Type | Description | +|-------|------|-------------| +| `value` | `"on" \| "off"` | Current enablement status | +| `editable` | `boolean` | Whether changes are allowed (check before PATCH) | +| `modified_on` | `string` | ISO timestamp of last modification | +| `success` | `boolean` | Whether request succeeded | +| `errors` | `Array` | Error details if `success: false` \ No newline at end of file diff --git a/cloudflare/references/argo-smart-routing/configuration.md b/cloudflare/references/argo-smart-routing/configuration.md new file mode 100644 index 0000000..ba94d38 --- /dev/null +++ b/cloudflare/references/argo-smart-routing/configuration.md @@ -0,0 +1,197 @@ +## Configuration Management + +**Note on Smart Shield Evolution:** Argo Smart Routing is being integrated into Smart Shield. Configuration methods below remain valid; Terraform and IaC patterns unchanged. + +### Infrastructure as Code (Terraform) + +```hcl +# terraform/argo.tf +# Note: Use Cloudflare Terraform provider + +resource "cloudflare_argo" "example" { + zone_id = var.zone_id + smart_routing = "on" + tiered_caching = "on" +} + +variable "zone_id" { + description = "Cloudflare Zone ID" + type = string +} + +output "argo_enabled" { + value = cloudflare_argo.example.smart_routing + description = "Argo Smart Routing status" +} +``` + +### Environment-Based Configuration + +```typescript +// config/argo.ts +interface ArgoEnvironmentConfig { + enabled: boolean; + tieredCache: boolean; + monitoring: { + usageAlerts: boolean; + threshold: number; + }; +} + +const configs: Record = { + production: { + enabled: true, + tieredCache: true, + monitoring: { + usageAlerts: true, + threshold: 1000, // GB + }, + }, + staging: { + enabled: true, + tieredCache: false, + monitoring: { + usageAlerts: false, + threshold: 100, // GB + }, + }, + development: { + enabled: false, + tieredCache: false, + monitoring: { + usageAlerts: false, + threshold: 0, + }, + }, +}; + +export function getArgoConfig(env: string): ArgoEnvironmentConfig { + return configs[env] || configs.development; +} +``` + +### Pulumi Configuration + +```typescript +// pulumi/argo.ts +import * as cloudflare from '@pulumi/cloudflare'; + +const zone = new cloudflare.Zone('example-zone', { + zone: 'example.com', + plan: 'enterprise', +}); + +const argoSettings = new cloudflare.Argo('argo-config', { + zoneId: zone.id, + smartRouting: 'on', + tieredCaching: 'on', +}); + +export const argoEnabled = argoSettings.smartRouting; +export const zoneId = zone.id; +``` + +## Billing Configuration + +Before enabling Argo Smart Routing, ensure billing is configured for the account: + +**Prerequisites:** +1. Valid payment method on file +2. Enterprise or higher plan +3. Zone must have billing enabled + +**Check Billing Status via Dashboard:** +1. Navigate to Account → Billing +2. Verify payment method configured +3. Check zone subscription status + +**Note:** Attempting to enable Argo without billing configured will result in `editable: false` in API responses. + +## Environment Variable Setup + +**Required Environment Variables:** +```bash +# .env +CLOUDFLARE_API_TOKEN=your_api_token_here +CLOUDFLARE_ZONE_ID=your_zone_id_here +CLOUDFLARE_ACCOUNT_ID=your_account_id_here + +# Optional +ARGO_ENABLED=true +ARGO_TIERED_CACHE=true +``` + +**TypeScript Configuration Loader:** +```typescript +// config/env.ts +import { z } from 'zod'; + +const envSchema = z.object({ + CLOUDFLARE_API_TOKEN: z.string().min(1), + CLOUDFLARE_ZONE_ID: z.string().min(1), + CLOUDFLARE_ACCOUNT_ID: z.string().min(1), + ARGO_ENABLED: z.string().optional().default('false'), + ARGO_TIERED_CACHE: z.string().optional().default('false'), +}); + +export const env = envSchema.parse(process.env); + +export const argoConfig = { + enabled: env.ARGO_ENABLED === 'true', + tieredCache: env.ARGO_TIERED_CACHE === 'true', +}; +``` + +## CI/CD Integration + +**GitHub Actions Example:** +```yaml +# .github/workflows/deploy-argo.yml +name: Deploy Argo Configuration + +on: + push: + branches: [main] + paths: + - 'terraform/argo.tf' + +jobs: + deploy: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v3 + + - name: Setup Terraform + uses: hashicorp/setup-terraform@v2 + + - name: Terraform Init + run: terraform init + working-directory: ./terraform + + - name: Terraform Apply + run: terraform apply -auto-approve + working-directory: ./terraform + env: + CLOUDFLARE_API_TOKEN: ${{ secrets.CLOUDFLARE_API_TOKEN }} + TF_VAR_zone_id: ${{ secrets.CLOUDFLARE_ZONE_ID }} +``` + +## Enterprise Preview Program + +For early access to Argo Smart Routing features and Smart Shield integration: + +**Eligibility:** +- Enterprise plan customers +- Active Cloudflare support contract +- Production traffic >100GB/month + +**How to Join:** +1. Contact Cloudflare account team or support +2. Request Argo/Smart Shield preview access +3. Receive preview zone configuration + +**Preview Features:** +- Enhanced analytics and reporting +- Smart Shield DDoS integration +- Advanced routing policies +- Priority support for routing issues \ No newline at end of file diff --git a/cloudflare/references/argo-smart-routing/gotchas.md b/cloudflare/references/argo-smart-routing/gotchas.md new file mode 100644 index 0000000..8d011d3 --- /dev/null +++ b/cloudflare/references/argo-smart-routing/gotchas.md @@ -0,0 +1,111 @@ +## Best Practices Summary + +**Smart Shield Note:** Argo Smart Routing evolving into Smart Shield. Best practices below remain applicable; monitor Cloudflare changelog for Smart Shield updates. + +1. **Always check editability** before attempting to enable/disable Argo +2. **Set up billing notifications** to avoid unexpected costs +3. **Combine with Tiered Cache** for maximum performance benefit +4. **Use in production only** - disable for dev/staging to control costs +5. **Monitor analytics** - require 500+ requests in 48h for detailed metrics +6. **Handle errors gracefully** - check for billing, permissions, zone compatibility +7. **Test configuration changes** in staging before production +8. **Use TypeScript SDK** for type safety and better developer experience +9. **Implement retry logic** for API calls in production systems +10. **Document zone-specific settings** for team visibility + +## Common Errors + +### "Argo unavailable" + +**Problem:** API returns error "Argo Smart Routing is unavailable for this zone" + +**Cause:** Zone not eligible or billing not set up + +**Solution:** +1. Verify zone has Enterprise or higher plan +2. Check billing is configured in Account → Billing +3. Ensure payment method is valid and current +4. Contact Cloudflare support if eligibility unclear + +### "Cannot enable/disable" + +**Problem:** API call succeeds but status remains unchanged, or `editable: false` in GET response + +**Cause:** Insufficient permissions or zone restrictions + +**Solution:** +1. Check API token has `Zone:Argo Smart Routing:Edit` permission +2. Verify `editable: true` in GET response before attempting PATCH +3. If `editable: false`, check: + - Billing configured for account + - Zone plan includes Argo (Enterprise+) + - No active zone holds or suspensions + - API token has correct scopes + +### `editable: false` Error + +**Problem:** GET request returns `"editable": false`, preventing enable/disable + +**Cause:** Zone-level restrictions from billing, plan, or permissions + +**Solution Pattern:** +```typescript +const status = await client.argo.smartRouting.get({ zone_id: zoneId }); + +if (!status.editable) { + // Don't attempt to modify - will fail + console.error('Cannot modify Argo settings:'); + console.error('- Check billing is configured'); + console.error('- Verify zone has Enterprise+ plan'); + console.error('- Confirm API token has Edit permission'); + throw new Error('Argo is not editable for this zone'); +} + +// Safe to proceed with enable/disable +await client.argo.smartRouting.edit({ zone_id: zoneId, value: 'on' }); +``` + +### Rate Limiting + +**Problem:** `429 Too Many Requests` error from API + +**Cause:** Exceeded API rate limits (typically 1200 requests per 5 minutes) + +**Solution:** +```typescript +import { RateLimitError } from 'cloudflare'; + +try { + await client.argo.smartRouting.edit({ zone_id: zoneId, value: 'on' }); +} catch (error) { + if (error instanceof RateLimitError) { + const retryAfter = error.response?.headers.get('retry-after'); + console.log(`Rate limited. Retry after ${retryAfter} seconds`); + + // Implement exponential backoff + await new Promise(resolve => setTimeout(resolve, (retryAfter || 60) * 1000)); + // Retry request + } +} +``` + +## Limits + +| Resource/Limit | Value | Notes | +|----------------|-------|-------| +| Min requests for analytics | 500 in 48h | For detailed metrics via GraphQL | +| Zones supported | Enterprise+ | Check zone plan in dashboard | +| Billing requirement | Must be configured | Before enabling; verify payment method | +| API rate limit | 1200 req / 5 min | Per API token across all endpoints | +| Spectrum apps | No hard limit | Each app can enable Argo independently | +| Traffic counting | Proxied only | Only orange-clouded DNS records count | +| DDoS/WAF exemption | Yes | Mitigated traffic excluded from billing | +| Analytics latency | 1-5 minutes | Real-time metrics not available | + +## Additional Resources + +- [Official Argo Smart Routing Docs](https://developers.cloudflare.com/argo-smart-routing/) +- [Cloudflare Smart Shield](https://developers.cloudflare.com/smart-shield/) +- [API Authentication](https://developers.cloudflare.com/fundamentals/api/get-started/create-token/) +- [Cloudflare TypeScript SDK](https://github.com/cloudflare/cloudflare-typescript) +- [Cloudflare Python SDK](https://github.com/cloudflare/cloudflare-python) diff --git a/cloudflare/references/argo-smart-routing/patterns.md b/cloudflare/references/argo-smart-routing/patterns.md new file mode 100644 index 0000000..1173b2d --- /dev/null +++ b/cloudflare/references/argo-smart-routing/patterns.md @@ -0,0 +1,104 @@ +# Integration Patterns + +## Enable Argo + Tiered Cache + +```typescript +async function enableOptimalPerformance(client: Cloudflare, zoneId: string) { + await Promise.all([ + client.argo.smartRouting.edit({ zone_id: zoneId, value: 'on' }), + client.argo.tieredCaching.edit({ zone_id: zoneId, value: 'on' }), + ]); +} +``` + +**Flow:** Visitor → Edge (Lower-Tier) → [Cache Miss] → Upper-Tier → [Cache Miss + Argo] → Origin + +**Impact:** Argo ~30% latency reduction + Tiered Cache 50-80% origin offload + +## Usage Analytics (GraphQL) + +```graphql +query ArgoAnalytics($zoneTag: string!) { + viewer { + zones(filter: { zoneTag: $zoneTag }) { + httpRequestsAdaptiveGroups(limit: 1000) { + sum { argoBytes, bytes } + } + } + } +} +``` + +**Billing:** ~$0.10/GB. DDoS-mitigated and WAF-blocked traffic NOT charged. + +## Spectrum TCP Integration + +Enable Argo for non-HTTP traffic (databases, game servers, IoT): + +```typescript +// Update existing app +await client.spectrum.apps.update(appId, { zone_id: zoneId, argo_smart_routing: true }); + +// Create new app with Argo +await client.spectrum.apps.create({ + zone_id: zoneId, + dns: { type: 'CNAME', name: 'tcp.example.com' }, + origin_direct: ['tcp://origin.example.com:3306'], + protocol: 'tcp/3306', + argo_smart_routing: true, +}); +``` + +**Use cases:** MySQL/PostgreSQL (3306/5432), game servers, MQTT (1883), SSH (22) + +## Pre-Flight Validation + +```typescript +async function validateArgoEligibility(client: Cloudflare, zoneId: string) { + const status = await client.argo.smartRouting.get({ zone_id: zoneId }); + const zone = await client.zones.get({ zone_id: zoneId }); + + const issues: string[] = []; + if (!status.editable) issues.push('Zone not editable'); + if (['free', 'pro'].includes(zone.plan.legacy_id)) issues.push('Requires Business+ plan'); + if (zone.status !== 'active') issues.push('Zone not active'); + + return { canEnable: issues.length === 0, issues }; +} +``` + +## Post-Enable Verification + +```typescript +async function verifyArgoEnabled(client: Cloudflare, zoneId: string): Promise { + await new Promise(r => setTimeout(r, 2000)); // Wait for propagation + const status = await client.argo.smartRouting.get({ zone_id: zoneId }); + return status.value === 'on'; +} +``` + +## Full Setup Pattern + +```typescript +async function setupArgo(client: Cloudflare, zoneId: string) { + // 1. Validate + const { canEnable, issues } = await validateArgoEligibility(client, zoneId); + if (!canEnable) throw new Error(issues.join(', ')); + + // 2. Enable both features + await Promise.all([ + client.argo.smartRouting.edit({ zone_id: zoneId, value: 'on' }), + client.argo.tieredCaching.edit({ zone_id: zoneId, value: 'on' }), + ]); + + // 3. Verify + const [argo, cache] = await Promise.all([ + client.argo.smartRouting.get({ zone_id: zoneId }), + client.argo.tieredCaching.get({ zone_id: zoneId }), + ]); + + return { argo: argo.value === 'on', tieredCache: cache.value === 'on' }; +} +``` + +**When to combine:** High-traffic sites (>1TB/mo), global users, cacheable content. diff --git a/cloudflare/references/bindings/README.md b/cloudflare/references/bindings/README.md new file mode 100644 index 0000000..fc3e8eb --- /dev/null +++ b/cloudflare/references/bindings/README.md @@ -0,0 +1,122 @@ +# Cloudflare Bindings Skill Reference + +Expert guidance on Cloudflare Workers Bindings - the runtime APIs that connect Workers to Cloudflare platform resources. + +## What Are Bindings? + +Bindings are how Workers access Cloudflare resources (storage, compute, services) via the `env` object. They're configured in `wrangler.jsonc`, type-safe via TypeScript, and zero-overhead at runtime. + +## Reading Order + +1. **This file** - Binding catalog and selection guide +2. **[api.md](api.md)** - TypeScript types and env access patterns +3. **[configuration.md](configuration.md)** - Complete wrangler.jsonc examples +4. **[patterns.md](patterns.md)** - Best practices and common patterns +5. **[gotchas.md](gotchas.md)** - Critical pitfalls and troubleshooting + +## Binding Catalog + +### Storage Bindings + +| Binding | Use Case | Access Pattern | +|---------|----------|----------------| +| **KV** | Key-value cache, CDN-backed reads | `env.MY_KV.get(key)` | +| **R2** | Object storage (S3-compatible) | `env.MY_BUCKET.get(key)` | +| **D1** | SQL database (SQLite) | `env.DB.prepare(sql).all()` | +| **Durable Objects** | Coordination, real-time state | `env.MY_DO.get(id)` | +| **Vectorize** | Vector embeddings search | `env.VECTORIZE.query(vector)` | +| **Queues** | Async message processing | `env.MY_QUEUE.send(msg)` | + +### Compute Bindings + +| Binding | Use Case | Access Pattern | +|---------|----------|----------------| +| **Service** | Worker-to-Worker RPC | `env.MY_SERVICE.fetch(req)` | +| **Workers AI** | LLM inference | `env.AI.run(model, input)` | +| **Browser Rendering** | Headless Chrome | `env.BROWSER.fetch(url)` | + +### Platform Bindings + +| Binding | Use Case | Access Pattern | +|---------|----------|----------------| +| **Analytics Engine** | Custom metrics | `env.ANALYTICS.writeDataPoint(data)` | +| **mTLS** | Client certificates | `env.MY_CERT` (string) | +| **Hyperdrive** | Database pooling | `env.HYPERDRIVE.connectionString` | +| **Rate Limiting** | Request throttling | `env.RATE_LIMITER.limit(id)` | +| **Workflows** | Long-running workflows | `env.MY_WORKFLOW.create()` | + +### Configuration Bindings + +| Binding | Use Case | Access Pattern | +|---------|----------|----------------| +| **Environment Variables** | Non-sensitive config | `env.API_URL` (string) | +| **Secrets** | Sensitive values | `env.API_KEY` (string) | +| **Text/Data Blobs** | Static files | `env.MY_BLOB` (string) | +| **WASM** | WebAssembly modules | `env.MY_WASM` (WebAssembly.Module) | + +## Quick Selection Guide + +**Need persistent storage?** +- Key-value < 25MB → **KV** +- Files/objects → **R2** +- Relational data → **D1** +- Real-time coordination → **Durable Objects** + +**Need AI/compute?** +- LLM inference → **Workers AI** +- Scraping/PDFs → **Browser Rendering** +- Call another Worker → **Service binding** + +**Need async processing?** +- Background jobs → **Queues** + +**Need config?** +- Public values → **Environment Variables** +- Secrets → **Secrets** (never commit) + +## Quick Start + +1. **Add binding to wrangler.jsonc:** +```jsonc +{ + "kv_namespaces": [ + { "binding": "MY_KV", "id": "your-kv-id" } + ] +} +``` + +2. **Generate types:** +```bash +npx wrangler types +``` + +3. **Access in Worker:** +```typescript +export default { + async fetch(request, env, ctx) { + await env.MY_KV.put('key', 'value'); + return new Response('OK'); + } +} +``` + +## Type Safety + +Bindings are fully typed via `wrangler types`. See [api.md](api.md) for details. + +## Limits + +- 64 bindings max per Worker (all types combined) +- See [gotchas.md](gotchas.md) for per-binding limits + +## Key Concepts + +**Zero-overhead access:** Bindings compiled into Worker, no network calls to access +**Type-safe:** Full TypeScript support via `wrangler types` +**Per-environment:** Different IDs for dev/staging/production +**Secrets vs Vars:** Secrets encrypted at rest, never in config files + +## See Also + +- [Cloudflare Docs: Bindings](https://developers.cloudflare.com/workers/runtime-apis/bindings/) +- [Wrangler Configuration](https://developers.cloudflare.com/workers/wrangler/configuration/) diff --git a/cloudflare/references/bindings/api.md b/cloudflare/references/bindings/api.md new file mode 100644 index 0000000..a8ab13b --- /dev/null +++ b/cloudflare/references/bindings/api.md @@ -0,0 +1,203 @@ +# Bindings API Reference + +## TypeScript Types + +Cloudflare generates binding types via `npx wrangler types`. This creates `.wrangler/types/runtime.d.ts` with your Env interface. + +### Generated Env Interface + +After running `wrangler types`, TypeScript knows your bindings: + +```typescript +interface Env { + // From wrangler.jsonc bindings + MY_KV: KVNamespace; + MY_BUCKET: R2Bucket; + DB: D1Database; + MY_SERVICE: Fetcher; + AI: Ai; + + // From vars + API_URL: string; + + // From secrets (set via wrangler secret put) + API_KEY: string; +} +``` + +### Binding Types + +| Config | TypeScript Type | Package | +|--------|-----------------|---------| +| `kv_namespaces` | `KVNamespace` | `@cloudflare/workers-types` | +| `r2_buckets` | `R2Bucket` | `@cloudflare/workers-types` | +| `d1_databases` | `D1Database` | `@cloudflare/workers-types` | +| `durable_objects.bindings` | `DurableObjectNamespace` | `@cloudflare/workers-types` | +| `vectorize` | `VectorizeIndex` | `@cloudflare/workers-types` | +| `queues.producers` | `Queue` | `@cloudflare/workers-types` | +| `services` | `Fetcher` | `@cloudflare/workers-types` | +| `ai` | `Ai` | `@cloudflare/workers-types` | +| `browser` | `Fetcher` | `@cloudflare/workers-types` | +| `analytics_engine_datasets` | `AnalyticsEngineDataset` | `@cloudflare/workers-types` | +| `hyperdrive` | `Hyperdrive` | `@cloudflare/workers-types` | +| `rate_limiting` | `RateLimit` | `@cloudflare/workers-types` | +| `workflows` | `Workflow` | `@cloudflare/workers-types` | +| `mtls_certificates` / `vars` / `text_blobs` / `data_blobs` | `string` | Built-in | +| `wasm_modules` | `WebAssembly.Module` | Built-in | + +## Accessing Bindings + +### Method 1: fetch() Handler (Recommended) + +```typescript +export default { + async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise { + const value = await env.MY_KV.get('key'); + return new Response(value); + } +} +``` + +**Why:** Type-safe, aligns with Workers API, supports ctx for waitUntil/passThroughOnException. + +### Method 2: Hono Framework + +```typescript +import { Hono } from 'hono'; + +const app = new Hono<{ Bindings: Env }>(); + +app.get('/', async (c) => { + const value = await c.env.MY_KV.get('key'); + return c.json({ value }); +}); + +export default app; +``` + +**Why:** c.env auto-typed, ergonomic for routing-heavy apps. + +### Method 3: Module Workers (Legacy) + +```typescript +export async function handleRequest(request: Request, env: Env): Promise { + const value = await env.MY_KV.get('key'); + return new Response(value); +} + +addEventListener('fetch', (event) => { + // env not directly available - requires workarounds +}); +``` + +**Avoid:** Use fetch() handler instead (Method 1). + +## Type Generation Workflow + +### Initial Setup + +```bash +# Install wrangler +npm install -D wrangler + +# Generate types from wrangler.jsonc +npx wrangler types +``` + +### After Changing Bindings + +```bash +# Added/modified binding in wrangler.jsonc +npx wrangler types + +# TypeScript now sees updated Env interface +``` + +**Note:** `wrangler types` outputs to `.wrangler/types/runtime.d.ts`. TypeScript picks this up automatically if `@cloudflare/workers-types` is in `tsconfig.json` `"types"` array. + +## Key Binding Methods + +**KV:** +```typescript +await env.MY_KV.get(key, { type: 'json' }); // text|json|arrayBuffer|stream +await env.MY_KV.put(key, value, { expirationTtl: 3600 }); +await env.MY_KV.delete(key); +await env.MY_KV.list({ prefix: 'user:' }); +``` + +**R2:** +```typescript +await env.BUCKET.get(key); +await env.BUCKET.put(key, value); +await env.BUCKET.delete(key); +await env.BUCKET.list({ prefix: 'images/' }); +``` + +**D1:** +```typescript +await env.DB.prepare('SELECT * FROM users WHERE id = ?').bind(userId).first(); +await env.DB.batch([stmt1, stmt2]); +``` + +**Service:** +```typescript +await env.MY_SERVICE.fetch(new Request('https://fake/path')); +``` + +**Workers AI:** +```typescript +await env.AI.run('@cf/meta/llama-3.1-8b-instruct', { prompt: 'Hello' }); +``` + +**Queues:** +```typescript +await env.MY_QUEUE.send({ userId: 123, action: 'process' }); +``` + +**Durable Objects:** +```typescript +const id = env.MY_DO.idFromName('user-123'); +const stub = env.MY_DO.get(id); +await stub.fetch(new Request('https://fake/increment')); +``` + +## Runtime vs Build-Time Types + +| Type Source | When Generated | Use Case | +|-------------|----------------|----------| +| `@cloudflare/workers-types` | npm install | Base Workers APIs (Request, Response, etc.) | +| `wrangler types` | After config change | Your specific bindings (Env interface) | + +**Install both:** +```bash +npm install -D @cloudflare/workers-types +npx wrangler types +``` + +## Type Safety Best Practices + +1. **Never use `any` for env:** +```typescript +// ❌ BAD +async fetch(request: Request, env: any) { } + +// ✅ GOOD +async fetch(request: Request, env: Env) { } +``` + +2. **Run wrangler types after config changes:** +```bash +# After editing wrangler.jsonc +npx wrangler types +``` + +3. **Check generated types match config:** +```bash +# View generated Env interface +cat .wrangler/types/runtime.d.ts +``` + +## See Also + +- [Workers Types Package](https://www.npmjs.com/package/@cloudflare/workers-types) +- [Wrangler Types Command](https://developers.cloudflare.com/workers/wrangler/commands/#types) \ No newline at end of file diff --git a/cloudflare/references/bindings/configuration.md b/cloudflare/references/bindings/configuration.md new file mode 100644 index 0000000..9573500 --- /dev/null +++ b/cloudflare/references/bindings/configuration.md @@ -0,0 +1,188 @@ +# Binding Configuration Reference + +## Storage Bindings + +```jsonc +{ + "kv_namespaces": [{ "binding": "MY_KV", "id": "..." }], + "r2_buckets": [{ "binding": "MY_BUCKET", "bucket_name": "my-bucket" }], + "d1_databases": [{ "binding": "DB", "database_name": "my-db", "database_id": "..." }], + "durable_objects": { "bindings": [{ "name": "MY_DO", "class_name": "MyDO" }] }, + "vectorize": [{ "binding": "VECTORIZE", "index_name": "my-index" }], + "queues": { "producers": [{ "binding": "MY_QUEUE", "queue": "my-queue" }] } +} +``` + +**Create commands:** +```bash +npx wrangler kv namespace create MY_KV +npx wrangler r2 bucket create my-bucket +npx wrangler d1 create my-db +npx wrangler vectorize create my-index --dimensions=768 --metric=cosine +npx wrangler queues create my-queue + +# List existing resources +npx wrangler kv namespace list +npx wrangler r2 bucket list +npx wrangler d1 list +npx wrangler vectorize list +npx wrangler queues list +``` + +## Compute Bindings + +```jsonc +{ + "services": [{ + "binding": "MY_SERVICE", + "service": "other-worker", + "environment": "production" // Optional: target specific env + }], + "ai": { "binding": "AI" }, + "browser": { "binding": "BROWSER" }, + "workflows": [{ "binding": "MY_WORKFLOW", "name": "my-workflow" }] +} +``` + +**Create workflows:** +```bash +npx wrangler workflows create my-workflow +``` + +## Platform Bindings + +```jsonc +{ + "analytics_engine_datasets": [{ "binding": "ANALYTICS" }], + "mtls_certificates": [{ "binding": "MY_CERT", "certificate_id": "..." }], + "hyperdrive": [{ "binding": "HYPERDRIVE", "id": "..." }], + "unsafe": { + "bindings": [{ "name": "RATE_LIMITER", "type": "ratelimit", "namespace_id": "..." }] + } +} +``` + +## Configuration Bindings + +```jsonc +{ + "vars": { + "API_URL": "https://api.example.com", + "MAX_RETRIES": "3" + }, + "text_blobs": { "MY_TEXT": "./data/template.html" }, + "data_blobs": { "MY_DATA": "./data/config.bin" }, + "wasm_modules": { "MY_WASM": "./build/module.wasm" } +} +``` + +**Secrets (never in config):** +```bash +npx wrangler secret put API_KEY +``` + +## Environment-Specific Configuration + +```jsonc +{ + "name": "my-worker", + "vars": { "ENV": "production" }, + "kv_namespaces": [{ "binding": "CACHE", "id": "prod-kv-id" }], + + "env": { + "staging": { + "vars": { "ENV": "staging" }, + "kv_namespaces": [{ "binding": "CACHE", "id": "staging-kv-id" }] + } + } +} +``` + +**Deploy:** +```bash +npx wrangler deploy # Production +npx wrangler deploy --env staging +``` + +## Local Development + +```jsonc +{ + "kv_namespaces": [{ + "binding": "MY_KV", + "id": "prod-id", + "preview_id": "dev-id" // Used in wrangler dev + }] +} +``` + +**Or use remote:** +```bash +npx wrangler dev --remote # Uses production bindings +``` + +## Complete Example + +```jsonc +{ + "$schema": "./node_modules/wrangler/config-schema.json", + "name": "my-app", + "main": "src/index.ts", + "compatibility_date": "2025-01-01", + + "vars": { "API_URL": "https://api.example.com" }, + "kv_namespaces": [{ "binding": "CACHE", "id": "abc123" }], + "r2_buckets": [{ "binding": "ASSETS", "bucket_name": "my-assets" }], + "d1_databases": [{ "binding": "DB", "database_name": "my-db", "database_id": "xyz789" }], + "services": [{ "binding": "AUTH", "service": "auth-worker" }], + "ai": { "binding": "AI" } +} +``` + +## Binding-Specific Configuration + +### Durable Objects with Class Export + +```jsonc +{ + "durable_objects": { + "bindings": [ + { "name": "COUNTER", "class_name": "Counter", "script_name": "my-worker" } + ] + } +} +``` + +```typescript +// In same Worker or script_name Worker +export class Counter { + constructor(private state: DurableObjectState, private env: Env) {} + async fetch(request: Request) { /* ... */ } +} +``` + +### Queue Consumers + +```jsonc +{ + "queues": { + "producers": [{ "binding": "MY_QUEUE", "queue": "my-queue" }], + "consumers": [{ "queue": "my-queue", "max_batch_size": 10 }] + } +} +``` + +Queue consumer handler: `export default { async queue(batch, env) { /* process batch.messages */ } }` + +## Key Points + +- **64 binding limit** (all types combined) +- **Secrets**: Always use `wrangler secret put`, never commit +- **Types**: Run `npx wrangler types` after config changes +- **Environments**: Use `env` field for staging/production variants +- **Development**: Use `preview_id` or `--remote` flag +- **IDs vs Names**: Some bindings use `id` (KV, D1), others use `name` (R2, Queues) + +## See Also + +- [Wrangler Configuration](https://developers.cloudflare.com/workers/wrangler/configuration/) \ No newline at end of file diff --git a/cloudflare/references/bindings/gotchas.md b/cloudflare/references/bindings/gotchas.md new file mode 100644 index 0000000..90aca1a --- /dev/null +++ b/cloudflare/references/bindings/gotchas.md @@ -0,0 +1,208 @@ +# Binding Gotchas and Troubleshooting + +## Critical: Global Scope Mutation + +### ❌ THE #1 GOTCHA: Caching env in Global Scope + +```typescript +// ❌ DANGEROUS - env cached at deploy time +const apiKey = env.API_KEY; // ERROR: env not available in global scope + +export default { + async fetch(request: Request, env: Env) { + // Uses undefined or stale value! + } +} +``` + +**Why it breaks:** +- `env` not available in global scope +- If using workarounds, secrets may not update without redeployment +- Leads to "Cannot read property 'X' of undefined" errors + +**✅ Always access env per-request:** +```typescript +export default { + async fetch(request: Request, env: Env) { + const apiKey = env.API_KEY; // Fresh every request + } +} +``` + +## Common Errors + +### "env.MY_KV is undefined" + +**Cause:** Name mismatch or not configured +**Solution:** Check wrangler.jsonc (case-sensitive), run `npx wrangler types`, verify `npx wrangler kv namespace list` + +### "Property 'MY_KV' does not exist on type 'Env'" + +**Cause:** Types not generated +**Solution:** `npx wrangler types` + +### "preview_id is required for --remote" + +**Cause:** Missing preview binding +**Solution:** Add `"preview_id": "dev-id"` or use `npx wrangler dev` (local mode) + +### "Secret updated but Worker still uses old value" + +**Cause:** Cached in global scope or not redeployed +**Solution:** Avoid global caching, redeploy after secret change + +### "KV get() returns null for existing key" + +**Cause:** Eventual consistency (60s), wrong namespace, wrong environment +**Solution:** +```bash +# Check key exists +npx wrangler kv key get --binding=MY_KV "your-key" + +# Verify namespace ID +npx wrangler kv namespace list + +# Check environment +npx wrangler deployments list +``` + +### "D1 database not found" + +**Solution:** `npx wrangler d1 list`, verify ID in wrangler.jsonc + +### "Service binding returns 'No such service'" + +**Cause:** Target Worker not deployed, name mismatch, environment mismatch +**Solution:** +```bash +# List deployed Workers +npx wrangler deployments list --name=target-worker + +# Check service binding config +cat wrangler.jsonc | grep -A2 services + +# Deploy target first +cd ../target-worker && npx wrangler deploy +``` + +### "Rate limit exceeded" on KV writes + +**Cause:** >1 write/second per key +**Solution:** Use different keys, Durable Objects, or Queues + +## Type Safety Gotchas + +### Missing @cloudflare/workers-types + +**Error:** `Cannot find name 'Request'` +**Solution:** `npm install -D @cloudflare/workers-types`, add to tsconfig.json `"types"` + +### Binding Type Mismatches + +```typescript +// ❌ Wrong - KV returns string | null +const value: string = await env.MY_KV.get('key'); + +// ✅ Handle null +const value = await env.MY_KV.get('key'); +if (!value) return new Response('Not found', { status: 404 }); +``` + +## Environment Gotchas + +### Wrong Environment Deployed + +**Solution:** Check `npx wrangler deployments list`, use `--env` flag + +### Secrets Not Per-Environment + +**Solution:** Set per environment: `npx wrangler secret put API_KEY --env staging` + +## Development Gotchas + +**wrangler dev vs deploy:** +- dev: Uses `preview_id` or local bindings, secrets not available +- deploy: Uses production `id`, secrets available + +**Access secrets in dev:** `npx wrangler dev --remote` +**Persist local data:** `npx wrangler dev --persist` + +## Performance Gotchas + +### Sequential Binding Calls + +```typescript +// ❌ Slow +const user = await env.DB.prepare('...').first(); +const config = await env.MY_KV.get('config'); + +// ✅ Parallel +const [user, config] = await Promise.all([ + env.DB.prepare('...').first(), + env.MY_KV.get('config') +]); +``` + +## Security Gotchas + +**❌ Secrets in logs:** `console.log('Key:', env.API_KEY)` - visible in dashboard +**✅** `console.log('Key:', env.API_KEY ? '***' : 'missing')` + +**❌ Exposing env:** `return Response.json(env)` - exposes all bindings +**✅** Never return env object in responses + +## Limits Reference + +| Resource | Limit | Impact | Plan | +|----------|-------|--------|------| +| **Bindings per Worker** | 64 total | All binding types combined | All | +| **Environment variables** | 64 max, 5KB each | Per Worker | All | +| **Secret size** | 1KB | Per secret | All | +| **KV key size** | 512 bytes | UTF-8 encoded | All | +| **KV value size** | 25 MB | Per value | All | +| **KV writes per key** | 1/second | Per key; exceeding = 429 error | All | +| **KV list() results** | 1000 keys | Per call; use cursor for more | All | +| **KV operations** | 1000 reads/day | Free tier only | Free | +| **R2 object size** | 5 TB | Per object | All | +| **R2 operations** | 1M Class A/month free | Writes | All | +| **D1 database size** | 10 GB | Per database | All | +| **D1 rows per query** | 100,000 | Result set limit | All | +| **D1 databases** | 10 | Free tier | Free | +| **Queue batch size** | 100 messages | Per consumer batch | All | +| **Queue message size** | 128 KB | Per message | All | +| **Service binding calls** | Unlimited | Counts toward CPU time | All | +| **Durable Objects** | 1M requests/month free | First 1M | Free | + +## Debugging Tips + +```bash +# Check configuration +npx wrangler deploy --dry-run # Validate config without deploying +npx wrangler kv namespace list # List KV namespaces +npx wrangler secret list # List secrets (not values) +npx wrangler deployments list # Recent deployments + +# Inspect bindings +npx wrangler kv key list --binding=MY_KV +npx wrangler kv key get --binding=MY_KV "key-name" +npx wrangler r2 object get my-bucket/file.txt +npx wrangler d1 execute my-db --command="SELECT * FROM sqlite_master" + +# Test locally +npx wrangler dev # Local mode +npx wrangler dev --remote # Production bindings +npx wrangler dev --persist # Persist data across restarts + +# Verify types +npx wrangler types +cat .wrangler/types/runtime.d.ts | grep "interface Env" + +# Debug specific binding issues +npx wrangler tail # Stream logs in real-time +npx wrangler tail --format=pretty # Formatted logs +``` + +## See Also + +- [Workers Limits](https://developers.cloudflare.com/workers/platform/limits/) +- [Wrangler Commands](https://developers.cloudflare.com/workers/wrangler/commands/) diff --git a/cloudflare/references/bindings/patterns.md b/cloudflare/references/bindings/patterns.md new file mode 100644 index 0000000..759305d --- /dev/null +++ b/cloudflare/references/bindings/patterns.md @@ -0,0 +1,200 @@ +# Binding Patterns and Best Practices + +## Service Binding Patterns + +### RPC via Service Bindings + +```typescript +// auth-worker +export default { + async fetch(request: Request, env: Env) { + const token = request.headers.get('Authorization'); + return new Response(JSON.stringify({ valid: await validateToken(token) })); + } +} + +// api-worker +const response = await env.AUTH_SERVICE.fetch( + new Request('https://fake-host/validate', { + headers: { 'Authorization': token } + }) +); +``` + +**Why RPC?** Zero latency (same datacenter), no DNS, free, type-safe. + +**HTTP vs Service:** +```typescript +// ❌ HTTP (slow, paid, cross-region latency) +await fetch('https://auth-worker.example.com/validate'); + +// ✅ Service binding (fast, free, same isolate) +await env.AUTH_SERVICE.fetch(new Request('https://fake-host/validate')); +``` + +**URL doesn't matter:** Service bindings ignore hostname/protocol, routing happens via binding name. + +### Typed Service RPC + +```typescript +// shared-types.ts +export interface AuthRequest { token: string; } +export interface AuthResponse { valid: boolean; userId?: string; } + +// auth-worker +export default { + async fetch(request: Request): Promise { + const body: AuthRequest = await request.json(); + const response: AuthResponse = { valid: true, userId: '123' }; + return Response.json(response); + } +} + +// api-worker +const response = await env.AUTH_SERVICE.fetch( + new Request('https://fake/validate', { + method: 'POST', + body: JSON.stringify({ token } satisfies AuthRequest) + }) +); +const data: AuthResponse = await response.json(); +``` + +## Secrets Management + +```bash +# Set secret +npx wrangler secret put API_KEY +cat api-key.txt | npx wrangler secret put API_KEY +npx wrangler secret put API_KEY --env staging +``` + +```typescript +// Use secret +const response = await fetch('https://api.example.com', { + headers: { 'Authorization': `Bearer ${env.API_KEY}` } +}); +``` + +**Never commit secrets:** +```jsonc +// ❌ NEVER +{ "vars": { "API_KEY": "sk_live_abc123" } } +``` + +## Testing with Mock Bindings + +### Vitest Mock + +```typescript +import { vi } from 'vitest'; + +const mockKV: KVNamespace = { + get: vi.fn(async (key) => key === 'test' ? 'value' : null), + put: vi.fn(async () => {}), + delete: vi.fn(async () => {}), + list: vi.fn(async () => ({ keys: [], list_complete: true, cursor: '' })), + getWithMetadata: vi.fn(), +} as unknown as KVNamespace; + +const mockEnv: Env = { MY_KV: mockKV }; +const mockCtx: ExecutionContext = { + waitUntil: vi.fn(), + passThroughOnException: vi.fn(), +}; + +const response = await worker.fetch( + new Request('http://localhost/test'), + mockEnv, + mockCtx +); +``` + +## Binding Access Patterns + +### Lazy Access + +```typescript +// ✅ Access only when needed +if (url.pathname === '/cached') { + const cached = await env.MY_KV.get('data'); + if (cached) return new Response(cached); +} +``` + +### Parallel Access + +```typescript +// ✅ Parallelize independent calls +const [user, config, cache] = await Promise.all([ + env.DB.prepare('SELECT * FROM users WHERE id = ?').bind(userId).first(), + env.MY_KV.get('config'), + env.CACHE.get('data') +]); +``` + +## Storage Selection + +### KV: CDN-Backed Reads + +```typescript +const config = await env.MY_KV.get('app-config', { type: 'json' }); +``` + +**Use when:** Read-heavy, <25MB, global distribution, eventual consistency OK +**Latency:** <10ms reads (cached), writes eventually consistent (60s) + +### D1: Relational Queries + +```typescript +const results = await env.DB.prepare(` + SELECT u.name, COUNT(o.id) FROM users u + LEFT JOIN orders o ON u.id = o.user_id GROUP BY u.id +`).all(); +``` + +**Use when:** Relational data, JOINs, ACID transactions +**Limits:** 10GB database size, 100k rows per query + +### R2: Large Objects + +```typescript +const object = await env.MY_BUCKET.get('large-file.zip'); +return new Response(object.body); +``` + +**Use when:** Files >25MB, S3-compatible API needed +**Limits:** 5TB per object, unlimited storage + +### Durable Objects: Coordination + +```typescript +const id = env.COUNTER.idFromName('global'); +const stub = env.COUNTER.get(id); +await stub.fetch(new Request('https://fake/increment')); +``` + +**Use when:** Strong consistency, real-time coordination, WebSocket state +**Guarantees:** Single-threaded execution, transactional storage + +## Anti-Patterns + +**❌ Hardcoding credentials:** `const apiKey = 'sk_live_abc123'` +**✅** `npx wrangler secret put API_KEY` + +**❌ Using REST API:** `fetch('https://api.cloudflare.com/.../kv/...')` +**✅** `env.MY_KV.get('key')` + +**❌ Polling storage:** `setInterval(() => env.KV.get('config'), 1000)` +**✅** Use Durable Objects for real-time state + +**❌ Large data in vars:** `{ "vars": { "HUGE_CONFIG": "..." } }` (5KB max) +**✅** `env.MY_KV.put('config', data)` + +**❌ Caching env globally:** `const apiKey = env.API_KEY` outside fetch() +**✅** Access `env.API_KEY` per-request inside fetch() + +## See Also + +- [Service Bindings Docs](https://developers.cloudflare.com/workers/runtime-apis/bindings/service-bindings/) +- [Miniflare Testing](https://miniflare.dev/) \ No newline at end of file diff --git a/cloudflare/references/bot-management/README.md b/cloudflare/references/bot-management/README.md new file mode 100644 index 0000000..a76808d --- /dev/null +++ b/cloudflare/references/bot-management/README.md @@ -0,0 +1,94 @@ +# Cloudflare Bot Management + +Enterprise-grade bot detection, protection, and mitigation using ML/heuristics, bot scores, JavaScript detections, and verified bot handling. + +## Overview + +Bot Management provides multi-tier protection: +- **Free (Bot Fight Mode)**: Auto-blocks definite bots, no config +- **Pro/Business (Super Bot Fight Mode)**: Configurable actions, static resource protection, analytics groupings +- **Enterprise (Bot Management)**: Granular 1-99 scores, WAF integration, JA3/JA4 fingerprinting, Workers API, Advanced Analytics + +## Quick Start + +```txt +# Dashboard: Security > Bots +# Enterprise: Deploy rule template +(cf.bot_management.score eq 1 and not cf.bot_management.verified_bot) → Block +(cf.bot_management.score le 29 and not cf.bot_management.verified_bot) → Managed Challenge +``` + +## What Do You Need? + +```txt +├─ Initial setup → configuration.md +│ ├─ Free tier → "Bot Fight Mode" +│ ├─ Pro/Business → "Super Bot Fight Mode" +│ └─ Enterprise → "Bot Management for Enterprise" +├─ Workers API integration → api.md +├─ WAF rules → patterns.md +├─ Debugging → gotchas.md +└─ Analytics → api.md#bot-analytics +``` + +## Reading Order + +| Task | Files to Read | +|------|---------------| +| Enable bot protection | README → configuration.md | +| Workers bot detection | README → api.md | +| WAF rule templates | README → patterns.md | +| Debug bot issues | gotchas.md | +| Advanced analytics | api.md#bot-analytics | + +## Core Concepts + +**Bot Scores**: 1-99 (1 = definitely automated, 99 = definitely human). Threshold: <30 indicates bot traffic. Enterprise gets granular 1-99; Pro/Business get groupings only. + +**Detection Engines**: Heuristics (known fingerprints, assigns score=1), ML (majority of detections, supervised learning on billions of requests), Anomaly Detection (optional, baseline traffic analysis), JavaScript Detections (headless browser detection). + +**Verified Bots**: Allowlisted good bots (search engines, AI crawlers) verified via reverse DNS or Web Bot Auth. Access via `cf.bot_management.verified_bot` or `cf.verified_bot_category`. + +## Platform Limits + +| Plan | Bot Scores | JA3/JA4 | Custom Rules | Analytics Retention | +|------|------------|---------|--------------|---------------------| +| Free | No (auto-block only) | No | 5 | N/A (no analytics) | +| Pro/Business | Groupings only | No | 20/100 | 30 days (72h at a time) | +| Enterprise | 1-99 granular | Yes | 1,000+ | 30 days (1 week at a time) | + +## Basic Patterns + +```typescript +// Workers: Check bot score +export default { + async fetch(request: Request): Promise { + const botScore = request.cf?.botManagement?.score; + if (botScore && botScore < 30 && !request.cf?.botManagement?.verifiedBot) { + return new Response('Bot detected', { status: 403 }); + } + return fetch(request); + } +}; +``` + +```txt +# WAF: Block definite bots +(cf.bot_management.score eq 1 and not cf.bot_management.verified_bot) + +# WAF: Protect sensitive endpoints +(cf.bot_management.score lt 50 and http.request.uri.path in {"/login" "/checkout"} and not cf.bot_management.verified_bot) +``` + +## In This Reference + +- [configuration.md](./configuration.md) - Product tiers, WAF rule setup, JavaScript Detections, ML auto-updates +- [api.md](./api.md) - Workers BotManagement interface, WAF fields, JA4 Signals +- [patterns.md](./patterns.md) - E-commerce, API protection, mobile app allowlisting, SEO-friendly handling +- [gotchas.md](./gotchas.md) - False positives/negatives, score=0 issues, JSD limitations, CSP requirements + +## See Also + +- [waf](../waf/) - WAF custom rules for bot enforcement +- [workers](../workers/) - Workers request.cf.botManagement API +- [api-shield](../api-shield/) - API-specific bot protection diff --git a/cloudflare/references/bot-management/api.md b/cloudflare/references/bot-management/api.md new file mode 100644 index 0000000..eda958a --- /dev/null +++ b/cloudflare/references/bot-management/api.md @@ -0,0 +1,169 @@ +# Bot Management API + +## Workers: BotManagement Interface + +```typescript +interface BotManagement { + score: number; // 1-99 (Enterprise), 0 if not computed + verifiedBot: boolean; // Is verified bot + staticResource: boolean; // Serves static resource + ja3Hash: string; // JA3 fingerprint (Enterprise, HTTPS only) + ja4: string; // JA4 fingerprint (Enterprise, HTTPS only) + jsDetection?: { + passed: boolean; // Passed JS detection (if enabled) + }; + detectionIds: number[]; // Heuristic detection IDs + corporateProxy?: boolean; // From corporate proxy (Enterprise) +} + +// DEPRECATED: Use botManagement.score instead +// request.cf.clientTrustScore (legacy, duplicate of botManagement.score) + +// Access via request.cf +import type { IncomingRequestCfProperties } from '@cloudflare/workers-types'; + +export default { + async fetch(request: Request): Promise { + const cf = request.cf as IncomingRequestCfProperties | undefined; + const botMgmt = cf?.botManagement; + + if (!botMgmt) return fetch(request); + if (botMgmt.verifiedBot) return fetch(request); // Allow verified bots + if (botMgmt.score === 1) return new Response('Blocked', { status: 403 }); + if (botMgmt.score < 30) return new Response('Challenge required', { status: 429 }); + + return fetch(request); + } +}; +``` + +## WAF Fields Reference + +```txt +# Score fields +cf.bot_management.score # 0-99 (0 = not computed) +cf.bot_management.verified_bot # boolean +cf.bot_management.static_resource # boolean +cf.bot_management.ja3_hash # string (Enterprise) +cf.bot_management.ja4 # string (Enterprise) +cf.bot_management.detection_ids # array +cf.bot_management.js_detection.passed # boolean +cf.bot_management.corporate_proxy # boolean (Enterprise) +cf.verified_bot_category # string + +# Workers equivalent +request.cf.botManagement.score +request.cf.botManagement.verifiedBot +request.cf.botManagement.ja3Hash +request.cf.botManagement.ja4 +request.cf.botManagement.jsDetection.passed +request.cf.verifiedBotCategory +``` + +## JA4 Signals (Enterprise) + +```typescript +import type { IncomingRequestCfProperties } from '@cloudflare/workers-types'; + +interface JA4Signals { + // Ratios (0.0-1.0) + heuristic_ratio_1h?: number; // Fraction flagged by heuristics + browser_ratio_1h?: number; // Fraction from real browsers + cache_ratio_1h?: number; // Fraction hitting cache + h2h3_ratio_1h?: number; // Fraction using HTTP/2 or HTTP/3 + // Ranks (relative position in distribution) + uas_rank_1h?: number; // User-Agent diversity rank + paths_rank_1h?: number; // Path diversity rank + reqs_rank_1h?: number; // Request volume rank + ips_rank_1h?: number; // IP diversity rank + // Quantiles (0.0-1.0, percentile in distribution) + reqs_quantile_1h?: number; // Request volume quantile + ips_quantile_1h?: number; // IP count quantile +} + +export default { + async fetch(request: Request): Promise { + const cf = request.cf as IncomingRequestCfProperties | undefined; + const ja4Signals = cf?.ja4Signals as JA4Signals | undefined; + + if (!ja4Signals) return fetch(request); // Not available for HTTP or Worker routing + + // Check for anomalous behavior + // High heuristic_ratio or low browser_ratio = suspicious + const heuristicRatio = ja4Signals.heuristic_ratio_1h ?? 0; + const browserRatio = ja4Signals.browser_ratio_1h ?? 0; + + if (heuristicRatio > 0.5 || browserRatio < 0.3) { + return new Response('Suspicious traffic', { status: 403 }); + } + + return fetch(request); + } +}; +``` + +## Common Patterns + +See [patterns.md](./patterns.md) for Workers examples: mobile app allowlisting, corporate proxy exemption, datacenter detection, conditional delay, and more. + +## Bot Analytics + +### Access Locations +- Dashboard: Security > Bots (old) or Security > Analytics > Bot analysis (new) +- GraphQL API for programmatic access +- Security Events & Security Analytics +- Logpush/Logpull + +### Available Data +- **Enterprise BM**: Bot scores (1-99), bot score source, distribution +- **Pro/Business**: Bot groupings (automated, likely automated, likely human) +- Top attributes: IPs, paths, user agents, countries +- Detection sources: Heuristics, ML, AD, JSD +- Verified bot categories + +### Time Ranges +- **Enterprise BM**: Up to 1 week at a time, 30 days history +- **Pro/Business**: Up to 72 hours at a time, 30 days history +- Real-time in most cases, adaptive sampling (1-10% depending on volume) + +## Logpush Fields + +```txt +BotScore # 1-99 or 0 if not computed +BotScoreSrc # Detection engine (ML, Heuristics, etc.) +BotTags # Classification tags +BotDetectionIDs # Heuristic detection IDs +``` + +**BotScoreSrc values:** +- `"Heuristics"` - Known fingerprint +- `"Machine Learning"` - ML model +- `"Anomaly Detection"` - Baseline anomaly +- `"JS Detection"` - JavaScript check +- `"Cloudflare Service"` - Zero Trust +- `"Not Computed"` - Score = 0 + +Access via Logpush (stream to cloud storage/SIEM), Logpull (API to fetch logs), or GraphQL API (query analytics data). + +## Testing with Miniflare + +Miniflare provides mock botManagement data for local development: + +**Default values:** +- `score: 99` (human) +- `verifiedBot: false` +- `corporateProxy: false` +- `ja3Hash: "25b4882c2bcb50cd6b469ff28c596742"` +- `staticResource: false` +- `detectionIds: []` + +**Override in tests:** +```typescript +import { getPlatformProxy } from 'wrangler'; + +const { cf, dispose } = await getPlatformProxy(); +// cf.botManagement is frozen mock object +expect(cf.botManagement.score).toBe(99); +``` + +For custom test data, mock request.cf in your test setup. diff --git a/cloudflare/references/bot-management/configuration.md b/cloudflare/references/bot-management/configuration.md new file mode 100644 index 0000000..4166bcc --- /dev/null +++ b/cloudflare/references/bot-management/configuration.md @@ -0,0 +1,163 @@ +# Bot Management Configuration + +## Product Tiers + +**Note:** Dashboard paths differ between old and new UI: +- **New:** Security > Settings > Filter "Bot traffic" +- **Old:** Security > Bots + +Both UIs access same settings. + +### Bot Score Groupings (Pro/Business) + +Pro/Business users see bot score groupings instead of granular 1-99 scores: + +| Score | Grouping | Meaning | +|-------|----------|---------| +| 0 | Not computed | Bot Management didn't run | +| 1 | Automated | Definite bot (heuristic match) | +| 2-29 | Likely automated | Probably bot (ML detection) | +| 30-99 | Likely human | Probably human | +| N/A | Verified bot | Allowlisted good bot | + +Enterprise plans get granular 1-99 scores for custom thresholds. + +### Bot Fight Mode (Free) +- Auto-blocks definite bots (score=1), excludes verified bots by default +- JavaScript Detections always enabled, no configuration options + +### Super Bot Fight Mode (Pro/Business) +```txt +Dashboard: Security > Bots > Configure +- Definitely automated: Block/Challenge +- Likely automated: Challenge/Allow +- Verified bots: Allow (recommended) +- Static resource protection: ON (may block mail clients) +- JavaScript Detections: Optional +``` + +### Bot Management for Enterprise +```txt +Dashboard: Security > Bots > Configure > Auto-updates: ON (recommended) + +# Template 1: Block definite bots +(cf.bot_management.score eq 1 and not cf.bot_management.verified_bot and not cf.bot_management.static_resource) +Action: Block + +# Template 2: Challenge likely bots +(cf.bot_management.score ge 2 and cf.bot_management.score le 29 and not cf.bot_management.verified_bot and not cf.bot_management.static_resource) +Action: Managed Challenge +``` + +## JavaScript Detections Setup + +### Enable via Dashboard +```txt +Security > Bots > Configure Bot Management > JS Detections: ON + +Update CSP: script-src 'self' /cdn-cgi/challenge-platform/; +``` + +### Manual JS Injection (API) +```html + + +``` + +**Use API for**: Selective deployment on specific pages +**Don't combine**: Zone-wide toggle + manual injection + +### WAF Rules for JSD +```txt +# NEVER use on first page visit (needs HTML page first) +(not cf.bot_management.js_detection.passed and http.request.uri.path eq "/api/user/create" and http.request.method eq "POST" and not cf.bot_management.verified_bot) +Action: Managed Challenge (always use Managed Challenge, not Block) +``` + +### Limitations +- First request won't have JSD data (needs HTML page first) +- Strips ETags from HTML responses +- Not supported with CSP via `` tags +- Websocket endpoints not supported +- Native mobile apps won't pass +- cf_clearance cookie: 15-minute lifespan, max 4096 bytes + +## __cf_bm Cookie + +Cloudflare sets `__cf_bm` cookie to smooth bot scores across user sessions: + +- **Purpose:** Reduces false positives from score volatility +- **Scope:** Per-domain, HTTP-only +- **Lifespan:** Session duration +- **Privacy:** No PII—only session classification +- **Automatic:** No configuration required + +Bot scores for repeat visitors consider session history via this cookie. + +## Static Resource Protection + +**File Extensions**: ico, jpg, png, jpeg, gif, css, js, tif, tiff, bmp, pict, webp, svg, svgz, class, jar, txt, csv, doc, docx, xls, xlsx, pdf, ps, pls, ppt, pptx, ttf, otf, woff, woff2, eot, eps, ejs, swf, torrent, midi, mid, m3u8, m4a, mp3, ogg, ts +**Plus**: `/.well-known/` path (all files) + +```txt +# Exclude static resources from bot rules +(cf.bot_management.score lt 30 and not cf.bot_management.static_resource) +``` + +**WARNING**: May block mail clients fetching static images + +## JA3/JA4 Fingerprinting (Enterprise) + +```txt +# Block specific attack fingerprint +(cf.bot_management.ja3_hash eq "8b8e3d5e3e8b3d5e") + +# Allow mobile app by fingerprint +(cf.bot_management.ja4 eq "your_mobile_app_fingerprint") +``` + +Only available for HTTPS/TLS traffic. Missing for Worker-routed traffic or HTTP requests. + +## Verified Bot Categories + +```txt +# Allow search engines only +(cf.verified_bot_category eq "Search Engine Crawler") + +# Block AI crawlers +(cf.verified_bot_category eq "AI Crawler") +Action: Block + +# Or use dashboard: Security > Settings > Bot Management > Block AI Bots +``` + +| Category | String Value | Example | +|----------|--------------|---------| +| AI Crawler | `AI Crawler` | GPTBot, Claude-Web | +| AI Assistant | `AI Assistant` | Perplexity-User, DuckAssistBot | +| AI Search | `AI Search` | OAI-SearchBot | +| Accessibility | `Accessibility` | Accessible Web Bot | +| Academic Research | `Academic Research` | Library of Congress | +| Advertising & Marketing | `Advertising & Marketing` | Google Adsbot | +| Aggregator | `Aggregator` | Pinterest, Indeed | +| Archiver | `Archiver` | Internet Archive, CommonCrawl | +| Feed Fetcher | `Feed Fetcher` | RSS/Podcast updaters | +| Monitoring & Analytics | `Monitoring & Analytics` | Uptime monitors | +| Page Preview | `Page Preview` | Facebook/Slack link preview | +| SEO | `Search Engine Optimization` | Google Lighthouse | +| Security | `Security` | Vulnerability scanners | +| Social Media Marketing | `Social Media Marketing` | Brandwatch | +| Webhooks | `Webhooks` | Payment processors | +| Other | `Other` | Uncategorized bots | + +## Best Practices + +- **ML Auto-Updates**: Enable on Enterprise for latest models +- **Start with Managed Challenge**: Test before blocking +- **Always exclude verified bots**: Use `not cf.bot_management.verified_bot` +- **Exempt corporate proxies**: For B2B traffic via `cf.bot_management.corporate_proxy` +- **Use static resource exception**: Improves performance, reduces overhead diff --git a/cloudflare/references/bot-management/gotchas.md b/cloudflare/references/bot-management/gotchas.md new file mode 100644 index 0000000..685bcbd --- /dev/null +++ b/cloudflare/references/bot-management/gotchas.md @@ -0,0 +1,114 @@ +# Bot Management Gotchas + +## Common Errors + +### "Bot Score = 0" + +**Cause:** Bot Management didn't run (internal Cloudflare request, Worker routing to zone (Orange-to-Orange), or request handled before BM (Redirect Rules, etc.)) +**Solution:** Check request flow and ensure Bot Management runs in request lifecycle + +### "JavaScript Detections Not Working" + +**Cause:** `js_detection.passed` always false or undefined due to: CSP headers don't allow `/cdn-cgi/challenge-platform/`, using on first page visit (needs HTML page first), ad blockers or disabled JS, JSD not enabled in dashboard, or using Block action (must use Managed Challenge) +**Solution:** Add CSP header `Content-Security-Policy: script-src 'self' /cdn-cgi/challenge-platform/;` and ensure JSD is enabled with Managed Challenge action + +### "False Positives (Legitimate Users Blocked)" + +**Cause:** Bot detection incorrectly flagging legitimate users +**Solution:** Check Bot Analytics for affected IPs/paths, identify detection source (ML, Heuristics, etc.), create exception rule like `(cf.bot_management.score lt 30 and http.request.uri.path eq "/problematic-path")` with Action: Skip (Bot Management), or allowlist by IP/ASN/country + +### "False Negatives (Bots Not Caught)" + +**Cause:** Bots bypassing detection +**Solution:** Lower score threshold (30 → 50), enable JavaScript Detections, add JA3/JA4 fingerprinting rules, or use rate limiting as fallback + +### "Verified Bot Blocked" + +**Cause:** Search engine bot blocked by WAF Managed Rules (not just Bot Management) +**Solution:** Create WAF exception for specific rule ID and verify bot via reverse DNS + +### "Yandex Bot Blocked During IP Update" + +**Cause:** Yandex updates bot IPs; new IPs unrecognized for 48h during propagation +**Solution:** +1. Check Security Events for specific WAF rule ID blocking Yandex +2. Create WAF exception: + ```txt + (http.user_agent contains "YandexBot" and ip.src in {}) + Action: Skip (WAF Managed Ruleset) + ``` +3. Monitor Bot Analytics for 48h +4. Remove exception after propagation completes + +Issue resolves automatically after 48h. Contact Cloudflare Support if persists. + +### "JA3/JA4 Missing" + +**Cause:** Non-HTTPS traffic, Worker routing traffic, Orange-to-Orange traffic via Worker, or Bot Management skipped +**Solution:** JA3/JA4 only available for HTTPS/TLS traffic; check request routing + +**JA3/JA4 Not User-Unique:** Same browser/library version = same fingerprint +- Don't use for user identification +- Use for client profiling only +- Fingerprints change with browser updates + +## Bot Verification Methods + +Cloudflare verifies bots via: + +1. **Reverse DNS (IP validation):** Traditional method—bot IP resolves to expected domain +2. **Web Bot Auth:** Modern cryptographic verification—faster propagation + +When `verifiedBot=true`, bot passed at least one method. + +**Inactive verified bots:** IPs removed after 24h of no traffic. + +## Detection Engine Behavior + +| Engine | Score | Timing | Plan | Notes | +|--------|-------|--------|------|-------| +| Heuristics | Always 1 | Immediate | All | Known fingerprints—overrides ML | +| ML | 1-99 | Immediate | All | Majority of detections | +| Anomaly Detection | Influences | After baseline | Enterprise | Optional, baseline analysis | +| JavaScript Detections | Pass/fail | After JS | Pro+ | Headless browser detection | +| Cloudflare Service | N/A | N/A | Enterprise | Zero Trust internal source | + +**Priority:** Heuristics > ML—if heuristic matches, score=1 regardless of ML. + +## Limits + +| Limit | Value | Notes | +|-------|-------|-------| +| Bot Score = 0 | Means not computed | Not score = 100 | +| First request JSD data | May not be available | JSD data appears on subsequent requests | +| Score accuracy | Not 100% guaranteed | False positives/negatives possible | +| JSD on first HTML page visit | Not supported | Requires subsequent page load | +| JSD requirements | JavaScript-enabled browser | Won't work with JS disabled or ad blockers | +| JSD ETag stripping | Strips ETags from HTML responses | May affect caching behavior | +| JSD CSP compatibility | Requires specific CSP | Not compatible with some CSP configurations | +| JSD meta CSP tags | Not supported | Must use HTTP headers | +| JSD WebSocket support | Not supported | WebSocket endpoints won't work with JSD | +| JSD mobile app support | Native apps won't pass | Only works in browsers | +| JA3/JA4 traffic type | HTTPS/TLS only | Not available for non-HTTPS traffic | +| JA3/JA4 Worker routing | Missing for Worker-routed traffic | Check request routing | +| JA3/JA4 uniqueness | Not unique per user | Shared by clients with same browser/library | +| JA3/JA4 stability | Can change with updates | Browser/library updates affect fingerprints | +| WAF custom rules (Free) | 5 | Varies by plan | +| WAF custom rules (Pro) | 20 | Varies by plan | +| WAF custom rules (Business) | 100 | Varies by plan | +| WAF custom rules (Enterprise) | 1,000+ | Varies by plan | +| Workers CPU time | Varies by plan | Applies to bot logic | +| Bot Analytics sampling | 1-10% adaptive | High-volume zones sampled more aggressively | +| Bot Analytics history | 30 days max | Historical data retention limit | +| CSP requirements for JSD | Must allow `/cdn-cgi/challenge-platform/` | Required for JSD to function | + +### Plan Restrictions + +| Feature | Free | Pro/Business | Enterprise | +|---------|------|--------------|------------| +| Granular scores (1-99) | No | No | Yes | +| JA3/JA4 | No | No | Yes | +| Anomaly Detection | No | No | Yes | +| Corporate Proxy detection | No | No | Yes | +| Verified bot categories | Limited | Limited | Full | +| Custom WAF rules | 5 | 20/100 | 1,000+ | diff --git a/cloudflare/references/bot-management/patterns.md b/cloudflare/references/bot-management/patterns.md new file mode 100644 index 0000000..4ca7085 --- /dev/null +++ b/cloudflare/references/bot-management/patterns.md @@ -0,0 +1,182 @@ +# Bot Management Patterns + +## E-commerce Protection + +```txt +# High security for checkout +(cf.bot_management.score lt 50 and http.request.uri.path in {"/checkout" "/cart/add"} and not cf.bot_management.verified_bot and not cf.bot_management.corporate_proxy) +Action: Managed Challenge +``` + +## API Protection + +```txt +# Protect API with JS detection + score +(http.request.uri.path matches "^/api/" and (cf.bot_management.score lt 30 or not cf.bot_management.js_detection.passed) and not cf.bot_management.verified_bot) +Action: Block +``` + +## SEO-Friendly Bot Handling + +```txt +# Allow search engine crawlers +(cf.bot_management.score lt 30 and not cf.verified_bot_category in {"Search Engine Crawler"}) +Action: Managed Challenge +``` + +## Block AI Scrapers + +```txt +# Block training crawlers only (allow AI assistants/search) +(cf.verified_bot_category eq "AI Crawler") +Action: Block + +# Block all AI-related bots (training + assistants + search) +(cf.verified_bot_category in {"AI Crawler" "AI Assistant" "AI Search"}) +Action: Block + +# Allow AI Search, block AI Crawler and AI Assistant +(cf.verified_bot_category in {"AI Crawler" "AI Assistant"}) +Action: Block + +# Or use dashboard: Security > Settings > Bot Management > Block AI Bots +``` + +## Rate Limiting by Bot Score + +```txt +# Stricter limits for suspicious traffic +(cf.bot_management.score lt 50) +Rate: 10 requests per 10 seconds + +(cf.bot_management.score ge 50) +Rate: 100 requests per 10 seconds +``` + +## Mobile App Allowlisting + +```txt +# Identify mobile app by JA3/JA4 +(cf.bot_management.ja4 in {"fingerprint1" "fingerprint2"}) +Action: Skip (all remaining rules) +``` + +## Datacenter Detection + +```typescript +import type { IncomingRequestCfProperties } from '@cloudflare/workers-types'; + +// Low score + not corporate proxy = likely datacenter bot +export default { + async fetch(request: Request): Promise { + const cf = request.cf as IncomingRequestCfProperties | undefined; + const botMgmt = cf?.botManagement; + + if (botMgmt?.score && botMgmt.score < 30 && + !botMgmt.corporateProxy && !botMgmt.verifiedBot) { + return new Response('Datacenter traffic blocked', { status: 403 }); + } + + return fetch(request); + } +}; +``` + +## Conditional Delay (Tarpit) + +```typescript +import type { IncomingRequestCfProperties } from '@cloudflare/workers-types'; + +// Add delay proportional to bot suspicion +export default { + async fetch(request: Request): Promise { + const cf = request.cf as IncomingRequestCfProperties | undefined; + const botMgmt = cf?.botManagement; + + if (botMgmt?.score && botMgmt.score < 50 && !botMgmt.verifiedBot) { + // Delay: 0-2 seconds for scores 50-0 + const delayMs = Math.max(0, (50 - botMgmt.score) * 40); + await new Promise(r => setTimeout(r, delayMs)); + } + + return fetch(request); + } +}; +``` + +## Layered Defense + +```txt +1. Bot Management (score-based) +2. JavaScript Detections (for JS-capable clients) +3. Rate Limiting (fallback protection) +4. WAF Managed Rules (OWASP, etc.) +``` + +## Progressive Enhancement + +```txt +Public content: High threshold (score < 10) +Authenticated: Medium threshold (score < 30) +Sensitive: Low threshold (score < 50) + JSD +``` + +## Zero Trust for Bots + +```txt +1. Default deny (all scores < 30) +2. Allowlist verified bots +3. Allowlist mobile apps (JA3/JA4) +4. Allowlist corporate proxies +5. Allowlist static resources +``` + +## Workers: Score + JS Detection + +```typescript +import type { IncomingRequestCfProperties } from '@cloudflare/workers-types'; + +export default { + async fetch(request: Request): Promise { + const cf = request.cf as IncomingRequestCfProperties | undefined; + const botMgmt = cf?.botManagement; + const url = new URL(request.url); + + if (botMgmt?.staticResource) return fetch(request); // Skip static + + // API endpoints: require JS detection + good score + if (url.pathname.startsWith('/api/')) { + const jsDetectionPassed = botMgmt?.jsDetection?.passed ?? false; + const score = botMgmt?.score ?? 100; + + if (!jsDetectionPassed || score < 30) { + return new Response('Unauthorized', { status: 401 }); + } + } + + return fetch(request); + } +}; +``` + +## Rate Limiting by JWT Claim + Bot Score + +```txt +# Enterprise: Combine bot score with JWT validation +Rate limiting > Custom rules +- Field: lookup_json_string(http.request.jwt.claims["{config_id}"][0], "sub") +- Matches: user ID claim +- Additional condition: cf.bot_management.score lt 50 +``` + +## WAF Integration Points + +- **WAF Custom Rules**: Primary enforcement mechanism +- **Rate Limiting Rules**: Bot score as dimension, stricter limits for low scores +- **Transform Rules**: Pass score to origin via custom header +- **Workers**: Programmatic bot logic, custom scoring algorithms +- **Page Rules / Configuration Rules**: Zone-level overrides, path-specific settings + +## See Also + +- [gotchas.md](./gotchas.md) - Common errors, false positives/negatives, limitations diff --git a/cloudflare/references/browser-rendering/README.md b/cloudflare/references/browser-rendering/README.md new file mode 100644 index 0000000..eca7220 --- /dev/null +++ b/cloudflare/references/browser-rendering/README.md @@ -0,0 +1,78 @@ +# Cloudflare Browser Rendering Skill Reference + +**Description**: Expert knowledge for Cloudflare Browser Rendering - control headless Chrome on Cloudflare's global network for browser automation, screenshots, PDFs, web scraping, testing, and content generation. + +**When to use**: Any task involving Cloudflare Browser Rendering including: taking screenshots, generating PDFs, web scraping, browser automation, testing web applications, extracting structured data, capturing page metrics, or automating browser interactions. + +## Decision Tree + +### REST API vs Workers Bindings + +**Use REST API when:** +- One-off, stateless tasks (screenshot, PDF, content fetch) +- No Workers infrastructure yet +- Simple integrations from external services +- Need quick prototyping without deployment + +**Use Workers Bindings when:** +- Complex browser automation workflows +- Need session reuse for performance +- Multiple page interactions per request +- Custom scripting and logic required +- Building production applications + +### Puppeteer vs Playwright + +| Feature | Puppeteer | Playwright | +|---------|-----------|------------| +| API Style | Chrome DevTools Protocol | High-level abstractions | +| Selectors | CSS, XPath | CSS, text, role, test-id | +| Best for | Advanced control, CDP access | Quick automation, testing | +| Learning curve | Steeper | Gentler | + +**Use Puppeteer:** Need CDP protocol access, Chrome-specific features, migration from existing Puppeteer code +**Use Playwright:** Modern selector APIs, cross-browser patterns, faster development + +## Tier Limits Summary + +| Limit | Free Tier | Paid Tier | +|-------|-----------|-----------| +| Daily browser time | 10 minutes | Unlimited* | +| Concurrent sessions | 3 | 30 | +| Requests per minute | 6 | 180 | + +*Subject to fair-use policy. See [gotchas.md](gotchas.md) for details. + +## Reading Order + +**New to Browser Rendering:** +1. [configuration.md](configuration.md) - Setup and deployment +2. [patterns.md](patterns.md) - Common use cases with examples +3. [api.md](api.md) - API reference +4. [gotchas.md](gotchas.md) - Avoid common pitfalls + +**Specific task:** +- **Setup/deployment** → [configuration.md](configuration.md) +- **API reference/endpoints** → [api.md](api.md) +- **Example code/patterns** → [patterns.md](patterns.md) +- **Debugging/troubleshooting** → [gotchas.md](gotchas.md) + +**REST API users:** +- Start with [api.md](api.md) REST API section +- Check [gotchas.md](gotchas.md) for rate limits + +**Workers users:** +- Start with [configuration.md](configuration.md) +- Review [patterns.md](patterns.md) for session management +- Reference [api.md](api.md) for Workers Bindings + +## In This Reference + +- **[configuration.md](configuration.md)** - Setup, deployment, wrangler config, compatibility +- **[api.md](api.md)** - REST API endpoints + Workers Bindings (Puppeteer/Playwright) +- **[patterns.md](patterns.md)** - Common patterns, use cases, real examples +- **[gotchas.md](gotchas.md)** - Troubleshooting, best practices, tier limits, common errors + +## See Also + +- [Cloudflare Docs](https://developers.cloudflare.com/browser-rendering/) diff --git a/cloudflare/references/browser-rendering/api.md b/cloudflare/references/browser-rendering/api.md new file mode 100644 index 0000000..eea56b0 --- /dev/null +++ b/cloudflare/references/browser-rendering/api.md @@ -0,0 +1,108 @@ +# Browser Rendering API + +## REST API + +**Base:** `https://api.cloudflare.com/client/v4/accounts/{accountId}/browser-rendering` +**Auth:** `Authorization: Bearer ` (Browser Rendering - Edit permission) + +### Endpoints + +| Endpoint | Description | Key Options | +|----------|-------------|-------------| +| `/content` | Get rendered HTML | `url`, `waitUntil` | +| `/screenshot` | Capture image | `screenshotOptions: {type, fullPage, clip}` | +| `/pdf` | Generate PDF | `pdfOptions: {format, landscape, margin}` | +| `/snapshot` | HTML + inlined resources | `url` | +| `/scrape` | Extract by selectors | `selectors: ["h1", ".price"]` | +| `/json` | AI-structured extraction | `schema: {name: "string", price: "number"}` | +| `/links` | Get all links | `url` | +| `/markdown` | Convert to markdown | `url` | + +```bash +curl -X POST '.../browser-rendering/screenshot' \ + -H "Authorization: Bearer $TOKEN" \ + -d '{"url":"https://example.com","screenshotOptions":{"fullPage":true}}' +``` + +## Workers Binding + +```jsonc +// wrangler.jsonc +{ "browser": { "binding": "MYBROWSER" } } +``` + +## Puppeteer + +```typescript +import puppeteer from "@cloudflare/puppeteer"; + +const browser = await puppeteer.launch(env.MYBROWSER, { keep_alive: 600000 }); +const page = await browser.newPage(); +await page.goto('https://example.com', { waitUntil: 'networkidle0' }); + +// Content +const html = await page.content(); +const title = await page.title(); + +// Screenshot/PDF +await page.screenshot({ fullPage: true, type: 'png' }); +await page.pdf({ format: 'A4', printBackground: true }); + +// Interaction +await page.click('#button'); +await page.type('#input', 'text'); +await page.evaluate(() => document.querySelector('h1')?.textContent); + +// Session management +const sessions = await puppeteer.sessions(env.MYBROWSER); +const limits = await puppeteer.limits(env.MYBROWSER); + +await browser.close(); +``` + +## Playwright + +```typescript +import { launch, connect } from "@cloudflare/playwright"; + +const browser = await launch(env.MYBROWSER, { keep_alive: 600000 }); +const page = await browser.newPage(); + +await page.goto('https://example.com', { waitUntil: 'networkidle' }); + +// Modern selectors +await page.locator('.button').click(); +await page.getByText('Submit').click(); +await page.getByTestId('search').fill('query'); + +// Context for isolation +const context = await browser.newContext({ + viewport: { width: 1920, height: 1080 }, + userAgent: 'custom' +}); + +await browser.close(); +``` + +## Session Management + +```typescript +// List sessions +await puppeteer.sessions(env.MYBROWSER); + +// Connect to existing +await puppeteer.connect(env.MYBROWSER, sessionId); + +// Check limits +await puppeteer.limits(env.MYBROWSER); +// { remaining: ms, total: ms, concurrent: n } +``` + +## Key Options + +| Option | Values | +|--------|--------| +| `waitUntil` | `load`, `domcontentloaded`, `networkidle0`, `networkidle2` | +| `keep_alive` | Max 600000ms (10 min) | +| `screenshot.type` | `png`, `jpeg` | +| `pdf.format` | `A4`, `Letter`, `Legal` | diff --git a/cloudflare/references/browser-rendering/configuration.md b/cloudflare/references/browser-rendering/configuration.md new file mode 100644 index 0000000..84bad26 --- /dev/null +++ b/cloudflare/references/browser-rendering/configuration.md @@ -0,0 +1,78 @@ +# Configuration & Setup + +## Installation + +```bash +npm install @cloudflare/puppeteer # or @cloudflare/playwright +``` + +**Use Cloudflare packages** - standard `puppeteer`/`playwright` won't work in Workers. + +## wrangler.json + +```json +{ + "name": "browser-worker", + "main": "src/index.ts", + "compatibility_date": "2025-01-01", + "compatibility_flags": ["nodejs_compat"], + "browser": { + "binding": "MYBROWSER" + } +} +``` + +**Required:** `nodejs_compat` flag and `browser.binding`. + +## TypeScript + +```typescript +interface Env { + MYBROWSER: Fetcher; +} + +export default { + async fetch(request: Request, env: Env): Promise { + // ... + } +} satisfies ExportedHandler; +``` + +## Development + +```bash +wrangler dev --remote # --remote required for browser binding +``` + +**Local mode does NOT support Browser Rendering** - must use `--remote`. + +## REST API + +No wrangler config needed. Get API token with "Browser Rendering - Edit" permission. + +```bash +curl -X POST \ + 'https://api.cloudflare.com/client/v4/accounts/{accountId}/browser-rendering/screenshot' \ + -H 'Authorization: Bearer TOKEN' \ + -d '{"url": "https://example.com"}' --output screenshot.png +``` + +## Requirements + +| Requirement | Value | +|-------------|-------| +| Node.js compatibility | `nodejs_compat` flag | +| Compatibility date | 2023-03-01+ | +| Module format | ES modules only | +| Browser | Chromium 119+ (no Firefox/Safari) | + +**Not supported:** WebGL, WebRTC, extensions, `file://` protocol, Service Worker syntax. + +## Troubleshooting + +| Error | Solution | +|-------|----------| +| `MYBROWSER is undefined` | Use `wrangler dev --remote` | +| `nodejs_compat not enabled` | Add to `compatibility_flags` | +| `Module not found` | `npm install @cloudflare/puppeteer` | +| `Browser Rendering not available` | Enable in dashboard | diff --git a/cloudflare/references/browser-rendering/gotchas.md b/cloudflare/references/browser-rendering/gotchas.md new file mode 100644 index 0000000..7e34f2b --- /dev/null +++ b/cloudflare/references/browser-rendering/gotchas.md @@ -0,0 +1,88 @@ +# Browser Rendering Gotchas + +## Tier Limits + +| Limit | Free | Paid | +|-------|------|------| +| Daily browser time | 10 min | Unlimited* | +| Concurrent sessions | 3 | 30 | +| Requests/minute | 6 | 180 | +| Session keep-alive | 10 min max | 10 min max | + +*Subject to fair-use policy. + +**Check quota:** +```typescript +const limits = await puppeteer.limits(env.MYBROWSER); +// { remaining: 540000, total: 600000, concurrent: 2 } +``` + +## Always Close Browsers + +```typescript +const browser = await puppeteer.launch(env.MYBROWSER); +try { + const page = await browser.newPage(); + await page.goto("https://example.com"); + return new Response(await page.content()); +} finally { + await browser.close(); // ALWAYS in finally +} +``` + +**Workers vs REST:** REST auto-closes after timeout. Workers must call `close()` or session stays open until `keep_alive` expires. + +## Optimize Concurrency + +```typescript +// ❌ 3 sessions (hits free tier limit) +const browser1 = await puppeteer.launch(env.MYBROWSER); +const browser2 = await puppeteer.launch(env.MYBROWSER); + +// ✅ 1 session, multiple pages +const browser = await puppeteer.launch(env.MYBROWSER); +const page1 = await browser.newPage(); +const page2 = await browser.newPage(); +``` + +## Common Errors + +| Error | Cause | Fix | +|-------|-------|-----| +| Session limit exceeded | Too many concurrent | Close unused browsers, use pages not browsers | +| Page navigation timeout | Slow page or `networkidle` on busy page | Increase timeout, use `waitUntil: "load"` | +| Session not found | Expired session | Catch error, launch new session | +| Evaluation failed | DOM element missing | Use `?.` optional chaining | +| Protocol error: Target closed | Page closed during operation | Await all ops before closing | + +## page.evaluate() Gotchas + +```typescript +// ❌ Outer scope not available +const selector = "h1"; +await page.evaluate(() => document.querySelector(selector)); + +// ✅ Pass as argument +await page.evaluate((sel) => document.querySelector(sel)?.textContent, selector); +``` + +## Performance + +**waitUntil options (fastest to slowest):** +1. `domcontentloaded` - DOM ready +2. `load` - load event (default) +3. `networkidle0` - no network for 500ms + +**Block unnecessary resources:** +```typescript +await page.setRequestInterception(true); +page.on("request", (req) => { + if (["image", "stylesheet", "font"].includes(req.resourceType())) { + req.abort(); + } else { + req.continue(); + } +}); +``` + +**Session reuse:** Cold start ~1-2s, warm connect ~100-200ms. Store sessionId in KV for reuse. diff --git a/cloudflare/references/browser-rendering/patterns.md b/cloudflare/references/browser-rendering/patterns.md new file mode 100644 index 0000000..a652c2b --- /dev/null +++ b/cloudflare/references/browser-rendering/patterns.md @@ -0,0 +1,91 @@ +# Browser Rendering Patterns + +## Basic Worker + +```typescript +import puppeteer from "@cloudflare/puppeteer"; + +export default { + async fetch(request, env) { + const browser = await puppeteer.launch(env.MYBROWSER); + try { + const page = await browser.newPage(); + await page.goto("https://example.com"); + return new Response(await page.content()); + } finally { + await browser.close(); // ALWAYS in finally + } + } +}; +``` + +## Session Reuse + +Keep sessions alive for performance: +```typescript +let sessionId = await env.SESSION_KV.get("browser-session"); +if (sessionId) { + browser = await puppeteer.connect(env.MYBROWSER, sessionId); +} else { + browser = await puppeteer.launch(env.MYBROWSER, { keep_alive: 600000 }); + await env.SESSION_KV.put("browser-session", browser.sessionId(), { expirationTtl: 600 }); +} +// Don't close browser to keep session alive +``` + +## Common Operations + +| Task | Code | +|------|------| +| Screenshot | `await page.screenshot({ type: "png", fullPage: true })` | +| PDF | `await page.pdf({ format: "A4", printBackground: true })` | +| Extract data | `await page.evaluate(() => document.querySelector('h1').textContent)` | +| Fill form | `await page.type('#input', 'value'); await page.click('button')` | +| Wait nav | `await Promise.all([page.waitForNavigation(), page.click('a')])` | + +## Parallel Scraping + +```typescript +const pages = await Promise.all(urls.map(() => browser.newPage())); +await Promise.all(pages.map((p, i) => p.goto(urls[i]))); +const titles = await Promise.all(pages.map(p => p.title())); +``` + +## Playwright Selectors + +```typescript +import { launch } from "@cloudflare/playwright"; +const browser = await launch(env.MYBROWSER); +await page.getByRole("button", { name: "Sign in" }).click(); +await page.getByLabel("Email").fill("user@example.com"); +await page.getByTestId("submit-button").click(); +``` + +## Incognito Contexts + +Isolated sessions without multiple browsers: +```typescript +const ctx1 = await browser.createIncognitoBrowserContext(); +const ctx2 = await browser.createIncognitoBrowserContext(); +// Each has isolated cookies/storage +``` + +## Quota Check + +```typescript +const limits = await puppeteer.limits(env.MYBROWSER); +if (limits.remaining < 60000) return new Response("Quota low", { status: 429 }); +``` + +## Error Handling + +```typescript +try { + await page.goto(url, { timeout: 30000, waitUntil: "networkidle0" }); +} catch (e) { + if (e.message.includes("timeout")) return new Response("Timeout", { status: 504 }); + if (e.message.includes("Session limit")) return new Response("Too many sessions", { status: 429 }); +} finally { + if (browser) await browser.close(); +} +``` diff --git a/cloudflare/references/c3/README.md b/cloudflare/references/c3/README.md new file mode 100644 index 0000000..0516fc6 --- /dev/null +++ b/cloudflare/references/c3/README.md @@ -0,0 +1,111 @@ +# C3 (create-cloudflare) + +Official CLI for scaffolding Cloudflare Workers and Pages projects with templates, TypeScript, and instant deployment. + +## Quick Start + +```bash +# Interactive (recommended for first-time) +npm create cloudflare@latest my-app + +# Worker (API/WebSocket/Cron) +npm create cloudflare@latest my-api -- --type=hello-world --ts + +# Pages (static/SSG/full-stack) +npm create cloudflare@latest my-site -- --type=web-app --framework=astro --platform=pages +``` + +## Platform Decision Tree + +``` +What are you building? + +├─ API / WebSocket / Cron / Email handler +│ └─ Workers (default) - no --platform flag needed +│ npm create cloudflare@latest my-api -- --type=hello-world + +├─ Static site / SSG / Documentation +│ └─ Pages - requires --platform=pages +│ npm create cloudflare@latest my-site -- --type=web-app --framework=astro --platform=pages + +├─ Full-stack app (Next.js/Remix/SvelteKit) +│ ├─ Need Durable Objects, Queues, or Workers-only features? +│ │ └─ Workers (default) +│ └─ Otherwise use Pages for git integration and branch previews +│ └─ Add --platform=pages + +└─ Convert existing project + └─ npm create cloudflare@latest . -- --type=pre-existing --existing-script=./src/worker.ts +``` + +**Critical:** Pages projects require `--platform=pages` flag. Without it, C3 defaults to Workers. + +## Interactive Flow + +When run without flags, C3 prompts in this order: + +1. **Project name** - Directory to create (defaults to current dir with `.`) +2. **Application type** - `hello-world`, `web-app`, `demo`, `pre-existing`, `remote-template` +3. **Platform** - `workers` (default) or `pages` (for web apps only) +4. **Framework** - If web-app: `next`, `remix`, `astro`, `react-router`, `solid`, `svelte`, etc. +5. **TypeScript** - `yes` (recommended) or `no` +6. **Git** - Initialize repository? `yes` or `no` +7. **Deploy** - Deploy now? `yes` or `no` (requires `wrangler login`) + +## Installation Methods + +```bash +# NPM +npm create cloudflare@latest + +# Yarn +yarn create cloudflare + +# PNPM +pnpm create cloudflare@latest +``` + +## In This Reference + +| File | Purpose | Use When | +|------|---------|----------| +| **api.md** | Complete CLI flag reference | Scripting, CI/CD, advanced usage | +| **configuration.md** | Generated files, bindings, types | Understanding output, customization | +| **patterns.md** | Workflows, CI/CD, monorepos | Real-world integration | +| **gotchas.md** | Troubleshooting failures | Deployment blocked, errors | + +## Reading Order + +| Task | Read | +|------|------| +| Create first project | README only | +| Set up CI/CD | README → api → patterns | +| Debug failed deploy | gotchas | +| Understand generated files | configuration | +| Full CLI reference | api | +| Create custom template | patterns → configuration | +| Convert existing project | README → patterns | + +## Post-Creation + +```bash +cd my-app + +# Local dev with hot reload +npm run dev + +# Generate TypeScript types for bindings +npm run cf-typegen + +# Deploy to Cloudflare +npm run deploy +``` + +## See Also + +- **workers/README.md** - Workers runtime, bindings, APIs +- **workers-ai/README.md** - AI/ML models +- **pages/README.md** - Pages-specific features +- **wrangler/README.md** - Wrangler CLI beyond initial setup +- **d1/README.md** - SQLite database +- **r2/README.md** - Object storage diff --git a/cloudflare/references/c3/api.md b/cloudflare/references/c3/api.md new file mode 100644 index 0000000..29c2b0c --- /dev/null +++ b/cloudflare/references/c3/api.md @@ -0,0 +1,71 @@ +# C3 CLI Reference + +## Invocation + +```bash +npm create cloudflare@latest [name] [-- flags] # NPM requires -- +yarn create cloudflare [name] [flags] +pnpm create cloudflare@latest [name] [-- flags] +``` + +## Core Flags + +| Flag | Values | Description | +|------|--------|-------------| +| `--type` | `hello-world`, `web-app`, `demo`, `pre-existing`, `remote-template` | Application type | +| `--platform` | `workers` (default), `pages` | Target platform | +| `--framework` | `next`, `remix`, `astro`, `react-router`, `solid`, `svelte`, `qwik`, `vue`, `angular`, `hono` | Web framework (requires `--type=web-app`) | +| `--lang` | `ts`, `js`, `python` | Language (for `--type=hello-world`) | +| `--ts` / `--no-ts` | - | TypeScript for web apps | + +## Deployment Flags + +| Flag | Description | +|------|-------------| +| `--deploy` / `--no-deploy` | Deploy immediately (prompts interactive, skips in CI) | +| `--git` / `--no-git` | Initialize git (default: yes) | +| `--open` | Open browser after deploy | + +## Advanced Flags + +| Flag | Description | +|------|-------------| +| `--template=user/repo` | GitHub template or local path | +| `--existing-script=./src/worker.ts` | Existing script (requires `--type=pre-existing`) | +| `--category=ai\|database\|realtime` | Demo filter (requires `--type=demo`) | +| `--experimental` | Enable experimental features | +| `--wrangler-defaults` | Skip wrangler prompts | + +## Environment Variables + +```bash +CLOUDFLARE_API_TOKEN=xxx # For deployment +CLOUDFLARE_ACCOUNT_ID=xxx # Account ID +CF_TELEMETRY_DISABLED=1 # Disable telemetry +``` + +## Exit Codes + +`0` success, `1` user abort, `2` error + +## Examples + +```bash +# TypeScript Worker +npm create cloudflare@latest my-api -- --type=hello-world --lang=ts --no-deploy + +# Next.js on Pages +npm create cloudflare@latest my-app -- --type=web-app --framework=next --platform=pages --ts + +# Astro blog +npm create cloudflare@latest my-blog -- --type=web-app --framework=astro --ts --deploy + +# CI: non-interactive +npm create cloudflare@latest my-app -- --type=web-app --framework=next --ts --no-git --no-deploy + +# GitHub template +npm create cloudflare@latest -- --template=cloudflare/templates/worker-openapi + +# Convert existing project +npm create cloudflare@latest . -- --type=pre-existing --existing-script=./build/worker.js +``` diff --git a/cloudflare/references/c3/configuration.md b/cloudflare/references/c3/configuration.md new file mode 100644 index 0000000..37f9f82 --- /dev/null +++ b/cloudflare/references/c3/configuration.md @@ -0,0 +1,81 @@ +# C3 Generated Configuration + +## Output Structure + +``` +my-app/ +├── src/index.ts # Worker entry point +├── wrangler.jsonc # Cloudflare config +├── package.json # Scripts +├── tsconfig.json +└── .gitignore +``` + +## wrangler.jsonc + +```jsonc +{ + "$schema": "https://raw.githubusercontent.com/cloudflare/workers-sdk/main/packages/wrangler/config-schema.json", + "name": "my-app", + "main": "src/index.ts", + "compatibility_date": "2026-01-27" +} +``` + +## Binding Placeholders + +C3 generates **placeholder IDs** that must be replaced before deploy: + +```jsonc +{ + "kv_namespaces": [{ "binding": "MY_KV", "id": "placeholder_kv_id" }], + "d1_databases": [{ "binding": "DB", "database_id": "00000000-..." }] +} +``` + +**Replace with real IDs:** +```bash +npx wrangler kv namespace create MY_KV # Returns real ID +npx wrangler d1 create my-database # Returns real database_id +``` + +**Deployment error if not replaced:** +``` +Error: Invalid KV namespace ID "placeholder_kv_id" +``` + +## Scripts + +```json +{ + "scripts": { + "dev": "wrangler dev", + "deploy": "wrangler deploy", + "cf-typegen": "wrangler types" + } +} +``` + +## Type Generation + +Run after adding bindings: +```bash +npm run cf-typegen +``` + +Generates `.wrangler/types/runtime.d.ts`: +```typescript +interface Env { + MY_KV: KVNamespace; + DB: D1Database; +} +``` + +## Post-Creation Checklist + +1. Review `wrangler.jsonc` - check name, compatibility_date +2. Replace placeholder binding IDs with real resource IDs +3. Run `npm run cf-typegen` +4. Test: `npm run dev` +5. Deploy: `npm run deploy` +6. Add secrets: `npx wrangler secret put SECRET_NAME` diff --git a/cloudflare/references/c3/gotchas.md b/cloudflare/references/c3/gotchas.md new file mode 100644 index 0000000..ecd664d --- /dev/null +++ b/cloudflare/references/c3/gotchas.md @@ -0,0 +1,92 @@ +# C3 Troubleshooting + +## Deployment Issues + +### Placeholder IDs + +**Error:** "Invalid namespace ID" +**Fix:** Replace placeholders in wrangler.jsonc with real IDs: +```bash +npx wrangler kv namespace create MY_KV # Get real ID +``` + +### Authentication + +**Error:** "Not authenticated" +**Fix:** `npx wrangler login` or set `CLOUDFLARE_API_TOKEN` + +### Name Conflict + +**Error:** "Worker already exists" +**Fix:** Change `name` in wrangler.jsonc + +## Platform Selection + +| Need | Platform | +|------|----------| +| Git integration, branch previews | `--platform=pages` | +| Durable Objects, D1, Queues | Workers (default) | + +Wrong platform? Recreate with correct `--platform` flag. + +## TypeScript Issues + +**"Cannot find name 'KVNamespace'"** +```bash +npm run cf-typegen # Regenerate types +# Restart TS server in editor +``` + +**Missing types after config change:** Re-run `npm run cf-typegen` + +## Package Manager + +**Multiple lockfiles causing issues:** +```bash +rm pnpm-lock.yaml # If using npm +rm package-lock.json # If using pnpm +``` + +## CI/CD + +**CI hangs on prompts:** +```bash +npm create cloudflare@latest my-app -- \ + --type=hello-world --lang=ts --no-git --no-deploy +``` + +**Auth in CI:** +```yaml +env: + CLOUDFLARE_API_TOKEN: ${{ secrets.CLOUDFLARE_API_TOKEN }} + CLOUDFLARE_ACCOUNT_ID: ${{ secrets.CLOUDFLARE_ACCOUNT_ID }} +``` + +## Framework-Specific + +| Framework | Issue | Fix | +|-----------|-------|-----| +| Next.js | create-next-app failed | `npm cache clean --force`, retry | +| Astro | Adapter missing | Install `@astrojs/cloudflare` | +| Remix | Module errors | Update `@remix-run/cloudflare*` | + +## Compatibility Date + +**"Feature X requires compatibility_date >= ..."** +**Fix:** Update `compatibility_date` in wrangler.jsonc to today's date + +## Node.js Version + +**"Node.js version not supported"** +**Fix:** Install Node.js 18+ (`nvm install 20`) + +## Quick Reference + +| Error | Cause | Fix | +|-------|-------|-----| +| Invalid namespace ID | Placeholder binding | Create resource, update config | +| Not authenticated | No login | `npx wrangler login` | +| Cannot find KVNamespace | Missing types | `npm run cf-typegen` | +| Worker already exists | Name conflict | Change `name` | +| CI hangs | Missing flags | Add --type, --lang, --no-deploy | +| Template not found | Bad name | Check cloudflare/templates | diff --git a/cloudflare/references/c3/patterns.md b/cloudflare/references/c3/patterns.md new file mode 100644 index 0000000..76379e3 --- /dev/null +++ b/cloudflare/references/c3/patterns.md @@ -0,0 +1,82 @@ +# C3 Usage Patterns + +## Quick Workflows + +```bash +# TypeScript API Worker +npm create cloudflare@latest my-api -- --type=hello-world --lang=ts --deploy + +# Next.js on Pages +npm create cloudflare@latest my-app -- --type=web-app --framework=next --platform=pages --ts --deploy + +# Astro static site +npm create cloudflare@latest my-blog -- --type=web-app --framework=astro --platform=pages --ts +``` + +## CI/CD (GitHub Actions) + +```yaml +- name: Deploy + run: npm run deploy + env: + CLOUDFLARE_API_TOKEN: ${{ secrets.CLOUDFLARE_API_TOKEN }} + CLOUDFLARE_ACCOUNT_ID: ${{ secrets.CLOUDFLARE_ACCOUNT_ID }} +``` + +**Non-interactive requires:** +```bash +--type= # Required +--no-git # Recommended (CI already in git) +--no-deploy # Deploy separately with secrets +--framework= # For web-app +--ts / --no-ts # Required +``` + +## Monorepo + +C3 detects workspace config (`package.json` workspaces or `pnpm-workspace.yaml`). + +```bash +cd packages/ +npm create cloudflare@latest my-worker -- --type=hello-world --lang=ts --no-deploy +``` + +## Custom Templates + +```bash +# GitHub repo +npm create cloudflare@latest -- --template=username/repo +npm create cloudflare@latest -- --template=cloudflare/templates/worker-openapi + +# Local path +npm create cloudflare@latest my-app -- --template=../my-template +``` + +**Template requires `c3.config.json`:** +```json +{ + "name": "my-template", + "category": "hello-world", + "copies": [{ "path": "src/" }, { "path": "wrangler.jsonc" }], + "transforms": [{ "path": "package.json", "jsonc": { "name": "{{projectName}}" }}] +} +``` + +## Existing Projects + +```bash +# Add Cloudflare to existing Worker +npm create cloudflare@latest . -- --type=pre-existing --existing-script=./dist/index.js + +# Add to existing framework app +npm create cloudflare@latest . -- --type=web-app --framework=next --platform=pages --ts +``` + +## Post-Creation Checklist + +1. Review `wrangler.jsonc` - set `compatibility_date`, verify `name` +2. Create bindings: `wrangler kv namespace create`, `wrangler d1 create`, `wrangler r2 bucket create` +3. Generate types: `npm run cf-typegen` +4. Test: `npm run dev` +5. Deploy: `npm run deploy` +6. Set secrets: `wrangler secret put SECRET_NAME` diff --git a/cloudflare/references/cache-reserve/README.md b/cloudflare/references/cache-reserve/README.md new file mode 100644 index 0000000..395347a --- /dev/null +++ b/cloudflare/references/cache-reserve/README.md @@ -0,0 +1,147 @@ +# Cloudflare Cache Reserve + +**Persistent cache storage built on R2 for long-term content retention** + +## Smart Shield Integration + +Cache Reserve is part of **Smart Shield**, Cloudflare's comprehensive security and performance suite: + +- **Smart Shield Advanced tier**: Includes 2TB Cache Reserve storage +- **Standalone purchase**: Available separately if not using Smart Shield +- **Migration**: Existing standalone customers can migrate to Smart Shield bundles + +**Decision**: Already on Smart Shield Advanced? Cache Reserve is included. Otherwise evaluate standalone purchase vs Smart Shield upgrade. + +## Overview + +Cache Reserve is Cloudflare's persistent, large-scale cache storage layer built on R2. It acts as the ultimate upper-tier cache, storing cacheable content for extended periods (30+ days) to maximize cache hits, reduce origin egress fees, and shield origins from repeated requests for long-tail content. + +## Core Concepts + +### What is Cache Reserve? + +- **Persistent storage layer**: Built on R2, sits above tiered cache hierarchy +- **Long-term retention**: 30-day default retention, extended on each access +- **Automatic operation**: Works seamlessly with existing CDN, no code changes required +- **Origin shielding**: Dramatically reduces origin egress by serving cached content longer +- **Usage-based pricing**: Pay only for storage + read/write operations + +### Cache Hierarchy + +``` +Visitor Request + ↓ +Lower-Tier Cache (closest to visitor) + ↓ (on miss) +Upper-Tier Cache (closest to origin) + ↓ (on miss) +Cache Reserve (R2 persistent storage) + ↓ (on miss) +Origin Server +``` + +### How It Works + +1. **On cache miss**: Content fetched from origin �� written to Cache Reserve + edge caches simultaneously +2. **On edge eviction**: Content may be evicted from edge cache but remains in Cache Reserve +3. **On subsequent request**: If edge cache misses but Cache Reserve hits → content restored to edge caches +4. **Retention**: Assets remain in Cache Reserve for 30 days since last access (configurable via TTL) + +## When to Use Cache Reserve + +``` +Need persistent caching? +├─ High origin egress costs → Cache Reserve ✓ +├─ Long-tail content (archives, media libraries) → Cache Reserve ✓ +├─ Already using Smart Shield Advanced → Included! ✓ +├─ Video streaming with seeking (range requests) → ✗ Not supported +├─ Dynamic/personalized content → ✗ Use edge cache only +├─ Need per-request cache control from Workers → ✗ Use R2 directly +└─ Frequently updated content (< 10hr lifetime) → ✗ Not eligible +``` + +## Asset Eligibility + +Cache Reserve only stores assets meeting **ALL** criteria: + +- Cacheable per Cloudflare's standard rules +- Minimum 10-hour TTL (36000 seconds) +- `Content-Length` header present +- Original files only (not transformed images) + +### Eligibility Checklist + +Use this checklist to verify if an asset is eligible: + +- [ ] Zone has Cache Reserve enabled +- [ ] Zone has Tiered Cache enabled (required) +- [ ] Asset TTL ≥ 10 hours (36,000 seconds) +- [ ] `Content-Length` header present on origin response +- [ ] No `Set-Cookie` header (or uses private directive) +- [ ] `Vary` header is NOT `*` (can be `Accept-Encoding`) +- [ ] Not an image transformation variant (original images OK) +- [ ] Not a range request (no HTTP 206 support) +- [ ] Not O2O (Orange-to-Orange) proxied request + +**All boxes must be checked for Cache Reserve eligibility.** + +### Not Eligible + +- Assets with TTL < 10 hours +- Responses without `Content-Length` header +- Image transformation variants (original images are eligible) +- Responses with `Set-Cookie` headers +- Responses with `Vary: *` header +- Assets from R2 public buckets on same zone +- O2O (Orange-to-Orange) setup requests +- **Range requests** (video seeking, partial content downloads) + +## Quick Start + +```bash +# Enable via Dashboard +https://dash.cloudflare.com/caching/cache-reserve +# Click "Enable Storage Sync" or "Purchase" button +``` + +**Prerequisites:** +- Paid Cache Reserve plan or Smart Shield Advanced required +- Tiered Cache required for optimal performance + +## Essential Commands + +```bash +# Check Cache Reserve status +curl -X GET "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/cache/cache_reserve" \ + -H "Authorization: Bearer $API_TOKEN" + +# Enable Cache Reserve +curl -X PATCH "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/cache/cache_reserve" \ + -H "Authorization: Bearer $API_TOKEN" \ + -H "Content-Type: application/json" \ + -d '{"value": "on"}' + +# Check asset cache status +curl -I https://example.com/asset.jpg | grep -i cache +``` + +## In This Reference + +| Task | Files | +|------|-------| +| Evaluate if Cache Reserve fits your use case | README.md (this file) | +| Enable Cache Reserve for your zone | README.md + [configuration.md](./configuration.md) | +| Use with Workers (understand limitations) | [api.md](./api.md) | +| Setup via SDKs or IaC (TypeScript, Python, Terraform) | [configuration.md](./configuration.md) | +| Optimize costs and debug issues | [patterns.md](./patterns.md) + [gotchas.md](./gotchas.md) | +| Understand eligibility and troubleshoot | [gotchas.md](./gotchas.md) → [patterns.md](./patterns.md) | + +**Files:** +- [configuration.md](./configuration.md) - Setup, API, SDKs, and Cache Rules +- [api.md](./api.md) - Purging, monitoring, Workers integration +- [patterns.md](./patterns.md) - Best practices, cost optimization, debugging +- [gotchas.md](./gotchas.md) - Common issues, limitations, troubleshooting + +## See Also +- [r2](../r2/) - Cache Reserve built on R2 storage +- [workers](../workers/) - Workers integration with Cache API diff --git a/cloudflare/references/cache-reserve/api.md b/cloudflare/references/cache-reserve/api.md new file mode 100644 index 0000000..18c49d8 --- /dev/null +++ b/cloudflare/references/cache-reserve/api.md @@ -0,0 +1,194 @@ +# Cache Reserve API + +## Workers Integration + +``` +┌────────────────────────────────────────────────────────────────┐ +│ CRITICAL: Workers Cache API ≠ Cache Reserve │ +│ │ +│ • Workers caches.default / cache.put() → edge cache ONLY │ +│ • Cache Reserve → zone-level setting, automatic, no per-req │ +│ • You CANNOT selectively write to Cache Reserve from Workers │ +│ • Cache Reserve works with standard fetch(), not cache.put() │ +└────────────────────────────────────────────────────────────────┘ +``` + +Cache Reserve is a **zone-level configuration**, not a per-request API. It works automatically when enabled for the zone: + +### Standard Fetch (Recommended) + +```typescript +// Cache Reserve works automatically via standard fetch +export default { + async fetch(request: Request, env: Env): Promise { + // Standard fetch uses Cache Reserve automatically + return await fetch(request); + } +}; +``` + +### Cache API Limitations + +**IMPORTANT**: `cache.put()` is **NOT compatible** with Cache Reserve or Tiered Cache. + +```typescript +// ❌ WRONG: cache.put() bypasses Cache Reserve +const cache = caches.default; +let response = await cache.match(request); +if (!response) { + response = await fetch(request); + await cache.put(request, response.clone()); // Bypasses Cache Reserve! +} + +// ✅ CORRECT: Use standard fetch for Cache Reserve compatibility +return await fetch(request); + +// ✅ CORRECT: Use Cache API only for custom cache namespaces +const customCache = await caches.open('my-custom-cache'); +let response = await customCache.match(request); +if (!response) { + response = await fetch(request); + await customCache.put(request, response.clone()); // Custom cache OK +} +``` + +## Purging and Cache Management + +### Purge by URL (Instant) + +```typescript +// Purge specific URL from Cache Reserve immediately +const purgeCacheReserveByURL = async ( + zoneId: string, + apiToken: string, + urls: string[] +) => { + const response = await fetch( + `https://api.cloudflare.com/client/v4/zones/${zoneId}/purge_cache`, + { + method: 'POST', + headers: { + 'Authorization': `Bearer ${apiToken}`, + 'Content-Type': 'application/json', + }, + body: JSON.stringify({ files: urls }) + } + ); + return await response.json(); +}; + +// Example usage +await purgeCacheReserveByURL('zone123', 'token456', [ + 'https://example.com/image.jpg', + 'https://example.com/video.mp4' +]); +``` + +### Purge by Tag/Host/Prefix (Revalidation) + +```typescript +// Purge by cache tag - forces revalidation, not immediate removal +await fetch( + `https://api.cloudflare.com/client/v4/zones/${zoneId}/purge_cache`, + { + method: 'POST', + headers: { 'Authorization': `Bearer ${apiToken}`, 'Content-Type': 'application/json' }, + body: JSON.stringify({ tags: ['tag1', 'tag2'] }) + } +); +``` + +**Purge behavior:** +- **By URL**: Immediate removal from Cache Reserve + edge cache +- **By tag/host/prefix**: Revalidation only, assets remain in storage (costs continue) + +### Clear All Cache Reserve Data + +```typescript +// Requires Cache Reserve OFF first +await fetch( + `https://api.cloudflare.com/client/v4/zones/${zoneId}/cache/cache_reserve_clear`, + { method: 'POST', headers: { 'Authorization': `Bearer ${apiToken}` } } +); + +// Check status: GET same endpoint returns { state: "In-progress" | "Completed" } +``` + +**Process**: Disable Cache Reserve → Call clear endpoint → Wait up to 24hr → Re-enable + +## Monitoring and Analytics + +### Dashboard Analytics + +Navigate to **Caching > Cache Reserve** to view: + +- **Egress Savings**: Total bytes served from Cache Reserve vs origin egress cost saved +- **Requests Served**: Cache Reserve hits vs misses breakdown +- **Storage Used**: Current GB stored in Cache Reserve (billed monthly) +- **Operations**: Class A (writes) and Class B (reads) operation counts +- **Cost Tracking**: Estimated monthly costs based on current usage + +### Logpush Integration + +```typescript +// Logpush field: CacheReserveUsed (boolean) - filter for Cache Reserve hits +// Query Cache Reserve hits in analytics +const logpushQuery = ` + SELECT + ClientRequestHost, + COUNT(*) as requests, + SUM(EdgeResponseBytes) as bytes_served, + COUNT(CASE WHEN CacheReserveUsed = true THEN 1 END) as cache_reserve_hits, + COUNT(CASE WHEN CacheReserveUsed = false THEN 1 END) as cache_reserve_misses + FROM http_requests + WHERE Timestamp >= NOW() - INTERVAL '24 hours' + GROUP BY ClientRequestHost + ORDER BY requests DESC +`; + +// Filter only Cache Reserve hits +const crHitsQuery = ` + SELECT ClientRequestHost, COUNT(*) as requests, SUM(EdgeResponseBytes) as bytes + FROM http_requests + WHERE CacheReserveUsed = true AND Timestamp >= NOW() - INTERVAL '7 days' + GROUP BY ClientRequestHost + ORDER BY bytes DESC +`; +``` + +### GraphQL Analytics + +```graphql +query CacheReserveAnalytics($zoneTag: string, $since: string, $until: string) { + viewer { + zones(filter: { zoneTag: $zoneTag }) { + httpRequests1dGroups( + filter: { datetime_geq: $since, datetime_leq: $until } + limit: 1000 + ) { + dimensions { date } + sum { + cachedBytes + cachedRequests + bytes + requests + } + } + } + } +} +``` + +## Pricing + +```typescript +// Storage: $0.015/GB-month | Class A (writes): $4.50/M | Class B (reads): $0.36/M +// Cache miss: 1A + 1B | Cache hit: 1B | Assets >1GB: proportionally more ops +``` + +## See Also + +- [README](./README.md) - Overview and core concepts +- [Configuration](./configuration.md) - Setup and Cache Rules +- [Patterns](./patterns.md) - Best practices and optimization +- [Gotchas](./gotchas.md) - Common issues and troubleshooting diff --git a/cloudflare/references/cache-reserve/configuration.md b/cloudflare/references/cache-reserve/configuration.md new file mode 100644 index 0000000..84a6616 --- /dev/null +++ b/cloudflare/references/cache-reserve/configuration.md @@ -0,0 +1,169 @@ +# Cache Reserve Configuration + +## Dashboard Setup + +**Minimum steps to enable:** + +```bash +# Navigate to dashboard +https://dash.cloudflare.com/caching/cache-reserve + +# Click "Enable Storage Sync" or "Purchase" button +``` + +**Prerequisites:** +- Paid Cache Reserve plan or Smart Shield Advanced required +- Tiered Cache **required** for Cache Reserve to function optimally + +## API Configuration + +### REST API + +```bash +# Enable +curl -X PATCH "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/cache/cache_reserve" \ + -H "Authorization: Bearer $API_TOKEN" -H "Content-Type: application/json" \ + -d '{"value": "on"}' + +# Check status +curl -X GET "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/cache/cache_reserve" \ + -H "Authorization: Bearer $API_TOKEN" +``` + +### TypeScript SDK + +```bash +npm install cloudflare +``` + +```typescript +import Cloudflare from 'cloudflare'; + +const client = new Cloudflare({ + apiToken: process.env.CLOUDFLARE_API_TOKEN, +}); + +// Enable Cache Reserve +await client.cache.cacheReserve.edit({ + zone_id: 'abc123', + value: 'on', +}); + +// Get Cache Reserve status +const status = await client.cache.cacheReserve.get({ + zone_id: 'abc123', +}); +console.log(status.value); // 'on' or 'off' +``` + +### Python SDK + +```bash +pip install cloudflare +``` + +```python +from cloudflare import Cloudflare + +client = Cloudflare(api_token=os.environ.get("CLOUDFLARE_API_TOKEN")) + +# Enable Cache Reserve +client.cache.cache_reserve.edit( + zone_id="abc123", + value="on" +) + +# Get Cache Reserve status +status = client.cache.cache_reserve.get(zone_id="abc123") +print(status.value) # 'on' or 'off' +``` + +### Terraform + +```hcl +terraform { + required_providers { + cloudflare = { + source = "cloudflare/cloudflare" + version = "~> 4.0" + } + } +} + +provider "cloudflare" { + api_token = var.cloudflare_api_token +} + +resource "cloudflare_zone_cache_reserve" "example" { + zone_id = var.zone_id + enabled = true +} + +# Tiered Cache is required for Cache Reserve +resource "cloudflare_tiered_cache" "example" { + zone_id = var.zone_id + cache_type = "smart" +} +``` + +### Pulumi + +```typescript +import * as cloudflare from "@pulumi/cloudflare"; + +// Enable Cache Reserve +const cacheReserve = new cloudflare.ZoneCacheReserve("example", { + zoneId: zoneId, + enabled: true, +}); + +// Enable Tiered Cache (required) +const tieredCache = new cloudflare.TieredCache("example", { + zoneId: zoneId, + cacheType: "smart", +}); +``` + +### Required API Token Permissions + +- `Zone Settings Read` +- `Zone Settings Write` +- `Zone Read` +- `Zone Write` + +## Cache Rules Integration + +Control Cache Reserve eligibility via Cache Rules: + +```typescript +// Enable for static assets +{ + action: 'set_cache_settings', + action_parameters: { + cache_reserve: { eligible: true, minimum_file_ttl: 86400 }, + edge_ttl: { mode: 'override_origin', default: 86400 }, + cache: true + }, + expression: '(http.request.uri.path matches "\\.(jpg|png|webp|pdf|zip)$")' +} + +// Disable for APIs +{ + action: 'set_cache_settings', + action_parameters: { cache_reserve: { eligible: false } }, + expression: '(http.request.uri.path matches "^/api/")' +} + +// Create via API: PUT to zones/{zone_id}/rulesets/phases/http_request_cache_settings/entrypoint +``` + +## Wrangler Integration + +Cache Reserve works automatically with Workers deployed via Wrangler. No special wrangler.jsonc configuration needed - enable Cache Reserve via Dashboard or API for the zone. + +## See Also + +- [README](./README.md) - Overview and core concepts +- [API Reference](./api.md) - Purging and monitoring APIs +- [Patterns](./patterns.md) - Best practices and optimization +- [Gotchas](./gotchas.md) - Common issues and troubleshooting diff --git a/cloudflare/references/cache-reserve/gotchas.md b/cloudflare/references/cache-reserve/gotchas.md new file mode 100644 index 0000000..9995cf8 --- /dev/null +++ b/cloudflare/references/cache-reserve/gotchas.md @@ -0,0 +1,132 @@ +# Cache Reserve Gotchas + +## Common Errors + +### "Assets Not Being Cached in Cache Reserve" + +**Cause:** Asset is not cacheable, TTL < 10 hours, Content-Length header missing, or blocking headers present (Set-Cookie, Vary: *) +**Solution:** Ensure minimum TTL of 10+ hours (`Cache-Control: public, max-age=36000`), add Content-Length header, remove Set-Cookie header, and set `Vary: Accept-Encoding` (not *) + +### "Range Requests Not Working" (Video Seeking Fails) + +**Cause:** Cache Reserve does **NOT** support range requests (HTTP 206 Partial Content) +**Solution:** Range requests bypass Cache Reserve entirely. For video streaming with seeking: +- Use edge cache only (shorter TTLs) +- Consider R2 with direct access for range-heavy workloads +- Accept that seekable content won't benefit from Cache Reserve persistence + +### "Origin Bandwidth Higher Than Expected" + +**Cause:** Cache Reserve fetches **uncompressed** content from origin, even though it serves compressed to visitors +**Solution:** +- If origin charges by bandwidth, factor in uncompressed transfer costs +- Cache Reserve compresses for visitors automatically (saves visitor bandwidth) +- Compare: origin egress savings vs higher uncompressed fetch costs + +### "Cloudflare Images Not Caching with Cache Reserve" + +**Cause:** Cloudflare Images with `Vary: Accept` header (format negotiation) is incompatible with Cache Reserve +**Solution:** +- Cache Reserve silently skips images with Vary for format negotiation +- Original images (non-transformed) may still be eligible +- Use Cloudflare Images variants or edge cache for transformed images + +### "High Class A Operations Costs" + +**Cause:** Frequent cache misses, short TTLs, or frequent revalidation +**Solution:** Increase TTL for stable content (24+ hours), enable Tiered Cache to reduce direct Cache Reserve misses, or use stale-while-revalidate + +### "Purge Not Working as Expected" + +**Cause:** Purge by tag only triggers revalidation but doesn't remove from Cache Reserve storage +**Solution:** Use purge by URL for immediate removal, or disable Cache Reserve then clear all data for complete removal + +### "O2O (Orange-to-Orange) Assets Not Caching" + +**Cause:** Orange-to-Orange (proxied zone requesting another proxied zone on Cloudflare) bypasses Cache Reserve +**Solution:** +- **What is O2O**: Zone A (proxied) → Zone B (proxied), both on Cloudflare +- **Detection**: Check `cf-cache-status` for `BYPASS` and review request path +- **Workaround**: Use R2 or direct origin access instead of O2O proxy chains + +### "Cache Reserve must be OFF before clearing data" + +**Cause:** Attempting to clear Cache Reserve data while it's still enabled +**Solution:** Disable Cache Reserve first, wait briefly for propagation (5s), then clear data (can take up to 24 hours) + +## Limits + +| Limit | Value | Notes | +|-------|-------|-------| +| Minimum TTL | 10 hours (36000 seconds) | Assets with shorter TTL not eligible | +| Default retention | 30 days (2592000 seconds) | Configurable | +| Maximum file size | Same as R2 limits | No practical limit | +| Purge/clear time | Up to 24 hours | Complete propagation time | +| Plan requirement | Paid Cache Reserve or Smart Shield | Not available on free plans | +| Content-Length header | Required | Must be present for eligibility | +| Set-Cookie header | Blocks caching | Must not be present (or use private directive) | +| Vary header | Cannot be * | Can use Vary: Accept-Encoding | +| Image transformations | Variants not eligible | Original images only | +| Range requests | NOT supported | HTTP 206 bypasses Cache Reserve | +| Compression | Fetches uncompressed | Serves compressed to visitors | +| Worker control | Zone-level only | Cannot control per-request | +| O2O requests | Bypassed | Orange-to-Orange not eligible | + +## Additional Resources + +- **Official Docs**: https://developers.cloudflare.com/cache/advanced-configuration/cache-reserve/ +- **API Reference**: https://developers.cloudflare.com/api/resources/cache/subresources/cache_reserve/ +- **Cache Rules**: https://developers.cloudflare.com/cache/how-to/cache-rules/ +- **Workers Cache API**: https://developers.cloudflare.com/workers/runtime-apis/cache/ +- **R2 Documentation**: https://developers.cloudflare.com/r2/ +- **Smart Shield**: https://developers.cloudflare.com/smart-shield/ +- **Tiered Cache**: https://developers.cloudflare.com/cache/how-to/tiered-cache/ + +## Troubleshooting Flowchart + +Asset not caching in Cache Reserve? + +``` +1. Is Cache Reserve enabled for zone? + → No: Enable via Dashboard or API + → Yes: Continue to step 2 + +2. Is Tiered Cache enabled? + → No: Enable Tiered Cache (required!) + → Yes: Continue to step 3 + +3. Does asset have TTL ≥ 10 hours? + → No: Increase via Cache Rules (edge_ttl override) + → Yes: Continue to step 4 + +4. Is Content-Length header present? + → No: Fix origin to include Content-Length + → Yes: Continue to step 5 + +5. Is Set-Cookie header present? + → Yes: Remove Set-Cookie or scope appropriately + → No: Continue to step 6 + +6. Is Vary header set to *? + → Yes: Change to specific value (e.g., Accept-Encoding) + → No: Continue to step 7 + +7. Is this a range request? + → Yes: Range requests bypass Cache Reserve (not supported) + → No: Continue to step 8 + +8. Is this an O2O (Orange-to-Orange) request? + → Yes: O2O bypasses Cache Reserve + → No: Continue to step 9 + +9. Check Logpush CacheReserveUsed field + → Filter logs to see if assets ever hit Cache Reserve + → Verify cf-cache-status header (should be HIT after first request) +``` + +## See Also + +- [README](./README.md) - Overview and core concepts +- [Configuration](./configuration.md) - Setup and Cache Rules +- [API Reference](./api.md) - Purging and monitoring +- [Patterns](./patterns.md) - Best practices and optimization diff --git a/cloudflare/references/cache-reserve/patterns.md b/cloudflare/references/cache-reserve/patterns.md new file mode 100644 index 0000000..65f9488 --- /dev/null +++ b/cloudflare/references/cache-reserve/patterns.md @@ -0,0 +1,197 @@ +# Cache Reserve Patterns + +## Best Practices + +### 1. Always Enable Tiered Cache + +```typescript +// Cache Reserve is designed for use WITH Tiered Cache +const configuration = { + tieredCache: 'enabled', // Required for optimal performance + cacheReserve: 'enabled', // Works best with Tiered Cache + + hierarchy: [ + 'Lower-Tier Cache (visitor)', + 'Upper-Tier Cache (origin region)', + 'Cache Reserve (persistent)', + 'Origin' + ] +}; +``` + +### 2. Set Appropriate Cache-Control Headers + +```typescript +// Origin response headers for Cache Reserve eligibility +const originHeaders = { + 'Cache-Control': 'public, max-age=86400', // 24hr (minimum 10hr) + 'Content-Length': '1024000', // Required + 'Cache-Tag': 'images,product-123', // Optional: purging + 'ETag': '"abc123"', // Optional: revalidation + // Avoid: 'Set-Cookie' and 'Vary: *' prevent caching +}; +``` + +### 3. Use Cache Rules for Fine-Grained Control + +```typescript +// Different TTLs for different content types +const cacheRules = [ + { + description: 'Long-term cache for immutable assets', + expression: '(http.request.uri.path matches "^/static/.*\\.[a-f0-9]{8}\\.")', + action_parameters: { + cache_reserve: { eligible: true }, + edge_ttl: { mode: 'override_origin', default: 2592000 }, // 30 days + cache: true + } + }, + { + description: 'Moderate cache for regular images', + expression: '(http.request.uri.path matches "\\.(jpg|png|webp)$")', + action_parameters: { + cache_reserve: { eligible: true }, + edge_ttl: { mode: 'override_origin', default: 86400 }, // 24 hours + cache: true + } + }, + { + description: 'Exclude API from Cache Reserve', + expression: '(http.request.uri.path matches "^/api/")', + action_parameters: { cache_reserve: { eligible: false }, cache: false } + } +]; +``` + +### 4. Making Assets Cache Reserve Eligible from Workers + +**Note**: This modifies response headers to meet eligibility criteria but does NOT directly control Cache Reserve storage (which is zone-level automatic). + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const response = await fetch(request); + if (!response.ok) return response; + + const headers = new Headers(response.headers); + headers.set('Cache-Control', 'public, max-age=36000'); // 10hr minimum + headers.delete('Set-Cookie'); // Blocks caching + + // Ensure Content-Length present + if (!headers.has('Content-Length')) { + const blob = await response.blob(); + headers.set('Content-Length', blob.size.toString()); + return new Response(blob, { status: response.status, headers }); + } + + return new Response(response.body, { status: response.status, headers }); + } +}; +``` + +### 5. Hostname Best Practices + +Use Worker's hostname for efficient caching - avoid overriding hostname unnecessarily. + +## Architecture Patterns + +### Multi-Tier Caching + Immutable Assets + +```typescript +// Optimal: L1 (visitor) → L2 (region) → L3 (Cache Reserve) → Origin +export default { + async fetch(request: Request, env: Env): Promise { + const url = new URL(request.url); + const isImmutable = /\.[a-f0-9]{8,}\.(js|css|jpg|png|woff2)$/.test(url.pathname); + const response = await fetch(request); + + if (isImmutable) { + const headers = new Headers(response.headers); + headers.set('Cache-Control', 'public, max-age=31536000, immutable'); + return new Response(response.body, { status: response.status, headers }); + } + return response; + } +}; +``` + +## Cost Optimization + +### Cost Calculator + +```typescript +interface CacheReserveEstimate { + avgAssetSizeGB: number; + uniqueAssets: number; + monthlyReads: number; + monthlyWrites: number; + originEgressCostPerGB: number; // e.g., AWS: $0.09/GB +} + +function estimateMonthlyCost(input: CacheReserveEstimate) { + // Cache Reserve pricing + const storageCostPerGBMonth = 0.015; + const classAPerMillion = 4.50; // writes + const classBPerMillion = 0.36; // reads + + // Calculate Cache Reserve costs + const totalStorageGB = input.avgAssetSizeGB * input.uniqueAssets; + const storageCost = totalStorageGB * storageCostPerGBMonth; + const writeCost = (input.monthlyWrites / 1_000_000) * classAPerMillion; + const readCost = (input.monthlyReads / 1_000_000) * classBPerMillion; + + const cacheReserveCost = storageCost + writeCost + readCost; + + // Calculate origin egress cost (what you'd pay without Cache Reserve) + const totalTrafficGB = (input.monthlyReads * input.avgAssetSizeGB); + const originEgressCost = totalTrafficGB * input.originEgressCostPerGB; + + // Savings calculation + const savings = originEgressCost - cacheReserveCost; + const savingsPercent = ((savings / originEgressCost) * 100).toFixed(1); + + return { + cacheReserveCost: `$${cacheReserveCost.toFixed(2)}`, + originEgressCost: `$${originEgressCost.toFixed(2)}`, + monthlySavings: `$${savings.toFixed(2)}`, + savingsPercent: `${savingsPercent}%`, + breakdown: { + storage: `$${storageCost.toFixed(2)}`, + writes: `$${writeCost.toFixed(2)}`, + reads: `$${readCost.toFixed(2)}`, + } + }; +} + +// Example: Media library +const mediaLibrary = estimateMonthlyCost({ + avgAssetSizeGB: 0.005, // 5MB images + uniqueAssets: 10_000, + monthlyReads: 5_000_000, + monthlyWrites: 50_000, + originEgressCostPerGB: 0.09, // AWS S3 +}); + +console.log(mediaLibrary); +// { +// cacheReserveCost: "$9.98", +// originEgressCost: "$25.00", +// monthlySavings: "$15.02", +// savingsPercent: "60.1%", +// breakdown: { storage: "$0.75", writes: "$0.23", reads: "$9.00" } +// } +``` + +### Optimization Guidelines + +- **Set appropriate TTLs**: 10hr minimum, 24hr+ optimal for stable content, 30d max cautiously +- **Cache high-value stable assets**: Images, media, fonts, archives, documentation +- **Exclude frequently changing**: APIs, user-specific content, real-time data +- **Compression note**: Cache Reserve fetches uncompressed from origin, serves compressed to visitors - factor in origin egress costs + +## See Also + +- [README](./README.md) - Overview and core concepts +- [Configuration](./configuration.md) - Setup and Cache Rules +- [API Reference](./api.md) - Purging and monitoring +- [Gotchas](./gotchas.md) - Common issues and troubleshooting diff --git a/cloudflare/references/containers/README.md b/cloudflare/references/containers/README.md new file mode 100644 index 0000000..a6c488d --- /dev/null +++ b/cloudflare/references/containers/README.md @@ -0,0 +1,85 @@ +# Cloudflare Containers Skill Reference + +**APPLIES TO: Cloudflare Containers ONLY - NOT general Cloudflare Workers** + +Use when working with Cloudflare Containers: deploying containerized apps on Workers platform, configuring container-enabled Durable Objects, managing container lifecycle, or implementing stateful/stateless container patterns. + +## Beta Status + +⚠️ Containers is currently in **beta**. API may change without notice. No SLA guarantees. Custom instance types added Jan 2026. + +## Core Concepts + +**Container as Durable Object:** Each container is a Durable Object with persistent identity. Accessed via `getByName(id)` or `getRandom()`. + +**Image deployment:** Images pre-fetched globally. Deployments use rolling strategy (not instant like Workers). + +**Lifecycle:** cold start (2-3s) → running → `sleepAfter` timeout → stopped. No autoscaling - manual load balancing via `getRandom()`. + +**Persistent identity, ephemeral disk:** Container ID persists, but disk resets on stop. Use Durable Object storage for persistence. + +## Quick Start + +```typescript +import { Container } from "@cloudflare/containers"; + +export class MyContainer extends Container { + defaultPort = 8080; + sleepAfter = "30m"; +} + +export default { + async fetch(request: Request, env: Env) { + const container = env.MY_CONTAINER.getByName("instance-1"); + await container.startAndWaitForPorts(); + return container.fetch(request); + } +}; +``` + +## Reading Order + +| Task | Files | +|------|-------| +| Setup new container project | README → configuration.md | +| Implement container logic | README → api.md → patterns.md | +| Choose routing pattern | patterns.md (routing section) | +| Debug issues | gotchas.md | +| Production hardening | gotchas.md → patterns.md (lifecycle) | + +## Routing Decision Tree + +**How should requests reach containers?** + +- **Same user/session → same container:** Use `getByName(sessionId)` for session affinity +- **Stateless, spread load:** Use `getRandom()` for load balancing +- **Job per container:** Use `getByName(jobId)` + explicit lifecycle management +- **Single global instance:** Use `getByName("singleton")` + +## When to Use Containers vs Workers + +**Use Containers when:** +- Need stateful, long-lived processes (sessions, WebSockets, games) +- Running existing containerized apps (Node.js, Python, custom binaries) +- Need filesystem access or specific system dependencies +- Per-user/session isolation with dedicated compute + +**Use Workers when:** +- Stateless HTTP handlers +- Sub-millisecond cold starts required +- Auto-scaling to zero critical +- Simple request/response patterns + +## In This Reference + +- **[configuration.md](configuration.md)** - Wrangler config, instance types, Container class properties, environment variables, account limits +- **[api.md](api.md)** - Container class API, startup methods, communication (HTTP/TCP/WebSocket), routing helpers, lifecycle hooks, scheduling, state inspection +- **[patterns.md](patterns.md)** - Routing patterns (session affinity, load balancing, singleton), WebSocket forwarding, graceful shutdown, Workflow/Queue integration +- **[gotchas.md](gotchas.md)** - Critical gotchas (WebSocket, startup methods), common errors with solutions, specific limits, beta caveats + +## See Also + +- [Durable Objects](../durable-objects/) - Containers extend Durable Objects +- [Workflows](../workflows/) - Orchestrate container operations +- [Queues](../queues/) - Trigger containers from queue messages +- [Cloudflare Docs](https://developers.cloudflare.com/containers/) diff --git a/cloudflare/references/containers/api.md b/cloudflare/references/containers/api.md new file mode 100644 index 0000000..c41f721 --- /dev/null +++ b/cloudflare/references/containers/api.md @@ -0,0 +1,187 @@ +## Container Class API + +```typescript +import { Container } from "@cloudflare/containers"; + +export class MyContainer extends Container { + defaultPort = 8080; + requiredPorts = [8080]; + sleepAfter = "30m"; + enableInternet = true; + pingEndpoint = "/health"; + envVars = {}; + entrypoint = []; + + onStart() { /* container started */ } + onStop() { /* container stopping */ } + onError(error: Error) { /* container error */ } + onActivityExpired(): boolean { /* timeout, return true to stay alive */ } + async alarm() { /* scheduled task */ } +} +``` + +## Routing + +**getByName(id)** - Named instance for session affinity, per-user state +**getRandom()** - Random instance for load balancing stateless services + +```typescript +const container = env.MY_CONTAINER.getByName("user-123"); +const container = env.MY_CONTAINER.getRandom(); +``` + +## Startup Methods + +### start() - Basic start (8s timeout) + +```typescript +await container.start(); +await container.start({ envVars: { KEY: "value" } }); +``` + +Returns when **process starts**, NOT when ports ready. Use for fire-and-forget. + +### startAndWaitForPorts() - Recommended (20s timeout) + +```typescript +await container.startAndWaitForPorts(); // Uses requiredPorts +await container.startAndWaitForPorts({ ports: [8080, 9090] }); +await container.startAndWaitForPorts({ + ports: [8080], + startOptions: { envVars: { KEY: "value" } } +}); +``` + +Returns when **ports listening**. Use before HTTP/TCP requests. + +**Port resolution:** explicit ports → requiredPorts → defaultPort → port 33 + +### waitForPort() - Wait for specific port + +```typescript +await container.waitForPort(8080); +await container.waitForPort(8080, { timeout: 30000 }); +``` + +## Communication + +### fetch() - HTTP with WebSocket support + +```typescript +// ✅ Supports WebSocket upgrades +const response = await container.fetch(request); +const response = await container.fetch("http://container/api", { + method: "POST", + body: JSON.stringify({ data: "value" }) +}); +``` + +**Use for:** All HTTP, especially WebSocket. + +### containerFetch() - HTTP only (no WebSocket) + +```typescript +// ❌ No WebSocket support +const response = await container.containerFetch(request); +``` + +**⚠️ Critical:** Use `fetch()` for WebSocket, not `containerFetch()`. + +### TCP Connections + +```typescript +const port = this.ctx.container.getTcpPort(8080); +const conn = port.connect(); +await conn.opened; + +if (request.body) await request.body.pipeTo(conn.writable); +return new Response(conn.readable); +``` + +### switchPort() - Change default port + +```typescript +this.switchPort(8081); // Subsequent fetch() uses this port +``` + +## Lifecycle Hooks + +### onStart() + +Called when container process starts (ports may not be ready). Runs in `blockConcurrencyWhile` - no concurrent requests. + +```typescript +onStart() { + console.log("Container starting"); +} +``` + +### onStop() + +Called when SIGTERM received. 15 minutes until SIGKILL. Use for graceful shutdown. + +```typescript +onStop() { + // Save state, close connections, flush logs +} +``` + +### onError() + +Called when container crashes or fails to start. + +```typescript +onError(error: Error) { + console.error("Container error:", error); +} +``` + +### onActivityExpired() + +Called when `sleepAfter` timeout reached. Return `true` to stay alive, `false` to stop. + +```typescript +onActivityExpired(): boolean { + if (this.hasActiveConnections()) return true; // Keep alive + return false; // OK to stop +} +``` + +## Scheduling + +```typescript +export class ScheduledContainer extends Container { + async fetch(request: Request) { + await this.schedule(Date.now() + 60000); // 1 minute + await this.schedule("2026-01-28T00:00:00Z"); // ISO string + return new Response("Scheduled"); + } + + async alarm() { + // Called when schedule fires (SQLite-backed, survives restarts) + } +} +``` + +**⚠️ Don't override `alarm()` directly when using `schedule()` helper.** + +## State Inspection + +### External state check + +```typescript +const state = await container.getState(); +// state.status: "starting" | "running" | "stopping" | "stopped" +``` + +### Internal state check + +```typescript +export class MyContainer extends Container { + async fetch(request: Request) { + if (this.ctx.container.running) { ... } + } +} +``` + +**⚠️ Use `getState()` for external checks, `ctx.container.running` for internal.** diff --git a/cloudflare/references/containers/configuration.md b/cloudflare/references/containers/configuration.md new file mode 100644 index 0000000..fd39cc4 --- /dev/null +++ b/cloudflare/references/containers/configuration.md @@ -0,0 +1,188 @@ +## Wrangler Configuration + +### Basic Container Config + +```jsonc +{ + "name": "my-worker", + "main": "src/index.ts", + "compatibility_date": "2026-01-10", + "containers": [ + { + "class_name": "MyContainer", + "image": "./Dockerfile", // Path to Dockerfile or directory with Dockerfile + "instance_type": "standard-1", // Predefined or custom (see below) + "max_instances": 10 + } + ], + "durable_objects": { + "bindings": [ + { + "name": "MY_CONTAINER", + "class_name": "MyContainer" + } + ] + }, + "migrations": [ + { + "tag": "v1", + "new_sqlite_classes": ["MyContainer"] // Must use new_sqlite_classes + } + ] +} +``` + +Key config requirements: +- `image` - Path to Dockerfile or directory containing Dockerfile +- `class_name` - Must match Container class export name +- `max_instances` - Max concurrent container instances +- Must configure Durable Objects binding AND migrations + +### Instance Types + +#### Predefined Types + +| Type | vCPU | Memory | Disk | +|------|------|--------|------| +| lite | 1/16 | 256 MiB | 2 GB | +| basic | 1/4 | 1 GiB | 4 GB | +| standard-1 | 1/2 | 4 GiB | 8 GB | +| standard-2 | 1 | 6 GiB | 12 GB | +| standard-3 | 2 | 8 GiB | 16 GB | +| standard-4 | 4 | 12 GiB | 20 GB | + +```jsonc +{ + "containers": [ + { + "class_name": "MyContainer", + "image": "./Dockerfile", + "instance_type": "standard-2" // Use predefined type + } + ] +} +``` + +#### Custom Types (Jan 2026 Feature) + +```jsonc +{ + "containers": [ + { + "class_name": "MyContainer", + "image": "./Dockerfile", + "instance_type_custom": { + "vcpu": 2, // 1-4 vCPU + "memory_mib": 8192, // 512-12288 MiB (up to 12 GiB) + "disk_mib": 16384 // 2048-20480 MiB (up to 20 GB) + } + } + ] +} +``` + +**Custom type constraints:** +- Minimum 3 GiB memory per vCPU +- Maximum 2 GB disk per 1 GiB memory +- Max 4 vCPU, 12 GiB memory, 20 GB disk per container + +### Account Limits + +| Resource | Limit | Notes | +|----------|-------|-------| +| Total memory (all containers) | 400 GiB | Across all running containers | +| Total vCPU (all containers) | 100 | Across all running containers | +| Total disk (all containers) | 2 TB | Across all running containers | +| Image storage per account | 50 GB | Stored container images | + +### Container Class Properties + +```typescript +import { Container } from "@cloudflare/containers"; + +export class MyContainer extends Container { + // Port Configuration + defaultPort = 8080; // Default port for fetch() calls + requiredPorts = [8080, 9090]; // Ports to wait for in startAndWaitForPorts() + + // Lifecycle + sleepAfter = "30m"; // Inactivity timeout (5m, 30m, 2h, etc.) + + // Network + enableInternet = true; // Allow outbound internet access + + // Health Check + pingEndpoint = "/health"; // Health check endpoint path + + // Environment + envVars = { // Environment variables passed to container + NODE_ENV: "production", + LOG_LEVEL: "info" + }; + + // Startup + entrypoint = ["/bin/start.sh"]; // Override image entrypoint (optional) +} +``` + +**Property details:** + +- **`defaultPort`**: Port used when calling `container.fetch()` without explicit port. Falls back to port 33 if not set. + +- **`requiredPorts`**: Array of ports that must be listening before `startAndWaitForPorts()` returns. First port becomes default if `defaultPort` not set. + +- **`sleepAfter`**: Duration string (e.g., "5m", "30m", "2h"). Container stops after this period of inactivity. Timer resets on each request. + +- **`enableInternet`**: Boolean. If `true`, container can make outbound HTTP/TCP requests. + +- **`pingEndpoint`**: Path used for health checks. Should respond with 2xx status. + +- **`envVars`**: Object of environment variables. Merged with runtime-provided vars (see below). + +- **`entrypoint`**: Array of strings. Overrides container image's CMD/ENTRYPOINT. + +### Runtime Environment Variables + +Cloudflare automatically provides these environment variables to containers: + +| Variable | Description | +|----------|-------------| +| `CLOUDFLARE_APPLICATION_ID` | Worker application ID | +| `CLOUDFLARE_COUNTRY_A2` | Two-letter country code of request origin | +| `CLOUDFLARE_LOCATION` | Cloudflare data center location | +| `CLOUDFLARE_REGION` | Region identifier | +| `CLOUDFLARE_DURABLE_OBJECT_ID` | Container's Durable Object ID | + +Custom `envVars` from Container class are merged with these. Custom vars override runtime vars if names conflict. + +### Image Management + +**Distribution model:** Images pre-fetched to all global locations before deployment. Ensures fast cold starts (2-3s typical). + +**Rolling deploys:** Unlike Workers (instant), container deployments roll out gradually. Old versions continue running during rollout. + +**Ephemeral disk:** Container disk is ephemeral and resets on each stop. Use Durable Object storage (`this.ctx.storage`) for persistence. + +## wrangler.toml Format + +```toml +name = "my-worker" +main = "src/index.ts" +compatibility_date = "2026-01-10" + +[[containers]] +class_name = "MyContainer" +image = "./Dockerfile" +instance_type = "standard-2" +max_instances = 10 + +[[durable_objects.bindings]] +name = "MY_CONTAINER" +class_name = "MyContainer" + +[[migrations]] +tag = "v1" +new_sqlite_classes = ["MyContainer"] +``` + +Both `wrangler.jsonc` and `wrangler.toml` are supported. Use `wrangler.jsonc` for comments and better IDE support. diff --git a/cloudflare/references/containers/gotchas.md b/cloudflare/references/containers/gotchas.md new file mode 100644 index 0000000..306e8c5 --- /dev/null +++ b/cloudflare/references/containers/gotchas.md @@ -0,0 +1,178 @@ +## Critical Gotchas + +### ⚠️ WebSocket: fetch() vs containerFetch() + +**Problem:** WebSocket connections fail silently + +**Cause:** `containerFetch()` doesn't support WebSocket upgrades + +**Fix:** Always use `fetch()` for WebSocket + +```typescript +// ❌ WRONG +return container.containerFetch(request); + +// ✅ CORRECT +return container.fetch(request); +``` + +### ⚠️ startAndWaitForPorts() vs start() + +**Problem:** "connection refused" after `start()` + +**Cause:** `start()` returns when process starts, NOT when ports ready + +**Fix:** Use `startAndWaitForPorts()` before requests + +```typescript +// ❌ WRONG +await container.start(); +return container.fetch(request); + +// ✅ CORRECT +await container.startAndWaitForPorts(); +return container.fetch(request); +``` + +### ⚠️ Activity Timeout on Long Operations + +**Problem:** Container stops during long work + +**Cause:** `sleepAfter` based on request activity, not internal work + +**Fix:** Renew timeout by touching storage + +```typescript +const interval = setInterval(() => { + this.ctx.storage.put("keepalive", Date.now()); +}, 60000); + +try { + await this.doLongWork(data); +} finally { + clearInterval(interval); +} +``` + +### ⚠️ blockConcurrencyWhile for Startup + +**Problem:** Race conditions during initialization + +**Fix:** Use `blockConcurrencyWhile` for atomic initialization + +```typescript +await this.ctx.blockConcurrencyWhile(async () => { + if (!this.initialized) { + await this.startAndWaitForPorts(); + this.initialized = true; + } +}); +``` + +### ⚠️ Lifecycle Hooks Block Requests + +**Problem:** Container unresponsive during `onStart()` + +**Cause:** Hooks run in `blockConcurrencyWhile` - no concurrent requests + +**Fix:** Keep hooks fast, avoid long operations + +### ⚠️ Don't Override alarm() When Using schedule() + +**Problem:** Scheduled tasks don't execute + +**Cause:** `schedule()` uses `alarm()` internally + +**Fix:** Implement `alarm()` to handle scheduled tasks + +## Common Errors + +### "Container start timeout" + +**Cause:** Container took >8s (`start()`) or >20s (`startAndWaitForPorts()`) + +**Solutions:** +- Optimize image (smaller base, fewer layers) +- Check `entrypoint` correct +- Verify app listens on correct ports +- Increase timeout if needed + +### "Port not available" + +**Cause:** Calling `fetch()` before port ready + +**Solution:** Use `startAndWaitForPorts()` + +### "Container memory exceeded" + +**Cause:** Using more memory than instance type allows + +**Solutions:** +- Use larger instance type (standard-2, standard-3, standard-4) +- Optimize app memory usage +- Use custom instance type + +```jsonc +"instance_type_custom": { + "vcpu": 2, + "memory_mib": 8192 +} +``` + +### "Max instances reached" + +**Cause:** All `max_instances` slots in use + +**Solutions:** +- Increase `max_instances` +- Implement proper `sleepAfter` +- Use `getRandom()` for distribution +- Check for instance leaks + +### "No container instance available" + +**Cause:** Account capacity limits reached + +**Solutions:** +- Check account limits +- Review instance types across containers +- Contact Cloudflare support + +## Limits + +| Resource | Limit | Notes | +|----------|-------|-------| +| Cold start | 2-3s | Image pre-fetched globally | +| Graceful shutdown | 15 min | SIGTERM → SIGKILL | +| `start()` timeout | 8s | Process start | +| `startAndWaitForPorts()` timeout | 20s | Port ready | +| Max vCPU per container | 4 | standard-4 or custom | +| Max memory per container | 12 GiB | standard-4 or custom | +| Max disk per container | 20 GB | Ephemeral, resets | +| Account total memory | 400 GiB | All containers | +| Account total vCPU | 100 | All containers | +| Account total disk | 2 TB | All containers | +| Image storage | 50 GB | Per account | +| Disk persistence | None | Use DO storage | + +## Best Practices + +1. **Use `startAndWaitForPorts()` by default** - Prevents port errors +2. **Set appropriate `sleepAfter`** - Balance resources vs cold starts +3. **Use `fetch()` for WebSocket** - Not `containerFetch()` +4. **Design for restarts** - Ephemeral disk, implement graceful shutdown +5. **Monitor resources** - Stay within account limits +6. **Keep hooks fast** - Run in `blockConcurrencyWhile` +7. **Renew activity for long ops** - Touch storage to prevent timeout + +## Beta Caveats + +⚠️ Containers in **beta**: + +- **API may change** without notice +- **No SLA** guarantees +- **Limited regions** initially +- **No autoscaling** - manual via `getRandom()` +- **Rolling deploys** only (not instant like Workers) + +Plan for API changes, test thoroughly before production. diff --git a/cloudflare/references/containers/patterns.md b/cloudflare/references/containers/patterns.md new file mode 100644 index 0000000..9204294 --- /dev/null +++ b/cloudflare/references/containers/patterns.md @@ -0,0 +1,202 @@ +## Routing Patterns + +### Session Affinity (Stateful) + +```typescript +export class SessionBackend extends Container { + defaultPort = 3000; + sleepAfter = "30m"; +} + +export default { + async fetch(request: Request, env: Env) { + const sessionId = request.headers.get("X-Session-ID") || crypto.randomUUID(); + const container = env.SESSION_BACKEND.getByName(sessionId); + await container.startAndWaitForPorts(); + return container.fetch(request); + } +}; +``` + +**Use:** User sessions, WebSocket, stateful games, per-user caching. + +### Load Balancing (Stateless) + +```typescript +export default { + async fetch(request: Request, env: Env) { + const container = env.STATELESS_API.getRandom(); + await container.startAndWaitForPorts(); + return container.fetch(request); + } +}; +``` + +**Use:** Stateless HTTP APIs, CPU-intensive work, read-only queries. + +### Singleton Pattern + +```typescript +export default { + async fetch(request: Request, env: Env) { + const container = env.GLOBAL_SERVICE.getByName("singleton"); + await container.startAndWaitForPorts(); + return container.fetch(request); + } +}; +``` + +**Use:** Global cache, centralized coordinator, single source of truth. + +## WebSocket Forwarding + +```typescript +export default { + async fetch(request: Request, env: Env) { + if (request.headers.get("Upgrade") === "websocket") { + const sessionId = request.headers.get("X-Session-ID") || crypto.randomUUID(); + const container = env.WS_BACKEND.getByName(sessionId); + await container.startAndWaitForPorts(); + + // ⚠️ MUST use fetch(), not containerFetch() + return container.fetch(request); + } + return new Response("Not a WebSocket request", { status: 400 }); + } +}; +``` + +**⚠️ Critical:** Always use `fetch()` for WebSocket. + +## Graceful Shutdown + +```typescript +export class GracefulContainer extends Container { + private connections = new Set(); + + onStop() { + // SIGTERM received, 15 minutes until SIGKILL + for (const ws of this.connections) { + ws.close(1001, "Server shutting down"); + } + this.ctx.storage.put("shutdown-time", Date.now()); + } + + onActivityExpired(): boolean { + return this.connections.size > 0; // Keep alive if connections + } +} +``` + +## Concurrent Request Handling + +```typescript +export class SafeContainer extends Container { + private initialized = false; + + async fetch(request: Request) { + await this.ctx.blockConcurrencyWhile(async () => { + if (!this.initialized) { + await this.startAndWaitForPorts(); + this.initialized = true; + } + }); + return super.fetch(request); + } +} +``` + +**Use:** One-time initialization, preventing concurrent startup. + +## Activity Timeout Renewal + +```typescript +export class LongRunningContainer extends Container { + sleepAfter = "5m"; + + async processLongJob(data: unknown) { + const interval = setInterval(() => { + this.ctx.storage.put("keepalive", Date.now()); + }, 60000); + + try { + await this.doLongWork(data); + } finally { + clearInterval(interval); + } + } +} +``` + +**Use:** Long operations exceeding `sleepAfter`. + +## Multiple Port Routing + +```typescript +export class MultiPortContainer extends Container { + requiredPorts = [8080, 8081, 9090]; + + async fetch(request: Request) { + const path = new URL(request.url).pathname; + if (path.startsWith("/grpc")) this.switchPort(8081); + else if (path.startsWith("/metrics")) this.switchPort(9090); + return super.fetch(request); + } +} +``` + +**Use:** Multi-protocol services (HTTP + gRPC), separate metrics endpoints. + +## Workflow Integration + +```typescript +import { WorkflowEntrypoint } from "cloudflare:workers"; + +export class ProcessingWorkflow extends WorkflowEntrypoint { + async run(event, step) { + const container = this.env.PROCESSOR.getByName(event.payload.jobId); + + await step.do("start", async () => { + await container.startAndWaitForPorts(); + }); + + const result = await step.do("process", async () => { + return container.fetch("/process", { + method: "POST", + body: JSON.stringify(event.payload.data) + }).then(r => r.json()); + }); + + return result; + } +} +``` + +**Use:** Orchestrating multi-step container operations, durable execution. + +## Queue Consumer Integration + +```typescript +export default { + async queue(batch, env) { + for (const msg of batch.messages) { + try { + const container = env.PROCESSOR.getByName(msg.body.jobId); + await container.startAndWaitForPorts(); + + const response = await container.fetch("/process", { + method: "POST", + body: JSON.stringify(msg.body) + }); + + response.ok ? msg.ack() : msg.retry(); + } catch (err) { + console.error("Queue processing error:", err); + msg.retry(); + } + } + } +}; +``` + +**Use:** Asynchronous job processing, batch operations, event-driven execution. diff --git a/cloudflare/references/cron-triggers/README.md b/cloudflare/references/cron-triggers/README.md new file mode 100644 index 0000000..67c00f8 --- /dev/null +++ b/cloudflare/references/cron-triggers/README.md @@ -0,0 +1,99 @@ +# Cloudflare Cron Triggers + +Schedule Workers execution using cron expressions. Runs on Cloudflare's global network during underutilized periods. + +## Key Features + +- **UTC-only execution** - All schedules run on UTC time +- **5-field cron syntax** - Quartz scheduler extensions (L, W, #) +- **Global propagation** - 15min deployment delay +- **At-least-once delivery** - Rare duplicate executions possible +- **Workflow integration** - Trigger long-running multi-step tasks +- **Green Compute** - Optional carbon-aware scheduling during low-carbon periods + +## Cron Syntax + +``` + ┌─────────── minute (0-59) + │ ┌───────── hour (0-23) + │ │ ┌─────── day of month (1-31) + │ │ │ ┌───── month (1-12, JAN-DEC) + │ │ │ │ ┌─── day of week (1-7, SUN-SAT, 1=Sunday) + * * * * * +``` + +**Special chars:** `*` (any), `,` (list), `-` (range), `/` (step), `L` (last), `W` (weekday), `#` (nth) + +## Common Schedules + +```bash +*/5 * * * * # Every 5 minutes +0 * * * * # Hourly +0 2 * * * # Daily 2am UTC (off-peak) +0 9 * * MON-FRI # Weekdays 9am UTC +0 0 1 * * # Monthly 1st midnight UTC +0 9 L * * # Last day of month 9am UTC +0 10 * * MON#2 # 2nd Monday 10am UTC +*/10 9-17 * * MON-FRI # Every 10min, 9am-5pm weekdays +``` + +## Quick Start + +**wrangler.jsonc:** +```jsonc +{ + "name": "my-cron-worker", + "triggers": { + "crons": ["*/5 * * * *", "0 2 * * *"] + } +} +``` + +**Handler:** +```typescript +export default { + async scheduled( + controller: ScheduledController, + env: Env, + ctx: ExecutionContext, + ): Promise { + console.log("Cron:", controller.cron); + console.log("Time:", new Date(controller.scheduledTime)); + + ctx.waitUntil(asyncTask(env)); // Non-blocking + }, +}; +``` + +**Test locally:** +```bash +npx wrangler dev +curl "http://localhost:8787/__scheduled?cron=*/5+*+*+*+*" +``` + +## Limits + +- **Free:** 3 triggers/worker, 10ms CPU +- **Paid:** Unlimited triggers, 50ms CPU +- **Propagation:** 15min global deployment +- **Timezone:** UTC only + +## Reading Order + +**New to cron triggers?** Start here: +1. This README - Overview and quick start +2. [configuration.md](./configuration.md) - Set up your first cron trigger +3. [api.md](./api.md) - Understand the handler API +4. [patterns.md](./patterns.md) - Common use cases and examples + +**Troubleshooting?** Jump to [gotchas.md](./gotchas.md) + +## In This Reference +- [configuration.md](./configuration.md) - wrangler config, env-specific schedules, Green Compute +- [api.md](./api.md) - ScheduledController, noRetry(), waitUntil, testing patterns +- [patterns.md](./patterns.md) - Use cases, monitoring, queue integration, Durable Objects +- [gotchas.md](./gotchas.md) - Timezone issues, idempotency, security, testing + +## See Also +- [workflows](../workflows/) - Alternative for long-running scheduled tasks +- [workers](../workers/) - Worker runtime documentation diff --git a/cloudflare/references/cron-triggers/api.md b/cloudflare/references/cron-triggers/api.md new file mode 100644 index 0000000..b0242d7 --- /dev/null +++ b/cloudflare/references/cron-triggers/api.md @@ -0,0 +1,196 @@ +# Cron Triggers API + +## Basic Handler + +```typescript +export default { + async scheduled(controller: ScheduledController, env: Env, ctx: ExecutionContext): Promise { + console.log("Cron executed:", new Date(controller.scheduledTime)); + }, +}; +``` + +**JavaScript:** Same signature without types +**Python:** `class Default(WorkerEntrypoint): async def scheduled(self, controller, env, ctx)` + +## ScheduledController + +```typescript +interface ScheduledController { + scheduledTime: number; // Unix ms when scheduled to run + cron: string; // Expression that triggered (e.g., "*/5 * * * *") + type: string; // Always "scheduled" + noRetry(): void; // Prevent automatic retry on failure +} +``` + +**Prevent retry on failure:** +```typescript +export default { + async scheduled(controller, env, ctx) { + try { + await riskyOperation(env); + } catch (error) { + // Don't retry - failure is expected/acceptable + controller.noRetry(); + console.error("Operation failed, not retrying:", error); + } + }, +}; +``` + +**When to use noRetry():** +- External API failures outside your control (avoid hammering failed services) +- Rate limit errors (retry would fail again immediately) +- Duplicate execution detected (idempotency check failed) +- Non-critical operations where skip is acceptable (analytics, caching) +- Validation errors that won't resolve on retry + +## Handler Parameters + +**`controller: ScheduledController`** +- Access cron expression and scheduled time + +**`env: Env`** +- All bindings: KV, R2, D1, secrets, service bindings + +**`ctx: ExecutionContext`** +- `ctx.waitUntil(promise)` - Extend execution for async tasks (logging, cleanup, external APIs) +- First `waitUntil` failure recorded in Cron Events + +## Multiple Schedules + +```typescript +export default { + async scheduled(controller, env, ctx) { + switch (controller.cron) { + case "*/3 * * * *": ctx.waitUntil(updateRecentData(env)); break; + case "0 * * * *": ctx.waitUntil(processHourlyAggregation(env)); break; + case "0 2 * * *": ctx.waitUntil(performDailyMaintenance(env)); break; + default: console.warn(`Unhandled: ${controller.cron}`); + } + }, +}; +``` + +## ctx.waitUntil Usage + +```typescript +export default { + async scheduled(controller, env, ctx) { + const data = await fetchCriticalData(); // Critical path + + // Non-blocking background tasks + ctx.waitUntil(Promise.all([ + logToAnalytics(data), + cleanupOldRecords(env.DB), + notifyWebhook(env.WEBHOOK_URL, data), + ])); + }, +}; +``` + +## Workflow Integration + +```typescript +import { WorkflowEntrypoint } from "cloudflare:workers"; + +export class DataProcessingWorkflow extends WorkflowEntrypoint { + async run(event, step) { + const data = await step.do("fetch-data", () => fetchLargeDataset()); + const processed = await step.do("process-data", () => processDataset(data)); + await step.do("store-results", () => storeResults(processed)); + } +} + +export default { + async scheduled(controller, env, ctx) { + const instance = await env.MY_WORKFLOW.create({ + params: { scheduledTime: controller.scheduledTime, cron: controller.cron }, + }); + console.log(`Started workflow: ${instance.id}`); + }, +}; +``` + +## Testing Handler + +**Local development (/__scheduled endpoint):** +```bash +# Start dev server +npx wrangler dev + +# Trigger any cron +curl "http://localhost:8787/__scheduled?cron=*/5+*+*+*+*" + +# Trigger specific cron with custom time +curl "http://localhost:8787/__scheduled?cron=0+2+*+*+*&scheduledTime=1704067200000" +``` + +**Query parameters:** +- `cron` - Required. URL-encoded cron expression (use `+` for spaces) +- `scheduledTime` - Optional. Unix timestamp in milliseconds (defaults to current time) + +**Production security:** The `/__scheduled` endpoint is available in production and can be triggered by anyone. Block it or implement authentication - see [gotchas.md](./gotchas.md#security-concerns) + +**Unit testing (Vitest):** +```typescript +// test/scheduled.test.ts +import { describe, it, expect } from "vitest"; +import { env } from "cloudflare:test"; +import worker from "../src/index"; + +describe("Scheduled Handler", () => { + it("processes scheduled event", async () => { + const controller = { scheduledTime: Date.now(), cron: "*/5 * * * *", type: "scheduled" as const, noRetry: () => {} }; + const ctx = { waitUntil: (p: Promise) => p, passThroughOnException: () => {} }; + await worker.scheduled(controller, env, ctx); + expect(await env.MY_KV.get("last_run")).toBeDefined(); + }); + + it("handles multiple crons", async () => { + const ctx = { waitUntil: () => {}, passThroughOnException: () => {} }; + await worker.scheduled({ scheduledTime: Date.now(), cron: "*/5 * * * *", type: "scheduled", noRetry: () => {} }, env, ctx); + expect(await env.MY_KV.get("last_type")).toBe("frequent"); + }); +}); +``` + +## Error Handling + +**Automatic retries:** +- Failed cron executions are retried automatically unless `noRetry()` is called +- Retry happens after a delay (typically minutes) +- Only first `waitUntil()` failure is recorded in Cron Events + +**Best practices:** +```typescript +export default { + async scheduled(controller, env, ctx) { + try { + await criticalOperation(env); + } catch (error) { + // Log error details + console.error("Cron failed:", { + cron: controller.cron, + scheduledTime: controller.scheduledTime, + error: error.message, + stack: error.stack, + }); + + // Decide: retry or skip + if (error.message.includes("rate limit")) { + controller.noRetry(); // Skip retry for rate limits + } + // Otherwise allow automatic retry + throw error; + } + }, +}; +``` + +## See Also + +- [README.md](./README.md) - Overview +- [patterns.md](./patterns.md) - Use cases, examples +- [gotchas.md](./gotchas.md) - Common errors, testing issues diff --git a/cloudflare/references/cron-triggers/configuration.md b/cloudflare/references/cron-triggers/configuration.md new file mode 100644 index 0000000..b584369 --- /dev/null +++ b/cloudflare/references/cron-triggers/configuration.md @@ -0,0 +1,180 @@ +# Cron Triggers Configuration + +## wrangler.jsonc + +```jsonc +{ + "$schema": "./node_modules/wrangler/config-schema.json", + "name": "my-cron-worker", + "main": "src/index.ts", + "compatibility_date": "2025-01-01", // Use current date for new projects + + "triggers": { + "crons": [ + "*/5 * * * *", // Every 5 minutes + "0 */2 * * *", // Every 2 hours + "0 9 * * MON-FRI", // Weekdays at 9am UTC + "0 2 1 * *" // Monthly on 1st at 2am UTC + ] + } +} +``` + +## Green Compute (Beta) + +Schedule crons during low-carbon periods for carbon-aware execution: + +```jsonc +{ + "name": "eco-cron-worker", + "triggers": { + "crons": ["0 2 * * *"] + }, + "placement": { + "mode": "smart" // Runs during low-carbon periods + } +} +``` + +**Modes:** +- `"smart"` - Carbon-aware scheduling (may delay up to 24h for optimal window) +- Default (no placement config) - Standard scheduling (no delay) + +**How it works:** +- Cloudflare delays execution until grid carbon intensity is lower +- Maximum delay: 24 hours from scheduled time +- Ideal for batch jobs with flexible timing requirements + +**Use cases:** +- Nightly data processing and ETL pipelines +- Weekly/monthly report generation +- Database backups and maintenance +- Analytics aggregation +- ML model training + +**Not suitable for:** +- Time-sensitive operations (SLA requirements) +- User-facing features requiring immediate execution +- Real-time monitoring and alerting +- Compliance tasks with strict time windows + +## Environment-Specific Schedules + +```jsonc +{ + "name": "my-cron-worker", + "triggers": { + "crons": ["0 */6 * * *"] // Prod: every 6 hours + }, + "env": { + "staging": { + "triggers": { + "crons": ["*/15 * * * *"] // Staging: every 15min + } + }, + "dev": { + "triggers": { + "crons": ["*/5 * * * *"] // Dev: every 5min + } + } + } +} +``` + +## Schedule Format + +**Structure:** `minute hour day-of-month month day-of-week` + +**Special chars:** `*` (any), `,` (list), `-` (range), `/` (step), `L` (last), `W` (weekday), `#` (nth) + +## Managing Triggers + +**Remove all:** `"triggers": { "crons": [] }` +**Preserve existing:** Omit `"triggers"` field entirely + +## Deployment + +```bash +# Deploy with config crons +npx wrangler deploy + +# Deploy specific environment +npx wrangler deploy --env production + +# View deployments +npx wrangler deployments list +``` + +**⚠️ Changes take up to 15 minutes to propagate globally** + +## API Management + +**Get triggers:** +```bash +curl "https://api.cloudflare.com/client/v4/accounts/{account_id}/workers/scripts/{script_name}/schedules" \ + -H "Authorization: Bearer {api_token}" +``` + +**Update triggers:** +```bash +curl -X PUT "https://api.cloudflare.com/client/v4/accounts/{account_id}/workers/scripts/{script_name}/schedules" \ + -H "Authorization: Bearer {api_token}" \ + -H "Content-Type: application/json" \ + -d '{"crons": ["*/5 * * * *", "0 2 * * *"]}' +``` + +**Delete all:** +```bash +curl -X PUT "https://api.cloudflare.com/client/v4/accounts/{account_id}/workers/scripts/{script_name}/schedules" \ + -H "Authorization: Bearer {api_token}" \ + -H "Content-Type: application/json" \ + -d '{"crons": []}' +``` + +## Combining Multiple Workers + +For complex schedules, use multiple workers: + +```jsonc +// worker-frequent.jsonc +{ + "name": "data-sync-frequent", + "triggers": { "crons": ["*/5 * * * *"] } +} + +// worker-daily.jsonc +{ + "name": "reports-daily", + "triggers": { "crons": ["0 2 * * *"] }, + "placement": { "mode": "smart" } +} + +// worker-weekly.jsonc +{ + "name": "cleanup-weekly", + "triggers": { "crons": ["0 3 * * SUN"] } +} +``` + +**Benefits:** +- Separate CPU limits per worker +- Independent error isolation +- Different Green Compute policies +- Easier to maintain and debug + +## Validation + +**Test cron syntax:** +- [crontab.guru](https://crontab.guru/) - Interactive validator +- Wrangler validates on deploy but won't catch logic errors + +**Common mistakes:** +- `0 0 * * *` runs daily at midnight UTC, not your local timezone +- `*/60 * * * *` is invalid (use `0 * * * *` for hourly) +- `0 2 31 * *` only runs on months with 31 days + +## See Also + +- [README.md](./README.md) - Overview, quick start +- [api.md](./api.md) - Handler implementation +- [patterns.md](./patterns.md) - Multi-cron routing examples diff --git a/cloudflare/references/cron-triggers/gotchas.md b/cloudflare/references/cron-triggers/gotchas.md new file mode 100644 index 0000000..5906c3a --- /dev/null +++ b/cloudflare/references/cron-triggers/gotchas.md @@ -0,0 +1,199 @@ +# Cron Triggers Gotchas + +## Common Errors + +### "Timezone Issues" + +**Problem:** Cron runs at wrong time relative to local timezone +**Cause:** All crons execute in UTC, no local timezone support +**Solution:** Convert local time to UTC manually + +**Conversion formula:** `utcHour = (localHour - utcOffset + 24) % 24` + +**Examples:** +- 9am PST (UTC-8) → `(9 - (-8) + 24) % 24 = 17` → `0 17 * * *` +- 2am EST (UTC-5) → `(2 - (-5) + 24) % 24 = 7` → `0 7 * * *` +- 6pm JST (UTC+9) → `(18 - 9 + 24) % 24 = 33 % 24 = 9` → `0 9 * * *` + +**Daylight Saving Time:** Adjust manually when DST changes, or schedule at times unaffected by DST (e.g., 2am-4am local time usually safe) + +### "Cron Not Executing" + +**Cause:** Missing `scheduled()` export, invalid syntax, propagation delay (<15min), or plan limits +**Solution:** Verify export exists, validate at crontab.guru, wait 15+ min after deploy, check plan limits + +### "Duplicate Executions" + +**Cause:** At-least-once delivery +**Solution:** Track execution IDs in KV - see idempotency pattern below + +### "Execution Failures" + +**Cause:** CPU exceeded, unhandled exceptions, network timeouts, binding errors +**Solution:** Use try-catch, AbortController timeouts, `ctx.waitUntil()` for long ops, or Workflows for heavy tasks + +### "Local Testing Not Working" + +**Problem:** `/__scheduled` endpoint returns 404 or doesn't trigger handler +**Cause:** Missing `scheduled()` export, wrangler not running, or incorrect endpoint format +**Solution:** + +1. Verify `scheduled()` is exported: +```typescript +export default { + async scheduled(controller, env, ctx) { + console.log("Cron triggered"); + }, +}; +``` + +2. Start dev server: +```bash +npx wrangler dev +``` + +3. Use correct endpoint format (URL-encode spaces as `+`): +```bash +# Correct +curl "http://localhost:8787/__scheduled?cron=*/5+*+*+*+*" + +# Wrong (will fail) +curl "http://localhost:8787/__scheduled?cron=*/5 * * * *" +``` + +4. Update Wrangler if outdated: +```bash +npm install -g wrangler@latest +``` + +### "waitUntil() Tasks Not Completing" + +**Problem:** Background tasks in `ctx.waitUntil()` fail silently or don't execute +**Cause:** Promises rejected without error handling, or handler returns before promise settles +**Solution:** Always await or handle errors in waitUntil promises: + +```typescript +export default { + async scheduled(controller, env, ctx) { + // BAD: Silent failures + ctx.waitUntil(riskyOperation()); + + // GOOD: Explicit error handling + ctx.waitUntil( + riskyOperation().catch(err => { + console.error("Background task failed:", err); + return logError(err, env); + }) + ); + }, +}; +``` + +### "Idempotency Issues" + +**Problem:** At-least-once delivery causes duplicate side effects (double charges, duplicate emails) +**Cause:** No deduplication mechanism +**Solution:** Use KV to track execution IDs: + +```typescript +export default { + async scheduled(controller, env, ctx) { + const executionId = `${controller.cron}-${controller.scheduledTime}`; + const existing = await env.EXECUTIONS.get(executionId); + + if (existing) { + console.log("Already executed, skipping"); + controller.noRetry(); + return; + } + + await env.EXECUTIONS.put(executionId, "1", { expirationTtl: 86400 }); // 24h TTL + await performIdempotentOperation(env); + }, +}; +``` + +### "Security Concerns" + +**Problem:** `__scheduled` endpoint exposed in production allows unauthorized cron triggering +**Cause:** Testing endpoint available in deployed Workers +**Solution:** Block `__scheduled` in production: + +```typescript +export default { + async fetch(request, env, ctx) { + const url = new URL(request.url); + + // Block __scheduled in production + if (url.pathname === "/__scheduled" && env.ENVIRONMENT === "production") { + return new Response("Not Found", { status: 404 }); + } + + return handleRequest(request, env, ctx); + }, + + async scheduled(controller, env, ctx) { + // Your cron logic + }, +}; +``` + +**Also:** Use `env.API_KEY` for secrets (never hardcode) + +**Alternative:** Add middleware to verify request origin: +```typescript +export default { + async fetch(request, env, ctx) { + const url = new URL(request.url); + + if (url.pathname === "/__scheduled") { + // Check Cloudflare headers to verify internal request + const cfRay = request.headers.get("cf-ray"); + if (!cfRay && env.ENVIRONMENT === "production") { + return new Response("Not Found", { status: 404 }); + } + } + + return handleRequest(request, env, ctx); + }, + + async scheduled(controller, env, ctx) { + // Your cron logic + }, +}; +``` + +## Limits & Quotas + +| Limit | Free | Paid | Notes | +|-------|------|------|-------| +| Triggers per Worker | 3 | Unlimited | Maximum cron schedules per Worker | +| CPU time | 10ms | 50ms | May need `ctx.waitUntil()` or Workflows | +| Execution guarantee | At-least-once | At-least-once | Duplicates possible - use idempotency | +| Propagation delay | Up to 15 minutes | Up to 15 minutes | Time for changes to take effect globally | +| Min interval | 1 minute | 1 minute | Cannot schedule more frequently | +| Cron accuracy | ±1 minute | ±1 minute | Execution may drift slightly | + +## Testing Best Practices + +**Unit tests:** +- Mock `ScheduledController`, `ExecutionContext`, and bindings +- Test each cron expression separately +- Verify `noRetry()` is called when expected +- Use Vitest with `@cloudflare/vitest-pool-workers` for realistic env + +**Integration tests:** +- Test via `/__scheduled` endpoint in dev environment +- Verify idempotency logic with duplicate `scheduledTime` values +- Test error handling and retry behavior + +**Production:** Start with long intervals (`*/30 * * * *`), monitor Cron Events for 24h, set up alerts before reducing interval + +## Resources + +- [Cron Triggers Docs](https://developers.cloudflare.com/workers/configuration/cron-triggers/) +- [Scheduled Handler API](https://developers.cloudflare.com/workers/runtime-apis/handlers/scheduled/) +- [Cloudflare Workflows](https://developers.cloudflare.com/workflows/) +- [Workers Limits](https://developers.cloudflare.com/workers/platform/limits/) +- [Crontab Guru](https://crontab.guru/) - Validator +- [Vitest Pool Workers](https://github.com/cloudflare/workers-sdk/tree/main/fixtures/vitest-pool-workers-examples) diff --git a/cloudflare/references/cron-triggers/patterns.md b/cloudflare/references/cron-triggers/patterns.md new file mode 100644 index 0000000..a1f1823 --- /dev/null +++ b/cloudflare/references/cron-triggers/patterns.md @@ -0,0 +1,190 @@ +# Cron Triggers Patterns + +## API Data Sync + +```typescript +export default { + async scheduled(controller, env, ctx) { + const response = await fetch("https://api.example.com/data", {headers: { "Authorization": `Bearer ${env.API_KEY}` }}); + if (!response.ok) throw new Error(`API error: ${response.status}`); + ctx.waitUntil(env.MY_KV.put("cached_data", JSON.stringify(await response.json()), {expirationTtl: 3600})); + }, +}; +``` + +## Database Cleanup + +```typescript +export default { + async scheduled(controller, env, ctx) { + const result = await env.DB.prepare(`DELETE FROM sessions WHERE expires_at < datetime('now')`).run(); + console.log(`Deleted ${result.meta.changes} expired sessions`); + ctx.waitUntil(env.DB.prepare("VACUUM").run()); + }, +}; +``` + +## Report Generation + +```typescript +export default { + async scheduled(controller, env, ctx) { + const startOfWeek = new Date(); startOfWeek.setDate(startOfWeek.getDate() - 7); + const { results } = await env.DB.prepare(`SELECT date, revenue, orders FROM daily_stats WHERE date >= ? ORDER BY date`).bind(startOfWeek.toISOString()).all(); + const report = {period: "weekly", totalRevenue: results.reduce((sum, d) => sum + d.revenue, 0), totalOrders: results.reduce((sum, d) => sum + d.orders, 0), dailyBreakdown: results}; + const reportKey = `reports/weekly-${Date.now()}.json`; + await env.REPORTS_BUCKET.put(reportKey, JSON.stringify(report)); + ctx.waitUntil(env.SEND_EMAIL.fetch("https://example.com/send", {method: "POST", body: JSON.stringify({to: "team@example.com", subject: "Weekly Report", reportUrl: `https://reports.example.com/${reportKey}`})})); + }, +}; +``` + +## Health Checks + +```typescript +export default { + async scheduled(controller, env, ctx) { + const services = [{name: "API", url: "https://api.example.com/health"}, {name: "CDN", url: "https://cdn.example.com/health"}]; + const checks = await Promise.all(services.map(async (service) => { + const start = Date.now(); + try { + const response = await fetch(service.url, { signal: AbortSignal.timeout(5000) }); + return {name: service.name, status: response.ok ? "up" : "down", responseTime: Date.now() - start}; + } catch (error) { + return {name: service.name, status: "down", responseTime: Date.now() - start, error: error.message}; + } + })); + ctx.waitUntil(env.STATUS_KV.put("health_status", JSON.stringify(checks))); + const failures = checks.filter(c => c.status === "down"); + if (failures.length > 0) ctx.waitUntil(fetch(env.ALERT_WEBHOOK, {method: "POST", body: JSON.stringify({text: `${failures.length} service(s) down: ${failures.map(f => f.name).join(", ")}`})})); + }, +}; +``` + +## Batch Processing (Rate-Limited) + +```typescript +export default { + async scheduled(controller, env, ctx) { + const queueData = await env.QUEUE_KV.get("pending_items", "json"); + if (!queueData || queueData.length === 0) return; + const batch = queueData.slice(0, 100); + const results = await Promise.allSettled(batch.map(item => fetch("https://api.example.com/process", {method: "POST", headers: {"Authorization": `Bearer ${env.API_KEY}`, "Content-Type": "application/json"}, body: JSON.stringify(item)}))); + console.log(`Processed ${results.filter(r => r.status === "fulfilled").length}/${batch.length} items`); + ctx.waitUntil(env.QUEUE_KV.put("pending_items", JSON.stringify(queueData.slice(100)))); + }, +}; +``` + +## Queue Integration + +```typescript +export default { + async scheduled(controller, env, ctx) { + const batch = await env.MY_QUEUE.receive({ batchSize: 100 }); + const results = await Promise.allSettled(batch.messages.map(async (msg) => { + await processMessage(msg.body, env); + await msg.ack(); + })); + console.log(`Processed ${results.filter(r => r.status === "fulfilled").length}/${batch.messages.length}`); + }, +}; +``` + +## Monitoring & Observability + +```typescript +export default { + async scheduled(controller, env, ctx) { + const startTime = Date.now(); + const meta = { cron: controller.cron, scheduledTime: controller.scheduledTime }; + console.log("[START]", meta); + try { + const result = await performTask(env); + console.log("[SUCCESS]", { ...meta, duration: Date.now() - startTime, count: result.count }); + ctx.waitUntil(env.METRICS.put(`cron:${controller.scheduledTime}`, JSON.stringify({ ...meta, status: "success" }), { expirationTtl: 2592000 })); + } catch (error) { + console.error("[ERROR]", { ...meta, duration: Date.now() - startTime, error: error.message }); + ctx.waitUntil(fetch(env.ALERT_WEBHOOK, { method: "POST", body: JSON.stringify({ text: `Cron failed: ${controller.cron}`, error: error.message }) })); + throw error; + } + }, +}; +``` + +**View logs:** `npx wrangler tail` or Dashboard → Workers & Pages → Worker → Logs + +## Durable Objects Coordination + +```typescript +export default { + async scheduled(controller, env, ctx) { + const stub = env.COORDINATOR.get(env.COORDINATOR.idFromName("cron-lock")); + const acquired = await stub.tryAcquireLock(controller.scheduledTime); + if (!acquired) { + controller.noRetry(); + return; + } + try { + await performTask(env); + } finally { + await stub.releaseLock(); + } + }, +}; +``` + +## Python Handler + +```python +from workers import WorkerEntrypoint + +class Default(WorkerEntrypoint): + async def scheduled(self, controller, env, ctx): + data = await env.MY_KV.get("key") + ctx.waitUntil(env.DB.execute("DELETE FROM logs WHERE created_at < datetime('now', '-7 days')")) +``` + +## Testing Patterns + +**Local testing with /__scheduled:** +```bash +# Start dev server +npx wrangler dev + +# Test specific cron +curl "http://localhost:8787/__scheduled?cron=*/5+*+*+*+*" + +# Test with specific time +curl "http://localhost:8787/__scheduled?cron=0+2+*+*+*&scheduledTime=1704067200000" +``` + +**Unit tests:** +```typescript +// test/scheduled.test.ts +import { describe, it, expect, vi } from "vitest"; +import { env } from "cloudflare:test"; +import worker from "../src/index"; + +describe("Scheduled Handler", () => { + it("executes cron", async () => { + const controller = { scheduledTime: Date.now(), cron: "*/5 * * * *", type: "scheduled" as const, noRetry: vi.fn() }; + const ctx = { waitUntil: vi.fn(), passThroughOnException: vi.fn() }; + await worker.scheduled(controller, env, ctx); + expect(await env.MY_KV.get("last_run")).toBeDefined(); + }); + + it("calls noRetry on duplicate", async () => { + const controller = { scheduledTime: 1704067200000, cron: "0 2 * * *", type: "scheduled" as const, noRetry: vi.fn() }; + await env.EXECUTIONS.put("0 2 * * *-1704067200000", "1"); + await worker.scheduled(controller, env, { waitUntil: vi.fn(), passThroughOnException: vi.fn() }); + expect(controller.noRetry).toHaveBeenCalled(); + }); +}); +``` + +## See Also + +- [README.md](./README.md) - Overview +- [api.md](./api.md) - Handler implementation +- [gotchas.md](./gotchas.md) - Troubleshooting diff --git a/cloudflare/references/d1/README.md b/cloudflare/references/d1/README.md new file mode 100644 index 0000000..e40d44c --- /dev/null +++ b/cloudflare/references/d1/README.md @@ -0,0 +1,133 @@ +# Cloudflare D1 Database + +Expert guidance for Cloudflare D1, a serverless SQLite database designed for horizontal scale-out across multiple databases. + +## Overview + +D1 is Cloudflare's managed, serverless database with: +- SQLite SQL semantics and compatibility +- Built-in disaster recovery via Time Travel (30-day point-in-time recovery) +- Horizontal scale-out architecture (10 GB per database) +- Worker and HTTP API access +- Pricing based on query and storage costs only + +**Architecture Philosophy**: D1 is optimized for per-user, per-tenant, or per-entity database patterns rather than single large databases. + +## Quick Start + +```bash +# Create database +wrangler d1 create + +# Execute migration +wrangler d1 migrations apply --remote + +# Local development +wrangler dev +``` + +## Core Query Methods + +```typescript +// .all() - Returns all rows; .first() - First row or null; .first(col) - Single column value +// .run() - INSERT/UPDATE/DELETE; .raw() - Array of arrays (efficient) +const { results, success, meta } = await env.DB.prepare('SELECT * FROM users WHERE active = ?').bind(true).all(); +const user = await env.DB.prepare('SELECT * FROM users WHERE id = ?').bind(userId).first(); +``` + +## Batch Operations + +```typescript +// Multiple queries in single round trip (atomic transaction) +const results = await env.DB.batch([ + env.DB.prepare('SELECT * FROM users WHERE id = ?').bind(1), + env.DB.prepare('SELECT * FROM posts WHERE author_id = ?').bind(1), + env.DB.prepare('UPDATE users SET last_access = ? WHERE id = ?').bind(Date.now(), 1) +]); +``` + +## Sessions API (Paid Plans) + +```typescript +// Create long-running session for analytics/migrations (up to 15 minutes) +const session = env.DB.withSession(); +try { + await session.prepare('CREATE INDEX idx_heavy ON large_table(column)').run(); + await session.prepare('ANALYZE').run(); +} finally { + session.close(); // Always close to release resources +} +``` + +## Read Replication (Paid Plans) + +```typescript +// Read from nearest replica for lower latency (automatic failover) +const user = await env.DB_REPLICA.prepare('SELECT * FROM users WHERE id = ?').bind(userId).first(); + +// Writes always go to primary +await env.DB.prepare('UPDATE users SET last_login = ? WHERE id = ?').bind(Date.now(), userId).run(); +``` + +## Platform Limits + +| Limit | Free Tier | Paid Plans | +|-------|-----------|------------| +| Database size | 500 MB | 10 GB per database | +| Row size | 1 MB max | 1 MB max | +| Query timeout | 30 seconds | 30 seconds | +| Batch size | 1,000 statements | 10,000 statements | +| Time Travel retention | 7 days | 30 days | +| Read replicas | Not available | Yes (paid add-on) | + +**Pricing**: $5/month per database beyond free tier + $0.001 per 1K reads + $1 per 1M writes + $0.75/GB storage/month + +## CLI Commands + +```bash +# Database management +wrangler d1 create +wrangler d1 list +wrangler d1 delete + +# Migrations +wrangler d1 migrations create # Create new migration file +wrangler d1 migrations apply --remote # Apply pending migrations +wrangler d1 migrations apply --local # Apply locally +wrangler d1 migrations list --remote # Show applied migrations + +# Direct SQL execution +wrangler d1 execute --remote --command="SELECT * FROM users" +wrangler d1 execute --local --file=./schema.sql + +# Backups & Import/Export +wrangler d1 export --remote --output=./backup.sql # Full export with schema +wrangler d1 export --remote --no-schema --output=./data.sql # Data only +wrangler d1 time-travel restore --timestamp="2024-01-15T14:30:00Z" # Point-in-time recovery + +# Development +wrangler dev --persist-to=./.wrangler/state +``` + +## Reading Order + +**Start here**: Quick Start above → configuration.md (setup) → api.md (queries) + +**Common tasks**: +- First time setup: configuration.md → Run migrations +- Adding queries: api.md → Prepared statements +- Pagination/caching: patterns.md +- Production optimization: Read Replication + Sessions API (this file) +- Debugging: gotchas.md + +## In This Reference + +- [configuration.md](./configuration.md) - wrangler.jsonc setup, migrations, TypeScript types, ORMs, local dev +- [api.md](./api.md) - Query methods (.all/.first/.run/.raw), batch, sessions, read replicas, error handling +- [patterns.md](./patterns.md) - Pagination, bulk operations, caching, multi-tenant, sessions, analytics +- [gotchas.md](./gotchas.md) - SQL injection, limits by plan tier, performance, common errors + +## See Also + +- [workers](../workers/) - Worker runtime and fetch handler patterns +- [hyperdrive](../hyperdrive/) - Connection pooling for external databases diff --git a/cloudflare/references/d1/api.md b/cloudflare/references/d1/api.md new file mode 100644 index 0000000..b3c26de --- /dev/null +++ b/cloudflare/references/d1/api.md @@ -0,0 +1,196 @@ +# D1 API Reference + +## Prepared Statements (Required for Security) + +```typescript +// ❌ NEVER: Direct string interpolation (SQL injection risk) +const result = await env.DB.prepare(`SELECT * FROM users WHERE id = ${userId}`).all(); + +// ✅ CORRECT: Prepared statements with bind() +const result = await env.DB.prepare('SELECT * FROM users WHERE id = ?').bind(userId).all(); + +// Multiple parameters +const result = await env.DB.prepare('SELECT * FROM users WHERE email = ? AND active = ?').bind(email, true).all(); +``` + +## Query Execution Methods + +```typescript +// .all() - Returns all rows +const { results, success, meta } = await env.DB.prepare('SELECT * FROM users WHERE active = ?').bind(true).all(); +// results: Array of row objects; success: boolean +// meta: { duration: number, rows_read: number, rows_written: number } + +// .first() - Returns first row or null +const user = await env.DB.prepare('SELECT * FROM users WHERE id = ?').bind(userId).first(); + +// .first(columnName) - Returns single column value +const email = await env.DB.prepare('SELECT email FROM users WHERE id = ?').bind(userId).first('email'); +// Returns string | number | null + +// .run() - For INSERT/UPDATE/DELETE (no row data returned) +const result = await env.DB.prepare('UPDATE users SET last_login = ? WHERE id = ?').bind(Date.now(), userId).run(); +// result.meta: { duration, rows_read, rows_written, last_row_id, changes } + +// .raw() - Returns array of arrays (efficient for large datasets) +const rawResults = await env.DB.prepare('SELECT id, name FROM users').raw(); +// [[1, 'Alice'], [2, 'Bob']] +``` + +## Batch Operations + +```typescript +// Execute multiple queries in single round trip (atomic transaction) +const results = await env.DB.batch([ + env.DB.prepare('SELECT * FROM users WHERE id = ?').bind(1), + env.DB.prepare('SELECT * FROM posts WHERE author_id = ?').bind(1), + env.DB.prepare('UPDATE users SET last_access = ? WHERE id = ?').bind(Date.now(), 1) +]); +// results is array: [result1, result2, result3] + +// Batch with same prepared statement, different params +const userIds = [1, 2, 3]; +const stmt = env.DB.prepare('SELECT * FROM users WHERE id = ?'); +const results = await env.DB.batch(userIds.map(id => stmt.bind(id))); +``` + +## Transactions (via batch) + +```typescript +// D1 executes batch() as atomic transaction - all succeed or all fail +const results = await env.DB.batch([ + env.DB.prepare('INSERT INTO accounts (id, balance) VALUES (?, ?)').bind(1, 100), + env.DB.prepare('INSERT INTO accounts (id, balance) VALUES (?, ?)').bind(2, 200), + env.DB.prepare('UPDATE accounts SET balance = balance - ? WHERE id = ?').bind(50, 1), + env.DB.prepare('UPDATE accounts SET balance = balance + ? WHERE id = ?').bind(50, 2) +]); +``` + +## Sessions API (Paid Plans) + +Long-running sessions for operations exceeding 30s timeout (up to 15 min). + +```typescript +const session = env.DB.withSession({ timeout: 600 }); // 10 min (1-900s) +try { + await session.prepare('CREATE INDEX idx_large ON big_table(column)').run(); + await session.prepare('ANALYZE').run(); +} finally { + session.close(); // CRITICAL: always close to prevent leaks +} +``` + +**Use cases**: Migrations, ANALYZE, large index creation, bulk transformations + +## Read Replication (Paid Plans) + +Routes queries to nearest replica for lower latency. Writes always go to primary. + +```typescript +interface Env { + DB: D1Database; // Primary (writes) + DB_REPLICA: D1Database; // Replica (reads) +} + +// Reads: use replica +const user = await env.DB_REPLICA.prepare('SELECT * FROM users WHERE id = ?').bind(userId).first(); + +// Writes: use primary +await env.DB.prepare('UPDATE users SET last_login = ? WHERE id = ?').bind(Date.now(), userId).run(); + +// Read-after-write: use primary for consistency (replication lag <100ms-2s) +await env.DB.prepare('INSERT INTO posts (title) VALUES (?)').bind(title).run(); +const post = await env.DB.prepare('SELECT * FROM posts WHERE title = ?').bind(title).first(); // Primary +``` + +## Error Handling + +```typescript +async function getUser(userId: number, env: Env): Promise { + try { + const result = await env.DB.prepare('SELECT * FROM users WHERE id = ?').bind(userId).all(); + if (!result.success) return new Response('Database error', { status: 500 }); + if (result.results.length === 0) return new Response('User not found', { status: 404 }); + return Response.json(result.results[0]); + } catch (error) { + return new Response('Internal error', { status: 500 }); + } +} + +// Constraint violations +try { + await env.DB.prepare('INSERT INTO users (email, name) VALUES (?, ?)').bind(email, name).run(); +} catch (error) { + if (error.message?.includes('UNIQUE constraint failed')) return new Response('Email exists', { status: 409 }); + throw error; +} +``` + +## REST API (HTTP) Access + +Access D1 from external services (non-Worker contexts) using Cloudflare API. + +```typescript +// Single query +const response = await fetch( + `https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/d1/database/${DATABASE_ID}/query`, + { + method: 'POST', + headers: { + 'Authorization': `Bearer ${CLOUDFLARE_API_TOKEN}`, + 'Content-Type': 'application/json' + }, + body: JSON.stringify({ + sql: 'SELECT * FROM users WHERE id = ?', + params: [userId] + }) + } +); + +const { result, success, errors } = await response.json(); +// result: [{ results: [...], success: true, meta: {...} }] + +// Batch queries via HTTP +const response = await fetch( + `https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/d1/database/${DATABASE_ID}/query`, + { + method: 'POST', + headers: { + 'Authorization': `Bearer ${CLOUDFLARE_API_TOKEN}`, + 'Content-Type': 'application/json' + }, + body: JSON.stringify([ + { sql: 'SELECT * FROM users WHERE id = ?', params: [1] }, + { sql: 'SELECT * FROM posts WHERE author_id = ?', params: [1] } + ]) + } +); +``` + +**Use cases**: Server-side scripts, CI/CD migrations, administrative tools, non-Worker integrations + +## Testing & Debugging + +```typescript +// Vitest with unstable_dev +import { unstable_dev } from 'wrangler'; +describe('D1', () => { + let worker: Awaited>; + beforeAll(async () => { worker = await unstable_dev('src/index.ts'); }); + afterAll(async () => { await worker.stop(); }); + it('queries users', async () => { expect((await worker.fetch('/users')).status).toBe(200); }); +}); + +// Debug query performance +const result = await env.DB.prepare('SELECT * FROM users').all(); +console.log('Duration:', result.meta.duration, 'ms'); + +// Query plan analysis +const plan = await env.DB.prepare('EXPLAIN QUERY PLAN SELECT * FROM users WHERE email = ?').bind(email).all(); +``` + +```bash +# Inspect local database +sqlite3 .wrangler/state/v3/d1/.sqlite +.tables; .schema users; PRAGMA table_info(users); +``` diff --git a/cloudflare/references/d1/configuration.md b/cloudflare/references/d1/configuration.md new file mode 100644 index 0000000..8a073fc --- /dev/null +++ b/cloudflare/references/d1/configuration.md @@ -0,0 +1,188 @@ +# D1 Configuration + +## wrangler.jsonc Setup + +```jsonc +{ + "name": "your-worker-name", + "main": "src/index.ts", + "compatibility_date": "2025-01-01", // Use current date for new projects + "d1_databases": [ + { + "binding": "DB", // Env variable name + "database_name": "your-db-name", // Human-readable name + "database_id": "your-database-id", // UUID from dashboard/CLI + "migrations_dir": "migrations" // Optional: default is "migrations" + }, + // Read replica (paid plans only) + { + "binding": "DB_REPLICA", + "database_name": "your-db-name", + "database_id": "your-database-id" // Same ID, different binding + }, + // Multiple databases + { + "binding": "ANALYTICS_DB", + "database_name": "analytics-db", + "database_id": "yyy-yyy-yyy" + } + ] +} +``` + +## TypeScript Types + +```typescript +interface Env { DB: D1Database; ANALYTICS_DB?: D1Database; } + +export default { + async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise { + const result = await env.DB.prepare('SELECT * FROM users').all(); + return Response.json(result.results); + } +} +``` + +## Migrations + +File structure: `migrations/0001_initial_schema.sql`, `0002_add_posts.sql`, etc. + +### Example Migration + +```sql +-- migrations/0001_initial_schema.sql +CREATE TABLE IF NOT EXISTS users ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + email TEXT UNIQUE NOT NULL, + name TEXT NOT NULL, + created_at TEXT DEFAULT CURRENT_TIMESTAMP, + updated_at TEXT DEFAULT CURRENT_TIMESTAMP +); + +CREATE INDEX idx_users_email ON users(email); + +CREATE TABLE IF NOT EXISTS posts ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + user_id INTEGER NOT NULL, + title TEXT NOT NULL, + content TEXT, + published BOOLEAN DEFAULT 0, + created_at TEXT DEFAULT CURRENT_TIMESTAMP, + FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE +); + +CREATE INDEX idx_posts_user_id ON posts(user_id); +CREATE INDEX idx_posts_published ON posts(published); +``` + +### Running Migrations + +```bash +# Create new migration file +wrangler d1 migrations create add_users_table +# Creates: migrations/0001_add_users_table.sql + +# Apply migrations +wrangler d1 migrations apply --local # Apply to local DB +wrangler d1 migrations apply --remote # Apply to production DB + +# List applied migrations +wrangler d1 migrations list --remote + +# Direct SQL execution (bypasses migration tracking) +wrangler d1 execute --remote --command="SELECT * FROM users" +wrangler d1 execute --local --file=./schema.sql +``` + +**Migration tracking**: Wrangler creates `d1_migrations` table automatically to track applied migrations + +## Indexing Strategy + +```sql +-- Index frequently queried columns +CREATE INDEX idx_users_email ON users(email); + +-- Composite indexes for multi-column queries +CREATE INDEX idx_posts_user_published ON posts(user_id, published); + +-- Covering indexes (include queried columns) +CREATE INDEX idx_users_email_name ON users(email, name); + +-- Partial indexes for filtered queries +CREATE INDEX idx_active_users ON users(email) WHERE active = 1; + +-- Check if query uses index +EXPLAIN QUERY PLAN SELECT * FROM users WHERE email = ?; +``` + +## Drizzle ORM + +```typescript +// drizzle.config.ts +export default { + schema: './src/schema.ts', out: './migrations', dialect: 'sqlite', driver: 'd1-http', + dbCredentials: { accountId: process.env.CLOUDFLARE_ACCOUNT_ID!, databaseId: process.env.D1_DATABASE_ID!, token: process.env.CLOUDFLARE_API_TOKEN! } +} satisfies Config; + +// schema.ts +import { sqliteTable, text, integer } from 'drizzle-orm/sqlite-core'; +export const users = sqliteTable('users', { + id: integer('id').primaryKey({ autoIncrement: true }), + email: text('email').notNull().unique(), + name: text('name').notNull() +}); + +// worker.ts +import { drizzle } from 'drizzle-orm/d1'; +import { users } from './schema'; +export default { + async fetch(request: Request, env: Env) { + const db = drizzle(env.DB); + return Response.json(await db.select().from(users)); + } +} +``` + +## Import & Export + +```bash +# Export full database (schema + data) +wrangler d1 export --remote --output=./backup.sql + +# Export data only (no schema) +wrangler d1 export --remote --no-schema --output=./data-only.sql + +# Export with foreign key constraints preserved +# (Default: foreign keys are disabled during export for import compatibility) + +# Import SQL file +wrangler d1 execute --remote --file=./backup.sql + +# Limitations +# - BLOB data may not export correctly (use R2 for binary files) +# - Very large exports (>1GB) may timeout (split into chunks) +# - Import is NOT atomic (use batch() for transactional imports in Workers) +``` + +## Plan Tiers + +| Feature | Free | Paid | +|---------|------|------| +| Database size | 500 MB | 10 GB | +| Batch size | 1,000 statements | 10,000 statements | +| Time Travel | 7 days | 30 days | +| Read replicas | ❌ | ✅ | +| Sessions API | ❌ | ✅ (up to 15 min) | +| Pricing | Free | $5/mo + usage | + +**Usage pricing** (paid plans): $0.001 per 1K reads + $1 per 1M writes + $0.75/GB storage/month + +## Local Development + +```bash +wrangler dev --persist-to=./.wrangler/state # Persist across restarts +# Local DB: .wrangler/state/v3/d1/.sqlite +sqlite3 .wrangler/state/v3/d1/.sqlite # Inspect + +# Local dev uses free tier limits by default +``` diff --git a/cloudflare/references/d1/gotchas.md b/cloudflare/references/d1/gotchas.md new file mode 100644 index 0000000..9f9a95a --- /dev/null +++ b/cloudflare/references/d1/gotchas.md @@ -0,0 +1,98 @@ +# D1 Gotchas & Troubleshooting + +## Common Errors + +### "SQL Injection Vulnerability" + +**Cause:** Using string interpolation instead of prepared statements with bind() +**Solution:** ALWAYS use prepared statements: `env.DB.prepare('SELECT * FROM users WHERE id = ?').bind(userId).all()` instead of string interpolation which allows attackers to inject malicious SQL + +### "no such table" + +**Cause:** Table doesn't exist because migrations haven't been run, or using wrong database binding +**Solution:** Run migrations using `wrangler d1 migrations apply --remote` and verify binding name in wrangler.jsonc matches code + +### "UNIQUE constraint failed" + +**Cause:** Attempting to insert duplicate value in column with UNIQUE constraint +**Solution:** Catch error and return 409 Conflict status code + +### "Query Timeout (30s exceeded)" + +**Cause:** Query execution exceeds 30 second timeout limit +**Solution:** Break into smaller queries, add indexes to speed up queries, or reduce dataset size + +### "N+1 Query Problem" + +**Cause:** Making multiple individual queries in a loop instead of single optimized query +**Solution:** Use JOIN to fetch related data in single query or use `batch()` method for multiple queries + +### "Missing Indexes" + +**Cause:** Queries performing full table scans without indexes +**Solution:** Use `EXPLAIN QUERY PLAN` to check if index is used, then create index with `CREATE INDEX idx_users_email ON users(email)` + +### "Boolean Type Issues" + +**Cause:** SQLite uses INTEGER (0/1) not native boolean type +**Solution:** Bind 1 or 0 instead of true/false when working with boolean values + +### "Date/Time Type Issues" + +**Cause:** SQLite doesn't have native DATE/TIME types +**Solution:** Use TEXT (ISO 8601 format) or INTEGER (unix timestamp) for date/time values + +## Plan Tier Limits + +| Limit | Free Tier | Paid Plans | Notes | +|-------|-----------|------------|-------| +| Database size | 500 MB | 10 GB | Design for multiple DBs per tenant on paid | +| Row size | 1 MB | 1 MB | Store large files in R2, not D1 | +| Query timeout | 30s | 30s (900s with sessions) | Use sessions API for migrations | +| Batch size | 1,000 statements | 10,000 statements | Split large batches accordingly | +| Time Travel | 7 days | 30 days | Point-in-time recovery window | +| Read replicas | ❌ Not available | ✅ Available | Paid add-on for lower latency | +| Sessions API | ❌ Not available | ✅ Up to 15 min | For migrations and heavy operations | +| Concurrent requests | 10,000/min | Higher | Contact support for custom limits | + +## Production Gotchas + +### "Batch size exceeded" + +**Cause:** Attempting to send >1,000 statements on free tier or >10,000 on paid +**Solution:** Chunk batches: `for (let i = 0; i < stmts.length; i += MAX_BATCH) await env.DB.batch(stmts.slice(i, i + MAX_BATCH))` + +### "Session not closed / resource leak" + +**Cause:** Forgot to call `session.close()` after using sessions API +**Solution:** Always use try/finally block: `try { await session.prepare(...) } finally { session.close() }` + +### "Replication lag causing stale reads" + +**Cause:** Reading from replica immediately after write - replication lag can be 100ms-2s +**Solution:** Use primary for read-after-write: `await env.DB.prepare(...)` not `env.DB_REPLICA` + +### "Migration applied to local but not remote" + +**Cause:** Forgot `--remote` flag when applying migrations +**Solution:** Always run `wrangler d1 migrations apply --remote` for production + +### "Foreign key constraint failed" + +**Cause:** Inserting row with FK to non-existent parent, or deleting parent before children +**Solution:** Enable FK enforcement: `PRAGMA foreign_keys = ON;` and use ON DELETE CASCADE in schema + +### "BLOB data corrupted on export" + +**Cause:** D1 export may not handle BLOB correctly +**Solution:** Store binary files in R2, only store R2 URLs/keys in D1 + +### "Database size approaching limit" + +**Cause:** Storing too much data in single database +**Solution:** Horizontal scale-out: create per-tenant/per-user databases, archive old data, or upgrade to paid plan + +### "Local dev vs production behavior differs" + +**Cause:** Local uses SQLite file, production uses distributed D1 - different performance/limits +**Solution:** Always test migrations on remote with `--remote` flag before production rollout diff --git a/cloudflare/references/d1/patterns.md b/cloudflare/references/d1/patterns.md new file mode 100644 index 0000000..f01c7bd --- /dev/null +++ b/cloudflare/references/d1/patterns.md @@ -0,0 +1,189 @@ +# D1 Patterns & Best Practices + +## Pagination + +```typescript +async function getUsers({ page, pageSize }: { page: number; pageSize: number }, env: Env) { + const offset = (page - 1) * pageSize; + const [countResult, dataResult] = await env.DB.batch([ + env.DB.prepare('SELECT COUNT(*) as total FROM users'), + env.DB.prepare('SELECT * FROM users ORDER BY created_at DESC LIMIT ? OFFSET ?').bind(pageSize, offset) + ]); + return { data: dataResult.results, total: countResult.results[0].total, page, pageSize, totalPages: Math.ceil(countResult.results[0].total / pageSize) }; +} +``` + +## Conditional Queries + +```typescript +async function searchUsers(filters: { name?: string; email?: string; active?: boolean }, env: Env) { + const conditions: string[] = [], params: (string | number | boolean | null)[] = []; + if (filters.name) { conditions.push('name LIKE ?'); params.push(`%${filters.name}%`); } + if (filters.email) { conditions.push('email = ?'); params.push(filters.email); } + if (filters.active !== undefined) { conditions.push('active = ?'); params.push(filters.active ? 1 : 0); } + const whereClause = conditions.length > 0 ? `WHERE ${conditions.join(' AND ')}` : ''; + return await env.DB.prepare(`SELECT * FROM users ${whereClause}`).bind(...params).all(); +} +``` + +## Bulk Insert + +```typescript +async function bulkInsertUsers(users: Array<{ name: string; email: string }>, env: Env) { + const stmt = env.DB.prepare('INSERT INTO users (name, email) VALUES (?, ?)'); + const batch = users.map(user => stmt.bind(user.name, user.email)); + return await env.DB.batch(batch); +} +``` + +## Caching with KV + +```typescript +async function getCachedUser(userId: number, env: { DB: D1Database; CACHE: KVNamespace }) { + const cacheKey = `user:${userId}`; + const cached = await env.CACHE?.get(cacheKey, 'json'); + if (cached) return cached; + const user = await env.DB.prepare('SELECT * FROM users WHERE id = ?').bind(userId).first(); + if (user) await env.CACHE?.put(cacheKey, JSON.stringify(user), { expirationTtl: 300 }); + return user; +} +``` + +## Query Optimization + +```typescript +// ✅ Use indexes in WHERE clauses +const users = await env.DB.prepare('SELECT * FROM users WHERE email = ?').bind(email).all(); + +// ✅ Limit result sets +const recentPosts = await env.DB.prepare('SELECT * FROM posts ORDER BY created_at DESC LIMIT 100').all(); + +// ✅ Use batch() for multiple independent queries +const [user, posts, comments] = await env.DB.batch([ + env.DB.prepare('SELECT * FROM users WHERE id = ?').bind(userId), + env.DB.prepare('SELECT * FROM posts WHERE user_id = ?').bind(userId), + env.DB.prepare('SELECT * FROM comments WHERE user_id = ?').bind(userId) +]); + +// ❌ Avoid N+1 queries +for (const post of posts) { + const author = await env.DB.prepare('SELECT * FROM users WHERE id = ?').bind(post.user_id).first(); // Bad: multiple round trips +} + +// ✅ Use JOINs instead +const postsWithAuthors = await env.DB.prepare(` + SELECT posts.*, users.name as author_name + FROM posts + JOIN users ON posts.user_id = users.id +`).all(); +``` + +## Multi-Tenant SaaS + +```typescript +// Each tenant gets own database +export default { + async fetch(request: Request, env: { [key: `TENANT_${string}`]: D1Database }) { + const tenantId = request.headers.get('X-Tenant-ID'); + const data = await env[`TENANT_${tenantId}`].prepare('SELECT * FROM records').all(); + return Response.json(data.results); + } +} +``` + +## Session Storage + +```typescript +async function createSession(userId: number, token: string, env: Env) { + const expiresAt = new Date(Date.now() + 7 * 24 * 60 * 60 * 1000).toISOString(); + return await env.DB.prepare('INSERT INTO sessions (user_id, token, expires_at) VALUES (?, ?, ?)').bind(userId, token, expiresAt).run(); +} + +async function validateSession(token: string, env: Env) { + return await env.DB.prepare('SELECT s.*, u.email FROM sessions s JOIN users u ON s.user_id = u.id WHERE s.token = ? AND s.expires_at > CURRENT_TIMESTAMP').bind(token).first(); +} +``` + +## Analytics/Events + +```typescript +async function logEvent(event: { type: string; userId?: number; metadata: object }, env: Env) { + return await env.DB.prepare('INSERT INTO events (type, user_id, metadata) VALUES (?, ?, ?)').bind(event.type, event.userId || null, JSON.stringify(event.metadata)).run(); +} + +async function getEventStats(startDate: string, endDate: string, env: Env) { + return await env.DB.prepare('SELECT type, COUNT(*) as count FROM events WHERE timestamp BETWEEN ? AND ? GROUP BY type ORDER BY count DESC').bind(startDate, endDate).all(); +} +``` + +## Read Replication Pattern (Paid Plans) + +```typescript +interface Env { DB: D1Database; DB_REPLICA: D1Database; } + +export default { + async fetch(request: Request, env: Env) { + if (request.method === 'GET') { + // Reads: use replica for lower latency + const users = await env.DB_REPLICA.prepare('SELECT * FROM users WHERE active = 1').all(); + return Response.json(users.results); + } + + if (request.method === 'POST') { + const { name, email } = await request.json(); + const result = await env.DB.prepare('INSERT INTO users (name, email) VALUES (?, ?)').bind(name, email).run(); + + // Read-after-write: use primary for consistency (replication lag <100ms-2s) + const user = await env.DB.prepare('SELECT * FROM users WHERE id = ?').bind(result.meta.last_row_id).first(); + return Response.json(user, { status: 201 }); + } + } +} +``` + +**Use replicas for**: Analytics dashboards, search results, public queries (eventual consistency OK) +**Use primary for**: Read-after-write, financial transactions, authentication (consistency required) + +## Sessions API Pattern (Paid Plans) + +```typescript +// Migration with long-running session (up to 15 min) +async function runMigration(env: Env) { + const session = env.DB.withSession({ timeout: 600 }); // 10 min + try { + await session.prepare('CREATE INDEX idx_users_email ON users(email)').run(); + await session.prepare('CREATE INDEX idx_posts_user ON posts(user_id)').run(); + await session.prepare('ANALYZE').run(); + } finally { + session.close(); // Always close to prevent leaks + } +} + +// Bulk transformation with batching +async function transformLargeDataset(env: Env) { + const session = env.DB.withSession({ timeout: 900 }); // 15 min max + try { + const BATCH_SIZE = 1000; + let offset = 0; + while (true) { + const rows = await session.prepare('SELECT id, data FROM legacy LIMIT ? OFFSET ?').bind(BATCH_SIZE, offset).all(); + if (rows.results.length === 0) break; + const updates = rows.results.map(row => + session.prepare('UPDATE legacy SET new_data = ? WHERE id = ?').bind(transform(row.data), row.id) + ); + await session.batch(updates); + offset += BATCH_SIZE; + } + } finally { session.close(); } +} +``` + +## Time Travel & Backups + +```bash +wrangler d1 time-travel restore --timestamp="2024-01-15T14:30:00Z" # Point-in-time +wrangler d1 time-travel info # List restore points (7 days free, 30 days paid) +wrangler d1 export --remote --output=./backup.sql # Full export +wrangler d1 export --remote --no-schema --output=./data.sql # Data only +wrangler d1 execute --remote --file=./backup.sql # Import +``` diff --git a/cloudflare/references/ddos/README.md b/cloudflare/references/ddos/README.md new file mode 100644 index 0000000..117dd21 --- /dev/null +++ b/cloudflare/references/ddos/README.md @@ -0,0 +1,41 @@ +# Cloudflare DDoS Protection + +Autonomous, always-on protection against DDoS attacks across L3/4 and L7. + +## Protection Types + +- **HTTP DDoS (L7)**: Protects HTTP/HTTPS traffic, phase `ddos_l7`, zone/account level +- **Network DDoS (L3/4)**: UDP/SYN/DNS floods, phase `ddos_l4`, account level only +- **Adaptive DDoS**: Learns 7-day baseline, detects deviations, 4 profile types (Origins, User-Agents, Locations, Protocols) + +## Plan Availability + +| Feature | Free | Pro | Business | Enterprise | Enterprise Advanced | +|---------|------|-----|----------|------------|---------------------| +| HTTP DDoS (L7) | ✓ | ✓ | ✓ | ✓ | ✓ | +| Network DDoS (L3/4) | ✓ | ✓ | ✓ | ✓ | ✓ | +| Override rules | 1 | 1 | 1 | 1 | 10 | +| Custom expressions | ✗ | ✗ | ✗ | ✗ | ✓ | +| Log action | ✗ | ✗ | ✗ | ✗ | ✓ | +| Adaptive DDoS | ✗ | ✗ | ✗ | ✓ | ✓ | +| Alert filters | Basic | Basic | Basic | Advanced | Advanced | + +## Actions & Sensitivity + +- **Actions**: `block`, `managed_challenge`, `challenge`, `log` (Enterprise Advanced only) +- **Sensitivity**: `default` (high), `medium`, `low`, `eoff` (essentially off) +- **Override**: By category/tag or individual rule ID +- **Scope**: Zone-level overrides take precedence over account-level + +## Reading Order + +| File | Purpose | Start Here If... | +|------|---------|------------------| +| [configuration.md](./configuration.md) | Dashboard setup, rule structure, adaptive profiles | You're setting up DDoS protection for the first time | +| [api.md](./api.md) | API endpoints, SDK usage, ruleset ID discovery | You're automating configuration or need programmatic access | +| [patterns.md](./patterns.md) | Protection strategies, defense-in-depth, dynamic response | You need implementation patterns or layered security | +| [gotchas.md](./gotchas.md) | False positives, tuning, error handling | You're troubleshooting or optimizing existing protection | + +## See Also +- [waf](../waf/) - Application-layer security rules +- [bot-management](../bot-management/) - Bot detection and mitigation diff --git a/cloudflare/references/ddos/api.md b/cloudflare/references/ddos/api.md new file mode 100644 index 0000000..b96284a --- /dev/null +++ b/cloudflare/references/ddos/api.md @@ -0,0 +1,164 @@ +# DDoS API + +## Endpoints + +### HTTP DDoS (L7) + +```typescript +// Zone-level +PUT /zones/{zoneId}/rulesets/phases/ddos_l7/entrypoint +GET /zones/{zoneId}/rulesets/phases/ddos_l7/entrypoint + +// Account-level (Enterprise Advanced) +PUT /accounts/{accountId}/rulesets/phases/ddos_l7/entrypoint +GET /accounts/{accountId}/rulesets/phases/ddos_l7/entrypoint +``` + +### Network DDoS (L3/4) + +```typescript +// Account-level only +PUT /accounts/{accountId}/rulesets/phases/ddos_l4/entrypoint +GET /accounts/{accountId}/rulesets/phases/ddos_l4/entrypoint +``` + +## TypeScript SDK + +**SDK Version**: Requires `cloudflare` >= 3.0.0 for ruleset phase methods. + +```typescript +import Cloudflare from "cloudflare"; + +const client = new Cloudflare({ apiToken: process.env.CLOUDFLARE_API_TOKEN }); + +// STEP 1: Discover managed ruleset ID (required for overrides) +const allRulesets = await client.rulesets.list({ zone_id: zoneId }); +const ddosRuleset = allRulesets.result.find( + (r) => r.kind === "managed" && r.phase === "ddos_l7" +); +if (!ddosRuleset) throw new Error("DDoS managed ruleset not found"); +const managedRulesetId = ddosRuleset.id; + +// STEP 2: Get current HTTP DDoS configuration +const entrypointRuleset = await client.zones.rulesets.phases.entrypoint.get("ddos_l7", { + zone_id: zoneId, +}); + +// STEP 3: Update HTTP DDoS ruleset with overrides +await client.zones.rulesets.phases.entrypoint.update("ddos_l7", { + zone_id: zoneId, + rules: [ + { + action: "execute", + expression: "true", + action_parameters: { + id: managedRulesetId, // From discovery step + overrides: { + sensitivity_level: "medium", + action: "managed_challenge", + }, + }, + }, + ], +}); + +// Network DDoS (account level, L3/4) +const l4Rulesets = await client.rulesets.list({ account_id: accountId }); +const l4DdosRuleset = l4Rulesets.result.find( + (r) => r.kind === "managed" && r.phase === "ddos_l4" +); +const l4Ruleset = await client.accounts.rulesets.phases.entrypoint.get("ddos_l4", { + account_id: accountId, +}); +``` + +## Alert Configuration + +```typescript +interface DDoSAlertConfig { + name: string; + enabled: boolean; + alert_type: "http_ddos_attack_alert" | "layer_3_4_ddos_attack_alert" + | "advanced_http_ddos_attack_alert" | "advanced_layer_3_4_ddos_attack_alert"; + filters?: { + zones?: string[]; + hostnames?: string[]; + requests_per_second?: number; + packets_per_second?: number; + megabits_per_second?: number; + ip_prefixes?: string[]; // CIDR + ip_addresses?: string[]; + protocols?: string[]; + }; + mechanisms: { + email?: Array<{ id: string }>; + webhooks?: Array<{ id: string }>; + pagerduty?: Array<{ id: string }>; + }; +} + +// Create alert +await fetch( + `https://api.cloudflare.com/client/v4/accounts/${accountId}/alerting/v3/policies`, + { + method: "POST", + headers: { + Authorization: `Bearer ${apiToken}`, + "Content-Type": "application/json", + }, + body: JSON.stringify(alertConfig), + } +); +``` + +## Typed Override Examples + +```typescript +// Override by category +interface CategoryOverride { + action: "execute"; + expression: string; + action_parameters: { + id: string; + overrides: { + categories?: Array<{ + category: "http-flood" | "http-anomaly" | "udp-flood" | "syn-flood"; + sensitivity_level?: "default" | "medium" | "low" | "eoff"; + action?: "block" | "managed_challenge" | "challenge" | "log"; + }>; + }; + }; +} + +// Override by rule ID +interface RuleOverride { + action: "execute"; + expression: string; + action_parameters: { + id: string; + overrides: { + rules?: Array<{ + id: string; + action?: "block" | "managed_challenge" | "challenge" | "log"; + sensitivity_level?: "default" | "medium" | "low" | "eoff"; + }>; + }; + }; +} + +// Example: Override specific adaptive rule +const adaptiveOverride: RuleOverride = { + action: "execute", + expression: "true", + action_parameters: { + id: managedRulesetId, + overrides: { + rules: [ + { id: "...adaptive-origins-rule-id...", sensitivity_level: "low" }, + ], + }, + }, +}; +``` + +See [patterns.md](./patterns.md) for complete implementation patterns. diff --git a/cloudflare/references/ddos/configuration.md b/cloudflare/references/ddos/configuration.md new file mode 100644 index 0000000..14c6e32 --- /dev/null +++ b/cloudflare/references/ddos/configuration.md @@ -0,0 +1,93 @@ +# DDoS Configuration + +## Dashboard Setup + +1. Navigate to Security > DDoS +2. Select HTTP DDoS or Network-layer DDoS +3. Configure sensitivity & action per ruleset/category/rule +4. Apply overrides with optional expressions (Enterprise Advanced) +5. Enable Adaptive DDoS toggle (Enterprise/Enterprise Advanced, requires 7 days traffic history) + +## Rule Structure + +```typescript +interface DDoSOverride { + description: string; + rules: Array<{ + action: "execute"; + expression: string; // Custom expression (Enterprise Advanced) or "true" for all + action_parameters: { + id: string; // Managed ruleset ID (discover via api.md) + overrides: { + sensitivity_level?: "default" | "medium" | "low" | "eoff"; + action?: "block" | "managed_challenge" | "challenge" | "log"; // log = Enterprise Advanced only + categories?: Array<{ + category: string; // e.g., "http-flood", "udp-flood" + sensitivity_level?: string; + }>; + rules?: Array<{ + id: string; + action?: string; + sensitivity_level?: string; + }>; + }; + }; + }>; +} +``` + +## Expression Availability + +| Plan | Custom Expressions | Example | +|------|-------------------|---------| +| Free/Pro/Business | ✗ | Use `"true"` only | +| Enterprise | ✗ | Use `"true"` only | +| Enterprise Advanced | ✓ | `ip.src in {...}`, `http.request.uri.path matches "..."` | + +## Sensitivity Mapping + +| UI | API | Threshold | +|----|-----|-----------| +| High | `default` | Most aggressive | +| Medium | `medium` | Balanced | +| Low | `low` | Less aggressive | +| Essentially Off | `eoff` | Minimal mitigation | + +## Common Categories + +- `http-flood`, `http-anomaly` (L7) +- `udp-flood`, `syn-flood`, `dns-flood` (L3/4) + +## Override Precedence + +Multiple override layers apply in this order (higher precedence wins): + +``` +Zone-level > Account-level +Individual Rule > Category > Global sensitivity/action +``` + +**Example**: Zone rule for `/api/*` overrides account-level global settings. + +## Adaptive DDoS Profiles + +**Availability**: Enterprise, Enterprise Advanced +**Learning period**: 7 days of traffic history required + +| Profile Type | Description | Detects | +|--------------|-------------|---------| +| **Origins** | Traffic patterns per origin server | Anomalous requests to specific origins | +| **User-Agents** | Traffic patterns per User-Agent | Malicious/anomalous user agent strings | +| **Locations** | Traffic patterns per geo-location | Attacks from specific countries/regions | +| **Protocols** | Traffic patterns per protocol (L3/4) | Protocol-specific flood attacks | + +Configure by targeting specific adaptive rule IDs via API (see api.md#typed-override-examples). + +## Alerting + +Configure via Notifications: +- Alert types: `http_ddos_attack_alert`, `layer_3_4_ddos_attack_alert`, `advanced_*` variants +- Filters: zones, hostnames, RPS/PPS/Mbps thresholds, IPs, protocols +- Mechanisms: email, webhooks, PagerDuty + +See [api.md](./api.md#alert-configuration) for API examples. diff --git a/cloudflare/references/ddos/gotchas.md b/cloudflare/references/ddos/gotchas.md new file mode 100644 index 0000000..f2a97d1 --- /dev/null +++ b/cloudflare/references/ddos/gotchas.md @@ -0,0 +1,107 @@ +# DDoS Gotchas + +## Common Errors + +### "False positives blocking legitimate traffic" + +**Cause**: Sensitivity too high, wrong action, or missing exceptions +**Solution**: +1. Lower sensitivity for specific rule/category +2. Use `log` action first to validate (Enterprise Advanced) +3. Add exception with custom expression (e.g., allowlist IPs) +4. Query flagged requests via GraphQL Analytics API to identify patterns + +### "Attacks getting through" + +**Cause**: Sensitivity too low or wrong action +**Solution**: Increase to `default` sensitivity and use `block` action: +```typescript +const config = { + rules: [{ + expression: "true", + action: "execute", + action_parameters: { id: managedRulesetId, overrides: { sensitivity_level: "default", action: "block" } }, + }], +}; +``` + +### "Adaptive rules not working" + +**Cause**: Insufficient traffic history (needs 7 days) +**Solution**: Wait for baseline to establish, check dashboard for adaptive rule status + +### "Zone override ignored" + +**Cause**: Account overrides conflict with zone overrides +**Solution**: Configure at zone level OR remove zone overrides to use account-level + +### "Log action not available" + +**Cause**: Not on Enterprise Advanced DDoS plan +**Solution**: Use `managed_challenge` with low sensitivity for testing + +### "Rule limit exceeded" + +**Cause**: Too many override rules (Free/Pro/Business: 1, Enterprise Advanced: 10) +**Solution**: Combine conditions in single expression using `and`/`or` + +### "Cannot override rule" + +**Cause**: Rule is read-only +**Solution**: Check API response for read-only indicator, use different rule + +### "Cannot disable DDoS protection" + +**Cause**: DDoS managed rulesets cannot be fully disabled (always-on protection) +**Solution**: Set `sensitivity_level: "eoff"` for minimal mitigation + +### "Expression not allowed" + +**Cause**: Custom expressions require Enterprise Advanced plan +**Solution**: Use `expression: "true"` for all traffic, or upgrade plan + +### "Managed ruleset not found" + +**Cause**: Zone/account doesn't have DDoS managed ruleset, or incorrect phase +**Solution**: Verify ruleset exists via `client.rulesets.list()`, check phase name (`ddos_l7` or `ddos_l4`) + +## API Error Codes + +| Error Code | Message | Cause | Solution | +|------------|---------|-------|----------| +| 10000 | Authentication error | Invalid/missing API token | Check token has DDoS permissions | +| 81000 | Ruleset validation failed | Invalid rule structure | Verify `action_parameters.id` is managed ruleset ID | +| 81020 | Expression not allowed | Custom expressions on wrong plan | Use `"true"` or upgrade to Enterprise Advanced | +| 81021 | Rule limit exceeded | Too many override rules | Reduce rules or upgrade (Enterprise Advanced: 10) | +| 81022 | Invalid sensitivity level | Wrong sensitivity value | Use: `default`, `medium`, `low`, `eoff` | +| 81023 | Invalid action | Wrong action for plan | Enterprise Advanced only: `log` action | + +## Limits + +| Resource/Limit | Free/Pro/Business | Enterprise | Enterprise Advanced | +|----------------|-------------------|------------|---------------------| +| Override rules per zone | 1 | 1 | 10 | +| Custom expressions | ✗ | ✗ | ✓ | +| Log action | ✗ | ✗ | ✓ | +| Adaptive DDoS | ✗ | ✓ | ✓ | +| Traffic history required | - | 7 days | 7 days | + +## Tuning Strategy + +1. Start with `log` action + `medium` sensitivity +2. Monitor for 24-48 hours +3. Identify false positives, add exceptions +4. Gradually increase to `default` sensitivity +5. Change action from `log` → `managed_challenge` → `block` +6. Document all adjustments + +## Best Practices + +- Test during low-traffic periods +- Use zone-level for per-site tuning +- Reference IP lists for easier management +- Set appropriate alert thresholds (avoid noise) +- Combine with WAF for layered defense +- Avoid over-tuning (keep config simple) + +See [patterns.md](./patterns.md) for progressive rollout examples. diff --git a/cloudflare/references/ddos/patterns.md b/cloudflare/references/ddos/patterns.md new file mode 100644 index 0000000..a46ef2f --- /dev/null +++ b/cloudflare/references/ddos/patterns.md @@ -0,0 +1,174 @@ +# DDoS Protection Patterns + +## Allowlist Trusted IPs + +```typescript +const config = { + description: "Allowlist trusted IPs", + rules: [{ + expression: "ip.src in { 203.0.113.0/24 192.0.2.1 }", + action: "execute", + action_parameters: { + id: managedRulesetId, + overrides: { sensitivity_level: "eoff" }, + }, + }], +}; + +await client.accounts.rulesets.phases.entrypoint.update("ddos_l7", { + account_id: accountId, + ...config, +}); +``` + +## Route-specific Sensitivity + +```typescript +const config = { + description: "Route-specific protection", + rules: [ + { + expression: "not http.request.uri.path matches \"^/api/\"", + action: "execute", + action_parameters: { + id: managedRulesetId, + overrides: { sensitivity_level: "default", action: "block" }, + }, + }, + { + expression: "http.request.uri.path matches \"^/api/\"", + action: "execute", + action_parameters: { + id: managedRulesetId, + overrides: { sensitivity_level: "low", action: "managed_challenge" }, + }, + }, + ], +}; +``` + +## Progressive Enhancement + +```typescript +enum ProtectionLevel { MONITORING = "monitoring", LOW = "low", MEDIUM = "medium", HIGH = "high" } + +const levelConfig = { + [ProtectionLevel.MONITORING]: { action: "log", sensitivity: "eoff" }, + [ProtectionLevel.LOW]: { action: "managed_challenge", sensitivity: "low" }, + [ProtectionLevel.MEDIUM]: { action: "managed_challenge", sensitivity: "medium" }, + [ProtectionLevel.HIGH]: { action: "block", sensitivity: "default" }, +} as const; + +async function setProtectionLevel(zoneId: string, level: ProtectionLevel, rulesetId: string, client: Cloudflare) { + const settings = levelConfig[level]; + return client.zones.rulesets.phases.entrypoint.update("ddos_l7", { + zone_id: zoneId, + rules: [{ + expression: "true", + action: "execute", + action_parameters: { id: rulesetId, overrides: { action: settings.action, sensitivity_level: settings.sensitivity } }, + }], + }); +} +``` + +## Dynamic Response to Attacks + +```typescript +interface Env { CLOUDFLARE_API_TOKEN: string; ZONE_ID: string; KV: KVNamespace; } + +export default { + async fetch(request: Request, env: Env): Promise { + if (request.url.includes("/attack-detected")) { + const attackData = await request.json(); + await env.KV.put(`attack:${Date.now()}`, JSON.stringify(attackData), { expirationTtl: 86400 }); + const recentAttacks = await getRecentAttacks(env.KV); + if (recentAttacks.length > 5) { + await setProtectionLevel(env.ZONE_ID, ProtectionLevel.HIGH, managedRulesetId, client); + return new Response("Protection increased"); + } + } + return new Response("OK"); + }, + async scheduled(event: ScheduledEvent, env: Env): Promise { + const recentAttacks = await getRecentAttacks(env.KV); + if (recentAttacks.length === 0) await setProtectionLevel(env.ZONE_ID, ProtectionLevel.MEDIUM, managedRulesetId, client); + }, +}; +``` + +## Multi-rule Tiered Protection (Enterprise Advanced) + +```typescript +const config = { + description: "Multi-tier DDoS protection", + rules: [ + { + expression: "not ip.src in $known_ips and not cf.bot_management.score gt 30", + action: "execute", + action_parameters: { id: managedRulesetId, overrides: { sensitivity_level: "default", action: "block" } }, + }, + { + expression: "cf.bot_management.verified_bot", + action: "execute", + action_parameters: { id: managedRulesetId, overrides: { sensitivity_level: "medium", action: "managed_challenge" } }, + }, + { + expression: "ip.src in $trusted_ips", + action: "execute", + action_parameters: { id: managedRulesetId, overrides: { sensitivity_level: "low" } }, + }, + ], +}; +``` + +## Defense in Depth + +Layered security stack: DDoS + WAF + Rate Limiting + Bot Management. + +```typescript +// Layer 1: DDoS (volumetric attacks) +await client.zones.rulesets.phases.entrypoint.update("ddos_l7", { + zone_id: zoneId, + rules: [{ expression: "true", action: "execute", action_parameters: { id: ddosRulesetId, overrides: { sensitivity_level: "medium" } } }], +}); + +// Layer 2: WAF (exploit protection) +await client.zones.rulesets.phases.entrypoint.update("http_request_firewall_managed", { + zone_id: zoneId, + rules: [{ expression: "true", action: "execute", action_parameters: { id: wafRulesetId } }], +}); + +// Layer 3: Rate Limiting (abuse prevention) +await client.zones.rulesets.phases.entrypoint.update("http_ratelimit", { + zone_id: zoneId, + rules: [{ expression: "http.request.uri.path eq \"/api/login\"", action: "block", ratelimit: { characteristics: ["ip.src"], period: 60, requests_per_period: 5 } }], +}); + +// Layer 4: Bot Management (automation detection) +await client.zones.rulesets.phases.entrypoint.update("http_request_sbfm", { + zone_id: zoneId, + rules: [{ expression: "cf.bot_management.score lt 30", action: "managed_challenge" }], +}); +``` + +## Cache Strategy for DDoS Mitigation + +Exclude query strings from cache key to counter randomized query parameter attacks. + +```typescript +const cacheRule = { + expression: "http.request.uri.path matches \"^/api/\"", + action: "set_cache_settings", + action_parameters: { + cache: true, + cache_key: { ignore_query_strings_order: true, custom_key: { query_string: { exclude: { all: true } } } }, + }, +}; + +await client.zones.rulesets.phases.entrypoint.update("http_request_cache_settings", { zone_id: zoneId, rules: [cacheRule] }); +``` + +**Rationale**: Attackers randomize query strings (`?random=123456`) to bypass cache. Excluding query params ensures cache hits absorb attack traffic. + +See [configuration.md](./configuration.md) for rule structure details. diff --git a/cloudflare/references/do-storage/README.md b/cloudflare/references/do-storage/README.md new file mode 100644 index 0000000..426d2c4 --- /dev/null +++ b/cloudflare/references/do-storage/README.md @@ -0,0 +1,75 @@ +# Cloudflare Durable Objects Storage + +Persistent storage API for Durable Objects with SQLite and KV backends, PITR, and automatic concurrency control. + +## Overview + +DO Storage provides: +- SQLite-backed (recommended) or KV-backed +- SQL API + synchronous/async KV APIs +- Automatic input/output gates (race-free) +- 30-day point-in-time recovery (PITR) +- Transactions and alarms + +**Use cases:** Stateful coordination, real-time collaboration, counters, sessions, rate limiters + +**Billing:** Charged by request, GB-month storage, and rowsRead/rowsWritten for SQL operations + +## Quick Start + +```typescript +export class Counter extends DurableObject { + sql: SqlStorage; + + constructor(ctx: DurableObjectState, env: Env) { + super(ctx, env); + this.sql = ctx.storage.sql; + this.sql.exec('CREATE TABLE IF NOT EXISTS data(key TEXT PRIMARY KEY, value INTEGER)'); + } + + async increment(): Promise { + const result = this.sql.exec( + 'INSERT INTO data VALUES (?, ?) ON CONFLICT(key) DO UPDATE SET value = value + 1 RETURNING value', + 'counter', 1 + ).one(); + return result?.value || 1; + } +} +``` + +## Storage Backends + +| Backend | Create Method | APIs | PITR | +|---------|---------------|------|------| +| SQLite (recommended) | `new_sqlite_classes` | SQL + sync KV + async KV | ✅ | +| KV (legacy) | `new_classes` | async KV only | ❌ | + +## Core APIs + +- **SQL API** (`ctx.storage.sql`): Full SQLite with extensions (FTS5, JSON, math) +- **Sync KV** (`ctx.storage.kv`): Synchronous key-value (SQLite only) +- **Async KV** (`ctx.storage`): Asynchronous key-value (both backends) +- **Transactions** (`transactionSync()`, `transaction()`) +- **PITR** (`getBookmarkForTime()`, `onNextSessionRestoreBookmark()`) +- **Alarms** (`setAlarm()`, `alarm()` handler) + +## Reading Order + +**New to DO storage:** configuration.md → api.md → patterns.md → gotchas.md +**Building features:** patterns.md → api.md → gotchas.md +**Debugging issues:** gotchas.md → api.md +**Writing tests:** testing.md + +## In This Reference + +- [configuration.md](./configuration.md) - wrangler.jsonc migrations, SQLite vs KV setup, RPC binding +- [api.md](./api.md) - SQL exec/cursors, KV methods, storage options, transactions, alarms, PITR +- [patterns.md](./patterns.md) - Schema migrations, caching, rate limiting, batch processing, parent-child coordination +- [gotchas.md](./gotchas.md) - Concurrency gates, INTEGER precision, transaction rules, SQL limits +- [testing.md](./testing.md) - vitest-pool-workers setup, testing DOs with SQL/alarms/PITR + +## See Also + +- [durable-objects](../durable-objects/) - DO fundamentals and coordination patterns +- [workers](../workers/) - Worker runtime for DO stubs +- [d1](../d1/) - Shared database alternative to per-DO storage diff --git a/cloudflare/references/do-storage/api.md b/cloudflare/references/do-storage/api.md new file mode 100644 index 0000000..e659598 --- /dev/null +++ b/cloudflare/references/do-storage/api.md @@ -0,0 +1,102 @@ +# DO Storage API Reference + +## SQL API + +```typescript +const cursor = this.sql.exec('SELECT * FROM users WHERE email = ?', email); +for (let row of cursor) {} // Objects: { id, name, email } +cursor.toArray(); cursor.one(); // Single row (throws if != 1) +for (let row of cursor.raw()) {} // Arrays: [1, "Alice", "..."] + +// Manual iteration +const iter = cursor[Symbol.iterator](); +const first = iter.next(); // { value: {...}, done: false } + +cursor.columnNames; // ["id", "name", "email"] +cursor.rowsRead; cursor.rowsWritten; // Billing + +type User = { id: number; name: string; email: string }; +const user = this.sql.exec('...', userId).one(); +``` + +## Sync KV API (SQLite only) + +```typescript +this.ctx.storage.kv.get("counter"); // undefined if missing +this.ctx.storage.kv.put("counter", 42); +this.ctx.storage.kv.put("user", { name: "Alice", age: 30 }); +this.ctx.storage.kv.delete("counter"); // true if existed + +for (let [key, value] of this.ctx.storage.kv.list()) {} + +// List options: start, prefix, reverse, limit +this.ctx.storage.kv.list({ start: "user:", prefix: "user:", reverse: true, limit: 100 }); +``` + +## Async KV API (Both backends) + +```typescript +await this.ctx.storage.get("key"); // Single +await this.ctx.storage.get(["key1", "key2"]); // Multiple (max 128) +await this.ctx.storage.put("key", value); // Single +await this.ctx.storage.put({ "key1": "v1", "key2": { nested: true } }); // Multiple (max 128) +await this.ctx.storage.delete("key"); +await this.ctx.storage.delete(["key1", "key2"]); +await this.ctx.storage.list({ prefix: "user:", limit: 100 }); + +// Options: allowConcurrency, noCache, allowUnconfirmed +await this.ctx.storage.get("key", { allowConcurrency: true, noCache: true }); +await this.ctx.storage.put("key", value, { allowUnconfirmed: true, noCache: true }); +``` + +### Storage Options + +| Option | Methods | Effect | Use Case | +|--------|---------|--------|----------| +| `allowConcurrency` | get, list | Skip input gate; allow concurrent requests during read | Read-heavy metrics that don't need strict consistency | +| `noCache` | get, put, list | Skip in-memory cache; always read from disk | Rarely-accessed data or testing storage directly | +| `allowUnconfirmed` | put, delete | Return before write confirms (still protected by output gate) | Non-critical writes where latency matters more than confirmation | + +## Transactions + +```typescript +// Sync (SQL/sync KV only) +this.ctx.storage.transactionSync(() => { + this.sql.exec('UPDATE accounts SET balance = balance - ? WHERE id = ?', 100, 1); + this.sql.exec('UPDATE accounts SET balance = balance + ? WHERE id = ?', 100, 2); + return "result"; +}); + +// Async +await this.ctx.storage.transaction(async () => { + const value = await this.ctx.storage.get("counter"); + await this.ctx.storage.put("counter", value + 1); + if (value > 100) this.ctx.storage.rollback(); // Explicit rollback +}); +``` + +## Point-in-Time Recovery + +```typescript +await this.ctx.storage.getCurrentBookmark(); +await this.ctx.storage.getBookmarkForTime(Date.now() - 2 * 24 * 60 * 60 * 1000); +await this.ctx.storage.onNextSessionRestoreBookmark(bookmark); +this.ctx.abort(); // Restart to apply; bookmarks lexically comparable (earlier < later) +``` + +## Alarms + +```typescript +await this.ctx.storage.setAlarm(Date.now() + 60000); // Timestamp or Date +await this.ctx.storage.getAlarm(); +await this.ctx.storage.deleteAlarm(); + +async alarm() { await this.doScheduledWork(); } +``` + +## Misc + +```typescript +await this.ctx.storage.deleteAll(); // Atomic for SQLite; alarm NOT included +this.ctx.storage.sql.databaseSize; // Bytes +``` diff --git a/cloudflare/references/do-storage/configuration.md b/cloudflare/references/do-storage/configuration.md new file mode 100644 index 0000000..18b41bb --- /dev/null +++ b/cloudflare/references/do-storage/configuration.md @@ -0,0 +1,112 @@ +# DO Storage Configuration + +## SQLite-backed (Recommended) + +**wrangler.jsonc:** +```jsonc +{ + "migrations": [ + { + "tag": "v1", + "new_sqlite_classes": ["Counter", "Session", "RateLimiter"] + } + ] +} +``` + +**Migration lifecycle:** Migrations run once per deployment. Existing DO instances get new storage backend on next invocation. Renaming/removing classes requires `renamed_classes` or `deleted_classes` entries. + +## KV-backed (Legacy) + +**wrangler.jsonc:** +```jsonc +{ + "migrations": [ + { + "tag": "v1", + "new_classes": ["OldCounter"] + } + ] +} +``` + +## TypeScript Setup + +```typescript +export class MyDurableObject extends DurableObject { + sql: SqlStorage; + + constructor(ctx: DurableObjectState, env: Env) { + super(ctx, env); + this.sql = ctx.storage.sql; + + // Initialize schema + this.sql.exec(` + CREATE TABLE IF NOT EXISTS users( + id INTEGER PRIMARY KEY, + name TEXT NOT NULL, + email TEXT UNIQUE + ); + `); + } +} + +// Binding +interface Env { + MY_DO: DurableObjectNamespace; +} + +export default { + async fetch(request: Request, env: Env): Promise { + const id = env.MY_DO.idFromName('singleton'); + const stub = env.MY_DO.get(id); + + // Modern RPC: call methods directly (recommended) + const result = await stub.someMethod(); + return Response.json(result); + + // Legacy: forward request (still works) + // return stub.fetch(request); + } +} +``` + +## CPU Limits + +```jsonc +{ + "limits": { + "cpu_ms": 300000 // 5 minutes (default 30s) + } +} +``` + +## Location Control + +```typescript +// Jurisdiction (GDPR/FedRAMP) +const euNamespace = env.MY_DO.jurisdiction("eu"); +const id = euNamespace.newUniqueId(); +const stub = euNamespace.get(id); + +// Location hint (best effort) +const stub = env.MY_DO.get(id, { locationHint: "enam" }); +// Hints: wnam, enam, sam, weur, eeur, apac, oc, afr, me +``` + +## Initialization + +```typescript +export class Counter extends DurableObject { + value: number; + + constructor(ctx: DurableObjectState, env: Env) { + super(ctx, env); + + // Block concurrent requests during init + ctx.blockConcurrencyWhile(async () => { + this.value = (await ctx.storage.get("value")) || 0; + }); + } +} +``` diff --git a/cloudflare/references/do-storage/gotchas.md b/cloudflare/references/do-storage/gotchas.md new file mode 100644 index 0000000..8898f08 --- /dev/null +++ b/cloudflare/references/do-storage/gotchas.md @@ -0,0 +1,150 @@ +# DO Storage Gotchas & Troubleshooting + +## Concurrency Model (CRITICAL) + +Durable Objects use **input/output gates** to prevent race conditions: + +### Input Gates +Block new requests during storage reads from CURRENT request: + +```typescript +// SAFE: Input gate active during await +async increment() { + const val = await this.ctx.storage.get("counter"); // Input gate blocks other requests + await this.ctx.storage.put("counter", val + 1); + return val; +} +``` + +### Output Gates +Hold response until ALL writes from current request confirm: + +```typescript +// SAFE: Output gate waits for put() to confirm before returning response +async increment() { + const val = await this.ctx.storage.get("counter"); + this.ctx.storage.put("counter", val + 1); // No await + return new Response(String(val)); // Response delayed until write confirms +} +``` + +### Write Coalescing +Multiple writes to same key = atomic (last write wins): + +```typescript +// SAFE: All three writes coalesce atomically +this.ctx.storage.put("key", 1); +this.ctx.storage.put("key", 2); +this.ctx.storage.put("key", 3); // Final value: 3 +``` + +### Breaking Gates (DANGER) + +**fetch() breaks input/output gates** → allows request interleaving: + +```typescript +// UNSAFE: fetch() allows another request to interleave +async unsafe() { + const val = await this.ctx.storage.get("counter"); + await fetch("https://api.example.com"); // Gate broken! + await this.ctx.storage.put("counter", val + 1); // Race condition possible +} +``` + +**Solution:** Use `blockConcurrencyWhile()` or `transaction()`: + +```typescript +// SAFE: Block concurrent requests explicitly +async safe() { + return await this.ctx.blockConcurrencyWhile(async () => { + const val = await this.ctx.storage.get("counter"); + await fetch("https://api.example.com"); + await this.ctx.storage.put("counter", val + 1); + return val; + }); +} +``` + +### allowConcurrency Option + +Opt out of input gate for reads that don't need protection: + +```typescript +// Allow concurrent reads (no consistency guarantee) +const val = await this.ctx.storage.get("metrics", { allowConcurrency: true }); +``` + +## Common Errors + +### "Race Condition in Concurrent Calls" + +**Cause:** Multiple concurrent storage operations initiated from same event (e.g., `Promise.all()`) are not protected by input gate +**Solution:** Avoid concurrent storage operations within single event; input gate only serializes requests from different events, not operations within same event + +### "Direct SQL Transaction Statements" + +**Cause:** Using `BEGIN TRANSACTION` directly instead of transaction methods +**Solution:** Use `this.ctx.storage.transactionSync()` for sync operations or `this.ctx.storage.transaction()` for async operations + +### "Async in transactionSync" + +**Cause:** Using async operations inside `transactionSync()` callback +**Solution:** Use async `transaction()` method instead of `transactionSync()` when async operations needed + +### "TypeScript Type Mismatch at Runtime" + +**Cause:** Query doesn't return all fields specified in TypeScript type +**Solution:** Ensure SQL query selects all columns that match the TypeScript type definition + +### "Silent Data Corruption with Large IDs" + +**Cause:** JavaScript numbers have 53-bit precision; SQLite INTEGER is 64-bit +**Symptom:** IDs > 9007199254740991 (Number.MAX_SAFE_INTEGER) silently truncate/corrupt +**Solution:** Store large IDs as TEXT: + +```typescript +// BAD: Snowflake/Twitter IDs will corrupt +this.sql.exec("CREATE TABLE events(id INTEGER PRIMARY KEY)"); +this.sql.exec("INSERT INTO events VALUES (?)", 1234567890123456789n); // Corrupts! + +// GOOD: Store as TEXT +this.sql.exec("CREATE TABLE events(id TEXT PRIMARY KEY)"); +this.sql.exec("INSERT INTO events VALUES (?)", "1234567890123456789"); +``` + +### "Alarm Not Deleted with deleteAll()" + +**Cause:** `deleteAll()` doesn't delete alarms automatically +**Solution:** Call `deleteAlarm()` explicitly before `deleteAll()` to remove alarm + +### "Slow Performance" + +**Cause:** Using async KV API instead of sync API +**Solution:** Use sync KV API (`ctx.storage.kv`) for better performance with simple key-value operations + +### "High Billing from Storage Operations" + +**Cause:** Excessive `rowsRead`/`rowsWritten` or unused objects not cleaned up +**Solution:** Monitor `rowsRead`/`rowsWritten` metrics and ensure unused objects call `deleteAll()` + +### "Durable Object Overloaded" + +**Cause:** Single DO exceeding ~1K req/sec soft limit +**Solution:** Shard across multiple DOs with random IDs or other distribution strategy + +## Limits + +| Limit | Value | Notes | +|-------|-------|-------| +| Max columns per table | 100 | SQL limitation | +| Max string/BLOB per row | 2 MB | SQL limitation | +| Max row size | 2 MB | SQL limitation | +| Max SQL statement size | 100 KB | SQL limitation | +| Max SQL parameters | 100 | SQL limitation | +| Max LIKE/GLOB pattern | 50 B | SQL limitation | +| SQLite storage per object | 10 GB | SQLite-backed storage | +| SQLite key+value size | 2 MB | SQLite-backed storage | +| KV storage per object | Unlimited | KV-style storage | +| KV key size | 2 KiB | KV-style storage | +| KV value size | 128 KiB | KV-style storage | +| Request throughput | ~1K req/sec | Soft limit per DO | diff --git a/cloudflare/references/do-storage/patterns.md b/cloudflare/references/do-storage/patterns.md new file mode 100644 index 0000000..2885915 --- /dev/null +++ b/cloudflare/references/do-storage/patterns.md @@ -0,0 +1,182 @@ +# DO Storage Patterns & Best Practices + +## Schema Migration + +```typescript +export class MyDurableObject extends DurableObject { + constructor(ctx: DurableObjectState, env: Env) { + super(ctx, env); + this.sql = ctx.storage.sql; + + // Use SQLite's built-in user_version pragma + const ver = this.sql.exec("PRAGMA user_version").one()?.user_version || 0; + + if (ver === 0) { + this.sql.exec(`CREATE TABLE users(id INTEGER PRIMARY KEY, name TEXT)`); + this.sql.exec("PRAGMA user_version = 1"); + } + if (ver === 1) { + this.sql.exec(`ALTER TABLE users ADD COLUMN email TEXT`); + this.sql.exec("PRAGMA user_version = 2"); + } + } +} +``` + +## In-Memory Caching + +```typescript +export class UserCache extends DurableObject { + cache = new Map(); + async getUser(id: string): Promise { + if (this.cache.has(id)) { + const cached = this.cache.get(id); + if (cached) return cached; + } + const user = await this.ctx.storage.get(`user:${id}`); + if (user) this.cache.set(id, user); + return user; + } + async updateUser(id: string, data: Partial) { + const updated = { ...await this.getUser(id), ...data }; + this.cache.set(id, updated); + await this.ctx.storage.put(`user:${id}`, updated); + return updated; + } +} +``` + +## Rate Limiting + +```typescript +export class RateLimiter extends DurableObject { + async checkLimit(key: string, limit: number, window: number): Promise { + const now = Date.now(); + this.sql.exec('DELETE FROM requests WHERE key = ? AND timestamp < ?', key, now - window); + const count = this.sql.exec('SELECT COUNT(*) as count FROM requests WHERE key = ?', key).one().count; + if (count >= limit) return false; + this.sql.exec('INSERT INTO requests (key, timestamp) VALUES (?, ?)', key, now); + return true; + } +} +``` + +## Batch Processing with Alarms + +```typescript +export class BatchProcessor extends DurableObject { + pending: string[] = []; + async addItem(item: string) { + this.pending.push(item); + if (!await this.ctx.storage.getAlarm()) await this.ctx.storage.setAlarm(Date.now() + 5000); + } + async alarm() { + const items = [...this.pending]; + this.pending = []; + this.sql.exec(`INSERT INTO processed_items (item, timestamp) VALUES ${items.map(() => "(?, ?)").join(", ")}`, ...items.flatMap(item => [item, Date.now()])); + } +} +``` + +## Initialization Pattern + +```typescript +export class Counter extends DurableObject { + value: number; + constructor(ctx: DurableObjectState, env: Env) { + super(ctx, env); + ctx.blockConcurrencyWhile(async () => { this.value = (await ctx.storage.get("value")) || 0; }); + } + async increment() { + this.value++; + this.ctx.storage.put("value", this.value); // Don't await (output gate protects) + return this.value; + } +} +``` + +## Safe Counter / Optimized Write + +```typescript +// Input gate blocks other requests +async getUniqueNumber(): Promise { + let val = await this.ctx.storage.get("counter"); + await this.ctx.storage.put("counter", val + 1); + return val; +} + +// No await on write - output gate delays response until write confirms +async increment(): Promise { + let val = await this.ctx.storage.get("counter"); + this.ctx.storage.put("counter", val + 1); + return new Response(String(val)); +} +``` + +## Parent-Child Coordination + +Hierarchical DO pattern where parent manages child DOs: + +```typescript +// Parent DO coordinates children +export class Workspace extends DurableObject { + async createDocument(name: string): Promise { + const docId = crypto.randomUUID(); + const childId = this.env.DOCUMENT.idFromName(`${this.ctx.id.toString()}:${docId}`); + const childStub = this.env.DOCUMENT.get(childId); + await childStub.initialize(name); + + // Track child in parent storage + this.sql.exec('INSERT INTO documents (id, name, created) VALUES (?, ?, ?)', + docId, name, Date.now()); + return docId; + } + + async listDocuments(): Promise { + return this.sql.exec('SELECT id FROM documents').toArray().map(r => r.id); + } +} + +// Child DO +export class Document extends DurableObject { + async initialize(name: string) { + this.sql.exec('CREATE TABLE IF NOT EXISTS content(key TEXT PRIMARY KEY, value TEXT)'); + this.sql.exec('INSERT INTO content VALUES (?, ?)', 'name', name); + } +} +``` + +## Write Coalescing Pattern + +Multiple writes to same key coalesce atomically (last write wins): + +```typescript +async updateMetrics(userId: string, actions: Action[]) { + // All writes coalesce - no await needed + for (const action of actions) { + this.ctx.storage.put(`user:${userId}:lastAction`, action.type); + this.ctx.storage.put(`user:${userId}:count`, + await this.ctx.storage.get(`user:${userId}:count`) + 1); + } + // Output gate ensures all writes confirm before response + return new Response("OK"); +} + +// Atomic batch with SQL +async batchUpdate(items: Item[]) { + this.sql.exec('BEGIN'); + for (const item of items) { + this.sql.exec('INSERT OR REPLACE INTO items VALUES (?, ?)', item.id, item.value); + } + this.sql.exec('COMMIT'); +} +``` + +## Cleanup + +```typescript +async cleanup() { + await this.ctx.storage.deleteAlarm(); // Separate from deleteAll + await this.ctx.storage.deleteAll(); +} +``` diff --git a/cloudflare/references/do-storage/testing.md b/cloudflare/references/do-storage/testing.md new file mode 100644 index 0000000..d348d87 --- /dev/null +++ b/cloudflare/references/do-storage/testing.md @@ -0,0 +1,183 @@ +# DO Storage Testing + +Testing Durable Objects with storage using `vitest-pool-workers`. + +## Setup + +**vitest.config.ts:** +```typescript +import { defineWorkersConfig } from "@cloudflare/vitest-pool-workers/config"; + +export default defineWorkersConfig({ + test: { + poolOptions: { + workers: { wrangler: { configPath: "./wrangler.toml" } } + } + } +}); +``` + +**package.json:** Add `@cloudflare/vitest-pool-workers` and `vitest` to devDependencies + +## Basic Testing + +```typescript +import { env, runInDurableObject } from "cloudflare:test"; +import { describe, it, expect } from "vitest"; + +describe("Counter DO", () => { + it("increments counter", async () => { + const id = env.COUNTER.idFromName("test"); + const result = await runInDurableObject(env.COUNTER, id, async (instance, state) => { + const val1 = await instance.increment(); + const val2 = await instance.increment(); + return { val1, val2 }; + }); + expect(result.val1).toBe(1); + expect(result.val2).toBe(2); + }); +}); +``` + +## Testing SQL Storage + +```typescript +it("creates and queries users", async () => { + const id = env.USER_MANAGER.idFromName("test"); + await runInDurableObject(env.USER_MANAGER, id, async (instance, state) => { + await instance.createUser("alice@example.com", "Alice"); + const user = await instance.getUser("alice@example.com"); + expect(user).toEqual({ email: "alice@example.com", name: "Alice" }); + }); +}); + +it("handles schema migrations", async () => { + const id = env.USER_MANAGER.idFromName("migration-test"); + await runInDurableObject(env.USER_MANAGER, id, async (instance, state) => { + const version = state.storage.sql.exec( + "SELECT value FROM _meta WHERE key = 'schema_version'" + ).one()?.value; + expect(version).toBe("1"); + }); +}); +``` + +## Testing Alarms + +```typescript +import { runDurableObjectAlarm } from "cloudflare:test"; + +it("processes batch on alarm", async () => { + const id = env.BATCH_PROCESSOR.idFromName("test"); + + // Add items + await runInDurableObject(env.BATCH_PROCESSOR, id, async (instance) => { + await instance.addItem("item1"); + await instance.addItem("item2"); + }); + + // Trigger alarm + await runDurableObjectAlarm(env.BATCH_PROCESSOR, id); + + // Verify processed + await runInDurableObject(env.BATCH_PROCESSOR, id, async (instance, state) => { + const count = state.storage.sql.exec( + "SELECT COUNT(*) as count FROM processed_items" + ).one().count; + expect(count).toBe(2); + }); +}); +``` + +## Testing Concurrency + +```typescript +it("handles concurrent increments safely", async () => { + const id = env.COUNTER.idFromName("concurrent-test"); + + // Parallel increments + const results = await Promise.all([ + runInDurableObject(env.COUNTER, id, (i) => i.increment()), + runInDurableObject(env.COUNTER, id, (i) => i.increment()), + runInDurableObject(env.COUNTER, id, (i) => i.increment()) + ]); + + // All should get unique values + expect(new Set(results).size).toBe(3); + expect(Math.max(...results)).toBe(3); +}); +``` + +## Test Isolation + +```typescript +// Per-test unique IDs +let testId: string; +beforeEach(() => { testId = crypto.randomUUID(); }); + +it("isolated test", async () => { + const id = env.MY_DO.idFromName(testId); + // Uses unique DO instance +}); + +// Cleanup pattern +it("with cleanup", async () => { + const id = env.MY_DO.idFromName("cleanup-test"); + try { + await runInDurableObject(env.MY_DO, id, async (instance) => {}); + } finally { + await runInDurableObject(env.MY_DO, id, async (instance, state) => { + await state.storage.deleteAll(); + }); + } +}); +``` + +## Testing PITR + +```typescript +it("restores from bookmark", async () => { + const id = env.MY_DO.idFromName("pitr-test"); + + // Create checkpoint + const bookmark = await runInDurableObject(env.MY_DO, id, async (instance, state) => { + await state.storage.put("value", 1); + return await state.storage.getCurrentBookmark(); + }); + + // Modify and restore + await runInDurableObject(env.MY_DO, id, async (instance, state) => { + await state.storage.put("value", 2); + await state.storage.onNextSessionRestoreBookmark(bookmark); + state.abort(); + }); + + // Verify restored + await runInDurableObject(env.MY_DO, id, async (instance, state) => { + const value = await state.storage.get("value"); + expect(value).toBe(1); + }); +}); +``` + +## Testing Transactions + +```typescript +it("rolls back on error", async () => { + const id = env.BANK.idFromName("transaction-test"); + + await runInDurableObject(env.BANK, id, async (instance, state) => { + await state.storage.put("balance", 100); + + await expect( + state.storage.transaction(async () => { + await state.storage.put("balance", 50); + throw new Error("Cancel"); + }) + ).rejects.toThrow("Cancel"); + + const balance = await state.storage.get("balance"); + expect(balance).toBe(100); // Rolled back + }); +}); +``` diff --git a/cloudflare/references/durable-objects/README.md b/cloudflare/references/durable-objects/README.md new file mode 100644 index 0000000..8e96558 --- /dev/null +++ b/cloudflare/references/durable-objects/README.md @@ -0,0 +1,185 @@ +# Cloudflare Durable Objects + +Expert guidance for building stateful applications with Cloudflare Durable Objects. + +## Reading Order + +1. **First time?** Read this overview + Quick Start +2. **Setting up?** See [Configuration](./configuration.md) +3. **Building features?** Use decision trees below → [Patterns](./patterns.md) +4. **Debugging issues?** Check [Gotchas](./gotchas.md) +5. **Deep dive?** [API](./api.md) and [DO Storage](../do-storage/README.md) + +## Overview + +Durable Objects combine compute with storage in globally-unique, strongly-consistent packages: +- **Globally unique instances**: Each DO has unique ID for multi-client coordination +- **Co-located storage**: Fast, strongly-consistent storage with compute +- **Automatic placement**: Objects spawn near first request location +- **Stateful serverless**: In-memory state + persistent storage +- **Single-threaded**: Serial request processing (no race conditions) + +## Rules of Durable Objects + +Critical rules preventing most production issues: + +1. **One alarm per DO** - Schedule multiple events via queue pattern +2. **~1K req/s per DO max** - Shard for higher throughput +3. **Constructor runs every wake** - Keep initialization light; use lazy loading +4. **Hibernation clears memory** - In-memory state lost; persist critical data +5. **Use `ctx.waitUntil()` for cleanup** - Ensures completion after response sent +6. **No setTimeout for persistence** - Use `setAlarm()` for reliable scheduling + +## Core Concepts + +### Class Structure +All DOs extend `DurableObject` base class with constructor receiving `DurableObjectState` (storage, WebSockets, alarms) and `Env` (bindings). + +### Lifecycle States + +``` +[Not Created] → [Active] ⇄ [Hibernated] → [Evicted] + ↓ + [Destroyed] +``` + +- **Not Created**: DO ID exists but instance never spawned +- **Active**: Processing requests, in-memory state valid, billed per GB-hour +- **Hibernated**: WebSocket connections open but zero compute, zero cost +- **Evicted**: Removed from memory; next request triggers cold start +- **Destroyed**: Data deleted via migration or manual deletion + +### Accessing from Workers +Workers use bindings to get stubs, then call RPC methods directly (recommended) or use fetch handler (legacy). + +**RPC vs fetch() decision:** +``` +├─ New project + compat ≥2024-04-03 → RPC (type-safe, simpler) +├─ Need HTTP semantics (headers, status) → fetch() +├─ Proxying requests to DO → fetch() +└─ Legacy compatibility → fetch() +``` + +See [Patterns: RPC vs fetch()](./patterns.md) for examples. + +### ID Generation +- `idFromName()`: Deterministic, named coordination (rate limiting, locks) +- `newUniqueId()`: Random IDs for sharding high-throughput workloads +- `idFromString()`: Derive from existing IDs +- Jurisdiction option: Data locality compliance + +### Storage Options + +**Which storage API?** +``` +├─ Structured data, relations, transactions → SQLite (recommended) +├─ Simple KV on SQLite DO → ctx.storage.kv (sync KV) +└─ Legacy KV-only DO → ctx.storage (async KV) +``` + +- **SQLite** (recommended): Structured data, transactions, 10GB/DO +- **Synchronous KV API**: Simple key-value on SQLite objects +- **Asynchronous KV API**: Legacy/advanced use cases + +See [DO Storage](../do-storage/README.md) for deep dive. + +### Special Features +- **Alarms**: Schedule future execution per-DO (1 per DO - use queue pattern for multiple) +- **WebSocket Hibernation**: Zero-cost idle connections (memory cleared on hibernation) +- **Point-in-Time Recovery**: Restore to any point in 30 days (SQLite only) + +## Quick Start + +```typescript +import { DurableObject } from "cloudflare:workers"; + +export class Counter extends DurableObject { + async increment(): Promise { + const result = this.ctx.storage.sql.exec( + `INSERT INTO counters (id, value) VALUES (1, 1) + ON CONFLICT(id) DO UPDATE SET value = value + 1 + RETURNING value` + ).one(); + return result.value; + } +} + +// Worker access +export default { + async fetch(request: Request, env: Env): Promise { + const id = env.COUNTER.idFromName("global"); + const stub = env.COUNTER.get(id); + const count = await stub.increment(); + return new Response(`Count: ${count}`); + } +}; +``` + +## Decision Trees + +### What do you need? + +``` +├─ Coordinate requests (rate limit, lock, session) +│ → idFromName(identifier) → [Patterns: Rate Limiting/Locks](./patterns.md) +│ +├─ High throughput (>1K req/s) +│ → Sharding with newUniqueId() or hash → [Patterns: Sharding](./patterns.md) +│ +├─ Real-time updates (WebSocket, chat, collab) +│ → WebSocket hibernation + room pattern → [Patterns: Real-time](./patterns.md) +│ +├─ Background work (cleanup, notifications, scheduled tasks) +│ → Alarms + queue pattern (1 alarm/DO) → [Patterns: Multiple Events](./patterns.md) +│ +└─ User sessions with expiration + → Session pattern + alarm cleanup → [Patterns: Session Management](./patterns.md) +``` + +### Which access pattern? + +``` +├─ New project + typed methods → RPC (compat ≥2024-04-03) +├─ Need HTTP semantics → fetch() +├─ Proxying to DO → fetch() +└─ Legacy compat → fetch() +``` + +See [Patterns: RPC vs fetch()](./patterns.md) for examples. + +### Which storage? + +``` +├─ Structured data, SQL queries, transactions → SQLite (recommended) +├─ Simple KV on SQLite DO → ctx.storage.kv (sync API) +└─ Legacy KV-only DO → ctx.storage (async API) +``` + +See [DO Storage](../do-storage/README.md) for complete guide. + +## Essential Commands + +```bash +npx wrangler dev # Local dev with DOs +npx wrangler dev --remote # Test against prod DOs +npx wrangler deploy # Deploy + auto-apply migrations +``` + +## Resources + +**Docs**: https://developers.cloudflare.com/durable-objects/ +**API Reference**: https://developers.cloudflare.com/durable-objects/api/ +**Examples**: https://developers.cloudflare.com/durable-objects/examples/ + +## In This Reference + +- **[Configuration](./configuration.md)** - wrangler.jsonc setup, migrations, bindings, environments +- **[API](./api.md)** - Class structure, ctx methods, alarms, WebSocket hibernation +- **[Patterns](./patterns.md)** - Sharding, rate limiting, locks, real-time, sessions +- **[Gotchas](./gotchas.md)** - Limits, hibernation caveats, common errors + +## See Also + +- **[DO Storage](../do-storage/README.md)** - SQLite, KV, transactions (detailed storage guide) +- **[Workers](../workers/README.md)** - Core Workers runtime features +- **[WebSockets](../websockets/README.md)** - WebSocket APIs and patterns diff --git a/cloudflare/references/durable-objects/api.md b/cloudflare/references/durable-objects/api.md new file mode 100644 index 0000000..89c7e4d --- /dev/null +++ b/cloudflare/references/durable-objects/api.md @@ -0,0 +1,187 @@ +# Durable Objects API + +## Class Structure + +```typescript +import { DurableObject } from "cloudflare:workers"; + +export class MyDO extends DurableObject { + constructor(ctx: DurableObjectState, env: Env) { + super(ctx, env); + // Runs on EVERY wake - keep light! + } + + // RPC methods (called directly from worker) + async myMethod(arg: string): Promise { return arg; } + + // fetch handler (legacy/HTTP semantics) + async fetch(req: Request): Promise { /* ... */ } + + // Lifecycle handlers + async alarm() { /* alarm fired */ } + async webSocketMessage(ws: WebSocket, msg: string | ArrayBuffer) { /* ... */ } + async webSocketClose(ws: WebSocket, code: number, reason: string, wasClean: boolean) { /* ... */ } + async webSocketError(ws: WebSocket, error: unknown) { /* ... */ } +} +``` + +## DurableObjectState Context Methods + +### Concurrency Control + +```typescript +// Complete work after response sent (e.g., cleanup, logging) +this.ctx.waitUntil(promise: Promise): void + +// Critical section - blocks all other requests until complete +await this.ctx.blockConcurrencyWhile(async () => { + // No other requests processed during this block + // Use for initialization or critical operations +}) +``` + +**When to use:** +- `waitUntil()`: Background cleanup, logging, non-critical work after response +- `blockConcurrencyWhile()`: First-time init, schema migration, critical state setup + +### Lifecycle + +```typescript +this.ctx.id // DurableObjectId of this instance +this.ctx.abort() // Force eviction (use after PITR restore to reload state) +``` + +### Storage Access + +```typescript +this.ctx.storage.sql // SQLite API (recommended) +this.ctx.storage.kv // Sync KV API (SQLite DOs only) +this.ctx.storage // Async KV API (legacy/KV-only DOs) +``` + +See **[DO Storage](../do-storage/README.md)** for complete storage API reference. + +### WebSocket Management + +```typescript +this.ctx.acceptWebSocket(ws: WebSocket, tags?: string[]) // Enable hibernation +this.ctx.getWebSockets(tag?: string): WebSocket[] // Get by tag or all +this.ctx.getTags(ws: WebSocket): string[] // Get tags for connection +``` + +### Alarms + +```typescript +await this.ctx.storage.setAlarm(timestamp: number | Date) // Schedule (overwrites existing) +await this.ctx.storage.getAlarm(): number | null // Get next alarm time +await this.ctx.storage.deleteAlarm(): void // Cancel alarm +``` + +**Limit:** 1 alarm per DO. Use queue pattern for multiple events (see [Patterns](./patterns.md)). + +## Storage APIs + +For detailed storage documentation including SQLite queries, KV operations, transactions, and Point-in-Time Recovery, see **[DO Storage](../do-storage/README.md)**. + +Quick reference: + +```typescript +// SQLite (recommended) +this.ctx.storage.sql.exec("SELECT * FROM users WHERE id = ?", userId).one() + +// Sync KV (SQLite DOs only) +this.ctx.storage.kv.get("key") + +// Async KV (legacy) +await this.ctx.storage.get("key") +``` + +## Alarms + +Schedule future work that survives eviction: + +```typescript +// Set alarm (overwrites any existing alarm) +await this.ctx.storage.setAlarm(Date.now() + 3600000) // 1 hour from now +await this.ctx.storage.setAlarm(new Date("2026-02-01")) // Absolute time + +// Check next alarm +const nextRun = await this.ctx.storage.getAlarm() // null if none + +// Cancel alarm +await this.ctx.storage.deleteAlarm() + +// Handler called when alarm fires +async alarm() { + // Runs once alarm triggers + // DO wakes from hibernation if needed + // Use for cleanup, notifications, scheduled tasks +} +``` + +**Limitations:** +- 1 alarm per DO maximum +- Overwrites previous alarm when set +- Use queue pattern for multiple scheduled events (see [Patterns](./patterns.md)) + +**Reliability:** +- Alarms survive DO eviction/restart +- Cloudflare retries failed alarms automatically +- Not guaranteed exactly-once (handle idempotently) + +## WebSocket Hibernation + +Hibernation allows DOs with open WebSocket connections to consume zero compute/memory until message arrives. + +```typescript +async fetch(req: Request): Promise { + const [client, server] = Object.values(new WebSocketPair()); + this.ctx.acceptWebSocket(server, ["room:123"]); // Tags for filtering + server.serializeAttachment({ userId: "abc" }); // Persisted metadata + return new Response(null, { status: 101, webSocket: client }); +} + +// Called when message arrives (DO wakes from hibernation) +async webSocketMessage(ws: WebSocket, msg: string | ArrayBuffer) { + const data = ws.deserializeAttachment(); // Retrieve metadata + for (const c of this.ctx.getWebSockets("room:123")) c.send(msg); +} + +// Called on close (optional handler) +async webSocketClose(ws: WebSocket, code: number, reason: string, wasClean: boolean) { + // Cleanup logic, remove from lists, etc. +} + +// Called on error (optional handler) +async webSocketError(ws: WebSocket, error: unknown) { + console.error("WebSocket error:", error); + // Handle error, close connection, etc. +} +``` + +**Key concepts:** +- **Auto-hibernation:** DO hibernates when no active requests/alarms +- **Zero cost:** Hibernated DOs incur no charges while preserving connections +- **Memory cleared:** All in-memory state lost on hibernation +- **Attachment persistence:** Use `serializeAttachment()` for per-connection metadata that survives hibernation +- **Tags for filtering:** Group connections by room/channel/user for targeted broadcasts + +**Handler lifecycle:** +- `webSocketMessage`: DO wakes, processes message, may hibernate after +- `webSocketClose`: Called when client closes (optional - implement for cleanup) +- `webSocketError`: Called on connection error (optional - implement for error handling) + +**Metadata persistence:** +```typescript +// Store connection metadata (survives hibernation) +ws.serializeAttachment({ userId: "abc", room: "lobby" }) + +// Retrieve after hibernation +const { userId, room } = ws.deserializeAttachment() +``` + +## See Also + +- **[DO Storage](../do-storage/README.md)** - Complete storage API reference +- **[Patterns](./patterns.md)** - Real-world usage patterns +- **[Gotchas](./gotchas.md)** - Hibernation caveats and limits diff --git a/cloudflare/references/durable-objects/configuration.md b/cloudflare/references/durable-objects/configuration.md new file mode 100644 index 0000000..651599a --- /dev/null +++ b/cloudflare/references/durable-objects/configuration.md @@ -0,0 +1,160 @@ +# Durable Objects Configuration + +## Basic Setup + +```jsonc +{ + "name": "my-worker", + "main": "src/index.ts", + "compatibility_date": "2025-01-01", // Use latest; ≥2024-04-03 for RPC + "durable_objects": { + "bindings": [ + { + "name": "MY_DO", // Env binding name + "class_name": "MyDO" // Class exported from this worker + }, + { + "name": "EXTERNAL", // Access DO from another worker + "class_name": "ExternalDO", + "script_name": "other-worker" + } + ] + }, + "migrations": [ + { "tag": "v1", "new_sqlite_classes": ["MyDO"] } // Prefer SQLite + ] +} +``` + +## Binding Options + +```jsonc +{ + "name": "BINDING_NAME", + "class_name": "ClassName", + "script_name": "other-worker", // Optional: external DO + "environment": "production" // Optional: isolate by env +} +``` + +## Jurisdiction (Data Locality) + +Specify jurisdiction at ID creation for data residency compliance: + +```typescript +// EU data residency +const id = env.MY_DO.idFromName("user:123", { jurisdiction: "eu" }) + +// Available jurisdictions +const jurisdictions = ["eu", "fedramp"] // More may be added + +// All operations on this DO stay within jurisdiction +const stub = env.MY_DO.get(id) +await stub.someMethod() // Data stays in EU +``` + +**Key points:** +- Set at ID creation time, immutable afterward +- DO instance physically located within jurisdiction +- Storage and compute guaranteed within boundary +- Use for GDPR, FedRAMP, other compliance requirements +- No cross-jurisdiction access (requests fail if DO in different jurisdiction) + +## Migrations + +```jsonc +{ + "migrations": [ + { "tag": "v1", "new_sqlite_classes": ["MyDO"] }, // Create SQLite (recommended) + // { "tag": "v1", "new_classes": ["MyDO"] }, // Create KV (paid only) + { "tag": "v2", "renamed_classes": [{ "from": "Old", "to": "New" }] }, + { "tag": "v3", "transferred_classes": [{ "from": "Src", "from_script": "old", "to": "Dest" }] }, + { "tag": "v4", "deleted_classes": ["Obsolete"] } // Destroys ALL data! + ] +} +``` + +**Migration rules:** +- Tags must be unique and sequential (v1, v2, v3...) +- No rollback supported (test with `--dry-run` first) +- Auto-applied on deploy +- `new_sqlite_classes` recommended over `new_classes` (SQLite vs KV) +- `deleted_classes` immediately destroys ALL data (irreversible) + +## Environment Isolation + +Separate DO namespaces per environment (staging/production have distinct object instances): + +```jsonc +{ + "durable_objects": { + "bindings": [{ "name": "MY_DO", "class_name": "MyDO" }] + }, + "env": { + "production": { + "durable_objects": { + "bindings": [ + { "name": "MY_DO", "class_name": "MyDO", "environment": "production" } + ] + } + } + } +} +``` + +Deploy: `npx wrangler deploy --env production` + +## Limits & Settings + +```jsonc +{ + "limits": { + "cpu_ms": 300000 // Max CPU time: 30s default, 300s max + } +} +``` + +See [Gotchas](./gotchas.md) for complete limits table. + +## Types + +```typescript +import { DurableObject } from "cloudflare:workers"; + +interface Env { + MY_DO: DurableObjectNamespace; +} + +export class MyDO extends DurableObject {} + +type DurableObjectNamespace = { + newUniqueId(options?: { jurisdiction?: string }): DurableObjectId; + idFromName(name: string): DurableObjectId; + idFromString(id: string): DurableObjectId; + get(id: DurableObjectId): DurableObjectStub; +}; +``` + +## Commands + +```bash +# Development +npx wrangler dev # Local dev +npx wrangler dev --remote # Test against production DOs + +# Deployment +npx wrangler deploy # Deploy + auto-apply migrations +npx wrangler deploy --dry-run # Validate migrations without deploying +npx wrangler deploy --env production + +# Management +npx wrangler durable-objects list # List namespaces +npx wrangler durable-objects info # Inspect specific DO +npx wrangler durable-objects delete # Delete DO (destroys data) +``` + +## See Also + +- **[API](./api.md)** - DurableObjectState and lifecycle handlers +- **[Patterns](./patterns.md)** - Multi-environment patterns +- **[Gotchas](./gotchas.md)** - Migration caveats, limits diff --git a/cloudflare/references/durable-objects/gotchas.md b/cloudflare/references/durable-objects/gotchas.md new file mode 100644 index 0000000..72495f9 --- /dev/null +++ b/cloudflare/references/durable-objects/gotchas.md @@ -0,0 +1,197 @@ +# Durable Objects Gotchas + +## Common Errors + +### "Hibernation Cleared My In-Memory State" + +**Problem:** Variables lost after hibernation +**Cause:** DO auto-hibernates when idle; in-memory state not persisted +**Solution:** Use `ctx.storage` for critical data, `ws.serializeAttachment()` for per-connection metadata + +```typescript +// ❌ Wrong - lost on hibernation +private userCount = 0; +async webSocketMessage(ws: WebSocket, msg: string) { + this.userCount++; // Lost! +} + +// ✅ Right - persisted +async webSocketMessage(ws: WebSocket, msg: string) { + const count = this.ctx.storage.kv.get("userCount") || 0; + this.ctx.storage.kv.put("userCount", count + 1); +} +``` + +### "setTimeout Didn't Fire After Restart" + +**Problem:** Scheduled work lost on eviction +**Cause:** `setTimeout` in-memory only; eviction clears timers +**Solution:** Use `ctx.storage.setAlarm()` for reliable scheduling + +```typescript +// ❌ Wrong - lost on eviction +setTimeout(() => this.cleanup(), 3600000); + +// ✅ Right - survives eviction +await this.ctx.storage.setAlarm(Date.now() + 3600000); +async alarm() { await this.cleanup(); } +``` + +### "Constructor Runs on Every Wake" + +**Problem:** Expensive init logic slows all requests +**Cause:** Constructor runs on every wake (first request after eviction OR after hibernation) +**Solution:** Lazy initialization or cache in storage + +**Critical understanding:** Constructor runs in two scenarios: +1. **Cold start** - DO evicted from memory, first request creates new instance +2. **Wake from hibernation** - DO with WebSockets hibernated, message/alarm wakes it + +```typescript +// ❌ Wrong - expensive on every wake +constructor(ctx: DurableObjectState, env: Env) { + super(ctx, env); + this.heavyData = this.loadExpensiveData(); // Slow! +} + +// ✅ Right - lazy load +private heavyData?: HeavyData; +private getHeavyData() { + if (!this.heavyData) this.heavyData = this.loadExpensiveData(); + return this.heavyData; +} +``` + +### "Durable Object Overloaded (503 errors)" + +**Problem:** 503 errors under load +**Cause:** Single DO exceeding ~1K req/s throughput limit +**Solution:** Shard across multiple DOs (see [Patterns: Sharding](./patterns.md)) + +### "Storage Quota Exceeded (Write failures)" + +**Problem:** Write operations failing +**Cause:** DO storage exceeding 10GB limit or account quota +**Solution:** Cleanup with alarms, use `deleteAll()` for old data, upgrade plan + +### "CPU Time Exceeded (Terminated)" + +**Problem:** Request terminated mid-execution +**Cause:** Processing exceeding 30s CPU time default limit +**Solution:** Increase `limits.cpu_ms` in wrangler.jsonc (max 300s) or chunk work + +### "WebSockets Disconnect on Eviction" + +**Problem:** Connections drop unexpectedly +**Cause:** DO evicted from memory without hibernation API +**Solution:** Use WebSocket hibernation handlers + client reconnection logic + +### "Migration Failed (Deploy error)" + +**Cause:** Non-unique tags, non-sequential tags, or invalid class names in migration +**Solution:** Check tag uniqueness/sequential ordering and verify class names are correct + +### "RPC Method Not Found" + +**Cause:** compatibility_date < 2024-04-03 preventing RPC usage +**Solution:** Update compatibility_date to >= 2024-04-03 or use fetch() instead of RPC + +### "Only One Alarm Allowed" + +**Cause:** Need multiple scheduled tasks but only one alarm supported per DO +**Solution:** Use event queue pattern to schedule multiple tasks with single alarm + +### "Race Condition Despite Single-Threading" + +**Problem:** Concurrent requests see inconsistent state +**Cause:** Async operations allow request interleaving (await = yield point) +**Solution:** Use `blockConcurrencyWhile()` for critical sections or atomic storage ops + +```typescript +// ❌ Wrong - race condition +async incrementCounter() { + const count = await this.ctx.storage.get("count") || 0; + // ⚠️ Another request could execute here during await + await this.ctx.storage.put("count", count + 1); +} + +// ✅ Right - atomic operation +async incrementCounter() { + return this.ctx.storage.sql.exec( + "INSERT INTO counters (id, value) VALUES (1, 1) ON CONFLICT(id) DO UPDATE SET value = value + 1 RETURNING value" + ).one().value; +} + +// ✅ Right - explicit locking +async criticalOperation() { + await this.ctx.blockConcurrencyWhile(async () => { + const count = await this.ctx.storage.get("count") || 0; + await this.ctx.storage.put("count", count + 1); + }); +} +``` + +### "Migration Rollback Not Supported" + +**Cause:** Attempting to rollback a migration after deployment +**Solution:** Test with `--dry-run` before deploying; migrations cannot be rolled back + +### "deleted_classes Destroys Data" + +**Problem:** Migration deleted all data +**Cause:** `deleted_classes` migration immediately destroys all DO instances and data +**Solution:** Test with `--dry-run`; use `transferred_classes` to preserve data during moves + +### "Cold Starts Are Slow" + +**Problem:** First request after eviction takes longer +**Cause:** DO constructor + initial storage access on cold start +**Solution:** Expected behavior; optimize constructor, use connection pooling in clients, consider warming strategy for critical DOs + +```typescript +// Warming strategy (periodically ping critical DOs) +export default { + async scheduled(event: ScheduledEvent, env: Env) { + const criticalIds = ["auth", "sessions", "locks"]; + await Promise.all(criticalIds.map(name => { + const id = env.MY_DO.idFromName(name); + const stub = env.MY_DO.get(id); + return stub.ping(); // Keep warm + })); + } +}; +``` + +## Limits + +| Limit | Free | Paid | Notes | +|-------|------|------|-------| +| SQLite storage per DO | 10 GB | 10 GB | Per Durable Object instance | +| SQLite total storage | 5 GB | Unlimited | Account-wide quota | +| Key+value size | 2 MB | 2 MB | Single KV pair (SQLite/async) | +| CPU time default | 30s | 30s | Per request; configurable | +| CPU time max | 300s | 300s | Set via `limits.cpu_ms` | +| DO classes | 100 | 500 | Distinct DO class definitions | +| SQL columns | 100 | 100 | Per table | +| SQL statement size | 100 KB | 100 KB | Max SQL query size | +| WebSocket message size | 32 MiB | 32 MiB | Per message | +| Request throughput | ~1K req/s | ~1K req/s | Per DO (soft limit - shard for more) | +| Alarms per DO | 1 | 1 | Use queue pattern for multiple events | +| Total DOs | Unlimited | Unlimited | Create as many instances as needed | +| WebSockets | Unlimited | Unlimited | Within 128MB memory limit per DO | +| Memory per DO | 128 MB | 128 MB | In-memory state + WebSocket buffers | + +## Hibernation Caveats + +1. **Memory cleared** - All in-memory variables lost; reconstruct from storage or `deserializeAttachment()` +2. **Constructor reruns** - Runs on wake; avoid expensive operations, use lazy initialization +3. **No guarantees** - DO may evict instead of hibernate; design for both +4. **Attachment limit** - `serializeAttachment()` data must be JSON-serializable, keep small +5. **Alarm wakes DO** - Alarm prevents hibernation until handler completes +6. **WebSocket state not automatic** - Must explicitly persist with `serializeAttachment()` or storage + +## See Also + +- **[Patterns](./patterns.md)** - Workarounds for common limitations +- **[API](./api.md)** - Storage limits and quotas +- **[Configuration](./configuration.md)** - Setting CPU limits diff --git a/cloudflare/references/durable-objects/patterns.md b/cloudflare/references/durable-objects/patterns.md new file mode 100644 index 0000000..d91f382 --- /dev/null +++ b/cloudflare/references/durable-objects/patterns.md @@ -0,0 +1,201 @@ +# Durable Objects Patterns + +## When to Use Which Pattern + +| Need | Pattern | ID Strategy | +|------|---------|-------------| +| Rate limit per user/IP | Rate Limiting | `idFromName(identifier)` | +| Mutual exclusion | Distributed Lock | `idFromName(resource)` | +| >1K req/s throughput | Sharding | `newUniqueId()` or hash | +| Real-time updates | WebSocket Collab | `idFromName(room)` | +| User sessions | Session Management | `idFromName(sessionId)` | +| Background cleanup | Alarm-based | Any | + +## RPC vs fetch() + +**RPC** (compat ≥2024-04-03): Type-safe, simpler, default for new projects +**fetch()**: Legacy compat, HTTP semantics, proxying + +```typescript +const count = await stub.increment(); // RPC +const count = await (await stub.fetch(req)).json(); // fetch() +``` + +## Sharding (High Throughput) + +Single DO ~1K req/s max. Shard for higher throughput: + +```typescript +export default { + async fetch(req: Request, env: Env): Promise { + const userId = new URL(req.url).searchParams.get("user"); + const hash = hashCode(userId) % 100; // 100 shards + const id = env.COUNTER.idFromName(`shard:${hash}`); + return env.COUNTER.get(id).fetch(req); + } +}; + +function hashCode(str: string): number { + let hash = 0; + for (let i = 0; i < str.length; i++) hash = ((hash << 5) - hash) + str.charCodeAt(i); + return Math.abs(hash); +} +``` + +**Decisions:** +- **Shard count**: 10-1000 typical (start with 100, measure, adjust) +- **Shard key**: User ID, IP, session - must distribute evenly (use hash) +- **Aggregation**: Coordinator DO or external system (D1, R2) + +## Rate Limiting + +```typescript +async checkLimit(key: string, limit: number, windowMs: number): Promise { + const req = this.ctx.storage.sql.exec("SELECT COUNT(*) as count FROM requests WHERE key = ? AND timestamp > ?", key, Date.now() - windowMs).one(); + if (req.count >= limit) return false; + this.ctx.storage.sql.exec("INSERT INTO requests (key, timestamp) VALUES (?, ?)", key, Date.now()); + return true; +} +``` + +## Distributed Lock + +```typescript +private held = false; +async acquire(timeoutMs = 5000): Promise { + if (this.held) return false; + this.held = true; + await this.ctx.storage.setAlarm(Date.now() + timeoutMs); + return true; +} +async release() { this.held = false; await this.ctx.storage.deleteAlarm(); } +async alarm() { this.held = false; } // Auto-release on timeout +``` + +## Hibernation-Aware Pattern + +Preserve state across hibernation: + +```typescript +async fetch(req: Request): Promise { + const [client, server] = Object.values(new WebSocketPair()); + const userId = new URL(req.url).searchParams.get("user"); + server.serializeAttachment({ userId }); // Survives hibernation + this.ctx.acceptWebSocket(server, ["room:lobby"]); + server.send(JSON.stringify({ type: "init", state: this.ctx.storage.kv.get("state") })); + return new Response(null, { status: 101, webSocket: client }); +} + +async webSocketMessage(ws: WebSocket, msg: string) { + const { userId } = ws.deserializeAttachment(); // Retrieve after wake + const state = this.ctx.storage.kv.get("state") || {}; + state[userId] = JSON.parse(msg); + this.ctx.storage.kv.put("state", state); + for (const c of this.ctx.getWebSockets("room:lobby")) c.send(msg); +} +``` + +## Real-time Collaboration + +Broadcast updates to all connected clients: + +```typescript +async webSocketMessage(ws: WebSocket, msg: string) { + const data = JSON.parse(msg); + this.ctx.storage.kv.put("doc", data.content); // Persist + for (const c of this.ctx.getWebSockets()) if (c !== ws) c.send(msg); // Broadcast +} +``` + +### WebSocket Reconnection + +**Client-side** (exponential backoff): +```typescript +class ResilientWS { + private delay = 1000; + connect(url: string) { + const ws = new WebSocket(url); + ws.onclose = () => setTimeout(() => { + this.connect(url); + this.delay = Math.min(this.delay * 2, 30000); + }, this.delay); + } +} +``` + +**Server-side** (cleanup on close): +```typescript +async webSocketClose(ws: WebSocket, code: number, reason: string, wasClean: boolean) { + const { userId } = ws.deserializeAttachment(); + this.ctx.storage.sql.exec("UPDATE users SET online = false WHERE id = ?", userId); + for (const c of this.ctx.getWebSockets()) c.send(JSON.stringify({ type: "user_left", userId })); +} +``` + +## Session Management + +```typescript +async createSession(userId: string, data: object): Promise { + const id = crypto.randomUUID(), exp = Date.now() + 86400000; + this.ctx.storage.sql.exec("INSERT INTO sessions VALUES (?, ?, ?, ?)", id, userId, JSON.stringify(data), exp); + await this.ctx.storage.setAlarm(exp); + return id; +} + +async getSession(id: string): Promise { + const row = this.ctx.storage.sql.exec("SELECT data FROM sessions WHERE id = ? AND expires_at > ?", id, Date.now()).one(); + return row ? JSON.parse(row.data) : null; +} + +async alarm() { this.ctx.storage.sql.exec("DELETE FROM sessions WHERE expires_at <= ?", Date.now()); } +``` + +## Multiple Events (Single Alarm) + +Queue pattern to schedule multiple events: + +```typescript +async scheduleEvent(id: string, runAt: number) { + await this.ctx.storage.put(`event:${id}`, { id, runAt }); + const curr = await this.ctx.storage.getAlarm(); + if (!curr || runAt < curr) await this.ctx.storage.setAlarm(runAt); +} + +async alarm() { + const events = await this.ctx.storage.list({ prefix: "event:" }), now = Date.now(); + let next = null; + for (const [key, ev] of events) { + if (ev.runAt <= now) { + await this.processEvent(ev); + await this.ctx.storage.delete(key); + } else if (!next || ev.runAt < next) next = ev.runAt; + } + if (next) await this.ctx.storage.setAlarm(next); +} +``` + +## Graceful Cleanup + +Use `ctx.waitUntil()` to complete work after response: + +```typescript +async myMethod() { + const response = { success: true }; + this.ctx.waitUntil(this.ctx.storage.sql.exec("DELETE FROM old_data WHERE timestamp < ?", cutoff)); + return response; +} +``` + +## Best Practices + +- **Design**: Use `idFromName()` for coordination, `newUniqueId()` for sharding, minimize constructor work +- **Storage**: Prefer SQLite, batch with transactions, set alarms for cleanup, use PITR before risky ops +- **Performance**: ~1K req/s per DO max - shard for more, cache in memory, use alarms for deferred work +- **Reliability**: Handle 503 with retry+backoff, design for cold starts, test migrations with `--dry-run` +- **Security**: Validate inputs in Workers, rate limit DO creation, use jurisdiction for compliance + +## See Also + +- **[API](./api.md)** - ctx methods, WebSocket handlers +- **[Gotchas](./gotchas.md)** - Hibernation caveats, common errors +- **[DO Storage](../do-storage/README.md)** - Storage patterns and transactions diff --git a/cloudflare/references/email-routing/README.md b/cloudflare/references/email-routing/README.md new file mode 100644 index 0000000..7fa902e --- /dev/null +++ b/cloudflare/references/email-routing/README.md @@ -0,0 +1,89 @@ +# Cloudflare Email Routing Skill Reference + +## Overview + +Cloudflare Email Routing enables custom email addresses for your domain that route to verified destination addresses. It's free, privacy-focused (no storage/access), and includes Email Workers for programmatic email processing. + +**Available to all Cloudflare customers using Cloudflare as authoritative nameserver.** + +## Quick Start + +```typescript +// Basic email handler +export default { + async email(message, env, ctx) { + // CRITICAL: Must consume stream before response + const parser = new PostalMime.default(); + const email = await parser.parse(await message.raw.arrayBuffer()); + + // Process email + console.log(`From: ${message.from}, Subject: ${email.subject}`); + + // Forward or reject + await message.forward("verified@destination.com"); + } +} satisfies ExportedHandler; +``` + +## Reading Order + +**Start here based on your goal:** + +1. **New to Email Routing?** → [configuration.md](configuration.md) → [patterns.md](patterns.md) +2. **Adding Workers?** → [api.md](api.md) § Worker Runtime API → [patterns.md](patterns.md) +3. **Sending emails?** → [api.md](api.md) § SendEmail Binding +4. **Managing via API?** → [api.md](api.md) § REST API Operations +5. **Debugging issues?** → [gotchas.md](gotchas.md) + +## Decision Tree + +``` +Need to receive emails? +├─ Simple forwarding only? → Dashboard rules (configuration.md) +├─ Complex logic/filtering? → Email Workers (api.md + patterns.md) +└─ Parse attachments/body? → postal-mime library (patterns.md § Parse Email) + +Need to send emails? +├─ From Worker? → SendEmail binding (api.md § SendEmail) +└─ From external app? → Use external SMTP/API service + +Having issues? +├─ Email not arriving? → gotchas.md § Mail Authentication +├─ Worker crashing? → gotchas.md § Stream Consumption +└─ Forward failing? → gotchas.md § Destination Verification +``` + +## Key Concepts + +**Routing Rules**: Pattern-based forwarding configured via Dashboard/API. Simple but limited. + +**Email Workers**: Custom TypeScript handlers with full email access. Handles complex logic, parsing, storage, rejection. + +**SendEmail Binding**: Outbound email API for Workers. Transactional email only (no marketing/bulk). + +**ForwardableEmailMessage**: Runtime interface for incoming emails. Provides headers, raw stream, forward/reject methods. + +## In This Reference + +- **[configuration.md](configuration.md)** - Setup, deployment, wrangler config +- **[api.md](api.md)** - REST API + Worker runtime API + types +- **[patterns.md](patterns.md)** - Common patterns with working examples +- **[gotchas.md](gotchas.md)** - Critical pitfalls, troubleshooting, limits + +## Architecture + +``` +Internet → MX Records → Cloudflare Email Routing + ├─ Routing Rules (dashboard) + └─ Email Worker (your code) + ├─ Forward to destination + ├─ Reject with reason + ├─ Store in R2/KV/D1 + └─ Send outbound (SendEmail) +``` + +## See Also + +- [Cloudflare Docs: Email Routing](https://developers.cloudflare.com/email-routing/) +- [Cloudflare Docs: Email Workers](https://developers.cloudflare.com/email-routing/email-workers/) +- [postal-mime npm package](https://www.npmjs.com/package/postal-mime) diff --git a/cloudflare/references/email-routing/api.md b/cloudflare/references/email-routing/api.md new file mode 100644 index 0000000..33b8bf0 --- /dev/null +++ b/cloudflare/references/email-routing/api.md @@ -0,0 +1,195 @@ +# Email Routing API Reference + +## Worker Runtime API + +### Email Handler Interface + +```typescript +interface ExportedHandler { + email?(message: ForwardableEmailMessage, env: Env, ctx: ExecutionContext): void | Promise; +} +``` + +### ForwardableEmailMessage + +Main interface for incoming emails: + +```typescript +interface ForwardableEmailMessage { + readonly from: string; // Envelope sender (e.g., "sender@example.com") + readonly to: string; // Envelope recipient (e.g., "you@yourdomain.com") + readonly headers: Headers; // Web API Headers object + readonly raw: ReadableStream; // Raw MIME message stream + + setReject(reason: string): void; + forward(rcptTo: string, headers?: Headers): Promise; +} +``` + +**Key Properties:** + +| Property | Type | Description | +|----------|------|-------------| +| `from` | `string` | Envelope sender (MAIL FROM), not header From | +| `to` | `string` | Envelope recipient (RCPT TO), not header To | +| `headers` | `Headers` | Email headers (Subject, From, To, etc.) | +| `raw` | `ReadableStream` | Raw MIME message (consume once only) | + +**Methods:** + +- `setReject(reason)`: Reject email with bounce message +- `forward(rcptTo, headers?)`: Forward to verified destination, optionally add headers + +### Headers Object + +Standard Web API Headers interface: + +```typescript +// Access headers +const subject = message.headers.get("subject"); +const from = message.headers.get("from"); +const messageId = message.headers.get("message-id"); + +// Check spam score +const spamScore = parseFloat(message.headers.get("x-cf-spamh-score") || "0"); +if (spamScore > 5) { + message.setReject("Spam detected"); +} +``` + +### Common Headers + +`subject`, `from`, `to`, `x-cf-spamh-score` (spam score), `message-id` (deduplication), `dkim-signature` (auth) + +### Envelope vs Header Addresses + +**Critical distinction:** + +```typescript +// Envelope addresses (routing, auth checks) +message.from // "bounce@sender.com" (actual sender) +message.to // "you@yourdomain.com" (your address) + +// Header addresses (display, user-facing) +message.headers.get("from") // "Alice " +message.headers.get("to") // "Bob " +``` + +**Use envelope addresses for:** +- Authentication/SPF checks +- Routing decisions +- Bounce handling + +**Use header addresses for:** +- Display to users +- Reply-To logic +- User-facing filtering + +## SendEmail Binding + +Outbound email API for transactional messages. + +### Configuration + +```jsonc +// wrangler.jsonc +{ + "send_email": [ + { "name": "EMAIL" } + ] +} +``` + +### TypeScript Types + +```typescript +interface Env { + EMAIL: SendEmail; +} + +interface SendEmail { + send(message: EmailMessage): Promise; +} + +interface EmailMessage { + from: string | { name?: string; email: string }; + to: string | { name?: string; email: string } | Array; + subject: string; + text?: string; + html?: string; + headers?: Headers; + reply_to?: string | { name?: string; email: string }; +} +``` + +### Send Email Example + +```typescript +interface Env { + EMAIL: SendEmail; +} + +export default { + async fetch(request, env, ctx): Promise { + await env.EMAIL.send({ + from: { name: "Acme Corp", email: "noreply@yourdomain.com" }, + to: [ + { name: "Alice", email: "alice@example.com" }, + "bob@example.com" + ], + subject: "Your order #12345 has shipped", + text: "Track your package at: https://track.example.com/12345", + html: "

Track your package at: View tracking

", + reply_to: { name: "Support", email: "support@yourdomain.com" } + }); + + return new Response("Email sent"); + } +} satisfies ExportedHandler; +``` + +### SendEmail Constraints + +- **From address**: Must be on verified domain (your domain with Email Routing enabled) +- **Volume limits**: Transactional only, no bulk/marketing email +- **Rate limits**: 100 emails/minute on Free plan, higher on Paid +- **No attachments**: Use links to hosted files instead +- **No DKIM control**: Cloudflare signs automatically + +## REST API Operations + +Base URL: `https://api.cloudflare.com/client/v4` + +### Authentication + +```bash +curl -H "Authorization: Bearer $API_TOKEN" https://api.cloudflare.com/client/v4/... +``` + +### Key Endpoints + +| Operation | Method | Endpoint | +|-----------|--------|----------| +| Enable routing | POST | `/zones/{zone_id}/email/routing/enable` | +| Disable routing | POST | `/zones/{zone_id}/email/routing/disable` | +| List rules | GET | `/zones/{zone_id}/email/routing/rules` | +| Create rule | POST | `/zones/{zone_id}/email/routing/rules` | +| Verify destination | POST | `/zones/{zone_id}/email/routing/addresses` | +| List destinations | GET | `/zones/{zone_id}/email/routing/addresses` | + +### Create Routing Rule Example + +```bash +curl -X POST "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/email/routing/rules" \ + -H "Authorization: Bearer $API_TOKEN" \ + -H "Content-Type: application/json" \ + -d '{ + "enabled": true, + "name": "Forward sales", + "matchers": [{"type": "literal", "field": "to", "value": "sales@yourdomain.com"}], + "actions": [{"type": "forward", "value": ["alice@company.com"]}], + "priority": 0 + }' +``` + +Matcher types: `literal` (exact match), `all` (catch-all). diff --git a/cloudflare/references/email-routing/configuration.md b/cloudflare/references/email-routing/configuration.md new file mode 100644 index 0000000..3f9613e --- /dev/null +++ b/cloudflare/references/email-routing/configuration.md @@ -0,0 +1,186 @@ +# Email Routing Configuration + +## Wrangler Configuration + +### Basic Email Worker + +```jsonc +// wrangler.jsonc +{ + "name": "email-worker", + "main": "src/index.ts", + "compatibility_date": "2025-01-01", + "send_email": [{ "name": "EMAIL" }] +} +``` + +```typescript +// src/index.ts +export default { + async email(message, env, ctx) { + await message.forward("destination@example.com"); + } +} satisfies ExportedHandler; +``` + +### With Storage Bindings + +```jsonc +{ + "name": "email-processor", + "send_email": [{ "name": "EMAIL" }], + "kv_namespaces": [{ "binding": "KV", "id": "abc123" }], + "r2_buckets": [{ "binding": "R2", "bucket_name": "emails" }], + "d1_databases": [{ "binding": "DB", "database_id": "def456" }] +} +``` + +```typescript +interface Env { + EMAIL: SendEmail; + KV: KVNamespace; + R2: R2Bucket; + DB: D1Database; +} +``` + +## Local Development + +```bash +npx wrangler dev + +# Test with curl +curl -X POST 'http://localhost:8787/__email' \ + --header 'content-type: message/rfc822' \ + --data 'From: test@example.com +To: you@yourdomain.com +Subject: Test + +Body' +``` + +## Deployment + +```bash +npx wrangler deploy +``` + +**Connect to Email Routing:** + +Dashboard: Email > Email Routing > [domain] > Settings > Email Workers > Select worker + +API: +```bash +curl -X PUT "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/email/routing/settings" \ + -H "Authorization: Bearer $API_TOKEN" \ + -d '{"enabled": true, "worker": "email-worker"}' +``` + +## DNS (Auto-Created) + +```dns +yourdomain.com. IN MX 1 isaac.mx.cloudflare.net. +yourdomain.com. IN MX 2 linda.mx.cloudflare.net. +yourdomain.com. IN MX 3 amir.mx.cloudflare.net. +yourdomain.com. IN TXT "v=spf1 include:_spf.mx.cloudflare.net ~all" +``` + +## Secrets & Variables + +```bash +# Secrets (encrypted) +npx wrangler secret put API_KEY + +# Variables (plain) +# wrangler.jsonc +{ "vars": { "THRESHOLD": "5.0" } } +``` + +```typescript +interface Env { + API_KEY: string; + THRESHOLD: string; +} +``` + +## TypeScript Setup + +```bash +npm install --save-dev @cloudflare/workers-types +``` + +```json +// tsconfig.json +{ + "compilerOptions": { + "target": "ES2022", + "module": "ES2022", + "lib": ["ES2022"], + "types": ["@cloudflare/workers-types"], + "moduleResolution": "bundler", + "strict": true + } +} +``` + +```typescript +import type { ForwardableEmailMessage } from "@cloudflare/workers-types"; + +export default { + async email(message: ForwardableEmailMessage, env: Env, ctx: ExecutionContext): Promise { + await message.forward("dest@example.com"); + } +} satisfies ExportedHandler; +``` + +## Dependencies + +```bash +npm install postal-mime +``` + +```typescript +import PostalMime from 'postal-mime'; + +export default { + async email(message, env, ctx) { + const parser = new PostalMime(); + const email = await parser.parse(await message.raw.arrayBuffer()); + console.log(email.subject); + await message.forward("inbox@corp.com"); + } +} satisfies ExportedHandler; +``` + +## Multi-Environment + +```bash +# wrangler.dev.jsonc +{ "name": "worker-dev", "vars": { "ENV": "dev" } } + +# wrangler.prod.jsonc +{ "name": "worker-prod", "vars": { "ENV": "prod" } } + +npx wrangler deploy --config wrangler.dev.jsonc +npx wrangler deploy --config wrangler.prod.jsonc +``` + +## CI/CD (GitHub Actions) + +```yaml +# .github/workflows/deploy.yml +name: Deploy +on: + push: + branches: [main] +jobs: + deploy: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v3 + - uses: actions/setup-node@v3 + - run: npm ci + - run: npx wrangler deploy + env: + CLOUDFLARE_API_TOKEN: ${{ secrets.CLOUDFLARE_API_TOKEN }} +``` diff --git a/cloudflare/references/email-routing/gotchas.md b/cloudflare/references/email-routing/gotchas.md new file mode 100644 index 0000000..20ea419 --- /dev/null +++ b/cloudflare/references/email-routing/gotchas.md @@ -0,0 +1,196 @@ +# Gotchas & Troubleshooting + +## Critical Pitfalls + +### Stream Consumption (MOST COMMON) + +**Problem:** "stream already consumed" or worker hangs + +**Cause:** `message.raw` is `ReadableStream` - consume once only + +**Solution:** +```typescript +// ❌ WRONG +const email1 = await parser.parse(await message.raw.arrayBuffer()); +const email2 = await parser.parse(await message.raw.arrayBuffer()); // FAILS + +// ✅ CORRECT +const raw = await message.raw.arrayBuffer(); +const email = await parser.parse(raw); +``` + +Consume `message.raw` immediately before any async operations. + +### Destination Verification + +**Problem:** Emails not forwarding + +**Cause:** Destination unverified + +**Solution:** Add destination, check inbox for verification email, click link. Verify status: `GET /zones/{id}/email/routing/addresses` + +### Mail Authentication + +**Problem:** Legitimate emails rejected + +**Cause:** Missing SPF/DKIM/DMARC on sender domain + +**Solution:** Configure sender DNS: +```dns +example.com. IN TXT "v=spf1 include:_spf.example.com ~all" +selector._domainkey.example.com. IN TXT "v=DKIM1; k=rsa; p=..." +_dmarc.example.com. IN TXT "v=DMARC1; p=quarantine" +``` + +### Envelope vs Header + +**Problem:** Filtering on wrong address + +**Solution:** +```typescript +// Routing/auth: envelope +if (message.from === "trusted@example.com") { } + +// Display: headers +const display = message.headers.get("from"); +``` + +### SendEmail Limits + +| Issue | Limit | Solution | +|-------|-------|----------| +| From domain | Must own | Use Email Routing domain | +| Volume | ~100/min Free | Upgrade or throttle | +| Attachments | Not supported | Link to R2 | +| Type | Transactional | No bulk | + +## Common Errors + +### CPU Time Exceeded + +**Cause:** Heavy parsing, large emails + +**Solution:** +```typescript +const size = parseInt(message.headers.get("content-length") || "0") / 1024 / 1024; +if (size > 20) { + message.setReject("Too large"); + return; +} + +ctx.waitUntil(expensiveWork()); +await message.forward("dest@example.com"); +``` + +### Rule Not Triggering + +**Causes:** Priority conflict, matcher error, catch-all override + +**Solution:** Check priority (lower=first), verify exact match, confirm destination verified + +### Undefined Property + +**Cause:** Missing header + +**Solution:** +```typescript +// ❌ WRONG +const subj = message.headers.get("subject").toLowerCase(); + +// ✅ CORRECT +const subj = message.headers.get("subject")?.toLowerCase() || ""; +``` + +## Limits + +| Resource | Free | Paid | +|----------|------|------| +| Email size | 25 MB | 25 MB | +| Rules | 200 | 200 | +| Destinations | 200 | 200 | +| CPU time | 10ms | 50ms | +| SendEmail | ~100/min | Higher | + +## Debugging + +### Local + +```bash +npx wrangler dev + +curl -X POST 'http://localhost:8787/__email' \ + --header 'content-type: message/rfc822' \ + --data 'From: test@example.com +To: you@yourdomain.com +Subject: Test + +Body' +``` + +### Production + +```bash +npx wrangler tail +``` + +### Pattern + +```typescript +export default { + async email(message, env, ctx) { + try { + console.log("From:", message.from); + await process(message, env); + } catch (err) { + console.error(err); + message.setReject(err.message); + } + } +} satisfies ExportedHandler; +``` + +## Auth Troubleshooting + +### Check Status + +```typescript +const auth = message.headers.get("authentication-results") || ""; +console.log({ + spf: auth.includes("spf=pass"), + dkim: auth.includes("dkim=pass"), + dmarc: auth.includes("dmarc=pass") +}); + +if (!auth.includes("pass")) { + message.setReject("Failed auth"); + return; +} +``` + +### SPF Issues + +**Causes:** Forwarding breaks SPF, too many lookups (>10), missing includes + +**Solution:** +```dns +; ✅ Good +example.com. IN TXT "v=spf1 include:_spf.google.com ~all" + +; ❌ Bad - too many +example.com. IN TXT "v=spf1 include:a.com include:b.com ... ~all" +``` + +### DMARC Alignment + +**Cause:** From domain must match SPF/DKIM domain + +## Best Practices + +1. Consume `message.raw` immediately +2. Verify destinations +3. Handle missing headers (`?.`) +4. Use envelope for routing +5. Check spam scores +6. Test locally first +7. Use `ctx.waitUntil` for background work +8. Size-check early diff --git a/cloudflare/references/email-routing/patterns.md b/cloudflare/references/email-routing/patterns.md new file mode 100644 index 0000000..2163677 --- /dev/null +++ b/cloudflare/references/email-routing/patterns.md @@ -0,0 +1,229 @@ +# Common Patterns + +## 1. Allowlist/Blocklist + +```typescript +// Allowlist +const allowed = ["user@example.com", "trusted@corp.com"]; +if (!allowed.includes(message.from)) { + message.setReject("Not allowed"); + return; +} +await message.forward("inbox@corp.com"); +``` + +## 2. Parse Email Body + +```typescript +import PostalMime from 'postal-mime'; + +export default { + async email(message, env, ctx) { + // CRITICAL: Consume stream immediately + const raw = await message.raw.arrayBuffer(); + + const parser = new PostalMime(); + const email = await parser.parse(raw); + + console.log({ + subject: email.subject, + text: email.text, + html: email.html, + from: email.from.address, + attachments: email.attachments.length + }); + + await message.forward("inbox@corp.com"); + } +} satisfies ExportedHandler; +``` + +## 3. Spam Filter + +```typescript +const score = parseFloat(message.headers.get("x-cf-spamh-score") || "0"); +if (score > 5) { + message.setReject("Spam detected"); + return; +} +await message.forward("inbox@corp.com"); +``` + +## 4. Archive to R2 + +```typescript +interface Env { R2: R2Bucket; } + +export default { + async email(message, env, ctx) { + const raw = await message.raw.arrayBuffer(); + + const key = `${new Date().toISOString()}-${message.from}.eml`; + await env.R2.put(key, raw, { + httpMetadata: { contentType: "message/rfc822" } + }); + + await message.forward("inbox@corp.com"); + } +} satisfies ExportedHandler; +``` + +## 5. Store Metadata in KV + +```typescript +import PostalMime from 'postal-mime'; + +interface Env { KV: KVNamespace; } + +export default { + async email(message, env, ctx) { + const raw = await message.raw.arrayBuffer(); + const parser = new PostalMime(); + const email = await parser.parse(raw); + + const metadata = { + from: email.from.address, + subject: email.subject, + timestamp: new Date().toISOString(), + size: raw.byteLength + }; + + await env.KV.put(`email:${Date.now()}`, JSON.stringify(metadata)); + await message.forward("inbox@corp.com"); + } +} satisfies ExportedHandler; +``` + +## 6. Subject-Based Routing + +```typescript +export default { + async email(message, env, ctx) { + const subject = message.headers.get("subject")?.toLowerCase() || ""; + + if (subject.includes("[urgent]")) { + await message.forward("oncall@corp.com"); + } else if (subject.includes("[billing]")) { + await message.forward("billing@corp.com"); + } else if (subject.includes("[support]")) { + await message.forward("support@corp.com"); + } else { + await message.forward("general@corp.com"); + } + } +} satisfies ExportedHandler; +``` + +## 7. Auto-Reply + +```typescript +interface Env { + EMAIL: SendEmail; + REPLIED: KVNamespace; +} + +export default { + async email(message, env, ctx) { + const msgId = message.headers.get("message-id"); + + if (msgId && await env.REPLIED.get(msgId)) { + await message.forward("archive@corp.com"); + return; + } + + ctx.waitUntil((async () => { + await env.EMAIL.send({ + from: "noreply@yourdomain.com", + to: message.from, + subject: "Re: " + (message.headers.get("subject") || ""), + text: "Thank you. We'll respond within 24h." + }); + if (msgId) await env.REPLIED.put(msgId, "1", { expirationTtl: 604800 }); + })()); + + await message.forward("support@corp.com"); + } +} satisfies ExportedHandler; +``` + +## 8. Extract Attachments + +```typescript +import PostalMime from 'postal-mime'; + +interface Env { ATTACHMENTS: R2Bucket; } + +export default { + async email(message, env, ctx) { + const parser = new PostalMime(); + const email = await parser.parse(await message.raw.arrayBuffer()); + + for (const att of email.attachments) { + const key = `${Date.now()}-${att.filename}`; + await env.ATTACHMENTS.put(key, att.content, { + httpMetadata: { contentType: att.mimeType } + }); + } + + await message.forward("inbox@corp.com"); + } +} satisfies ExportedHandler; +``` + +## 9. Log to D1 + +```typescript +import PostalMime from 'postal-mime'; + +interface Env { DB: D1Database; } + +export default { + async email(message, env, ctx) { + const parser = new PostalMime(); + const email = await parser.parse(await message.raw.arrayBuffer()); + + ctx.waitUntil( + env.DB.prepare("INSERT INTO log (ts, from_addr, subj) VALUES (?, ?, ?)") + .bind(new Date().toISOString(), email.from.address, email.subject || "") + .run() + ); + + await message.forward("inbox@corp.com"); + } +} satisfies ExportedHandler; +``` + +## 10. Multi-Tenant + +```typescript +interface Env { TENANTS: KVNamespace; } + +export default { + async email(message, env, ctx) { + const subdomain = message.to.split("@")[1].split(".")[0]; + const config = await env.TENANTS.get(subdomain, "json") as { forward: string } | null; + + if (!config) { + message.setReject("Unknown tenant"); + return; + } + + await message.forward(config.forward); + } +} satisfies ExportedHandler; +``` + +## Summary + +| Pattern | Use Case | Storage | +|---------|----------|---------| +| Allowlist | Security | None | +| Parse | Body/attachments | None | +| Spam Filter | Reduce spam | None | +| R2 Archive | Email storage | R2 | +| KV Meta | Analytics | KV | +| Subject Route | Dept routing | None | +| Auto-Reply | Support | KV | +| Attachments | Doc mgmt | R2 | +| D1 Log | Audit trail | D1 | +| Multi-Tenant | SaaS | KV | diff --git a/cloudflare/references/email-workers/README.md b/cloudflare/references/email-workers/README.md new file mode 100644 index 0000000..5a3e304 --- /dev/null +++ b/cloudflare/references/email-workers/README.md @@ -0,0 +1,151 @@ +# Cloudflare Email Workers + +Process incoming emails programmatically using Cloudflare Workers runtime. + +## Overview + +Email Workers enable custom email processing logic at the edge. Build spam filters, auto-responders, ticket systems, notification handlers, and more using the same Workers runtime you use for HTTP requests. + +**Key capabilities**: +- Process inbound emails with full message access +- Forward to verified destinations +- Send replies with proper threading +- Parse MIME content and attachments +- Integrate with KV, R2, D1, and external APIs + +## Quick Start + +### Minimal ES Modules Handler + +```typescript +export default { + async email(message, env, ctx) { + // Reject spam + if (message.from.includes('spam.com')) { + message.setReject('Blocked'); + return; + } + + // Forward to inbox + await message.forward('inbox@example.com'); + } +}; +``` + +### Core Operations + +| Operation | Method | Use Case | +|-----------|--------|----------| +| Forward | `message.forward(to, headers?)` | Route to verified destination | +| Reject | `message.setReject(reason)` | Block with SMTP error | +| Reply | `message.reply(emailMessage)` | Auto-respond with threading | +| Parse | postal-mime library | Extract subject, body, attachments | + +## Reading Order + +For comprehensive understanding, read files in this order: + +1. **README.md** (this file) - Overview and quick start +2. **configuration.md** - Setup, deployment, bindings +3. **api.md** - Complete API reference +4. **patterns.md** - Real-world implementation examples +5. **gotchas.md** - Critical pitfalls and debugging + +## In This Reference + +| File | Description | Key Topics | +|------|-------------|------------| +| [api.md](./api.md) | Complete API reference | ForwardableEmailMessage, SendEmail bindings, reply() method, postal-mime/mimetext APIs | +| [configuration.md](./configuration.md) | Setup and configuration | wrangler.jsonc, bindings, deployment, dependencies | +| [patterns.md](./patterns.md) | Real-world examples | Allowlists from KV, auto-reply with threading, attachment extraction, webhook notifications | +| [gotchas.md](./gotchas.md) | Pitfalls and debugging | Stream consumption, ctx.waitUntil errors, security, limits | + +## Architecture + +``` +Incoming Email → Email Routing → Email Worker + ↓ + Process + Decide + ↓ + ┌───────────────┼───────────────┐ + ↓ ↓ ↓ + Forward Reply Reject +``` + +**Event flow**: +1. Email arrives at your domain +2. Email Routing matches route (e.g., `support@example.com`) +3. Bound Email Worker receives `ForwardableEmailMessage` +4. Worker processes and takes action (forward/reply/reject) +5. Email delivered or rejected based on worker logic + +## Key Concepts + +### Envelope vs Headers + +- **Envelope addresses** (`message.from`, `message.to`): SMTP transport addresses (trusted) +- **Header addresses** (parsed from body): Display addresses (can be spoofed) + +Use envelope addresses for security decisions. + +### Single-Use Streams + +`message.raw` is a ReadableStream that can only be read once. Buffer to ArrayBuffer for multiple uses. + +```typescript +// Buffer first +const buffer = await new Response(message.raw).arrayBuffer(); +const email = await PostalMime.parse(buffer); +``` + +See [gotchas.md](./gotchas.md#readablestream-can-only-be-consumed-once) for details. + +### Verified Destinations + +`forward()` only works with addresses verified in the Cloudflare Email Routing dashboard. Add destinations before deployment. + +## Use Cases + +- **Spam filtering**: Block based on sender, content, or reputation +- **Auto-responders**: Send acknowledgment replies with threading +- **Ticket creation**: Parse emails and create support tickets +- **Email archival**: Store in KV, R2, or D1 +- **Notification routing**: Forward to Slack, Discord, or webhooks +- **Attachment processing**: Extract files to R2 storage +- **Multi-tenant routing**: Route based on recipient subdomain +- **Size filtering**: Reject oversized attachments + +## Limits + +| Limit | Value | +|-------|-------| +| Max message size | 25 MiB | +| Max routing rules | 200 | +| Max destinations | 200 | +| CPU time (free tier) | 10ms | +| CPU time (paid tier) | 50ms | + +See [gotchas.md](./gotchas.md#limits-reference) for complete limits table. + +## Prerequisites + +Before deploying Email Workers: + +1. **Enable Email Routing** in Cloudflare dashboard for your domain +2. **Verify destination addresses** for forwarding +3. **Configure DMARC/SPF** for sending domains (required for replies) +4. **Set up wrangler.jsonc** with SendEmail binding + +See [configuration.md](./configuration.md) for detailed setup. + +## Service Worker Syntax (Deprecated) + +Modern projects should use ES modules format shown above. Service Worker syntax (`addEventListener('email', ...)`) is deprecated but still supported. + +## See Also + +- [Email Routing Documentation](https://developers.cloudflare.com/email-routing/) +- [Workers Platform](https://developers.cloudflare.com/workers/) +- [Wrangler CLI](https://developers.cloudflare.com/workers/wrangler/) +- [postal-mime on npm](https://www.npmjs.com/package/postal-mime) +- [mimetext on npm](https://www.npmjs.com/package/mimetext) diff --git a/cloudflare/references/email-workers/api.md b/cloudflare/references/email-workers/api.md new file mode 100644 index 0000000..74da66c --- /dev/null +++ b/cloudflare/references/email-workers/api.md @@ -0,0 +1,237 @@ +# Email Workers API Reference + +Complete API reference for Cloudflare Email Workers runtime. + +## ForwardableEmailMessage Interface + +The main interface passed to email handlers. + +```typescript +interface ForwardableEmailMessage { + readonly from: string; // Envelope MAIL FROM (SMTP sender) + readonly to: string; // Envelope RCPT TO (SMTP recipient) + readonly headers: Headers; // Web-standard Headers object + readonly raw: ReadableStream; // Raw MIME message (single-use stream) + readonly rawSize: number; // Total message size in bytes + + setReject(reason: string): void; + forward(rcptTo: string, headers?: Headers): Promise; + reply(message: EmailMessage): Promise; +} +``` + +### Properties + +| Property | Type | Description | +|----------|------|-------------| +| `from` | string | Envelope sender (SMTP MAIL FROM) - use for security | +| `to` | string | Envelope recipient (SMTP RCPT TO) | +| `headers` | Headers | Message headers (Subject, Message-ID, etc.) | +| `raw` | ReadableStream | Raw MIME message (**single-use**, buffer first) | +| `rawSize` | number | Message size in bytes | + +### Methods + +#### setReject(reason: string): void + +Reject with permanent SMTP 5xx error. Email not delivered, sender may receive bounce. + +```typescript +if (blockList.includes(message.from)) { + message.setReject('Sender blocked'); +} +``` + +#### forward(rcptTo: string, headers?: Headers): Promise + +Forward to verified destination. Only `X-*` custom headers allowed. + +```typescript +await message.forward('inbox@example.com'); + +// With custom headers +const h = new Headers(); +h.set('X-Processed-By', 'worker'); +await message.forward('inbox@example.com', h); +``` + +#### reply(message: EmailMessage): Promise + +Send a reply to the original sender (March 2025 feature). + +```typescript +import { EmailMessage } from 'cloudflare:email'; +import { createMimeMessage } from 'mimetext'; + +const msg = createMimeMessage(); +msg.setSender({ name: 'Support', addr: 'support@example.com' }); +msg.setRecipient(message.from); +msg.setSubject(`Re: ${message.headers.get('Subject')}`); +msg.setHeader('In-Reply-To', message.headers.get('Message-ID')); +msg.setHeader('References', message.headers.get('References') || ''); +msg.addMessage({ + contentType: 'text/plain', + data: 'Thank you for your message.' +}); + +await message.reply(new EmailMessage( + 'support@example.com', + message.from, + msg.asRaw() +)); +``` + +**Requirements**: +- Incoming email needs valid DMARC +- Reply once per event, recipient = `message.from` +- Sender domain = receiving domain, with DMARC/SPF/DKIM +- Max 100 `References` entries +- Threading: `In-Reply-To` (original Message-ID), `References`, new `Message-ID` + +## EmailMessage Constructor + +```typescript +import { EmailMessage } from 'cloudflare:email'; + +new EmailMessage(from: string, to: string, raw: ReadableStream | string) +``` + +Used for sending emails (replies or via SendEmail binding). Domain must be verified. + +## SendEmail Interface + +```typescript +interface SendEmail { + send(message: EmailMessage): Promise; +} + +// Usage +await env.EMAIL.send(new EmailMessage(from, to, mimeContent)); +``` + +## SendEmail Binding Types + +```jsonc +{ + "send_email": [ + { "name": "EMAIL" }, // Type 1: Any verified address + { "name": "LOGS", "destination_address": "logs@example.com" }, // Type 2: Single dest + { "name": "TEAM", "allowed_destination_addresses": ["a@ex.com", "b@ex.com"] }, // Type 3: Dest allowlist + { "name": "NOREPLY", "allowed_sender_addresses": ["noreply@ex.com"] } // Type 4: Sender allowlist + ] +} +``` + +## postal-mime Parsed Output + +postal-mime v2.7.3 parses incoming emails into structured data. + +```typescript +interface ParsedEmail { + headers: Array<{ key: string; value: string }>; + from: { name: string; address: string } | null; + to: Array<{ name: string; address: string }> | { name: string; address: string } | null; + cc: Array<{ name: string; address: string }> | null; + bcc: Array<{ name: string; address: string }> | null; + subject: string; + messageId: string | null; + inReplyTo: string | null; + references: string | null; + date: string | null; + html: string | null; + text: string | null; + attachments: Array<{ + filename: string; + mimeType: string; + disposition: string | null; + related: boolean; + contentId: string | null; + content: Uint8Array; + }>; +} +``` + +### Usage + +```typescript +import PostalMime from 'postal-mime'; + +const buffer = await new Response(message.raw).arrayBuffer(); +const email = await PostalMime.parse(buffer); + +console.log(email.subject); +console.log(email.from?.address); +console.log(email.text); +console.log(email.attachments.length); +``` + +## mimetext API Quick Reference + +mimetext v3.0.27 composes outgoing emails. + +```typescript +import { createMimeMessage } from 'mimetext'; + +const msg = createMimeMessage(); + +// Sender +msg.setSender({ name: 'John Doe', addr: 'john@example.com' }); + +// Recipients +msg.setRecipient('alice@example.com'); +msg.setRecipients(['bob@example.com', 'carol@example.com']); +msg.setCc('manager@example.com'); +msg.setBcc(['audit@example.com']); + +// Headers +msg.setSubject('Meeting Notes'); +msg.setHeader('In-Reply-To', ''); +msg.setHeader('References', ' '); +msg.setHeader('Message-ID', `<${crypto.randomUUID()}@example.com>`); + +// Content +msg.addMessage({ + contentType: 'text/plain', + data: 'Plain text content' +}); + +msg.addMessage({ + contentType: 'text/html', + data: '

HTML content

' +}); + +// Attachments +msg.addAttachment({ + filename: 'report.pdf', + contentType: 'application/pdf', + data: pdfBuffer // Uint8Array or base64 string +}); + +// Generate raw MIME +const raw = msg.asRaw(); // Returns string +``` + +## TypeScript Types + +```typescript +import { + ForwardableEmailMessage, + EmailMessage +} from 'cloudflare:email'; + +interface Env { + EMAIL: SendEmail; + EMAIL_ARCHIVE: KVNamespace; + ALLOWED_SENDERS: KVNamespace; +} + +export default { + async email( + message: ForwardableEmailMessage, + env: Env, + ctx: ExecutionContext + ): Promise { + // Fully typed + } +}; +``` diff --git a/cloudflare/references/email-workers/configuration.md b/cloudflare/references/email-workers/configuration.md new file mode 100644 index 0000000..7928d04 --- /dev/null +++ b/cloudflare/references/email-workers/configuration.md @@ -0,0 +1,112 @@ +# Email Workers Configuration + +## wrangler.jsonc + +```jsonc +{ + "name": "email-worker", + "main": "src/index.ts", + "compatibility_date": "2025-01-27", + "send_email": [ + { "name": "EMAIL" }, // Unrestricted + { "name": "EMAIL_LOGS", "destination_address": "logs@example.com" }, // Single dest + { "name": "EMAIL_TEAM", "allowed_destination_addresses": ["a@ex.com", "b@ex.com"] }, + { "name": "EMAIL_NOREPLY", "allowed_sender_addresses": ["noreply@ex.com"] } + ], + "kv_namespaces": [{ "binding": "ARCHIVE", "id": "xxx" }], + "r2_buckets": [{ "binding": "ATTACHMENTS", "bucket_name": "email-attachments" }], + "vars": { "WEBHOOK_URL": "https://hooks.example.com" } +} +``` + +## TypeScript Types + +```typescript +interface Env { + EMAIL: SendEmail; + ARCHIVE: KVNamespace; + ATTACHMENTS: R2Bucket; + WEBHOOK_URL: string; +} + +export default { + async email(message: ForwardableEmailMessage, env: Env, ctx: ExecutionContext) {} +}; +``` + +## Dependencies + +```bash +npm install postal-mime mimetext +npm install -D @cloudflare/workers-types wrangler typescript +``` + +Use postal-mime v2.x, mimetext v3.x. + +## tsconfig.json + +```json +{ + "compilerOptions": { + "target": "ES2022", "module": "ES2022", "lib": ["ES2022"], + "types": ["@cloudflare/workers-types"], + "moduleResolution": "bundler", "strict": true + } +} +``` + +## Local Development + +```bash +npx wrangler dev + +# Test receiving +curl --request POST 'http://localhost:8787/cdn-cgi/handler/email' \ + --url-query 'from=sender@example.com' --url-query 'to=recipient@example.com' \ + --header 'Content-Type: text/plain' --data-raw 'Subject: Test\n\nHello' +``` + +Sent emails write to local `.eml` files. + +## Deployment Checklist + +- [ ] Enable Email Routing in dashboard +- [ ] Verify destination addresses +- [ ] Configure DMARC/SPF/DKIM for sending +- [ ] Create KV/R2 resources if needed +- [ ] Update wrangler.jsonc with production IDs + +```bash +npx wrangler deploy +npx wrangler deployments list +``` + +## Dashboard Setup + +1. **Email Routing:** Domain → Email → Enable Email Routing +2. **Verify addresses:** Email → Destination addresses → Add & verify +3. **Bind Worker:** Email → Email Workers → Create route → Select pattern & Worker +4. **DMARC:** Add TXT `_dmarc.domain.com`: `v=DMARC1; p=quarantine;` + +## Secrets + +```bash +npx wrangler secret put API_KEY +# Access: env.API_KEY +``` + +## Monitoring + +```bash +npx wrangler tail +npx wrangler tail --status error +npx wrangler tail --format json +``` + +## Troubleshooting + +| Error | Fix | +|-------|-----| +| "Binding not found" | Check `send_email` name matches code | +| "Invalid destination" | Verify in Email Routing dashboard | +| Type errors | Install `@cloudflare/workers-types` | diff --git a/cloudflare/references/email-workers/gotchas.md b/cloudflare/references/email-workers/gotchas.md new file mode 100644 index 0000000..3700a50 --- /dev/null +++ b/cloudflare/references/email-workers/gotchas.md @@ -0,0 +1,125 @@ +# Email Workers Gotchas + +## Critical Issues + +### ReadableStream Single-Use + +```typescript +// ❌ WRONG: Stream consumed twice +const email = await PostalMime.parse(await new Response(message.raw).arrayBuffer()); +const rawText = await new Response(message.raw).text(); // EMPTY! + +// ✅ CORRECT: Buffer first +const buffer = await new Response(message.raw).arrayBuffer(); +const email = await PostalMime.parse(buffer); +const rawText = new TextDecoder().decode(buffer); +``` + +### ctx.waitUntil() Errors Silent + +```typescript +// ❌ Errors dropped silently +ctx.waitUntil(fetch(webhookUrl, { method: 'POST', body: data })); + +// ✅ Catch and log +ctx.waitUntil( + fetch(webhookUrl, { method: 'POST', body: data }) + .catch(err => env.ERROR_LOG.put(`error:${Date.now()}`, err.message)) +); +``` + +## Security + +### Envelope vs Header From (Spoofing) + +```typescript +const envelopeFrom = message.from; // SMTP MAIL FROM (trusted) +const headerFrom = (await PostalMime.parse(buffer)).from?.address; // (untrusted) +// Use envelope for security decisions +``` + +### Input Validation + +```typescript +if (message.rawSize > 5_000_000) { message.setReject('Too large'); return; } +if ((message.headers.get('Subject') || '').length > 1000) { + message.setReject('Invalid subject'); return; +} +``` + +### DMARC for Replies + +Replies fail silently without DMARC. Verify: `dig TXT _dmarc.example.com` + +## Parsing + +### Address Parsing + +```typescript +const email = await PostalMime.parse(buffer); +const fromAddress = email.from?.address || 'unknown'; +const toAddresses = Array.isArray(email.to) ? email.to.map(t => t.address) : [email.to?.address]; +``` + +### Character Encoding + +Let postal-mime handle decoding - `email.subject`, `email.text`, `email.html` are UTF-8. + +## API Behavior + +### setReject() vs throw + +```typescript +// setReject() for SMTP rejection +if (blockList.includes(message.from)) { message.setReject('Blocked'); return; } + +// throw for worker errors +if (!env.KV) throw new Error('KV not configured'); +``` + +### forward() Only X-* Headers + +```typescript +headers.set('X-Processed-By', 'worker'); // ✅ Works +headers.set('Subject', 'Modified'); // ❌ Dropped +``` + +### Reply Requires Verified Domain + +```typescript +// Use same domain as receiving address +const receivingDomain = message.to.split('@')[1]; +await message.reply(new EmailMessage(`noreply@${receivingDomain}`, message.from, rawMime)); +``` + +## Performance + +### CPU Limit + +```typescript +// Skip parsing large emails +if (message.rawSize > 5_000_000) { + await message.forward('inbox@example.com'); + return; +} +``` + +Monitor: `npx wrangler tail` + +## Limits + +| Limit | Value | +|-------|-------| +| Max message size | 25 MiB | +| Max rules/zone | 200 | +| CPU time (free/paid) | 10ms / 50ms | +| Reply References | 100 | + +## Common Errors + +| Error | Fix | +|-------|-----| +| "Address not verified" | Add in Email Routing dashboard | +| "Exceeded CPU time" | Use `ctx.waitUntil()` or upgrade | +| "Stream is locked" | Buffer `message.raw` first | +| Silent reply failure | Check DMARC records | diff --git a/cloudflare/references/email-workers/patterns.md b/cloudflare/references/email-workers/patterns.md new file mode 100644 index 0000000..f1e65f5 --- /dev/null +++ b/cloudflare/references/email-workers/patterns.md @@ -0,0 +1,102 @@ +# Email Workers Patterns + +## Parse Email + +```typescript +import PostalMime from 'postal-mime'; + +export default { + async email(message, env, ctx) { + const buffer = await new Response(message.raw).arrayBuffer(); + const email = await PostalMime.parse(buffer); + console.log(email.from, email.subject, email.text, email.attachments.length); + await message.forward('inbox@example.com'); + } +}; +``` + +## Filtering + +```typescript +// Allowlist from KV +const allowList = await env.ALLOWED_SENDERS.get('list', 'json') || []; +if (!allowList.includes(message.from)) { + message.setReject('Not allowed'); + return; +} + +// Size check (avoid parsing large emails) +if (message.rawSize > 5_000_000) { + await message.forward('inbox@example.com'); // Forward without parsing + return; +} +``` + +## Auto-Reply with Threading + +```typescript +import { EmailMessage } from 'cloudflare:email'; +import { createMimeMessage } from 'mimetext'; + +const msg = createMimeMessage(); +msg.setSender({ addr: 'support@example.com' }); +msg.setRecipient(message.from); +msg.setSubject(`Re: ${message.headers.get('Subject')}`); +msg.setHeader('In-Reply-To', message.headers.get('Message-ID') || ''); +msg.addMessage({ contentType: 'text/plain', data: 'Thank you. We will respond.' }); + +await message.reply(new EmailMessage('support@example.com', message.from, msg.asRaw())); +``` + +## Rate-Limited Auto-Reply + +```typescript +const rateKey = `rate:${message.from}`; +if (!await env.RATE_LIMIT.get(rateKey)) { + // Send reply... + ctx.waitUntil(env.RATE_LIMIT.put(rateKey, '1', { expirationTtl: 3600 })); +} +``` + +## Subject-Based Routing + +```typescript +const subject = (message.headers.get('Subject') || '').toLowerCase(); +if (subject.includes('billing')) await message.forward('billing@example.com'); +else if (subject.includes('support')) await message.forward('support@example.com'); +else await message.forward('general@example.com'); +``` + +## Multi-Tenant Routing + +```typescript +// support+tenant123@example.com → tenant123 +const tenantId = message.to.split('@')[0].match(/\+(.+)$/)?.[1] || 'default'; +const config = await env.TENANT_CONFIG.get(tenantId, 'json'); +config?.forwardTo ? await message.forward(config.forwardTo) : message.setReject('Unknown'); +``` + +## Archive & Extract Attachments + +```typescript +// Archive to KV +ctx.waitUntil(env.ARCHIVE.put(`email:${Date.now()}`, JSON.stringify({ + from: message.from, subject: email.subject +}))); + +// Attachments to R2 +for (const att of email.attachments) { + ctx.waitUntil(env.R2.put(`${Date.now()}-${att.filename}`, att.content)); +} +``` + +## Webhook Integration + +```typescript +ctx.waitUntil( + fetch(env.WEBHOOK_URL, { + method: 'POST', + body: JSON.stringify({ from: message.from, subject: message.headers.get('Subject') }) + }).catch(err => console.error(err)) +); +``` diff --git a/cloudflare/references/hyperdrive/README.md b/cloudflare/references/hyperdrive/README.md new file mode 100644 index 0000000..6626776 --- /dev/null +++ b/cloudflare/references/hyperdrive/README.md @@ -0,0 +1,82 @@ +# Hyperdrive + +Accelerates database queries from Workers via connection pooling, edge setup, query caching. + +## Key Features + +- **Connection Pooling**: Persistent connections eliminate TCP/TLS/auth handshakes (~7 round-trips) +- **Edge Setup**: Connection negotiation at edge, pooling near origin +- **Query Caching**: Auto-cache non-mutating queries (default 60s TTL) +- **Support**: PostgreSQL, MySQL + compatibles (CockroachDB, Timescale, PlanetScale, Neon, Supabase) + +## Architecture + +``` +Worker → Edge (setup) → Pool (near DB) → Origin + ↓ cached reads + Cache +``` + +## Quick Start + +```bash +# Create config +npx wrangler hyperdrive create my-db \ + --connection-string="postgres://user:pass@host:5432/db" + +# wrangler.jsonc +{ + "compatibility_flags": ["nodejs_compat"], + "hyperdrive": [{"binding": "HYPERDRIVE", "id": ""}] +} +``` + +```typescript +import { Client } from "pg"; + +export default { + async fetch(req: Request, env: Env): Promise { + const client = new Client({ + connectionString: env.HYPERDRIVE.connectionString, + }); + await client.connect(); + const result = await client.query("SELECT * FROM users WHERE id = $1", [123]); + await client.end(); + return Response.json(result.rows); + }, +}; +``` + +## When to Use + +✅ Global access to single-region DBs, high read ratios, popular queries, connection-heavy loads +❌ Write-heavy, real-time data (<1s), single-region apps close to DB + +**💡 Pair with Smart Placement** for Workers making multiple queries - executes near DB to minimize latency. + +## Driver Choice + +| Driver | Use When | Notes | +|--------|----------|-------| +| **pg** (recommended) | General use, TypeScript, ecosystem compatibility | Stable, widely used, works with most ORMs | +| **postgres.js** | Advanced features, template literals, streaming | Lighter than pg, `prepare: true` is default | +| **mysql2** | MySQL/MariaDB/PlanetScale | MySQL only, less mature support | + +## Reading Order + +| New to Hyperdrive | Implementing | Troubleshooting | +|-------------------|--------------|-----------------| +| 1. README (this) | 1. [configuration.md](./configuration.md) | 1. [gotchas.md](./gotchas.md) | +| 2. [configuration.md](./configuration.md) | 2. [api.md](./api.md) | 2. [patterns.md](./patterns.md) | +| 3. [api.md](./api.md) | 3. [patterns.md](./patterns.md) | 3. [api.md](./api.md) | + +## In This Reference +- [configuration.md](./configuration.md) - Setup, wrangler config, Smart Placement +- [api.md](./api.md) - Binding APIs, query patterns, driver usage +- [patterns.md](./patterns.md) - Use cases, ORMs, multi-query optimization +- [gotchas.md](./gotchas.md) - Limits, troubleshooting, connection management + +## See Also +- [smart-placement](../smart-placement/) - Optimize multi-query Workers near databases +- [d1](../d1/) - Serverless SQLite alternative for edge-native apps +- [workers](../workers/) - Worker runtime with database bindings diff --git a/cloudflare/references/hyperdrive/api.md b/cloudflare/references/hyperdrive/api.md new file mode 100644 index 0000000..0e587b9 --- /dev/null +++ b/cloudflare/references/hyperdrive/api.md @@ -0,0 +1,143 @@ +# API Reference + +See [README.md](./README.md) for overview, [configuration.md](./configuration.md) for setup. + +## Binding Interface + +```typescript +interface Hyperdrive { + connectionString: string; // PostgreSQL + // MySQL properties: + host: string; + port: number; + user: string; + password: string; + database: string; +} + +interface Env { + HYPERDRIVE: Hyperdrive; +} +``` + +**Generate types:** `npx wrangler types` (auto-creates worker-configuration.d.ts from wrangler.jsonc) + +## PostgreSQL (node-postgres) - RECOMMENDED + +```typescript +import { Client } from "pg"; // pg@^8.17.2 + +export default { + async fetch(req: Request, env: Env): Promise { + const client = new Client({connectionString: env.HYPERDRIVE.connectionString}); + try { + await client.connect(); + const result = await client.query("SELECT * FROM users WHERE id = $1", [123]); + return Response.json(result.rows); + } finally { + await client.end(); + } + }, +}; +``` + +**⚠️ Workers connection limit: 6 per Worker invocation** - use connection pooling wisely. + +## PostgreSQL (postgres.js) + +```typescript +import postgres from "postgres"; // postgres@^3.4.8 + +const sql = postgres(env.HYPERDRIVE.connectionString, { + max: 5, // Limit per Worker (Workers max: 6) + prepare: true, // Enabled by default, required for caching + fetch_types: false, // Reduce latency if not using arrays +}); + +const users = await sql`SELECT * FROM users WHERE active = ${true} LIMIT 10`; +``` + +**⚠️ `prepare: true` is enabled by default and required for Hyperdrive caching.** Setting to `false` disables prepared statements + cache. + +## MySQL (mysql2) + +```typescript +import { createConnection } from "mysql2/promise"; // mysql2@^3.16.2 + +const conn = await createConnection({ + host: env.HYPERDRIVE.host, + user: env.HYPERDRIVE.user, + password: env.HYPERDRIVE.password, + database: env.HYPERDRIVE.database, + port: env.HYPERDRIVE.port, + disableEval: true, // ⚠️ REQUIRED for Workers +}); + +const [results] = await conn.query("SELECT * FROM users WHERE active = ? LIMIT ?", [true, 10]); +ctx.waitUntil(conn.end()); +``` + +**⚠️ MySQL support is less mature than PostgreSQL** - expect fewer optimizations and potential edge cases. + +## Query Caching + +**Cacheable:** +```sql +SELECT * FROM posts WHERE published = true; +SELECT COUNT(*) FROM users; +``` + +**NOT cacheable:** +```sql +-- Writes +INSERT/UPDATE/DELETE + +-- Volatile functions +SELECT NOW(); +SELECT random(); +SELECT LASTVAL(); -- PostgreSQL +SELECT UUID(); -- MySQL +``` + +**Cache config:** +- Default: `max_age=60s`, `swr=15s` +- Max `max_age`: 3600s +- Disable: `--caching-disabled=true` + +**Multiple configs pattern:** +```typescript +// Reads: cached +const sqlCached = postgres(env.HYPERDRIVE_CACHED.connectionString); +const posts = await sqlCached`SELECT * FROM posts ORDER BY views DESC LIMIT 10`; + +// Writes/time-sensitive: no cache +const sqlNoCache = postgres(env.HYPERDRIVE_NO_CACHE.connectionString); +const orders = await sqlNoCache`SELECT * FROM orders WHERE created_at > NOW() - INTERVAL 5 MINUTE`; +``` + +## ORMs + +**Drizzle:** +```typescript +import { drizzle } from "drizzle-orm/postgres-js"; // drizzle-orm@^0.45.1 +import postgres from "postgres"; + +const client = postgres(env.HYPERDRIVE.connectionString, {max: 5, prepare: true}); +const db = drizzle(client); +const users = await db.select().from(users).where(eq(users.active, true)).limit(10); +``` + +**Kysely:** +```typescript +import { Kysely, PostgresDialect } from "kysely"; // kysely@^0.27+ +import postgres from "postgres"; + +const db = new Kysely({ + dialect: new PostgresDialect({ + postgres: postgres(env.HYPERDRIVE.connectionString, {max: 5, prepare: true}), + }), +}); +const users = await db.selectFrom("users").selectAll().where("active", "=", true).execute(); +``` + +See [patterns.md](./patterns.md) for use cases, [gotchas.md](./gotchas.md) for limits. diff --git a/cloudflare/references/hyperdrive/configuration.md b/cloudflare/references/hyperdrive/configuration.md new file mode 100644 index 0000000..6d429a9 --- /dev/null +++ b/cloudflare/references/hyperdrive/configuration.md @@ -0,0 +1,159 @@ +# Configuration + +See [README.md](./README.md) for overview. + +## Create Config + +**PostgreSQL:** +```bash +# Basic +npx wrangler hyperdrive create my-db \ + --connection-string="postgres://user:pass@host:5432/db" + +# Custom cache +npx wrangler hyperdrive create my-db \ + --connection-string="postgres://..." \ + --max-age=120 --swr=30 + +# No cache +npx wrangler hyperdrive create my-db \ + --connection-string="postgres://..." \ + --caching-disabled=true +``` + +**MySQL:** +```bash +npx wrangler hyperdrive create my-db \ + --connection-string="mysql://user:pass@host:3306/db" +``` + +## wrangler.jsonc + +```jsonc +{ + "compatibility_date": "2025-01-01", // Use latest for new projects + "compatibility_flags": ["nodejs_compat"], + "hyperdrive": [ + { + "binding": "HYPERDRIVE", + "id": "", + "localConnectionString": "postgres://user:pass@localhost:5432/dev" + } + ] +} +``` + +**Generate TypeScript types:** Run `npx wrangler types` to auto-generate `worker-configuration.d.ts` from your wrangler.jsonc. + +**Multiple configs:** +```jsonc +{ + "hyperdrive": [ + {"binding": "HYPERDRIVE_CACHED", "id": ""}, + {"binding": "HYPERDRIVE_NO_CACHE", "id": ""} + ] +} +``` + +## Management + +```bash +npx wrangler hyperdrive list +npx wrangler hyperdrive get +npx wrangler hyperdrive update --max-age=180 +npx wrangler hyperdrive delete +``` + +## Config Options + +Hyperdrive create/update CLI flags: + +| Option | Default | Notes | +|--------|---------|-------| +| `--caching-disabled` | `false` | Disable caching | +| `--max-age` | `60` | Cache TTL (max 3600s) | +| `--swr` | `15` | Stale-while-revalidate | +| `--origin-connection-limit` | 20/100 | Free/paid | +| `--access-client-id` | - | Tunnel auth | +| `--access-client-secret` | - | Tunnel auth | +| `--sslmode` | `require` | PostgreSQL only | + +## Smart Placement Integration + +For Workers making **multiple queries** per request, enable Smart Placement to execute near your database: + +```jsonc +{ + "compatibility_date": "2025-01-01", + "compatibility_flags": ["nodejs_compat"], + "placement": { + "mode": "smart" + }, + "hyperdrive": [ + { + "binding": "HYPERDRIVE", + "id": "" + } + ] +} +``` + +**Benefits:** Multi-query Workers run closer to DB, reducing round-trip latency. See [patterns.md](./patterns.md) for examples. + +## Private DB via Tunnel + +``` +Worker → Hyperdrive → Access → Tunnel → Private Network → DB +``` + +**Setup:** +```bash +# 1. Create tunnel +cloudflared tunnel create my-db-tunnel + +# 2. Configure hostname in Zero Trust dashboard +# Domain: db-tunnel.example.com +# Service: TCP -> localhost:5432 + +# 3. Create service token (Zero Trust > Service Auth) +# Save Client ID/Secret + +# 4. Create Access app (db-tunnel.example.com) +# Policy: Service Auth token from step 3 + +# 5. Create Hyperdrive +npx wrangler hyperdrive create my-private-db \ + --host=db-tunnel.example.com \ + --user=dbuser --password=dbpass --database=prod \ + --access-client-id= --access-client-secret= +``` + +**⚠️ Don't specify `--port` with Tunnel** - port configured in tunnel service settings. + +## Local Dev + +**Option 1: Local (RECOMMENDED):** +```bash +# Env var (takes precedence) +export CLOUDFLARE_HYPERDRIVE_LOCAL_CONNECTION_STRING_HYPERDRIVE="postgres://user:pass@localhost:5432/dev" +npx wrangler dev + +# wrangler.jsonc +{"hyperdrive": [{"binding": "HYPERDRIVE", "localConnectionString": "postgres://..."}]} +``` + +**Remote DB locally:** +```bash +# PostgreSQL +export CLOUDFLARE_HYPERDRIVE_LOCAL_CONNECTION_STRING_HYPERDRIVE="postgres://user:pass@remote:5432/db?sslmode=require" + +# MySQL +export CLOUDFLARE_HYPERDRIVE_LOCAL_CONNECTION_STRING_HYPERDRIVE="mysql://user:pass@remote:3306/db?sslMode=REQUIRED" +``` + +**Option 2: Remote execution:** +```bash +npx wrangler dev --remote # Uses deployed config, affects production +``` + +See [api.md](./api.md), [patterns.md](./patterns.md), [gotchas.md](./gotchas.md). diff --git a/cloudflare/references/hyperdrive/gotchas.md b/cloudflare/references/hyperdrive/gotchas.md new file mode 100644 index 0000000..efa2ead --- /dev/null +++ b/cloudflare/references/hyperdrive/gotchas.md @@ -0,0 +1,77 @@ +# Gotchas + +See [README.md](./README.md), [configuration.md](./configuration.md), [api.md](./api.md), [patterns.md](./patterns.md). + +## Common Errors + +### "Too many open connections" / "Connection limit exceeded" + +**Cause:** Workers have a hard limit of **6 concurrent connections per invocation** +**Solution:** Set `max: 5` in driver config, reuse connections, ensure proper cleanup with `client.end()` or `ctx.waitUntil(conn.end())` + +### "Failed to acquire a connection (Pool exhausted)" + +**Cause:** All connections in pool are in use, often due to long-running transactions +**Solution:** Reduce transaction duration, avoid queries >60s, don't hold connections during external calls, or upgrade to paid plan for more connections + +### "connection_refused" + +**Cause:** Database refusing connections due to firewall, connection limits, or service down +**Solution:** Check firewall allows Cloudflare IPs, verify DB listening on port, confirm service running, and validate credentials + +### "Query timeout (deadline exceeded)" + +**Cause:** Query execution exceeding 60s timeout limit +**Solution:** Optimize with indexes, reduce dataset with LIMIT, break into smaller queries, or use async processing + +### "password authentication failed" + +**Cause:** Invalid credentials in Hyperdrive configuration +**Solution:** Check username and password in Hyperdrive config match database credentials + +### "SSL/TLS connection error" + +**Cause:** SSL/TLS configuration mismatch between Hyperdrive and database +**Solution:** Add `sslmode=require` (Postgres) or `sslMode=REQUIRED` (MySQL), upload CA cert if self-signed, verify DB has SSL enabled, and check cert expiry + +### "Queries not being cached" + +**Cause:** Query is mutating (INSERT/UPDATE/DELETE), contains volatile functions (NOW(), RANDOM()), or caching disabled +**Solution:** Verify query is non-mutating SELECT, avoid volatile functions, confirm caching enabled, use `wrangler dev --remote` to test, and set `prepare=true` for postgres.js + +### "Slow multi-query Workers despite Hyperdrive" + +**Cause:** Worker executing at edge, each query round-trips to DB region +**Solution:** Enable Smart Placement (`"placement": {"mode": "smart"}` in wrangler.jsonc) to execute Worker near DB. See [patterns.md](./patterns.md) Multi-Query pattern. + +### "Local database connection failed" + +**Cause:** `localConnectionString` incorrect or database not running +**Solution:** Verify `localConnectionString` correct, check DB running, confirm env var name matches binding, and test with psql/mysql client + +### "Environment variable not working" + +**Cause:** Environment variable format incorrect or not exported +**Solution:** Use format `CLOUDFLARE_HYPERDRIVE_LOCAL_CONNECTION_STRING_`, ensure binding matches wrangler.jsonc, export variable in shell, and restart wrangler dev + +## Limits + +| Limit | Free | Paid | Notes | +|-------|------|------|-------| +| Max configs | 10 | 25 | Hyperdrive configurations per account | +| Worker connections | 6 | 6 | Max concurrent connections per Worker invocation | +| Username/DB name | 63 bytes | 63 bytes | Maximum length | +| Connection timeout | 15s | 15s | Time to establish connection | +| Idle timeout | 10 min | 10 min | Connection idle timeout | +| Max origin connections | ~20 | ~100 | Connections to origin database | +| Query duration max | 60s | 60s | Queries >60s terminated | +| Cached response max | 50 MB | 50 MB | Responses >50MB returned but not cached | + +## Resources + +- [Docs](https://developers.cloudflare.com/hyperdrive/) +- [Getting Started](https://developers.cloudflare.com/hyperdrive/get-started/) +- [Wrangler Reference](https://developers.cloudflare.com/hyperdrive/reference/wrangler-commands/) +- [Supported DBs](https://developers.cloudflare.com/hyperdrive/reference/supported-databases-and-features/) +- [Discord #hyperdrive](https://discord.cloudflare.com) +- [Limit Increase Form](https://forms.gle/ukpeZVLWLnKeixDu7) diff --git a/cloudflare/references/hyperdrive/patterns.md b/cloudflare/references/hyperdrive/patterns.md new file mode 100644 index 0000000..bd794b9 --- /dev/null +++ b/cloudflare/references/hyperdrive/patterns.md @@ -0,0 +1,190 @@ +# Patterns + +See [README.md](./README.md), [configuration.md](./configuration.md), [api.md](./api.md). + +## High-Traffic Read-Heavy + +```typescript +const sql = postgres(env.HYPERDRIVE.connectionString, {max: 5, prepare: true}); + +// Cacheable: popular content +const posts = await sql`SELECT * FROM posts WHERE published = true ORDER BY views DESC LIMIT 20`; + +// Cacheable: user profiles +const [user] = await sql`SELECT id, username, bio FROM users WHERE id = ${userId}`; +``` + +**Benefits:** Trending/profiles cached (60s), connection pooling handles spikes. + +## Mixed Read/Write + +```typescript +interface Env { + HYPERDRIVE_CACHED: Hyperdrive; // max_age=120 + HYPERDRIVE_REALTIME: Hyperdrive; // caching disabled +} + +// Reads: cached +if (req.method === "GET") { + const sql = postgres(env.HYPERDRIVE_CACHED.connectionString, {prepare: true}); + const products = await sql`SELECT * FROM products WHERE category = ${cat}`; +} + +// Writes: no cache (immediate consistency) +if (req.method === "POST") { + const sql = postgres(env.HYPERDRIVE_REALTIME.connectionString, {prepare: true}); + await sql`INSERT INTO orders ${sql(data)}`; +} +``` + +## Analytics Dashboard + +```typescript +const client = new Client({connectionString: env.HYPERDRIVE.connectionString}); +await client.connect(); + +// Aggregate queries cached (use fixed timestamps for caching) +const thirtyDaysAgo = new Date(Date.now() - 30 * 24 * 60 * 60 * 1000).toISOString(); +const dailyStats = await client.query(` + SELECT DATE(created_at) as date, COUNT(*) as orders, SUM(amount) as revenue + FROM orders WHERE created_at >= $1 + GROUP BY DATE(created_at) ORDER BY date DESC +`, [thirtyDaysAgo]); + +const sevenDaysAgo = new Date(Date.now() - 7 * 24 * 60 * 60 * 1000).toISOString(); +const topProducts = await client.query(` + SELECT p.name, COUNT(oi.id) as count, SUM(oi.quantity * oi.price) as revenue + FROM order_items oi JOIN products p ON oi.product_id = p.id + WHERE oi.created_at >= $1 + GROUP BY p.id, p.name ORDER BY revenue DESC LIMIT 10 +`, [sevenDaysAgo]); +``` + +**Benefits:** Expensive aggregations cached (avoid NOW() for cacheability), dashboard instant, reduced DB load. + +## Multi-Tenant + +```typescript +const tenantId = req.headers.get("X-Tenant-ID"); +const sql = postgres(env.HYPERDRIVE.connectionString, {prepare: true}); + +// Tenant-scoped queries cached separately +const docs = await sql` + SELECT * FROM documents + WHERE tenant_id = ${tenantId} AND deleted_at IS NULL + ORDER BY updated_at DESC LIMIT 50 +`; +``` + +**Benefits:** Per-tenant caching, shared connection pool, protects DB from multi-tenant load. + +## Geographically Distributed + +```typescript +// Worker runs at edge nearest user +// Connection setup at edge (fast), pooling near DB (efficient) +const sql = postgres(env.HYPERDRIVE.connectionString, {prepare: true}); +const [user] = await sql`SELECT * FROM users WHERE id = ${userId}`; + +return Response.json({ + user, + serverRegion: req.cf?.colo, // Edge location +}); +``` + +**Benefits:** Edge setup + DB pooling = global → single-region DB without replication. + +## Multi-Query + Smart Placement + +For Workers making **multiple queries** per request, enable Smart Placement to execute near DB: + +```jsonc +// wrangler.jsonc +{ + "placement": {"mode": "smart"}, + "hyperdrive": [{"binding": "HYPERDRIVE", "id": ""}] +} +``` + +```typescript +const sql = postgres(env.HYPERDRIVE.connectionString, {prepare: true}); + +// Multiple queries benefit from Smart Placement +const [user] = await sql`SELECT * FROM users WHERE id = ${userId}`; +const orders = await sql`SELECT * FROM orders WHERE user_id = ${userId} ORDER BY created_at DESC LIMIT 10`; +const stats = await sql`SELECT COUNT(*) as total, SUM(amount) as spent FROM orders WHERE user_id = ${userId}`; + +return Response.json({user, orders, stats}); +``` + +**Benefits:** Worker executes near DB → reduces latency for each query. Without Smart Placement, each query round-trips from edge. + +## Connection Pooling + +Operates in **transaction mode**: connection acquired per transaction, `RESET` on return. + +**SET statements:** +```typescript +// ✅ Within transaction +await client.query("BEGIN"); +await client.query("SET work_mem = '256MB'"); +await client.query("SELECT * FROM large_table"); // Uses SET +await client.query("COMMIT"); // RESET after + +// ✅ Single statement +await client.query("SET work_mem = '256MB'; SELECT * FROM large_table"); + +// ❌ Across queries (may get different connection) +await client.query("SET work_mem = '256MB'"); +await client.query("SELECT * FROM large_table"); // SET not applied +``` + +**Best practices:** +```typescript +// ❌ Long transactions block pooling +await client.query("BEGIN"); +await processThousands(); // Connection held entire time +await client.query("COMMIT"); + +// ✅ Short transactions +await client.query("BEGIN"); +await client.query("UPDATE users SET status = $1 WHERE id = $2", [status, id]); +await client.query("COMMIT"); + +// ✅ SET LOCAL within transaction +await client.query("BEGIN"); +await client.query("SET LOCAL work_mem = '256MB'"); +await client.query("SELECT * FROM large_table"); +await client.query("COMMIT"); +``` + +## Performance Tips + +**Enable prepared statements (required for caching):** +```typescript +const sql = postgres(connectionString, {prepare: true}); // Default, enables caching +``` + +**Optimize connection settings:** +```typescript +const sql = postgres(connectionString, { + max: 5, // Stay under Workers' 6 connection limit + fetch_types: false, // Reduce latency if not using arrays + idle_timeout: 60, // Match Worker lifetime +}); +``` + +**Write cache-friendly queries:** +```typescript +// ✅ Cacheable (deterministic) +await sql`SELECT * FROM products WHERE category = 'electronics' LIMIT 10`; + +// ❌ Not cacheable (volatile NOW()) +await sql`SELECT * FROM logs WHERE created_at > NOW()`; + +// ✅ Cacheable (parameterized timestamp) +const ts = Date.now(); +await sql`SELECT * FROM logs WHERE created_at > ${ts}`; +``` + +See [gotchas.md](./gotchas.md) for limits, troubleshooting. diff --git a/cloudflare/references/images/README.md b/cloudflare/references/images/README.md new file mode 100644 index 0000000..f1dd644 --- /dev/null +++ b/cloudflare/references/images/README.md @@ -0,0 +1,61 @@ +# Cloudflare Images Skill Reference + +**Cloudflare Images** is an end-to-end image management solution providing storage, transformation, optimization, and delivery at scale via Cloudflare's global network. + +## Quick Decision Tree + +**Need to:** +- **Transform in Worker?** → [api.md](api.md#workers-binding-api-2026-primary-method) (Workers Binding API) +- **Upload from Worker?** → [api.md](api.md#upload-from-worker) (REST API) +- **Upload from client?** → [patterns.md](patterns.md#upload-from-client-direct-creator-upload) (Direct Creator Upload) +- **Set up variants?** → [configuration.md](configuration.md#variants-configuration) +- **Serve responsive images?** → [patterns.md](patterns.md#responsive-images) +- **Add watermarks?** → [patterns.md](patterns.md#watermarking) +- **Fix errors?** → [gotchas.md](gotchas.md#common-errors) + +## Reading Order + +**For building image upload/transform feature:** +1. [configuration.md](configuration.md) - Setup Workers binding +2. [api.md](api.md#workers-binding-api-2026-primary-method) - Learn transform API +3. [patterns.md](patterns.md#upload-from-client-direct-creator-upload) - Direct upload pattern +4. [gotchas.md](gotchas.md) - Check limits and errors + +**For URL-based transforms:** +1. [configuration.md](configuration.md#variants-configuration) - Create variants +2. [api.md](api.md#url-transform-api) - URL syntax +3. [patterns.md](patterns.md#responsive-images) - Responsive patterns + +**For troubleshooting:** +1. [gotchas.md](gotchas.md#common-errors) - Error messages +2. [gotchas.md](gotchas.md#limits) - Size/format limits + +## Core Methods + +| Method | Use Case | Location | +|--------|----------|----------| +| `env.IMAGES.input().transform()` | Transform in Worker | [api.md:11](api.md) | +| REST API `/images/v1` | Upload images | [api.md:57](api.md) | +| Direct Creator Upload | Client-side upload | [api.md:127](api.md) | +| URL transforms | Static image delivery | [api.md:112](api.md) | + +## In This Reference + +- **[api.md](api.md)** - Complete API: Workers binding, REST endpoints, URL transforms +- **[configuration.md](configuration.md)** - Setup: wrangler.toml, variants, auth, signed URLs +- **[patterns.md](patterns.md)** - Patterns: responsive images, watermarks, format negotiation, caching +- **[gotchas.md](gotchas.md)** - Troubleshooting: limits, errors, best practices + +## Key Features + +- **Automatic Optimization** - AVIF/WebP format negotiation +- **On-the-fly Transforms** - Resize, crop, blur, sharpen via URL or API +- **Workers Binding** - Transform images in Workers (2026 primary method) +- **Direct Upload** - Secure client-side uploads without backend proxy +- **Global Delivery** - Cached at 300+ Cloudflare data centers +- **Watermarking** - Overlay images programmatically + +## See Also + +- [Official Docs](https://developers.cloudflare.com/images/) +- [Workers Examples](https://developers.cloudflare.com/images/tutorials/) diff --git a/cloudflare/references/images/api.md b/cloudflare/references/images/api.md new file mode 100644 index 0000000..c172e22 --- /dev/null +++ b/cloudflare/references/images/api.md @@ -0,0 +1,96 @@ +# API Reference + +## Workers Binding API + +```toml +# wrangler.toml +[images] +binding = "IMAGES" +``` + +### Transform Images + +```typescript +const imageResponse = await env.IMAGES + .input(fileBuffer) + .transform({ width: 800, height: 600, fit: "cover", quality: 85, format: "avif" }) + .output(); +return imageResponse.response(); +``` + +### Transform Options + +```typescript +interface TransformOptions { + width?: number; height?: number; + fit?: "scale-down" | "contain" | "cover" | "crop" | "pad"; + quality?: number; // 1-100 + format?: "avif" | "webp" | "jpeg" | "png"; + dpr?: number; // 1-3 + gravity?: "auto" | "left" | "right" | "top" | "bottom" | "face" | string; + sharpen?: number; // 0-10 + blur?: number; // 1-250 + rotate?: 90 | 180 | 270; + background?: string; // CSS color for pad + metadata?: "none" | "copyright" | "keep"; + brightness?: number; contrast?: number; gamma?: number; // 0-2 +} +``` + +### Draw/Watermark + +```typescript +await env.IMAGES.input(baseImage) + .draw(env.IMAGES.input(watermark).transform({ width: 100 }), { top: 10, left: 10, opacity: 0.8 }) + .output(); +``` + +## REST API + +### Upload Image + +```bash +curl -X POST https://api.cloudflare.com/client/v4/accounts/{account_id}/images/v1 \ + -H "Authorization: Bearer {token}" -F file=@image.jpg -F metadata='{"key":"value"}' +``` + +### Other Operations + +```bash +GET /accounts/{account_id}/images/v1/{image_id} # Get details +DELETE /accounts/{account_id}/images/v1/{image_id} # Delete +GET /accounts/{account_id}/images/v1?page=1 # List +``` + +## URL Transform API + +``` +https://imagedelivery.net/{hash}/{id}/width=800,height=600,fit=cover,format=avif +``` + +**Params:** `w=`, `h=`, `fit=`, `q=`, `f=`, `dpr=`, `gravity=`, `sharpen=`, `blur=`, `rotate=`, `background=`, `metadata=` + +## Direct Creator Upload + +```typescript +// 1. Get upload URL (backend) +const { result } = await fetch( + `https://api.cloudflare.com/client/v4/accounts/${accountId}/images/v2/direct_upload`, + { method: 'POST', headers: { 'Authorization': `Bearer ${token}` }, + body: JSON.stringify({ requireSignedURLs: false }) } +).then(r => r.json()); + +// 2. Client uploads to result.uploadURL +const formData = new FormData(); +formData.append('file', file); +await fetch(result.uploadURL, { method: 'POST', body: formData }); +``` + +## Error Codes + +| Code | Message | Solution | +|------|---------|----------| +| 5400 | Invalid format | Use JPEG, PNG, GIF, WebP | +| 5401 | Too large | Max 100MB | +| 5403 | Invalid transform | Check params | +| 9413 | Rate limit | Implement backoff | diff --git a/cloudflare/references/images/configuration.md b/cloudflare/references/images/configuration.md new file mode 100644 index 0000000..9fa2deb --- /dev/null +++ b/cloudflare/references/images/configuration.md @@ -0,0 +1,211 @@ +# Configuration + +## Wrangler Integration + +### Workers Binding Setup + +Add to `wrangler.toml`: + +```toml +name = "my-image-worker" +main = "src/index.ts" +compatibility_date = "2024-01-01" + +[images] +binding = "IMAGES" +``` + +Access in Worker: + +```typescript +interface Env { + IMAGES: ImageBinding; +} + +export default { + async fetch(request: Request, env: Env): Promise { + return await env.IMAGES + .input(imageBuffer) + .transform({ width: 800 }) + .output() + .response(); + } +}; +``` + +### Upload via Script + +Wrangler doesn't have built-in Images commands, use REST API: + +```typescript +// scripts/upload-image.ts +import fs from 'fs'; +import FormData from 'form-data'; + +async function uploadImage(filePath: string) { + const accountId = process.env.CLOUDFLARE_ACCOUNT_ID!; + const apiToken = process.env.CLOUDFLARE_API_TOKEN!; + + const formData = new FormData(); + formData.append('file', fs.createReadStream(filePath)); + + const response = await fetch( + `https://api.cloudflare.com/client/v4/accounts/${accountId}/images/v1`, + { + method: 'POST', + headers: { + 'Authorization': `Bearer ${apiToken}`, + }, + body: formData, + } + ); + + const result = await response.json(); + console.log('Uploaded:', result); +} + +uploadImage('./photo.jpg'); +``` + +### Environment Variables + +Store account hash for URL construction: + +```toml +[vars] +IMAGES_ACCOUNT_HASH = "your-account-hash" +ACCOUNT_ID = "your-account-id" +``` + +Access in Worker: + +```typescript +const imageUrl = `https://imagedelivery.net/${env.IMAGES_ACCOUNT_HASH}/${imageId}/public`; +``` + +## Variants Configuration + +Variants are named presets for transformations. + +### Create Variant (Dashboard) + +1. Navigate to Images → Variants +2. Click "Create Variant" +3. Set name (e.g., `thumbnail`) +4. Configure: `width=200,height=200,fit=cover` + +### Create Variant (API) + +```bash +curl -X POST \ + https://api.cloudflare.com/client/v4/accounts/{account_id}/images/v1/variants \ + -H "Authorization: Bearer {api_token}" \ + -H "Content-Type: application/json" \ + -d '{ + "id": "thumbnail", + "options": { + "width": 200, + "height": 200, + "fit": "cover" + }, + "neverRequireSignedURLs": true + }' +``` + +### Use Variant + +``` +https://imagedelivery.net/{account_hash}/{image_id}/thumbnail +``` + +### Common Variant Presets + +```json +{ + "thumbnail": { + "width": 200, + "height": 200, + "fit": "cover" + }, + "avatar": { + "width": 128, + "height": 128, + "fit": "cover", + "gravity": "face" + }, + "hero": { + "width": 1920, + "height": 1080, + "fit": "cover", + "quality": 90 + }, + "mobile": { + "width": 640, + "fit": "scale-down", + "quality": 80, + "format": "avif" + } +} +``` + +## Authentication + +### API Token (Recommended) + +Generate at: Dashboard → My Profile → API Tokens + +Required permissions: +- Account → Cloudflare Images → Edit + +```bash +curl -H "Authorization: Bearer {api_token}" \ + https://api.cloudflare.com/client/v4/accounts/{account_id}/images/v1 +``` + +### API Key (Legacy) + +```bash +curl -H "X-Auth-Email: {email}" \ + -H "X-Auth-Key: {api_key}" \ + https://api.cloudflare.com/client/v4/accounts/{account_id}/images/v1 +``` + +## Signed URLs + +For private images, enable signed URLs: + +```bash +# Upload with signed URLs required +curl -X POST \ + https://api.cloudflare.com/client/v4/accounts/{account_id}/images/v1 \ + -H "Authorization: Bearer {api_token}" \ + -F file=@private.jpg \ + -F requireSignedURLs=true +``` + +Generate signed URL: + +```typescript +import { createHmac } from 'crypto'; + +function signUrl(imageId: string, variant: string, expiry: number, key: string): string { + const path = `/${imageId}/${variant}`; + const toSign = `${path}${expiry}`; + const signature = createHmac('sha256', key) + .update(toSign) + .digest('hex'); + + return `https://imagedelivery.net/{hash}${path}?exp=${expiry}&sig=${signature}`; +} + +// Sign URL valid for 1 hour +const signedUrl = signUrl('image-id', 'public', Date.now() + 3600, env.SIGNING_KEY); +``` + +## Local Development + +```bash +npx wrangler dev --remote +``` + +Must use `--remote` for Images binding access. diff --git a/cloudflare/references/images/gotchas.md b/cloudflare/references/images/gotchas.md new file mode 100644 index 0000000..6f52455 --- /dev/null +++ b/cloudflare/references/images/gotchas.md @@ -0,0 +1,99 @@ +# Gotchas & Best Practices + +## Fit Modes + +| Mode | Best For | Behavior | +|------|----------|----------| +| `cover` | Hero images, thumbnails | Fills space, crops excess | +| `contain` | Product images, artwork | Preserves full image, may add padding | +| `scale-down` | User uploads | Never enlarges | +| `crop` | Precise crops | Uses gravity | +| `pad` | Fixed aspect ratio | Adds background | + +## Format Selection + +```typescript +format: 'auto' // Recommended - negotiates best format +``` + +**Support:** AVIF (Chrome 85+, Firefox 93+, Safari 16.4+), WebP (Chrome 23+, Firefox 65+, Safari 14+) + +## Quality Settings + +| Use Case | Quality | +|----------|---------| +| Thumbnails | 75-80 | +| Standard | 85 (default) | +| High-quality | 90-95 | + +## Common Errors + +### 5403: "Image transformation failed" +- Verify `width`/`height` ≤ 12000 +- Check `quality` 1-100, `dpr` 1-3 +- Don't combine incompatible options + +### 9413: "Rate limit exceeded" +Implement caching and exponential backoff: +```typescript +for (let i = 0; i < 3; i++) { + try { return await env.IMAGES.input(buffer).transform({...}).output(); } + catch { await new Promise(r => setTimeout(r, 2 ** i * 1000)); } +} +``` + +### 5401: "Image too large" +Pre-process images before upload (max 100MB, 12000×12000px) + +### 5400: "Invalid image format" +Supported: JPEG, PNG, GIF, WebP, AVIF, SVG + +### 401/403: "Unauthorized" +Verify API token has `Cloudflare Images → Edit` permission + +## Limits + +| Resource | Limit | +|----------|-------| +| Max input size | 100MB | +| Max dimensions | 12000×12000px | +| Quality range | 1-100 | +| DPR range | 1-3 | +| API rate limit | ~1200 req/min | + +## AVIF Gotchas + +- **Slower encoding**: First request may have higher latency +- **Browser detection**: +```typescript +const format = /image\/avif/.test(request.headers.get('Accept') || '') ? 'avif' : 'webp'; +``` + +## Anti-Patterns + +```typescript +// ❌ No caching - transforms every request +return env.IMAGES.input(buffer).transform({...}).output().response(); + +// ❌ cover without both dimensions +transform({ width: 800, fit: 'cover' }) + +// ✅ Always set both for cover +transform({ width: 800, height: 600, fit: 'cover' }) + +// ❌ Exposes API token to client +// ✅ Use Direct Creator Upload (patterns.md) +``` + +## Debugging + +```typescript +// Check response headers +console.log('Content-Type:', response.headers.get('Content-Type')); + +// Test with curl +// curl -I "https://imagedelivery.net/{hash}/{id}/width=800,format=avif" + +// Monitor logs +// npx wrangler tail +``` diff --git a/cloudflare/references/images/patterns.md b/cloudflare/references/images/patterns.md new file mode 100644 index 0000000..c07bf3c --- /dev/null +++ b/cloudflare/references/images/patterns.md @@ -0,0 +1,115 @@ +# Common Patterns + +## URL Transform Options + +``` +width= height= fit=scale-down|contain|cover|crop|pad +quality=85 format=auto|webp|avif|jpeg|png dpr=2 +gravity=auto|face|left|right|top|bottom sharpen=2 blur=10 +rotate=90|180|270 background=white metadata=none|copyright|keep +``` + +## Responsive Images (srcset) + +```html + +``` + +## Format Negotiation + +```typescript +async fetch(request: Request, env: Env): Promise { + const accept = request.headers.get('Accept') || ''; + const format = /image\/avif/.test(accept) ? 'avif' : /image\/webp/.test(accept) ? 'webp' : 'jpeg'; + return env.IMAGES.input(buffer).transform({ format, quality: 85 }).output().response(); +} +``` + +## Direct Creator Upload + +```typescript +// Backend: Generate upload URL +const response = await fetch( + `https://api.cloudflare.com/client/v4/accounts/${env.ACCOUNT_ID}/images/v2/direct_upload`, + { method: 'POST', headers: { 'Authorization': `Bearer ${env.API_TOKEN}` }, + body: JSON.stringify({ requireSignedURLs: false, metadata: { userId } }) } +); + +// Frontend: Upload to returned uploadURL +const formData = new FormData(); +formData.append('file', file); +await fetch(result.uploadURL, { method: 'POST', body: formData }); +// Use: https://imagedelivery.net/{hash}/${result.id}/public +``` + +## Transform & Store to R2 + +```typescript +async fetch(request: Request, env: Env): Promise { + const file = (await request.formData()).get('image') as File; + const transformed = await env.IMAGES + .input(await file.arrayBuffer()) + .transform({ width: 800, format: 'avif', quality: 80 }) + .output(); + await env.R2.put(`images/${Date.now()}.avif`, transformed.response().body); + return Response.json({ success: true }); +} +``` + +## Watermarking + +```typescript +const watermark = await env.ASSETS.fetch(new URL('/watermark.png', request.url)); +const result = await env.IMAGES + .input(await image.arrayBuffer()) + .draw(env.IMAGES.input(watermark.body).transform({ width: 100 }), { bottom: 20, right: 20, opacity: 0.7 }) + .transform({ format: 'avif' }) + .output(); +return result.response(); +``` + +## Device-Based Transforms + +```typescript +const ua = request.headers.get('User-Agent') || ''; +const isMobile = /Mobile|Android|iPhone/i.test(ua); +return env.IMAGES.input(buffer) + .transform({ width: isMobile ? 400 : 1200, quality: isMobile ? 75 : 85, format: 'avif' }) + .output().response(); +``` + +## Caching Strategy + +```typescript +async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise { + const cache = caches.default; + let response = await cache.match(request); + if (!response) { + response = await env.IMAGES.input(buffer).transform({ width: 800, format: 'avif' }).output().response(); + response = new Response(response.body, { headers: { ...response.headers, 'Cache-Control': 'public, max-age=86400' } }); + ctx.waitUntil(cache.put(request, response.clone())); + } + return response; +} +``` + +## Batch Processing + +```typescript +const results = await Promise.all(images.map(buffer => + env.IMAGES.input(buffer).transform({ width: 800, fit: 'cover', format: 'avif' }).output() +)); +``` + +## Error Handling + +```typescript +try { + return (await env.IMAGES.input(buffer).transform({ width: 800 }).output()).response(); +} catch (error) { + console.error('Transform failed:', error); + return new Response('Image processing failed', { status: 500 }); +} +``` diff --git a/cloudflare/references/kv/README.md b/cloudflare/references/kv/README.md new file mode 100644 index 0000000..9e43e01 --- /dev/null +++ b/cloudflare/references/kv/README.md @@ -0,0 +1,89 @@ +# Cloudflare Workers KV + +Globally-distributed, eventually-consistent key-value store optimized for high read volume and low latency. + +## Overview + +KV provides: +- Eventual consistency (60s global propagation) +- Read-optimized performance +- 25 MiB value limit per key +- Auto-replication to Cloudflare edge +- Metadata support (1024 bytes) + +**Use cases:** Config storage, user sessions, feature flags, caching, A/B testing + +## When to Use KV + +| Need | Recommendation | +|------|----------------| +| Strong consistency | → [Durable Objects](../durable-objects/) | +| SQL queries | → [D1](../d1/) | +| Object storage (files) | → [R2](../r2/) | +| High read, low write volume | → KV ✅ | +| Sub-10ms global reads | → KV ✅ | + +**Quick comparison:** + +| Feature | KV | D1 | Durable Objects | +|---------|----|----|-----------------| +| Consistency | Eventual | Strong | Strong | +| Read latency | <10ms | ~50ms | <1ms | +| Write limit | 1/s per key | Unlimited | Unlimited | +| Use case | Config, cache | Relational data | Coordination | + +## Quick Start + +```bash +wrangler kv namespace create MY_NAMESPACE +# Add binding to wrangler.jsonc +``` + +```typescript +// Write +await env.MY_KV.put("key", "value", { expirationTtl: 300 }); + +// Read +const value = await env.MY_KV.get("key"); +const json = await env.MY_KV.get("config", "json"); +``` + +## Core Operations + +| Method | Purpose | Returns | +|--------|---------|---------| +| `get(key, type?)` | Single read | `string \| null` | +| `get(keys, type?)` | Bulk read (≤100) | `Map` | +| `put(key, value, options?)` | Write | `Promise` | +| `delete(key)` | Delete | `Promise` | +| `list(options?)` | List keys | `{ keys, list_complete, cursor? }` | +| `getWithMetadata(key)` | Get + metadata | `{ value, metadata }` | + +## Consistency Model + +- **Write visibility:** Immediate in same location, ≤60s globally +- **Read path:** Eventually consistent +- **Write rate:** 1 write/second per key (429 on exceed) + +## Reading Order + +| Task | Files to Read | +|------|---------------| +| Quick start | README → configuration.md | +| Implement feature | README → api.md → patterns.md | +| Debug issues | gotchas.md → api.md | +| Batch operations | api.md (bulk section) → patterns.md | +| Performance tuning | gotchas.md (performance) → patterns.md (caching) | + +## In This Reference + +- [configuration.md](./configuration.md) - wrangler.jsonc setup, namespace creation, TypeScript types +- [api.md](./api.md) - KV methods, bulk operations, cacheTtl, content types +- [patterns.md](./patterns.md) - Caching, sessions, rate limiting, A/B testing +- [gotchas.md](./gotchas.md) - Eventual consistency, concurrent writes, value limits + +## See Also + +- [workers](../workers/) - Worker runtime for KV access +- [d1](../d1/) - Use D1 for strong consistency needs +- [durable-objects](../durable-objects/) - Strongly consistent alternative diff --git a/cloudflare/references/kv/api.md b/cloudflare/references/kv/api.md new file mode 100644 index 0000000..35063f2 --- /dev/null +++ b/cloudflare/references/kv/api.md @@ -0,0 +1,160 @@ +# KV API Reference + +## Read Operations + +```typescript +// Single key (string) +const value = await env.MY_KV.get("user:123"); + +// JSON type (auto-parsed) +const config = await env.MY_KV.get("config", "json"); + +// ArrayBuffer for binary +const buffer = await env.MY_KV.get("image", "arrayBuffer"); + +// Stream for large values +const stream = await env.MY_KV.get("large-file", "stream"); + +// With cache TTL (min 60s) +const value = await env.MY_KV.get("key", { type: "text", cacheTtl: 300 }); + +// Bulk get (max 100 keys, counts as 1 operation) +const keys = ["user:1", "user:2", "user:3", "missing:key"]; +const results = await env.MY_KV.get(keys); +// Returns Map + +console.log(results.get("user:1")); // "John" (if exists) +console.log(results.get("missing:key")); // null + +// Process results with null handling +for (const [key, value] of results) { + if (value !== null) { + // Handle found keys + console.log(`${key}: ${value}`); + } +} + +// TypeScript with generics (type-safe JSON parsing) +interface UserProfile { name: string; email: string; } +const profile = await env.USERS.get("user:123", "json"); +// profile is typed as UserProfile | null +if (profile) { + console.log(profile.name); // Type-safe access +} + +// Bulk get with type +const configs = await env.MY_KV.get(["config:app", "config:feature"], "json"); +// Map +``` + +## Write Operations + +```typescript +// Basic put +await env.MY_KV.put("key", "value"); +await env.MY_KV.put("config", JSON.stringify({ theme: "dark" })); + +// With expiration (UNIX timestamp) +await env.MY_KV.put("session", token, { + expiration: Math.floor(Date.now() / 1000) + 3600 +}); + +// With TTL (seconds from now, min 60) +await env.MY_KV.put("cache", data, { expirationTtl: 300 }); + +// With metadata (max 1024 bytes) +await env.MY_KV.put("user:profile", userData, { + metadata: { version: 2, lastUpdated: Date.now() } +}); + +// Combined +await env.MY_KV.put("temp", value, { + expirationTtl: 3600, + metadata: { temporary: true } +}); +``` + +## Get with Metadata + +```typescript +// Single key +const result = await env.MY_KV.getWithMetadata("user:profile"); +// { value: string | null, metadata: any | null } + +if (result.value && result.metadata) { + const { version, lastUpdated } = result.metadata; +} + +// Multiple keys (bulk) +const keys = ["key1", "key2", "key3"]; +const results = await env.MY_KV.getWithMetadata(keys); +// Returns Map + +for (const [key, result] of results) { + if (result.value) { + console.log(`${key}: ${result.value}`); + console.log(`Metadata: ${JSON.stringify(result.metadata)}`); + // cacheStatus field indicates cache hit/miss (when available) + } +} + +// With type +const result = await env.MY_KV.getWithMetadata("user:123", "json"); +// result: { value: UserData | null, metadata: any | null, cacheStatus?: string } +``` + +## Delete Operations + +```typescript +await env.MY_KV.delete("key"); // Always succeeds (even if key missing) +``` + +## List Operations + +```typescript +// List all +const keys = await env.MY_KV.list(); +// { keys: [...], list_complete: boolean, cursor?: string } + +// With prefix +const userKeys = await env.MY_KV.list({ prefix: "user:" }); + +// Pagination +let cursor: string | undefined; +let allKeys = []; +do { + const result = await env.MY_KV.list({ cursor, limit: 1000 }); + allKeys.push(...result.keys); + cursor = result.cursor; +} while (!result.list_complete); +``` + +## Performance Considerations + +### Type Selection + +| Type | Use Case | Performance | +|------|----------|-------------| +| `stream` | Large values (>1MB) | Fastest - no buffering | +| `arrayBuffer` | Binary data | Fast - single allocation | +| `text` | String values | Medium | +| `json` | Objects (parse overhead) | Slowest - parsing cost | + +### Parallel Reads + +```typescript +// Efficient parallel reads with Promise.all() +const [user, settings, cache] = await Promise.all([ + env.USERS.get("user:123", "json"), + env.SETTINGS.get("config:app", "json"), + env.CACHE.get("data:latest") +]); +``` + +## Error Handling + +- **Missing keys:** Return `null` (not an error) +- **Rate limit (429):** Retry with exponential backoff (see gotchas.md) +- **Response too large (413):** Values >25MB fail with 413 error + +See [gotchas.md](./gotchas.md) for detailed error patterns and solutions. diff --git a/cloudflare/references/kv/configuration.md b/cloudflare/references/kv/configuration.md new file mode 100644 index 0000000..0aefa5f --- /dev/null +++ b/cloudflare/references/kv/configuration.md @@ -0,0 +1,144 @@ +# KV Configuration + +## Create Namespace + +```bash +wrangler kv namespace create MY_NAMESPACE +# Output: { binding = "MY_NAMESPACE", id = "abc123..." } + +wrangler kv namespace create MY_NAMESPACE --preview # For local dev +``` + +## Workers Binding + +**wrangler.jsonc:** +```jsonc +{ + "kv_namespaces": [ + { + "binding": "MY_KV", + "id": "abc123xyz789" + }, + // Optional: Different namespace for preview/development + { + "binding": "MY_KV", + "preview_id": "preview-abc123" + } + ] +} +``` + +## TypeScript Types + +**env.d.ts:** +```typescript +interface Env { + MY_KV: KVNamespace; + SESSIONS: KVNamespace; + CACHE: KVNamespace; +} +``` + +**worker.ts:** +```typescript +export default { + async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise { + // env.MY_KV is now typed as KVNamespace + const value = await env.MY_KV.get("key"); + return new Response(value || "Not found"); + } +} satisfies ExportedHandler; +``` + +**Type-safe JSON operations:** +```typescript +interface UserProfile { + name: string; + email: string; + role: "admin" | "user"; +} + +const profile = await env.USERS.get("user:123", "json"); +// profile: UserProfile | null (type-safe!) +if (profile) { + console.log(profile.name); // TypeScript knows this is a string +} +``` + +## CLI Operations + +```bash +# Put +wrangler kv key put --binding=MY_KV "key" "value" +wrangler kv key put --binding=MY_KV "key" --path=./file.json --ttl=3600 + +# Get +wrangler kv key get --binding=MY_KV "key" + +# Delete +wrangler kv key delete --binding=MY_KV "key" + +# List +wrangler kv key list --binding=MY_KV --prefix="user:" + +# Bulk operations (max 10,000 keys per file) +wrangler kv bulk put data.json --binding=MY_KV +wrangler kv bulk get keys.json --binding=MY_KV +wrangler kv bulk delete keys.json --binding=MY_KV --force +``` + +## Local Development + +```bash +wrangler dev # Local KV (isolated) +wrangler dev --remote # Remote KV (production) + +# Or in wrangler.jsonc: +# "kv_namespaces": [{ "binding": "MY_KV", "id": "...", "remote": true }] +``` + +## REST API + +### Single Operations + +```typescript +import Cloudflare from 'cloudflare'; + +const client = new Cloudflare({ + apiEmail: process.env.CLOUDFLARE_EMAIL, + apiKey: process.env.CLOUDFLARE_API_KEY +}); + +// Single key operations +await client.kv.namespaces.values.update(namespaceId, 'key', { + account_id: accountId, + value: 'value', + expiration_ttl: 3600 +}); +``` + +### Bulk Operations + +```typescript +// Bulk update (up to 10,000 keys, max 100MB total) +await client.kv.namespaces.bulkUpdate(namespaceId, { + account_id: accountId, + body: [ + { key: "key1", value: "value1", expiration_ttl: 3600 }, + { key: "key2", value: "value2", metadata: { version: 1 } }, + { key: "key3", value: "value3" } + ] +}); + +// Bulk get (up to 100 keys) +const results = await client.kv.namespaces.bulkGet(namespaceId, { + account_id: accountId, + keys: ["key1", "key2", "key3"] +}); + +// Bulk delete (up to 10,000 keys) +await client.kv.namespaces.bulkDelete(namespaceId, { + account_id: accountId, + keys: ["key1", "key2", "key3"] +}); +``` diff --git a/cloudflare/references/kv/gotchas.md b/cloudflare/references/kv/gotchas.md new file mode 100644 index 0000000..5ad3213 --- /dev/null +++ b/cloudflare/references/kv/gotchas.md @@ -0,0 +1,131 @@ +# KV Gotchas & Troubleshooting + +## Common Errors + +### "Stale Read After Write" + +**Cause:** Eventual consistency means writes may not be immediately visible in other regions +**Solution:** Don't read immediately after write; return confirmation without reading or use the local value you just wrote. Writes visible immediately in same location, ≤60s globally + +```typescript +// ❌ BAD: Read immediately after write +await env.KV.put("key", "value"); +const value = await env.KV.get("key"); // May be null in other regions! + +// ✅ GOOD: Use the value you just wrote +const newValue = "value"; +await env.KV.put("key", newValue); +return new Response(newValue); // Don't re-read +``` + +### "429 Rate Limit on Concurrent Writes" + +**Cause:** Multiple concurrent writes to same key exceeding 1 write/second limit +**Solution:** Use sequential writes, unique keys for concurrent operations, or implement retry with exponential backoff + +```typescript +async function putWithRetry( + kv: KVNamespace, + key: string, + value: string, + maxAttempts = 5 +): Promise { + let delay = 1000; + for (let i = 0; i < maxAttempts; i++) { + try { + await kv.put(key, value); + return; + } catch (err) { + if (err instanceof Error && err.message.includes("429")) { + if (i === maxAttempts - 1) throw err; + await new Promise(r => setTimeout(r, delay)); + delay *= 2; // Exponential backoff + } else { + throw err; + } + } + } +} +``` + +### "Inefficient Multiple Gets" + +**Cause:** Making multiple individual get() calls instead of bulk operation +**Solution:** Use bulk get with array of keys: `env.USERS.get(["user:1", "user:2", "user:3"])` to reduce to 1 operation + +### "Null Reference Error" + +**Cause:** Attempting to use value without checking for null when key doesn't exist +**Solution:** Always handle null returns - KV returns `null` for missing keys, not undefined + +```typescript +// ❌ BAD: Assumes value exists +const config = await env.KV.get("config", "json"); +return config.theme; // TypeError if null! + +// ✅ GOOD: Null checks +const config = await env.KV.get("config", "json"); +return config?.theme ?? "default"; + +// ✅ GOOD: Early return +const config = await env.KV.get("config", "json"); +if (!config) return new Response("Not found", { status: 404 }); +return new Response(config.theme); +``` + +### "Negative Lookup Caching" + +**Cause:** Keys that don't exist are cached as "not found" for up to 60s +**Solution:** Creating a key after checking won't be visible until cache expires + +```typescript +// Check → create pattern has race condition +const exists = await env.KV.get("key"); // null, cached as "not found" +if (!exists) { + await env.KV.put("key", "value"); + // Next get() may still return null for ~60s due to negative cache +} + +// Alternative: Always assume key may not exist, use defaults +const value = await env.KV.get("key") ?? "default-value"; +``` + +## Performance Tips + +| Scenario | Recommendation | Why | +|----------|----------------|-----| +| Large values (>1MB) | Use `stream` type | Avoids buffering entire value in memory | +| Many small keys | Coalesce into one JSON object | Reduces operations, improves cache hit rate | +| High write volume | Spread across different keys | Avoid 1 write/second per-key limit | +| Cold reads | Increase `cacheTtl` parameter | Reduces latency for frequently-read data | +| Bulk operations | Use array form of get() | Single operation, better performance | + +## Cost Examples + +**Free tier:** +- 100K reads/day = 3M/month ✅ +- 1K writes/day = 30K/month ✅ +- 1GB storage ✅ + +**Example paid workload:** +- 10M reads/month = $5.00 +- 100K writes/month = $0.50 +- 1GB storage = $0.50 +- **Total: ~$6/month** + +## Limits + +| Limit | Value | Notes | +|-------|-------|-------| +| Key size | 512 bytes | Maximum key length | +| Value size | 25 MiB | Maximum value; 413 error if exceeded | +| Metadata size | 1024 bytes | Maximum metadata per key | +| cacheTtl minimum | 60s | Minimum cache TTL | +| Write rate per key | 1 write/second | All plans; 429 error if exceeded | +| Propagation time | ≤60s | Global propagation time | +| Bulk get max | 100 keys | Maximum keys per bulk operation | +| Operations per Worker | 1,000 | Per request (bulk counts as 1) | +| Reads pricing | $0.50 per 10M | Per million reads | +| Writes pricing | $5.00 per 1M | Per million writes | +| Deletes pricing | $5.00 per 1M | Per million deletes | +| Storage pricing | $0.50 per GB-month | Per GB per month | diff --git a/cloudflare/references/kv/patterns.md b/cloudflare/references/kv/patterns.md new file mode 100644 index 0000000..8386074 --- /dev/null +++ b/cloudflare/references/kv/patterns.md @@ -0,0 +1,196 @@ +# KV Patterns & Best Practices + +## Multi-Tier Caching + +```typescript +// Memory → KV → Origin (3-tier cache) +const memoryCache = new Map(); + +async function getCached(env: Env, key: string): Promise { + const now = Date.now(); + + // L1: Memory cache (fastest) + const cached = memoryCache.get(key); + if (cached && cached.expires > now) { + return cached.data; + } + + // L2: KV cache (fast) + const kvValue = await env.CACHE.get(key, "json"); + if (kvValue) { + memoryCache.set(key, { data: kvValue, expires: now + 60000 }); // 1min in memory + return kvValue; + } + + // L3: Origin (slow) + const origin = await fetch(`https://api.example.com/${key}`).then(r => r.json()); + + // Backfill caches + await env.CACHE.put(key, JSON.stringify(origin), { expirationTtl: 300 }); // 5min in KV + memoryCache.set(key, { data: origin, expires: now + 60000 }); + + return origin; +} +``` + +## API Response Caching + +```typescript +async function getCachedData(env: Env, key: string, fetcher: () => Promise): Promise { + const cached = await env.MY_KV.get(key, "json"); + if (cached) return cached; + + const data = await fetcher(); + await env.MY_KV.put(key, JSON.stringify(data), { expirationTtl: 300 }); + return data; +} + +const apiData = await getCachedData( + env, + "cache:users", + () => fetch("https://api.example.com/users").then(r => r.json()) +); +``` + +## Session Management + +```typescript +interface Session { userId: string; expiresAt: number; } + +async function createSession(env: Env, userId: string): Promise { + const sessionId = crypto.randomUUID(); + const expiresAt = Date.now() + (24 * 60 * 60 * 1000); + + await env.SESSIONS.put( + `session:${sessionId}`, + JSON.stringify({ userId, expiresAt }), + { expirationTtl: 86400, metadata: { createdAt: Date.now() } } + ); + + return sessionId; +} + +async function getSession(env: Env, sessionId: string): Promise { + const data = await env.SESSIONS.get(`session:${sessionId}`, "json"); + if (!data || data.expiresAt < Date.now()) return null; + return data; +} +``` + +## Coalesce Cold Keys + +```typescript +// ❌ BAD: Many individual keys +await env.KV.put("user:123:name", "John"); +await env.KV.put("user:123:email", "john@example.com"); + +// ✅ GOOD: Single coalesced object +await env.USERS.put("user:123:profile", JSON.stringify({ + name: "John", + email: "john@example.com", + role: "admin" +})); + +// Benefits: Hot key cache, single read, reduced operations +// Trade-off: Harder to update individual fields +``` + +## Prefix-Based Namespacing + +```typescript +// Logical partitioning within single namespace +const PREFIXES = { + users: "user:", + sessions: "session:", + cache: "cache:", + features: "feature:" +} as const; + +// Write with prefix +async function setUser(env: Env, id: string, data: any) { + await env.KV.put(`${PREFIXES.users}${id}`, JSON.stringify(data)); +} + +// Read with prefix +async function getUser(env: Env, id: string) { + return await env.KV.get(`${PREFIXES.users}${id}`, "json"); +} + +// List by prefix +async function listUserIds(env: Env): Promise { + const result = await env.KV.list({ prefix: PREFIXES.users }); + return result.keys.map(k => k.name.replace(PREFIXES.users, "")); +} + +// Example hierarchy +"user:123:profile" +"user:123:settings" +"cache:api:users" +"session:abc-def" +"feature:flags:beta" +``` + +## Metadata Versioning + +```typescript +interface VersionedData { + version: number; + data: any; +} + +async function migrateIfNeeded(env: Env, key: string) { + const result = await env.DATA.getWithMetadata(key, "json"); + + if (!result.value) return null; + + const currentVersion = result.metadata?.version || 1; + const targetVersion = 2; + + if (currentVersion < targetVersion) { + // Migrate data format + const migrated = migrate(result.value, currentVersion, targetVersion); + + // Store with new version + await env.DATA.put(key, JSON.stringify(migrated), { + metadata: { version: targetVersion, migratedAt: Date.now() } + }); + + return migrated; + } + + return result.value; +} + +function migrate(data: any, from: number, to: number): any { + if (from === 1 && to === 2) { + // V1 → V2: Rename field + return { ...data, userName: data.name }; + } + return data; +} +``` + +## Error Boundary Pattern + +```typescript +// Resilient get with fallback +async function resilientGet( + env: Env, + key: string, + fallback: T +): Promise { + try { + const value = await env.KV.get(key, "json"); + return value ?? fallback; + } catch (err) { + console.error(`KV error for ${key}:`, err); + return fallback; + } +} + +// Usage +const config = await resilientGet(env, "config:app", { + theme: "light", + maxItems: 10 +}); +``` diff --git a/cloudflare/references/miniflare/README.md b/cloudflare/references/miniflare/README.md new file mode 100644 index 0000000..82baf7c --- /dev/null +++ b/cloudflare/references/miniflare/README.md @@ -0,0 +1,105 @@ +# Miniflare + +Local simulator for Cloudflare Workers development/testing. Runs Workers in workerd sandbox implementing runtime APIs - no internet required. + +## Features + +- Full-featured: KV, Durable Objects, R2, D1, WebSockets, Queues +- Fully-local: test without internet, instant reload +- TypeScript-native: detailed logging, source maps +- Advanced testing: dispatch events without HTTP, simulate Worker connections + +## When to Use + +**Decision tree for testing Workers:** + +``` +Need to test Workers? +│ +├─ Unit tests for business logic only? +│ └─ getPlatformProxy (Vitest/Jest) → [patterns.md](./patterns.md#getplatformproxy) +│ Fast, no HTTP, direct binding access +│ +├─ Integration tests with full runtime? +│ ├─ Single Worker? +│ │ └─ Miniflare API → [Quick Start](#quick-start) +│ │ Full control, programmatic access +│ │ +│ ├─ Multiple Workers + service bindings? +│ │ └─ Miniflare workers array → [configuration.md](./configuration.md#multiple-workers) +│ │ Shared storage, inter-worker calls +│ │ +│ └─ Vitest test runner integration? +│ └─ vitest-pool-workers → [patterns.md](./patterns.md#vitest-pool-workers) +│ Full Workers env in Vitest +│ +└─ Local dev server? + └─ wrangler dev (not Miniflare) + Hot reload, automatic config +``` + +**Use Miniflare for:** +- Integration tests with full Worker runtime +- Testing bindings/storage locally +- Multiple Workers with service bindings +- Programmatic event dispatch (fetch, queue, scheduled) + +**Use getPlatformProxy for:** +- Fast unit tests of business logic +- Testing without HTTP overhead +- Vitest/Jest environments + +**Use Wrangler for:** +- Local development workflow +- Production deployments + +## Setup + +```bash +npm i -D miniflare +``` + +Requires ES modules in `package.json`: +```json +{"type": "module"} +``` + +## Quick Start + +```js +import { Miniflare } from "miniflare"; + +const mf = new Miniflare({ + modules: true, + script: ` + export default { + async fetch(request, env, ctx) { + return new Response("Hello Miniflare!"); + } + } + `, +}); + +const res = await mf.dispatchFetch("http://localhost:8787/"); +console.log(await res.text()); // Hello Miniflare! +await mf.dispose(); +``` + +## Reading Order + +**New to Miniflare?** Start here: +1. [Quick Start](#quick-start) - Running in 2 minutes +2. [When to Use](#when-to-use) - Choose your testing approach +3. [patterns.md](./patterns.md) - Testing patterns (getPlatformProxy, Vitest, node:test) +4. [configuration.md](./configuration.md) - Configure bindings, storage, multiple workers + +**Troubleshooting:** +- [gotchas.md](./gotchas.md) - Common errors and debugging + +**API reference:** +- [api.md](./api.md) - Complete method reference + +## See Also +- [wrangler](../wrangler/) - CLI tool that embeds Miniflare for `wrangler dev` +- [workerd](../workerd/) - Runtime that powers Miniflare +- [workers](../workers/) - Workers runtime API documentation diff --git a/cloudflare/references/miniflare/api.md b/cloudflare/references/miniflare/api.md new file mode 100644 index 0000000..e4df4d7 --- /dev/null +++ b/cloudflare/references/miniflare/api.md @@ -0,0 +1,187 @@ +# Programmatic API + +## Miniflare Class + +```typescript +class Miniflare { + constructor(options: MiniflareOptions); + + // Lifecycle + ready: Promise; // Resolves when server ready, returns URL + dispose(): Promise; // Cleanup resources + setOptions(options: MiniflareOptions): Promise; // Reload config + + // Event dispatching + dispatchFetch(url: string | URL | Request, init?: RequestInit): Promise; + getWorker(name?: string): Promise; + + // Bindings access + getBindings>(name?: string): Promise; + getCf(name?: string): Promise; + getKVNamespace(name: string): Promise; + getR2Bucket(name: string): Promise; + getDurableObjectNamespace(name: string): Promise; + getDurableObjectStorage(id: DurableObjectId): Promise; + getD1Database(name: string): Promise; + getCaches(): Promise; + getQueueProducer(name: string): Promise; + + // Debugging + getInspectorURL(): Promise; // Chrome DevTools inspector URL +} +``` + +## Event Dispatching + +**Fetch (no HTTP server):** +```js +const res = await mf.dispatchFetch("http://localhost:8787/path", { + method: "POST", + headers: { "Authorization": "Bearer token" }, + body: JSON.stringify({ data: "value" }), +}); +``` + +**Custom Host routing:** +```js +const res = await mf.dispatchFetch("http://localhost:8787/", { + headers: { "Host": "api.example.com" }, +}); +``` + +**Scheduled:** +```js +const worker = await mf.getWorker(); +const result = await worker.scheduled({ cron: "30 * * * *" }); +// result: { outcome: "ok", noRetry: false } +``` + +**Queue:** +```js +const worker = await mf.getWorker(); +const result = await worker.queue("queue-name", [ + { id: "msg1", timestamp: new Date(), body: "data", attempts: 1 }, +]); +// result: { outcome: "ok", retryAll: false, ackAll: false, ... } +``` + +## Bindings Access + +**Environment variables:** +```js +// Basic usage +const bindings = await mf.getBindings(); +console.log(bindings.SECRET_KEY); + +// With type safety (recommended): +interface Env { + SECRET_KEY: string; + API_URL: string; + KV: KVNamespace; +} +const env = await mf.getBindings(); +env.SECRET_KEY; // string (typed!) +env.KV.get("key"); // KVNamespace methods available +``` + +**Request.cf object:** +```js +const cf = await mf.getCf(); +console.log(cf?.colo); // "DFW" +console.log(cf?.country); // "US" +``` + +**KV:** +```js +const ns = await mf.getKVNamespace("TEST_NAMESPACE"); +await ns.put("key", "value"); +const value = await ns.get("key"); +``` + +**R2:** +```js +const bucket = await mf.getR2Bucket("BUCKET"); +await bucket.put("file.txt", "content"); +const object = await bucket.get("file.txt"); +``` + +**Durable Objects:** +```js +const ns = await mf.getDurableObjectNamespace("COUNTER"); +const id = ns.idFromName("test"); +const stub = ns.get(id); +const res = await stub.fetch("http://localhost/"); + +// Access storage directly: +const storage = await mf.getDurableObjectStorage(id); +await storage.put("key", "value"); +``` + +**D1:** +```js +const db = await mf.getD1Database("DB"); +await db.exec(`CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT)`); +await db.prepare("INSERT INTO users (name) VALUES (?)").bind("Alice").run(); +``` + +**Cache:** +```js +const caches = await mf.getCaches(); +const defaultCache = caches.default; +await defaultCache.put("http://example.com", new Response("cached")); +``` + +**Queue producer:** +```js +const producer = await mf.getQueueProducer("QUEUE"); +await producer.send({ body: "message data" }); +``` + +## Lifecycle + +**Reload:** +```js +await mf.setOptions({ + scriptPath: "worker.js", + bindings: { VERSION: "2.0" }, +}); +``` + +**Watch (manual):** +```js +import { watch } from "fs"; + +const config = { scriptPath: "worker.js" }; +const mf = new Miniflare(config); + +watch("worker.js", async () => { + console.log("Reloading..."); + await mf.setOptions(config); +}); +``` + +**Cleanup:** +```js +await mf.dispose(); +``` + +## Debugging + +**Inspector URL for DevTools:** +```js +const url = await mf.getInspectorURL(); +console.log(`DevTools: ${url}`); +// Open in Chrome DevTools for breakpoints, profiling +``` + +**Wait for server ready:** +```js +const mf = new Miniflare({ scriptPath: "worker.js" }); +const url = await mf.ready; // Promise +console.log(`Server running at ${url}`); // http://127.0.0.1:8787 + +// Note: dispatchFetch() waits automatically, no need to await ready +const res = await mf.dispatchFetch("http://localhost/"); // Works immediately +``` + +See [configuration.md](./configuration.md) for all constructor options. diff --git a/cloudflare/references/miniflare/configuration.md b/cloudflare/references/miniflare/configuration.md new file mode 100644 index 0000000..b269b24 --- /dev/null +++ b/cloudflare/references/miniflare/configuration.md @@ -0,0 +1,173 @@ +# Configuration + +## Script Loading + +```js +// Inline +new Miniflare({ modules: true, script: `export default { ... }` }); + +// File-based +new Miniflare({ scriptPath: "worker.js" }); + +// Multi-module +new Miniflare({ + scriptPath: "src/index.js", + modules: true, + modulesRules: [ + { type: "ESModule", include: ["**/*.js"] }, + { type: "Text", include: ["**/*.txt"] }, + ], +}); +``` + +## Compatibility + +```js +new Miniflare({ + compatibilityDate: "2026-01-01", // Use recent date for latest features + compatibilityFlags: [ + "nodejs_compat", // Node.js APIs (process, Buffer, etc) + "streams_enable_constructors", // Stream constructors + ], + upstream: "https://example.com", // Fallback for unhandled requests +}); +``` + +**Critical:** Use `compatibilityDate: "2026-01-01"` or latest to match production runtime. Old dates limit available APIs. + +## HTTP Server & Request.cf + +```js +new Miniflare({ + port: 8787, // Default: 8787 + host: "127.0.0.1", + https: true, // Self-signed cert + liveReload: true, // Auto-reload HTML + + cf: true, // Fetch live Request.cf data (cached) + // cf: "./cf.json", // Or load from file + // cf: { colo: "DFW" }, // Or inline mock +}); +``` + +**Note:** For tests, use `dispatchFetch()` (no port conflicts). + +## Storage Bindings + +```js +new Miniflare({ + // KV + kvNamespaces: ["TEST_NAMESPACE", "CACHE"], + kvPersist: "./kv-data", // Optional: persist to disk + + // R2 + r2Buckets: ["BUCKET", "IMAGES"], + r2Persist: "./r2-data", + + // Durable Objects + modules: true, + durableObjects: { + COUNTER: "Counter", // className + API_OBJECT: { className: "ApiObject", scriptName: "api-worker" }, + }, + durableObjectsPersist: "./do-data", + + // D1 + d1Databases: ["DB"], + d1Persist: "./d1-data", + + // Cache + cache: true, // Default + cachePersist: "./cache-data", +}); +``` + +## Bindings + +```js +new Miniflare({ + // Environment variables + bindings: { + SECRET_KEY: "my-secret-value", + API_URL: "https://api.example.com", + DEBUG: true, + }, + + // Other bindings + wasmBindings: { ADD_MODULE: "./add.wasm" }, + textBlobBindings: { TEXT: "./data.txt" }, + queueProducers: ["QUEUE"], +}); +``` + +## Multiple Workers + +```js +new Miniflare({ + workers: [ + { + name: "main", + kvNamespaces: { DATA: "shared" }, + serviceBindings: { API: "api-worker" }, + script: `export default { ... }`, + }, + { + name: "api-worker", + kvNamespaces: { DATA: "shared" }, // Shared storage + script: `export default { ... }`, + }, + ], +}); +``` + +**With routing:** +```js +workers: [ + { name: "api", scriptPath: "./api.js", routes: ["api.example.com/*"] }, + { name: "web", scriptPath: "./web.js", routes: ["example.com/*"] }, +], +``` + +## Logging & Performance + +```js +import { Log, LogLevel } from "miniflare"; + +new Miniflare({ + log: new Log(LogLevel.DEBUG), // DEBUG | INFO | WARN | ERROR | NONE + scriptTimeout: 30000, // CPU limit (ms) + workersConcurrencyLimit: 10, // Max concurrent workers +}); +``` + +## Workers Sites + +```js +new Miniflare({ + sitePath: "./public", + siteInclude: ["**/*.html", "**/*.css"], + siteExclude: ["**/*.map"], +}); +``` + +## From wrangler.toml + +Miniflare doesn't auto-read `wrangler.toml`: + +```toml +# wrangler.toml +name = "my-worker" +main = "src/index.ts" +compatibility_date = "2026-01-01" +[[kv_namespaces]] +binding = "KV" +``` + +```js +// Miniflare equivalent +new Miniflare({ + scriptPath: "src/index.ts", + compatibilityDate: "2026-01-01", + kvNamespaces: ["KV"], +}); +``` diff --git a/cloudflare/references/miniflare/gotchas.md b/cloudflare/references/miniflare/gotchas.md new file mode 100644 index 0000000..dfcd157 --- /dev/null +++ b/cloudflare/references/miniflare/gotchas.md @@ -0,0 +1,160 @@ +# Gotchas & Troubleshooting + +## Miniflare Limitations + +**Not supported:** +- Analytics Engine (use mocks) +- Cloudflare Images/Stream +- Browser Rendering API +- Tail Workers +- Workers for Platforms (partial support) + +**Behavior differences from production:** +- Runs workerd locally, not Cloudflare edge +- Storage is local (filesystem/memory), not distributed +- `Request.cf` is cached/mocked, not real edge data +- Performance differs from edge +- Caching implementation may vary slightly + +## Common Errors + +### "Cannot find module" +**Cause:** Module path wrong or `modulesRules` not configured +**Solution:** +```js +new Miniflare({ + modules: true, + modulesRules: [{ type: "ESModule", include: ["**/*.js"] }], +}); +``` + +### "Data not persisting" +**Cause:** Persist paths are files, not directories +**Solution:** +```js +kvPersist: "./data/kv", // Directory, not file +``` + +### "Cannot run TypeScript" +**Cause:** Miniflare doesn't transpile TypeScript +**Solution:** Build first with esbuild/tsc, then run compiled JS + +### "`request.cf` is undefined" +**Cause:** CF data not configured +**Solution:** +```js +new Miniflare({ cf: true }); // Or cf: "./cf.json" +``` + +### "EADDRINUSE" port conflict +**Cause:** Multiple instances using same port +**Solution:** Use `dispatchFetch()` (no HTTP server) or `port: 0` for auto-assign + +### "Durable Object not found" +**Cause:** Class export doesn't match config name +**Solution:** +```js +export class Counter {} // Must match +new Miniflare({ durableObjects: { COUNTER: "Counter" } }); +``` + +## Debugging + +**Enable verbose logging:** +```js +import { Log, LogLevel } from "miniflare"; +new Miniflare({ log: new Log(LogLevel.DEBUG) }); +``` + +**Chrome DevTools:** +```js +const url = await mf.getInspectorURL(); +console.log(`DevTools: ${url}`); // Open in Chrome +``` + +**Inspect bindings:** +```js +const env = await mf.getBindings(); +console.log(Object.keys(env)); +``` + +**Verify storage:** +```js +const ns = await mf.getKVNamespace("TEST"); +const { keys } = await ns.list(); +``` + +## Best Practices + +**✓ Do:** +- Use `dispatchFetch()` for tests (no HTTP server) +- In-memory storage for CI (omit persist options) +- New instances per test for isolation +- Type-safe bindings with interfaces +- `await mf.dispose()` in cleanup + +**✗ Avoid:** +- HTTP server in tests +- Shared instances without cleanup +- Old compatibility dates (use 2026+) + +## Migration Guides + +### From Miniflare 2.x to 3+ + +Breaking changes in v3+: + +| v2 | v3+ | +|----|-----| +| `getBindings()` sync | `getBindings()` returns Promise | +| `ready` is void | `ready` returns `Promise` | +| service-worker-mock | Built on workerd | +| Different options | Restructured constructor | + +**Example migration:** +```js +// v2 +const bindings = mf.getBindings(); +mf.ready; // void + +// v3+ +const bindings = await mf.getBindings(); +const url = await mf.ready; // Promise +``` + +### From unstable_dev to Miniflare + +```js +// Old (deprecated) +import { unstable_dev } from "wrangler"; +const worker = await unstable_dev("src/index.ts"); + +// New +import { Miniflare } from "miniflare"; +const mf = new Miniflare({ scriptPath: "src/index.ts" }); +``` + +### From Wrangler Dev + +Miniflare doesn't auto-read `wrangler.toml`: + +```js +// Translate manually: +new Miniflare({ + scriptPath: "dist/worker.js", + compatibilityDate: "2026-01-01", + kvNamespaces: ["KV"], + bindings: { API_KEY: process.env.API_KEY }, +}); +``` + +## Resource Limits + +| Limit | Value | Notes | +|-------|-------|-------| +| CPU time | 30s default | Configurable via `scriptTimeout` | +| Storage | Filesystem | Performance varies by disk | +| Memory | System dependent | No artificial limits | +| Request.cf | Cached/mocked | Not live edge data | + +See [patterns.md](./patterns.md) for testing examples. diff --git a/cloudflare/references/miniflare/patterns.md b/cloudflare/references/miniflare/patterns.md new file mode 100644 index 0000000..c89c3a5 --- /dev/null +++ b/cloudflare/references/miniflare/patterns.md @@ -0,0 +1,181 @@ +# Testing Patterns + +## Choosing a Testing Approach + +| Approach | Use Case | Speed | Setup | Runtime | +|----------|----------|-------|-------|---------| +| **getPlatformProxy** | Unit tests, logic testing | Fast | Low | Miniflare | +| **Miniflare API** | Integration tests, full control | Medium | Medium | Miniflare | +| **vitest-pool-workers** | Vitest runner integration | Medium | Medium | workerd | + +**Quick guide:** +- Unit tests → getPlatformProxy +- Integration tests → Miniflare API +- Vitest workflows → vitest-pool-workers + +## getPlatformProxy + +Lightweight unit testing - provides bindings without full Worker runtime. + +```js +// vitest.config.js +export default { test: { environment: "node" } }; +``` + +```js +import { env } from "cloudflare:test"; +import { describe, it, expect } from "vitest"; + +describe("Business logic", () => { + it("processes data with KV", async () => { + await env.KV.put("test", "value"); + expect(await env.KV.get("test")).toBe("value"); + }); +}); +``` + +**Pros:** Fast, simple +**Cons:** No full runtime, can't test fetch handler + +## vitest-pool-workers + +Full Workers runtime in Vitest. Reads `wrangler.toml`. + +```bash +npm i -D @cloudflare/vitest-pool-workers +``` + +```js +// vitest.config.js +import { defineWorkersConfig } from "@cloudflare/vitest-pool-workers/config"; + +export default defineWorkersConfig({ + test: { + poolOptions: { workers: { wrangler: { configPath: "./wrangler.toml" } } }, + }, +}); +``` + +```js +import { env, SELF } from "cloudflare:test"; +import { it, expect } from "vitest"; + +it("handles fetch", async () => { + const res = await SELF.fetch("http://example.com/"); + expect(res.status).toBe(200); +}); +``` + +**Pros:** Full runtime, uses wrangler.toml +**Cons:** Requires Wrangler config + +## Miniflare API (node:test) + +```js +import assert from "node:assert"; +import test, { after, before } from "node:test"; +import { Miniflare } from "miniflare"; + +let mf; +before(() => { + mf = new Miniflare({ scriptPath: "src/index.js", kvNamespaces: ["TEST_KV"] }); +}); + +test("fetch", async () => { + const res = await mf.dispatchFetch("http://localhost/"); + assert.strictEqual(await res.text(), "Hello"); +}); + +after(() => mf.dispose()); +``` + +## Testing Durable Objects & Events + +```js +// Durable Objects +const ns = await mf.getDurableObjectNamespace("COUNTER"); +const stub = ns.get(ns.idFromName("test-counter")); +await stub.fetch("http://localhost/increment"); + +// Direct storage +const storage = await mf.getDurableObjectStorage(ns.idFromName("test-counter")); +const count = await storage.get("count"); + +// Queue +const worker = await mf.getWorker(); +await worker.queue("my-queue", [ + { id: "msg1", timestamp: new Date(), body: { userId: 123 }, attempts: 1 }, +]); + +// Scheduled +await worker.scheduled({ cron: "0 0 * * *" }); +``` + +## Test Isolation & Mocking + +```js +// Per-test isolation +beforeEach(() => { mf = new Miniflare({ kvNamespaces: ["TEST"] }); }); +afterEach(() => mf.dispose()); + +// Mock external APIs +new Miniflare({ + workers: [ + { name: "main", serviceBindings: { API: "mock-api" }, script: `...` }, + { name: "mock-api", script: `export default { async fetch() { return Response.json({mock: true}); } }` }, + ], +}); +``` + +## Type Safety + +```ts +import type { KVNamespace } from "@cloudflare/workers-types"; + +interface Env { + KV: KVNamespace; + API_KEY: string; +} + +const env = await mf.getBindings(); +await env.KV.put("key", "value"); // Typed! + +export default { + async fetch(req: Request, env: Env) { + return new Response(await env.KV.get("key")); + } +} satisfies ExportedHandler; +``` + +## WebSocket Testing + +```js +const res = await mf.dispatchFetch("http://localhost/ws", { + headers: { Upgrade: "websocket" }, +}); +assert.strictEqual(res.status, 101); +``` + +## Migration from unstable_dev + +```js +// Old (deprecated) +import { unstable_dev } from "wrangler"; +const worker = await unstable_dev("src/index.ts"); + +// New +import { Miniflare } from "miniflare"; +const mf = new Miniflare({ scriptPath: "src/index.ts" }); +``` + +## CI/CD Tips + +```js +// In-memory storage (faster) +new Miniflare({ kvNamespaces: ["TEST"] }); // No persist = in-memory + +// Use dispatchFetch (no port conflicts) +await mf.dispatchFetch("http://localhost/"); +``` + +See [gotchas.md](./gotchas.md) for troubleshooting. diff --git a/cloudflare/references/network-interconnect/README.md b/cloudflare/references/network-interconnect/README.md new file mode 100644 index 0000000..e337f1b --- /dev/null +++ b/cloudflare/references/network-interconnect/README.md @@ -0,0 +1,99 @@ +# Cloudflare Network Interconnect (CNI) + +Private, high-performance connectivity to Cloudflare's network. **Enterprise-only**. + +## Connection Types + +**Direct**: Physical fiber in shared datacenter. 10/100 Gbps. You order cross-connect. + +**Partner**: Virtual via Console Connect, Equinix, Megaport, etc. Managed via partner SDN. + +**Cloud**: AWS Direct Connect or GCP Cloud Interconnect. Magic WAN only. + +## Dataplane Versions + +**v1 (Classic)**: GRE tunnel support, VLAN/BFD/LACP, asymmetric MTU (1500↓/1476↑), peering support. + +**v2 (Beta)**: No GRE, 1500 MTU both ways, no VLAN/BFD/LACP yet, ECMP instead. + +## Use Cases + +- **Magic Transit DSR**: DDoS protection, egress via ISP (v1/v2) +- **Magic Transit + Egress**: DDoS + egress via CF (v1/v2) +- **Magic WAN + Zero Trust**: Private backbone (v1 needs GRE, v2 native) +- **Peering**: Public routes at PoP (v1 only) +- **App Security**: WAF/Cache/LB (v1/v2 over Magic Transit) + +## Prerequisites + +- Enterprise plan +- IPv4 /24+ or IPv6 /48+ prefixes +- BGP ASN for v1 +- See [locations PDF](https://developers.cloudflare.com/network-interconnect/static/cni-locations-2026-01.pdf) + +## Specs + +- /31 point-to-point subnets +- 10km max optical distance +- 10G: 10GBASE-LR single-mode +- 100G: 100GBASE-LR4 single-mode +- **No SLA** (free service) +- Backup Internet required + +## Throughput + +| Direction | 10G | 100G | +|-----------|-----|------| +| CF → Customer | 10 Gbps | 100 Gbps | +| Customer → CF (peering) | 10 Gbps | 100 Gbps | +| Customer → CF (Magic) | 1 Gbps/tunnel or CNI | 1 Gbps/tunnel or CNI | + +## Timeline + +2-4 weeks typical. Steps: request → config review → order connection → configure → test → enable health checks → activate → monitor. + +## In This Reference +- [configuration.md](./configuration.md) - BGP, routing, setup +- [api.md](./api.md) - API endpoints, SDKs +- [patterns.md](./patterns.md) - HA, hybrid cloud, failover +- [gotchas.md](./gotchas.md) - Troubleshooting, limits + +## Reading Order by Task + +| Task | Files to Load | +|------|---------------| +| Initial setup | README → configuration.md → api.md | +| Create interconnect via API | api.md → gotchas.md | +| Design HA architecture | patterns.md → README | +| Troubleshoot connection | gotchas.md → configuration.md | +| Cloud integration (AWS/GCP) | configuration.md → patterns.md | +| Monitor + alerts | configuration.md | + +## Automation Boundary + +**API-Automatable:** +- List/create/delete interconnects (Direct, Partner) +- List available slots +- Get interconnect status +- Download LOA PDF +- Create/update CNI objects (BGP config) +- Query settings + +**Requires Account Team:** +- Initial request approval +- AWS Direct Connect setup (send LOA+VLAN to CF) +- GCP Cloud Interconnect final activation +- Partner interconnect acceptance (Equinix, Megaport) +- VLAN assignment (v1) +- Configuration document generation (v1) +- Escalations + troubleshooting support + +**Cannot Be Automated:** +- Physical cross-connect installation (Direct) +- Partner portal operations (virtual circuit ordering) +- AWS/GCP portal operations +- Maintenance window coordination + +## See Also +- [tunnel](../tunnel/) - Alternative for private network connectivity +- [spectrum](../spectrum/) - Layer 4 proxy for TCP/UDP traffic diff --git a/cloudflare/references/network-interconnect/api.md b/cloudflare/references/network-interconnect/api.md new file mode 100644 index 0000000..85e5e12 --- /dev/null +++ b/cloudflare/references/network-interconnect/api.md @@ -0,0 +1,199 @@ +# CNI API Reference + +See [README.md](README.md) for overview. + +## Base + +``` +https://api.cloudflare.com/client/v4 +Auth: Authorization: Bearer +``` + +## SDK Namespaces + +**Primary (recommended):** +```typescript +client.networkInterconnects.interconnects.* +client.networkInterconnects.cnis.* +client.networkInterconnects.slots.* +``` + +**Alternate (deprecated):** +```typescript +client.magicTransit.cfInterconnects.* +``` + +Use `networkInterconnects` namespace for all new code. + +## Interconnects + +```http +GET /accounts/{account_id}/cni/interconnects # Query: page, per_page +POST /accounts/{account_id}/cni/interconnects # Query: validate_only=true (optional) +GET /accounts/{account_id}/cni/interconnects/{icon} +GET /accounts/{account_id}/cni/interconnects/{icon}/status +GET /accounts/{account_id}/cni/interconnects/{icon}/loa # Returns PDF +DELETE /accounts/{account_id}/cni/interconnects/{icon} +``` + +**Create Body:** `account`, `slot_id`, `type`, `facility`, `speed`, `name`, `description` +**Status Values:** `active` | `healthy` | `unhealthy` | `pending` | `down` + +**Response Example:** +```json +{"result": [{"id": "icon_abc", "name": "prod", "type": "direct", "facility": "EWR1", "speed": "10G", "status": "active"}]} +``` + +## CNI Objects (BGP config) + +```http +GET /accounts/{account_id}/cni/cnis +POST /accounts/{account_id}/cni/cnis +GET /accounts/{account_id}/cni/cnis/{cni} +PUT /accounts/{account_id}/cni/cnis/{cni} +DELETE /accounts/{account_id}/cni/cnis/{cni} +``` + +Body: `account`, `cust_ip`, `cf_ip`, `bgp_asn`, `bgp_password`, `vlan` + +## Slots + +```http +GET /accounts/{account_id}/cni/slots +GET /accounts/{account_id}/cni/slots/{slot} +``` + +Query: `facility`, `occupied`, `speed` + +## Health Checks + +Configure via Magic Transit/WAN tunnel endpoints (CNI v2). + +```typescript +await client.magicTransit.tunnels.update(accountId, tunnelId, { + health_check: { enabled: true, target: '192.0.2.1', rate: 'high', type: 'request' }, +}); +``` + +Rates: `high` | `medium` | `low`. Types: `request` | `reply`. See [Magic Transit docs](https://developers.cloudflare.com/magic-transit/how-to/configure-tunnel-endpoints/#add-tunnels). + +## Settings + +```http +GET /accounts/{account_id}/cni/settings +PUT /accounts/{account_id}/cni/settings +``` + +Body: `default_asn` + +## TypeScript SDK + +```typescript +import Cloudflare from 'cloudflare'; + +const client = new Cloudflare({ apiToken: process.env.CF_TOKEN }); + +// List +await client.networkInterconnects.interconnects.list({ account_id: id }); + +// Create with validation +await client.networkInterconnects.interconnects.create({ + account_id: id, + account: id, + slot_id: 'slot_abc', + type: 'direct', + facility: 'EWR1', + speed: '10G', + name: 'prod-interconnect', +}, { + query: { validate_only: true }, // Dry-run validation +}); + +// Create without validation +await client.networkInterconnects.interconnects.create({ + account_id: id, + account: id, + slot_id: 'slot_abc', + type: 'direct', + facility: 'EWR1', + speed: '10G', + name: 'prod-interconnect', +}); + +// Status +await client.networkInterconnects.interconnects.get(accountId, iconId); + +// LOA (use fetch) +const res = await fetch(`https://api.cloudflare.com/client/v4/accounts/${id}/cni/interconnects/${iconId}/loa`, { + headers: { Authorization: `Bearer ${token}` }, +}); +await fs.writeFile('loa.pdf', Buffer.from(await res.arrayBuffer())); + +// CNI object +await client.networkInterconnects.cnis.create({ + account_id: id, + account: id, + cust_ip: '192.0.2.1/31', + cf_ip: '192.0.2.0/31', + bgp_asn: 65000, + vlan: 100, +}); + +// Slots (filter by facility and speed) +await client.networkInterconnects.slots.list({ + account_id: id, + occupied: false, + facility: 'EWR1', + speed: '10G', +}); +``` + +## Python SDK + +```python +from cloudflare import Cloudflare + +client = Cloudflare(api_token=os.environ["CF_TOKEN"]) + +# List, create, status (same pattern as TypeScript) +client.network_interconnects.interconnects.list(account_id=id) +client.network_interconnects.interconnects.create(account_id=id, account=id, slot_id="slot_abc", type="direct", facility="EWR1", speed="10G") +client.network_interconnects.interconnects.get(account_id=id, icon=icon_id) + +# CNI objects and slots +client.network_interconnects.cnis.create(account_id=id, cust_ip="192.0.2.1/31", cf_ip="192.0.2.0/31", bgp_asn=65000) +client.network_interconnects.slots.list(account_id=id, occupied=False) +``` + +## cURL + +```bash +# List interconnects +curl "https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/cni/interconnects" \ + -H "Authorization: Bearer ${CF_TOKEN}" + +# Create interconnect +curl -X POST "https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/cni/interconnects?validate_only=true" \ + -H "Authorization: Bearer ${CF_TOKEN}" -H "Content-Type: application/json" \ + -d '{"account": "id", "slot_id": "slot_abc", "type": "direct", "facility": "EWR1", "speed": "10G"}' + +# LOA PDF +curl "https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/cni/interconnects/${ICON_ID}/loa" \ + -H "Authorization: Bearer ${CF_TOKEN}" --output loa.pdf +``` + +## Not Available via API + +**Missing Capabilities:** +- BGP session state query (use Dashboard or BGP logs) +- Bandwidth utilization metrics (use external monitoring) +- Traffic statistics per interconnect +- Historical uptime/downtime data +- Light level readings (contact account team) +- Maintenance window scheduling (notifications only) + +## Resources + +- [API Docs](https://developers.cloudflare.com/api/resources/network_interconnects/) +- [TypeScript SDK](https://github.com/cloudflare/cloudflare-typescript) +- [Python SDK](https://github.com/cloudflare/cloudflare-python) diff --git a/cloudflare/references/network-interconnect/configuration.md b/cloudflare/references/network-interconnect/configuration.md new file mode 100644 index 0000000..0f1005c --- /dev/null +++ b/cloudflare/references/network-interconnect/configuration.md @@ -0,0 +1,114 @@ +# CNI Configuration + +See [README.md](README.md) for overview. + +## Workflow (2-4 weeks) + +1. **Submit request** (Week 1): Contact account team, provide type/location/use case +2. **Review config** (Week 1-2, v1 only): Approve IP/VLAN/spec doc +3. **Order connection** (Week 2-3): + - **Direct**: Get LOA, order cross-connect from facility + - **Partner**: Order virtual circuit in partner portal + - **Cloud**: Order Direct Connect/Cloud Interconnect, send LOA+VLAN to CF +4. **Configure** (Week 3): Both sides configure per doc +5. **Test** (Week 3-4): Ping, verify BGP, check routes +6. **Health checks** (Week 4): Configure [Magic Transit](https://developers.cloudflare.com/magic-transit/how-to/configure-tunnel-endpoints/#add-tunnels) or [Magic WAN](https://developers.cloudflare.com/magic-wan/configuration/manually/how-to/configure-tunnel-endpoints/#add-tunnels) health checks +7. **Activate** (Week 4): Route traffic, verify flow +8. **Monitor**: Enable [maintenance notifications](https://developers.cloudflare.com/network-interconnect/monitoring-and-alerts/#enable-cloudflare-status-maintenance-notification) + +## BGP Configuration + +**v1 Requirements:** +- BGP ASN (provide during setup) +- /31 subnet for peering +- Optional: BGP password + +**v2:** Simplified, less BGP config needed. + +**BGP over CNI (Dec 2024):** Magic WAN/Transit can now peer BGP directly over CNI v2 (no GRE tunnel required). + +**Example v1 BGP:** +``` +Router ID: 192.0.2.1 +Peer IP: 192.0.2.0 +Remote ASN: 13335 +Local ASN: 65000 +Password: [optional] +VLAN: 100 +``` + +## Cloud Interconnect Setup + +### AWS Direct Connect (Beta) + +**Requirements:** Magic WAN, AWS Dedicated Direct Connect 1/10 Gbps. + +**Process:** +1. Contact CF account team +2. Choose location +3. Order in AWS portal +4. AWS provides LOA + VLAN ID +5. Send to CF account team +6. Wait ~4 weeks + +**Post-setup:** Add [static routes](https://developers.cloudflare.com/magic-wan/configuration/manually/how-to/configure-routes/#configure-static-routes) to Magic WAN. Enable [bidirectional health checks](https://developers.cloudflare.com/magic-wan/configuration/manually/how-to/configure-tunnel-endpoints/#legacy-bidirectional-health-checks). + +### GCP Cloud Interconnect (Beta) + +**Setup via Dashboard:** +1. Interconnects → Create → Cloud Interconnect → Google +2. Provide name, MTU (match GCP VLAN attachment), speed (50M-50G granular options available for partner interconnects) +3. Enter VLAN attachment pairing key +4. Confirm order + +**Routing to GCP:** Add [static routes](https://developers.cloudflare.com/magic-wan/configuration/manually/how-to/configure-routes/#configure-static-routes). BGP routes from GCP Cloud Router **ignored**. + +**Routing to CF:** Configure [custom learned routes](https://cloud.google.com/network-connectivity/docs/router/how-to/configure-custom-learned-routes) in Cloud Router. Request prefixes from CF account team. + +## Monitoring + +**Dashboard Status:** + +| Status | Meaning | +|--------|---------| +| **Healthy** | Link operational, traffic flowing, health checks passing | +| **Active** | Link up, sufficient light, Ethernet negotiated | +| **Unhealthy** | Link down, no/low light (<-20 dBm), can't negotiate | +| **Pending** | Cross-connect incomplete, device unresponsive, RX/TX swapped | +| **Down** | Physical link down, no connectivity | + +**Alerts:** + +**CNI Connection Maintenance** (Magic Networking only): +``` +Dashboard → Notifications → Add +Product: Cloudflare Network Interconnect +Type: Connection Maintenance Alert +``` +Warnings up to 2 weeks advance. 6hr delay for new additions. + +**Cloudflare Status Maintenance** (entire PoP): +``` +Dashboard → Notifications → Add +Product: Cloudflare Status +Filter PoPs: gru,fra,lhr +``` + +**Find PoP code:** +``` +Dashboard → Magic Transit/WAN → Configuration → Interconnects +Select CNI → Note Data Center (e.g., "gru-b") +Use first 3 letters: "gru" +``` + +## Best Practices + +**Critical config-specific practices:** +- /31 subnets required for BGP +- BGP passwords recommended +- BFD for fast failover (v1 only) +- Test ping connectivity before BGP +- Enable maintenance notifications immediately after activation +- Monitor status programmatically via API + +For design patterns, HA architecture, and security best practices, see [patterns.md](./patterns.md). diff --git a/cloudflare/references/network-interconnect/gotchas.md b/cloudflare/references/network-interconnect/gotchas.md new file mode 100644 index 0000000..9880807 --- /dev/null +++ b/cloudflare/references/network-interconnect/gotchas.md @@ -0,0 +1,165 @@ +# CNI Gotchas & Troubleshooting + +## Common Errors + +### "Status: Pending" + +**Cause:** Cross-connect not installed, RX/TX fibers reversed, wrong fiber type, or low light levels +**Solution:** +1. Verify cross-connect installed +2. Check fiber at patch panel +3. Swap RX/TX fibers +4. Check light with optical power meter (target > -20 dBm) +5. Contact account team + +### "Status: Unhealthy" + +**Cause:** Physical issue, low light (<-20 dBm), optic mismatch, or dirty connectors +**Solution:** +1. Check physical connections +2. Clean fiber connectors +3. Verify optic types (10GBASE-LR/100GBASE-LR4) +4. Test with known-good optics +5. Check patch panel +6. Contact account team + +### "BGP Session Down" + +**Cause:** Wrong IP addressing, wrong ASN, password mismatch, or firewall blocking TCP/179 +**Solution:** +1. Verify IPs match CNI object +2. Confirm ASN correct +3. Check BGP password +4. Verify no firewall on TCP/179 +5. Check BGP logs +6. Review BGP timers + +### "Low Throughput" + +**Cause:** MTU mismatch, fragmentation, single GRE tunnel (v1), or routing inefficiency +**Solution:** +1. Check MTU (1500↓/1476↑ for v1, 1500 both for v2) +2. Test various packet sizes +3. Add more GRE tunnels (v1) +4. Consider upgrading to v2 +5. Review routing tables +6. Use LACP for bundling (v1) + +## API Errors + +### 400 Bad Request: "slot_id already occupied" + +**Cause:** Another interconnect already uses this slot +**Solution:** Use `occupied=false` filter when listing slots: +```typescript +await client.networkInterconnects.slots.list({ + account_id: id, + occupied: false, + facility: 'EWR1', +}); +``` + +### 400 Bad Request: "invalid facility code" + +**Cause:** Typo or unsupported facility +**Solution:** Check [locations PDF](https://developers.cloudflare.com/network-interconnect/static/cni-locations-2026-01.pdf) for valid codes + +### 403 Forbidden: "Enterprise plan required" + +**Cause:** Account not enterprise-level +**Solution:** Contact account team to upgrade + +### 422 Unprocessable: "validate_only request failed" + +**Cause:** Dry-run validation found issues (wrong slot, invalid config) +**Solution:** Review error message details, fix config before real creation + +### Rate Limiting + +**Limit:** 1200 requests/5min per token +**Solution:** Implement exponential backoff, cache slot listings + +## Cloud-Specific Issues + +### AWS Direct Connect: "VLAN not matching" + +**Cause:** VLAN ID from AWS LOA doesn't match CNI config +**Solution:** +1. Get VLAN from AWS Console after ordering +2. Send exact VLAN to CF account team +3. Verify match in CNI object config + +### AWS: "Connection stuck in Pending" + +**Cause:** LOA not provided to CF or AWS connection not accepted +**Solution:** +1. Verify AWS connection status is "Available" +2. Confirm LOA sent to CF account team +3. Wait for CF team acceptance (can take days) + +### GCP: "BGP routes not propagating" + +**Cause:** BGP routes from GCP Cloud Router **ignored by design** +**Solution:** Use [static routes](https://developers.cloudflare.com/magic-wan/configuration/manually/how-to/configure-routes/#configure-static-routes) in Magic WAN instead + +### GCP: "Cannot query VLAN attachment status via API" + +**Cause:** GCP Cloud Interconnect Dashboard-only (no API yet) +**Solution:** Check status in CF Dashboard or GCP Console + +## Partner Interconnect Issues + +### Equinix: "Virtual circuit not appearing" + +**Cause:** CF hasn't accepted Equinix connection request +**Solution:** +1. Verify VC created in Equinix Fabric Portal +2. Contact CF account team to accept +3. Allow 2-3 business days + +### Console Connect/Megaport: "API creation fails" + +**Cause:** Partner interconnects require partner portal + CF approval +**Solution:** Cannot fully automate. Order in partner portal, notify CF account team. + +## Anti-Patterns + +| Anti-Pattern | Why Bad | Solution | +|--------------|---------|----------| +| Single interconnect for production | No SLA, single point of failure | Use ≥2 with device diversity | +| No backup Internet | CNI fails = total outage | Always maintain alternate path | +| Polling status every second | Rate limits, wastes API calls | Poll every 30-60s max | +| Using v1 for Magic WAN v2 workloads | GRE overhead, complexity | Use v2 for simplified routing | +| Assuming BGP session = traffic flowing | BGP up ≠ routes installed | Verify routing tables + test traffic | +| Not enabling maintenance alerts | Surprise downtime during maintenance | Enable notifications immediately | +| Hardcoding VLAN in automation | VLAN assigned by CF (v1) | Get VLAN from CNI object response | +| Using Direct without colocation | Can't access cross-connect | Use Partner or Cloud interconnect | + +## What's Not Queryable via API + +**Cannot retrieve:** +- BGP session state (use Dashboard or BGP logs) +- Light levels (contact account team) +- Historical metrics (uptime, traffic) +- Bandwidth utilization per interconnect +- Maintenance window schedules (notifications only) +- Fiber path details +- Cross-connect installation status + +**Workarounds:** +- External monitoring for BGP state +- Log aggregation for historical data +- Notifications for maintenance windows + +## Limits + +| Resource/Limit | Value | Notes | +|----------------|-------|-------| +| Max optical distance | 10km | Physical limit | +| MTU (v1) | 1500↓ / 1476↑ | Asymmetric | +| MTU (v2) | 1500 both | Symmetric | +| GRE tunnel throughput | 1 Gbps | Per tunnel (v1) | +| Recovery time | Days | No formal SLA | +| Light level minimum | -20 dBm | Target threshold | +| API rate limit | 1200 req/5min | Per token | +| Health check delay | 6 hours | New maintenance alert subscriptions | diff --git a/cloudflare/references/network-interconnect/patterns.md b/cloudflare/references/network-interconnect/patterns.md new file mode 100644 index 0000000..7ff9dd3 --- /dev/null +++ b/cloudflare/references/network-interconnect/patterns.md @@ -0,0 +1,166 @@ +# CNI Patterns + +See [README.md](README.md) for overview. + +## High Availability + +**Critical:** Design for resilience from day one. + +**Requirements:** +- Device-level diversity (separate hardware) +- Backup Internet connectivity (no SLA on CNI) +- Network-resilient locations preferred +- Regular failover testing + +**Architecture:** +``` +Your Network A ──10G CNI v2──> CF CCR Device 1 + │ +Your Network B ──10G CNI v2──> CF CCR Device 2 + │ + CF Global Network (AS13335) +``` + +**Capacity Planning:** +- Plan across all links +- Account for failover scenarios +- Your responsibility + +## Pattern: Magic Transit + CNI v2 + +**Use Case:** DDoS protection, private connectivity, no GRE overhead. + +```typescript +// 1. Create interconnect +const ic = await client.networkInterconnects.interconnects.create({ + account_id: id, + type: 'direct', + facility: 'EWR1', + speed: '10G', + name: 'magic-transit-primary', +}); + +// 2. Poll until active +const status = await pollUntilActive(id, ic.id); + +// 3. Configure Magic Transit tunnel via Dashboard/API +``` + +**Benefits:** 1500 MTU both ways, simplified routing. + +## Pattern: Multi-Cloud Hybrid + +**Use Case:** AWS/GCP workloads with Cloudflare. + +**AWS Direct Connect:** +```typescript +// 1. Order Direct Connect in AWS Console +// 2. Get LOA + VLAN from AWS +// 3. Send to CF account team (no API) +// 4. Configure static routes in Magic WAN + +await configureStaticRoutes(id, { + prefix: '10.0.0.0/8', + nexthop: 'aws-direct-connect', +}); +``` + +**GCP Cloud Interconnect:** +``` +1. Get VLAN attachment pairing key from GCP Console +2. Create via Dashboard: Interconnects → Create → Cloud Interconnect → Google + - Enter pairing key, name, MTU, speed +3. Configure static routes in Magic WAN (BGP routes from GCP ignored) +4. Configure custom learned routes in GCP Cloud Router +``` + +**Note:** Dashboard-only. No API/SDK support yet. + +## Pattern: Multi-Location HA + +**Use Case:** 99.99%+ uptime. + +```typescript +// Primary (NY) +const primary = await client.networkInterconnects.interconnects.create({ + account_id: id, + type: 'direct', + facility: 'EWR1', + speed: '10G', + name: 'primary-ewr1', +}); + +// Secondary (NY, different hardware) +const secondary = await client.networkInterconnects.interconnects.create({ + account_id: id, + type: 'direct', + facility: 'EWR2', + speed: '10G', + name: 'secondary-ewr2', +}); + +// Tertiary (LA, different geography) +const tertiary = await client.networkInterconnects.interconnects.create({ + account_id: id, + type: 'partner', + facility: 'LAX1', + speed: '10G', + name: 'tertiary-lax1', +}); + +// BGP local preferences: +// Primary: 200 +// Secondary: 150 +// Tertiary: 100 +// Internet: Last resort +``` + +## Pattern: Partner Interconnect (Equinix) + +**Use Case:** Quick deployment, no colocation. + +**Setup:** +1. Order virtual circuit in Equinix Fabric Portal +2. Select Cloudflare as destination +3. Choose facility +4. Send details to CF account team +5. CF accepts in portal +6. Configure BGP + +**No API automation** – partner portals managed separately. + +## Failover & Security + +**Failover Best Practices:** +- Use BGP local preferences for priority +- Configure BFD for fast detection (v1) +- Test regularly with traffic shift +- Document runbooks + +**Security:** +- BGP password authentication +- BGP route filtering +- Monitor unexpected routes +- Magic Firewall for DDoS/threats +- Minimum API token permissions +- Rotate credentials periodically + +## Decision Matrix + +| Requirement | Recommended | +|-------------|-------------| +| Collocated with CF | Direct | +| Not collocated | Partner | +| AWS/GCP workloads | Cloud | +| 1500 MTU both ways | v2 | +| VLAN tagging | v1 | +| Public peering | v1 | +| Simplest config | v2 | +| BFD fast failover | v1 | +| LACP bundling | v1 | + +## Resources + +- [Magic Transit Docs](https://developers.cloudflare.com/magic-transit/) +- [Magic WAN Docs](https://developers.cloudflare.com/magic-wan/) +- [Argo Smart Routing](https://developers.cloudflare.com/argo/) diff --git a/cloudflare/references/observability/README.md b/cloudflare/references/observability/README.md new file mode 100644 index 0000000..58feed6 --- /dev/null +++ b/cloudflare/references/observability/README.md @@ -0,0 +1,87 @@ +# Cloudflare Observability Skill Reference + +**Purpose**: Comprehensive guidance for implementing observability in Cloudflare Workers, covering traces, logs, metrics, and analytics. + +**Scope**: Cloudflare Observability features ONLY - Workers Logs, Traces, Analytics Engine, Logpush, Metrics & Analytics, and OpenTelemetry exports. + +--- + +## Decision Tree: Which File to Load? + +Use this to route to the correct file without loading all content: + +``` +├─ "How do I enable/configure X?" → configuration.md +├─ "What's the API/method/binding for X?" → api.md +├─ "How do I implement X pattern?" → patterns.md +│ ├─ Usage tracking/billing → patterns.md +│ ├─ Error tracking → patterns.md +│ ├─ Performance monitoring → patterns.md +│ ├─ Multi-tenant tracking → patterns.md +│ ├─ Tail Worker filtering → patterns.md +│ └─ OpenTelemetry export → patterns.md +└─ "Why isn't X working?" / "Limits?" → gotchas.md +``` + +## Reading Order + +Load files in this order based on task: + +| Task Type | Load Order | Reason | +|-----------|------------|--------| +| **Initial setup** | configuration.md → gotchas.md | Setup first, avoid pitfalls | +| **Implement feature** | patterns.md → api.md → gotchas.md | Pattern → API details → edge cases | +| **Debug issue** | gotchas.md → configuration.md | Common issues first | +| **Query data** | api.md → patterns.md | API syntax → query examples | + +## Product Overview + +### Workers Logs +- **What:** Console output from Workers (console.log/warn/error) +- **Access:** Dashboard (Real-time Logs), Logpush, Tail Workers +- **Cost:** Free (included with all Workers) +- **Retention:** Real-time only (no historical storage in dashboard) + +### Workers Traces +- **What:** Execution traces with timing, CPU usage, outcome +- **Access:** Dashboard (Workers Analytics → Traces), Logpush +- **Cost:** $0.10/1M spans (GA pricing starts March 1, 2026), 10M free/month +- **Retention:** 14 days included + +### Analytics Engine +- **What:** High-cardinality event storage and SQL queries +- **Access:** SQL API, Dashboard (Analytics → Analytics Engine) +- **Cost:** $0.25/1M writes beyond 10M free/month +- **Retention:** 90 days (configurable up to 1 year) + +### Tail Workers +- **What:** Workers that receive logs/traces from other Workers +- **Use Cases:** Log filtering, transformation, external export +- **Cost:** Standard Workers pricing + +### Logpush +- **What:** Stream logs to external storage (S3, R2, Datadog, etc.) +- **Access:** Dashboard, API +- **Cost:** Requires Business/Enterprise plan + +## Pricing Summary (2026) + +| Feature | Free Tier | Cost Beyond Free Tier | Plan Requirement | +|---------|-----------|----------------------|------------------| +| Workers Logs | Unlimited | Free | Any | +| Workers Traces | 10M spans/month | $0.10/1M spans | Paid Workers (GA: March 1, 2026) | +| Analytics Engine | 10M writes/month | $0.25/1M writes | Paid Workers | +| Logpush | N/A | Included in plan | Business/Enterprise | + +## In This Reference + +- **[configuration.md](configuration.md)** - Setup, deployment, configuration (Logs, Traces, Analytics Engine, Tail Workers, Logpush) +- **[api.md](api.md)** - API endpoints, methods, interfaces (GraphQL, SQL, bindings, types) +- **[patterns.md](patterns.md)** - Common patterns, use cases, examples (billing, monitoring, error tracking, exports) +- **[gotchas.md](gotchas.md)** - Troubleshooting, best practices, limitations (common errors, performance gotchas, pricing) + +## See Also + +- [Cloudflare Workers Docs](https://developers.cloudflare.com/workers/) +- [Analytics Engine Docs](https://developers.cloudflare.com/analytics/analytics-engine/) +- [Workers Traces Docs](https://developers.cloudflare.com/workers/observability/traces/) diff --git a/cloudflare/references/observability/api.md b/cloudflare/references/observability/api.md new file mode 100644 index 0000000..a0161de --- /dev/null +++ b/cloudflare/references/observability/api.md @@ -0,0 +1,164 @@ +## API Reference + +### GraphQL Analytics API + +**Endpoint**: `https://api.cloudflare.com/client/v4/graphql` + +**Query Workers Metrics**: +```graphql +query { + viewer { + accounts(filter: { accountTag: $accountId }) { + workersInvocationsAdaptive( + limit: 100 + filter: { + datetime_geq: "2025-01-01T00:00:00Z" + datetime_leq: "2025-01-31T23:59:59Z" + scriptName: "my-worker" + } + ) { + sum { + requests + errors + subrequests + } + quantiles { + cpuTimeP50 + cpuTimeP99 + wallTimeP50 + wallTimeP99 + } + } + } + } +} +``` + +### Analytics Engine SQL API + +**Endpoint**: `https://api.cloudflare.com/client/v4/accounts/{account_id}/analytics_engine/sql` + +**Authentication**: `Authorization: Bearer ` (Account Analytics Read permission) + +**Common Queries**: + +```sql +-- List all datasets +SHOW TABLES; + +-- Time-series aggregation (5-minute buckets) +SELECT + intDiv(toUInt32(timestamp), 300) * 300 AS time_bucket, + blob1 AS endpoint, + SUM(_sample_interval) AS total_requests, + AVG(double1) AS avg_response_time_ms +FROM api_metrics +WHERE timestamp >= NOW() - INTERVAL '24' HOUR +GROUP BY time_bucket, endpoint +ORDER BY time_bucket DESC; + +-- Top customers by usage +SELECT + index1 AS customer_id, + SUM(_sample_interval * double1) AS total_api_calls, + AVG(double2) AS avg_response_time_ms +FROM api_usage +WHERE timestamp >= NOW() - INTERVAL '7' DAY +GROUP BY customer_id +ORDER BY total_api_calls DESC +LIMIT 100; + +-- Error rate analysis +SELECT + blob1 AS error_type, + COUNT(*) AS occurrences, + MAX(timestamp) AS last_seen +FROM error_tracking +WHERE timestamp >= NOW() - INTERVAL '1' HOUR +GROUP BY error_type +ORDER BY occurrences DESC; +``` + +### Console Logging API + +**Methods**: +```typescript +// Standard methods (all appear in Workers Logs) +console.log('info message'); +console.info('info message'); +console.warn('warning message'); +console.error('error message'); +console.debug('debug message'); + +// Structured logging (recommended) +console.log({ + level: 'info', + user_id: '123', + action: 'checkout', + amount: 99.99, + currency: 'USD' +}); +``` + +**Log Levels**: All console methods produce logs; use structured fields for filtering: +```typescript +console.log({ + level: 'error', + message: 'Payment failed', + error_code: 'CARD_DECLINED' +}); +``` + +### Analytics Engine Binding Types + +```typescript +interface AnalyticsEngineDataset { + writeDataPoint(event: AnalyticsEngineDataPoint): void; +} + +interface AnalyticsEngineDataPoint { + // Indexed strings (use for filtering/grouping) + indexes?: string[]; + + // Non-indexed strings (metadata, IDs, URLs) + blobs?: string[]; + + // Numeric values (counts, durations, amounts) + doubles?: number[]; +} +``` + +**Field Limits**: +- Max 20 indexes +- Max 20 blobs +- Max 20 doubles +- Max 25 `writeDataPoint` calls per request + +### Tail Consumer Event Type + +```typescript +interface TraceItem { + event: TraceEvent; + logs: TraceLog[]; + exceptions: TraceException[]; + scriptName?: string; +} + +interface TraceEvent { + outcome: 'ok' | 'exception' | 'exceededCpu' | 'exceededMemory' | 'unknown'; + cpuTime: number; // microseconds + wallTime: number; // microseconds +} + +interface TraceLog { + timestamp: number; + level: 'log' | 'info' | 'debug' | 'warn' | 'error'; + message: any; // string or structured object +} + +interface TraceException { + name: string; + message: string; + timestamp: number; +} +``` \ No newline at end of file diff --git a/cloudflare/references/observability/configuration.md b/cloudflare/references/observability/configuration.md new file mode 100644 index 0000000..483de4c --- /dev/null +++ b/cloudflare/references/observability/configuration.md @@ -0,0 +1,169 @@ +## Configuration Patterns + +### Enable Workers Logs + +```jsonc +{ + "observability": { + "enabled": true, + "head_sampling_rate": 1 // 100% sampling (default) + } +} +``` + +**Best Practice**: Use structured JSON logging for better indexing + +```typescript +// Good - structured logging +console.log({ + user_id: 123, + action: "login", + status: "success", + duration_ms: 45 +}); + +// Avoid - unstructured string +console.log("user_id: 123 logged in successfully in 45ms"); +``` + +### Enable Workers Traces + +```jsonc +{ + "observability": { + "traces": { + "enabled": true, + "head_sampling_rate": 0.05 // 5% sampling + } + } +} +``` + +**Note**: Default sampling is 100%. For high-traffic Workers, use lower sampling (0.01-0.1). + +### Configure Analytics Engine + +**Bind to Worker**: +```toml +# wrangler.toml +analytics_engine_datasets = [ + { binding = "ANALYTICS", dataset = "api_metrics" } +] +``` + +**Write Data Points**: +```typescript +export interface Env { + ANALYTICS: AnalyticsEngineDataset; +} + +export default { + async fetch(request: Request, env: Env): Promise { + // Track metrics + env.ANALYTICS.writeDataPoint({ + blobs: ['customer_123', 'POST', '/api/v1/users'], + doubles: [1, 245.5], // request_count, response_time_ms + indexes: ['customer_123'] // for efficient filtering + }); + + return new Response('OK'); + } +} +``` + +### Configure Tail Workers + +Tail Workers receive logs/traces from other Workers for filtering, transformation, or export. + +**Setup**: +```toml +# wrangler.toml +name = "log-processor" +main = "src/tail.ts" + +[[tail_consumers]] +service = "my-worker" # Worker to tail +``` + +**Tail Worker Example**: +```typescript +export default { + async tail(events: TraceItem[], env: Env, ctx: ExecutionContext) { + // Filter errors only + const errors = events.filter(event => + event.outcome === 'exception' || event.outcome === 'exceededCpu' + ); + + if (errors.length > 0) { + // Send to external monitoring + ctx.waitUntil( + fetch('https://monitoring.example.com/errors', { + method: 'POST', + body: JSON.stringify(errors) + }) + ); + } + } +} +``` + +### Configure Logpush + +Send logs to external storage (S3, R2, GCS, Azure, Datadog, etc.). Requires Business/Enterprise plan. + +**Via Dashboard**: +1. Navigate to Analytics → Logs → Logpush +2. Select destination type +3. Provide credentials and bucket/endpoint +4. Choose dataset (e.g., Workers Trace Events) +5. Configure filters and fields + +**Via API**: +```bash +curl -X POST "https://api.cloudflare.com/client/v4/accounts/{account_id}/logpush/jobs" \ + -H "Authorization: Bearer " \ + -H "Content-Type: application/json" \ + -d '{ + "name": "workers-logs-to-s3", + "destination_conf": "s3://my-bucket/logs?region=us-east-1", + "dataset": "workers_trace_events", + "enabled": true, + "frequency": "high", + "filter": "{\"where\":{\"and\":[{\"key\":\"ScriptName\",\"operator\":\"eq\",\"value\":\"my-worker\"}]}}" + }' +``` + +### Environment-Specific Configuration + +**Development** (verbose logs, full sampling): +```jsonc +// wrangler.dev.jsonc +{ + "observability": { + "enabled": true, + "head_sampling_rate": 1.0, + "traces": { + "enabled": true + } + } +} +``` + +**Production** (reduced sampling, structured logs): +```jsonc +// wrangler.prod.jsonc +{ + "observability": { + "enabled": true, + "head_sampling_rate": 0.1, // 10% sampling + "traces": { + "enabled": true + } + } +} +``` + +Deploy with env-specific config: +```bash +wrangler deploy --config wrangler.prod.jsonc --env production +``` \ No newline at end of file diff --git a/cloudflare/references/observability/gotchas.md b/cloudflare/references/observability/gotchas.md new file mode 100644 index 0000000..42bc738 --- /dev/null +++ b/cloudflare/references/observability/gotchas.md @@ -0,0 +1,115 @@ +## Common Errors + +### "Logs not appearing" + +**Cause:** Observability disabled, Worker not redeployed, no traffic, low sampling rate, or log size exceeds 256 KB +**Solution:** +```bash +# Verify config +cat wrangler.jsonc | jq '.observability' + +# Check deployment +wrangler deployments list + +# Test with curl +curl https://your-worker.workers.dev +``` +Ensure `observability.enabled = true`, redeploy Worker, check `head_sampling_rate`, verify traffic + +### "Traces not being captured" + +**Cause:** Traces not enabled, incorrect sampling rate, Worker not redeployed, or destination unavailable +**Solution:** +```jsonc +// Temporarily set to 100% sampling for debugging +{ + "observability": { + "enabled": true, + "head_sampling_rate": 1.0, + "traces": { + "enabled": true + } + } +} +``` +Ensure `observability.traces.enabled = true`, set `head_sampling_rate` to 1.0 for testing, redeploy, check destination status + +## Limits + +| Resource/Limit | Value | Notes | +|----------------|-------|-------| +| Max log size | 256 KB | Logs exceeding this are truncated | +| Default sampling rate | 1.0 (100%) | Reduce for high-traffic Workers | +| Max destinations | Varies by plan | Check dashboard | +| Trace context propagation | 100 spans max | Deep call chains may lose spans | +| Analytics Engine write rate | 25 writes/request | Excess writes dropped silently | + +## Performance Gotchas + +### Spectre Mitigation Timing + +**Problem:** `Date.now()` and `performance.now()` have reduced precision (coarsened to 100μs) +**Cause:** Spectre vulnerability mitigation in V8 +**Solution:** Accept reduced precision or use Workers Traces for accurate timing +```typescript +// Date.now() is coarsened - trace spans are accurate +export default { + async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise { + // For user-facing timing, Date.now() is fine + const start = Date.now(); + const response = await processRequest(request); + const duration = Date.now() - start; + + // For detailed performance analysis, use Workers Traces instead + return response; + } +} +``` + +### Analytics Engine _sample_interval Aggregation + +**Problem:** Queries return incorrect totals when not multiplying by `_sample_interval` +**Cause:** Analytics Engine stores sampled data points, each representing multiple events +**Solution:** Always multiply counts/sums by `_sample_interval` in aggregations +```sql +-- WRONG: Undercounts actual events +SELECT blob1 AS customer_id, COUNT(*) AS total_calls +FROM api_usage GROUP BY customer_id; + +-- CORRECT: Accounts for sampling +SELECT blob1 AS customer_id, SUM(_sample_interval) AS total_calls +FROM api_usage GROUP BY customer_id; +``` + +### Trace Context Propagation Limits + +**Problem:** Deep call chains lose trace context after 100 spans +**Cause:** Cloudflare limits trace depth to prevent performance impact +**Solution:** Design for flatter architectures or use custom correlation IDs for deep chains +```typescript +// For deep call chains, add custom correlation ID +const correlationId = crypto.randomUUID(); +console.log({ correlationId, event: 'request_start' }); + +// Pass correlationId through headers to downstream services +await fetch('https://api.example.com', { + headers: { 'X-Correlation-ID': correlationId } +}); +``` + +## Pricing (2026) + +### Workers Traces +- **GA Pricing (starts March 1, 2026):** + - $0.10 per 1M trace spans captured + - Retention: 14 days included +- **Free tier:** 10M trace spans/month +- **Note:** Beta usage (before March 1, 2026) is free + +### Workers Logs +- **Included:** Free for all Workers +- **Logpush:** Requires Business/Enterprise plan + +### Analytics Engine +- **Included:** 10M writes/month on Paid Workers plan +- **Additional:** $0.25 per 1M writes beyond included quota diff --git a/cloudflare/references/observability/patterns.md b/cloudflare/references/observability/patterns.md new file mode 100644 index 0000000..9135c68 --- /dev/null +++ b/cloudflare/references/observability/patterns.md @@ -0,0 +1,105 @@ +# Observability Patterns + +## Usage-Based Billing + +```typescript +env.ANALYTICS.writeDataPoint({ + blobs: [customerId, request.url, request.method], + doubles: [1], // request_count + indexes: [customerId] +}); +``` + +```sql +SELECT blob1 AS customer_id, SUM(_sample_interval * double1) AS total_calls +FROM api_usage WHERE timestamp >= DATE_TRUNC('month', NOW()) +GROUP BY customer_id +``` + +## Performance Monitoring + +```typescript +const start = Date.now(); +const response = await fetch(url); +env.ANALYTICS.writeDataPoint({ + blobs: [url, response.status.toString()], + doubles: [Date.now() - start, response.status] +}); +``` + +```sql +SELECT blob1 AS url, AVG(double1) AS avg_ms, percentile(double1, 0.95) AS p95_ms +FROM fetch_metrics WHERE timestamp >= NOW() - INTERVAL '1' HOUR +GROUP BY url +``` + +## Error Tracking + +```typescript +env.ANALYTICS.writeDataPoint({ + blobs: [error.name, request.url, request.method], + doubles: [1], + indexes: [error.name] +}); +``` + +## Multi-Tenant Tracking + +```typescript +env.ANALYTICS.writeDataPoint({ + indexes: [tenantId], // efficient filtering + blobs: [tenantId, url.pathname, method, status], + doubles: [1, duration, bytesSize] +}); +``` + +## Tail Worker Log Filtering + +```typescript +export default { + async tail(events, env, ctx) { + const critical = events.filter(e => + e.exceptions.length > 0 || e.event.wallTime > 1000000 + ); + if (critical.length === 0) return; + + ctx.waitUntil( + fetch('https://logging.example.com/ingest', { + method: 'POST', + headers: { 'Authorization': `Bearer ${env.API_KEY}` }, + body: JSON.stringify(critical.map(e => ({ + outcome: e.event.outcome, + cpu_ms: e.event.cpuTime / 1000, + errors: e.exceptions + }))) + }) + ); + } +}; +``` + +## OpenTelemetry Export + +```typescript +export default { + async tail(events, env, ctx) { + const otelSpans = events.map(e => ({ + traceId: generateId(32), + spanId: generateId(16), + name: e.scriptName || 'worker.request', + attributes: [ + { key: 'worker.outcome', value: { stringValue: e.event.outcome } }, + { key: 'worker.cpu_time_us', value: { intValue: String(e.event.cpuTime) } } + ] + })); + + ctx.waitUntil( + fetch('https://api.honeycomb.io/v1/traces', { + method: 'POST', + headers: { 'X-Honeycomb-Team': env.HONEYCOMB_KEY }, + body: JSON.stringify({ resourceSpans: [{ scopeSpans: [{ spans: otelSpans }] }] }) + }) + ); + } +}; +``` diff --git a/cloudflare/references/pages-functions/README.md b/cloudflare/references/pages-functions/README.md new file mode 100644 index 0000000..deaf461 --- /dev/null +++ b/cloudflare/references/pages-functions/README.md @@ -0,0 +1,98 @@ +# Cloudflare Pages Functions + +Serverless functions on Cloudflare Pages using Workers runtime. Full-stack dev with file-based routing. + +## Quick Navigation + +**Need to...** +| Task | Go to | +|------|-------| +| Set up TypeScript types | [configuration.md](./configuration.md) - TypeScript Setup | +| Configure bindings (KV, D1, R2) | [configuration.md](./configuration.md) - wrangler.jsonc | +| Access request/env/params | [api.md](./api.md) - EventContext | +| Add middleware or auth | [patterns.md](./patterns.md) - Middleware, Auth | +| Background tasks (waitUntil) | [patterns.md](./patterns.md) - Background Tasks | +| Debug errors or check limits | [gotchas.md](./gotchas.md) - Common Errors, Limits | + +## Decision Tree: Is This Pages Functions? + +``` +Need serverless backend? +├─ Yes, for a static site → Pages Functions +├─ Yes, standalone API → Workers +└─ Just static hosting → Pages (no functions) + +Have existing Worker? +├─ Complex routing logic → Use _worker.js (Advanced Mode) +└─ Simple routes → Migrate to /functions (File-Based) + +Framework-based? +├─ Next.js/SvelteKit/Remix → Uses _worker.js automatically +└─ Vanilla/HTML/React SPA → Use /functions +``` + +## File-Based Routing + +``` +/functions + ├── index.js → / + ├── api.js → /api + ├── users/ + │ ├── index.js → /users/ + │ ├── [user].js → /users/:user + │ └── [[catchall]].js → /users/* + └── _middleware.js → runs on all routes +``` + +**Rules:** +- `index.js` → directory root +- Trailing slash optional +- Specific routes precede catch-alls +- Falls back to static if no match + +## Dynamic Routes + +**Single segment** `[param]` → string: +```js +// /functions/users/[user].js +export function onRequest(context) { + return new Response(`Hello ${context.params.user}`); +} +// Matches: /users/nevi +``` + +**Multi-segment** `[[param]]` → array: +```js +// /functions/users/[[catchall]].js +export function onRequest(context) { + return new Response(JSON.stringify(context.params.catchall)); +} +// Matches: /users/nevi/foobar → ["nevi", "foobar"] +``` + +## Key Features + +- **Method handlers:** `onRequestGet`, `onRequestPost`, etc. +- **Middleware:** `_middleware.js` for cross-cutting concerns +- **Bindings:** KV, D1, R2, Durable Objects, Workers AI, Service bindings +- **TypeScript:** Full type support via `wrangler types` command +- **Advanced mode:** Use `_worker.js` for custom routing logic + +## Reading Order + +**New to Pages Functions?** Start here: +1. [README.md](./README.md) - Overview, routing, decision tree (you are here) +2. [configuration.md](./configuration.md) - TypeScript setup, wrangler.jsonc, bindings +3. [api.md](./api.md) - EventContext, handlers, bindings reference +4. [patterns.md](./patterns.md) - Middleware, auth, CORS, rate limiting, caching +5. [gotchas.md](./gotchas.md) - Common errors, debugging, limits + +**Quick reference lookup:** +- Bindings table → [api.md](./api.md) +- Error diagnosis → [gotchas.md](./gotchas.md) +- TypeScript setup → [configuration.md](./configuration.md) + +## See Also +- [pages](../pages/) - Pages platform overview and static site deployment +- [workers](../workers/) - Workers runtime API reference +- [d1](../d1/) - D1 database integration with Pages Functions diff --git a/cloudflare/references/pages-functions/api.md b/cloudflare/references/pages-functions/api.md new file mode 100644 index 0000000..5263372 --- /dev/null +++ b/cloudflare/references/pages-functions/api.md @@ -0,0 +1,143 @@ +# Function API + +## EventContext + +```typescript +interface EventContext { + request: Request; // Incoming request + functionPath: string; // Request path + waitUntil(promise: Promise): void; // Background tasks (non-blocking) + passThroughOnException(): void; // Fallback to static on error + next(input?: Request | string, init?: RequestInit): Promise; + env: Env; // Bindings, vars, secrets + params: Record; // Route params ([user] or [[catchall]]) + data: any; // Middleware shared state +} +``` + +**TypeScript:** See [configuration.md](./configuration.md) for `wrangler types` setup + +## Handlers + +```typescript +// Generic (fallback for any method) +export async function onRequest(ctx: EventContext): Promise { + return new Response('Any method'); +} + +// Method-specific (takes precedence over generic) +export async function onRequestGet(ctx: EventContext): Promise { + return Response.json({ message: 'GET' }); +} + +export async function onRequestPost(ctx: EventContext): Promise { + const body = await ctx.request.json(); + return Response.json({ received: body }); +} +// Also: onRequestPut, onRequestPatch, onRequestDelete, onRequestHead, onRequestOptions +``` + +## Bindings Reference + +| Binding Type | Interface | Config Key | Use Case | +|--------------|-----------|------------|----------| +| KV | `KVNamespace` | `kv_namespaces` | Key-value cache, sessions, config | +| D1 | `D1Database` | `d1_databases` | Relational data, SQL queries | +| R2 | `R2Bucket` | `r2_buckets` | Large files, user uploads, assets | +| Durable Objects | `DurableObjectNamespace` | `durable_objects.bindings` | Stateful coordination, websockets | +| Workers AI | `Ai` | `ai.binding` | LLM inference, embeddings | +| Vectorize | `VectorizeIndex` | `vectorize` | Vector search, embeddings | +| Service Binding | `Fetcher` | `services` | Worker-to-worker RPC | +| Analytics Engine | `AnalyticsEngineDataset` | `analytics_engine_datasets` | Event logging, metrics | +| Environment Vars | `string` | `vars` | Non-sensitive config | + +See [configuration.md](./configuration.md) for wrangler.jsonc examples. + +## Bindings + +### KV + +```typescript +interface Env { KV: KVNamespace; } +export const onRequest: PagesFunction = async (ctx) => { + await ctx.env.KV.put('key', 'value', { expirationTtl: 3600 }); + const val = await ctx.env.KV.get('key', { type: 'json' }); + const keys = await ctx.env.KV.list({ prefix: 'user:' }); + return Response.json({ val }); +}; +``` + +### D1 + +```typescript +interface Env { DB: D1Database; } +export const onRequest: PagesFunction = async (ctx) => { + const user = await ctx.env.DB.prepare('SELECT * FROM users WHERE id = ?').bind(123).first(); + return Response.json(user); +}; +``` + +### R2 + +```typescript +interface Env { BUCKET: R2Bucket; } +export const onRequest: PagesFunction = async (ctx) => { + const obj = await ctx.env.BUCKET.get('file.txt'); + if (!obj) return new Response('Not found', { status: 404 }); + await ctx.env.BUCKET.put('file.txt', ctx.request.body); + return new Response(obj.body); +}; +``` + +### Durable Objects + +```typescript +interface Env { COUNTER: DurableObjectNamespace; } +export const onRequest: PagesFunction = async (ctx) => { + const stub = ctx.env.COUNTER.get(ctx.env.COUNTER.idFromName('global')); + return stub.fetch(ctx.request); +}; +``` + +### Workers AI + +```typescript +interface Env { AI: Ai; } +export const onRequest: PagesFunction = async (ctx) => { + const resp = await ctx.env.AI.run('@cf/meta/llama-3.1-8b-instruct', { prompt: 'Hello' }); + return Response.json(resp); +}; +``` + +### Service Bindings & Env Vars + +```typescript +interface Env { AUTH: Fetcher; API_KEY: string; } +export const onRequest: PagesFunction = async (ctx) => { + // Service binding: forward to another Worker + return ctx.env.AUTH.fetch(ctx.request); + + // Environment variable + return Response.json({ key: ctx.env.API_KEY }); +}; +``` + +## Advanced Mode (env.ASSETS) + +When using `_worker.js`, access static assets via `env.ASSETS.fetch()`: + +```typescript +interface Env { ASSETS: Fetcher; KV: KVNamespace; } + +export default { + async fetch(request: Request, env: Env): Promise { + const url = new URL(request.url); + if (url.pathname.startsWith('/api/')) { + return Response.json({ data: await env.KV.get('key') }); + } + return env.ASSETS.fetch(request); // Fallback to static + } +} satisfies ExportedHandler; +``` + +**See also:** [configuration.md](./configuration.md) for TypeScript setup and wrangler.jsonc | [patterns.md](./patterns.md) for middleware and auth patterns diff --git a/cloudflare/references/pages-functions/configuration.md b/cloudflare/references/pages-functions/configuration.md new file mode 100644 index 0000000..62ba298 --- /dev/null +++ b/cloudflare/references/pages-functions/configuration.md @@ -0,0 +1,122 @@ +# Configuration + +## TypeScript Setup + +**Generate types from wrangler.jsonc** (replaces deprecated `@cloudflare/workers-types`): + +```bash +npx wrangler types +``` + +Creates `worker-configuration.d.ts` with typed `Env` interface based on your bindings. + +```typescript +// functions/api.ts +export const onRequest: PagesFunction = async (ctx) => { + // ctx.env.KV, ctx.env.DB, etc. are fully typed + return Response.json({ ok: true }); +}; +``` + +**Manual types** (if not using wrangler types): + +```typescript +interface Env { + KV: KVNamespace; + DB: D1Database; + API_KEY: string; +} +export const onRequest: PagesFunction = async (ctx) => { /* ... */ }; +``` + +## wrangler.jsonc + +```jsonc +{ + "$schema": "./node_modules/wrangler/config-schema.json", + "name": "my-pages-app", + "pages_build_output_dir": "./dist", + "compatibility_date": "2025-01-01", + "compatibility_flags": ["nodejs_compat"], + + "vars": { "API_URL": "https://api.example.com" }, + "kv_namespaces": [{ "binding": "KV", "id": "abc123" }], + "d1_databases": [{ "binding": "DB", "database_name": "prod-db", "database_id": "xyz789" }], + "r2_buckets": [{ "binding": "BUCKET", "bucket_name": "my-bucket" }], + "durable_objects": { "bindings": [{ "name": "COUNTER", "class_name": "Counter", "script_name": "counter-worker" }] }, + "services": [{ "binding": "AUTH", "service": "auth-worker" }], + "ai": { "binding": "AI" }, + "vectorize": [{ "binding": "VECTORIZE", "index_name": "my-index" }], + "analytics_engine_datasets": [{ "binding": "ANALYTICS" }] +} +``` + +## Environment Overrides + +Top-level → local dev, `env.preview` → preview, `env.production` → production + +```jsonc +{ + "vars": { "API_URL": "http://localhost:8787" }, + "env": { + "production": { "vars": { "API_URL": "https://api.example.com" } } + } +} +``` + +**Note:** If overriding `vars`, `kv_namespaces`, `d1_databases`, etc., ALL must be redefined (non-inheritable) + +## Local Secrets (.dev.vars) + +**Local dev only** - NOT deployed: + +```bash +# .dev.vars (add to .gitignore) +SECRET_KEY="my-secret-value" +``` + +Accessed via `ctx.env.SECRET_KEY`. Set production secrets: +```bash +echo "value" | npx wrangler pages secret put SECRET_KEY --project-name=my-app +``` + +## Static Config Files + +**_routes.json** - Custom routing: +```json +{ "version": 1, "include": ["/api/*"], "exclude": ["/static/*"] } +``` + +**_headers** - Static headers: +``` +/static/* + Cache-Control: public, max-age=31536000 +``` + +**_redirects** - Redirects: +``` +/old /new 301 +``` + +## Local Dev & Deployment + +```bash +# Dev server +npx wrangler pages dev ./dist + +# With bindings +npx wrangler pages dev ./dist --kv=KV --d1=DB=db-id --r2=BUCKET + +# Durable Objects (2 terminals) +cd do-worker && npx wrangler dev +cd pages-project && npx wrangler pages dev ./dist --do COUNTER=Counter@do-worker + +# Deploy +npx wrangler pages deploy ./dist +npx wrangler pages deploy ./dist --branch preview + +# Download config +npx wrangler pages download config my-project +``` + +**See also:** [api.md](./api.md) for binding usage examples diff --git a/cloudflare/references/pages-functions/gotchas.md b/cloudflare/references/pages-functions/gotchas.md new file mode 100644 index 0000000..f63e608 --- /dev/null +++ b/cloudflare/references/pages-functions/gotchas.md @@ -0,0 +1,94 @@ +# Gotchas & Debugging + +## Error Diagnosis + +| Symptom | Likely Cause | Solution | +|---------|--------------|----------| +| **Function not invoking** | Wrong `/functions` location, wrong extension, or `_routes.json` excludes path | Check `pages_build_output_dir`, use `.js`/`.ts`, verify `_routes.json` | +| **`ctx.env.BINDING` undefined** | Binding not configured or name mismatch | Add to `wrangler.jsonc`, verify exact name (case-sensitive), redeploy | +| **TypeScript errors on `ctx.env`** | Missing type definition | Run `wrangler types` or define `interface Env {}` | +| **Middleware not running** | Wrong filename/location or missing `ctx.next()` | Name exactly `_middleware.js`, export `onRequest`, call `ctx.next()` | +| **Secrets missing in production** | `.dev.vars` not deployed | `.dev.vars` is local only - set production secrets via dashboard or `wrangler secret put` | +| **Type mismatch on binding** | Wrong interface type | See [api.md](./api.md) bindings table for correct types | +| **"KV key not found" but exists** | Key in wrong namespace or env | Verify namespace binding, check preview vs production env | +| **Function times out** | Synchronous wait or missing `await` | All I/O must be async/await, use `ctx.waitUntil()` for background tasks | + +## Common Errors + +### TypeScript type errors + +**Problem:** `ctx.env.MY_BINDING` shows type error +**Cause:** No type definition for `Env` +**Solution:** Run `npx wrangler types` or manually define: +```typescript +interface Env { MY_BINDING: KVNamespace; } +export const onRequest: PagesFunction = async (ctx) => { /* ... */ }; +``` + +### Secrets not available in production + +**Problem:** `ctx.env.SECRET_KEY` is undefined in production +**Cause:** `.dev.vars` is local-only, not deployed +**Solution:** Set production secrets: +```bash +echo "value" | npx wrangler pages secret put SECRET_KEY --project-name=my-app +``` + +## Debugging + +```typescript +// Console logging +export async function onRequest(ctx) { + console.log('Request:', ctx.request.method, ctx.request.url); + const res = await ctx.next(); + console.log('Status:', res.status); + return res; +} +``` + +```bash +# Stream real-time logs +npx wrangler pages deployment tail +npx wrangler pages deployment tail --status error +``` + +```jsonc +// Source maps (wrangler.jsonc) +{ "upload_source_maps": true } +``` + +## Limits + +| Resource | Free | Paid | +|----------|------|------| +| CPU time | 10ms | 50ms | +| Memory | 128 MB | 128 MB | +| Script size | 10 MB compressed | 10 MB compressed | +| Env vars | 5 KB per var, 64 max | 5 KB per var, 64 max | +| Requests | 100k/day | Unlimited ($0.50/million) | + +## Best Practices + +**Performance:** Minimize deps (cold start), use KV for cache/D1 for relational/R2 for large files, set `Cache-Control` headers, batch DB ops, handle errors gracefully + +**Security:** Never commit secrets (use `.dev.vars` + gitignore), validate input, sanitize before DB, implement auth middleware, set CORS headers, rate limit per-IP + +## Migration + +**Workers → Pages Functions:** +- `export default { fetch(req, env) {} }` → `export function onRequest(ctx) { const { request, env } = ctx; }` +- Use `_worker.js` for complex routing: `env.ASSETS.fetch(request)` for static files + +**Other platforms → Pages:** +- File-based routing: `/functions/api/users.js` → `/api/users` +- Dynamic routes: `[param]` not `:param` +- Replace Node.js deps with Workers APIs or add `nodejs_compat` flag + +## Resources + +- [Official Docs](https://developers.cloudflare.com/pages/functions/) +- [Workers APIs](https://developers.cloudflare.com/workers/runtime-apis/) +- [Examples](https://github.com/cloudflare/pages-example-projects) +- [Discord](https://discord.gg/cloudflaredev) + +**See also:** [configuration.md](./configuration.md) for TypeScript setup | [patterns.md](./patterns.md) for middleware/auth | [api.md](./api.md) for bindings diff --git a/cloudflare/references/pages-functions/patterns.md b/cloudflare/references/pages-functions/patterns.md new file mode 100644 index 0000000..22289e8 --- /dev/null +++ b/cloudflare/references/pages-functions/patterns.md @@ -0,0 +1,137 @@ +# Common Patterns + +## Background Tasks (waitUntil) + +Non-blocking tasks after response sent (analytics, cleanup, webhooks): + +```typescript +export async function onRequest(ctx: EventContext) { + const res = Response.json({ success: true }); + + ctx.waitUntil(ctx.env.KV.put('last-visit', new Date().toISOString())); + ctx.waitUntil(Promise.all([ + ctx.env.ANALYTICS.writeDataPoint({ event: 'view' }), + fetch('https://webhook.site/...', { method: 'POST' }) + ])); + + return res; // Returned immediately +} +``` + +## Middleware & Auth + +```typescript +// functions/_middleware.js (global) or functions/users/_middleware.js (scoped) +export async function onRequest(ctx) { + try { return await ctx.next(); } + catch (err) { return new Response(err.message, { status: 500 }); } +} + +// Chained: export const onRequest = [errorHandler, auth, logger]; + +// Auth +async function auth(ctx: EventContext) { + const token = ctx.request.headers.get('authorization')?.replace('Bearer ', ''); + if (!token) return new Response('Unauthorized', { status: 401 }); + const session = await ctx.env.KV.get(`session:${token}`); + if (!session) return new Response('Invalid', { status: 401 }); + ctx.data.user = JSON.parse(session); + return ctx.next(); +} +``` + +## CORS & Rate Limiting + +```typescript +// CORS middleware +const cors = { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'GET, POST' }; +export async function onRequestOptions() { return new Response(null, { headers: cors }); } +export async function onRequest(ctx) { + const res = await ctx.next(); + Object.entries(cors).forEach(([k, v]) => res.headers.set(k, v)); + return res; +} + +// Rate limiting (KV-based) +async function rateLimit(ctx: EventContext) { + const ip = ctx.request.headers.get('CF-Connecting-IP') || 'unknown'; + const count = parseInt(await ctx.env.KV.get(`rate:${ip}`) || '0'); + if (count >= 100) return new Response('Rate limited', { status: 429 }); + await ctx.env.KV.put(`rate:${ip}`, (count + 1).toString(), { expirationTtl: 3600 }); + return ctx.next(); +} +``` + +## Forms, Caching, Redirects + +```typescript +// JSON & file upload +export async function onRequestPost(ctx) { + const ct = ctx.request.headers.get('content-type') || ''; + if (ct.includes('application/json')) return Response.json(await ctx.request.json()); + if (ct.includes('multipart/form-data')) { + const file = (await ctx.request.formData()).get('file') as File; + await ctx.env.BUCKET.put(file.name, file.stream()); + return Response.json({ uploaded: file.name }); + } +} + +// Cache API +export async function onRequest(ctx) { + let res = await caches.default.match(ctx.request); + if (!res) { + res = new Response('Data'); + res.headers.set('Cache-Control', 'public, max-age=3600'); + ctx.waitUntil(caches.default.put(ctx.request, res.clone())); + } + return res; +} + +// Redirects +export async function onRequest(ctx) { + if (new URL(ctx.request.url).pathname === '/old') { + return Response.redirect(new URL('/new', ctx.request.url), 301); + } + return ctx.next(); +} +``` + +## Testing + +**Unit tests** (Vitest + cloudflare:test): +```typescript +import { env } from 'cloudflare:test'; +import { it, expect } from 'vitest'; +import { onRequest } from '../functions/api'; + +it('returns JSON', async () => { + const req = new Request('http://localhost/api'); + const ctx = { request: req, env, params: {}, data: {} } as EventContext; + const res = await onRequest(ctx); + expect(res.status).toBe(200); +}); +``` + +**Integration:** `wrangler pages dev` + Playwright/Cypress + +## Advanced Mode (_worker.js) + +Use `_worker.js` for complex routing (replaces `/functions`): + +```typescript +interface Env { ASSETS: Fetcher; KV: KVNamespace; } + +export default { + async fetch(request: Request, env: Env): Promise { + const url = new URL(request.url); + if (url.pathname.startsWith('/api/')) { + return Response.json({ data: await env.KV.get('key') }); + } + return env.ASSETS.fetch(request); // Static files + } +} satisfies ExportedHandler; +``` + +**When:** Existing Worker, framework-generated (Next.js/SvelteKit), custom routing logic + +**See also:** [api.md](./api.md) for `env.ASSETS.fetch()` | [gotchas.md](./gotchas.md) for debugging diff --git a/cloudflare/references/pages/README.md b/cloudflare/references/pages/README.md new file mode 100644 index 0000000..bf0546f --- /dev/null +++ b/cloudflare/references/pages/README.md @@ -0,0 +1,88 @@ +# Cloudflare Pages + +JAMstack platform for full-stack apps on Cloudflare's global network. + +## Key Features + +- **Git-based deploys**: Auto-deploy from GitHub/GitLab +- **Preview deployments**: Unique URL per branch/PR +- **Pages Functions**: File-based serverless routing (Workers runtime) +- **Static + dynamic**: Smart asset caching + edge compute +- **Smart Placement**: Automatic function optimization based on traffic patterns +- **Framework optimized**: SvelteKit, Astro, Nuxt, Qwik, Solid Start + +## Deployment Methods + +### 1. Git Integration (Production) +Dashboard → Workers & Pages → Create → Connect to Git → Configure build + +### 2. Direct Upload +```bash +npx wrangler pages deploy ./dist --project-name=my-project +npx wrangler pages deploy ./dist --project-name=my-project --branch=staging +``` + +### 3. C3 CLI +```bash +npm create cloudflare@latest my-app +# Select framework → auto-setup + deploy +``` + +## vs Workers + +- **Pages**: Static sites, JAMstack, frameworks, git workflow, file-based routing +- **Workers**: Pure APIs, complex routing, WebSockets, scheduled tasks, email handlers +- **Combine**: Pages Functions use Workers runtime, can bind to Workers + +## Quick Start + +```bash +# Create +npm create cloudflare@latest + +# Local dev +npx wrangler pages dev ./dist + +# Deploy +npx wrangler pages deploy ./dist --project-name=my-project + +# Types +npx wrangler types --path='./functions/types.d.ts' + +# Secrets +echo "value" | npx wrangler pages secret put KEY --project-name=my-project + +# Logs +npx wrangler pages deployment tail --project-name=my-project +``` + +## Resources + +- [Pages Docs](https://developers.cloudflare.com/pages/) +- [Functions API](https://developers.cloudflare.com/pages/functions/api-reference/) +- [Framework Guides](https://developers.cloudflare.com/pages/framework-guides/) +- [Discord #functions](https://discord.com/channels/595317990191398933/910978223968518144) + +## Reading Order + +**New to Pages?** Start here: +1. README.md (you are here) - Overview & quick start +2. [configuration.md](./configuration.md) - Project setup, wrangler.jsonc, bindings +3. [api.md](./api.md) - Functions API, routing, context +4. [patterns.md](./patterns.md) - Common implementations +5. [gotchas.md](./gotchas.md) - Troubleshooting & pitfalls + +**Quick reference?** Jump to relevant file above. + +## In This Reference + +- [configuration.md](./configuration.md) - wrangler.jsonc, build, env vars, Smart Placement +- [api.md](./api.md) - Functions API, bindings, context, advanced mode +- [patterns.md](./patterns.md) - Full-stack patterns, framework integration +- [gotchas.md](./gotchas.md) - Build issues, limits, debugging, framework warnings + +## See Also + +- [pages-functions](../pages-functions/) - File-based routing, middleware +- [d1](../d1/) - SQL database for Pages Functions +- [kv](../kv/) - Key-value storage for caching/state diff --git a/cloudflare/references/pages/api.md b/cloudflare/references/pages/api.md new file mode 100644 index 0000000..a719585 --- /dev/null +++ b/cloudflare/references/pages/api.md @@ -0,0 +1,204 @@ +# Functions API + +## File-Based Routing + +``` +/functions/index.ts → example.com/ +/functions/api/users.ts → example.com/api/users +/functions/api/users/[id].ts → example.com/api/users/:id +/functions/api/users/[[path]].ts → example.com/api/users/* (catchall) +/functions/_middleware.ts → Runs before all routes +``` + +**Rules**: `[param]` = single segment, `[[param]]` = multi-segment catchall, more specific wins. + +## Request Handlers + +```typescript +import type { PagesFunction } from '@cloudflare/workers-types'; + +interface Env { + DB: D1Database; + KV: KVNamespace; +} + +// All methods +export const onRequest: PagesFunction = async (context) => { + return new Response('All methods'); +}; + +// Method-specific +export const onRequestGet: PagesFunction = async (context) => { + const { request, env, params, data } = context; + + const user = await env.DB.prepare( + 'SELECT * FROM users WHERE id = ?' + ).bind(params.id).first(); + + return Response.json(user); +}; + +export const onRequestPost: PagesFunction = async (context) => { + const body = await context.request.json(); + return Response.json({ success: true }); +}; + +// Also: onRequestPut, onRequestPatch, onRequestDelete, onRequestHead, onRequestOptions +``` + +## Context Object + +```typescript +interface EventContext { + request: Request; // HTTP request + env: Env; // Bindings (KV, D1, R2, etc.) + params: Params; // Route parameters + data: Data; // Middleware-shared data + waitUntil: (promise: Promise) => void; // Background tasks + next: () => Promise; // Next handler + passThroughOnException: () => void; // Error fallback (not in advanced mode) +} +``` + +## Dynamic Routes + +```typescript +// Single segment: functions/users/[id].ts +export const onRequestGet: PagesFunction = async ({ params }) => { + // /users/123 → params.id = "123" + return Response.json({ userId: params.id }); +}; + +// Multi-segment: functions/files/[[path]].ts +export const onRequestGet: PagesFunction = async ({ params }) => { + // /files/docs/api/v1.md → params.path = ["docs", "api", "v1.md"] + const filePath = (params.path as string[]).join('/'); + return new Response(filePath); +}; +``` + +## Middleware + +```typescript +// functions/_middleware.ts +// Single +export const onRequest: PagesFunction = async (context) => { + const response = await context.next(); + response.headers.set('X-Custom-Header', 'value'); + return response; +}; + +// Chained (runs in order) +const errorHandler: PagesFunction = async (context) => { + try { + return await context.next(); + } catch (err) { + return new Response(err.message, { status: 500 }); + } +}; + +const auth: PagesFunction = async (context) => { + const token = context.request.headers.get('Authorization'); + if (!token) return new Response('Unauthorized', { status: 401 }); + context.data.userId = await verifyToken(token); + return context.next(); +}; + +export const onRequest = [errorHandler, auth]; +``` + +**Scope**: `functions/_middleware.ts` → all; `functions/api/_middleware.ts` → `/api/*` only + +## Bindings Usage + +```typescript +export const onRequestGet: PagesFunction = async ({ env }) => { + // KV + const cached = await env.KV.get('key', 'json'); + await env.KV.put('key', JSON.stringify({data: 'value'}), {expirationTtl: 3600}); + + // D1 + const result = await env.DB.prepare('SELECT * FROM users WHERE id = ?').bind(userId).first(); + + // R2, Queue, AI - see respective reference docs + + return Response.json({success: true}); +}; +``` + +## Advanced Mode + +Full Workers API, bypasses file-based routing: + +```javascript +// functions/_worker.js +export default { + async fetch(request, env, ctx) { + const url = new URL(request.url); + + // Custom routing + if (url.pathname.startsWith('/api/')) { + return new Response('API response'); + } + + // REQUIRED: Serve static assets + return env.ASSETS.fetch(request); + } +}; +``` + +**When to use**: WebSockets, complex routing, scheduled handlers, email handlers. + +## Smart Placement + +Automatically optimizes function execution location based on traffic patterns. + +**Configuration** (in wrangler.jsonc): +```jsonc +{ + "placement": { + "mode": "smart" // Enables optimization (default: off) + } +} +``` + +**How it works**: Analyzes traffic patterns over time and places functions closer to users or data sources (e.g., D1 databases). Requires no code changes. + +**Trade-offs**: Initial requests may see slightly higher latency during learning period (hours-days). Performance improves as system optimizes. + +**When to use**: Global apps with centralized databases or geographically concentrated traffic sources. + +## getRequestContext (Framework SSR) + +Access bindings in framework code: + +```typescript +// SvelteKit +import type { RequestEvent } from '@sveltejs/kit'; +export async function load({ platform }: RequestEvent) { + const data = await platform.env.DB.prepare('SELECT * FROM users').all(); + return { users: data.results }; +} + +// Astro +const { DB } = Astro.locals.runtime.env; +const data = await DB.prepare('SELECT * FROM users').all(); + +// Solid Start (server function) +import { getRequestEvent } from 'solid-js/web'; +const event = getRequestEvent(); +const data = await event.locals.runtime.env.DB.prepare('SELECT * FROM users').all(); +``` + +**✅ Supported adapters** (2026): +- **SvelteKit**: `@sveltejs/adapter-cloudflare` +- **Astro**: Built-in Cloudflare adapter +- **Nuxt**: Set `nitro.preset: 'cloudflare-pages'` in `nuxt.config.ts` +- **Qwik**: Built-in Cloudflare adapter +- **Solid Start**: `@solidjs/start-cloudflare-pages` + +**❌ Deprecated/Unsupported**: +- **Next.js**: Official adapter (`@cloudflare/next-on-pages`) deprecated. Use Vercel or self-host on Workers. +- **Remix**: Official adapter (`@remix-run/cloudflare-pages`) deprecated. Migrate to supported frameworks. + +See [gotchas.md](./gotchas.md#framework-specific) for migration guidance. diff --git a/cloudflare/references/pages/configuration.md b/cloudflare/references/pages/configuration.md new file mode 100644 index 0000000..30ada89 --- /dev/null +++ b/cloudflare/references/pages/configuration.md @@ -0,0 +1,201 @@ +# Configuration + +## wrangler.jsonc + +```jsonc +{ + "name": "my-pages-project", + "pages_build_output_dir": "./dist", + "compatibility_date": "2026-01-01", // Use current date for new projects + "compatibility_flags": ["nodejs_compat"], + "placement": { + "mode": "smart" // Optional: Enable Smart Placement + }, + "kv_namespaces": [{"binding": "KV", "id": "abcd1234..."}], + "d1_databases": [{"binding": "DB", "database_id": "xxxx-xxxx", "database_name": "production-db"}], + "r2_buckets": [{"binding": "BUCKET", "bucket_name": "my-bucket"}], + "durable_objects": {"bindings": [{"name": "COUNTER", "class_name": "Counter", "script_name": "counter-worker"}]}, + "services": [{"binding": "API", "service": "api-worker"}], + "queues": {"producers": [{"binding": "QUEUE", "queue": "my-queue"}]}, + "vectorize": [{"binding": "VECTORIZE", "index_name": "my-index"}], + "ai": {"binding": "AI"}, + "analytics_engine_datasets": [{"binding": "ANALYTICS"}], + "vars": {"API_URL": "https://api.example.com", "ENVIRONMENT": "production"}, + "env": { + "preview": { + "vars": {"API_URL": "https://staging-api.example.com"}, + "kv_namespaces": [{"binding": "KV", "id": "preview-namespace-id"}] + } + } +} +``` + +## Build Config + +**Git deployment**: Dashboard → Project → Settings → Build settings +Set build command, output dir, env vars. Framework auto-detection configures automatically. + +## Environment Variables + +### Local (.dev.vars) +```bash +# .dev.vars (never commit) +SECRET_KEY="local-secret-key" +API_TOKEN="dev-token-123" +``` + +### Production +```bash +echo "secret-value" | npx wrangler pages secret put SECRET_KEY --project-name=my-project +npx wrangler pages secret list --project-name=my-project +npx wrangler pages secret delete SECRET_KEY --project-name=my-project +``` + +Access: `env.SECRET_KEY` + +## Static Config Files + +### _redirects +Place in build output (e.g., `dist/_redirects`): + +```txt +/old-page /new-page 301 # 301 redirect +/blog/* /news/:splat 301 # Splat wildcard +/users/:id /members/:id 301 # Placeholders +/api/* /api-v2/:splat 200 # Proxy (no redirect) +``` + +**Limits**: 2,100 total (2,000 static + 100 dynamic), 1,000 char/line +**Note**: Functions take precedence + +### _headers +```txt +/secure/* + X-Frame-Options: DENY + X-Content-Type-Options: nosniff + +/api/* + Access-Control-Allow-Origin: * + +/static/* + Cache-Control: public, max-age=31536000, immutable +``` + +**Limits**: 100 rules, 2,000 char/line +**Note**: Only static assets; Functions set headers in Response + +### _routes.json +Controls which requests invoke Functions (auto-generated for most frameworks): + +```json +{ + "version": 1, + "include": ["/*"], + "exclude": ["/build/*", "/static/*", "/assets/*", "/*.{ico,png,jpg,css,js}"] +} +``` + +**Purpose**: Functions are metered; static requests are free. `exclude` takes precedence. Max 100 rules, 100 char/rule. + +## TypeScript + +```bash +npx wrangler types --path='./functions/types.d.ts' +``` + +Point `types` in `functions/tsconfig.json` to generated file. + +## Smart Placement + +Automatically optimizes function execution location based on request patterns. + +```jsonc +{ + "placement": { + "mode": "smart" // Enable optimization (default: off) + } +} +``` + +**How it works**: System analyzes traffic over hours/days and places function execution closer to: +- User clusters (e.g., regional traffic) +- Data sources (e.g., D1 database primary location) + +**Benefits**: +- Lower latency for read-heavy apps with centralized databases +- Better performance for apps with regional traffic patterns + +**Trade-offs**: +- Initial learning period: First requests may be slower while system optimizes +- Optimization time: Performance improves over 24-48 hours + +**When to enable**: Global apps with D1/Durable Objects in specific regions, or apps with concentrated geographic traffic. + +**When to skip**: Evenly distributed global traffic with no data locality constraints. + +## Remote Bindings (Local Dev) + +Connect local dev server to production bindings instead of local mocks: + +```bash +# All bindings remote +npx wrangler pages dev ./dist --remote + +# Specific bindings remote (others local) +npx wrangler pages dev ./dist --remote --kv=KV --d1=DB +``` + +**Use cases**: +- Test against production data (read-only operations) +- Debug binding-specific behavior +- Validate changes before deployment + +**⚠️ Warning**: +- Writes affect **real production data** +- Use only for read-heavy debugging or with non-production accounts +- Consider creating separate preview environments instead + +**Requirements**: Must be logged in (`npx wrangler login`) with access to bindings. + +## Local Dev + +```bash +# Basic +npx wrangler pages dev ./dist + +# With bindings +npx wrangler pages dev ./dist --kv KV --d1 DB=local-db-id + +# Remote bindings (production data) +npx wrangler pages dev ./dist --remote + +# Persistence +npx wrangler pages dev ./dist --persist-to=./.wrangler/state/v3 + +# Proxy mode (SSR frameworks) +npx wrangler pages dev -- npm run dev +``` + +## Limits (as of Jan 2026) + +| Resource | Free | Paid | +|----------|------|------| +| **Functions Requests** | 100k/day | Unlimited (metered) | +| **Function CPU Time** | 10ms/req | 30ms/req (Workers Paid) | +| **Function Memory** | 128MB | 128MB | +| **Script Size** | 1MB compressed | 10MB compressed | +| **Deployments** | 500/month | 5,000/month | +| **Files per Deploy** | 20,000 | 20,000 | +| **File Size** | 25MB | 25MB | +| **Build Time** | 20min | 20min | +| **Redirects** | 2,100 (2k static + 100 dynamic) | Same | +| **Header Rules** | 100 | 100 | +| **Route Rules** | 100 | 100 | +| **Subrequests** | 50/request | 1,000/request (Workers Paid) | + +**Notes**: +- Functions use Workers runtime; Workers Paid plan increases limits +- Free plan sufficient for most projects +- Static requests always free (not counted toward limits) + +[Full limits](https://developers.cloudflare.com/pages/platform/limits/) diff --git a/cloudflare/references/pages/gotchas.md b/cloudflare/references/pages/gotchas.md new file mode 100644 index 0000000..943c2d3 --- /dev/null +++ b/cloudflare/references/pages/gotchas.md @@ -0,0 +1,203 @@ +# Gotchas + +## Functions Not Running + +**Problem**: Function endpoints return 404 or don't execute +**Causes**: `_routes.json` excludes path; wrong file extension (`.jsx`/`.tsx`); Functions dir not at output root +**Solution**: Check `_routes.json`, rename to `.ts`/`.js`, verify build output structure + +## 404 on Static Assets + +**Problem**: Static files not serving +**Causes**: Build output dir misconfigured; Functions catching requests; Advanced mode missing `env.ASSETS.fetch()` +**Solution**: Verify output dir, add exclusions to `_routes.json`, call `env.ASSETS.fetch()` in `_worker.js` + +## Bindings Not Working + +**Problem**: `env.BINDING` undefined or errors +**Causes**: wrangler.jsonc syntax error; wrong binding IDs; missing `.dev.vars`; out-of-sync types +**Solution**: Validate config, verify IDs, create `.dev.vars`, run `npx wrangler types` + +## Build Failures + +**Problem**: Deployment fails during build +**Causes**: Wrong build command/output dir; Node version incompatibility; missing env vars; 20min timeout; OOM +**Solution**: Check Dashboard → Deployments → Build log; verify settings; add `.nvmrc`; optimize build + +## Middleware Not Running + +**Problem**: Middleware doesn't execute +**Causes**: Wrong filename (not `_middleware.ts`); missing `onRequest` export; didn't call `next()` +**Solution**: Rename file with underscore prefix; export handler; call `next()` or return Response + +## Headers/Redirects Not Working + +**Problem**: `_headers` or `_redirects` not applying +**Causes**: Only work for static assets; Functions override; syntax errors; exceeded limits +**Solution**: Set headers in Response object for Functions; verify syntax; check limits (100 headers, 2,100 redirects) + +## TypeScript Errors + +**Problem**: Type errors in Functions code +**Causes**: Types not generated; Env interface doesn't match wrangler.jsonc +**Solution**: Run `npx wrangler types --path='./functions/types.d.ts'`; update Env interface + +## Local Dev Issues + +**Problem**: Dev server errors or bindings don't work +**Causes**: Port conflict; bindings not passed; local vs HTTPS differences +**Solution**: Use `--port=3000`; pass bindings via CLI or wrangler.jsonc; account for HTTP/HTTPS differences + +## Performance Issues + +**Problem**: Slow responses or CPU limit errors +**Causes**: Functions invoked for static assets; cold starts; 10ms CPU limit; large bundle +**Solution**: Exclude static via `_routes.json`; optimize hot paths; keep bundle < 1MB + +## Framework-Specific + +### ⚠️ Deprecated Frameworks + +**Next.js**: Official adapter (`@cloudflare/next-on-pages`) **deprecated** and unmaintained. +- **Problem**: No updates since 2024; incompatible with Next.js 15+; missing App Router features +- **Cause**: Cloudflare discontinued official support; community fork exists but limited +- **Solutions**: + 1. **Recommended**: Use Vercel (official Next.js host) + 2. **Advanced**: Self-host on Workers using custom adapter (complex, unsupported) + 3. **Migration**: Switch to SvelteKit/Nuxt (similar DX, full Pages support) + +**Remix**: Official adapter (`@remix-run/cloudflare-pages`) **deprecated**. +- **Problem**: No maintenance from Remix team; compatibility issues with Remix v2+ +- **Cause**: Remix team deprecated all framework adapters +- **Solutions**: + 1. **Recommended**: Migrate to SvelteKit (similar file-based routing, better DX) + 2. **Alternative**: Use Astro (static-first with optional SSR) + 3. **Workaround**: Continue using deprecated adapter (no future support) + +### ✅ Supported Frameworks + +**SvelteKit**: +- Use `@sveltejs/adapter-cloudflare` +- Access bindings via `platform.env` in server load functions +- Set `platform: 'cloudflare'` in `svelte.config.js` + +**Astro**: +- Built-in Cloudflare adapter +- Access bindings via `Astro.locals.runtime.env` + +**Nuxt**: +- Set `nitro.preset: 'cloudflare-pages'` in `nuxt.config.ts` +- Access bindings via `event.context.cloudflare.env` + +**Qwik, Solid Start**: +- Built-in or official Cloudflare adapters available +- Check respective framework docs for binding access + +## Debugging + +```typescript +// Log request details +console.log('Request:', { method: request.method, url: request.url }); +console.log('Env:', Object.keys(env)); +console.log('Params:', params); +``` + +**View logs**: `npx wrangler pages deployment tail --project-name=my-project` + +## Smart Placement Issues + +### Increased Cold Start Latency + +**Problem**: First requests slower after enabling Smart Placement +**Cause**: Initial optimization period while system learns traffic patterns +**Solution**: Expected behavior during first 24-48 hours; monitor latency trends over time + +### Inconsistent Response Times + +**Problem**: Latency varies significantly across requests during initial deployment +**Cause**: Smart Placement testing different execution locations to find optimal placement +**Solution**: Normal during learning phase; stabilizes after traffic patterns emerge (1-2 days) + +### No Performance Improvement + +**Problem**: Smart Placement enabled but no latency reduction observed +**Cause**: Traffic evenly distributed globally, or no data locality constraints +**Solution**: Smart Placement most effective with centralized data (D1/DO) or regional traffic; disable if no benefit + +## Remote Bindings Issues + +### Accidentally Modified Production Data + +**Problem**: Local dev with `--remote` altered production database/KV +**Cause**: Remote bindings connect directly to production resources; writes are real +**Solution**: +- Use `--remote` only for read-heavy debugging +- Create separate preview environments for testing +- Never use `--remote` for write operations during development + +### Remote Binding Auth Errors + +**Problem**: `npx wrangler pages dev --remote` fails with "Unauthorized" or auth error +**Cause**: Not logged in, session expired, or insufficient account permissions +**Solution**: +1. Run `npx wrangler login` to re-authenticate +2. Verify account has access to project and bindings +3. Check binding IDs match production configuration + +### Slow Local Dev with Remote Bindings + +**Problem**: Local dev server slow when using `--remote` +**Cause**: Every request makes network calls to production bindings +**Solution**: Use local bindings for development; reserve `--remote` for final validation + +## Common Errors + +### "Module not found" +**Cause**: Dependencies not bundled or build output incorrect +**Solution**: Check build output directory, ensure dependencies bundled + +### "Binding not found" +**Cause**: Binding not configured or types out of sync +**Solution**: Verify wrangler.jsonc, run `npx wrangler types` + +### "Request exceeded CPU limit" +**Cause**: Code execution too slow or heavy compute +**Solution**: Optimize hot paths, upgrade to Workers Paid + +### "Script too large" +**Cause**: Bundle size exceeds limit +**Solution**: Tree-shake, use dynamic imports, code-split + +### "Too many subrequests" +**Cause**: Exceeded 50 subrequest limit +**Solution**: Batch or reduce fetch calls + +### "KV key not found" +**Cause**: Key doesn't exist or wrong namespace +**Solution**: Check namespace matches environment + +### "D1 error" +**Cause**: Wrong database_id or missing migrations +**Solution**: Verify config, run `wrangler d1 migrations list` + +## Limits Reference (Jan 2026) + +| Resource | Free | Paid | +|----------|------|------| +| Functions Requests | 100k/day | Unlimited | +| CPU Time | 10ms/req | 30ms/req | +| Memory | 128MB | 128MB | +| Script Size | 1MB | 10MB | +| Subrequests | 50/req | 1,000/req | +| Deployments | 500/month | 5,000/month | + +**Tip**: Hitting CPU limit? Optimize hot paths or upgrade to Workers Paid plan. + +[Full limits](https://developers.cloudflare.com/pages/platform/limits/) + +## Getting Help + +1. Check [Pages Docs](https://developers.cloudflare.com/pages/) +2. Search [Discord #functions](https://discord.com/channels/595317990191398933/910978223968518144) +3. Review [Workers Examples](https://developers.cloudflare.com/workers/examples/) +4. Check framework-specific docs/adapters diff --git a/cloudflare/references/pages/patterns.md b/cloudflare/references/pages/patterns.md new file mode 100644 index 0000000..883c4da --- /dev/null +++ b/cloudflare/references/pages/patterns.md @@ -0,0 +1,204 @@ +# Patterns + +## API Routes + +```typescript +// functions/api/todos/[id].ts +export const onRequestGet: PagesFunction = async ({ env, params }) => { + const todo = await env.DB.prepare('SELECT * FROM todos WHERE id = ?').bind(params.id).first(); + if (!todo) return new Response('Not found', { status: 404 }); + return Response.json(todo); +}; + +export const onRequestPut: PagesFunction = async ({ env, params, request }) => { + const body = await request.json(); + await env.DB.prepare('UPDATE todos SET title = ?, completed = ? WHERE id = ?') + .bind(body.title, body.completed, params.id).run(); + return Response.json({ success: true }); +}; +// Also: onRequestDelete, onRequestPost +``` + +## Auth Middleware + +```typescript +// functions/_middleware.ts +const auth: PagesFunction = async (context) => { + if (context.request.url.includes('/public/')) return context.next(); + const authHeader = context.request.headers.get('Authorization'); + if (!authHeader?.startsWith('Bearer ')) { + return new Response('Unauthorized', { status: 401 }); + } + + try { + const payload = await verifyJWT(authHeader.substring(7), context.env.JWT_SECRET); + context.data.user = payload; + return context.next(); + } catch (err) { + return new Response('Invalid token', { status: 401 }); + } +}; +export const onRequest = [auth]; +``` + +## CORS + +```typescript +// functions/api/_middleware.ts +const corsHeaders = { + 'Access-Control-Allow-Origin': '*', + 'Access-Control-Allow-Methods': 'GET, POST, PUT, DELETE, OPTIONS', + 'Access-Control-Allow-Headers': 'Content-Type, Authorization' +}; + +export const onRequest: PagesFunction = async (context) => { + if (context.request.method === 'OPTIONS') { + return new Response(null, {headers: corsHeaders}); + } + const response = await context.next(); + Object.entries(corsHeaders).forEach(([k, v]) => response.headers.set(k, v)); + return response; +}; +``` + +## Form Handling + +```typescript +// functions/api/contact.ts +export const onRequestPost: PagesFunction = async ({ request, env }) => { + const formData = await request.formData(); + await env.QUEUE.send({name: formData.get('name'), email: formData.get('email')}); + return new Response('

Thanks!

', { headers: { 'Content-Type': 'text/html' } }); +}; +``` + +## Background Tasks + +```typescript +export const onRequestPost: PagesFunction = async ({ request, waitUntil }) => { + const data = await request.json(); + waitUntil(fetch('https://api.example.com/webhook', { + method: 'POST', body: JSON.stringify(data) + })); + return Response.json({ queued: true }); +}; +``` + +## Error Handling + +```typescript +// functions/_middleware.ts +const errorHandler: PagesFunction = async (context) => { + try { + return await context.next(); + } catch (error) { + console.error('Error:', error); + if (context.request.url.includes('/api/')) { + return Response.json({ error: error.message }, { status: 500 }); + } + return new Response(`

Error

${error.message}

`, { + status: 500, headers: { 'Content-Type': 'text/html' } + }); + } +}; +export const onRequest = [errorHandler]; +``` + +## Caching + +```typescript +// functions/api/data.ts +export const onRequestGet: PagesFunction = async ({ env, request }) => { + const cacheKey = `data:${new URL(request.url).pathname}`; + const cached = await env.KV.get(cacheKey, 'json'); + if (cached) return Response.json(cached, { headers: { 'X-Cache': 'HIT' } }); + + const data = await env.DB.prepare('SELECT * FROM data').first(); + await env.KV.put(cacheKey, JSON.stringify(data), {expirationTtl: 3600}); + return Response.json(data, {headers: {'X-Cache': 'MISS'}}); +}; +``` + +## Smart Placement for Database Apps + +Enable Smart Placement for apps with D1 or centralized data sources: + +```jsonc +// wrangler.jsonc +{ + "name": "global-app", + "placement": { + "mode": "smart" + }, + "d1_databases": [{ + "binding": "DB", + "database_id": "your-db-id" + }] +} +``` + +```typescript +// functions/api/data.ts +export const onRequestGet: PagesFunction = async ({ env }) => { + // Smart Placement optimizes execution location over time + // Balances user location vs database location + const data = await env.DB.prepare('SELECT * FROM products LIMIT 10').all(); + return Response.json(data); +}; +``` + +**Best for**: Read-heavy apps with D1/Durable Objects in specific regions. +**Not needed**: Apps without data locality constraints or with evenly distributed traffic. + +## Framework Integration + +**Supported** (2026): SvelteKit, Astro, Nuxt, Qwik, Solid Start + +```bash +npm create cloudflare@latest my-app -- --framework=svelte +``` + +### SvelteKit +```typescript +// src/routes/+page.server.ts +export const load = async ({ platform }) => { + const todos = await platform.env.DB.prepare('SELECT * FROM todos').all(); + return { todos: todos.results }; +}; +``` + +### Astro +```astro +--- +const { DB } = Astro.locals.runtime.env; +const todos = await DB.prepare('SELECT * FROM todos').all(); +--- +
    {todos.results.map(t =>
  • {t.title}
  • )}
+``` + +### Nuxt +```typescript +// server/api/todos.get.ts +export default defineEventHandler(async (event) => { + const { DB } = event.context.cloudflare.env; + return await DB.prepare('SELECT * FROM todos').all(); +}); +``` + +**⚠️ Framework Status** (2026): +- ✅ **Supported**: SvelteKit, Astro, Nuxt, Qwik, Solid Start +- ❌ **Deprecated**: Next.js (`@cloudflare/next-on-pages`), Remix (`@remix-run/cloudflare-pages`) + +For deprecated frameworks, see [gotchas.md](./gotchas.md#framework-specific) for migration options. + +[Framework Guides](https://developers.cloudflare.com/pages/framework-guides/) + +## Monorepo + +Dashboard → Settings → Build → Root directory. Set to subproject (e.g., `apps/web`). + +## Best Practices + +**Performance**: Exclude static via `_routes.json`; cache with KV; keep bundle < 1MB +**Security**: Use secrets (not vars); validate inputs; rate limit with KV/DO +**Workflow**: Preview per branch; local dev with `wrangler pages dev`; instant rollbacks in Dashboard diff --git a/cloudflare/references/pipelines/README.md b/cloudflare/references/pipelines/README.md new file mode 100644 index 0000000..2724485 --- /dev/null +++ b/cloudflare/references/pipelines/README.md @@ -0,0 +1,105 @@ +# Cloudflare Pipelines + +ETL streaming platform for ingesting, transforming, and loading data into R2 with SQL transformations. + +## Overview + +Pipelines provides: +- **Streams**: Durable event buffers (HTTP/Workers ingestion) +- **Pipelines**: SQL-based transformations +- **Sinks**: R2 destinations (Iceberg tables or Parquet/JSON files) + +**Status**: Open beta (Workers Paid plan) +**Pricing**: No charge beyond standard R2 storage/operations + +## Architecture + +``` +Data Sources → Streams → Pipelines (SQL) → Sinks → R2 + ↑ ↓ ↓ + HTTP/Workers Transform Iceberg/Parquet +``` + +| Component | Purpose | Key Feature | +|-----------|---------|-------------| +| Streams | Event ingestion | Structured (validated) or unstructured | +| Pipelines | Transform with SQL | Immutable after creation | +| Sinks | Write to R2 | Exactly-once delivery | + +## Quick Start + +```bash +# Interactive setup (recommended) +npx wrangler pipelines setup +``` + +**Minimal Worker example:** +```typescript +interface Env { + STREAM: Pipeline; +} + +export default { + async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise { + const event = { user_id: "123", event_type: "purchase", amount: 29.99 }; + + // Fire-and-forget pattern + ctx.waitUntil(env.STREAM.send([event])); + + return new Response('OK'); + } +} satisfies ExportedHandler; +``` + +## Which Sink Type? + +``` +Need SQL queries on data? + → R2 Data Catalog (Iceberg) + ✅ ACID transactions, time-travel, schema evolution + ❌ More setup complexity (namespace, table, catalog token) + +Just file storage/archival? + → R2 Storage (Parquet) + ✅ Simple, direct file access + ❌ No built-in SQL queries + +Using external tools (Spark/Athena)? + → R2 Storage (Parquet with partitioning) + ✅ Standard format, partition pruning for performance + ❌ Must manage schema compatibility yourself +``` + +## Common Use Cases + +- **Analytics pipelines**: Clickstream, telemetry, server logs +- **Data warehousing**: ETL into queryable Iceberg tables +- **Event processing**: Mobile/IoT with enrichment +- **Ecommerce analytics**: User events, purchases, views + +## Reading Order + +**New to Pipelines?** Start here: +1. [configuration.md](./configuration.md) - Setup streams, sinks, pipelines +2. [api.md](./api.md) - Send events, TypeScript types, SQL functions +3. [patterns.md](./patterns.md) - Best practices, integrations, complete example +4. [gotchas.md](./gotchas.md) - Critical warnings, troubleshooting + +**Task-based routing:** +- Setup pipeline → [configuration.md](./configuration.md) +- Send/query data → [api.md](./api.md) +- Implement pattern → [patterns.md](./patterns.md) +- Debug issue → [gotchas.md](./gotchas.md) + +## In This Reference + +- [configuration.md](./configuration.md) - wrangler.jsonc bindings, schema definition, sink options, CLI commands +- [api.md](./api.md) - Pipeline binding interface, send() method, HTTP ingest, SQL function reference +- [patterns.md](./patterns.md) - Fire-and-forget, schema validation with Zod, integrations, performance tuning +- [gotchas.md](./gotchas.md) - Silent validation failures, immutable pipelines, latency expectations, limits + +## See Also + +- [r2](../r2/) - R2 storage backend for sinks +- [queues](../queues/) - Compare with Queues for async processing +- [workers](../workers/) - Worker runtime for event ingestion diff --git a/cloudflare/references/pipelines/api.md b/cloudflare/references/pipelines/api.md new file mode 100644 index 0000000..ff302c7 --- /dev/null +++ b/cloudflare/references/pipelines/api.md @@ -0,0 +1,208 @@ +# Pipelines API Reference + +## Pipeline Binding Interface + +```typescript +// From @cloudflare/workers-types +interface Pipeline { + send(data: object | object[]): Promise; +} + +interface Env { + STREAM: Pipeline; +} + +export default { + async fetch(request: Request, env: Env): Promise { + // send() returns Promise - no result data + await env.STREAM.send([event]); + return new Response('OK'); + } +} satisfies ExportedHandler; +``` + +**Key points:** +- `send()` accepts single object or array +- Always returns `Promise` (no confirmation data) +- Throws on network/validation errors (wrap in try/catch) +- Use `ctx.waitUntil()` for fire-and-forget pattern + +## Writing Events + +### Single Event + +```typescript +await env.STREAM.send([{ + user_id: "12345", + event_type: "purchase", + product_id: "widget-001", + amount: 29.99 +}]); +``` + +### Batch Events + +```typescript +const events = [ + { user_id: "user1", event_type: "view" }, + { user_id: "user2", event_type: "purchase", amount: 50 } +]; +await env.STREAM.send(events); +``` + +**Limits:** +- Max 1 MB per request +- 5 MB/s per stream + +### Fire-and-Forget Pattern + +```typescript +export default { + async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise { + const event = { /* ... */ }; + + // Don't block response on send + ctx.waitUntil(env.STREAM.send([event])); + + return new Response('OK'); + } +}; +``` + +### Error Handling + +```typescript +try { + await env.STREAM.send([event]); +} catch (error) { + console.error('Pipeline send failed:', error); + // Log to another system, retry, or return error response + return new Response('Failed to track event', { status: 500 }); +} +``` + +## HTTP Ingest API + +### Endpoint Format + +``` +https://{stream-id}.ingest.cloudflare.com +``` + +Get `{stream-id}` from: `npx wrangler pipelines streams list` + +### Request Format + +**CRITICAL:** Must send array, not single object + +```bash +# ✅ Correct +curl -X POST https://{stream-id}.ingest.cloudflare.com \ + -H "Content-Type: application/json" \ + -d '[{"user_id": "123", "event_type": "purchase"}]' + +# ❌ Wrong - will fail +curl -X POST https://{stream-id}.ingest.cloudflare.com \ + -H "Content-Type: application/json" \ + -d '{"user_id": "123", "event_type": "purchase"}' +``` + +### Authentication + +```bash +curl -X POST https://{stream-id}.ingest.cloudflare.com \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer YOUR_API_TOKEN" \ + -d '[{"event": "data"}]' +``` + +**Required permission:** Workers Pipeline Send + +Create token: Dashboard → Workers → API tokens → Create with Pipeline Send permission + +### Response Codes + +| Code | Meaning | Action | +|------|---------|--------| +| 200 | Accepted | Success | +| 400 | Invalid format | Check JSON array, schema match | +| 401 | Auth failed | Verify token valid | +| 413 | Payload too large | Split into smaller batches (<1 MB) | +| 429 | Rate limited | Back off, retry with delay | +| 5xx | Server error | Retry with exponential backoff | + +## SQL Functions Quick Reference + +Available in `INSERT INTO sink SELECT ... FROM stream` transformations: + +| Function | Example | Use Case | +|----------|---------|----------| +| `UPPER(s)` | `UPPER(event_type)` | Normalize strings | +| `LOWER(s)` | `LOWER(email)` | Case-insensitive matching | +| `CONCAT(...)` | `CONCAT(user_id, '_', product_id)` | Generate composite keys | +| `CASE WHEN ... THEN ... END` | `CASE WHEN amount > 100 THEN 'high' ELSE 'low' END` | Conditional enrichment | +| `CAST(x AS type)` | `CAST(timestamp AS string)` | Type conversion | +| `COALESCE(x, y)` | `COALESCE(amount, 0.0)` | Default values | +| Math operators | `amount * 1.1`, `price / quantity` | Calculations | +| Comparison | `amount > 100`, `status IN ('active', 'pending')` | Filtering | + +**String types for CAST:** `string`, `int32`, `int64`, `float32`, `float64`, `bool`, `timestamp` + +Full reference: [Pipelines SQL Reference](https://developers.cloudflare.com/pipelines/sql-reference/) + +## SQL Transform Examples + +### Filter Events + +```sql +INSERT INTO my_sink +SELECT * FROM my_stream +WHERE event_type = 'purchase' AND amount > 100 +``` + +### Select Specific Fields + +```sql +INSERT INTO my_sink +SELECT user_id, event_type, timestamp, amount +FROM my_stream +``` + +### Transform and Enrich + +```sql +INSERT INTO my_sink +SELECT + user_id, + UPPER(event_type) as event_type, + timestamp, + amount * 1.1 as amount_with_tax, + CONCAT(user_id, '_', product_id) as unique_key, + CASE + WHEN amount > 1000 THEN 'high_value' + WHEN amount > 100 THEN 'medium_value' + ELSE 'low_value' + END as customer_tier +FROM my_stream +WHERE event_type IN ('purchase', 'refund') +``` + +## Querying Results (R2 Data Catalog) + +```bash +export WRANGLER_R2_SQL_AUTH_TOKEN=YOUR_CATALOG_TOKEN + +npx wrangler r2 sql query "warehouse_name" " +SELECT + event_type, + COUNT(*) as event_count, + SUM(amount) as total_revenue +FROM default.my_table +WHERE event_type = 'purchase' + AND timestamp >= '2025-01-01' +GROUP BY event_type +ORDER BY total_revenue DESC +LIMIT 100" +``` + +**Note:** Iceberg tables support standard SQL queries with GROUP BY, JOINs, WHERE, ORDER BY, etc. diff --git a/cloudflare/references/pipelines/configuration.md b/cloudflare/references/pipelines/configuration.md new file mode 100644 index 0000000..75e65f5 --- /dev/null +++ b/cloudflare/references/pipelines/configuration.md @@ -0,0 +1,98 @@ +# Pipelines Configuration + +## Worker Binding + +```jsonc +// wrangler.jsonc +{ + "pipelines": [ + { "pipeline": "", "binding": "STREAM" } + ] +} +``` + +Get stream ID: `npx wrangler pipelines streams list` + +## Schema (Structured Streams) + +```json +{ + "fields": [ + { "name": "user_id", "type": "string", "required": true }, + { "name": "event_type", "type": "string", "required": true }, + { "name": "amount", "type": "float64", "required": false }, + { "name": "timestamp", "type": "timestamp", "required": true } + ] +} +``` + +**Types:** `string`, `int32`, `int64`, `float32`, `float64`, `bool`, `timestamp`, `json`, `binary`, `list`, `struct` + +## Stream Setup + +```bash +# With schema +npx wrangler pipelines streams create my-stream --schema-file schema.json + +# Unstructured (no validation) +npx wrangler pipelines streams create my-stream + +# List/get/delete +npx wrangler pipelines streams list +npx wrangler pipelines streams get +npx wrangler pipelines streams delete +``` + +## Sink Configuration + +**R2 Data Catalog (Iceberg):** +```bash +npx wrangler pipelines sinks create my-sink \ + --type r2-data-catalog \ + --bucket my-bucket --namespace default --table events \ + --catalog-token $TOKEN \ + --compression zstd --roll-interval 60 +``` + +**R2 Raw (Parquet):** +```bash +npx wrangler pipelines sinks create my-sink \ + --type r2 --bucket my-bucket --format parquet \ + --path analytics/events \ + --partitioning "year=%Y/month=%m/day=%d" \ + --access-key-id $KEY --secret-access-key $SECRET +``` + +| Option | Values | Guidance | +|--------|--------|----------| +| `--compression` | `zstd`, `snappy`, `gzip` | `zstd` best ratio, `snappy` fastest | +| `--roll-interval` | Seconds | Low latency: 10-60, Query perf: 300 | +| `--roll-size` | MB | Larger = better compression | + +## Pipeline Creation + +```bash +npx wrangler pipelines create my-pipeline \ + --sql "INSERT INTO my_sink SELECT * FROM my_stream WHERE event_type = 'purchase'" +``` + +**⚠️ Pipelines are immutable** - cannot modify SQL. Must delete/recreate. + +## Credentials + +| Type | Permission | Get From | +|------|------------|----------| +| Catalog token | R2 Admin Read & Write | Dashboard → R2 → API tokens | +| R2 credentials | Object Read & Write | `wrangler r2 bucket create` output | +| HTTP ingest token | Workers Pipeline Send | Dashboard → Workers → API tokens | + +## Complete Example + +```bash +npx wrangler r2 bucket create my-bucket +npx wrangler r2 bucket catalog enable my-bucket +npx wrangler pipelines streams create my-stream --schema-file schema.json +npx wrangler pipelines sinks create my-sink --type r2-data-catalog --bucket my-bucket ... +npx wrangler pipelines create my-pipeline --sql "INSERT INTO my_sink SELECT * FROM my_stream" +npx wrangler deploy +``` diff --git a/cloudflare/references/pipelines/gotchas.md b/cloudflare/references/pipelines/gotchas.md new file mode 100644 index 0000000..2a2a75f --- /dev/null +++ b/cloudflare/references/pipelines/gotchas.md @@ -0,0 +1,80 @@ +# Pipelines Gotchas + +## Critical Issues + +### Events Silently Dropped + +**Most common issue.** Events accepted (HTTP 200) but never appear in sink. + +**Causes:** +1. Schema validation fails - structured streams drop invalid events silently +2. Waiting for roll interval (10-300s) - expected behavior + +**Solution:** Validate client-side with Zod: +```typescript +const EventSchema = z.object({ user_id: z.string(), amount: z.number() }); +try { + const validated = EventSchema.parse(rawEvent); + await env.STREAM.send([validated]); +} catch (e) { /* get immediate feedback */ } +``` + +### Pipelines Are Immutable + +Cannot modify SQL after creation. Must delete and recreate. + +```bash +npx wrangler pipelines delete old-pipeline +npx wrangler pipelines create new-pipeline --sql "..." +``` + +**Tip:** Use version naming (`events-pipeline-v1`) and keep SQL in version control. + +### Worker Binding Not Found + +**`env.STREAM is undefined`** + +1. Use **stream ID** (not pipeline ID) in `wrangler.jsonc` +2. Redeploy after adding binding + +```bash +npx wrangler pipelines streams list # Get stream ID +npx wrangler deploy +``` + +## Common Errors + +| Error | Cause | Fix | +|-------|-------|-----| +| Events not in R2 | Roll interval not elapsed | Wait 10-300s, check `roll_interval` | +| Schema validation failures | Type mismatch, missing fields | Validate client-side | +| Rate limit (429) | >5 MB/s per stream | Batch events, request increase | +| Payload too large (413) | >1 MB request | Split into smaller batches | +| Cannot delete stream | Pipeline references it | Delete pipelines first | +| Sink credential errors | Token expired | Recreate sink with new credentials | + +## Limits (Open Beta) + +| Resource | Limit | +|----------|-------| +| Streams/Sinks/Pipelines per account | 20 each | +| Payload size | 1 MB | +| Ingest rate per stream | 5 MB/s | +| Event retention | 24 hours | +| Recommended batch size | 100 events | + +## SQL Limitations + +- **No JOINs** - single stream per pipeline +- **No window functions** - basic SQL only +- **No subqueries** - must use `INSERT INTO ... SELECT ... FROM` +- **No schema evolution** - cannot modify after creation + +## Debug Checklist + +- [ ] Stream exists: `npx wrangler pipelines streams list` +- [ ] Pipeline healthy: `npx wrangler pipelines get ` +- [ ] SQL syntax matches schema +- [ ] Worker redeployed after binding added +- [ ] Waited for roll interval +- [ ] Accepted vs processed count matches (no validation drops) diff --git a/cloudflare/references/pipelines/patterns.md b/cloudflare/references/pipelines/patterns.md new file mode 100644 index 0000000..186b6a2 --- /dev/null +++ b/cloudflare/references/pipelines/patterns.md @@ -0,0 +1,87 @@ +# Pipelines Patterns + +## Fire-and-Forget + +```typescript +export default { + async fetch(request, env, ctx) { + const event = { user_id: '...', event_type: 'page_view', timestamp: new Date().toISOString() }; + ctx.waitUntil(env.STREAM.send([event])); // Don't block response + return new Response('OK'); + } +}; +``` + +## Schema Validation with Zod + +```typescript +import { z } from 'zod'; + +const EventSchema = z.object({ + user_id: z.string(), + event_type: z.enum(['purchase', 'view']), + amount: z.number().positive().optional() +}); + +const validated = EventSchema.parse(rawEvent); // Throws on invalid +await env.STREAM.send([validated]); +``` + +**Why:** Structured streams drop invalid events silently. Client validation gives immediate feedback. + +## SQL Transform Patterns + +```sql +-- Filter early (reduce storage) +INSERT INTO my_sink +SELECT user_id, event_type, amount +FROM my_stream +WHERE event_type = 'purchase' AND amount > 10 + +-- Select only needed fields +INSERT INTO my_sink +SELECT user_id, event_type, timestamp FROM my_stream + +-- Enrich with CASE +INSERT INTO my_sink +SELECT user_id, amount, + CASE WHEN amount > 1000 THEN 'vip' ELSE 'standard' END as tier +FROM my_stream +``` + +## Pipelines + Queues Fan-out + +```typescript +await Promise.all([ + env.ANALYTICS_STREAM.send([event]), // Long-term storage + env.PROCESS_QUEUE.send(event) // Immediate processing +]); +``` + +| Need | Use | +|------|-----| +| Long-term storage, SQL queries | Pipelines | +| Immediate processing, retries | Queues | +| Both | Fan-out pattern | + +## Performance Tuning + +| Goal | Config | +|------|--------| +| Low latency | `--roll-interval 10` | +| Query performance | `--roll-interval 300 --roll-size 100` | +| Cost optimal | `--compression zstd --roll-interval 300` | + +## Schema Evolution + +Pipelines are immutable. Use versioning: + +```bash +# Create v2 stream/sink/pipeline +npx wrangler pipelines streams create events-v2 --schema-file v2.json + +# Dual-write during transition +await Promise.all([env.EVENTS_V1.send([event]), env.EVENTS_V2.send([event])]); + +# Query across versions with UNION ALL +``` diff --git a/cloudflare/references/pulumi/README.md b/cloudflare/references/pulumi/README.md new file mode 100644 index 0000000..e78d807 --- /dev/null +++ b/cloudflare/references/pulumi/README.md @@ -0,0 +1,100 @@ +# Cloudflare Pulumi Provider + +Expert guidance for Cloudflare Pulumi Provider (@pulumi/cloudflare). + +## Overview + +Programmatic management of Cloudflare resources: Workers, Pages, D1, KV, R2, DNS, Queues, etc. + +**Packages:** +- TypeScript/JS: `@pulumi/cloudflare` +- Python: `pulumi-cloudflare` +- Go: `github.com/pulumi/pulumi-cloudflare/sdk/v6/go/cloudflare` +- .NET: `Pulumi.Cloudflare` + +**Version:** v6.x + +## Core Principles + +1. Use API tokens (not legacy API keys) +2. Store accountId in stack config +3. Match binding names across code/config +4. Use `module: true` for ES modules +5. Set `compatibilityDate` to lock behavior + +## Authentication + +```typescript +import * as cloudflare from "@pulumi/cloudflare"; + +// API Token (recommended): CLOUDFLARE_API_TOKEN env +const provider = new cloudflare.Provider("cf", { apiToken: process.env.CLOUDFLARE_API_TOKEN }); + +// API Key (legacy): CLOUDFLARE_API_KEY + CLOUDFLARE_EMAIL env +const provider = new cloudflare.Provider("cf", { apiKey: process.env.CLOUDFLARE_API_KEY, email: process.env.CLOUDFLARE_EMAIL }); + +// API User Service Key: CLOUDFLARE_API_USER_SERVICE_KEY env +const provider = new cloudflare.Provider("cf", { apiUserServiceKey: process.env.CLOUDFLARE_API_USER_SERVICE_KEY }); +``` + +## Setup + +**Pulumi.yaml:** +```yaml +name: my-cloudflare-app +runtime: nodejs +config: + cloudflare:apiToken: + value: ${CLOUDFLARE_API_TOKEN} +``` + +**Pulumi..yaml:** +```yaml +config: + cloudflare:accountId: "abc123..." +``` + +**index.ts:** +```typescript +import * as pulumi from "@pulumi/pulumi"; +import * as cloudflare from "@pulumi/cloudflare"; +const accountId = new pulumi.Config("cloudflare").require("accountId"); +``` + +## Common Resource Types +- `Provider` - Provider config +- `WorkerScript` - Worker +- `WorkersKvNamespace` - KV +- `R2Bucket` - R2 +- `D1Database` - D1 +- `Queue` - Queue +- `PagesProject` - Pages +- `DnsRecord` - DNS +- `WorkerRoute` - Worker route +- `WorkersDomain` - Custom domain + +## Key Properties +- `accountId` - Required for most resources +- `zoneId` - Required for DNS/domain +- `name`/`title` - Resource identifier +- `*Bindings` - Connect resources to Workers + +## Reading Order + +| Order | File | What | When to Read | +|-------|------|------|--------------| +| 1 | [configuration.md](./configuration.md) | Resource config for Workers/KV/D1/R2/Queues/Pages | First time setup, resource reference | +| 2 | [patterns.md](./patterns.md) | Architecture patterns, multi-env, component resources | Building complex apps, best practices | +| 3 | [api.md](./api.md) | Outputs, dependencies, imports, dynamic providers | Advanced features, integrations | +| 4 | [gotchas.md](./gotchas.md) | Common errors, troubleshooting, limits | Debugging, deployment issues | + +## In This Reference +- [configuration.md](./configuration.md) - Provider config, stack setup, Workers/bindings +- [api.md](./api.md) - Resource types, Workers script, KV/D1/R2/queues/Pages +- [patterns.md](./patterns.md) - Multi-env, secrets, CI/CD, stack management +- [gotchas.md](./gotchas.md) - State issues, deployment failures, limits + +## See Also +- [terraform](../terraform/) - Alternative IaC for Cloudflare +- [wrangler](../wrangler/) - CLI deployment alternative +- [workers](../workers/) - Worker runtime documentation diff --git a/cloudflare/references/pulumi/api.md b/cloudflare/references/pulumi/api.md new file mode 100644 index 0000000..332cfef --- /dev/null +++ b/cloudflare/references/pulumi/api.md @@ -0,0 +1,200 @@ +# API & Data Sources + +## Outputs and Exports + +Export resource identifiers: + +```typescript +export const kvId = kv.id; +export const bucketName = bucket.name; +export const workerUrl = worker.subdomain; +export const dbId = db.id; +``` + +## Resource Dependencies + +Implicit dependencies via outputs: + +```typescript +const kv = new cloudflare.WorkersKvNamespace("kv", { + accountId: accountId, + title: "my-kv", +}); + +// Worker depends on KV (implicit via kv.id) +const worker = new cloudflare.WorkerScript("worker", { + accountId: accountId, + name: "my-worker", + content: code, + kvNamespaceBindings: [{name: "MY_KV", namespaceId: kv.id}], // Creates dependency +}); +``` + +Explicit dependencies: + +```typescript +const migration = new command.local.Command("migration", { + create: pulumi.interpolate`wrangler d1 execute ${db.name} --file ./schema.sql`, +}, {dependsOn: [db]}); + +const worker = new cloudflare.WorkerScript("worker", { + accountId: accountId, + name: "worker", + content: code, + d1DatabaseBindings: [{name: "DB", databaseId: db.id}], +}, {dependsOn: [migration]}); // Ensure migrations run first +``` + +## Using Outputs with API Calls + +```typescript +const db = new cloudflare.D1Database("db", {accountId, name: "my-db"}); + +db.id.apply(async (dbId) => { + const response = await fetch( + `https://api.cloudflare.com/client/v4/accounts/${accountId}/d1/database/${dbId}/query`, + {method: "POST", headers: {"Authorization": `Bearer ${apiToken}`, "Content-Type": "application/json"}, + body: JSON.stringify({sql: "CREATE TABLE users (id INT)"})} + ); + return response.json(); +}); +``` + +## Custom Dynamic Providers + +For resources not in provider: + +```typescript +import * as pulumi from "@pulumi/pulumi"; + +class D1MigrationProvider implements pulumi.dynamic.ResourceProvider { + async create(inputs: any): Promise { + const response = await fetch( + `https://api.cloudflare.com/client/v4/accounts/${inputs.accountId}/d1/database/${inputs.databaseId}/query`, + {method: "POST", headers: {"Authorization": `Bearer ${inputs.apiToken}`, "Content-Type": "application/json"}, + body: JSON.stringify({sql: inputs.sql})} + ); + return {id: `${inputs.databaseId}-${Date.now()}`, outs: await response.json()}; + } + async update(id: string, olds: any, news: any): Promise { + if (olds.sql !== news.sql) await this.create(news); + return {}; + } + async delete(id: string, props: any): Promise {} +} + +class D1Migration extends pulumi.dynamic.Resource { + constructor(name: string, args: any, opts?: pulumi.CustomResourceOptions) { + super(new D1MigrationProvider(), name, args, opts); + } +} + +const migration = new D1Migration("migration", { + accountId, databaseId: db.id, apiToken, sql: "CREATE TABLE users (id INT)", +}, {dependsOn: [db]}); +``` + +## Data Sources + +**Get Zone:** +```typescript +const zone = cloudflare.getZone({name: "example.com"}); +const zoneId = zone.then(z => z.id); +``` + +**Get Accounts (via API):** +Use Cloudflare API directly or custom dynamic resources. + +## Import Existing Resources + +```bash +# Import worker +pulumi import cloudflare:index/workerScript:WorkerScript my-worker / + +# Import KV namespace +pulumi import cloudflare:index/workersKvNamespace:WorkersKvNamespace my-kv + +# Import R2 bucket +pulumi import cloudflare:index/r2Bucket:R2Bucket my-bucket / + +# Import D1 database +pulumi import cloudflare:index/d1Database:D1Database my-db / + +# Import DNS record +pulumi import cloudflare:index/dnsRecord:DnsRecord my-record / +``` + +## Secrets Management + +```typescript +import * as pulumi from "@pulumi/pulumi"; + +const config = new pulumi.Config(); +const apiKey = config.requireSecret("apiKey"); // Encrypted in state + +const worker = new cloudflare.WorkerScript("worker", { + accountId: accountId, + name: "my-worker", + content: code, + secretTextBindings: [{name: "API_KEY", text: apiKey}], +}); +``` + +Store secrets: +```bash +pulumi config set --secret apiKey "secret-value" +``` + +## Transform Pattern + +Modify resource args before creation: + +```typescript +import {Transform} from "@pulumi/pulumi"; + +interface BucketArgs { + accountId: pulumi.Input; + transform?: {bucket?: Transform}; +} + +function createBucket(name: string, args: BucketArgs) { + const bucketArgs: cloudflare.R2BucketArgs = { + accountId: args.accountId, + name: name, + location: "auto", + }; + const finalArgs = args.transform?.bucket?.(bucketArgs) ?? bucketArgs; + return new cloudflare.R2Bucket(name, finalArgs); +} +``` + +## v6.x Worker Versioning Resources + +**Worker** - Container for versions: +```typescript +const worker = new cloudflare.Worker("api", {accountId, name: "api-worker"}); +export const workerId = worker.id; +``` + +**WorkerVersion** - Immutable code + config: +```typescript +const version = new cloudflare.WorkerVersion("v1", { + accountId, workerId: worker.id, + content: fs.readFileSync("./dist/worker.js", "utf8"), + compatibilityDate: "2025-01-01", +}); +export const versionId = version.id; +``` + +**WorkersDeployment** - Active deployment with bindings: +```typescript +const deployment = new cloudflare.WorkersDeployment("prod", { + accountId, workerId: worker.id, versionId: version.id, + kvNamespaceBindings: [{name: "MY_KV", namespaceId: kv.id}], +}); +``` + +**Use:** Advanced deployments (canary, blue-green). Most apps should use `WorkerScript` (auto-versioning). + +--- +See: [README.md](./README.md), [configuration.md](./configuration.md), [patterns.md](./patterns.md), [gotchas.md](./gotchas.md) diff --git a/cloudflare/references/pulumi/configuration.md b/cloudflare/references/pulumi/configuration.md new file mode 100644 index 0000000..449419d --- /dev/null +++ b/cloudflare/references/pulumi/configuration.md @@ -0,0 +1,198 @@ +# Resource Configuration + +## Workers (cloudflare.WorkerScript) + +```typescript +import * as cloudflare from "@pulumi/cloudflare"; +import * as fs from "fs"; + +const worker = new cloudflare.WorkerScript("my-worker", { + accountId: accountId, + name: "my-worker", + content: fs.readFileSync("./dist/worker.js", "utf8"), + module: true, // ES modules + compatibilityDate: "2025-01-01", + compatibilityFlags: ["nodejs_compat"], + + // v6.x: Observability + logpush: true, // Enable Workers Logpush + tailConsumers: [{service: "log-consumer"}], // Stream logs to Worker + + // v6.x: Placement + placement: {mode: "smart"}, // Smart placement for latency optimization + + // Bindings + kvNamespaceBindings: [{name: "MY_KV", namespaceId: kv.id}], + r2BucketBindings: [{name: "MY_BUCKET", bucketName: bucket.name}], + d1DatabaseBindings: [{name: "DB", databaseId: db.id}], + queueBindings: [{name: "MY_QUEUE", queue: queue.id}], + serviceBindings: [{name: "OTHER_SERVICE", service: other.name}], + plainTextBindings: [{name: "ENV_VAR", text: "value"}], + secretTextBindings: [{name: "API_KEY", text: secret}], + + // v6.x: Advanced bindings + analyticsEngineBindings: [{name: "ANALYTICS", dataset: "my-dataset"}], + browserBinding: {name: "BROWSER"}, // Browser Rendering + aiBinding: {name: "AI"}, // Workers AI + hyperdriveBindings: [{name: "HYPERDRIVE", id: hyperdriveConfig.id}], +}); +``` + +## Workers KV (cloudflare.WorkersKvNamespace) + +```typescript +const kv = new cloudflare.WorkersKvNamespace("my-kv", { + accountId: accountId, + title: "my-kv-namespace", +}); + +// Write values +const kvValue = new cloudflare.WorkersKvValue("config", { + accountId: accountId, + namespaceId: kv.id, + key: "config", + value: JSON.stringify({foo: "bar"}), +}); +``` + +## R2 Buckets (cloudflare.R2Bucket) + +```typescript +const bucket = new cloudflare.R2Bucket("my-bucket", { + accountId: accountId, + name: "my-bucket", + location: "auto", // or "wnam", etc. +}); +``` + +## D1 Databases (cloudflare.D1Database) + +```typescript +const db = new cloudflare.D1Database("my-db", {accountId, name: "my-database"}); + +// Migrations via wrangler +import * as command from "@pulumi/command"; +const migration = new command.local.Command("d1-migration", { + create: pulumi.interpolate`wrangler d1 execute ${db.name} --file ./schema.sql`, +}, {dependsOn: [db]}); +``` + +## Queues (cloudflare.Queue) + +```typescript +const queue = new cloudflare.Queue("my-queue", {accountId, name: "my-queue"}); + +// Producer +const producer = new cloudflare.WorkerScript("producer", { + accountId, name: "producer", content: code, + queueBindings: [{name: "MY_QUEUE", queue: queue.id}], +}); + +// Consumer +const consumer = new cloudflare.WorkerScript("consumer", { + accountId, name: "consumer", content: code, + queueConsumers: [{queue: queue.name, maxBatchSize: 10, maxRetries: 3}], +}); +``` + +## Pages Projects (cloudflare.PagesProject) + +```typescript +const pages = new cloudflare.PagesProject("my-site", { + accountId, name: "my-site", productionBranch: "main", + buildConfig: {buildCommand: "npm run build", destinationDir: "dist"}, + source: { + type: "github", + config: {owner: "my-org", repoName: "my-repo", productionBranch: "main"}, + }, + deploymentConfigs: { + production: { + environmentVariables: {NODE_VERSION: "18"}, + kvNamespaces: {MY_KV: kv.id}, + d1Databases: {DB: db.id}, + }, + }, +}); +``` + +## DNS Records (cloudflare.DnsRecord) + +```typescript +const zone = cloudflare.getZone({name: "example.com"}); +const record = new cloudflare.DnsRecord("www", { + zoneId: zone.then(z => z.id), name: "www", type: "A", + content: "192.0.2.1", ttl: 3600, proxied: true, +}); +``` + +## Workers Domains/Routes + +```typescript +// Route (pattern-based) +const route = new cloudflare.WorkerRoute("my-route", { + zoneId: zoneId, + pattern: "example.com/api/*", + scriptName: worker.name, +}); + +// Domain (dedicated subdomain) +const domain = new cloudflare.WorkersDomain("my-domain", { + accountId: accountId, + hostname: "api.example.com", + service: worker.name, + zoneId: zoneId, +}); +``` + +## Assets Configuration (v6.x) + +Serve static assets from Workers: + +```typescript +const worker = new cloudflare.WorkerScript("app", { + accountId: accountId, + name: "my-app", + content: code, + assets: { + path: "./public", // Local directory + // Assets uploaded and served from Workers + }, +}); +``` + +## v6.x Versioned Deployments (Advanced) + +For gradual rollouts, use 3-resource pattern: + +```typescript +// 1. Worker (container for versions) +const worker = new cloudflare.Worker("api", { + accountId: accountId, + name: "api-worker", +}); + +// 2. Version (immutable code + config) +const version = new cloudflare.WorkerVersion("v1", { + accountId: accountId, + workerId: worker.id, + content: fs.readFileSync("./dist/worker.js", "utf8"), + compatibilityDate: "2025-01-01", + compatibilityFlags: ["nodejs_compat"], + // Note: Bindings configured at deployment level +}); + +// 3. Deployment (version + bindings + traffic split) +const deployment = new cloudflare.WorkersDeployment("prod", { + accountId: accountId, + workerId: worker.id, + versionId: version.id, + // Bindings applied to deployment + kvNamespaceBindings: [{name: "MY_KV", namespaceId: kv.id}], +}); +``` + +**When to use:** Blue-green deployments, canary releases, gradual rollouts +**When NOT to use:** Simple single-version deployments (use WorkerScript) + +--- +See: [README.md](./README.md), [api.md](./api.md), [patterns.md](./patterns.md), [gotchas.md](./gotchas.md) diff --git a/cloudflare/references/pulumi/gotchas.md b/cloudflare/references/pulumi/gotchas.md new file mode 100644 index 0000000..f01592a --- /dev/null +++ b/cloudflare/references/pulumi/gotchas.md @@ -0,0 +1,181 @@ +# Troubleshooting & Best Practices + +## Common Errors + +### "No bundler/build step" - Pulumi uploads raw code + +**Problem:** Worker fails with "Cannot use import statement outside a module" +**Cause:** Pulumi doesn't bundle Worker code - uploads exactly what you provide +**Solution:** Build Worker BEFORE Pulumi deploy + +```typescript +// WRONG: Pulumi won't bundle this +const worker = new cloudflare.WorkerScript("worker", { + content: fs.readFileSync("./src/index.ts", "utf8"), // Raw TS file +}); + +// RIGHT: Build first, then deploy +import * as command from "@pulumi/command"; +const build = new command.local.Command("build", { + create: "npm run build", + dir: "./worker", +}); +const worker = new cloudflare.WorkerScript("worker", { + content: build.stdout.apply(() => fs.readFileSync("./worker/dist/index.js", "utf8")), +}, {dependsOn: [build]}); +``` + +### "wrangler.toml not consumed" - Config drift + +**Problem:** Local wrangler dev works, Pulumi deploy fails +**Cause:** Pulumi ignores wrangler.toml - must duplicate config +**Solution:** Generate wrangler.toml from Pulumi or keep synced manually + +```typescript +// Pattern: Export Pulumi config to wrangler.toml +const workerConfig = { + name: "my-worker", + compatibilityDate: "2025-01-01", + compatibilityFlags: ["nodejs_compat"], +}; + +new command.local.Command("generate-wrangler", { + create: pulumi.interpolate`cat > wrangler.toml <.yaml +config: + cloudflare:accountId: "abc123..." +``` + +### "Binding name mismatch" + +**Problem:** Worker fails with "env.MY_KV is undefined" +**Cause:** Binding name in Pulumi != name in Worker code +**Solution:** Match exactly (case-sensitive) + +```typescript +// Pulumi +kvNamespaceBindings: [{name: "MY_KV", namespaceId: kv.id}] + +// Worker code +export default { async fetch(request, env) { await env.MY_KV.get("key"); }} +``` + +### "API token permissions insufficient" + +**Problem:** `Error: authentication error (10000)` +**Cause:** Token lacks required permissions +**Solution:** Grant token permissions: Account.Workers Scripts:Edit, Account.Account Settings:Read + +### "Resource not found after import" + +**Problem:** Imported resource shows as changed on next `pulumi up` +**Cause:** State mismatch between actual resource and Pulumi config +**Solution:** Check property names/types match exactly + +```bash +pulumi import cloudflare:index/workerScript:WorkerScript my-worker / +pulumi preview # If shows changes, adjust Pulumi code to match actual resource +``` + +### "v6.x Worker versioning confusion" + +**Problem:** Worker deployed but not receiving traffic +**Cause:** v6.x requires Worker + WorkerVersion + WorkersDeployment (3 resources) +**Solution:** Use WorkerScript (auto-versioning) OR full versioning pattern + +```typescript +// SIMPLE: WorkerScript auto-versions (default behavior) +const worker = new cloudflare.WorkerScript("worker", { + accountId, name: "my-worker", content: code, +}); + +// ADVANCED: Manual versioning for gradual rollouts (v6.x) +const worker = new cloudflare.Worker("worker", {accountId, name: "my-worker"}); +const version = new cloudflare.WorkerVersion("v1", { + accountId, workerId: worker.id, content: code, compatibilityDate: "2025-01-01", +}); +const deployment = new cloudflare.WorkersDeployment("prod", { + accountId, workerId: worker.id, versionId: version.id, +}); +``` + +## Best Practices + +1. **Always set compatibilityDate** - Locks Worker behavior, prevents breaking changes +2. **Build before deploy** - Pulumi doesn't bundle; use Command resource or CI build step +3. **Match binding names** - Case-sensitive, must match between Pulumi and Worker code +4. **Use dependsOn for migrations** - Ensure D1 migrations run before Worker deploys +5. **Version Worker content** - Add VERSION binding to force redeployment on content changes +6. **Store secrets in stack config** - Use `pulumi config set --secret` for API keys + +## Limits + +| Resource | Limit | Notes | +|----------|-------|-------| +| Worker script size | 10 MB | Includes all dependencies, after compression | +| Worker CPU time | 50ms (free), 30s (paid) | Per request | +| KV keys per namespace | Unlimited | 1000 ops/sec write, 100k ops/sec read | +| R2 storage | Unlimited | Class A ops: 1M/mo free, Class B: 10M/mo free | +| D1 databases | 50,000 per account | Free: 10 per account, 5 GB each | +| Queues | 10,000 per account | Free: 1M ops/day | +| Pages projects | 500 per account | Free: 100 projects | +| API requests | Varies by plan | ~1200 req/5min on free | + +## Resources + +- **Pulumi Registry:** https://www.pulumi.com/registry/packages/cloudflare/ +- **API Docs:** https://www.pulumi.com/registry/packages/cloudflare/api-docs/ +- **GitHub:** https://github.com/pulumi/pulumi-cloudflare +- **Cloudflare Docs:** https://developers.cloudflare.com/ +- **Workers Docs:** https://developers.cloudflare.com/workers/ + +--- +See: [README.md](./README.md), [configuration.md](./configuration.md), [api.md](./api.md), [patterns.md](./patterns.md) diff --git a/cloudflare/references/pulumi/patterns.md b/cloudflare/references/pulumi/patterns.md new file mode 100644 index 0000000..c843d54 --- /dev/null +++ b/cloudflare/references/pulumi/patterns.md @@ -0,0 +1,191 @@ +# Architecture Patterns + +## Component Resources + +```typescript +class WorkerApp extends pulumi.ComponentResource { + constructor(name: string, args: WorkerAppArgs, opts?) { + super("custom:cloudflare:WorkerApp", name, {}, opts); + const defaultOpts = {parent: this}; + + this.kv = new cloudflare.WorkersKvNamespace(`${name}-kv`, {accountId: args.accountId, title: `${name}-kv`}, defaultOpts); + this.worker = new cloudflare.WorkerScript(`${name}-worker`, { + accountId: args.accountId, name: `${name}-worker`, content: args.workerCode, + module: true, kvNamespaceBindings: [{name: "KV", namespaceId: this.kv.id}], + }, defaultOpts); + this.domain = new cloudflare.WorkersDomain(`${name}-domain`, { + accountId: args.accountId, hostname: args.domain, service: this.worker.name, + }, defaultOpts); + } +} +``` + +## Full-Stack Worker App + +```typescript +const kv = new cloudflare.WorkersKvNamespace("cache", {accountId, title: "api-cache"}); +const db = new cloudflare.D1Database("db", {accountId, name: "app-database"}); +const bucket = new cloudflare.R2Bucket("assets", {accountId, name: "app-assets"}); + +const apiWorker = new cloudflare.WorkerScript("api", { + accountId, name: "api-worker", content: fs.readFileSync("./dist/api.js", "utf8"), + module: true, kvNamespaceBindings: [{name: "CACHE", namespaceId: kv.id}], + d1DatabaseBindings: [{name: "DB", databaseId: db.id}], + r2BucketBindings: [{name: "ASSETS", bucketName: bucket.name}], +}); +``` + +## Multi-Environment Setup + +```typescript +const stack = pulumi.getStack(); +const worker = new cloudflare.WorkerScript(`worker-${stack}`, { + accountId, name: `my-worker-${stack}`, content: code, + plainTextBindings: [{name: "ENVIRONMENT", text: stack}], +}); +``` + +## Queue-Based Processing + +```typescript +const queue = new cloudflare.Queue("processing-queue", {accountId, name: "image-processing"}); + +// Producer: API receives requests +const apiWorker = new cloudflare.WorkerScript("api", { + accountId, name: "api-worker", content: apiCode, + queueBindings: [{name: "PROCESSING_QUEUE", queue: queue.id}], +}); + +// Consumer: Process async +const processorWorker = new cloudflare.WorkerScript("processor", { + accountId, name: "processor-worker", content: processorCode, + queueConsumers: [{queue: queue.name, maxBatchSize: 10, maxRetries: 3, maxWaitTimeMs: 5000}], + r2BucketBindings: [{name: "OUTPUT_BUCKET", bucketName: outputBucket.name}], +}); +``` + +## Microservices with Service Bindings + +```typescript +const authWorker = new cloudflare.WorkerScript("auth", {accountId, name: "auth-service", content: authCode}); +const apiWorker = new cloudflare.WorkerScript("api", { + accountId, name: "api-service", content: apiCode, + serviceBindings: [{name: "AUTH", service: authWorker.name}], +}); +``` + +## Event-Driven Architecture + +```typescript +const eventQueue = new cloudflare.Queue("events", {accountId, name: "event-bus"}); +const producer = new cloudflare.WorkerScript("producer", { + accountId, name: "api-producer", content: producerCode, + queueBindings: [{name: "EVENTS", queue: eventQueue.id}], +}); +const consumer = new cloudflare.WorkerScript("consumer", { + accountId, name: "email-consumer", content: consumerCode, + queueConsumers: [{queue: eventQueue.name, maxBatchSize: 10}], +}); +``` + +## v6.x Versioned Deployments (Blue-Green/Canary) + +```typescript +const worker = new cloudflare.Worker("api", {accountId, name: "api-worker"}); +const v1 = new cloudflare.WorkerVersion("v1", {accountId, workerId: worker.id, content: fs.readFileSync("./dist/v1.js", "utf8"), compatibilityDate: "2025-01-01"}); +const v2 = new cloudflare.WorkerVersion("v2", {accountId, workerId: worker.id, content: fs.readFileSync("./dist/v2.js", "utf8"), compatibilityDate: "2025-01-01"}); + +// Gradual rollout: 10% v2, 90% v1 +const deployment = new cloudflare.WorkersDeployment("canary", { + accountId, workerId: worker.id, + versions: [{versionId: v2.id, percentage: 10}, {versionId: v1.id, percentage: 90}], + kvNamespaceBindings: [{name: "MY_KV", namespaceId: kv.id}], +}); +``` + +**Use:** Canary releases, A/B testing, blue-green. Most apps use `WorkerScript` (auto-versioning). + +## Wrangler.toml Generation (Bridge IaC with Local Dev) + +Generate wrangler.toml from Pulumi config to keep local dev in sync: + +```typescript +import * as command from "@pulumi/command"; + +const workerConfig = { + name: "my-worker", + compatibilityDate: "2025-01-01", + compatibilityFlags: ["nodejs_compat"], +}; + +// Create resources +const kv = new cloudflare.WorkersKvNamespace("kv", {accountId, title: "my-kv"}); +const db = new cloudflare.D1Database("db", {accountId, name: "my-db"}); +const bucket = new cloudflare.R2Bucket("bucket", {accountId, name: "my-bucket"}); + +// Generate wrangler.toml after resources created +const wranglerGen = new command.local.Command("gen-wrangler", { + create: pulumi.interpolate`cat > wrangler.toml < fs.readFileSync("./worker/dist/index.js", "utf8")), +}, {dependsOn: [build]}); +``` + +## Content SHA Pattern (Force Updates) + +Prevent false "no changes" detections: + +```typescript +const version = Date.now().toString(); +const worker = new cloudflare.WorkerScript("worker", { + accountId, name: "my-worker", content: code, + plainTextBindings: [{name: "VERSION", text: version}], // Forces deployment +}); +``` + +--- +See: [README.md](./README.md), [configuration.md](./configuration.md), [api.md](./api.md), [gotchas.md](./gotchas.md) diff --git a/cloudflare/references/queues/README.md b/cloudflare/references/queues/README.md new file mode 100644 index 0000000..5588fd2 --- /dev/null +++ b/cloudflare/references/queues/README.md @@ -0,0 +1,96 @@ +# Cloudflare Queues + +Flexible message queuing for async task processing with guaranteed at-least-once delivery and configurable batching. + +## Overview + +Queues provide: +- At-least-once delivery guarantee +- Push-based (Worker) and pull-based (HTTP) consumers +- Configurable batching and retries +- Dead Letter Queues (DLQ) +- Delays up to 12 hours + +**Use cases:** Async processing, API buffering, rate limiting, event workflows, deferred jobs + +## Quick Start + +```bash +wrangler queues create my-queue +wrangler queues consumer add my-queue my-worker +``` + +```typescript +// Producer +await env.MY_QUEUE.send({ userId: 123, action: 'notify' }); + +// Consumer (with proper error handling) +export default { + async queue(batch: MessageBatch, env: Env): Promise { + for (const msg of batch.messages) { + try { + await process(msg.body); + msg.ack(); + } catch (error) { + msg.retry({ delaySeconds: 60 }); + } + } + } +}; +``` + +## Critical Warnings + +**Before using Queues, understand these production mistakes:** + +1. **Uncaught errors retry ENTIRE batch** (not just failed message). Always use per-message try/catch. +2. **Messages not ack'd/retry'd will auto-retry forever** until max_retries. Always explicitly handle each message. + +See [gotchas.md](./gotchas.md) for detailed solutions. + +## Core Operations + +| Operation | Purpose | Limit | +|-----------|---------|-------| +| `send(body, options?)` | Publish message | 128 KB | +| `sendBatch(messages)` | Bulk publish | 100 msgs/256 KB | +| `message.ack()` | Acknowledge success | - | +| `message.retry(options?)` | Retry with delay | - | +| `batch.ackAll()` | Ack entire batch | - | + +## Architecture + +``` +[Producer Worker] → [Queue] → [Consumer Worker/HTTP] → [Processing] +``` + +- Max 10,000 queues per account +- 5,000 msgs/second per queue +- 4-14 day retention (configurable) + +## Reading Order + +**New to Queues?** Start here: +1. [configuration.md](./configuration.md) - Set up queues, bindings, consumers +2. [api.md](./api.md) - Send messages, handle batches, ack/retry patterns +3. [patterns.md](./patterns.md) - Real-world examples and integrations +4. [gotchas.md](./gotchas.md) - Critical warnings and troubleshooting + +**Task-based routing:** +- Setup queue → [configuration.md](./configuration.md) +- Send/receive messages → [api.md](./api.md) +- Implement specific pattern → [patterns.md](./patterns.md) +- Debug/troubleshoot → [gotchas.md](./gotchas.md) + +## In This Reference + +- [configuration.md](./configuration.md) - wrangler.jsonc setup, producer/consumer config, DLQ, content types +- [api.md](./api.md) - Send/batch methods, queue handler, ack/retry rules, type-safe patterns +- [patterns.md](./patterns.md) - Async tasks, buffering, rate limiting, D1/Workflows/DO integrations +- [gotchas.md](./gotchas.md) - Critical batch error handling, idempotency, error classification + +## See Also + +- [workers](../workers/) - Worker runtime for producers/consumers +- [r2](../r2/) - Process R2 event notifications via queues +- [d1](../d1/) - Batch write to D1 from queue consumers diff --git a/cloudflare/references/queues/api.md b/cloudflare/references/queues/api.md new file mode 100644 index 0000000..ded029c --- /dev/null +++ b/cloudflare/references/queues/api.md @@ -0,0 +1,206 @@ +# Queues API Reference + +## Producer: Send Messages + +```typescript +// Basic send +await env.MY_QUEUE.send({ url: request.url, timestamp: Date.now() }); + +// Options: delay (max 43200s), contentType (json|text|bytes|v8) +await env.MY_QUEUE.send(message, { delaySeconds: 600 }); +await env.MY_QUEUE.send(message, { delaySeconds: 0 }); // Override queue default + +// Batch (up to 100 msgs or 256 KB) +await env.MY_QUEUE.sendBatch([ + { body: 'msg1' }, + { body: 'msg2' }, + { body: 'msg3', options: { delaySeconds: 300 } } +]); + +// Non-blocking with ctx.waitUntil - send continues after response +ctx.waitUntil(env.MY_QUEUE.send({ data: 'async' })); + +// Background tasks in queue consumer +export default { + async queue(batch: MessageBatch, env: Env, ctx: ExecutionContext): Promise { + for (const msg of batch.messages) { + await processMessage(msg.body); + + // Fire-and-forget analytics (doesn't block ack) + ctx.waitUntil( + env.ANALYTICS_QUEUE.send({ messageId: msg.id, processedAt: Date.now() }) + ); + + msg.ack(); + } + } +}; +``` + +## Consumer: Push-based (Worker) + +```typescript +// Type-safe handler with ExportedHandler +interface Env { + MY_QUEUE: Queue; + DB: D1Database; +} + +export default { + async queue(batch: MessageBatch, env: Env, ctx: ExecutionContext): Promise { + // batch.queue, batch.messages.length + for (const msg of batch.messages) { + // msg.id, msg.body, msg.timestamp, msg.attempts + try { + await processMessage(msg.body); + msg.ack(); + } catch (error) { + msg.retry({ delaySeconds: 600 }); + } + } + } +} satisfies ExportedHandler; +``` + +**CRITICAL WARNINGS:** + +1. **Messages not explicitly ack'd or retry'd will auto-retry indefinitely** until `max_retries` is reached. Always call `msg.ack()` or `msg.retry()` for each message. + +2. **Throwing uncaught errors retries the ENTIRE batch**, not just the failed message. Always wrap individual message processing in try/catch and call `msg.retry()` explicitly per message. + +```typescript +// ❌ BAD: Uncaught error retries entire batch +async queue(batch: MessageBatch): Promise { + for (const msg of batch.messages) { + await riskyOperation(msg.body); // If this throws, entire batch retries + msg.ack(); + } +} + +// ✅ GOOD: Catch per message, handle individually +async queue(batch: MessageBatch): Promise { + for (const msg of batch.messages) { + try { + await riskyOperation(msg.body); + msg.ack(); + } catch (error) { + msg.retry({ delaySeconds: 60 }); + } + } +} +``` + +## Ack/Retry Precedence Rules + +1. **Per-message calls take precedence**: If you call both `msg.ack()` and `msg.retry()`, last call wins +2. **Batch calls don't override**: `batch.ackAll()` only affects messages without explicit ack/retry +3. **No action = automatic retry**: Messages with no explicit action retry with configured delay + +```typescript +async queue(batch: MessageBatch): Promise { + for (const msg of batch.messages) { + msg.ack(); // Message marked for ack + msg.retry(); // Overrides ack - message will retry + } + + batch.ackAll(); // Only affects messages not explicitly handled above +} +``` + +## Batch Operations + +```typescript +// Acknowledge entire batch +try { + await bulkProcess(batch.messages); + batch.ackAll(); +} catch (error) { + batch.retryAll({ delaySeconds: 300 }); +} +``` + +## Exponential Backoff + +```typescript +async queue(batch: MessageBatch, env: Env): Promise { + for (const msg of batch.messages) { + try { + await processMessage(msg.body); + msg.ack(); + } catch (error) { + // 30s, 60s, 120s, 240s, 480s, ... up to 12h max + const delay = Math.min(30 * (2 ** msg.attempts), 43200); + msg.retry({ delaySeconds: delay }); + } + } +} +``` + +## Multiple Queues, Single Consumer + +```typescript +export default { + async queue(batch: MessageBatch, env: Env): Promise { + switch (batch.queue) { + case 'high-priority': await processUrgent(batch.messages); break; + case 'low-priority': await processDeferred(batch.messages); break; + case 'email': await sendEmails(batch.messages); break; + default: batch.retryAll(); + } + } +}; +``` + +## Consumer: Pull-based (HTTP) + +```typescript +// Pull messages +const response = await fetch( + `https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/queues/${QUEUE_ID}/messages/pull`, + { + method: 'POST', + headers: { 'authorization': `Bearer ${API_TOKEN}`, 'content-type': 'application/json' }, + body: JSON.stringify({ visibility_timeout_ms: 6000, batch_size: 50 }) + } +); + +const data = await response.json(); + +// Acknowledge +await fetch( + `https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/queues/${QUEUE_ID}/messages/ack`, + { + method: 'POST', + headers: { 'authorization': `Bearer ${API_TOKEN}`, 'content-type': 'application/json' }, + body: JSON.stringify({ + acks: [{ lease_id: msg.lease_id }], + retries: [{ lease_id: msg2.lease_id, delay_seconds: 600 }] + }) + } +); +``` + +## Interfaces + +```typescript +interface MessageBatch { + readonly queue: string; + readonly messages: Message[]; + ackAll(): void; + retryAll(options?: QueueRetryOptions): void; +} + +interface Message { + readonly id: string; + readonly timestamp: Date; + readonly body: Body; + readonly attempts: number; + ack(): void; + retry(options?: QueueRetryOptions): void; +} + +interface QueueSendOptions { + contentType?: 'text' | 'bytes' | 'json' | 'v8'; + delaySeconds?: number; // 0-43200 +} +``` diff --git a/cloudflare/references/queues/configuration.md b/cloudflare/references/queues/configuration.md new file mode 100644 index 0000000..e6c629f --- /dev/null +++ b/cloudflare/references/queues/configuration.md @@ -0,0 +1,144 @@ +# Queues Configuration + +## Create Queue + +```bash +wrangler queues create my-queue +wrangler queues create my-queue --retention-period-hours=336 # 14 days +wrangler queues create my-queue --delivery-delay-secs=300 +``` + +## Producer Binding + +**wrangler.jsonc:** +```jsonc +{ + "queues": { + "producers": [ + { + "queue": "my-queue-name", + "binding": "MY_QUEUE", + "delivery_delay": 60 // Optional: default delay in seconds + } + ] + } +} +``` + +## Consumer Configuration (Push-based) + +**wrangler.jsonc:** +```jsonc +{ + "queues": { + "consumers": [ + { + "queue": "my-queue-name", + "max_batch_size": 10, // 1-100, default 10 + "max_batch_timeout": 5, // 0-60s, default 5 + "max_retries": 3, // default 3, max 100 + "dead_letter_queue": "my-dlq", // optional + "retry_delay": 300 // optional: delay retries in seconds + } + ] + } +} +``` + +## Consumer Configuration (Pull-based) + +**wrangler.jsonc:** +```jsonc +{ + "queues": { + "consumers": [ + { + "queue": "my-queue-name", + "type": "http_pull", + "visibility_timeout_ms": 5000, // default 30000, max 12h + "max_retries": 5, + "dead_letter_queue": "my-dlq" + } + ] + } +} +``` + +## TypeScript Types + +```typescript +interface Env { + MY_QUEUE: Queue; + ANALYTICS_QUEUE: Queue; +} + +interface MessageBody { + id: string; + action: 'create' | 'update' | 'delete'; + data: Record; +} + +export default { + async queue(batch: MessageBatch, env: Env): Promise { + for (const msg of batch.messages) { + console.log(msg.body.action); + msg.ack(); + } + } +} satisfies ExportedHandler; +``` + +## Content Type Selection + +Choose content type based on consumer type and data requirements: + +| Content Type | Use When | Readable By | Supports | Size | +|--------------|----------|-------------|----------|------| +| `json` | Pull consumers, dashboard visibility, simple objects | All (push/pull/dashboard) | JSON-serializable types only | Medium | +| `v8` | Push consumers only, complex JS objects | Push consumers only | Date, Map, Set, BigInt, typed arrays | Small | +| `text` | String-only payloads | All | Strings only | Smallest | +| `bytes` | Binary data (images, files) | All | ArrayBuffer, Uint8Array | Variable | + +**Decision tree:** +1. Need to view in dashboard or use pull consumer? → Use `json` +2. Need Date, Map, Set, or other V8 types? → Use `v8` (push consumers only) +3. Just strings? → Use `text` +4. Binary data? → Use `bytes` + +```typescript +// JSON: Good for simple objects, pull consumers, dashboard visibility +await env.QUEUE.send({ id: 123, name: 'test' }, { contentType: 'json' }); + +// V8: Good for Date, Map, Set (push consumers only) +await env.QUEUE.send({ + created: new Date(), + tags: new Set(['a', 'b']) +}, { contentType: 'v8' }); + +// Text: Simple strings +await env.QUEUE.send('process-user-123', { contentType: 'text' }); + +// Bytes: Binary data +await env.QUEUE.send(imageBuffer, { contentType: 'bytes' }); +``` + +**Default behavior:** If not specified, Cloudflare auto-selects `json` for JSON-serializable objects and `v8` for complex types. + +**IMPORTANT:** `v8` messages cannot be read by pull consumers or viewed in the dashboard. Use `json` if you need visibility or pull-based consumption. + +## CLI Commands + +```bash +# Consumer management +wrangler queues consumer add my-queue my-worker --batch-size=50 --max-retries=5 +wrangler queues consumer http add my-queue +wrangler queues consumer worker remove my-queue my-worker +wrangler queues consumer http remove my-queue + +# Queue operations +wrangler queues list +wrangler queues pause my-queue +wrangler queues resume my-queue +wrangler queues purge my-queue +wrangler queues delete my-queue +``` diff --git a/cloudflare/references/queues/gotchas.md b/cloudflare/references/queues/gotchas.md new file mode 100644 index 0000000..b93cbe2 --- /dev/null +++ b/cloudflare/references/queues/gotchas.md @@ -0,0 +1,206 @@ +# Queues Gotchas & Troubleshooting + +## CRITICAL: Top Production Mistakes + +### 1. "Entire Batch Retried After Single Error" + +**Problem:** Throwing uncaught error in queue handler retries the entire batch, not just the failed message +**Cause:** Uncaught exceptions propagate to the runtime, triggering batch-level retry +**Solution:** Always wrap individual message processing in try/catch and call `msg.retry()` explicitly + +```typescript +// ❌ BAD: Throws error, retries entire batch +async queue(batch: MessageBatch): Promise { + for (const msg of batch.messages) { + await riskyOperation(msg.body); // If this throws, entire batch retries + msg.ack(); + } +} + +// ✅ GOOD: Catch per message, handle individually +async queue(batch: MessageBatch): Promise { + for (const msg of batch.messages) { + try { + await riskyOperation(msg.body); + msg.ack(); + } catch (error) { + msg.retry({ delaySeconds: 60 }); + } + } +} +``` + +### 2. "Messages Retry Forever" + +**Problem:** Messages not explicitly ack'd or retry'd will auto-retry indefinitely +**Cause:** Runtime default behavior retries unhandled messages until `max_retries` reached +**Solution:** Always call `msg.ack()` or `msg.retry()` for each message. Never leave messages unhandled. + +```typescript +// ❌ BAD: Skipped messages auto-retry forever +async queue(batch: MessageBatch): Promise { + for (const msg of batch.messages) { + if (shouldProcess(msg.body)) { + await process(msg.body); + msg.ack(); + } + // Missing: msg.ack() for skipped messages - they will retry! + } +} + +// ✅ GOOD: Explicitly handle all messages +async queue(batch: MessageBatch): Promise { + for (const msg of batch.messages) { + if (shouldProcess(msg.body)) { + await process(msg.body); + msg.ack(); + } else { + msg.ack(); // Explicitly ack even if not processing + } + } +} +``` + +## Common Errors + +### "Duplicate Message Processing" + +**Problem:** Same message processed multiple times +**Cause:** At-least-once delivery guarantee means duplicates are possible during retries +**Solution:** Design consumers to be idempotent by tracking processed message IDs in KV with expiration TTL + +```typescript +async queue(batch: MessageBatch, env: Env): Promise { + for (const msg of batch.messages) { + const processed = await env.PROCESSED_KV.get(msg.id); + if (processed) { + msg.ack(); + continue; + } + + await processMessage(msg.body); + await env.PROCESSED_KV.put(msg.id, '1', { expirationTtl: 86400 }); + msg.ack(); + } +} +``` + +### "Pull Consumer Can't Decode Messages" + +**Problem:** Pull consumer or dashboard shows unreadable message bodies +**Cause:** Messages sent with `v8` content type are only decodable by Workers push consumers +**Solution:** Use `json` content type for pull consumers or dashboard visibility + +```typescript +// Use json for pull consumers +await env.MY_QUEUE.send(data, { contentType: 'json' }); + +// Use v8 only for push consumers with complex JS types +await env.MY_QUEUE.send({ date: new Date(), tags: new Set() }, { contentType: 'v8' }); +``` + +### "Messages Not Being Delivered" + +**Problem:** Messages sent but consumer not processing +**Cause:** Queue paused, consumer not configured, or consumer errors +**Solution:** Check queue status with `wrangler queues list`, verify consumer configured with `wrangler queues consumer add`, and check logs with `wrangler tail` + +### "High Dead Letter Queue Rate" + +**Problem:** Many messages ending up in DLQ +**Cause:** Consumer repeatedly failing to process messages after max retries +**Solution:** Review consumer error logs, check external dependency availability, verify message format matches expectations, or increase retry delay + +## Error Classification Patterns + +Classify errors to decide whether to retry or DLQ: + +```typescript +async queue(batch: MessageBatch, env: Env): Promise { + for (const msg of batch.messages) { + try { + await processMessage(msg.body); + msg.ack(); + } catch (error) { + // Transient errors: retry with backoff + if (isRetryable(error)) { + const delay = Math.min(30 * (2 ** msg.attempts), 43200); + msg.retry({ delaySeconds: delay }); + } + // Permanent errors: ack to avoid infinite retries + else { + console.error('Permanent error, sending to DLQ:', error); + await env.ERROR_LOG.put(msg.id, JSON.stringify({ msg: msg.body, error: String(error) })); + msg.ack(); // Prevent further retries + } + } + } +} + +function isRetryable(error: unknown): boolean { + if (error instanceof Response) { + // Retry: rate limits, timeouts, server errors + return error.status === 429 || error.status >= 500; + } + if (error instanceof Error) { + // Don't retry: validation, auth, not found + return !error.message.includes('validation') && + !error.message.includes('unauthorized') && + !error.message.includes('not found'); + } + return false; // Unknown errors don't retry +} +``` + +### "CPU Time Exceeded in Consumer" + +**Problem:** Consumer fails with CPU time limit exceeded +**Cause:** Consumer processing exceeding 30s default CPU time limit +**Solution:** Increase CPU limit in wrangler.jsonc: `{ "limits": { "cpu_ms": 300000 } }` (5 minutes max) + +## Content Type Decision Guide + +**When to use each content type:** + +| Content Type | Use When | Readable By | Supports | +|--------------|----------|-------------|----------| +| `json` (default) | Pull consumers, dashboard visibility, simple objects | All (push/pull/dashboard) | JSON-serializable types only | +| `v8` | Push consumers only, complex JS objects | Push consumers only | Date, Map, Set, BigInt, typed arrays | +| `text` | String-only payloads | All | Strings only | +| `bytes` | Binary data (images, files) | All | ArrayBuffer, Uint8Array | + +**Decision tree:** +1. Need to view in dashboard or use pull consumer? → Use `json` +2. Need Date, Map, Set, or other V8 types? → Use `v8` (push consumers only) +3. Just strings? → Use `text` +4. Binary data? → Use `bytes` + +```typescript +// Dashboard/pull: use json +await env.QUEUE.send({ id: 123, name: 'test' }, { contentType: 'json' }); + +// Complex JS types (push only): use v8 +await env.QUEUE.send({ + created: new Date(), + tags: new Set(['a', 'b']) +}, { contentType: 'v8' }); +``` + +## Limits + +| Limit | Value | Notes | +|-------|-------|-------| +| Max queues | 10,000 | Per account | +| Message size | 128 KB | Maximum per message | +| Batch size (consumer) | 100 messages | Maximum messages per batch | +| Batch size (sendBatch) | 100 msgs or 256 KB | Whichever limit reached first | +| Throughput | 5,000 msgs/sec | Per queue | +| Retention | 4-14 days | Configurable retention period | +| Max backlog | 25 GB | Maximum queue backlog size | +| Max delay | 12 hours (43,200s) | Maximum message delay | +| Max retries | 100 | Maximum retry attempts | +| CPU time default | 30s | Per consumer invocation | +| CPU time max | 300s (5 min) | Configurable via `limits.cpu_ms` | +| Operations per message | 3 (write + read + delete) | Base cost per message | +| Pricing | $0.40 per 1M operations | After 1M free operations | +| Message charging | Per 64 KB chunk | Messages charged in 64 KB increments | diff --git a/cloudflare/references/queues/patterns.md b/cloudflare/references/queues/patterns.md new file mode 100644 index 0000000..9ff01c1 --- /dev/null +++ b/cloudflare/references/queues/patterns.md @@ -0,0 +1,220 @@ +# Queues Patterns & Best Practices + +## Async Task Processing + +```typescript +// Producer: Accept request, queue work +export default { + async fetch(request: Request, env: Env): Promise { + const { userId, reportType } = await request.json(); + await env.REPORT_QUEUE.send({ userId, reportType, requestedAt: Date.now() }); + return Response.json({ message: 'Report queued', status: 'pending' }); + } +}; + +// Consumer: Process reports +export default { + async queue(batch: MessageBatch, env: Env): Promise { + for (const msg of batch.messages) { + const { userId, reportType } = msg.body; + const report = await generateReport(userId, reportType, env); + await env.REPORTS_BUCKET.put(`${userId}/${reportType}.pdf`, report); + msg.ack(); + } + } +}; +``` + +## Buffering API Calls + +```typescript +// Producer: Queue log entries +ctx.waitUntil(env.LOGS_QUEUE.send({ + method: request.method, + url: request.url, + timestamp: Date.now() +})); + +// Consumer: Batch write to external API +async queue(batch: MessageBatch, env: Env): Promise { + const logs = batch.messages.map(m => m.body); + await fetch(env.LOG_ENDPOINT, { method: 'POST', body: JSON.stringify({ logs }) }); + batch.ackAll(); +} +``` + +## Rate Limiting Upstream + +```typescript +async queue(batch: MessageBatch, env: Env): Promise { + for (const msg of batch.messages) { + try { + await callRateLimitedAPI(msg.body); + msg.ack(); + } catch (error) { + if (error.status === 429) { + const retryAfter = parseInt(error.headers.get('Retry-After') || '60'); + msg.retry({ delaySeconds: retryAfter }); + } else throw error; + } + } +} +``` + +## Event-Driven Workflows + +```typescript +// R2 event → Queue → Worker +export default { + async queue(batch: MessageBatch, env: Env): Promise { + for (const msg of batch.messages) { + const event = msg.body; + if (event.action === 'PutObject') { + await processNewFile(event.object.key, env); + } else if (event.action === 'DeleteObject') { + await cleanupReferences(event.object.key, env); + } + msg.ack(); + } + } +}; +``` + +## Dead Letter Queue Pattern + +```typescript +// Main queue: After max_retries, goes to DLQ automatically +export default { + async queue(batch: MessageBatch, env: Env): Promise { + for (const msg of batch.messages) { + try { + await riskyOperation(msg.body); + msg.ack(); + } catch (error) { + console.error(`Failed after ${msg.attempts} attempts:`, error); + } + } + } +}; + +// DLQ consumer: Log and store failed messages +export default { + async queue(batch: MessageBatch, env: Env): Promise { + for (const msg of batch.messages) { + await env.FAILED_KV.put(msg.id, JSON.stringify(msg.body)); + msg.ack(); + } + } +}; +``` + +## Priority Queues + +High priority: `max_batch_size: 5, max_batch_timeout: 1`. Low priority: `max_batch_size: 100, max_batch_timeout: 30`. + +## Delayed Job Processing + +```typescript +await env.EMAIL_QUEUE.send({ to, template, userId }, { delaySeconds: 3600 }); +``` + +## Fan-out Pattern + +```typescript +async fetch(request: Request, env: Env): Promise { + const event = await request.json(); + + // Send to multiple queues for parallel processing + await Promise.all([ + env.ANALYTICS_QUEUE.send(event), + env.NOTIFICATIONS_QUEUE.send(event), + env.AUDIT_LOG_QUEUE.send(event) + ]); + + return Response.json({ status: 'processed' }); +} +``` + +## Idempotency Pattern + +```typescript +async queue(batch: MessageBatch, env: Env): Promise { + for (const msg of batch.messages) { + // Check if already processed + const processed = await env.PROCESSED_KV.get(msg.id); + if (processed) { + msg.ack(); + continue; + } + + await processMessage(msg.body); + await env.PROCESSED_KV.put(msg.id, '1', { expirationTtl: 86400 }); + msg.ack(); + } +} +``` + +## Integration: D1 Batch Writes + +```typescript +async queue(batch: MessageBatch, env: Env): Promise { + // Collect all inserts for single D1 batch + const statements = batch.messages.map(msg => + env.DB.prepare('INSERT INTO events (id, data, created) VALUES (?, ?, ?)') + .bind(msg.id, JSON.stringify(msg.body), Date.now()) + ); + + try { + await env.DB.batch(statements); + batch.ackAll(); + } catch (error) { + console.error('D1 batch failed:', error); + batch.retryAll({ delaySeconds: 60 }); + } +} +``` + +## Integration: Workflows + +```typescript +// Queue triggers Workflow for long-running tasks +async queue(batch: MessageBatch, env: Env): Promise { + for (const msg of batch.messages) { + try { + const instance = await env.MY_WORKFLOW.create({ + id: msg.id, + params: msg.body + }); + console.log('Workflow started:', instance.id); + msg.ack(); + } catch (error) { + msg.retry({ delaySeconds: 30 }); + } + } +} +``` + +## Integration: Durable Objects + +```typescript +// Queue distributes work to Durable Objects by ID +async queue(batch: MessageBatch, env: Env): Promise { + for (const msg of batch.messages) { + const { userId, action } = msg.body; + + // Route to user-specific DO + const id = env.USER_DO.idFromName(userId); + const stub = env.USER_DO.get(id); + + try { + await stub.fetch(new Request('https://do/process', { + method: 'POST', + body: JSON.stringify({ action, messageId: msg.id }) + })); + msg.ack(); + } catch (error) { + msg.retry({ delaySeconds: 60 }); + } + } +} +``` diff --git a/cloudflare/references/r2-data-catalog/README.md b/cloudflare/references/r2-data-catalog/README.md new file mode 100644 index 0000000..88702fa --- /dev/null +++ b/cloudflare/references/r2-data-catalog/README.md @@ -0,0 +1,149 @@ +# Cloudflare R2 Data Catalog Skill Reference + +Expert guidance for Cloudflare R2 Data Catalog - Apache Iceberg catalog built into R2 buckets. + +## Reading Order + +**New to R2 Data Catalog?** Start here: +1. Read "What is R2 Data Catalog?" and "When to Use" below +2. [configuration.md](configuration.md) - Enable catalog, create tokens +3. [patterns.md](patterns.md) - PyIceberg setup and common patterns +4. [api.md](api.md) - REST API reference as needed +5. [gotchas.md](gotchas.md) - Troubleshooting when issues arise + +**Quick reference?** Jump to: +- [Enable catalog on bucket](configuration.md#enable-catalog-on-bucket) +- [PyIceberg connection pattern](patterns.md#pyiceberg-connection-pattern) +- [Permission errors](gotchas.md#permission-errors) + +## What is R2 Data Catalog? + +R2 Data Catalog is a **managed Apache Iceberg REST catalog** built directly into R2 buckets. It provides: + +- **Apache Iceberg tables** - ACID transactions, schema evolution, time-travel queries +- **Zero-egress costs** - Query from any cloud/region without data transfer fees +- **Standard REST API** - Works with Spark, PyIceberg, Snowflake, Trino, DuckDB +- **No infrastructure** - Fully managed, no catalog servers to run +- **Public beta** - Available to all R2 subscribers, no extra cost beyond R2 storage + +### What is Apache Iceberg? + +Open table format for analytics datasets in object storage. Features: +- **ACID transactions** - Safe concurrent reads/writes +- **Metadata optimization** - Fast queries without full scans +- **Schema evolution** - Add/rename/delete columns without rewrites +- **Time-travel** - Query historical snapshots +- **Partitioning** - Organize data for efficient queries + +## When to Use + +**Use R2 Data Catalog for:** +- **Log analytics** - Store and query application/system logs +- **Data lakes/warehouses** - Analytical datasets queried by multiple engines +- **BI pipelines** - Aggregate data for dashboards and reports +- **Multi-cloud analytics** - Share data across clouds without egress fees +- **Time-series data** - Event streams, metrics, sensor data + +**Don't use for:** +- **Transactional workloads** - Use D1 or external database instead +- **Sub-second latency** - Iceberg optimized for batch/analytical queries +- **Small datasets (<1GB)** - Setup overhead not worth it +- **Unstructured data** - Store files directly in R2, not as Iceberg tables + +## Architecture + +``` +┌─────────────────────────────────────────────────┐ +│ Query Engines │ +│ (PyIceberg, Spark, Trino, Snowflake, DuckDB) │ +└────────────────┬────────────────────────────────┘ + │ + │ REST API (OAuth2 token) + ▼ +┌─────────────────────────────────────────────────┐ +│ R2 Data Catalog (Managed Iceberg REST Catalog)│ +│ • Namespace/table metadata │ +│ • Transaction coordination │ +│ • Snapshot management │ +└────────────────┬────────────────────────────────┘ + │ + │ Vended credentials + ▼ +┌─────────────────────────────────────────────────┐ +│ R2 Bucket Storage │ +│ • Parquet data files │ +│ • Metadata files │ +│ • Manifest files │ +└─────────────────────────────────────────────────┘ +``` + +**Key concepts:** +- **Catalog URI** - REST endpoint for catalog operations (e.g., `https://.r2.cloudflarestorage.com/iceberg/`) +- **Warehouse** - Logical grouping of tables (typically same as bucket name) +- **Namespace** - Schema/database containing tables (e.g., `logs`, `analytics`) +- **Table** - Iceberg table with schema, data files, snapshots +- **Vended credentials** - Temporary S3 credentials catalog provides for data access + +## Limits + +| Resource | Limit | Notes | +|----------|-------|-------| +| Namespaces per catalog | No hard limit | Organize tables logically | +| Tables per namespace | <10,000 recommended | Performance degrades beyond this | +| Files per table | <100,000 recommended | Run compaction regularly | +| Snapshots per table | Configurable retention | Expire >7 days old | +| Partitions per table | 100-1,000 optimal | Too many = slow metadata ops | +| Table size | Same as R2 bucket | 10GB-10TB+ common | +| API rate limits | Standard R2 API limits | Shared with R2 storage operations | +| Target file size | 128-512 MB | After compaction | + +## Current Status + +**Public Beta** (as of Jan 2026) +- Available to all R2 subscribers +- No extra cost beyond standard R2 storage/operations +- Production-ready, but breaking changes possible +- Supports: namespaces, tables, snapshots, compaction, time-travel, table maintenance + +## Decision Tree: Is R2 Data Catalog Right For You? + +``` +Start → Need analytics on object storage data? + │ + ├─ No → Use R2 directly for object storage + │ + └─ Yes → Dataset >1GB with structured schema? + │ + ├─ No → Too small, use R2 + ad-hoc queries + │ + └─ Yes → Need ACID transactions or schema evolution? + │ + ├─ No → Consider simpler solutions (Parquet on R2) + │ + └─ Yes → Need multi-cloud/multi-tool access? + │ + ├─ No → D1 or external DB may be simpler + │ + └─ Yes → ✅ Use R2 Data Catalog +``` + +**Quick check:** If you answer "yes" to all: +- Dataset >1GB and growing +- Structured/tabular data (logs, events, metrics) +- Multiple query tools or cloud environments +- Need versioning, schema changes, or concurrent access + +→ R2 Data Catalog is a good fit. + +## In This Reference + +- **[configuration.md](configuration.md)** - Enable catalog, create API tokens, connect clients +- **[api.md](api.md)** - REST endpoints, operations, maintenance +- **[patterns.md](patterns.md)** - PyIceberg examples, common use cases +- **[gotchas.md](gotchas.md)** - Troubleshooting, best practices, limitations + +## See Also + +- [Cloudflare R2 Data Catalog Docs](https://developers.cloudflare.com/r2/data-catalog/) +- [Apache Iceberg Docs](https://iceberg.apache.org/) +- [PyIceberg Docs](https://py.iceberg.apache.org/) diff --git a/cloudflare/references/r2-data-catalog/api.md b/cloudflare/references/r2-data-catalog/api.md new file mode 100644 index 0000000..3d57d4f --- /dev/null +++ b/cloudflare/references/r2-data-catalog/api.md @@ -0,0 +1,199 @@ +# API Reference + +R2 Data Catalog exposes standard [Apache Iceberg REST Catalog API](https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml). + +## Quick Reference + +**Most common operations:** + +| Task | PyIceberg Code | +|------|----------------| +| Connect | `RestCatalog(name="r2", warehouse=bucket, uri=uri, token=token)` | +| List namespaces | `catalog.list_namespaces()` | +| Create namespace | `catalog.create_namespace("logs")` | +| Create table | `catalog.create_table(("ns", "table"), schema=schema)` | +| Load table | `catalog.load_table(("ns", "table"))` | +| Append data | `table.append(pyarrow_table)` | +| Query data | `table.scan().to_pandas()` | +| Compact files | `table.rewrite_data_files(target_file_size_bytes=128*1024*1024)` | +| Expire snapshots | `table.expire_snapshots(older_than=timestamp_ms, retain_last=10)` | + +## REST Endpoints + +Base: `https://.r2.cloudflarestorage.com/iceberg/` + +| Operation | Method | Path | +|-----------|--------|------| +| Catalog config | GET | `/v1/config` | +| List namespaces | GET | `/v1/namespaces` | +| Create namespace | POST | `/v1/namespaces` | +| Delete namespace | DELETE | `/v1/namespaces/{ns}` | +| List tables | GET | `/v1/namespaces/{ns}/tables` | +| Create table | POST | `/v1/namespaces/{ns}/tables` | +| Load table | GET | `/v1/namespaces/{ns}/tables/{table}` | +| Update table | POST | `/v1/namespaces/{ns}/tables/{table}` | +| Delete table | DELETE | `/v1/namespaces/{ns}/tables/{table}` | +| Rename table | POST | `/v1/tables/rename` | + +**Authentication:** Bearer token in header: `Authorization: Bearer ` + +## PyIceberg Client API + +Most users use PyIceberg, not raw REST. + +### Connection + +```python +from pyiceberg.catalog.rest import RestCatalog + +catalog = RestCatalog( + name="my_catalog", + warehouse="", + uri="", + token="", +) +``` + +### Namespace Operations + +```python +from pyiceberg.exceptions import NamespaceAlreadyExistsError + +namespaces = catalog.list_namespaces() # [('default',), ('logs',)] +catalog.create_namespace("logs", properties={"owner": "team"}) +catalog.drop_namespace("logs") # Must be empty +``` + +### Table Operations + +```python +from pyiceberg.schema import Schema +from pyiceberg.types import NestedField, StringType, IntegerType + +schema = Schema( + NestedField(1, "id", IntegerType(), required=True), + NestedField(2, "name", StringType(), required=False), +) +table = catalog.create_table(("logs", "app_logs"), schema=schema) +tables = catalog.list_tables("logs") +table = catalog.load_table(("logs", "app_logs")) +catalog.rename_table(("logs", "old"), ("logs", "new")) +``` + +### Data Operations + +```python +import pyarrow as pa + +data = pa.table({"id": [1, 2], "name": ["Alice", "Bob"]}) +table.append(data) +table.overwrite(data) + +# Read with filters +scan = table.scan(row_filter="id > 100", selected_fields=["id", "name"]) +df = scan.to_pandas() +``` + +### Schema Evolution + +```python +from pyiceberg.types import IntegerType, LongType + +with table.update_schema() as update: + update.add_column("user_id", IntegerType(), doc="User ID") + update.rename_column("msg", "message") + update.delete_column("old_field") + update.update_column("id", field_type=LongType()) # int→long only +``` + +### Time-Travel + +```python +from datetime import datetime, timedelta + +# Query specific snapshot or timestamp +scan = table.scan(snapshot_id=table.snapshots()[-2].snapshot_id) +yesterday_ms = int((datetime.now() - timedelta(days=1)).timestamp() * 1000) +scan = table.scan(as_of_timestamp=yesterday_ms) +``` + +### Partitioning + +```python +from pyiceberg.partitioning import PartitionSpec, PartitionField +from pyiceberg.transforms import DayTransform +from pyiceberg.types import TimestampType + +partition_spec = PartitionSpec( + PartitionField(source_id=1, field_id=1000, transform=DayTransform(), name="day") +) +table = catalog.create_table(("events", "actions"), schema=schema, partition_spec=partition_spec) +scan = table.scan(row_filter="day = '2026-01-27'") # Prunes partitions +``` + +## Table Maintenance + +### Compaction + +```python +files = table.scan().plan_files() +avg_mb = sum(f.file_size_in_bytes for f in files) / len(files) / (1024**2) +print(f"Files: {len(files)}, Avg: {avg_mb:.1f} MB") + +table.rewrite_data_files(target_file_size_bytes=128 * 1024 * 1024) +``` + +**When:** Avg <10MB or >1000 files. **Frequency:** High-write daily, medium weekly. + +### Snapshot Expiration + +```python +from datetime import datetime, timedelta + +seven_days_ms = int((datetime.now() - timedelta(days=7)).timestamp() * 1000) +table.expire_snapshots(older_than=seven_days_ms, retain_last=10) +``` + +**Retention:** Production 7-30d, dev 1-7d, audit 90+d. + +### Orphan Cleanup + +```python +three_days_ms = int((datetime.now() - timedelta(days=3)).timestamp() * 1000) +table.delete_orphan_files(older_than=three_days_ms) +``` + +⚠️ Always expire snapshots first, use 3+ day threshold, run during low traffic. + +### Full Maintenance + +```python +# Compact → Expire → Cleanup (in order) +if len(table.scan().plan_files()) > 1000: + table.rewrite_data_files(target_file_size_bytes=128 * 1024 * 1024) +seven_days_ms = int((datetime.now() - timedelta(days=7)).timestamp() * 1000) +table.expire_snapshots(older_than=seven_days_ms, retain_last=10) +three_days_ms = int((datetime.now() - timedelta(days=3)).timestamp() * 1000) +table.delete_orphan_files(older_than=three_days_ms) +``` + +## Metadata Inspection + +```python +table = catalog.load_table(("logs", "app_logs")) +print(table.schema()) +print(table.current_snapshot()) +print(table.properties) +print(f"Files: {len(table.scan().plan_files())}") +``` + +## Error Codes + +| Code | Meaning | Common Causes | +|------|---------|---------------| +| 401 | Unauthorized | Invalid/missing token | +| 404 | Not Found | Catalog not enabled, namespace/table missing | +| 409 | Conflict | Already exists, concurrent update | +| 422 | Validation | Invalid schema, incompatible type | + +See [gotchas.md](gotchas.md) for detailed troubleshooting. diff --git a/cloudflare/references/r2-data-catalog/configuration.md b/cloudflare/references/r2-data-catalog/configuration.md new file mode 100644 index 0000000..15915da --- /dev/null +++ b/cloudflare/references/r2-data-catalog/configuration.md @@ -0,0 +1,198 @@ +# Configuration + +How to enable R2 Data Catalog and configure authentication. + +## Prerequisites + +- Cloudflare account with [R2 subscription](https://developers.cloudflare.com/r2/pricing/) +- R2 bucket created +- Access to Cloudflare dashboard or Wrangler CLI + +## Enable Catalog on Bucket + +Choose one method: + +### Via Wrangler (Recommended) + +```bash +npx wrangler r2 bucket catalog enable +``` + +**Output:** +``` +✅ Data Catalog enabled for bucket 'my-bucket' + Catalog URI: https://.r2.cloudflarestorage.com/iceberg/my-bucket + Warehouse: my-bucket +``` + +### Via Dashboard + +1. Navigate to **R2** → Select your bucket → **Settings** tab +2. Scroll to "R2 Data Catalog" section → Click **Enable** +3. Note the **Catalog URI** and **Warehouse name** shown + +**Result:** +- Catalog URI: `https://.r2.cloudflarestorage.com/iceberg/` +- Warehouse: `` (same as bucket name) + +### Via API (Programmatic) + +```bash +curl -X POST \ + "https://api.cloudflare.com/client/v4/accounts//r2/buckets//catalog" \ + -H "Authorization: Bearer " \ + -H "Content-Type: application/json" +``` + +**Response:** +```json +{ + "result": { + "catalog_uri": "https://.r2.cloudflarestorage.com/iceberg/", + "warehouse": "" + }, + "success": true +} +``` + +## Check Catalog Status + +```bash +npx wrangler r2 bucket catalog status +``` + +**Output:** +``` +Catalog Status: enabled +Catalog URI: https://.r2.cloudflarestorage.com/iceberg/my-bucket +Warehouse: my-bucket +``` + +## Disable Catalog (If Needed) + +```bash +npx wrangler r2 bucket catalog disable +``` + +⚠️ **Warning:** Disabling does NOT delete tables/data. Files remain in bucket. Metadata becomes inaccessible until re-enabled. + +## API Token Creation + +R2 Data Catalog requires API token with **both** R2 Storage + R2 Data Catalog permissions. + +### Dashboard Method (Recommended) + +1. Go to **R2** → **Manage R2 API Tokens** → **Create API Token** +2. Select permission level: + - **Admin Read & Write** - Full catalog + storage access (read/write) + - **Admin Read only** - Read-only access (for query engines) +3. Copy token value immediately (shown only once) + +**Permission groups included:** +- `Workers R2 Data Catalog Write` (or Read) +- `Workers R2 Storage Bucket Item Write` (or Read) + +### API Method (Programmatic) + +Use Cloudflare API to create tokens programmatically. Required permissions: +- `Workers R2 Data Catalog Write` (or Read) +- `Workers R2 Storage Bucket Item Write` (or Read) + +## Client Configuration + +### PyIceberg + +```python +from pyiceberg.catalog.rest import RestCatalog + +catalog = RestCatalog( + name="my_catalog", + warehouse="", # Same as bucket name + uri="", # From enable command + token="", # From token creation +) +``` + +**Full example with credentials:** +```python +import os +from pyiceberg.catalog.rest import RestCatalog + +# Store credentials in environment variables +WAREHOUSE = os.getenv("R2_WAREHOUSE") # e.g., "my-bucket" +CATALOG_URI = os.getenv("R2_CATALOG_URI") # e.g., "https://abc123.r2.cloudflarestorage.com/iceberg/my-bucket" +TOKEN = os.getenv("R2_TOKEN") # API token + +catalog = RestCatalog( + name="r2_catalog", + warehouse=WAREHOUSE, + uri=CATALOG_URI, + token=TOKEN, +) + +# Test connection +print(catalog.list_namespaces()) +``` + +### Spark / Trino / DuckDB + +See [patterns.md](patterns.md) for integration examples with other query engines. + +## Connection String Format + +For quick reference: + +``` +Catalog URI: https://.r2.cloudflarestorage.com/iceberg/ +Warehouse: +Token: +``` + +**Where to find values:** + +| Value | Source | +|-------|--------| +| `` | Dashboard URL or `wrangler whoami` | +| `` | R2 bucket name | +| Catalog URI | Output from `wrangler r2 bucket catalog enable` | +| Token | R2 API Token creation page | + +## Security Best Practices + +1. **Store tokens securely** - Use environment variables or secret managers, never hardcode +2. **Use least privilege** - Read-only tokens for query engines, write tokens only where needed +3. **Rotate tokens regularly** - Create new tokens, test, then revoke old ones +4. **One token per application** - Easier to track and revoke if compromised +5. **Monitor token usage** - Check R2 analytics for unexpected patterns +6. **Bucket-scoped tokens** - Create tokens per bucket, not account-wide + +## Environment Variables Pattern + +```bash +# .env (never commit) +R2_CATALOG_URI=https://.r2.cloudflarestorage.com/iceberg/ +R2_WAREHOUSE= +R2_TOKEN= +``` + +```python +import os +from pyiceberg.catalog.rest import RestCatalog + +catalog = RestCatalog( + name="r2", + uri=os.getenv("R2_CATALOG_URI"), + warehouse=os.getenv("R2_WAREHOUSE"), + token=os.getenv("R2_TOKEN"), +) +``` + +## Troubleshooting + +| Problem | Solution | +|---------|----------| +| 404 "catalog not found" | Run `wrangler r2 bucket catalog enable ` | +| 401 "unauthorized" | Check token has both Catalog + Storage permissions | +| 403 on data files | Token needs both permission groups | + +See [gotchas.md](gotchas.md) for detailed troubleshooting. diff --git a/cloudflare/references/r2-data-catalog/gotchas.md b/cloudflare/references/r2-data-catalog/gotchas.md new file mode 100644 index 0000000..6bfad9e --- /dev/null +++ b/cloudflare/references/r2-data-catalog/gotchas.md @@ -0,0 +1,170 @@ +# Gotchas & Troubleshooting + +Common problems → causes → solutions. + +## Permission Errors + +### 401 Unauthorized + +**Error:** `"401 Unauthorized"` +**Cause:** Token missing R2 Data Catalog permissions. +**Solution:** Use "Admin Read & Write" token (includes catalog + storage permissions). Test with `catalog.list_namespaces()`. + +### 403 Forbidden + +**Error:** `"403 Forbidden"` on data files +**Cause:** Token lacks storage permissions. +**Solution:** Token needs both R2 Data Catalog + R2 Storage Bucket Item permissions. + +### Token Rotation Issues + +**Error:** New token fails after rotation. +**Solution:** Create new token → test in staging → update prod → monitor 24h → revoke old. + +## Catalog URI Issues + +### 404 Not Found + +**Error:** `"404 Catalog not found"` +**Cause:** Catalog not enabled or wrong URI. +**Solution:** Run `wrangler r2 bucket catalog enable `. URI must be HTTPS with `/iceberg/` and case-sensitive bucket name. + +### Wrong Warehouse + +**Error:** Cannot create/load tables. +**Cause:** Warehouse ≠ bucket name. +**Solution:** Set `warehouse="bucket-name"` to match bucket exactly. + +## Table and Schema Issues + +### Table/Namespace Already Exists + +**Error:** `"TableAlreadyExistsError"` +**Solution:** Use try/except to load existing or check first. + +### Namespace Not Found + +**Error:** Cannot create table. +**Solution:** Create namespace first: `catalog.create_namespace("ns")` + +### Schema Evolution Errors + +**Error:** `"422 Validation"` on schema update. +**Cause:** Incompatible change (required field, type shrink). +**Solution:** Only add nullable columns, compatible type widening (int→long, float→double). + +## Data and Query Issues + +### Empty Scan Results + +**Error:** Scan returns no data. +**Cause:** Incorrect filter or partition column. +**Solution:** Test without filter first: `table.scan().to_pandas()`. Verify partition column names. + +### Slow Queries + +**Error:** Performance degrades over time. +**Cause:** Too many small files. +**Solution:** Check file count, compact if >1000 or avg <10MB. See [api.md](api.md#compaction). + +### Type Mismatch + +**Error:** `"Cannot cast"` on append. +**Cause:** PyArrow types don't match Iceberg schema. +**Solution:** Cast to int64 (Iceberg default), not int32. Check `table.schema()`. + +## Compaction Issues + +### Compaction Issues + +**Problem:** File count unchanged or compaction takes hours. +**Cause:** Target size too large, or table too big for PyIceberg. +**Solution:** Only compact if avg <50MB. For >1TB tables, use Spark. Run during low-traffic periods. + +## Maintenance Issues + +### Snapshot/Orphan Issues + +**Problem:** Expiration fails or orphan cleanup deletes active data. +**Cause:** Too aggressive retention or wrong order. +**Solution:** Always expire snapshots first with `retain_last=10`, then cleanup orphans with 3+ day threshold. + +## Concurrency Issues + +### Concurrent Write Conflicts + +**Problem:** `CommitFailedException` with multiple writers. +**Cause:** Optimistic locking - simultaneous commits. +**Solution:** Add retry with exponential backoff (see [patterns.md](patterns.md#pattern-6-concurrent-writes-with-retry)). + +### Stale Metadata + +**Problem:** Old schema/data after external update. +**Cause:** Cached metadata. +**Solution:** Reload table: `table = catalog.load_table(("ns", "table"))` + +## Performance Optimization + +### Performance Tips + +**Scans:** Use `row_filter` and `selected_fields` to reduce data scanned. +**Partitions:** 100-1000 optimal. Avoid high cardinality (millions) or low (<10). +**Files:** Keep 100-500MB avg. Compact if <10MB or >10k files. + +## Limits + +| Resource | Recommended | Impact if Exceeded | +|----------|-------------|-------------------| +| Tables/namespace | <10k | Slow list ops | +| Files/table | <100k | Slow query planning | +| Partitions/table | 100-1k | Metadata overhead | +| Snapshots/table | Expire >7d | Metadata bloat | + +## Common Error Messages Reference + +| Error Message | Likely Cause | Fix | +|---------------|--------------|-----| +| `401 Unauthorized` | Missing/invalid token | Check token has catalog+storage permissions | +| `403 Forbidden` | Token lacks storage permissions | Add R2 Storage Bucket Item permission | +| `404 Not Found` | Catalog not enabled or wrong URI | Run `wrangler r2 bucket catalog enable` | +| `409 Conflict` | Table/namespace already exists | Use try/except or load existing | +| `422 Unprocessable Entity` | Schema validation failed | Check type compatibility, required fields | +| `CommitFailedException` | Concurrent write conflict | Add retry logic with backoff | +| `NamespaceAlreadyExistsError` | Namespace exists | Use try/except or load existing | +| `NoSuchTableError` | Table doesn't exist | Check namespace+table name, create first | +| `TypeError: Cannot cast` | PyArrow type mismatch | Cast data to match Iceberg schema | + +## Debugging Checklist + +When things go wrong, check in order: + +1. ✅ **Catalog enabled:** `npx wrangler r2 bucket catalog status ` +2. ✅ **Token permissions:** Both R2 Data Catalog + R2 Storage in dashboard +3. ✅ **Connection test:** `catalog.list_namespaces()` succeeds +4. ✅ **URI format:** HTTPS, includes `/iceberg/`, correct bucket name +5. ✅ **Warehouse name:** Matches bucket name exactly +6. ✅ **Namespace exists:** Create before `create_table()` +7. ✅ **Enable debug logging:** `logging.basicConfig(level=logging.DEBUG)` +8. ✅ **PyIceberg version:** `pip install --upgrade pyiceberg` (≥0.5.0) +9. ✅ **File health:** Compact if >1000 files or avg <10MB +10. ✅ **Snapshot count:** Expire if >100 snapshots + +## Enable Debug Logging + +```python +import logging +logging.basicConfig(level=logging.DEBUG) +# Now operations show HTTP requests/responses +``` + +## Resources + +- [Cloudflare Community](https://community.cloudflare.com/c/developers/workers/40) +- [Cloudflare Discord](https://discord.cloudflare.com) - #r2 channel +- [PyIceberg GitHub](https://github.com/apache/iceberg-python/issues) +- [Apache Iceberg Slack](https://iceberg.apache.org/community/) + +## Next Steps + +- [patterns.md](patterns.md) - Working examples +- [api.md](api.md) - API reference diff --git a/cloudflare/references/r2-data-catalog/patterns.md b/cloudflare/references/r2-data-catalog/patterns.md new file mode 100644 index 0000000..b6b181f --- /dev/null +++ b/cloudflare/references/r2-data-catalog/patterns.md @@ -0,0 +1,191 @@ +# Common Patterns + +Practical patterns for R2 Data Catalog with PyIceberg. + +## PyIceberg Connection + +```python +import os +from pyiceberg.catalog.rest import RestCatalog +from pyiceberg.exceptions import NamespaceAlreadyExistsError + +catalog = RestCatalog( + name="r2_catalog", + warehouse=os.getenv("R2_WAREHOUSE"), # bucket name + uri=os.getenv("R2_CATALOG_URI"), # catalog endpoint + token=os.getenv("R2_TOKEN"), # API token +) + +# Create namespace (idempotent) +try: + catalog.create_namespace("default") +except NamespaceAlreadyExistsError: + pass +``` + +## Pattern 1: Log Analytics Pipeline + +Ingest logs incrementally, query by time/level. + +```python +import pyarrow as pa +from datetime import datetime +from pyiceberg.schema import Schema +from pyiceberg.types import NestedField, TimestampType, StringType, IntegerType +from pyiceberg.partitioning import PartitionSpec, PartitionField +from pyiceberg.transforms import DayTransform + +# Create partitioned table (once) +schema = Schema( + NestedField(1, "timestamp", TimestampType(), required=True), + NestedField(2, "level", StringType(), required=True), + NestedField(3, "service", StringType(), required=True), + NestedField(4, "message", StringType(), required=False), +) + +partition_spec = PartitionSpec( + PartitionField(source_id=1, field_id=1000, transform=DayTransform(), name="day") +) + +catalog.create_namespace("logs") +table = catalog.create_table(("logs", "app_logs"), schema=schema, partition_spec=partition_spec) + +# Append logs (incremental) +data = pa.table({ + "timestamp": [datetime(2026, 1, 27, 10, 30, 0)], + "level": ["ERROR"], + "service": ["auth-service"], + "message": ["Failed login"], +}) +table.append(data) + +# Query by time + level (leverages partitioning) +scan = table.scan(row_filter="level = 'ERROR' AND day = '2026-01-27'") +errors = scan.to_pandas() +``` + +## Pattern 2: Time-Travel Queries + +```python +from datetime import datetime, timedelta + +table = catalog.load_table(("logs", "app_logs")) + +# Query specific snapshot +snapshot_id = table.current_snapshot().snapshot_id +data = table.scan(snapshot_id=snapshot_id).to_pandas() + +# Query as of timestamp (yesterday) +yesterday_ms = int((datetime.now() - timedelta(days=1)).timestamp() * 1000) +data = table.scan(as_of_timestamp=yesterday_ms).to_pandas() +``` + +## Pattern 3: Schema Evolution + +```python +from pyiceberg.types import StringType + +table = catalog.load_table(("users", "profiles")) + +with table.update_schema() as update: + update.add_column("email", StringType(), required=False) + update.rename_column("name", "full_name") +# Old readers ignore new columns, new readers see nulls for old data +``` + +## Pattern 4: Partitioned Tables + +```python +from pyiceberg.partitioning import PartitionSpec, PartitionField +from pyiceberg.transforms import DayTransform, IdentityTransform + +# Partition by day + country +partition_spec = PartitionSpec( + PartitionField(source_id=1, field_id=1000, transform=DayTransform(), name="day"), + PartitionField(source_id=2, field_id=1001, transform=IdentityTransform(), name="country"), +) +table = catalog.create_table(("events", "user_events"), schema=schema, partition_spec=partition_spec) + +# Queries prune partitions automatically +scan = table.scan(row_filter="country = 'US' AND day = '2026-01-27'") +``` + +## Pattern 5: Table Maintenance + +```python +from datetime import datetime, timedelta + +table = catalog.load_table(("logs", "app_logs")) + +# Compact → expire → cleanup (in order) +table.rewrite_data_files(target_file_size_bytes=128 * 1024 * 1024) +seven_days_ms = int((datetime.now() - timedelta(days=7)).timestamp() * 1000) +table.expire_snapshots(older_than=seven_days_ms, retain_last=10) +three_days_ms = int((datetime.now() - timedelta(days=3)).timestamp() * 1000) +table.delete_orphan_files(older_than=three_days_ms) +``` + +See [api.md](api.md#table-maintenance) for detailed parameters. + +## Pattern 6: Concurrent Writes with Retry + +```python +from pyiceberg.exceptions import CommitFailedException +import time + +def append_with_retry(table, data, max_retries=3): + for attempt in range(max_retries): + try: + table.append(data) + return + except CommitFailedException: + if attempt == max_retries - 1: + raise + time.sleep(2 ** attempt) +``` + +## Pattern 7: Upsert Simulation + +```python +import pandas as pd +import pyarrow as pa + +# Read → merge → overwrite (not atomic, use Spark MERGE INTO for production) +existing = table.scan().to_pandas() +new_data = pd.DataFrame({"id": [1, 3], "value": [100, 300]}) +merged = pd.concat([existing, new_data]).drop_duplicates(subset=["id"], keep="last") +table.overwrite(pa.Table.from_pandas(merged)) +``` + +## Pattern 8: DuckDB Integration + +```python +import duckdb + +arrow_table = table.scan().to_arrow() +con = duckdb.connect() +con.register("logs", arrow_table) +result = con.execute("SELECT level, COUNT(*) FROM logs GROUP BY level").fetchdf() +``` + +## Pattern 9: Monitor Table Health + +```python +files = table.scan().plan_files() +avg_mb = sum(f.file_size_in_bytes for f in files) / len(files) / (1024**2) +print(f"Files: {len(files)}, Avg: {avg_mb:.1f}MB, Snapshots: {len(table.snapshots())}") + +if avg_mb < 10 or len(files) > 1000: + print("⚠️ Needs compaction") +``` + +## Best Practices + +| Area | Guideline | +|------|-----------| +| **Partitioning** | Use day/hour for time-series; 100-1000 partitions; avoid high cardinality | +| **File sizes** | Target 128-512MB; compact when avg <10MB or >10k files | +| **Schema** | Add columns as nullable (`required=False`); batch changes | +| **Maintenance** | Compact high-write daily/weekly; expire snapshots 7-30d; cleanup orphans after | +| **Concurrency** | Reads automatic; writes to different partitions safe; retry same partition | +| **Performance** | Filter on partitions; select only needed columns; batch appends 100MB+ | diff --git a/cloudflare/references/r2-sql/README.md b/cloudflare/references/r2-sql/README.md new file mode 100644 index 0000000..c59a161 --- /dev/null +++ b/cloudflare/references/r2-sql/README.md @@ -0,0 +1,128 @@ +# Cloudflare R2 SQL Skill Reference + +Expert guidance for Cloudflare R2 SQL - serverless distributed query engine for Apache Iceberg tables. + +## Reading Order + +**New to R2 SQL?** Start here: +1. Read "What is R2 SQL?" and "When to Use" below +2. [configuration.md](configuration.md) - Enable catalog, create tokens +3. [patterns.md](patterns.md) - Wrangler CLI and integration examples +4. [api.md](api.md) - SQL syntax and query reference +5. [gotchas.md](gotchas.md) - Limitations and troubleshooting + +**Quick reference?** Jump to: +- [Run a query via Wrangler](patterns.md#wrangler-cli-query) +- [SQL syntax reference](api.md#sql-syntax) +- [ORDER BY limitations](gotchas.md#order-by-limitations) + +## What is R2 SQL? + +R2 SQL is Cloudflare's **serverless distributed analytics query engine** for querying Apache Iceberg tables in R2 Data Catalog. Features: + +- **Serverless** - No clusters to manage, no infrastructure +- **Distributed** - Leverages Cloudflare's global network for parallel execution +- **SQL interface** - Familiar SQL syntax for analytics queries +- **Zero egress fees** - Query from any cloud/region without data transfer costs +- **Open beta** - Free during beta (standard R2 storage costs apply) + +### What is Apache Iceberg? + +Open table format for large-scale analytics datasets in object storage: +- **ACID transactions** - Safe concurrent reads/writes +- **Metadata optimization** - Fast queries without full table scans +- **Schema evolution** - Add/rename/drop columns without rewrites +- **Partitioning** - Organize data for efficient pruning + +## When to Use + +**Use R2 SQL for:** +- **Log analytics** - Query application/system logs with WHERE filters and aggregations +- **BI dashboards** - Generate reports from large analytical datasets +- **Fraud detection** - Analyze transaction patterns with GROUP BY/HAVING +- **Multi-cloud analytics** - Query data from any cloud without egress fees +- **Ad-hoc exploration** - Run SQL queries on Iceberg tables via Wrangler CLI + +**Don't use R2 SQL for:** +- **Workers/Pages runtime** - R2 SQL has no Workers binding, use HTTP API from external systems +- **Real-time queries (<100ms)** - Optimized for analytical batch queries, not OLTP +- **Complex joins/CTEs** - Limited SQL feature set (no JOINs, subqueries, CTEs currently) +- **Small datasets (<1GB)** - Setup overhead not justified + +## Decision Tree: Need to Query R2 Data? + +``` +Do you need to query structured data in R2? +├─ YES, data is in Iceberg tables +│ ├─ Need SQL interface? → Use R2 SQL (this reference) +│ ├─ Need Python API? → See r2-data-catalog reference (PyIceberg) +│ └─ Need other engine? → See r2-data-catalog reference (Spark, Trino, etc.) +│ +├─ YES, but not in Iceberg format +│ ├─ Streaming data? → Use Pipelines to write to Data Catalog, then R2 SQL +│ └─ Static files? → Use PyIceberg to create Iceberg tables, then R2 SQL +│ +└─ NO, just need object storage → Use R2 reference (not R2 SQL) +``` + +## Architecture Overview + +**Query Planner:** +- Top-down metadata investigation with multi-layer pruning +- Partition-level, column-level, and row-group pruning +- Streaming pipeline - execution starts before planning completes +- Early termination with LIMIT - stops when result complete + +**Query Execution:** +- Coordinator distributes work to workers across Cloudflare network +- Workers run Apache DataFusion for parallel query execution +- Parquet column pruning - reads only required columns +- Ranged reads from R2 for efficiency + +**Aggregation Strategies:** +- Scatter-gather - simple aggregations (SUM, COUNT, AVG) +- Shuffling - ORDER BY/HAVING on aggregates via hash partitioning + +## Quick Start + +```bash +# 1. Enable R2 Data Catalog on bucket +npx wrangler r2 bucket catalog enable my-bucket + +# 2. Create API token (Admin Read & Write) +# Dashboard: R2 → Manage API tokens → Create API token + +# 3. Set environment variable +export WRANGLER_R2_SQL_AUTH_TOKEN= + +# 4. Run query +npx wrangler r2 sql query "my-bucket" "SELECT * FROM default.my_table LIMIT 10" +``` + +## Important Limitations + +**CRITICAL: No Workers Binding** +- R2 SQL cannot be called directly from Workers/Pages code +- For programmatic access, use HTTP API from external systems +- Or query via PyIceberg, Spark, etc. (see r2-data-catalog reference) + +**SQL Feature Set:** +- No JOINs, CTEs, subqueries, window functions +- ORDER BY supports aggregation columns (not just partition keys) +- LIMIT max 10,000 (default 500) +- See [gotchas.md](gotchas.md) for complete limitations + +## In This Reference + +- **[configuration.md](configuration.md)** - Enable catalog, create API tokens +- **[api.md](api.md)** - SQL syntax, functions, operators, data types +- **[patterns.md](patterns.md)** - Wrangler CLI, HTTP API, Pipelines, PyIceberg +- **[gotchas.md](gotchas.md)** - Limitations, troubleshooting, performance tips + +## See Also + +- [r2-data-catalog](../r2-data-catalog/) - PyIceberg, REST API, external engines +- [pipelines](../pipelines/) - Streaming ingestion to Iceberg tables +- [r2](../r2/) - R2 object storage fundamentals +- [Cloudflare R2 SQL Docs](https://developers.cloudflare.com/r2-sql/) +- [R2 SQL Deep Dive Blog](https://blog.cloudflare.com/r2-sql-deep-dive/) diff --git a/cloudflare/references/r2-sql/SKILL.md.backup b/cloudflare/references/r2-sql/SKILL.md.backup new file mode 100644 index 0000000..c1a3584 --- /dev/null +++ b/cloudflare/references/r2-sql/SKILL.md.backup @@ -0,0 +1,512 @@ +# Cloudflare R2 SQL Skill + +Guide for using Cloudflare R2 SQL - serverless distributed query engine for Apache Iceberg tables in R2 Data Catalog. + +## Overview + +R2 SQL is Cloudflare's serverless distributed analytics query engine for querying Apache Iceberg tables in R2 Data Catalog. Features: +- Serverless - no clusters to manage +- Distributed - leverages Cloudflare's global network +- Zero egress fees - query from any cloud/region +- Open beta - free during beta (standard R2 storage costs apply) + +## Core Concepts + +### Apache Iceberg Table Format +- Open table format for large-scale analytics datasets +- ACID transactions for reliable concurrent reads/writes +- Schema evolution - add/rename/drop columns without rewriting data +- Optimized metadata - avoids full table scans via indexed metadata +- Supported by Spark, Trino, Snowflake, DuckDB, ClickHouse, PyIceberg + +### R2 Data Catalog +- Managed Apache Iceberg catalog built into R2 bucket +- Exposes standard Iceberg REST catalog interface +- Single source of truth for table metadata +- Tracks table state via immutable snapshots +- Supports multiple query engines safely accessing same tables + +### Architecture +**Query Planner**: +- Top-down metadata investigation +- Multi-layer pruning (partition-level, column-level, row-group level) +- Streaming pipeline - execution starts before planning completes +- Early termination - stops when result complete without full scan +- Uses partition stats and column stats (min/max, null counts) + +**Query Execution**: +- Coordinator distributes work to workers across Cloudflare network +- Workers run Apache DataFusion for parallel query execution +- Arrow IPC format for inter-process communication +- Parquet column pruning - reads only required columns +- Ranged reads from R2 for efficiency + +**Aggregation Strategies**: +- Scatter-gather - for simple aggregations (sum, count, avg) +- Shuffling - for ORDER BY/HAVING on aggregates via hash partitioning + +## Setup & Configuration + +### 1. Enable R2 Data Catalog + +CLI: +```bash +npx wrangler r2 bucket catalog enable +``` + +Note the Warehouse name and Catalog URI from output. + +Dashboard: +1. R2 Object Storage → Select bucket +2. Settings tab → R2 Data Catalog → Enable +3. Note Catalog URI and Warehouse name + +### 2. Create API Token + +Required permissions: R2 Admin Read & Write (includes R2 SQL Read) + +Dashboard: +1. R2 Object Storage → Manage API tokens +2. Create API token → Admin Read & Write +3. Save token value + +### 3. Configure Environment + +```bash +export WRANGLER_R2_SQL_AUTH_TOKEN= +``` + +Or `.env` file: +``` +WRANGLER_R2_SQL_AUTH_TOKEN= +``` + +## Common Code Patterns + +### Wrangler CLI Query + +```bash +npx wrangler r2 sql query "" " + SELECT * + FROM namespace.table_name + WHERE condition + LIMIT 10" +``` + +### PyIceberg Setup + +```python +from pyiceberg.catalog.rest import RestCatalog + +catalog = RestCatalog( + name="my_catalog", + warehouse="", + uri="", + token="", +) + +# Create namespace +catalog.create_namespace_if_not_exists("default") +``` + +### Create Table + +```python +import pyarrow as pa + +# Define schema +df = pa.table({ + "id": [1, 2, 3], + "name": ["Alice", "Bob", "Charlie"], + "score": [80.0, 92.5, 88.0], +}) + +# Create table +table = catalog.create_table( + ("default", "people"), + schema=df.schema, +) +``` + +### Append Data + +```python +table.append(df) +``` + +### Query Table + +```python +# Scan and convert to Pandas +scanned = table.scan().to_arrow() +print(scanned.to_pandas()) +``` + +## SQL Reference + +### Query Structure + +```sql +SELECT column_list | aggregation_function +FROM table_name +WHERE conditions +[GROUP BY column_list] +[HAVING conditions] +[ORDER BY partition_key [DESC | ASC]] +[LIMIT number] +``` + +### Schema Discovery + +```sql +-- List namespaces +SHOW DATABASES; +SHOW NAMESPACES; + +-- List tables +SHOW TABLES IN namespace_name; + +-- Describe table +DESCRIBE namespace_name.table_name; +``` + +### SELECT Patterns + +```sql +-- All columns +SELECT * FROM ns.table; + +-- Specific columns +SELECT user_id, timestamp, status FROM ns.table; + +-- With conditions +SELECT * FROM ns.table +WHERE timestamp BETWEEN '2025-01-01T00:00:00Z' AND '2025-01-31T23:59:59Z' + AND status = 200 +LIMIT 100; + +-- Complex conditions +SELECT * FROM ns.table +WHERE (status = 404 OR status = 500) + AND method = 'POST' + AND user_agent IS NOT NULL +ORDER BY timestamp DESC; +``` + +### Aggregations + +Supported functions: COUNT(*), SUM(col), AVG(col), MIN(col), MAX(col) + +```sql +-- Count by group +SELECT department, COUNT(*) +FROM ns.sales_data +GROUP BY department; + +-- Multiple aggregates +SELECT region, MIN(price), MAX(price), AVG(price) +FROM ns.products +GROUP BY region +ORDER BY AVG(price) DESC; + +-- With HAVING filter +SELECT category, SUM(amount) +FROM ns.sales +WHERE sale_date >= '2024-01-01' +GROUP BY category +HAVING SUM(amount) > 10000 +LIMIT 10; +``` + +### Data Types + +| Type | Description | Example | +|------|-------------|---------| +| integer | Whole numbers | 1, 42, -10 | +| float | Decimals | 1.5, 3.14 | +| string | Text (quoted) | 'hello', 'GET' | +| boolean | true/false | true, false | +| timestamp | RFC3339 | '2025-01-01T00:00:00Z' | +| date | YYYY-MM-DD | '2025-01-01' | + +### Operators + +Comparison: =, !=, <, <=, >, >=, LIKE, BETWEEN, IS NULL, IS NOT NULL +Logical: AND (higher precedence), OR (lower precedence) + +### ORDER BY Limitations + +**CRITICAL**: ORDER BY only supports partition key columns + +```sql +-- Valid if timestamp is partition key +SELECT * FROM ns.logs ORDER BY timestamp DESC LIMIT 100; + +-- Invalid if column not in partition key +SELECT * FROM ns.logs ORDER BY user_id; -- ERROR +``` + +### LIMIT Defaults + +- Range: 1 to 10,000 +- Default: 500 if not specified + +## Pipelines Integration + +### Create Pipeline with Data Catalog Sink + +Schema file (`schema.json`): +```json +{ + "fields": [ + {"name": "user_id", "type": "string", "required": true}, + {"name": "event_type", "type": "string", "required": true}, + {"name": "amount", "type": "float64", "required": false} + ] +} +``` + +Setup: +```bash +npx wrangler pipelines setup +``` + +Configuration: +- Pipeline name: ecommerce +- Enable HTTP endpoint: yes +- Schema: Load from file → schema.json +- Destination: Data Catalog Table +- R2 bucket: your-bucket +- Namespace: default +- Table name: events +- Catalog token: +- Compression: zstd +- Roll file time: 10 seconds (dev), 300+ (prod) + +### Send Data to Pipeline + +```bash +curl -X POST https://{stream-id}.ingest.cloudflare.com \ + -H "Content-Type: application/json" \ + -d '[ + { + "user_id": "user_123", + "event_type": "purchase", + "amount": 29.99 + } + ]' +``` + +## Common Use Cases + +### Log Analytics +- Ingest logs via Pipelines to Iceberg table +- Partition by day(timestamp) for efficient queries +- Query specific time ranges with automatic pruning +- Aggregate by status codes, endpoints, user agents + +```sql +SELECT status, COUNT(*) +FROM logs.http_requests +WHERE timestamp BETWEEN '2025-01-01T00:00:00Z' AND '2025-01-31T23:59:59Z' + AND method = 'GET' +GROUP BY status +ORDER BY COUNT(*) DESC; +``` + +### Fraud Detection +- Stream transaction events to catalog +- Query suspicious patterns with WHERE filters +- Aggregate by location, merchant, time windows + +```sql +SELECT location, COUNT(*), AVG(amount) +FROM fraud.transactions +WHERE is_fraud = true + AND transaction_timestamp >= '2025-01-01' +GROUP BY location +HAVING COUNT(*) > 10; +``` + +### Business Intelligence +- ETL data into partitioned Iceberg tables +- Run analytical queries across large datasets +- Generate reports with GROUP BY aggregations +- No egress fees when querying from BI tools + +```sql +SELECT + department, + SUM(revenue) as total_revenue, + AVG(revenue) as avg_revenue +FROM sales.transactions +WHERE sale_date >= '2024-01-01' +GROUP BY department +ORDER BY SUM(revenue) DESC +LIMIT 10; +``` + +## Performance Optimization + +### Partitioning Strategy +- Choose partition key based on common query patterns +- Typical: day(timestamp), hour(timestamp), region, category +- Enables metadata pruning to skip entire partitions +- Required for ORDER BY optimization + +### Query Optimization +- Use WHERE filters to leverage partition/column stats +- Specify LIMIT to enable early termination +- ORDER BY partition key columns only +- Filter on high-selectivity columns first + +### Data Organization +- Smaller files → slower queries (overhead) +- Larger files → better compression, fewer metadata ops +- Recommended: 100-500MB Parquet files after compression +- Use appropriate roll intervals in Pipelines (300+ seconds for prod) + +### File Pruning +Automatic at three levels: +1. Partition-level: Skip manifests not matching query +2. File-level: Skip Parquet files via column stats +3. Row-group level: Skip row groups within files + +## Iceberg Metadata Structure + +``` +bucket/ + metadata/ + snap-{id}.avro # Snapshot (points to manifest list) + {uuid}-m0.avro # Manifest file (lists data files + stats) + version-hint.text # Current metadata version + v{n}.metadata.json # Table metadata (schema, snapshots) + data/ + 00000-0-{uuid}.parquet # Data files +``` + +**Metadata hierarchy**: +1. Table metadata JSON - schema, partition spec, snapshot log +2. Snapshot - points to manifest list +3. Manifest list - partition stats for each manifest +4. Manifest files - column stats for each data file +5. Parquet files - row group stats in footer + +## Limitations & Best Practices + +### Current Limitations (Open Beta) +- ORDER BY only on partition key columns +- COUNT(*) only - COUNT(column) not supported +- No aliases in SELECT +- No subqueries, joins, or CTEs +- No nested column access +- LIMIT max 10,000 + +### Best Practices +- Partition by time dimension for time-series data +- Use BETWEEN for time ranges (leverages partition pruning) +- Combine filters with AND for better pruning +- Set appropriate LIMIT based on use case +- Use compression (zstd recommended) +- Monitor query performance and adjust partitioning + +### Type Safety +- Quote string values: 'value' +- Use RFC3339 for timestamps: '2025-01-01T00:00:00Z' +- Use YYYY-MM-DD for dates: '2025-01-01' +- No implicit type conversions + +## Connecting Other Engines + +R2 Data Catalog supports standard Iceberg REST catalog API. + +### Spark (Scala) +```scala +val spark = SparkSession.builder() + .config("spark.sql.catalog.my_catalog", "org.apache.iceberg.spark.SparkCatalog") + .config("spark.sql.catalog.my_catalog.catalog-impl", "org.apache.iceberg.rest.RESTCatalog") + .config("spark.sql.catalog.my_catalog.uri", catalogUri) + .config("spark.sql.catalog.my_catalog.token", token) + .config("spark.sql.catalog.my_catalog.warehouse", warehouse) + .getOrCreate() +``` + +### Snowflake +- Create external Iceberg catalog connection +- Configure with Catalog URI and R2 credentials +- Query tables via SQL interface + +### DuckDB, Trino, ClickHouse +- Supported via Iceberg REST catalog protocol +- Refer to engine-specific documentation for configuration + +## Pricing (Future) + +Currently in open beta - no charges beyond standard R2 costs. + +Planned future pricing: +- R2 storage: $0.015/GB-month +- Class A operations: $4.50/million +- Class B operations: $0.36/million +- Catalog operations: $9.00/million (create table, get metadata, etc) +- Compaction: $0.05/GB + $4.00/million objects processed +- Egress: $0 (always free) + +30+ days notice before billing begins. + +## Troubleshooting + +### Common Errors + +**"ORDER BY column not in partition key"** +- Only partition key columns can be used in ORDER BY +- Check table partition spec with DESCRIBE +- Remove ORDER BY or adjust table partitioning + +**"Token authentication failed"** +- Verify WRANGLER_R2_SQL_AUTH_TOKEN is set +- Ensure token has R2 Admin Read & Write + SQL Read permissions +- Token may be expired - create new one + +**"Table not found"** +- Verify namespace exists: SHOW DATABASES +- Check table name: SHOW TABLES IN namespace +- Ensure catalog enabled on bucket + +**"No data returned"** +- Check WHERE conditions match data +- Verify time range in BETWEEN clause +- Try removing filters to confirm data exists + +### Performance Issues + +**Slow queries**: +- Check partition pruning effectiveness +- Reduce LIMIT if scanning too much data +- Ensure filters on partition key columns +- Review Parquet file sizes (aim for 100-500MB) + +**Query timeout**: +- Add more restrictive WHERE filters +- Reduce LIMIT +- Consider better partitioning strategy + +## Resources + +- Docs: https://developers.cloudflare.com/r2-sql/ +- Data Catalog: https://developers.cloudflare.com/r2/data-catalog/ +- Blog: https://blog.cloudflare.com/r2-sql-deep-dive/ +- Discord: https://discord.cloudflare.com/ + +## Key Reminders + +1. R2 SQL queries ONLY Apache Iceberg tables in R2 Data Catalog +2. Enable catalog on bucket before use +3. Create API token with R2 + catalog permissions +4. Partition by time for time-series data +5. ORDER BY limited to partition key columns +6. Use LIMIT and WHERE for optimal performance +7. Zero egress fees - query from anywhere +8. Open beta - free during testing phase +9. Serverless - no infrastructure management +10. Leverage Cloudflare's global network for distributed execution diff --git a/cloudflare/references/r2-sql/api.md b/cloudflare/references/r2-sql/api.md new file mode 100644 index 0000000..7e67c8f --- /dev/null +++ b/cloudflare/references/r2-sql/api.md @@ -0,0 +1,158 @@ +# R2 SQL API Reference + +SQL syntax, functions, operators, and data types for R2 SQL queries. + +## SQL Syntax + +```sql +SELECT column_list | aggregation_function +FROM [namespace.]table_name +WHERE conditions +[GROUP BY column_list] +[HAVING conditions] +[ORDER BY column | aggregation_function [DESC | ASC]] +[LIMIT number] +``` + +## Schema Discovery + +```sql +SHOW DATABASES; -- List namespaces +SHOW NAMESPACES; -- Alias for SHOW DATABASES +SHOW SCHEMAS; -- Alias for SHOW DATABASES +SHOW TABLES IN namespace; -- List tables in namespace +DESCRIBE namespace.table; -- Show table schema, partition keys +``` + +## SELECT Clause + +```sql +-- All columns +SELECT * FROM logs.http_requests; + +-- Specific columns +SELECT user_id, timestamp, status FROM logs.http_requests; +``` + +**Limitations:** No column aliases, expressions, or nested column access + +## WHERE Clause + +### Operators + +| Operator | Example | +|----------|---------| +| `=`, `!=`, `<`, `<=`, `>`, `>=` | `status = 200` | +| `LIKE` | `user_agent LIKE '%Chrome%'` | +| `BETWEEN` | `timestamp BETWEEN '2025-01-01T00:00:00Z' AND '2025-01-31T23:59:59Z'` | +| `IS NULL`, `IS NOT NULL` | `email IS NOT NULL` | +| `AND`, `OR` | `status = 200 AND method = 'GET'` | + +Use parentheses for precedence: `(status = 404 OR status = 500) AND method = 'POST'` + +## Aggregation Functions + +| Function | Description | +|----------|-------------| +| `COUNT(*)` | Count all rows | +| `COUNT(column)` | Count non-null values | +| `COUNT(DISTINCT column)` | Count unique values | +| `SUM(column)`, `AVG(column)` | Numeric aggregations | +| `MIN(column)`, `MAX(column)` | Min/max values | + +```sql +-- Multiple aggregations with GROUP BY +SELECT region, COUNT(*), SUM(amount), AVG(amount) +FROM sales.transactions +WHERE sale_date >= '2024-01-01' +GROUP BY region; +``` + +## HAVING Clause + +Filter aggregated results (after GROUP BY): + +```sql +SELECT category, SUM(amount) +FROM sales.transactions +GROUP BY category +HAVING SUM(amount) > 10000; +``` + +## ORDER BY Clause + +Sort results by: +- **Partition key columns** - Always supported +- **Aggregation functions** - Supported via shuffle strategy + +```sql +-- Order by partition key +SELECT * FROM logs.requests ORDER BY timestamp DESC LIMIT 100; + +-- Order by aggregation (repeat function, aliases not supported) +SELECT region, SUM(amount) +FROM sales.transactions +GROUP BY region +ORDER BY SUM(amount) DESC; +``` + +**Limitations:** Cannot order by non-partition columns. See [gotchas.md](gotchas.md#order-by-limitations) + +## LIMIT Clause + +```sql +SELECT * FROM logs.requests LIMIT 100; +``` + +| Setting | Value | +|---------|-------| +| Min | 1 | +| Max | 10,000 | +| Default | 500 | + +**Always use LIMIT** to enable early termination optimization. + +## Data Types + +| Type | SQL Literal | Example | +|------|-------------|---------| +| `integer` | Unquoted number | `42`, `-10` | +| `float` | Decimal number | `3.14`, `-0.5` | +| `string` | Single quotes | `'hello'`, `'GET'` | +| `boolean` | Keyword | `true`, `false` | +| `timestamp` | RFC3339 string | `'2025-01-01T00:00:00Z'` | +| `date` | ISO 8601 date | `'2025-01-01'` | + +### Type Safety + +- Quote strings with single quotes: `'value'` +- Timestamps must be RFC3339: `'2025-01-01T00:00:00Z'` (include timezone) +- Dates must be ISO 8601: `'2025-01-01'` (YYYY-MM-DD) +- No implicit conversions + +```sql +-- ✅ Correct +WHERE status = 200 AND method = 'GET' AND timestamp > '2025-01-01T00:00:00Z' + +-- ❌ Wrong +WHERE status = '200' -- string instead of integer +WHERE timestamp > '2025-01-01' -- missing time/timezone +WHERE method = GET -- unquoted string +``` + +## Query Result Format + +JSON array of objects: + +```json +[ + {"user_id": "user_123", "timestamp": "2025-01-15T10:30:00Z", "status": 200}, + {"user_id": "user_456", "timestamp": "2025-01-15T10:31:00Z", "status": 404} +] +``` + +## See Also + +- [patterns.md](patterns.md) - Query examples and use cases +- [gotchas.md](gotchas.md) - SQL limitations and error handling +- [configuration.md](configuration.md) - Setup and authentication diff --git a/cloudflare/references/r2-sql/configuration.md b/cloudflare/references/r2-sql/configuration.md new file mode 100644 index 0000000..3c5cfb2 --- /dev/null +++ b/cloudflare/references/r2-sql/configuration.md @@ -0,0 +1,147 @@ +# R2 SQL Configuration + +Setup and configuration for R2 SQL queries. + +## Prerequisites + +- R2 bucket with Data Catalog enabled +- API token with R2 permissions +- Wrangler CLI installed (for CLI queries) + +## Enable R2 Data Catalog + +R2 SQL queries Apache Iceberg tables in R2 Data Catalog. Must enable catalog on bucket first. + +### Via Wrangler CLI + +```bash +npx wrangler r2 bucket catalog enable +``` + +Output includes: +- **Warehouse name** - Typically same as bucket name +- **Catalog URI** - REST endpoint for catalog operations + +Example output: +``` +Catalog enabled successfully +Warehouse: my-bucket +Catalog URI: https://abc123.r2.cloudflarestorage.com/iceberg/my-bucket +``` + +### Via Dashboard + +1. Navigate to **R2 Object Storage** → Select your bucket +2. Click **Settings** tab +3. Scroll to **R2 Data Catalog** section +4. Click **Enable** +5. Note the **Catalog URI** and **Warehouse** name + +**Important:** Enabling catalog creates metadata directories in bucket but does not modify existing objects. + +## Create API Token + +R2 SQL requires API token with R2 permissions. + +### Required Permission + +**R2 Admin Read & Write** (includes R2 SQL Read permission) + +### Via Dashboard + +1. Navigate to **R2 Object Storage** +2. Click **Manage API tokens** (top right) +3. Click **Create API token** +4. Select **Admin Read & Write** permission +5. Click **Create API Token** +6. **Copy token value** - shown only once + +### Permission Scope + +| Permission | Grants Access To | +|------------|------------------| +| R2 Admin Read & Write | R2 storage operations + R2 SQL queries + Data Catalog operations | +| R2 SQL Read | SQL queries only (no storage writes) | + +**Note:** R2 SQL Read permission not yet available via Dashboard - use Admin Read & Write. + +## Configure Environment + +### Wrangler CLI + +Set environment variable for Wrangler to use: + +```bash +export WRANGLER_R2_SQL_AUTH_TOKEN= +``` + +Or create `.env` file in project directory: + +``` +WRANGLER_R2_SQL_AUTH_TOKEN= +``` + +Wrangler automatically loads `.env` file when running commands. + +### HTTP API + +For programmatic access (non-Wrangler), pass token in Authorization header: + +```bash +curl -X POST https://api.cloudflare.com/client/v4/accounts/{account_id}/r2/sql/query \ + -H "Authorization: Bearer " \ + -H "Content-Type: application/json" \ + -d '{ + "warehouse": "my-bucket", + "query": "SELECT * FROM default.my_table LIMIT 10" + }' +``` + +**Note:** HTTP API endpoint URL may vary - see [patterns.md](patterns.md#http-api-query) for current endpoint. + +## Verify Setup + +Test configuration by querying system tables: + +```bash +# List namespaces +npx wrangler r2 sql query "my-bucket" "SHOW DATABASES" + +# List tables in namespace +npx wrangler r2 sql query "my-bucket" "SHOW TABLES IN default" +``` + +If successful, returns JSON array of results. + +## Troubleshooting + +### "Token authentication failed" + +**Cause:** Invalid or missing token + +**Solution:** +- Verify `WRANGLER_R2_SQL_AUTH_TOKEN` environment variable set +- Check token has Admin Read & Write permission +- Create new token if expired + +### "Catalog not enabled on bucket" + +**Cause:** Data Catalog not enabled + +**Solution:** +- Run `npx wrangler r2 bucket catalog enable ` +- Or enable via Dashboard (R2 → bucket → Settings → R2 Data Catalog) + +### "Permission denied" + +**Cause:** Token lacks required permissions + +**Solution:** +- Verify token has **Admin Read & Write** permission +- Create new token with correct permissions + +## See Also + +- [r2-data-catalog/configuration.md](../r2-data-catalog/configuration.md) - Detailed token setup and PyIceberg connection +- [patterns.md](patterns.md) - Query examples using configuration +- [gotchas.md](gotchas.md) - Common configuration errors diff --git a/cloudflare/references/r2-sql/gotchas.md b/cloudflare/references/r2-sql/gotchas.md new file mode 100644 index 0000000..d16de94 --- /dev/null +++ b/cloudflare/references/r2-sql/gotchas.md @@ -0,0 +1,212 @@ +# R2 SQL Gotchas + +Limitations, troubleshooting, and common pitfalls for R2 SQL. + +## Critical Limitations + +### No Workers Binding + +**Cannot call R2 SQL from Workers/Pages code** - no binding exists. + +```typescript +// ❌ This doesn't exist +export default { + async fetch(request, env) { + const result = await env.R2_SQL.query("SELECT * FROM table"); // Not possible + return Response.json(result); + } +}; +``` + +**Solutions:** +- HTTP API from external systems (not Workers) +- PyIceberg/Spark via r2-data-catalog REST API +- For Workers, use D1 or external databases + +### ORDER BY Limitations + +Can only order by: +1. **Partition key columns** - Always supported +2. **Aggregation functions** - Supported via shuffle strategy + +**Cannot order by** regular non-partition columns. + +```sql +-- ✅ Valid: ORDER BY partition key +SELECT * FROM logs.requests ORDER BY timestamp DESC LIMIT 100; + +-- ✅ Valid: ORDER BY aggregation +SELECT region, SUM(amount) FROM sales.transactions +GROUP BY region ORDER BY SUM(amount) DESC; + +-- ❌ Invalid: ORDER BY non-partition column +SELECT * FROM logs.requests ORDER BY user_id; + +-- ❌ Invalid: ORDER BY alias (must repeat function) +SELECT region, SUM(amount) as total FROM sales.transactions +GROUP BY region ORDER BY total; -- Use ORDER BY SUM(amount) +``` + +Check partition spec: `DESCRIBE namespace.table_name` + +## SQL Feature Limitations + +| Feature | Supported | Notes | +|---------|-----------|-------| +| SELECT, WHERE, GROUP BY, HAVING | ✅ | Standard support | +| COUNT, SUM, AVG, MIN, MAX | ✅ | Standard aggregations | +| ORDER BY partition/aggregation | ✅ | See above | +| LIMIT | ✅ | Max 10,000 | +| Column aliases | ❌ | No AS alias | +| Expressions in SELECT | ❌ | No col1 + col2 | +| ORDER BY non-partition | ❌ | Fails at runtime | +| JOINs, subqueries, CTEs | ❌ | Denormalize at write time | +| Window functions, UNION | ❌ | Use external engines | +| INSERT/UPDATE/DELETE | ❌ | Use PyIceberg/Pipelines | +| Nested columns, arrays, JSON | ❌ | Flatten at write time | + +**Workarounds:** +- No JOINs: Denormalize data or use Spark/PyIceberg +- No subqueries: Split into multiple queries +- No aliases: Accept generated names, transform in app + +## Common Errors + +### "Column not found" +**Cause:** Typo, column doesn't exist, or case mismatch +**Solution:** `DESCRIBE namespace.table_name` to check schema + +### "Type mismatch" +```sql +-- ❌ Wrong types +WHERE status = '200' -- string instead of integer +WHERE timestamp > '2025-01-01' -- missing time/timezone + +-- ✅ Correct types +WHERE status = 200 +WHERE timestamp > '2025-01-01T00:00:00Z' +``` + +### "ORDER BY column not in partition key" +**Cause:** Ordering by non-partition column +**Solution:** Use partition key, aggregation, or remove ORDER BY. Check: `DESCRIBE table` + +### "Token authentication failed" +```bash +# Check/set token +echo $WRANGLER_R2_SQL_AUTH_TOKEN +export WRANGLER_R2_SQL_AUTH_TOKEN= + +# Or .env file +echo "WRANGLER_R2_SQL_AUTH_TOKEN=" > .env +``` + +### "Table not found" +```sql +-- Verify catalog and tables +SHOW DATABASES; +SHOW TABLES IN namespace_name; +``` + +Enable catalog: `npx wrangler r2 bucket catalog enable ` + +### "LIMIT exceeds maximum" +Max LIMIT is 10,000. For pagination, use WHERE filters with partition keys. + +### "No data returned" (unexpected) +**Debug steps:** +1. `SELECT COUNT(*) FROM table` - verify data exists +2. Remove WHERE filters incrementally +3. `SELECT * FROM table LIMIT 10` - inspect actual data/types + +## Performance Issues + +### Slow Queries + +**Causes:** Too many partitions, large LIMIT, no filters, small files + +```sql +-- ❌ Slow: No filters +SELECT * FROM logs.requests LIMIT 10000; + +-- ✅ Fast: Filter on partition key +SELECT * FROM logs.requests +WHERE timestamp >= '2025-01-15T00:00:00Z' AND timestamp < '2025-01-16T00:00:00Z' +LIMIT 1000; + +-- ✅ Faster: Multiple filters +SELECT * FROM logs.requests +WHERE timestamp >= '2025-01-15T00:00:00Z' AND status = 404 AND method = 'GET' +LIMIT 1000; +``` + +**File optimization:** +- Target Parquet size: 100-500MB compressed +- Pipelines roll interval: 300+ sec (prod), 10 sec (dev) +- Run compaction to merge small files + +### Query Timeout + +**Solution:** Add restrictive WHERE filters, reduce time range, query smaller intervals + +```sql +-- ❌ Times out: Year-long aggregation +SELECT status, COUNT(*) FROM logs.requests +WHERE timestamp >= '2024-01-01T00:00:00Z' GROUP BY status; + +-- ✅ Faster: Month-long aggregation +SELECT status, COUNT(*) FROM logs.requests +WHERE timestamp >= '2025-01-01T00:00:00Z' AND timestamp < '2025-02-01T00:00:00Z' +GROUP BY status; +``` + +## Best Practices + +### Partitioning +- **Time-series:** Partition by day/hour on timestamp +- **Avoid:** High-cardinality keys (user_id), >10,000 partitions + +```python +from pyiceberg.partitioning import PartitionSpec, PartitionField +from pyiceberg.transforms import DayTransform + +PartitionSpec(PartitionField(source_id=1, field_id=1000, transform=DayTransform(), name="day")) +``` + +### Query Writing +- **Always use LIMIT** for early termination +- **Filter on partition keys first** for pruning +- **Combine filters with AND** for more pruning + +```sql +-- Good +WHERE timestamp >= '2025-01-15T00:00:00Z' AND status = 404 AND method = 'GET' LIMIT 100 +``` + +### Type Safety +- Quote strings: `'GET'` not `GET` +- RFC3339 timestamps: `'2025-01-01T00:00:00Z'` not `'2025-01-01'` +- ISO dates: `'2025-01-15'` not `'01/15/2025'` + +### Data Organization +- **Pipelines:** Dev `roll_file_time: 10`, Prod `roll_file_time: 300+` +- **Compression:** Use `zstd` +- **Maintenance:** Compaction for small files, expire old snapshots + +## Debugging Checklist + +1. `npx wrangler r2 bucket catalog enable ` - Verify catalog +2. `echo $WRANGLER_R2_SQL_AUTH_TOKEN` - Check token +3. `SHOW DATABASES` - List namespaces +4. `SHOW TABLES IN namespace` - List tables +5. `DESCRIBE namespace.table` - Check schema +6. `SELECT COUNT(*) FROM namespace.table` - Verify data +7. `SELECT * FROM namespace.table LIMIT 10` - Test simple query +8. Add filters incrementally + +## See Also + +- [api.md](api.md) - SQL syntax +- [patterns.md](patterns.md) - Query optimization +- [configuration.md](configuration.md) - Setup +- [Cloudflare R2 SQL Docs](https://developers.cloudflare.com/r2-sql/) diff --git a/cloudflare/references/r2-sql/patterns.md b/cloudflare/references/r2-sql/patterns.md new file mode 100644 index 0000000..53de7d3 --- /dev/null +++ b/cloudflare/references/r2-sql/patterns.md @@ -0,0 +1,222 @@ +# R2 SQL Patterns + +Common patterns, use cases, and integration examples for R2 SQL. + +## Wrangler CLI Query + +```bash +# Basic query +npx wrangler r2 sql query "my-bucket" "SELECT * FROM default.logs LIMIT 10" + +# Multi-line query +npx wrangler r2 sql query "my-bucket" " + SELECT status, COUNT(*), AVG(response_time) + FROM logs.http_requests + WHERE timestamp >= '2025-01-01T00:00:00Z' + GROUP BY status + ORDER BY COUNT(*) DESC + LIMIT 100 +" + +# Use environment variable +export R2_SQL_WAREHOUSE="my-bucket" +npx wrangler r2 sql query "$R2_SQL_WAREHOUSE" "SELECT * FROM default.logs" +``` + +## HTTP API Query + +For programmatic access from external systems (not Workers - see gotchas.md). + +```bash +curl -X POST https://api.cloudflare.com/client/v4/accounts/{account_id}/r2/sql/query \ + -H "Authorization: Bearer " \ + -H "Content-Type: application/json" \ + -d '{ + "warehouse": "my-bucket", + "query": "SELECT * FROM default.my_table WHERE status = 200 LIMIT 100" + }' +``` + +Response: +```json +{ + "success": true, + "result": [{"user_id": "user_123", "timestamp": "2025-01-15T10:30:00Z", "status": 200}], + "errors": [] +} +``` + +## Pipelines Integration + +Stream data to Iceberg tables via Pipelines, then query with R2 SQL. + +```bash +# Setup pipeline (select Data Catalog Table destination) +npx wrangler pipelines setup + +# Key settings: +# - Destination: Data Catalog Table +# - Compression: zstd (recommended) +# - Roll file time: 300+ sec (production), 10 sec (dev) + +# Send data to pipeline +curl -X POST https://{stream-id}.ingest.cloudflare.com \ + -H "Content-Type: application/json" \ + -d '[{"user_id": "user_123", "event_type": "purchase", "timestamp": "2025-01-15T10:30:00Z", "amount": 29.99}]' + +# Query ingested data (wait for roll interval) +npx wrangler r2 sql query "my-bucket" " + SELECT event_type, COUNT(*), SUM(amount) + FROM default.events + WHERE timestamp >= '2025-01-15T00:00:00Z' + GROUP BY event_type +" +``` + +See [pipelines/patterns.md](../pipelines/patterns.md) for detailed setup. + +## PyIceberg Integration + +Create and populate Iceberg tables with PyIceberg, then query with R2 SQL. + +```python +from pyiceberg.catalog.rest import RestCatalog +import pyarrow as pa +import pandas as pd + +# Setup catalog +catalog = RestCatalog( + name="my_catalog", + warehouse="my-bucket", + uri="https://.r2.cloudflarestorage.com/iceberg/my-bucket", + token="", +) +catalog.create_namespace_if_not_exists("analytics") + +# Create table +schema = pa.schema([ + pa.field("user_id", pa.string(), nullable=False), + pa.field("event_time", pa.timestamp("us", tz="UTC"), nullable=False), + pa.field("page_views", pa.int64(), nullable=False), +]) +table = catalog.create_table(("analytics", "user_metrics"), schema=schema) + +# Append data +df = pd.DataFrame({ + "user_id": ["user_1", "user_2"], + "event_time": pd.to_datetime(["2025-01-15 10:00:00", "2025-01-15 11:00:00"], utc=True), + "page_views": [10, 25], +}) +table.append(pa.Table.from_pandas(df, schema=schema)) +``` + +Query with R2 SQL: +```bash +npx wrangler r2 sql query "my-bucket" " + SELECT user_id, SUM(page_views) + FROM analytics.user_metrics + WHERE event_time >= '2025-01-15T00:00:00Z' + GROUP BY user_id +" +``` + +See [r2-data-catalog/patterns.md](../r2-data-catalog/patterns.md) for advanced PyIceberg patterns. + +## Use Cases + +### Log Analytics +```sql +-- Error rate by endpoint +SELECT path, COUNT(*), SUM(CASE WHEN status >= 400 THEN 1 ELSE 0 END) as errors +FROM logs.http_requests +WHERE timestamp BETWEEN '2025-01-01T00:00:00Z' AND '2025-01-31T23:59:59Z' +GROUP BY path ORDER BY errors DESC LIMIT 20; + +-- Response time stats +SELECT method, MIN(response_time_ms), AVG(response_time_ms), MAX(response_time_ms) +FROM logs.http_requests WHERE timestamp >= '2025-01-15T00:00:00Z' GROUP BY method; + +-- Traffic by status +SELECT status, COUNT(*) FROM logs.http_requests +WHERE timestamp >= '2025-01-15T00:00:00Z' AND method = 'GET' +GROUP BY status ORDER BY COUNT(*) DESC; +``` + +### Fraud Detection +```sql +-- High-value transactions +SELECT location, COUNT(*), SUM(amount), AVG(amount) +FROM fraud.transactions WHERE transaction_timestamp >= '2025-01-01T00:00:00Z' AND amount > 1000.0 +GROUP BY location ORDER BY SUM(amount) DESC LIMIT 20; + +-- Flagged transactions +SELECT merchant_category, COUNT(*), AVG(amount) FROM fraud.transactions +WHERE is_fraud_flag = true AND transaction_timestamp >= '2025-01-01T00:00:00Z' +GROUP BY merchant_category HAVING COUNT(*) > 10 ORDER BY COUNT(*) DESC; +``` + +### Business Intelligence +```sql +-- Sales by department +SELECT department, SUM(revenue), AVG(revenue), COUNT(*) FROM sales.transactions +WHERE sale_date >= '2024-01-01' GROUP BY department ORDER BY SUM(revenue) DESC LIMIT 10; + +-- Product performance +SELECT category, COUNT(DISTINCT product_id), SUM(units_sold), SUM(revenue) +FROM sales.product_sales WHERE sale_date BETWEEN '2024-10-01' AND '2024-12-31' +GROUP BY category ORDER BY SUM(revenue) DESC; +``` + +## Connecting External Engines + +R2 Data Catalog exposes Iceberg REST API. Connect Spark, Snowflake, Trino, DuckDB, etc. + +```scala +// Apache Spark example +val spark = SparkSession.builder() + .config("spark.sql.catalog.my_catalog", "org.apache.iceberg.spark.SparkCatalog") + .config("spark.sql.catalog.my_catalog.catalog-impl", "org.apache.iceberg.rest.RESTCatalog") + .config("spark.sql.catalog.my_catalog.uri", "https://.r2.cloudflarestorage.com/iceberg/my-bucket") + .config("spark.sql.catalog.my_catalog.token", "") + .getOrCreate() + +spark.sql("SELECT * FROM my_catalog.default.my_table LIMIT 10").show() +``` + +See [r2-data-catalog/patterns.md](../r2-data-catalog/patterns.md) for more engines. + +## Performance Optimization + +### Partitioning +- **Time-series:** day/hour on timestamp +- **Geographic:** region/country +- **Avoid:** High-cardinality keys (user_id) + +```python +from pyiceberg.partitioning import PartitionSpec, PartitionField +from pyiceberg.transforms import DayTransform + +PartitionSpec(PartitionField(source_id=1, field_id=1000, transform=DayTransform(), name="day")) +``` + +### Query Optimization +- **Always use LIMIT** for early termination +- **Filter on partition keys first** +- **Multiple filters** for better pruning + +```sql +-- Better: Multiple filters on partition key +SELECT * FROM logs.requests +WHERE timestamp >= '2025-01-15T00:00:00Z' AND status = 404 AND method = 'GET' LIMIT 100; +``` + +### File Organization +- **Pipelines roll:** Dev 10-30s, Prod 300+s +- **Target Parquet:** 100-500MB compressed + +## See Also + +- [api.md](api.md) - SQL syntax reference +- [gotchas.md](gotchas.md) - Limitations and troubleshooting +- [r2-data-catalog/patterns.md](../r2-data-catalog/patterns.md) - PyIceberg advanced patterns +- [pipelines/patterns.md](../pipelines/patterns.md) - Streaming ingestion patterns diff --git a/cloudflare/references/r2/README.md b/cloudflare/references/r2/README.md new file mode 100644 index 0000000..af3d7d0 --- /dev/null +++ b/cloudflare/references/r2/README.md @@ -0,0 +1,95 @@ +# Cloudflare R2 Object Storage + +S3-compatible object storage with zero egress fees, optimized for large file storage and delivery. + +## Overview + +R2 provides: +- S3-compatible API (Workers API + S3 REST) +- Zero egress fees globally +- Strong consistency for writes/deletes +- Storage classes (Standard/Infrequent Access) +- SSE-C encryption support + +**Use cases:** Media storage, backups, static assets, user uploads, data lakes + +## Quick Start + +```bash +wrangler r2 bucket create my-bucket --location=enam +wrangler r2 object put my-bucket/file.txt --file=./local.txt +``` + +```typescript +// Upload +await env.MY_BUCKET.put(key, data, { + httpMetadata: { contentType: 'image/jpeg' } +}); + +// Download +const object = await env.MY_BUCKET.get(key); +if (object) return new Response(object.body); +``` + +## Core Operations + +| Method | Purpose | Returns | +|--------|---------|---------| +| `put(key, value, options?)` | Upload object | `R2Object \| null` | +| `get(key, options?)` | Download object | `R2ObjectBody \| R2Object \| null` | +| `head(key)` | Get metadata only | `R2Object \| null` | +| `delete(keys)` | Delete object(s) | `Promise` | +| `list(options?)` | List objects | `R2Objects` | + +## Storage Classes + +- **Standard**: Frequent access, low latency reads +- **InfrequentAccess**: 30-day minimum storage, retrieval fees, lower storage cost + +## Event Notifications + +R2 integrates with Cloudflare Queues for reactive workflows: + +```typescript +// wrangler.jsonc +{ + "event_notifications": [{ + "queue": "r2-notifications", + "actions": ["PutObject", "DeleteObject"] + }] +} + +// Consumer +async queue(batch: MessageBatch, env: Env) { + for (const message of batch.messages) { + const event = message.body; // { action, bucket, object, timestamps } + if (event.action === 'PutObject') { + // Process upload: thumbnail generation, virus scan, etc. + } + } +} +``` + +## Reading Order + +**First-time users:** README → configuration.md → api.md → patterns.md +**Specific tasks:** +- Setup: configuration.md +- Client uploads: patterns.md (presigned URLs) +- Public static site: patterns.md (public access + custom domain) +- Processing uploads: README (event notifications) + queues reference +- Debugging: gotchas.md + +## In This Reference + +- [configuration.md](./configuration.md) - Bindings, S3 SDK, CORS, lifecycles, token scopes +- [api.md](./api.md) - Workers API, multipart, conditional requests, presigned URLs +- [patterns.md](./patterns.md) - Streaming, caching, client uploads, public buckets +- [gotchas.md](./gotchas.md) - List truncation, etag format, stream length, S3 SDK region + +## See Also + +- [workers](../workers/) - Worker runtime and fetch handlers +- [kv](../kv/) - Metadata storage for R2 objects +- [d1](../d1/) - Store R2 URLs in relational database +- [queues](../queues/) - Process R2 uploads asynchronously diff --git a/cloudflare/references/r2/api.md b/cloudflare/references/r2/api.md new file mode 100644 index 0000000..9a8cd28 --- /dev/null +++ b/cloudflare/references/r2/api.md @@ -0,0 +1,200 @@ +# R2 API Reference + +## PUT (Upload) + +```typescript +// Basic +await env.MY_BUCKET.put(key, value); + +// With metadata +await env.MY_BUCKET.put(key, value, { + httpMetadata: { + contentType: 'image/jpeg', + contentDisposition: 'attachment; filename="photo.jpg"', + cacheControl: 'max-age=3600' + }, + customMetadata: { userId: '123', version: '2' }, + storageClass: 'Standard', // or 'InfrequentAccess' + sha256: arrayBufferOrHex, // Integrity check + ssecKey: arrayBuffer32bytes // SSE-C encryption +}); + +// Value types: ReadableStream | ArrayBuffer | string | Blob +``` + +## GET (Download) + +```typescript +const object = await env.MY_BUCKET.get(key); +if (!object) return new Response('Not found', { status: 404 }); + +// Body: arrayBuffer(), text(), json(), blob(), body (ReadableStream) + +// Ranged reads +const object = await env.MY_BUCKET.get(key, { range: { offset: 0, length: 1024 } }); + +// Conditional GET +const object = await env.MY_BUCKET.get(key, { onlyIf: { etagMatches: '"abc123"' } }); +``` + +## HEAD (Metadata Only) + +```typescript +const object = await env.MY_BUCKET.head(key); // Returns R2Object without body +``` + +## DELETE + +```typescript +await env.MY_BUCKET.delete(key); +await env.MY_BUCKET.delete([key1, key2, key3]); // Batch (max 1000) +``` +## LIST + +```typescript +const listed = await env.MY_BUCKET.list({ + limit: 1000, + prefix: 'photos/', + cursor: cursorFromPrevious, + delimiter: '/', + include: ['httpMetadata', 'customMetadata'] +}); + +// Pagination (always use truncated flag) +while (listed.truncated) { + const next = await env.MY_BUCKET.list({ cursor: listed.cursor }); + listed.objects.push(...next.objects); + listed.truncated = next.truncated; + listed.cursor = next.cursor; +} +``` + +## Multipart Uploads + +```typescript +const multipart = await env.MY_BUCKET.createMultipartUpload(key, { + httpMetadata: { contentType: 'video/mp4' } +}); + +const uploadedParts: R2UploadedPart[] = []; +for (let i = 0; i < partCount; i++) { + const part = await multipart.uploadPart(i + 1, partData); + uploadedParts.push(part); +} + +const object = await multipart.complete(uploadedParts); +// OR: await multipart.abort(); + +// Resume +const multipart = env.MY_BUCKET.resumeMultipartUpload(key, uploadId); +``` + +## Presigned URLs (S3 SDK) + +```typescript +import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3'; +import { getSignedUrl } from '@aws-sdk/s3-request-presigner'; + +const s3 = new S3Client({ + region: 'auto', + endpoint: `https://${accountId}.r2.cloudflarestorage.com`, + credentials: { accessKeyId: env.R2_ACCESS_KEY_ID, secretAccessKey: env.R2_SECRET_ACCESS_KEY } +}); + +const uploadUrl = await getSignedUrl(s3, new PutObjectCommand({ Bucket: 'my-bucket', Key: key }), { expiresIn: 3600 }); +return Response.json({ uploadUrl }); +``` + +## TypeScript Interfaces + +```typescript +interface R2Bucket { + head(key: string): Promise; + get(key: string, options?: R2GetOptions): Promise; + put(key: string, value: ReadableStream | ArrayBuffer | string | Blob, options?: R2PutOptions): Promise; + delete(keys: string | string[]): Promise; + list(options?: R2ListOptions): Promise; + createMultipartUpload(key: string, options?: R2MultipartOptions): Promise; + resumeMultipartUpload(key: string, uploadId: string): R2MultipartUpload; +} + +interface R2Object { + key: string; version: string; size: number; + etag: string; httpEtag: string; // httpEtag is quoted, use for headers + uploaded: Date; httpMetadata?: R2HTTPMetadata; + customMetadata?: Record; + storageClass: 'Standard' | 'InfrequentAccess'; + checksums: R2Checksums; + writeHttpMetadata(headers: Headers): void; +} + +interface R2ObjectBody extends R2Object { + body: ReadableStream; bodyUsed: boolean; + arrayBuffer(): Promise; text(): Promise; + json(): Promise; blob(): Promise; +} + +interface R2HTTPMetadata { + contentType?: string; contentDisposition?: string; + contentEncoding?: string; contentLanguage?: string; + cacheControl?: string; cacheExpiry?: Date; +} + +interface R2PutOptions { + httpMetadata?: R2HTTPMetadata | Headers; + customMetadata?: Record; + sha256?: ArrayBuffer | string; // Only ONE checksum allowed + storageClass?: 'Standard' | 'InfrequentAccess'; + ssecKey?: ArrayBuffer; +} + +interface R2GetOptions { + onlyIf?: R2Conditional | Headers; + range?: R2Range | Headers; + ssecKey?: ArrayBuffer; +} + +interface R2ListOptions { + limit?: number; prefix?: string; cursor?: string; delimiter?: string; + startAfter?: string; include?: ('httpMetadata' | 'customMetadata')[]; +} + +interface R2Objects { + objects: R2Object[]; truncated: boolean; + cursor?: string; delimitedPrefixes: string[]; +} + +interface R2Conditional { + etagMatches?: string; etagDoesNotMatch?: string; + uploadedBefore?: Date; uploadedAfter?: Date; +} + +interface R2Range { offset?: number; length?: number; suffix?: number; } + +interface R2Checksums { + md5?: ArrayBuffer; sha1?: ArrayBuffer; sha256?: ArrayBuffer; + sha384?: ArrayBuffer; sha512?: ArrayBuffer; +} + +interface R2MultipartUpload { + key: string; + uploadId: string; + uploadPart(partNumber: number, value: ReadableStream | ArrayBuffer | string | Blob): Promise; + abort(): Promise; + complete(uploadedParts: R2UploadedPart[]): Promise; +} + +interface R2UploadedPart { + partNumber: number; + etag: string; +} +``` + +## CLI Operations + +```bash +wrangler r2 object put my-bucket/file.txt --file=./local.txt +wrangler r2 object get my-bucket/file.txt --file=./download.txt +wrangler r2 object delete my-bucket/file.txt +wrangler r2 object list my-bucket --prefix=photos/ +``` diff --git a/cloudflare/references/r2/configuration.md b/cloudflare/references/r2/configuration.md new file mode 100644 index 0000000..f306acd --- /dev/null +++ b/cloudflare/references/r2/configuration.md @@ -0,0 +1,165 @@ +# R2 Configuration + +## Workers Binding + +**wrangler.jsonc:** +```jsonc +{ + "r2_buckets": [ + { + "binding": "MY_BUCKET", + "bucket_name": "my-bucket-name" + } + ] +} +``` + +## TypeScript Types + +```typescript +interface Env { MY_BUCKET: R2Bucket; } + +export default { + async fetch(request: Request, env: Env): Promise { + const object = await env.MY_BUCKET.get('file.txt'); + return new Response(object?.body); + } +} +``` + +## S3 SDK Setup + +```typescript +import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3'; + +const s3 = new S3Client({ + region: 'auto', + endpoint: `https://${accountId}.r2.cloudflarestorage.com`, + credentials: { + accessKeyId: env.R2_ACCESS_KEY_ID, + secretAccessKey: env.R2_SECRET_ACCESS_KEY + } +}); + +await s3.send(new PutObjectCommand({ + Bucket: 'my-bucket', + Key: 'file.txt', + Body: data, + StorageClass: 'STANDARD' // or 'STANDARD_IA' +})); +``` + +## Location Hints + +```bash +wrangler r2 bucket create my-bucket --location=enam + +# Hints: wnam, enam, weur, eeur, apac, oc +# Jurisdictions (override hint): --jurisdiction=eu (or fedramp) +``` + +## CORS Configuration + +CORS must be configured via S3 SDK or dashboard (not available in Workers API): + +```typescript +import { S3Client, PutBucketCorsCommand } from '@aws-sdk/client-s3'; + +const s3 = new S3Client({ + region: 'auto', + endpoint: `https://${accountId}.r2.cloudflarestorage.com`, + credentials: { + accessKeyId: env.R2_ACCESS_KEY_ID, + secretAccessKey: env.R2_SECRET_ACCESS_KEY + } +}); + +await s3.send(new PutBucketCorsCommand({ + Bucket: 'my-bucket', + CORSConfiguration: { + CORSRules: [{ + AllowedOrigins: ['https://example.com'], + AllowedMethods: ['GET', 'PUT', 'HEAD'], + AllowedHeaders: ['*'], + ExposeHeaders: ['ETag'], + MaxAgeSeconds: 3600 + }] + } +})); +``` + +## Object Lifecycles + +```typescript +import { PutBucketLifecycleConfigurationCommand } from '@aws-sdk/client-s3'; + +await s3.send(new PutBucketLifecycleConfigurationCommand({ + Bucket: 'my-bucket', + LifecycleConfiguration: { + Rules: [ + { + ID: 'expire-old-logs', + Status: 'Enabled', + Prefix: 'logs/', + Expiration: { Days: 90 } + }, + { + ID: 'transition-to-ia', + Status: 'Enabled', + Prefix: 'archives/', + Transitions: [{ Days: 30, StorageClass: 'STANDARD_IA' }] + } + ] + } +})); +``` + +## API Token Scopes + +When creating R2 tokens, set minimal permissions: + +| Permission | Use Case | +|------------|----------| +| Object Read | Public serving, downloads | +| Object Write | Uploads only | +| Object Read & Write | Full object operations | +| Admin Read & Write | Bucket management, CORS, lifecycles | + +**Best practice:** Separate tokens for Workers (read/write) vs admin tasks (CORS, lifecycles). + +## Event Notifications + +```jsonc +// wrangler.jsonc +{ + "r2_buckets": [ + { + "binding": "MY_BUCKET", + "bucket_name": "my-bucket", + "event_notifications": [ + { + "queue": "r2-events", + "actions": ["PutObject", "DeleteObject", "CompleteMultipartUpload"] + } + ] + } + ], + "queues": { + "producers": [{ "binding": "R2_EVENTS", "queue": "r2-events" }], + "consumers": [{ "queue": "r2-events", "max_batch_size": 10 }] + } +} +``` + +## Bucket Management + +```bash +wrangler r2 bucket create my-bucket --location=enam --storage-class=Standard +wrangler r2 bucket list +wrangler r2 bucket info my-bucket +wrangler r2 bucket delete my-bucket # Must be empty +wrangler r2 bucket update-storage-class my-bucket --storage-class=InfrequentAccess + +# Public bucket via dashboard +wrangler r2 bucket domain add my-bucket --domain=files.example.com +``` diff --git a/cloudflare/references/r2/gotchas.md b/cloudflare/references/r2/gotchas.md new file mode 100644 index 0000000..ad755d3 --- /dev/null +++ b/cloudflare/references/r2/gotchas.md @@ -0,0 +1,190 @@ +# R2 Gotchas & Troubleshooting + +## List Truncation + +```typescript +// ❌ WRONG: Don't compare object count when using include +while (listed.objects.length < options.limit) { ... } + +// ✅ CORRECT: Always use truncated property +while (listed.truncated) { + const next = await env.MY_BUCKET.list({ cursor: listed.cursor }); + // ... +} +``` + +**Reason:** `include` with metadata may return fewer objects per page to fit metadata. + +## ETag Format + +```typescript +// ❌ WRONG: Using etag (unquoted) in headers +headers.set('etag', object.etag); // Missing quotes + +// ✅ CORRECT: Use httpEtag (quoted) +headers.set('etag', object.httpEtag); +``` + +## Checksum Limits + +Only ONE checksum algorithm allowed per PUT: + +```typescript +// ❌ WRONG: Multiple checksums +await env.MY_BUCKET.put(key, data, { md5: hash1, sha256: hash2 }); // Error + +// ✅ CORRECT: Pick one +await env.MY_BUCKET.put(key, data, { sha256: hash }); +``` + +## Multipart Requirements + +- All parts must be uniform size (except last part) +- Part numbers start at 1 (not 0) +- Uncompleted uploads auto-abort after 7 days +- `resumeMultipartUpload` doesn't validate uploadId existence + +## Conditional Operations + +```typescript +// Precondition failure returns object WITHOUT body +const object = await env.MY_BUCKET.get(key, { + onlyIf: { etagMatches: '"wrong"' } +}); + +// Check for body, not just null +if (!object) return new Response('Not found', { status: 404 }); +if (!object.body) return new Response(null, { status: 304 }); // Precondition failed +``` + +## Key Validation + +```typescript +// ❌ DANGEROUS: Path traversal +const key = url.pathname.slice(1); // Could be ../../../etc/passwd +await env.MY_BUCKET.get(key); + +// ✅ SAFE: Validate keys +if (!key || key.includes('..') || key.startsWith('/')) { + return new Response('Invalid key', { status: 400 }); +} +``` + +## Storage Class Pitfalls + +- InfrequentAccess: 30-day minimum billing (even if deleted early) +- Can't transition IA → Standard via lifecycle (use S3 CopyObject) +- Retrieval fees apply for IA reads + +## Stream Length Requirement + +```typescript +// ❌ WRONG: Streaming unknown length fails silently +const response = await fetch(url); +await env.MY_BUCKET.put(key, response.body); // May fail without error + +// ✅ CORRECT: Buffer or use Content-Length +const data = await response.arrayBuffer(); +await env.MY_BUCKET.put(key, data); + +// OR: Pass Content-Length if known +const object = await env.MY_BUCKET.put(key, request.body, { + httpMetadata: { + contentLength: parseInt(request.headers.get('content-length') || '0') + } +}); +``` + +**Reason:** R2 requires known length for streams. Unknown length may cause silent truncation. + +## S3 SDK Region Configuration + +```typescript +// ❌ WRONG: Missing region breaks ALL S3 SDK calls +const s3 = new S3Client({ + endpoint: `https://${accountId}.r2.cloudflarestorage.com`, + credentials: { ... } +}); + +// ✅ CORRECT: MUST set region='auto' +const s3 = new S3Client({ + region: 'auto', // REQUIRED + endpoint: `https://${accountId}.r2.cloudflarestorage.com`, + credentials: { ... } +}); +``` + +**Reason:** S3 SDK requires region. R2 uses 'auto' as placeholder. + +## Local Development Limits + +```typescript +// ❌ Miniflare/wrangler dev: Limited R2 support +// - No multipart uploads +// - No presigned URLs (requires S3 SDK + network) +// - Memory-backed storage (lost on restart) + +// ✅ Use remote bindings for full features +wrangler dev --remote + +// OR: Conditional logic +if (env.ENVIRONMENT === 'development') { + // Fallback for local dev +} else { + // Full R2 features +} +``` + +## Presigned URL Expiry + +```typescript +// ❌ WRONG: URL expires but no client validation +const url = await getSignedUrl(s3, command, { expiresIn: 60 }); +// 61 seconds later: 403 Forbidden + +// ✅ CORRECT: Return expiry to client +return Response.json({ + uploadUrl: url, + expiresAt: new Date(Date.now() + 60000).toISOString() +}); +``` + +## Limits + +| Limit | Value | +|-------|-------| +| Object size | 5 TB | +| Multipart part count | 10,000 | +| Multipart part min size | 5 MB (except last) | +| Batch delete | 1,000 keys | +| List limit | 1,000 per request | +| Key size | 1024 bytes | +| Custom metadata | 2 KB per object | +| Presigned URL max expiry | 7 days | + +## Common Errors + +### "Stream upload failed" / Silent Truncation + +**Cause:** Stream length unknown or Content-Length missing +**Solution:** Buffer data or pass explicit Content-Length + +### "Invalid credentials" / S3 SDK + +**Cause:** Missing `region: 'auto'` in S3Client config +**Solution:** Always set `region: 'auto'` for R2 + +### "Object not found" + +**Cause:** Object key doesn't exist or was deleted +**Solution:** Verify object key correct, check if object was deleted, ensure bucket correct + +### "List compatibility error" + +**Cause:** Missing or old compatibility_date, or flag not enabled +**Solution:** Set `compatibility_date >= 2022-08-04` or enable `r2_list_honor_include` flag + +### "Multipart upload failed" + +**Cause:** Part sizes not uniform or incorrect part number +**Solution:** Ensure uniform size except final part, verify part numbers start at 1 diff --git a/cloudflare/references/r2/patterns.md b/cloudflare/references/r2/patterns.md new file mode 100644 index 0000000..85191d6 --- /dev/null +++ b/cloudflare/references/r2/patterns.md @@ -0,0 +1,193 @@ +# R2 Patterns & Best Practices + +## Streaming Large Files + +```typescript +const object = await env.MY_BUCKET.get(key); +if (!object) return new Response('Not found', { status: 404 }); + +const headers = new Headers(); +object.writeHttpMetadata(headers); +headers.set('etag', object.httpEtag); + +return new Response(object.body, { headers }); +``` + +## Conditional GET (304 Not Modified) + +```typescript +const ifNoneMatch = request.headers.get('if-none-match'); +const object = await env.MY_BUCKET.get(key, { + onlyIf: { etagDoesNotMatch: ifNoneMatch?.replace(/"/g, '') || '' } +}); + +if (!object) return new Response('Not found', { status: 404 }); +if (!object.body) return new Response(null, { status: 304, headers: { 'etag': object.httpEtag } }); + +return new Response(object.body, { headers: { 'etag': object.httpEtag } }); +``` + +## Upload with Validation + +```typescript +const key = url.pathname.slice(1); +if (!key || key.includes('..')) return new Response('Invalid key', { status: 400 }); + +const object = await env.MY_BUCKET.put(key, request.body, { + httpMetadata: { contentType: request.headers.get('content-type') || 'application/octet-stream' }, + customMetadata: { uploadedAt: new Date().toISOString(), ip: request.headers.get('cf-connecting-ip') || 'unknown' } +}); + +return Response.json({ key: object.key, size: object.size, etag: object.httpEtag }); +``` + +## Multipart with Progress + +```typescript +const PART_SIZE = 5 * 1024 * 1024; // 5MB +const partCount = Math.ceil(file.size / PART_SIZE); +const multipart = await env.MY_BUCKET.createMultipartUpload(key, { httpMetadata: { contentType: file.type } }); + +const uploadedParts: R2UploadedPart[] = []; +try { + for (let i = 0; i < partCount; i++) { + const start = i * PART_SIZE; + const part = await multipart.uploadPart(i + 1, file.slice(start, start + PART_SIZE)); + uploadedParts.push(part); + onProgress?.(Math.round(((i + 1) / partCount) * 100)); + } + return await multipart.complete(uploadedParts); +} catch (error) { + await multipart.abort(); + throw error; +} +``` + +## Batch Delete + +```typescript +async function deletePrefix(prefix: string, env: Env) { + let cursor: string | undefined; + let truncated = true; + + while (truncated) { + const listed = await env.MY_BUCKET.list({ prefix, limit: 1000, cursor }); + if (listed.objects.length > 0) { + await env.MY_BUCKET.delete(listed.objects.map(o => o.key)); + } + truncated = listed.truncated; + cursor = listed.cursor; + } +} +``` + +## Checksum Validation & Storage Transitions + +```typescript +// Upload with checksum +const hash = await crypto.subtle.digest('SHA-256', data); +await env.MY_BUCKET.put(key, data, { sha256: hash }); + +// Transition storage class (requires S3 SDK) +import { S3Client, CopyObjectCommand } from '@aws-sdk/client-s3'; +await s3.send(new CopyObjectCommand({ + Bucket: 'my-bucket', Key: key, + CopySource: `/my-bucket/${key}`, + StorageClass: 'STANDARD_IA' +})); +``` + +## Client-Side Uploads (Presigned URLs) + +```typescript +import { S3Client } from '@aws-sdk/client-s3'; +import { getSignedUrl } from '@aws-sdk/s3-request-presigner'; +import { PutObjectCommand } from '@aws-sdk/client-s3'; + +// Worker: Generate presigned upload URL +const s3 = new S3Client({ + region: 'auto', + endpoint: `https://${env.ACCOUNT_ID}.r2.cloudflarestorage.com`, + credentials: { accessKeyId: env.R2_ACCESS_KEY_ID, secretAccessKey: env.R2_SECRET_ACCESS_KEY } +}); + +const url = await getSignedUrl(s3, new PutObjectCommand({ Bucket: 'my-bucket', Key: key }), { expiresIn: 3600 }); +return Response.json({ uploadUrl: url }); + +// Client: Upload directly +const { uploadUrl } = await fetch('/api/upload-url').then(r => r.json()); +await fetch(uploadUrl, { method: 'PUT', body: file }); +``` + +## Caching with Cache API + +```typescript +export default { + async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise { + const cache = caches.default; + const url = new URL(request.url); + const cacheKey = new Request(url.toString(), request); + + // Check cache first + let response = await cache.match(cacheKey); + if (response) return response; + + // Fetch from R2 + const key = url.pathname.slice(1); + const object = await env.MY_BUCKET.get(key); + if (!object) return new Response('Not found', { status: 404 }); + + const headers = new Headers(); + object.writeHttpMetadata(headers); + headers.set('etag', object.httpEtag); + headers.set('cache-control', 'public, max-age=31536000, immutable'); + + response = new Response(object.body, { headers }); + + // Cache for subsequent requests + ctx.waitUntil(cache.put(cacheKey, response.clone())); + + return response; + } +}; +``` + +## Public Bucket with Custom Domain + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + // CORS preflight + if (request.method === 'OPTIONS') { + return new Response(null, { + headers: { + 'access-control-allow-origin': '*', + 'access-control-allow-methods': 'GET, HEAD', + 'access-control-max-age': '86400' + } + }); + } + + const key = new URL(request.url).pathname.slice(1); + if (!key) return Response.redirect('/index.html', 302); + + const object = await env.MY_BUCKET.get(key); + if (!object) return new Response('Not found', { status: 404 }); + + const headers = new Headers(); + object.writeHttpMetadata(headers); + headers.set('etag', object.httpEtag); + headers.set('access-control-allow-origin', '*'); + headers.set('cache-control', 'public, max-age=31536000, immutable'); + + return new Response(object.body, { headers }); + } +}; +``` + +## r2.dev Public URLs + +Enable r2.dev in dashboard for simple public access: `https://pub-${hashId}.r2.dev/${key}` +Or add custom domain via dashboard: `https://files.example.com/${key}` + +**Limitations:** No auth, bucket-level CORS, no cache override. diff --git a/cloudflare/references/realtime-sfu/README.md b/cloudflare/references/realtime-sfu/README.md new file mode 100644 index 0000000..6f99921 --- /dev/null +++ b/cloudflare/references/realtime-sfu/README.md @@ -0,0 +1,65 @@ +# Cloudflare Realtime SFU Reference + +Expert guidance for building real-time audio/video/data applications using Cloudflare Realtime SFU (Selective Forwarding Unit). + +## Reading Order + +| Task | Files | ~Tokens | +|------|-------|---------| +| New project | README → configuration | ~1200 | +| Implement publish/subscribe | README → api | ~1600 | +| Add PartyTracks | patterns (PartyTracks section) | ~800 | +| Build presence system | patterns (DO section) | ~800 | +| Debug connection issues | gotchas | ~700 | +| Scale to millions | patterns (Cascading section) | ~600 | +| Add simulcast | patterns (Advanced section) | ~500 | +| Configure TURN | configuration (TURN section) | ~400 | + +## In This Reference + +- **[configuration.md](configuration.md)** - Setup, deployment, environment variables, Wrangler config +- **[api.md](api.md)** - Sessions, tracks, endpoints, request/response patterns +- **[patterns.md](patterns.md)** - Architecture patterns, use cases, integration examples +- **[gotchas.md](gotchas.md)** - Common issues, debugging, performance, security + +## Quick Start + +Cloudflare Realtime SFU: WebRTC infrastructure on global network (310+ cities). Anycast routing, no regional constraints, pub/sub model. + +**Core concepts:** +- **Sessions:** WebRTC PeerConnection to Cloudflare edge +- **Tracks:** Audio/video/data channels you publish or subscribe to +- **No rooms:** Build presence layer yourself via track sharing (see patterns.md) + +**Mental model:** Your client establishes one WebRTC session, publishes tracks (audio/video), shares track IDs via your backend, others subscribe to your tracks using track IDs + your session ID. + +## Choose Your Approach + +| Approach | When to Use | Complexity | +|----------|-------------|------------| +| **PartyTracks** | Production apps with device switching, React | Low - Observable-based, handles reconnections | +| **Raw API** | Custom requirements, non-browser, learning | Medium - Full control, manual WebRTC lifecycle | +| **RealtimeKit** | End-to-end SDK with UI components | Lowest - Managed state, React hooks | + +**Recommendation:** Start with PartyTracks for most production applications. See patterns.md for PartyTracks examples. + +## SFU vs RealtimeKit + +- **Realtime SFU:** WebRTC infrastructure (this reference). Build your own signaling, presence, UI. +- **RealtimeKit:** SDK layer on top of SFU. Includes React hooks, state management, UI components. Part of Cloudflare AI platform. + +Use SFU directly when you need custom signaling or non-React framework. Use RealtimeKit for faster development with React. + +## Setup + +Dashboard: https://dash.cloudflare.com/?to=/:account/calls + +Get `CALLS_APP_ID` and `CALLS_APP_SECRET` from dashboard, then see configuration.md for deployment. + +## See Also + +- [Orange Meets Demo](https://demo.orange.cloudflare.dev/) +- [Orange Source](https://github.com/cloudflare/orange) +- [Calls Examples](https://github.com/cloudflare/calls-examples) +- [API Reference](https://developers.cloudflare.com/api/resources/calls/) +- [RealtimeKit Docs](https://developers.cloudflare.com/workers-ai/realtimekit/) diff --git a/cloudflare/references/realtime-sfu/api.md b/cloudflare/references/realtime-sfu/api.md new file mode 100644 index 0000000..6e6dae6 --- /dev/null +++ b/cloudflare/references/realtime-sfu/api.md @@ -0,0 +1,158 @@ +# API Reference + +## Authentication + +```bash +curl -X POST 'https://rtc.live/v1/apps/${CALLS_APP_ID}/sessions/new' \ + -H "Authorization: Bearer ${CALLS_APP_SECRET}" +``` + +## Core Concepts + +**Sessions:** PeerConnection to Cloudflare edge +**Tracks:** Media/data channels (audio/video/datachannel) +**No rooms:** Build presence via track sharing + +## Client Libraries + +**PartyTracks (Recommended):** Observable-based client library for production use. Handles device changes, network switches, ICE restarts automatically. Push/pull API with React hooks. See patterns.md for full examples. + +```bash +npm install partytracks @cloudflare/calls +``` + +**Raw API:** Direct HTTP + WebRTC for custom requirements (documented below). + +## Endpoints + +### Create Session +```http +POST /v1/apps/{appId}/sessions/new +→ {sessionId, sessionDescription} +``` + +### Add Track (Publish) +```http +POST /v1/apps/{appId}/sessions/{sessionId}/tracks/new +Body: { + sessionDescription: {sdp, type: "offer"}, + tracks: [{location: "local", trackName: "my-video"}] +} +→ {sessionDescription, tracks: [{trackName}]} +``` + +### Add Track (Subscribe) +```http +POST /v1/apps/{appId}/sessions/{sessionId}/tracks/new +Body: { + tracks: [{ + location: "remote", + trackName: "remote-track-id", + sessionId: "other-session-id" + }] +} +→ {sessionDescription} (server offer) +``` + +### Renegotiate +```http +PUT /v1/apps/{appId}/sessions/{sessionId}/renegotiate +Body: {sessionDescription: {sdp, type: "answer"}} +``` + +### Close Tracks +```http +PUT /v1/apps/{appId}/sessions/{sessionId}/tracks/close +Body: {tracks: [{trackName}]} +→ {requiresImmediateRenegotiation: boolean} +``` + +### Get Session +```http +GET /v1/apps/{appId}/sessions/{sessionId} +→ {sessionId, tracks: TrackMetadata[]} +``` + +## TypeScript Types + +```typescript +interface TrackMetadata { + trackName: string; + location: "local" | "remote"; + sessionId?: string; // For remote tracks + mid?: string; // WebRTC mid +} +``` + +## WebRTC Flow + +```typescript +// 1. Create PeerConnection +const pc = new RTCPeerConnection({ + iceServers: [{urls: 'stun:stun.cloudflare.com:3478'}] +}); + +// 2. Add tracks +const stream = await navigator.mediaDevices.getUserMedia({video: true, audio: true}); +stream.getTracks().forEach(track => pc.addTrack(track, stream)); + +// 3. Create offer +const offer = await pc.createOffer(); +await pc.setLocalDescription(offer); + +// 4. Send to backend → Cloudflare API +const response = await fetch('/api/new-session', { + method: 'POST', + body: JSON.stringify({sdp: offer.sdp}) +}); + +// 5. Set remote answer +const {sessionDescription} = await response.json(); +await pc.setRemoteDescription(sessionDescription); +``` + +## Publishing + +```typescript +const offer = await pc.createOffer(); +await pc.setLocalDescription(offer); + +const res = await fetch(`/api/sessions/${sessionId}/tracks`, { + method: 'POST', + body: JSON.stringify({ + sdp: offer.sdp, + tracks: [{location: 'local', trackName: 'my-video'}] + }) +}); + +const {sessionDescription, tracks} = await res.json(); +await pc.setRemoteDescription(sessionDescription); +const publishedTrackId = tracks[0].trackName; // Share with others +``` + +## Subscribing + +```typescript +const res = await fetch(`/api/sessions/${sessionId}/tracks`, { + method: 'POST', + body: JSON.stringify({ + tracks: [{location: 'remote', trackName: remoteTrackId, sessionId: remoteSessionId}] + }) +}); + +const {sessionDescription} = await res.json(); +await pc.setRemoteDescription(sessionDescription); + +const answer = await pc.createAnswer(); +await pc.setLocalDescription(answer); + +await fetch(`/api/sessions/${sessionId}/renegotiate`, { + method: 'PUT', + body: JSON.stringify({sdp: answer.sdp}) +}); + +pc.ontrack = (event) => { + const [remoteStream] = event.streams; + videoElement.srcObject = remoteStream; +}; +``` diff --git a/cloudflare/references/realtime-sfu/configuration.md b/cloudflare/references/realtime-sfu/configuration.md new file mode 100644 index 0000000..6736b45 --- /dev/null +++ b/cloudflare/references/realtime-sfu/configuration.md @@ -0,0 +1,137 @@ +# Configuration & Deployment + +## Dashboard Setup + +1. Navigate to https://dash.cloudflare.com/?to=/:account/calls +2. Click "Create Application" (or use existing app) +3. Copy `CALLS_APP_ID` from dashboard +4. Generate and copy `CALLS_APP_SECRET` (treat as sensitive credential) +5. Use credentials in Wrangler config or environment variables below + +## Dependencies + +**Backend (Workers):** Built-in fetch API, no additional packages required + +**Client (PartyTracks):** +```bash +npm install partytracks @cloudflare/calls +``` + +**Client (React + PartyTracks):** +```bash +npm install partytracks @cloudflare/calls observable-hooks +# Observable hooks: useObservableAsValue, useValueAsObservable +``` + +**Client (Raw API):** Native browser WebRTC API only + +## Wrangler Setup + +```jsonc +{ + "name": "my-calls-app", + "main": "src/index.ts", + "compatibility_date": "2025-01-01", // Use current date for new projects + "vars": { + "CALLS_APP_ID": "your-app-id", + "MAX_WEBCAM_BITRATE": "1200000", + "MAX_WEBCAM_FRAMERATE": "24", + "MAX_WEBCAM_QUALITY_LEVEL": "1080" + }, + // Set secret: wrangler secret put CALLS_APP_SECRET + "durable_objects": { + "bindings": [ + { + "name": "ROOM", + "class_name": "Room" + } + ] + } +} +``` + +## Deploy + +```bash +wrangler login +wrangler secret put CALLS_APP_SECRET +wrangler deploy +``` + +## Environment Variables + +**Required:** +- `CALLS_APP_ID`: From dashboard +- `CALLS_APP_SECRET`: From dashboard (secret) + +**Optional:** +- `MAX_WEBCAM_BITRATE` (default: 1200000) +- `MAX_WEBCAM_FRAMERATE` (default: 24) +- `MAX_WEBCAM_QUALITY_LEVEL` (default: 1080) +- `TURN_SERVICE_ID`: TURN service +- `TURN_SERVICE_TOKEN`: TURN auth (secret) + +## TURN Configuration + +```javascript +const pc = new RTCPeerConnection({ + iceServers: [ + { urls: 'stun:stun.cloudflare.com:3478' }, + { + urls: [ + 'turn:turn.cloudflare.com:3478?transport=udp', + 'turn:turn.cloudflare.com:3478?transport=tcp', + 'turns:turn.cloudflare.com:5349?transport=tcp' + ], + username: turnUsername, + credential: turnCredential + } + ], + bundlePolicy: 'max-bundle', // Recommended: reduces overhead + iceTransportPolicy: 'all' // Use 'relay' to force TURN (testing only) +}); +``` + +**Ports:** 3478 (UDP/TCP), 53 (UDP), 80 (TCP), 443 (TLS), 5349 (TLS) + +**When to use TURN:** Required for restrictive corporate firewalls/networks that block UDP. ~5-10% of connections fallback to TURN. STUN works for most users. + +**ICE candidate filtering:** Cloudflare handles candidate filtering automatically. No need to manually filter candidates. + +## Durable Object Boilerplate + +Minimal presence system: + +```typescript +export class Room { + private sessions = new Map(); + + async fetch(req: Request) { + const {pathname} = new URL(req.url); + const body = await req.json(); + + if (pathname === '/join') { + this.sessions.set(body.sessionId, {userId: body.userId, tracks: []}); + return Response.json({participants: this.sessions.size}); + } + + if (pathname === '/publish') { + this.sessions.get(body.sessionId)?.tracks.push(...body.tracks); + // Broadcast to others via WebSocket (not shown) + return new Response('OK'); + } + + return new Response('Not found', {status: 404}); + } +} +``` + +## Environment Validation + +Check credentials before first API call: + +```typescript +if (!env.CALLS_APP_ID || !env.CALLS_APP_SECRET) { + throw new Error('CALLS_APP_ID and CALLS_APP_SECRET required'); +} +``` diff --git a/cloudflare/references/realtime-sfu/gotchas.md b/cloudflare/references/realtime-sfu/gotchas.md new file mode 100644 index 0000000..efe5ee7 --- /dev/null +++ b/cloudflare/references/realtime-sfu/gotchas.md @@ -0,0 +1,133 @@ +# Gotchas & Troubleshooting + +## Common Errors + +### "Slow initial connect (~1.8s)" + +**Cause:** First STUN delayed during consensus forming (normal behavior) +**Solution:** Subsequent connections are faster. CF detects DTLS ClientHello early to compensate. + +### "No media flow" + +**Cause:** SDP exchange incomplete, connection not established, tracks not added before offer, browser permissions missing +**Solution:** +1. Verify SDP exchange complete +2. Check `pc.connectionState === 'connected'` +3. Ensure tracks added before creating offer +4. Confirm browser permissions granted +5. Use `chrome://webrtc-internals` for debugging + +### "Track not receiving" + +**Cause:** Track not published, track ID not shared, session IDs mismatch, `pc.ontrack` not set, renegotiation needed +**Solution:** +1. Verify track published successfully +2. Confirm track ID shared between peers +3. Check session IDs match +4. Set `pc.ontrack` handler before answer +5. Trigger renegotiation if needed + +### "ICE connection failed" + +**Cause:** Network changed, firewall blocked UDP, TURN needed, transient network issue +**Solution:** +```typescript +pc.oniceconnectionstatechange = async () => { + if (pc.iceConnectionState === 'failed') { + console.warn('ICE failed, attempting restart'); + await pc.restartIce(); // Triggers new ICE gathering + + // Create new offer with ICE restart flag + const offer = await pc.createOffer({iceRestart: true}); + await pc.setLocalDescription(offer); + + // Send to backend → Cloudflare API + await fetch(`/api/sessions/${sessionId}/renegotiate`, { + method: 'PUT', + body: JSON.stringify({sdp: offer.sdp}) + }); + } +}; +``` + +### "Track stuck/frozen" + +**Cause:** Sender paused track, network congestion, codec mismatch, mobile browser backgrounded +**Solution:** +1. Check `track.enabled` and `track.readyState === 'live'` +2. Verify sender active: `pc.getSenders().find(s => s.track === track)` +3. Check stats for packet loss/jitter (see patterns.md) +4. On mobile: Re-acquire tracks when app foregrounded +5. Test with different codecs if persistent + +### "Network change disconnects call" + +**Cause:** Mobile switching WiFi↔cellular, laptop changing networks +**Solution:** +```typescript +// Listen for network changes +if ('connection' in navigator) { + (navigator as any).connection.addEventListener('change', async () => { + console.log('Network changed'); + await pc.restartIce(); // Use ICE restart pattern above + }); +} + +// Or use PartyTracks (handles automatically) +``` + +## Retry with Exponential Backoff + +```typescript +async function fetchWithRetry(url: string, options: RequestInit, maxRetries = 3) { + for (let i = 0; i < maxRetries; i++) { + try { + const res = await fetch(url, options); + if (res.ok) return res; + if (res.status >= 500) throw new Error('Server error'); + return res; // Client error, don't retry + } catch (err) { + if (i === maxRetries - 1) throw err; + const delay = Math.min(1000 * 2 ** i, 10000); // Cap at 10s + await new Promise(resolve => setTimeout(resolve, delay)); + } + } +} +``` + +## Debugging with chrome://webrtc-internals + +1. Open `chrome://webrtc-internals` in Chrome/Edge +2. Find your PeerConnection in the list +3. Check **Stats graphs** for packet loss, jitter, bandwidth +4. Check **ICE candidate pairs**: Look for `succeeded` state, relay vs host candidates +5. Check **getStats**: Raw metrics for inbound/outbound RTP +6. Look for errors in **Event log**: `iceConnectionState`, `connectionState` changes +7. Export data with "Download the PeerConnection updates and stats data" button +8. Common issues visible here: ICE failures, high packet loss, bitrate drops + +## Limits + +| Resource/Limit | Value | Notes | +|----------------|-------|-------| +| Egress (Free) | 1TB/month | Per account | +| Egress (Paid) | $0.05/GB | After free tier | +| Inbound traffic | Free | All plans | +| TURN service | Free | Included with SFU | +| Participants | No hard limit | Client bandwidth/CPU bound (typically 10-50 tracks) | +| Tracks per session | No hard limit | Client resources limited | +| Session duration | No hard limit | Production calls run for hours | +| WebRTC ports | UDP 1024-65535 | Outbound only, required for media | +| API rate limit | 600 req/min | Per app, burst allowed | + +## Security Checklist + +- ✅ **Never expose** `CALLS_APP_SECRET` to client +- ✅ **Validate user identity** in backend before creating sessions +- ✅ **Implement auth tokens** for session access (JWT in custom header) +- ✅ **Rate limit** session creation endpoints +- ✅ **Expire sessions** server-side after inactivity +- ✅ **Validate track IDs** before subscribing (prevent unauthorized access) +- ✅ **Use HTTPS** for all signaling (API calls) +- ✅ **Enable DTLS-SRTP** (automatic with Cloudflare, encrypts media) +- ⚠️ **Consider E2EE** for sensitive content (implement client-side with Insertable Streams API) diff --git a/cloudflare/references/realtime-sfu/patterns.md b/cloudflare/references/realtime-sfu/patterns.md new file mode 100644 index 0000000..95ddc42 --- /dev/null +++ b/cloudflare/references/realtime-sfu/patterns.md @@ -0,0 +1,174 @@ +# Patterns & Use Cases + +## Architecture + +``` +Client (WebRTC) <---> CF Edge <---> Backend (HTTP) + | + CF Backbone (310+ DCs) + | + Other Edges <---> Other Clients +``` + +Anycast: Last-mile <50ms (95%), no region select, NACK shield, distributed consensus + +Cascading trees auto-scale to millions: +``` +Publisher -> Edge A -> Edge B -> Sub1 + \-> Edge C -> Sub2,3 +``` + +## Use Cases + +**1:1:** A creates session+publishes, B creates+subscribes to A+publishes, A subscribes to B +**N:N:** All create session+publish, backend broadcasts track IDs, all subscribe to others +**1:N:** Publisher creates+publishes, viewers each create+subscribe (no fan-out limit) +**Breakout:** Same PeerConnection! Backend closes/adds tracks, no recreation + +## PartyTracks (Recommended) + +Observable-based client with automatic device/network handling: + +```typescript +import {PartyTracks} from 'partytracks'; + +// Create client +const pt = new PartyTracks({ + apiUrl: '/api/calls', + sessionId: 'my-session', + onTrack: (track, peer) => { + const video = document.getElementById(`video-${peer.id}`) as HTMLVideoElement; + video.srcObject = new MediaStream([track]); + } +}); + +// Publish camera (push API) +const camera = await pt.getCamera(); // Auto-requests permissions, handles device changes +await pt.publishTrack(camera, {trackName: 'my-camera'}); + +// Subscribe to remote track (pull API) +await pt.subscribeToTrack({trackName: 'remote-camera', sessionId: 'other-session'}); + +// React hook example +import {useObservableAsValue} from 'observable-hooks'; + +function VideoCall() { + const localTracks = useObservableAsValue(pt.localTracks$); + const remoteTracks = useObservableAsValue(pt.remoteTracks$); + + return
{/* Render tracks */}
; +} + +// Screenshare +const screen = await pt.getScreenshare(); +await pt.publishTrack(screen, {trackName: 'my-screen'}); + +// Handle device changes (automatic) +// PartyTracks detects device changes (e.g., Bluetooth headset) and renegotiates +``` + +## Backend + +Express: +```js +app.post('/api/new-session', async (req, res) => { + const r = await fetch(`${CALLS_API}/apps/${process.env.CALLS_APP_ID}/sessions/new`, + {method: 'POST', headers: {'Authorization': `Bearer ${process.env.CALLS_APP_SECRET}`}}); + res.json(await r.json()); +}); +``` + +Workers: Same pattern, use `env.CALLS_APP_ID` and `env.CALLS_APP_SECRET` + +DO Presence: See configuration.md for boilerplate + +## Audio Level Detection + +```typescript +// Attach analyzer to audio track +function attachAudioLevelDetector(track: MediaStreamTrack) { + const ctx = new AudioContext(); + const analyzer = ctx.createAnalyser(); + const src = ctx.createMediaStreamSource(new MediaStream([track])); + src.connect(analyzer); + + const data = new Uint8Array(analyzer.frequencyBinCount); + const checkLevel = () => { + analyzer.getByteFrequencyData(data); + const level = data.reduce((a, b) => a + b) / data.length; + if (level > 30) console.log('Speaking:', level); // Trigger UI update + requestAnimationFrame(checkLevel); + }; + checkLevel(); +} +``` + +## Connection Quality Monitoring + +```typescript +pc.getStats().then(stats => { + stats.forEach(report => { + if (report.type === 'inbound-rtp' && report.kind === 'video') { + const {packetsLost, packetsReceived, jitter} = report; + const lossRate = packetsLost / (packetsLost + packetsReceived); + if (lossRate > 0.05) console.warn('High packet loss:', lossRate); + if (jitter > 100) console.warn('High jitter:', jitter); + } + }); +}); +``` + +## Stage Management (Limit Visible Participants) + +```typescript +// Subscribe to top 6 active speakers only +let activeSubscriptions = new Set(); + +function updateStage(topSpeakers: string[]) { + const toAdd = topSpeakers.filter(id => !activeSubscriptions.has(id)).slice(0, 6); + const toRemove = [...activeSubscriptions].filter(id => !topSpeakers.includes(id)); + + toRemove.forEach(id => { + pc.getSenders().find(s => s.track?.id === id)?.track?.stop(); + activeSubscriptions.delete(id); + }); + + toAdd.forEach(async id => { + await fetch(`/api/subscribe`, {method: 'POST', body: JSON.stringify({trackId: id})}); + activeSubscriptions.add(id); + }); +} +``` + +## Advanced + +Bandwidth mgmt: +```ts +const s = pc.getSenders().find(s => s.track?.kind === 'video'); +const p = s.getParameters(); +if (!p.encodings) p.encodings = [{}]; +p.encodings[0].maxBitrate = 1200000; p.encodings[0].maxFramerate = 24; +await s.setParameters(p); +``` + +Simulcast (CF auto-forwards best layer): +```ts +pc.addTransceiver('video', {direction: 'sendonly', sendEncodings: [ + {rid: 'high', maxBitrate: 1200000}, + {rid: 'med', maxBitrate: 600000, scaleResolutionDownBy: 2}, + {rid: 'low', maxBitrate: 200000, scaleResolutionDownBy: 4} +]}); +``` + +DataChannel: +```ts +const dc = pc.createDataChannel('chat', {ordered: true, maxRetransmits: 3}); +dc.onopen = () => dc.send(JSON.stringify({type: 'chat', text: 'Hi'})); +dc.onmessage = (e) => console.log('RX:', JSON.parse(e.data)); +``` + +**WHIP/WHEP:** For streaming interop (OBS → SFU, SFU → video players), use WHIP (ingest) and WHEP (egress) protocols. See Cloudflare Stream integration docs. + +Integrations: R2 for recording `env.R2_BUCKET.put(...)`, Queues for analytics + +Perf: 100-250ms connect, ~50ms latency (95%), 200-400ms glass-to-glass, no participant limit (client: 10-50 tracks) diff --git a/cloudflare/references/realtimekit/README.md b/cloudflare/references/realtimekit/README.md new file mode 100644 index 0000000..6d19f51 --- /dev/null +++ b/cloudflare/references/realtimekit/README.md @@ -0,0 +1,113 @@ +# Cloudflare RealtimeKit + +Expert guidance for building real-time video and audio applications using **Cloudflare RealtimeKit** - a comprehensive SDK suite for adding customizable live video and voice to web or mobile applications. + +## Overview + +RealtimeKit is Cloudflare's SDK suite built on Realtime SFU, abstracting WebRTC complexity with fast integration, pre-built UI components, global performance (300+ cities), and production features (recording, transcription, chat, polls). + +**Use cases**: Team meetings, webinars, social video, audio calls, interactive plugins + +## Core Concepts + +- **App**: Workspace grouping meetings, participants, presets, recordings. Use separate Apps for staging/production +- **Meeting**: Re-usable virtual room. Each join creates new **Session** +- **Session**: Live meeting instance. Created on first join, ends after last leave +- **Participant**: User added via REST API. Returns `authToken` for client SDK. **Do not reuse tokens** +- **Preset**: Reusable permission/UI template (permissions, meeting type, theme). Applied at participant creation +- **Peer ID** (`id`): Unique per session, changes on rejoin +- **Participant ID** (`userId`): Persistent across sessions + +## Quick Start + +### 1. Create App & Meeting (Backend) + +```bash +# Create app +curl -X POST 'https://api.cloudflare.com/client/v4/accounts//realtime/kit/apps' \ + -H 'Authorization: Bearer ' \ + -d '{"name": "My RealtimeKit App"}' + +# Create meeting +curl -X POST 'https://api.cloudflare.com/client/v4/accounts//realtime/kit//meetings' \ + -H 'Authorization: Bearer ' \ + -d '{"title": "Team Standup"}' + +# Add participant +curl -X POST 'https://api.cloudflare.com/client/v4/accounts//realtime/kit//meetings//participants' \ + -H 'Authorization: Bearer ' \ + -d '{"name": "Alice", "preset_name": "host"}' +# Returns: { authToken } +``` + +### 2. Client Integration + +**React**: +```tsx +import { RtkMeeting } from '@cloudflare/realtimekit-react-ui'; + +function App() { + return {}} />; +} +``` + +**Core SDK**: +```typescript +import RealtimeKitClient from '@cloudflare/realtimekit'; + +const meeting = new RealtimeKitClient({ authToken: '', video: true, audio: true }); +await meeting.join(); +``` + +## Reading Order + +| Task | Files | +|------|-------| +| Quick integration | README only | +| Custom UI | README → patterns → api | +| Backend setup | README → configuration | +| Debug issues | gotchas | +| Advanced features | patterns → api | + +## RealtimeKit vs Realtime SFU + +| Choose | When | +|--------|------| +| **RealtimeKit** | Need pre-built UI, fast integration, React/Angular/HTML | +| **Realtime SFU** | Building from scratch, custom WebRTC, full control | + +RealtimeKit is built on Realtime SFU but abstracts WebRTC complexity with UI components and SDKs. + +## Which Package? + +Need pre-built meeting UI? +- React → `@cloudflare/realtimekit-react-ui` (``) +- Angular → `@cloudflare/realtimekit-angular-ui` +- HTML/Vanilla → `@cloudflare/realtimekit-ui` + +Need custom UI? +- Core SDK → `@cloudflare/realtimekit` (RealtimeKitClient) - full control + +Need raw WebRTC control? +- See `realtime-sfu/` reference + +## In This Reference + +- [Configuration](./configuration.md) - Setup, installation, wrangler config +- [API](./api.md) - Meeting object, REST API, SDK methods +- [Patterns](./patterns.md) - Common workflows, code examples +- [Gotchas](./gotchas.md) - Common issues, troubleshooting + +## See Also + +- [Workers](../workers/) - Backend integration +- [D1](../d1/) - Meeting metadata storage +- [R2](../r2/) - Recording storage +- [KV](../kv/) - Session management + +## Reference Links + +- **Official Docs**: https://developers.cloudflare.com/realtime/realtimekit/ +- **API Reference**: https://developers.cloudflare.com/api/resources/realtime_kit/ +- **Examples**: https://github.com/cloudflare/realtimekit-web-examples +- **Dashboard**: https://dash.cloudflare.com/?to=/:account/realtime/kit diff --git a/cloudflare/references/realtimekit/api.md b/cloudflare/references/realtimekit/api.md new file mode 100644 index 0000000..18e9a3f --- /dev/null +++ b/cloudflare/references/realtimekit/api.md @@ -0,0 +1,212 @@ +# RealtimeKit API Reference + +Complete API reference for Meeting object, REST endpoints, and SDK methods. + +## Meeting Object API + +### `meeting.self` - Local Participant + +```typescript +// Properties: id, userId, name, audioEnabled, videoEnabled, screenShareEnabled, audioTrack, videoTrack, screenShareTracks, roomJoined, roomState +// Methods +await meeting.self.enableAudio() / disableAudio() / enableVideo() / disableVideo() / enableScreenShare() / disableScreenShare() +await meeting.self.setName("Name") // Before join only +await meeting.self.setDevice(device) +const devices = await meeting.self.getAllDevices() / getAudioDevices() / getVideoDevices() / getSpeakerDevices() +// Events: 'roomJoined', 'audioUpdate', 'videoUpdate', 'screenShareUpdate', 'deviceUpdate', 'deviceListUpdate' +meeting.self.on('roomJoined', () => {}) +meeting.self.on('audioUpdate', ({ audioEnabled, audioTrack }) => {}) +``` + +### `meeting.participants` - Remote Participants + +**Collections**: +```typescript +meeting.participants.joined / active / waitlisted / pinned // Maps +const participants = meeting.participants.joined.toArray() +const count = meeting.participants.joined.size() +const p = meeting.participants.joined.get('peer-id') +``` + +**Participant Properties**: +```typescript +participant.id / userId / name +participant.audioEnabled / videoEnabled / screenShareEnabled +participant.audioTrack / videoTrack / screenShareTracks +``` + +**Events**: +```typescript +meeting.participants.joined.on('participantJoined', (participant) => {}) +meeting.participants.joined.on('participantLeft', (participant) => {}) +``` + +### `meeting.meta` - Metadata +```typescript +meeting.meta.meetingId / meetingTitle / meetingStartedTimestamp +``` + +### `meeting.chat` - Chat +```typescript +meeting.chat.messages // Array +await meeting.chat.sendTextMessage("Hello") / sendImageMessage(file) +meeting.chat.on('chatUpdate', ({ message, messages }) => {}) +``` + +### `meeting.polls` - Polling +```typescript +meeting.polls.items // Array +await meeting.polls.create(question, options, anonymous, hideVotes) +await meeting.polls.vote(pollId, optionIndex) +``` + +### `meeting.plugins` - Collaborative Apps +```typescript +meeting.plugins.all // Array +await meeting.plugins.activate(pluginId) / deactivate() +``` + +### `meeting.ai` - AI Features +```typescript +meeting.ai.transcripts // Live transcriptions (when enabled in Preset) +``` + +### Core Methods +```typescript +await meeting.join() // Emits 'roomJoined' on meeting.self +await meeting.leave() +``` + +## TypeScript Types + +```typescript +import type { RealtimeKitClient, States, UIConfig, Participant } from '@cloudflare/realtimekit'; + +// Main interface +interface RealtimeKitClient { + self: SelfState; // Local participant (id, userId, name, audioEnabled, videoEnabled, roomJoined, roomState) + participants: { joined, active, waitlisted, pinned }; // Reactive Maps + chat: ChatNamespace; // messages[], sendTextMessage(), sendImageMessage() + polls: PollsNamespace; // items[], create(), vote() + plugins: PluginsNamespace; // all[], activate(), deactivate() + ai: AINamespace; // transcripts[] + meta: MetaState; // meetingId, meetingTitle, meetingStartedTimestamp + join(): Promise; + leave(): Promise; +} + +// Participant (self & remote share same shape) +interface Participant { + id: string; // Peer ID (changes on rejoin) + userId: string; // Persistent participant ID + name: string; + audioEnabled: boolean; + videoEnabled: boolean; + screenShareEnabled: boolean; + audioTrack: MediaStreamTrack | null; + videoTrack: MediaStreamTrack | null; + screenShareTracks: MediaStreamTrack[]; +} +``` + +## Store Architecture + +RealtimeKit uses reactive store (event-driven updates, live Maps): + +```typescript +// Subscribe to state changes +meeting.self.on('audioUpdate', ({ audioEnabled, audioTrack }) => {}); +meeting.participants.joined.on('participantJoined', (p) => {}); + +// Access current state synchronously +const isAudioOn = meeting.self.audioEnabled; +const count = meeting.participants.joined.size(); +``` + +**Key principles:** State updates emit events after changes. Use `.toArray()` sparingly. Collections are live Maps. + +## REST API + +Base: `https://api.cloudflare.com/client/v4/accounts/{account_id}/realtime/kit/{app_id}` + +### Meetings +```bash +GET /meetings # List all +GET /meetings/{meeting_id} # Get details +POST /meetings # Create: {"title": "..."} +PATCH /meetings/{meeting_id} # Update: {"title": "...", "record_on_start": true} +``` + +### Participants +```bash +GET /meetings/{meeting_id}/participants # List all +GET /meetings/{meeting_id}/participants/{participant_id} # Get details +POST /meetings/{meeting_id}/participants # Add: {"name": "...", "preset_name": "...", "custom_participant_id": "..."} +PATCH /meetings/{meeting_id}/participants/{participant_id} # Update: {"name": "...", "preset_name": "..."} +DELETE /meetings/{meeting_id}/participants/{participant_id} # Delete +POST /meetings/{meeting_id}/participants/{participant_id}/token # Refresh token +``` + +### Active Session +```bash +GET /meetings/{meeting_id}/active-session # Get active session +POST /meetings/{meeting_id}/active-session/kick # Kick users: {"user_ids": ["id1", "id2"]} +POST /meetings/{meeting_id}/active-session/kick-all # Kick all +POST /meetings/{meeting_id}/active-session/poll # Create poll: {"question": "...", "options": [...], "anonymous": false} +``` + +### Recording +```bash +GET /recordings?meeting_id={meeting_id} # List recordings +GET /recordings/active-recording/{meeting_id} # Get active recording +POST /recordings # Start: {"meeting_id": "...", "type": "composite"} (or "track") +PUT /recordings/{recording_id} # Control: {"action": "pause"} (or "resume", "stop") +POST /recordings/track # Track recording: {"meeting_id": "...", "layers": [...]} +``` + +### Livestreaming +```bash +GET /livestreams?exclude_meetings=false # List all +GET /livestreams/{livestream_id} # Get details +POST /meetings/{meeting_id}/livestreams # Start for meeting +POST /meetings/{meeting_id}/active-livestream/stop # Stop +POST /livestreams # Create independent: returns {ingest_server, stream_key, playback_url} +``` + +### Sessions & Analytics +```bash +GET /sessions # List all +GET /sessions/{session_id} # Get details +GET /sessions/{session_id}/participants # List participants +GET /sessions/{session_id}/participants/{participant_id} # Call stats +GET /sessions/{session_id}/chat # Download chat CSV +GET /sessions/{session_id}/transcript # Download transcript CSV +GET /sessions/{session_id}/summary # Get summary +POST /sessions/{session_id}/summary # Generate summary +GET /analytics/daywise?start_date=YYYY-MM-DD&end_date=YYYY-MM-DD # Day-wise analytics +GET /analytics/livestreams/overall # Livestream analytics +``` + +### Webhooks +```bash +GET /webhooks # List all +POST /webhooks # Create: {"url": "https://...", "events": ["session.started", "session.ended"]} +PATCH /webhooks/{webhook_id} # Update +DELETE /webhooks/{webhook_id} # Delete +``` + +## Session Lifecycle + +``` +Initialization → Join Intent → [Waitlist?] → Meeting Screen (Stage) → Ended + ↓ Approved + [Rejected → Ended] +``` + +UI Kit handles state transitions automatically. + +## See Also + +- [Configuration](./configuration.md) - Setup and installation +- [Patterns](./patterns.md) - Usage examples +- [README](./README.md) - Overview and quick start diff --git a/cloudflare/references/realtimekit/configuration.md b/cloudflare/references/realtimekit/configuration.md new file mode 100644 index 0000000..efbca80 --- /dev/null +++ b/cloudflare/references/realtimekit/configuration.md @@ -0,0 +1,203 @@ +# RealtimeKit Configuration + +Configuration guide for RealtimeKit setup, client SDKs, and wrangler integration. + +## Installation + +### React +```bash +npm install @cloudflare/realtimekit @cloudflare/realtimekit-react-ui +``` + +### Angular +```bash +npm install @cloudflare/realtimekit @cloudflare/realtimekit-angular-ui +``` + +### Web Components/HTML +```bash +npm install @cloudflare/realtimekit @cloudflare/realtimekit-ui +``` + +## Client SDK Configuration + +### React UI Kit +```tsx +import { RtkMeeting } from '@cloudflare/realtimekit-react-ui'; + {}} /> +``` + +### Angular UI Kit +```typescript +@Component({ template: `` }) +export class AppComponent { authToken = ''; onLeave() {} } +``` + +### Web Components +```html + + + +``` + +### Core SDK Configuration +```typescript +import RealtimeKitClient from '@cloudflare/realtimekit'; + +const meeting = new RealtimeKitClient({ + authToken: '', + video: true, audio: true, autoSwitchAudioDevice: true, + mediaConfiguration: { + video: { width: { ideal: 1280 }, height: { ideal: 720 }, frameRate: { ideal: 30 } }, + audio: { echoCancellation: true, noiseSuppression: true, autoGainControl: true }, + screenshare: { width: { max: 1920 }, height: { max: 1080 }, frameRate: { ideal: 15 } } + } +}); +await meeting.join(); +``` + +## Backend Setup + +### Create App & Credentials + +**Dashboard**: https://dash.cloudflare.com/?to=/:account/realtime/kit + +**API**: +```bash +curl -X POST 'https://api.cloudflare.com/client/v4/accounts//realtime/kit/apps' \ + -H 'Content-Type: application/json' \ + -H 'Authorization: Bearer ' \ + -d '{"name": "My RealtimeKit App"}' +``` + +**Required Permissions**: API token with **Realtime / Realtime Admin** permissions + +### Create Presets + +```bash +curl -X POST 'https://api.cloudflare.com/client/v4/accounts//realtime/kit//presets' \ + -H 'Authorization: Bearer ' \ + -d '{ + "name": "host", + "permissions": { + "canShareAudio": true, + "canShareVideo": true, + "canRecord": true, + "canLivestream": true, + "canStartStopRecording": true + } + }' +``` + +## Wrangler Configuration + +### Basic Configuration +```jsonc +// wrangler.jsonc +{ + "name": "realtimekit-app", + "main": "src/index.ts", + "compatibility_date": "2025-01-01", // Use current date + "vars": { + "CLOUDFLARE_ACCOUNT_ID": "abc123", + "REALTIMEKIT_APP_ID": "xyz789" + } + // Secrets: wrangler secret put CLOUDFLARE_API_TOKEN +} +``` + +### With Database & Storage +```jsonc +{ + "d1_databases": [{ "binding": "DB", "database_name": "meetings", "database_id": "d1-id" }], + "r2_buckets": [{ "binding": "RECORDINGS", "bucket_name": "recordings" }], + "kv_namespaces": [{ "binding": "SESSIONS", "id": "kv-id" }] +} +``` + +### Multi-Environment +```bash +# Deploy to environments +wrangler deploy --env staging +wrangler deploy --env production +``` + +## TURN Service Configuration + +RealtimeKit can use Cloudflare's TURN service for connectivity through restrictive networks: + +```jsonc +// wrangler.jsonc +{ + "vars": { + "TURN_SERVICE_ID": "your_turn_service_id" + } + // Set secret: wrangler secret put TURN_SERVICE_TOKEN +} +``` + +TURN automatically configured when enabled in account - no client-side changes needed. + +## Theming & Design Tokens + +```typescript +import type { UIConfig } from '@cloudflare/realtimekit'; + +const uiConfig: UIConfig = { + designTokens: { + colors: { + brand: { 500: '#0066ff', 600: '#0052cc' }, + background: { 1000: '#1A1A1A', 900: '#2D2D2D' }, + text: { 1000: '#FFFFFF', 900: '#E0E0E0' } + }, + borderRadius: 'extra-rounded', // 'rounded' | 'extra-rounded' | 'sharp' + theme: 'dark' // 'light' | 'dark' + }, + logo: { url: 'https://example.com/logo.png', altText: 'Company' } +}; + +// Apply to React + {}} /> + +// Or use CSS variables +// :root { --rtk-color-brand-500: #0066ff; --rtk-border-radius: 12px; } +``` + +## Internationalization (i18n) + +### Custom Language Strings +```typescript +import { useLanguage } from '@cloudflare/realtimekit-ui'; + +const customLanguage = { + 'join': 'Entrar', + 'leave': 'Salir', + 'mute': 'Silenciar', + 'unmute': 'Activar audio', + 'turn_on_camera': 'Encender cámara', + 'turn_off_camera': 'Apagar cámara', + 'share_screen': 'Compartir pantalla', + 'stop_sharing': 'Dejar de compartir' +}; + +const t = useLanguage(customLanguage); + +// React usage + {}} /> +``` + +### Supported Locales +Default locales available: `en`, `es`, `fr`, `de`, `pt`, `ja`, `zh` + +```typescript +import { setLocale } from '@cloudflare/realtimekit-ui'; +setLocale('es'); // Switch to Spanish +``` + +## See Also + +- [API](./api.md) - Meeting APIs, REST endpoints +- [Patterns](./patterns.md) - Backend integration examples +- [README](./README.md) - Overview and quick start diff --git a/cloudflare/references/realtimekit/gotchas.md b/cloudflare/references/realtimekit/gotchas.md new file mode 100644 index 0000000..c6e7dfd --- /dev/null +++ b/cloudflare/references/realtimekit/gotchas.md @@ -0,0 +1,169 @@ +# RealtimeKit Gotchas & Troubleshooting + +## Common Errors + +### "Cannot connect to meeting" + +**Cause:** Auth token invalid/expired, API credentials lack permissions, or network blocks WebRTC +**Solution:** +Verify token validity, check API token has **Realtime / Realtime Admin** permissions, enable TURN service for restrictive networks + +### "No video/audio tracks" + +**Cause:** Browser permissions not granted, video/audio not enabled, device in use, or device unavailable +**Solution:** +Request browser permissions explicitly, verify initialization config, use `meeting.self.getAllDevices()` to debug, close other apps using device + +### "Participant count mismatched" + +**Cause:** `meeting.participants` doesn't include `meeting.self` +**Solution:** Total count = `meeting.participants.joined.size() + 1` + +### "Events not firing" + +**Cause:** Listeners registered after actions, incorrect event name, or wrong namespace +**Solution:** +Register listeners before calling `meeting.join()`, check event names against docs, verify correct namespace + +### "CORS errors in API calls" + +**Cause:** Making REST API calls from client-side +**Solution:** All REST API calls **must** be server-side (Workers, backend). Never expose API tokens to clients. + +### "Preset not applying" + +**Cause:** Preset doesn't exist, name mismatch (case-sensitive), or participant created before preset +**Solution:** +Verify preset exists via Dashboard or API, check exact spelling and case, create preset before adding participants + +### "Token reuse error" + +**Cause:** Reusing participant tokens across sessions +**Solution:** Generate fresh token per session. Use refresh endpoint if token expires during session. + +### "Video quality poor" + +**Cause:** Insufficient bandwidth, resolution/bitrate too high, or CPU overload +**Solution:** +Lower `mediaConfiguration.video` resolution/frameRate, monitor network conditions, reduce participant count or grid size + +### "Echo or audio feedback" + +**Cause:** Multiple devices picking up same audio source +**Solution:** +- Lower `mediaConfiguration.video` resolution/frameRate +- Monitor network conditions +- Reduce participant count or grid size + +### Issue: Echo or audio feedback +**Cause**: Multiple devices picking up same audio source + +**Solutions**: +Enable `echoCancellation: true` in `mediaConfiguration.audio`, use headphones, mute when not speaking + +### "Screen share not working" + +**Cause:** Browser doesn't support screen sharing API, permission denied, or wrong `displaySurface` config +**Solution:** +Use Chrome/Edge/Firefox (Safari limited support), check browser permissions, try different `displaySurface` values ('window', 'monitor', 'browser') + +### "How do I schedule meetings?" + +**Cause:** RealtimeKit has no built-in scheduling system +**Solution:** +Store meeting IDs in your database with timestamps. Generate participant tokens only when user should join. Example: +```typescript +// Store in DB +{ meetingId: 'abc123', scheduledFor: '2026-02-15T10:00:00Z', userId: 'user456' } + +// Generate token when user clicks "Join" near scheduled time +const response = await fetch('/api/join-meeting', { + method: 'POST', + body: JSON.stringify({ meetingId: 'abc123' }) +}); +const { authToken } = await response.json(); +``` + +### "Recording not starting" + +**Cause:** Preset lacks recording permissions, no active session, or API call from client +**Solution:** +Verify preset has `canRecord: true` and `canStartStopRecording: true`, ensure session is active (at least one participant), make recording API calls server-side only + +## Limits + +| Resource | Limit | +|----------|-------| +| Max participants per session | 100 | +| Max concurrent sessions per App | 1000 | +| Max recording duration | 6 hours | +| Max meeting duration | 24 hours | +| Max chat message length | 4000 characters | +| Max preset name length | 64 characters | +| Max meeting title length | 256 characters | +| Max participant name length | 256 characters | +| Token expiration | 24 hours (default) | +| WebRTC ports required | UDP 1024-65535 | + +## Network Requirements + +### Firewall Rules +Allow outbound UDP/TCP to: +- `*.cloudflare.com` ports 443, 80 +- UDP ports 1024-65535 (WebRTC media) + +### TURN Service +Enable for users behind restrictive firewalls/proxies: +```jsonc +// wrangler.jsonc +{ + "vars": { + "TURN_SERVICE_ID": "your_turn_service_id" + } + // Set secret: wrangler secret put TURN_SERVICE_TOKEN +} +``` + +TURN automatically configured in SDK when enabled in account. + +## Debugging Tips + +```typescript +// Check devices +const devices = await meeting.self.getAllDevices(); +meeting.self.on('deviceListUpdate', ({ added, removed, devices }) => console.log('Devices:', { added, removed, devices })); + +// Monitor participants +meeting.participants.joined.on('participantJoined', (p) => console.log(`${p.name} joined:`, { id: p.id, userId: p.userId, audioEnabled: p.audioEnabled, videoEnabled: p.videoEnabled })); + +// Check room state +meeting.self.on('roomJoined', () => console.log('Room:', { meetingId: meeting.meta.meetingId, meetingTitle: meeting.meta.meetingTitle, participantCount: meeting.participants.joined.size() + 1, audioEnabled: meeting.self.audioEnabled, videoEnabled: meeting.self.videoEnabled })); + +// Log all events +['roomJoined', 'audioUpdate', 'videoUpdate', 'screenShareUpdate', 'deviceUpdate', 'deviceListUpdate'].forEach(event => meeting.self.on(event, (data) => console.log(`[self] ${event}:`, data))); +['participantJoined', 'participantLeft'].forEach(event => meeting.participants.joined.on(event, (data) => console.log(`[participants] ${event}:`, data))); +meeting.chat.on('chatUpdate', (data) => console.log('[chat] chatUpdate:', data)); +``` + +## Security & Performance + +### Security: Do NOT +- Expose `CLOUDFLARE_API_TOKEN` in client code, hardcode credentials in frontend +- Reuse participant tokens, store tokens in localStorage without encryption +- Allow client-side meeting creation + +### Security: DO +- Generate tokens server-side only, use HTTPS, implement rate limiting +- Validate user auth before generating tokens, use `custom_participant_id` to map to your user system +- Set appropriate preset permissions per user role, rotate API tokens regularly + +### Performance +- **CPU**: Lower video resolution/frameRate, disable video for audio-only, use `meeting.participants.active` for large meetings, implement virtual scrolling +- **Bandwidth**: Set max resolution in `mediaConfiguration`, disable screenshare audio if unneeded, use audio-only mode, implement adaptive bitrate +- **Memory**: Clean up event listeners on unmount, call `meeting.leave()` when done, don't store large participant arrays + +## In This Reference +- [README.md](README.md) - Overview, core concepts, quick start +- [configuration.md](configuration.md) - SDK config, presets, wrangler setup +- [api.md](api.md) - Client SDK APIs, REST endpoints +- [patterns.md](patterns.md) - Common patterns, React hooks, backend integration diff --git a/cloudflare/references/realtimekit/patterns.md b/cloudflare/references/realtimekit/patterns.md new file mode 100644 index 0000000..ac662ef --- /dev/null +++ b/cloudflare/references/realtimekit/patterns.md @@ -0,0 +1,223 @@ +# RealtimeKit Patterns + +## UI Kit (Minimal Code) + +```tsx +// React +import { RtkMeeting } from '@cloudflare/realtimekit-react-ui'; + console.log('Left')} /> + +// Angular +@Component({ template: `` }) +export class AppComponent { authToken = ''; onLeave(event: unknown) {} } + +// HTML/Web Components + + + +``` + +## UI Components + +RealtimeKit provides 133+ pre-built Stencil.js Web Components with framework wrappers: + +### Layout Components +- `` - Full meeting UI (all-in-one) +- ``, ``, `` - Layout sections +- `` - Chat/participants sidebar +- `` - Adaptive video grid + +### Control Components +- ``, `` - Media controls +- `` - Screen sharing +- `` - Leave meeting +- `` - Device settings + +### Grid Variants +- `` - Active speaker focus +- `` - Audio-only mode +- `` - Paginated layout + +**See full catalog**: https://docs.realtime.cloudflare.com/ui-kit + +## Core SDK Patterns + +### Basic Setup +```typescript +import RealtimeKitClient from '@cloudflare/realtimekit'; + +const meeting = new RealtimeKitClient({ authToken, video: true, audio: true }); +meeting.self.on('roomJoined', () => console.log('Joined:', meeting.meta.meetingTitle)); +meeting.participants.joined.on('participantJoined', (p) => console.log(`${p.name} joined`)); +await meeting.join(); +``` + +### Video Grid & Device Selection +```typescript +// Video grid +function VideoGrid({ meeting }) { + const [participants, setParticipants] = useState([]); + useEffect(() => { + const update = () => setParticipants(meeting.participants.joined.toArray()); + meeting.participants.joined.on('participantJoined', update); + meeting.participants.joined.on('participantLeft', update); + update(); + return () => { meeting.participants.joined.off('participantJoined', update); meeting.participants.joined.off('participantLeft', update); }; + }, [meeting]); + return
+ {participants.map(p => )} +
; +} + +function VideoTile({ participant }) { + const videoRef = useRef(null); + useEffect(() => { + if (videoRef.current && participant.videoTrack) videoRef.current.srcObject = new MediaStream([participant.videoTrack]); + }, [participant.videoTrack]); + return
; +} + +// Device selection +const devices = await meeting.self.getAllDevices(); +const switchCamera = (deviceId: string) => { + const device = devices.find(d => d.deviceId === deviceId); + if (device) await meeting.self.setDevice(device); +}; +``` + +## React Hooks (Official) + +```typescript +import { useRealtimeKitClient, useRealtimeKitSelector } from '@cloudflare/realtimekit-react-ui'; + +function MyComponent() { + const [meeting, initMeeting] = useRealtimeKitClient(); + const audioEnabled = useRealtimeKitSelector(m => m.self.audioEnabled); + const participantCount = useRealtimeKitSelector(m => m.participants.joined.size()); + + useEffect(() => { initMeeting({ authToken: '' }); }, []); + + return
+ + {participantCount} participants +
; +} +``` + +**Benefits:** Automatic re-renders, memoized selectors, type-safe + +## Waitlist Handling + +```typescript +// Monitor waitlist +meeting.participants.waitlisted.on('participantJoined', (participant) => { + console.log(`${participant.name} is waiting`); + // Show admin UI to approve/reject +}); + +// Approve from waitlist (backend only) +await fetch( + `https://api.cloudflare.com/client/v4/accounts/${accountId}/realtime/kit/${appId}/meetings/${meetingId}/active-session/waitlist/approve`, + { + method: 'POST', + headers: { 'Authorization': `Bearer ${apiToken}` }, + body: JSON.stringify({ user_ids: [participant.userId] }) + } +); + +// Client receives automatic transition when approved +meeting.self.on('roomJoined', () => console.log('Approved and joined')); +``` + +## Audio-Only Mode + +```typescript +const meeting = new RealtimeKitClient({ + authToken: '', + video: false, // Disable video + audio: true, + mediaConfiguration: { + audio: { + echoCancellation: true, + noiseSuppression: true, + autoGainControl: true + } + } +}); + +// Use audio grid component +import { RtkAudioGrid } from '@cloudflare/realtimekit-react-ui'; + +``` + +## Addon System + +```typescript +// List available addons +meeting.plugins.all.forEach(plugin => { + console.log(plugin.id, plugin.name, plugin.active); +}); + +// Activate collaborative app +await meeting.plugins.activate('whiteboard-addon-id'); + +// Listen for activations +meeting.plugins.on('pluginActivated', ({ plugin }) => { + console.log(`${plugin.name} activated`); +}); + +// Deactivate +await meeting.plugins.deactivate(); +``` + +## Backend Integration + +### Token Generation (Workers) +```typescript +export interface Env { CLOUDFLARE_API_TOKEN: string; CLOUDFLARE_ACCOUNT_ID: string; REALTIMEKIT_APP_ID: string; } + +export default { + async fetch(request: Request, env: Env): Promise { + const url = new URL(request.url); + + if (url.pathname === '/api/join-meeting') { + const { meetingId, userName, presetName } = await request.json(); + const response = await fetch( + `https://api.cloudflare.com/client/v4/accounts/${env.CLOUDFLARE_ACCOUNT_ID}/realtime/kit/${env.REALTIMEKIT_APP_ID}/meetings/${meetingId}/participants`, + { + method: 'POST', + headers: { 'Content-Type': 'application/json', 'Authorization': `Bearer ${env.CLOUDFLARE_API_TOKEN}` }, + body: JSON.stringify({ name: userName, preset_name: presetName }) + } + ); + const data = await response.json(); + return Response.json({ authToken: data.result.authToken }); + } + + return new Response('Not found', { status: 404 }); + } +}; +``` + +## Best Practices + +### Security +1. **Never expose API tokens client-side** - Generate participant tokens server-side only +2. **Don't reuse participant tokens** - Generate fresh token per session, use refresh endpoint if expired +3. **Use custom participant IDs** - Map to your user system for cross-session tracking + +### Performance +1. **Event-driven updates** - Listen to events, don't poll. Use `toArray()` only when needed +2. **Media quality constraints** - Set appropriate resolution/bitrate limits based on network conditions +3. **Device management** - Enable `autoSwitchAudioDevice` for better UX, handle device list updates + +### Architecture +1. **Separate Apps for environments** - staging vs production to prevent data mixing +2. **Preset strategy** - Create presets at App level, reuse across meetings +3. **Token management** - Backend generates tokens, frontend receives via authenticated endpoint + +## In This Reference +- [README.md](README.md) - Overview, core concepts, quick start +- [configuration.md](configuration.md) - SDK config, presets, wrangler setup +- [api.md](api.md) - Client SDK APIs, REST endpoints +- [gotchas.md](gotchas.md) - Common issues, troubleshooting, limits diff --git a/cloudflare/references/sandbox/README.md b/cloudflare/references/sandbox/README.md new file mode 100644 index 0000000..8550be4 --- /dev/null +++ b/cloudflare/references/sandbox/README.md @@ -0,0 +1,96 @@ +# Cloudflare Sandbox SDK + +Secure isolated code execution in containers on Cloudflare's edge. Run untrusted code, manage files, expose services, integrate with AI agents. + +**Use cases**: AI code execution, interactive dev environments, data analysis, CI/CD, code interpreters, multi-tenant execution. + +## Architecture + +- Each sandbox = Durable Object + Container +- Persistent across requests (same ID = same sandbox) +- Isolated filesystem/processes/network +- Configurable sleep/wake for cost optimization + +## Quick Start + +```typescript +import { getSandbox, proxyToSandbox, type Sandbox } from '@cloudflare/sandbox'; +export { Sandbox } from '@cloudflare/sandbox'; + +type Env = { Sandbox: DurableObjectNamespace; }; + +export default { + async fetch(request: Request, env: Env): Promise { + // CRITICAL: proxyToSandbox MUST be called first for preview URLs + const proxyResponse = await proxyToSandbox(request, env); + if (proxyResponse) return proxyResponse; + + const sandbox = getSandbox(env.Sandbox, 'my-sandbox'); + const result = await sandbox.exec('python3 -c "print(2 + 2)"'); + return Response.json({ output: result.stdout }); + } +}; +``` + +**wrangler.jsonc**: +```jsonc +{ + "name": "my-sandbox-worker", + "main": "src/index.ts", + "compatibility_date": "2025-01-01", // Use current date for new projects + + "containers": [{ + "class_name": "Sandbox", + "image": "./Dockerfile", + "instance_type": "lite", // lite | standard | heavy + "max_instances": 5 + }], + + "durable_objects": { + "bindings": [{ "class_name": "Sandbox", "name": "Sandbox" }] + }, + + "migrations": [{ + "tag": "v1", + "new_sqlite_classes": ["Sandbox"] + }] +} +``` + +**Dockerfile**: +```dockerfile +FROM docker.io/cloudflare/sandbox:latest +RUN pip3 install --no-cache-dir pandas numpy matplotlib +EXPOSE 8080 3000 # Required for wrangler dev +``` + +## Core APIs + +- `getSandbox(namespace, id, options?)` → Get/create sandbox +- `sandbox.exec(command, options?)` → Execute command +- `sandbox.readFile(path)` / `writeFile(path, content)` → File ops +- `sandbox.startProcess(command, options)` → Background process +- `sandbox.exposePort(port, options)` → Get preview URL +- `sandbox.createSession(options)` → Isolated session +- `sandbox.wsConnect(request, port)` → WebSocket proxy +- `sandbox.destroy()` → Terminate container +- `sandbox.mountBucket(bucket, path, options)` → Mount S3 storage + +## Critical Rules + +- ALWAYS call `proxyToSandbox()` first +- Same ID = reuse sandbox +- Use `/workspace` for persistent files +- `normalizeId: true` for preview URLs +- Retry on `CONTAINER_NOT_READY` + +## In This Reference +- [configuration.md](./configuration.md) - Config, CLI, environment setup +- [api.md](./api.md) - Programmatic API, testing patterns +- [patterns.md](./patterns.md) - Common workflows, CI/CD integration +- [gotchas.md](./gotchas.md) - Issues, limits, best practices + +## See Also +- [durable-objects](../durable-objects/) - Sandbox runs on DO infrastructure +- [containers](../containers/) - Container runtime fundamentals +- [workers](../workers/) - Entry point for sandbox requests diff --git a/cloudflare/references/sandbox/api.md b/cloudflare/references/sandbox/api.md new file mode 100644 index 0000000..3eb2fa5 --- /dev/null +++ b/cloudflare/references/sandbox/api.md @@ -0,0 +1,198 @@ +# API Reference + +## Command Execution + +```typescript +// Basic +const result = await sandbox.exec('python3 script.py'); +// Returns: { stdout, stderr, exitCode, success, duration } + +// With options +await sandbox.exec('python3 test.py', { + cwd: '/workspace/project', + env: { API_KEY: 'secret' }, + stream: true, + onOutput: (stream, data) => console.log(data) +}); +``` + +## File Operations + +```typescript +// Read/Write +const { content } = await sandbox.readFile('/workspace/data.txt'); +await sandbox.writeFile('/workspace/file.txt', 'content'); // Auto-creates dirs + +// List/Delete +const files = await sandbox.listFiles('/workspace'); +await sandbox.deleteFile('/workspace/temp.txt'); +await sandbox.deleteFile('/workspace/dir', { recursive: true }); + +// Utils +await sandbox.mkdir('/workspace/dir', { recursive: true }); +await sandbox.pathExists('/workspace/file.txt'); +``` + +## Background Processes + +```typescript +// Start +const process = await sandbox.startProcess('python3 -m http.server 8080', { + processId: 'web-server', + cwd: '/workspace/public', + env: { PORT: '8080' } +}); +// Returns: { id, pid, command } + +// Wait for readiness +await process.waitForPort(8080); // Wait for port to listen +await process.waitForLog(/Server running/); // Wait for log pattern +await process.waitForExit(); // Wait for completion + +// Management +const processes = await sandbox.listProcesses(); +const info = await sandbox.getProcess('web-server'); +await sandbox.stopProcess('web-server'); +const logs = await sandbox.getProcessLogs('web-server'); +``` + +## Port Exposure + +```typescript +// Expose port +const { url } = await sandbox.exposePort(8080, { + name: 'web-app', + hostname: request.hostname +}); + +// Management +await sandbox.isPortExposed(8080); +await sandbox.getExposedPorts(request.hostname); +await sandbox.unexposePort(8080); +``` + +## Sessions (Isolated Contexts) + +Each session maintains own shell state, env vars, cwd, process namespace. + +```typescript +// Create with context +const session = await sandbox.createSession({ + id: 'user-123', + cwd: '/workspace/user123', + env: { USER_ID: '123' } +}); + +// Use (full sandbox API) +await session.exec('echo $USER_ID'); +await session.writeFile('config.txt', 'data'); + +// Manage +await sandbox.getSession('user-123'); +await sandbox.deleteSession('user-123'); +``` + +## Code Interpreter + +```typescript +// Create context with variables +const ctx = await sandbox.createCodeContext({ + language: 'python', + variables: { + data: [1, 2, 3, 4, 5], + config: { verbose: true } + } +}); + +// Execute code with rich outputs +const result = await ctx.runCode(` +import matplotlib.pyplot as plt +plt.plot(data, [x**2 for x in data]) +plt.savefig('plot.png') +print(f"Processed {len(data)} points") +`); +// Returns: { outputs: [{ type: 'text'|'image'|'html', content }], error } + +// Context persists variables across runs +const result2 = await ctx.runCode('print(data[0])'); // Still has 'data' +``` + +## WebSocket Connections + +```typescript +// Proxy WebSocket to sandbox service +export default { + async fetch(request: Request, env: Env): Promise { + const proxyResponse = await proxyToSandbox(request, env); + if (proxyResponse) return proxyResponse; + + if (request.headers.get('Upgrade')?.toLowerCase() === 'websocket') { + const sandbox = getSandbox(env.Sandbox, 'realtime'); + return await sandbox.wsConnect(request, 8080); + } + + return new Response('Not a WebSocket request', { status: 400 }); + } +}; +``` + +## Bucket Mounting (S3 Storage) + +```typescript +// Mount R2 bucket (production only, not wrangler dev) +await sandbox.mountBucket(env.DATA_BUCKET, '/data', { + readOnly: false +}); + +// Access files in mounted bucket +await sandbox.exec('ls /data'); +await sandbox.writeFile('/data/output.txt', 'result'); + +// Unmount +await sandbox.unmountBucket('/data'); +``` + +**Note**: Bucket mounting only works in production. Mounted buckets are sandbox-scoped (visible to all sessions in that sandbox). + +## Lifecycle Management + +```typescript +// Terminate container immediately +await sandbox.destroy(); + +// REQUIRED when using keepAlive: true +const sandbox = getSandbox(env.Sandbox, 'temp', { keepAlive: true }); +try { + await sandbox.writeFile('/tmp/code.py', code); + const result = await sandbox.exec('python /tmp/code.py'); + return result.stdout; +} finally { + await sandbox.destroy(); // Free resources +} +``` + +Deletes: files, processes, sessions, network connections, exposed ports. + +## Error Handling + +```typescript +// Command errors +const result = await sandbox.exec('python3 invalid.py'); +if (!result.success) { + console.error('Exit code:', result.exitCode); + console.error('Stderr:', result.stderr); +} + +// SDK errors +try { + await sandbox.readFile('/nonexistent'); +} catch (error) { + if (error.code === 'FILE_NOT_FOUND') { /* ... */ } + else if (error.code === 'CONTAINER_NOT_READY') { /* retry */ } + else if (error.code === 'TIMEOUT') { /* ... */ } +} + +// Retry pattern (see gotchas.md for full implementation) +``` + + diff --git a/cloudflare/references/sandbox/configuration.md b/cloudflare/references/sandbox/configuration.md new file mode 100644 index 0000000..32a3bd9 --- /dev/null +++ b/cloudflare/references/sandbox/configuration.md @@ -0,0 +1,143 @@ +# Configuration + +## getSandbox Options + +```typescript +const sandbox = getSandbox(env.Sandbox, 'sandbox-id', { + normalizeId: true, // lowercase ID (required for preview URLs) + sleepAfter: '10m', // sleep after inactivity: '5m', '1h', '2d' (default: '10m') + keepAlive: false, // false = auto-timeout, true = never sleep + + containerTimeouts: { + instanceGetTimeoutMS: 30000, // 30s for provisioning (default: 30000) + portReadyTimeoutMS: 90000 // 90s for container startup (default: 90000) + } +}); +``` + +**Sleep Config**: +- `sleepAfter`: Duration string (e.g., '5m', '10m', '1h') - default: '10m' +- `keepAlive: false`: Auto-sleep (default, cost-optimized) +- `keepAlive: true`: Never sleep (higher cost, requires explicit `destroy()`) +- Sleeping sandboxes wake automatically (cold start) + +## Instance Types + +wrangler.jsonc `instance_type`: +- `lite`: 256MB RAM, 0.5 vCPU (default) +- `standard`: 512MB RAM, 1 vCPU +- `heavy`: 1GB RAM, 2 vCPU + +## Dockerfile Patterns + +**Basic**: +```dockerfile +FROM docker.io/cloudflare/sandbox:latest +RUN pip3 install --no-cache-dir pandas numpy +EXPOSE 8080 # Required for wrangler dev +``` + +**Scientific**: +```dockerfile +FROM docker.io/cloudflare/sandbox:latest +RUN pip3 install --no-cache-dir \ + jupyter-server ipykernel matplotlib \ + pandas seaborn plotly scipy scikit-learn +``` + +**Node.js**: +```dockerfile +FROM docker.io/cloudflare/sandbox:latest +RUN npm install -g typescript ts-node +``` + +**CRITICAL**: `EXPOSE` required for `wrangler dev` port access. Production auto-exposes all ports. + +## CLI Commands + +```bash +# Dev +wrangler dev # Start local dev server +wrangler deploy # Deploy to production +wrangler tail # Monitor logs +wrangler containers list # Check container status +wrangler secret put KEY # Set secret +``` + +## Environment & Secrets + +**wrangler.jsonc**: +```jsonc +{ + "vars": { + "ENVIRONMENT": "production", + "API_URL": "https://api.example.com" + }, + "r2_buckets": [{ + "binding": "DATA_BUCKET", + "bucket_name": "my-data-bucket" + }] +} +``` + +**Usage**: +```typescript +const token = env.GITHUB_TOKEN; // From wrangler secret +await sandbox.exec('git clone ...', { + env: { GIT_TOKEN: token } +}); +``` + +## Preview URL Setup + +**Prerequisites**: +- Custom domain with wildcard DNS: `*.yourdomain.com → worker.yourdomain.com` +- `.workers.dev` domains NOT supported +- `normalizeId: true` in getSandbox +- `proxyToSandbox()` called first in fetch handler + +## Cron Triggers (Pre-warming) + +```jsonc +{ + "triggers": { + "crons": ["*/5 * * * *"] // Every 5 minutes + } +} +``` + +```typescript +export default { + async scheduled(event: ScheduledEvent, env: Env) { + const sandbox = getSandbox(env.Sandbox, 'main'); + await sandbox.exec('echo "keepalive"'); // Wake sandbox + } +}; +``` + +## Logging Configuration + +**wrangler.jsonc**: +```jsonc +{ + "vars": { + "SANDBOX_LOG_LEVEL": "debug", // debug | info | warn | error (default: info) + "SANDBOX_LOG_FORMAT": "pretty" // json | pretty (default: json) + } +} +``` + +**Dev**: `debug` + `pretty`. **Production**: `info`/`warn` + `json`. + +## Timeout Environment Overrides + +Override default timeouts via environment variables: + +```jsonc +{ + "vars": { + "SANDBOX_INSTANCE_TIMEOUT_MS": "60000", // Override instanceGetTimeoutMS + "SANDBOX_PORT_TIMEOUT_MS": "120000" // Override portReadyTimeoutMS + } +} +``` diff --git a/cloudflare/references/sandbox/gotchas.md b/cloudflare/references/sandbox/gotchas.md new file mode 100644 index 0000000..856c503 --- /dev/null +++ b/cloudflare/references/sandbox/gotchas.md @@ -0,0 +1,194 @@ +# Gotchas & Best Practices + +## Common Errors + +### "Container running indefinitely" + +**Cause:** `keepAlive: true` without calling `destroy()` +**Solution:** Always call `destroy()` when done with keepAlive containers + +```typescript +const sandbox = getSandbox(env.Sandbox, 'temp', { keepAlive: true }); +try { + const result = await sandbox.exec('python script.py'); + return result.stdout; +} finally { + await sandbox.destroy(); // REQUIRED to free resources +} +``` + +### "CONTAINER_NOT_READY" + +**Cause:** Container still provisioning (first request or after sleep) +**Solution:** Retry after 2-3s + +```typescript +async function execWithRetry(sandbox, cmd) { + for (let i = 0; i < 3; i++) { + try { + return await sandbox.exec(cmd); + } catch (e) { + if (e.code === 'CONTAINER_NOT_READY') { + await new Promise(r => setTimeout(r, 2000)); + continue; + } + throw e; + } + } +} +``` + +### "Connection refused: container port not found" + +**Cause:** Missing `EXPOSE` directive in Dockerfile +**Solution:** Add `EXPOSE ` to Dockerfile (only needed for `wrangler dev`, production auto-exposes) + +### "Preview URLs not working" + +**Cause:** Custom domain not configured, wildcard DNS missing, `normalizeId` not set, or `proxyToSandbox()` not called +**Solution:** Check: +1. Custom domain configured? (not `.workers.dev`) +2. Wildcard DNS set up? (`*.domain.com → worker.domain.com`) +3. `normalizeId: true` in getSandbox? +4. `proxyToSandbox()` called first in fetch? + +### "Slow first request" + +**Cause:** Cold start (container provisioning) +**Solution:** +- Use `sleepAfter` instead of creating new sandboxes +- Pre-warm with cron triggers +- Set `keepAlive: true` for critical sandboxes + +### "File not persisting" + +**Cause:** Files in `/tmp` or other ephemeral paths +**Solution:** Use `/workspace` for persistent files + +### "Bucket mounting doesn't work locally" + +**Cause:** Bucket mounting requires FUSE, not available in `wrangler dev` +**Solution:** Test bucket mounting in production only. Use mock data locally. + +### "Different normalizeId = different sandbox" + +**Cause:** Changing `normalizeId` option changes Durable Object ID +**Solution:** Set `normalizeId` consistently. `normalizeId: true` lowercases the ID. + +```typescript +// These create DIFFERENT sandboxes: +getSandbox(env.Sandbox, 'MyApp'); // DO ID: hash('MyApp') +getSandbox(env.Sandbox, 'MyApp', { normalizeId: true }); // DO ID: hash('myapp') +``` + +### "Code context variables disappeared" + +**Cause:** Container restart clears code context state +**Solution:** Code contexts are ephemeral. Recreate context after container sleep/wake. + +## Performance Optimization + +### Sandbox ID Strategy + +```typescript +// ❌ BAD: New sandbox every time (slow) +const sandbox = getSandbox(env.Sandbox, `user-${Date.now()}`); + +// ✅ GOOD: Reuse per user +const sandbox = getSandbox(env.Sandbox, `user-${userId}`); +``` + +### Sleep & Traffic Config + +```typescript +// Cost-optimized +getSandbox(env.Sandbox, 'id', { sleepAfter: '30m', keepAlive: false }); + +// Always-on (requires destroy()) +getSandbox(env.Sandbox, 'id', { keepAlive: true }); +``` + +```jsonc +// High traffic: increase max_instances +{ "containers": [{ "class_name": "Sandbox", "max_instances": 50 }] } +``` + +## Security Best Practices + +### Sandbox Isolation +- Each sandbox = isolated container (filesystem, network, processes) +- Use unique sandbox IDs per tenant for multi-tenant apps +- Sandboxes cannot communicate directly + +### Input Validation + +```typescript +// ❌ DANGEROUS: Command injection +const result = await sandbox.exec(`python3 -c "${userCode}"`); + +// ✅ SAFE: Write to file, execute file +await sandbox.writeFile('/workspace/user_code.py', userCode); +const result = await sandbox.exec('python3 /workspace/user_code.py'); +``` + +### Resource Limits + +```typescript +// Timeout long-running commands +const result = await sandbox.exec('python3 script.py', { + timeout: 30000 // 30 seconds +}); +``` + +### Secrets Management + +```typescript +// ❌ NEVER hardcode secrets +const token = 'ghp_abc123'; + +// ✅ Use environment secrets +const token = env.GITHUB_TOKEN; + +// Pass to sandbox via exec env +const result = await sandbox.exec('git clone ...', { + env: { GIT_TOKEN: token } +}); +``` + +### Preview URL Security +Preview URLs include auto-generated tokens: +``` +https://8080-sandbox-abc123def456.yourdomain.com +``` +Token changes on each expose operation, preventing unauthorized access. + +## Limits + +| Resource | Lite | Standard | Heavy | +|----------|------|----------|-------| +| RAM | 256MB | 512MB | 1GB | +| vCPU | 0.5 | 1 | 2 | + +| Operation | Default Timeout | Override | +|-----------|----------------|----------| +| Container provisioning | 30s | `SANDBOX_INSTANCE_TIMEOUT_MS` | +| Port readiness | 90s | `SANDBOX_PORT_TIMEOUT_MS` | +| exec() | 120s | `timeout` option | +| sleepAfter | 10m | `sleepAfter` option | + +**Performance**: +- **First deploy**: 2-3 min for container build +- **Cold start**: 2-3s when waking from sleep +- **Bucket mounting**: Production only (FUSE not in dev) + +## Production Guide + +See: https://developers.cloudflare.com/sandbox/guides/production-deployment/ + +## Resources + +- [Official Docs](https://developers.cloudflare.com/sandbox/) +- [API Reference](https://developers.cloudflare.com/sandbox/api/) +- [Examples](https://github.com/cloudflare/sandbox-sdk/tree/main/examples) +- [npm Package](https://www.npmjs.com/package/@cloudflare/sandbox) +- [Discord Support](https://discord.cloudflare.com) diff --git a/cloudflare/references/sandbox/patterns.md b/cloudflare/references/sandbox/patterns.md new file mode 100644 index 0000000..adeb0a0 --- /dev/null +++ b/cloudflare/references/sandbox/patterns.md @@ -0,0 +1,201 @@ +# Common Patterns + +## AI Code Execution with Code Context + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const { code, variables } = await request.json(); + const sandbox = getSandbox(env.Sandbox, 'ai-agent'); + + // Create context with persistent variables + const ctx = await sandbox.createCodeContext({ + language: 'python', + variables: variables || {} + }); + + // Execute with rich outputs (text, images, HTML) + const result = await ctx.runCode(code); + + return Response.json({ + outputs: result.outputs, // [{ type: 'text'|'image'|'html', content }] + error: result.error, + success: !result.error + }); + } +}; +``` + +## Interactive Dev Environment + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const proxyResponse = await proxyToSandbox(request, env); + if (proxyResponse) return proxyResponse; + + const sandbox = getSandbox(env.Sandbox, 'ide', { normalizeId: true }); + + if (request.url.endsWith('/start')) { + await sandbox.exec('curl -fsSL https://code-server.dev/install.sh | sh'); + await sandbox.startProcess('code-server --bind-addr 0.0.0.0:8080', { + processId: 'vscode' + }); + + const exposed = await sandbox.exposePort(8080); + return Response.json({ url: exposed.url }); + } + + return new Response('Try /start'); + } +}; +``` + +## WebSocket Real-Time Service + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const proxyResponse = await proxyToSandbox(request, env); + if (proxyResponse) return proxyResponse; + + if (request.headers.get('Upgrade')?.toLowerCase() === 'websocket') { + const sandbox = getSandbox(env.Sandbox, 'realtime-service'); + return await sandbox.wsConnect(request, 8080); + } + + // Non-WebSocket: expose preview URL + const sandbox = getSandbox(env.Sandbox, 'realtime-service'); + const { url } = await sandbox.exposePort(8080, { + hostname: new URL(request.url).hostname + }); + return Response.json({ wsUrl: url.replace('https', 'wss') }); + } +}; +``` + +**Dockerfile**: +```dockerfile +FROM docker.io/cloudflare/sandbox:latest +RUN npm install -g ws +EXPOSE 8080 +``` + +## Process Readiness Pattern + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const sandbox = getSandbox(env.Sandbox, 'app-server'); + + // Start server + const process = await sandbox.startProcess( + 'node server.js', + { processId: 'server' } + ); + + // Wait for server to be ready + await process.waitForPort(8080); // Wait for port listening + + // Now safe to expose + const { url } = await sandbox.exposePort(8080); + return Response.json({ url }); + } +}; +``` + +## Persistent Data with Bucket Mounting + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const sandbox = getSandbox(env.Sandbox, 'data-processor'); + + // Mount R2 bucket (production only) + await sandbox.mountBucket(env.DATA_BUCKET, '/data', { + readOnly: false + }); + + // Process files in bucket + const result = await sandbox.exec('python3 /workspace/process.py', { + env: { DATA_DIR: '/data/input' } + }); + + // Results written to /data/output are persisted in R2 + return Response.json({ success: result.success }); + } +}; +``` + +## CI/CD Pipeline + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const { repo, branch } = await request.json(); + const sandbox = getSandbox(env.Sandbox, `ci-${repo}-${Date.now()}`); + + await sandbox.exec(`git clone -b ${branch} ${repo} /workspace/repo`); + + const install = await sandbox.exec('npm install', { + cwd: '/workspace/repo', + stream: true, + onOutput: (stream, data) => console.log(data) + }); + + if (!install.success) { + return Response.json({ success: false, error: 'Install failed' }); + } + + const test = await sandbox.exec('npm test', { cwd: '/workspace/repo' }); + + return Response.json({ + success: test.success, + output: test.stdout, + exitCode: test.exitCode + }); + } +}; +``` + + + + + +## Multi-Tenant Pattern + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const userId = request.headers.get('X-User-ID'); + const sandbox = getSandbox(env.Sandbox, 'multi-tenant'); + + // Each user gets isolated session + let session; + try { + session = await sandbox.getSession(userId); + } catch { + session = await sandbox.createSession({ + id: userId, + cwd: `/workspace/users/${userId}`, + env: { USER_ID: userId } + }); + } + + const code = await request.text(); + const result = await session.exec(`python3 -c "${code}"`); + + return Response.json({ output: result.stdout }); + } +}; +``` + +## Git Operations + +```typescript +// Clone repo +await sandbox.exec('git clone https://github.com/user/repo.git /workspace/repo'); + +// Authenticated (use env secrets) +await sandbox.exec(`git clone https://${env.GITHUB_TOKEN}@github.com/user/repo.git`); +``` diff --git a/cloudflare/references/secrets-store/README.md b/cloudflare/references/secrets-store/README.md new file mode 100644 index 0000000..dc709e9 --- /dev/null +++ b/cloudflare/references/secrets-store/README.md @@ -0,0 +1,74 @@ +# Cloudflare Secrets Store + +Account-level encrypted secret management for Workers and AI Gateway. + +## Overview + +**Secrets Store**: Centralized, account-level secrets, reusable across Workers +**Worker Secrets**: Per-Worker secrets (`wrangler secret put`) + +### Architecture + +- **Store**: Container (1/account in beta) +- **Secret**: String ≤1024 bytes +- **Scopes**: Permission boundaries controlling access + - `workers`: For Workers runtime access + - `ai-gateway`: For AI Gateway access + - Secrets must have correct scope for binding to work +- **Bindings**: Connect secrets via `env` object + +**Regional Availability**: Global except China Network (unavailable) + +### Access Control + +- **Super Admin**: Full access +- **Admin**: Create/edit/delete secrets, view metadata +- **Deployer**: View metadata + bindings +- **Reporter**: View metadata only + +API Token permissions: `Account Secrets Store Edit/Read` + +### Limits (Beta) + +- 100 secrets/account +- 1 store/account +- 1024 bytes max/secret +- Production secrets count toward limit + +## When to Use + +**Use Secrets Store when:** +- Multiple Workers share same credential +- Centralized management needed +- Compliance requires audit trail +- Team collaboration on secrets + +**Use Worker Secrets when:** +- Secret unique to one Worker +- Simple single-Worker project +- No cross-Worker sharing needed + +## In This Reference + +### Reading Order by Task + +| Task | Start Here | Then Read | +|------|------------|-----------| +| Quick overview | README.md | - | +| First-time setup | README.md → configuration.md | api.md | +| Add secret to Worker | configuration.md | api.md | +| Implement access pattern | api.md | patterns.md | +| Debug errors | gotchas.md | api.md | +| Secret rotation | patterns.md | configuration.md | +| Best practices | gotchas.md | patterns.md | + +### Files + +- [configuration.md](./configuration.md) - Wrangler commands, binding config +- [api.md](./api.md) - Binding API, get/put/delete operations +- [patterns.md](./patterns.md) - Rotation, encryption, access control +- [gotchas.md](./gotchas.md) - Security issues, limits, best practices + +## See Also +- [workers](../workers/) - Worker bindings integration +- [wrangler](../wrangler/) - CLI secret management commands diff --git a/cloudflare/references/secrets-store/api.md b/cloudflare/references/secrets-store/api.md new file mode 100644 index 0000000..2e4e6e2 --- /dev/null +++ b/cloudflare/references/secrets-store/api.md @@ -0,0 +1,200 @@ +# API Reference + +## Binding API + +### Basic Access + +**CRITICAL**: Async `.get()` required - secrets NOT directly available. + +**`.get()` throws on error** - does NOT return null. Always use try/catch. + +```typescript +interface Env { + API_KEY: { get(): Promise }; +} + +export default { + async fetch(request: Request, env: Env): Promise { + const apiKey = await env.API_KEY.get(); + return fetch("https://api.example.com", { + headers: { "Authorization": `Bearer ${apiKey}` } + }); + } +} +``` + +### Error Handling + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + try { + const apiKey = await env.API_KEY.get(); + return fetch("https://api.example.com", { + headers: { "Authorization": `Bearer ${apiKey}` } + }); + } catch (error) { + console.error("Secret access failed:", error); + return new Response("Configuration error", { status: 500 }); + } + } +} +``` + +### Multiple Secrets & Patterns + +```typescript +// Parallel fetch +const [stripeKey, sendgridKey] = await Promise.all([ + env.STRIPE_KEY.get(), + env.SENDGRID_KEY.get() +]); + +// ❌ Missing .get() +const key = env.API_KEY; + +// ❌ Module-level cache +const CACHED_KEY = await env.API_KEY.get(); // Fails + +// ✅ Request-scope cache +const key = await env.API_KEY.get(); // OK - reuse within request +``` + +## REST API + +Base: `https://api.cloudflare.com/client/v4` + +### Auth + +```bash +curl -H "Authorization: Bearer $CF_TOKEN" \ + https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/secrets_store/stores +``` + +### Store Operations + +```bash +# List +GET /accounts/{account_id}/secrets_store/stores + +# Create +POST /accounts/{account_id}/secrets_store/stores +{"name": "my-store"} + +# Delete +DELETE /accounts/{account_id}/secrets_store/stores/{store_id} +``` + +### Secret Operations + +```bash +# List +GET /accounts/{account_id}/secrets_store/stores/{store_id}/secrets + +# Create (single) +POST /accounts/{account_id}/secrets_store/stores/{store_id}/secrets +{ + "name": "my_secret", + "value": "secret_value", + "scopes": ["workers"], + "comment": "Optional" +} + +# Create (batch) +POST /accounts/{account_id}/secrets_store/stores/{store_id}/secrets +[ + {"name": "secret_one", "value": "val1", "scopes": ["workers"]}, + {"name": "secret_two", "value": "val2", "scopes": ["workers", "ai-gateway"]} +] + +# Get metadata +GET /accounts/{account_id}/secrets_store/stores/{store_id}/secrets/{secret_id} + +# Update +PATCH /accounts/{account_id}/secrets_store/stores/{store_id}/secrets/{secret_id} +{"value": "new_value", "comment": "Updated"} + +# Delete (single) +DELETE /accounts/{account_id}/secrets_store/stores/{store_id}/secrets/{secret_id} + +# Delete (batch) +DELETE /accounts/{account_id}/secrets_store/stores/{store_id}/secrets +{"secret_ids": ["id-1", "id-2"]} + +# Duplicate +POST /accounts/{account_id}/secrets_store/stores/{store_id}/secrets/{secret_id}/duplicate +{"name": "new_name"} + +# Quota +GET /accounts/{account_id}/secrets_store/quota +``` + +### Responses + +Success: +```json +{ + "success": true, + "result": { + "id": "secret-id-123", + "name": "my_secret", + "created": "2025-01-11T12:00:00Z", + "scopes": ["workers"] + } +} +``` + +Error: +```json +{ + "success": false, + "errors": [{"code": 10000, "message": "Name exists"}] +} +``` + +## TypeScript Helpers + +Official types available via `@cloudflare/workers-types`: + +```typescript +import type { SecretsStoreSecret } from "@cloudflare/workers-types"; + +interface Env { + STRIPE_API_KEY: SecretsStoreSecret; + DATABASE_URL: SecretsStoreSecret; + WORKER_SECRET: string; // Regular Worker secret (direct access) +} +``` + +Custom helper type: + +```typescript +interface SecretsStoreBinding { + get(): Promise; +} + +// Fallback helper +async function getSecretWithFallback( + primary: SecretsStoreBinding, + fallback?: SecretsStoreBinding +): Promise { + try { + return await primary.get(); + } catch (error) { + if (fallback) return await fallback.get(); + throw error; + } +} + +// Batch helper +async function getAllSecrets( + secrets: Record +): Promise> { + const entries = await Promise.all( + Object.entries(secrets).map(async ([k, v]) => [k, await v.get()]) + ); + return Object.fromEntries(entries); +} +``` + +See: [configuration.md](./configuration.md), [patterns.md](./patterns.md), [gotchas.md](./gotchas.md) diff --git a/cloudflare/references/secrets-store/configuration.md b/cloudflare/references/secrets-store/configuration.md new file mode 100644 index 0000000..a1e2eee --- /dev/null +++ b/cloudflare/references/secrets-store/configuration.md @@ -0,0 +1,185 @@ +# Configuration + +## Wrangler Config + +### Basic Binding + +**wrangler.jsonc**: + +```jsonc +{ + "secrets_store_secrets": [ + { + "binding": "API_KEY", + "store_id": "abc123", + "secret_name": "stripe_api_key" + } + ] +} +``` + +**wrangler.toml** (alternative): + +```toml +[[secrets_store_secrets]] +binding = "API_KEY" +store_id = "abc123" +secret_name = "stripe_api_key" +``` + +Fields: +- `binding`: Variable name for `env` access +- `store_id`: From `wrangler secrets-store store list` +- `secret_name`: Identifier (no spaces) + +### Environment-Specific + +**wrangler.jsonc**: + +```jsonc +{ + "env": { + "production": { + "secrets_store_secrets": [ + { + "binding": "API_KEY", + "store_id": "prod-store", + "secret_name": "prod_api_key" + } + ] + }, + "staging": { + "secrets_store_secrets": [ + { + "binding": "API_KEY", + "store_id": "staging-store", + "secret_name": "staging_api_key" + } + ] + } + } +} +``` + +**wrangler.toml** (alternative): + +```toml +[env.production] +[[env.production.secrets_store_secrets]] +binding = "API_KEY" +store_id = "prod-store" +secret_name = "prod_api_key" + +[env.staging] +[[env.staging.secrets_store_secrets]] +binding = "API_KEY" +store_id = "staging-store" +secret_name = "staging_api_key" +``` + +## Wrangler Commands + +### Store Management + +```bash +wrangler secrets-store store list +wrangler secrets-store store create my-store --remote +wrangler secrets-store store delete --remote +``` + +### Secret Management (Production) + +```bash +# Create (interactive) +wrangler secrets-store secret create \ + --name MY_SECRET --scopes workers --remote + +# Create (piped) +cat secret.txt | wrangler secrets-store secret create \ + --name MY_SECRET --scopes workers --remote + +# List/get/update/delete +wrangler secrets-store secret list --remote +wrangler secrets-store secret get --name MY_SECRET --remote +wrangler secrets-store secret update --name MY_SECRET --new-value "val" --remote +wrangler secrets-store secret delete --name MY_SECRET --remote + +# Duplicate +wrangler secrets-store secret duplicate \ + --name ORIG --new-name COPY --remote +``` + +### Local Development + +**CRITICAL**: Production secrets (`--remote`) NOT accessible in local dev. + +```bash +# Create local-only (no --remote) +wrangler secrets-store secret create --name DEV_KEY --scopes workers + +wrangler dev # Uses local secrets +wrangler deploy # Uses production secrets +``` + +Best practice: Separate names for local/prod: + +```jsonc +{ + "env": { + "development": { + "secrets_store_secrets": [ + { "binding": "API_KEY", "store_id": "store", "secret_name": "dev_api_key" } + ] + }, + "production": { + "secrets_store_secrets": [ + { "binding": "API_KEY", "store_id": "store", "secret_name": "prod_api_key" } + ] + } + } +} +``` + +## Dashboard + +### Creating Secrets + +1. **Secrets Store** → **Create secret** +2. Fill: Name (no spaces), Value, Scope (`Workers`), Comment +3. **Save** (value hidden after) + +### Adding Bindings + +**Method 1**: Worker → Settings → Bindings → Add → Secrets Store +**Method 2**: Create secret directly from Worker settings dropdown + +Deploy options: +- **Deploy**: Immediate 100% +- **Save version**: Gradual rollout + +## CI/CD + +### GitHub Actions + +```yaml +- name: Create secret + env: + CLOUDFLARE_API_TOKEN: ${{ secrets.CF_TOKEN }} + run: | + echo "${{ secrets.API_KEY }}" | \ + npx wrangler secrets-store secret create $STORE_ID \ + --name API_KEY --scopes workers --remote + +- name: Deploy + run: npx wrangler deploy +``` + +### GitLab CI + +```yaml +script: + - echo "$API_KEY_VALUE" | npx wrangler secrets-store secret create $STORE_ID --name API_KEY --scopes workers --remote + - npx wrangler deploy +``` + +See: [api.md](./api.md), [patterns.md](./patterns.md) diff --git a/cloudflare/references/secrets-store/gotchas.md b/cloudflare/references/secrets-store/gotchas.md new file mode 100644 index 0000000..08218b4 --- /dev/null +++ b/cloudflare/references/secrets-store/gotchas.md @@ -0,0 +1,97 @@ +# Gotchas + +## Common Errors + +### ".get() Throws on Error" + +**Cause:** Assuming `.get()` returns null on failure instead of throwing +**Solution:** Always wrap `.get()` calls in try/catch blocks to handle errors gracefully + +```typescript +try { + const key = await env.API_KEY.get(); +} catch (error) { + return new Response("Configuration error", { status: 500 }); +} +``` + +### "Logging Secret Values" + +**Cause:** Accidentally logging secret values in console or error messages +**Solution:** Only log metadata (e.g., "Retrieved API_KEY") never the actual secret value + +### "Module-Level Secret Access" + +**Cause:** Attempting to access secrets during module initialization before env is available +**Solution:** Cache secrets in request scope only, not at module level + +### "Secret not found in store" + +**Cause:** Secret name doesn't exist, case mismatch, missing workers scope, or incorrect store_id +**Solution:** Verify secret exists with `wrangler secrets-store secret list --remote`, check name matches exactly (case-sensitive), ensure secret has `workers` scope, and verify correct store_id + +### "Scope Mismatch" + +**Cause:** Secret exists but missing `workers` scope (only has `ai-gateway` scope) +**Solution:** Update secret scopes: `wrangler secrets-store secret update --name SECRET --scopes workers --remote` or add via Dashboard + +### "JSON Parsing Failure" + +**Cause:** Storing invalid JSON in secret, then failing to parse during runtime +**Solution:** Validate JSON before storing: + +```bash +# Validate before storing +echo '{"key":"value"}' | jq . && \ + echo '{"key":"value"}' | wrangler secrets-store secret create \ + --name CONFIG --scopes workers --remote +``` + +Runtime parsing with error handling: + +```typescript +try { + const configStr = await env.CONFIG.get(); + const config = JSON.parse(configStr); +} catch (error) { + console.error("Invalid config JSON:", error); + return new Response("Invalid configuration", { status: 500 }); +} +``` + +### "Cannot access secret in local dev" + +**Cause:** Attempting to access production secrets in local development environment +**Solution:** Create local-only secrets (without `--remote` flag) for development: `wrangler secrets-store secret create --name API_KEY --scopes workers` + +### "Property 'get' does not exist" + +**Cause:** Missing TypeScript type definition for secret binding +**Solution:** Define interface with get method: `interface Env { API_KEY: { get(): Promise }; }` + +### "Binding already exists" + +**Cause:** Duplicate binding in dashboard or conflict between wrangler.jsonc and dashboard +**Solution:** Remove duplicate from dashboard Settings → Bindings, check for conflicts, or delete old Worker secret with `wrangler secret delete API_KEY` + +### "Account secret quota exceeded" + +**Cause:** Account has reached 100 secret limit (beta) +**Solution:** Check quota with `wrangler secrets-store quota --remote`, delete unused secrets, consolidate duplicates, or contact Cloudflare for increase + +## Limits + +| Limit | Value | Notes | +|-------|-------|-------| +| Max secrets per account | 100 | Beta limit | +| Max stores per account | 1 | Beta limit | +| Max secret size | 1024 bytes | Per secret | +| Local secrets | Don't count toward limit | Only production secrets count | +| Scopes available | `workers`, `ai-gateway` | Must have correct scope for access | +| Scope | Account-level | Can be reused across multiple Workers | +| Access method | `await env.BINDING.get()` | Async only, throws on error | +| Management | Centralized | Via secrets-store commands | +| Local dev | Separate local secrets | Use without `--remote` flag | +| Regional availability | Global except China Network | Unavailable in China Network | + +See: [configuration.md](./configuration.md), [api.md](./api.md), [patterns.md](./patterns.md) diff --git a/cloudflare/references/secrets-store/patterns.md b/cloudflare/references/secrets-store/patterns.md new file mode 100644 index 0000000..afac998 --- /dev/null +++ b/cloudflare/references/secrets-store/patterns.md @@ -0,0 +1,207 @@ +# Patterns + +## Secret Rotation + +Zero-downtime rotation with versioned naming (`api_key_v1`, `api_key_v2`): + +```typescript +interface Env { + PRIMARY_KEY: { get(): Promise }; + FALLBACK_KEY?: { get(): Promise }; +} + +async function fetchWithAuth(url: string, key: string) { + return fetch(url, { headers: { "Authorization": `Bearer ${key}` } }); +} + +export default { + async fetch(request: Request, env: Env): Promise { + let resp = await fetchWithAuth("https://api.example.com", await env.PRIMARY_KEY.get()); + + // Fallback during rotation + if (!resp.ok && env.FALLBACK_KEY) { + resp = await fetchWithAuth("https://api.example.com", await env.FALLBACK_KEY.get()); + } + + return resp; + } +} +``` + +Workflow: Create `api_key_v2` → add fallback binding → deploy → swap primary → deploy → remove `v1` + +## Encryption with KV + +```typescript +interface Env { + CACHE: KVNamespace; + ENCRYPTION_KEY: { get(): Promise }; +} + +async function encryptValue(value: string, key: string): Promise { + const enc = new TextEncoder(); + const keyMaterial = await crypto.subtle.importKey( + "raw", enc.encode(key), { name: "AES-GCM" }, false, ["encrypt"] + ); + const iv = crypto.getRandomValues(new Uint8Array(12)); + const encrypted = await crypto.subtle.encrypt( + { name: "AES-GCM", iv }, keyMaterial, enc.encode(value) + ); + + const combined = new Uint8Array(iv.length + encrypted.byteLength); + combined.set(iv); + combined.set(new Uint8Array(encrypted), iv.length); + return btoa(String.fromCharCode(...combined)); +} + +export default { + async fetch(request: Request, env: Env): Promise { + const key = await env.ENCRYPTION_KEY.get(); + const encrypted = await encryptValue("sensitive-data", key); + await env.CACHE.put("user:123:data", encrypted); + return Response.json({ ok: true }); + } +} +``` + +## HMAC Signing + +```typescript +interface Env { + HMAC_SECRET: { get(): Promise }; +} + +async function signRequest(data: string, secret: string): Promise { + const enc = new TextEncoder(); + const key = await crypto.subtle.importKey( + "raw", enc.encode(secret), { name: "HMAC", hash: "SHA-256" }, false, ["sign"] + ); + const sig = await crypto.subtle.sign("HMAC", key, enc.encode(data)); + return btoa(String.fromCharCode(...new Uint8Array(sig))); +} + +export default { + async fetch(request: Request, env: Env): Promise { + const secret = await env.HMAC_SECRET.get(); + const payload = await request.text(); + const signature = await signRequest(payload, secret); + return Response.json({ signature }); + } +} +``` + +## Audit & Monitoring + +```typescript +export default { + async fetch(request: Request, env: Env, ctx: ExecutionContext) { + const startTime = Date.now(); + try { + const apiKey = await env.API_KEY.get(); + const resp = await fetch("https://api.example.com", { + headers: { "Authorization": `Bearer ${apiKey}` } + }); + + ctx.waitUntil( + fetch("https://log.example.com/log", { + method: "POST", + body: JSON.stringify({ + event: "secret_used", + secret_name: "API_KEY", + timestamp: new Date().toISOString(), + duration_ms: Date.now() - startTime, + success: resp.ok + }) + }) + ); + return resp; + } catch (error) { + ctx.waitUntil( + fetch("https://log.example.com/log", { + method: "POST", + body: JSON.stringify({ + event: "secret_access_failed", + secret_name: "API_KEY", + error: error instanceof Error ? error.message : "Unknown" + }) + }) + ); + return new Response("Error", { status: 500 }); + } + } +} +``` + +## Migration from Worker Secrets + +Change `env.SECRET` (direct) to `await env.SECRET.get()` (async). + +Steps: +1. Create in Secrets Store: `wrangler secrets-store secret create --name API_KEY --scopes workers --remote` +2. Add binding to `wrangler.jsonc`: `{"binding": "API_KEY", "store_id": "abc123", "secret_name": "api_key"}` +3. Update code: `const key = await env.API_KEY.get();` +4. Test staging, deploy +5. Remove old: `wrangler secret delete API_KEY` + +## Sharing Across Workers + +Same secret, different binding names: + +```jsonc +// worker-1: binding="SHARED_DB", secret_name="postgres_url" +// worker-2: binding="DB_CONN", secret_name="postgres_url" +``` + +## JSON Secret Parsing + +Store structured config as JSON secrets: + +```typescript +interface Env { + DB_CONFIG: { get(): Promise }; +} + +interface DbConfig { + host: string; + port: number; + username: string; + password: string; +} + +export default { + async fetch(request: Request, env: Env): Promise { + try { + const configStr = await env.DB_CONFIG.get(); + const config: DbConfig = JSON.parse(configStr); + + // Use parsed config + const dbUrl = `postgres://${config.username}:${config.password}@${config.host}:${config.port}`; + + return Response.json({ connected: true }); + } catch (error) { + if (error instanceof SyntaxError) { + return new Response("Invalid config JSON", { status: 500 }); + } + throw error; + } + } +} +``` + +Store JSON secret: + +```bash +echo '{"host":"db.example.com","port":5432,"username":"app","password":"secret"}' | \ + wrangler secrets-store secret create \ + --name DB_CONFIG --scopes workers --remote +``` + +## Integration + +### Service Bindings + +Auth Worker signs JWT with Secrets Store; API Worker verifies via service binding. + +See: [workers](../workers/) for service binding patterns. + +See: [api.md](./api.md), [gotchas.md](./gotchas.md) diff --git a/cloudflare/references/smart-placement/README.md b/cloudflare/references/smart-placement/README.md new file mode 100644 index 0000000..e8b1041 --- /dev/null +++ b/cloudflare/references/smart-placement/README.md @@ -0,0 +1,138 @@ +# Cloudflare Workers Smart Placement + +Automatic workload placement optimization to minimize latency by running Workers closer to backend infrastructure rather than end users. + +## Core Concept + +Smart Placement automatically analyzes Worker request duration across Cloudflare's global network and intelligently routes requests to optimal data center locations. Instead of defaulting to the location closest to the end user, Smart Placement can forward requests to locations closer to backend infrastructure when this reduces overall request duration. + +### When to Use + +**Enable Smart Placement when:** +- Worker makes multiple round trips to backend services/databases +- Backend infrastructure is geographically concentrated +- Request duration dominated by backend latency rather than network latency from user +- Running backend logic in Workers (APIs, data aggregation, SSR with DB calls) +- Worker uses `fetch` handler (not RPC methods) + +**Do NOT enable for:** +- Workers serving only static content or cached responses +- Workers without significant backend communication +- Pure edge logic (auth checks, redirects, simple transformations) +- Workers without fetch event handlers +- Workers with RPC methods or named entrypoints (only `fetch` handlers are affected) +- Pages/Assets Workers with `run_worker_first = true` (degrades asset serving) + +### Decision Tree + +``` +Does your Worker have a fetch handler? +├─ No → Smart Placement won't work (skip) +└─ Yes + │ + Does it make multiple backend calls (DB/API)? + ├─ No → Don't enable (won't help) + └─ Yes + │ + Is backend geographically concentrated? + ├─ No (globally distributed) → Probably won't help + └─ Yes or uncertain + │ + Does it serve static assets with run_worker_first=true? + ├─ Yes → Don't enable (will hurt performance) + └─ No → Enable Smart Placement + │ + After 15min, check placement_status + ├─ SUCCESS → Monitor metrics + ├─ INSUFFICIENT_INVOCATIONS → Need more traffic + └─ UNSUPPORTED_APPLICATION → Disable (hurting performance) +``` + +### Key Architecture Pattern + +**Recommended:** Split full-stack applications into separate Workers: +``` +User → Frontend Worker (at edge, close to user) + ↓ Service Binding + Backend Worker (Smart Placement enabled, close to DB/API) + ↓ + Database/Backend Service +``` + +This maintains fast, reactive frontends while optimizing backend latency. + +## Quick Start + +```jsonc +// wrangler.jsonc +{ + "placement": { + "mode": "smart" // or "off" to explicitly disable + } +} +``` + +Deploy and wait 15 minutes for analysis. Check status via API or dashboard metrics. + +**To disable:** Set `"mode": "off"` or remove `placement` field entirely (both equivalent). + +## Requirements + +- Wrangler 2.20.0+ +- Analysis time: Up to 15 minutes after enabling +- Traffic requirements: Consistent traffic from multiple global locations +- Available on all Workers plans (Free, Paid, Enterprise) + +## Placement Status Values + +```typescript +type PlacementStatus = + | undefined // Not yet analyzed + | 'SUCCESS' // Successfully optimized + | 'INSUFFICIENT_INVOCATIONS' // Not enough traffic + | 'UNSUPPORTED_APPLICATION'; // Made Worker slower (reverted) +``` + +## CLI Commands + +```bash +# Deploy with Smart Placement +wrangler deploy + +# Check placement status +curl -H "Authorization: Bearer $TOKEN" \ + https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workers/services/$WORKER_NAME \ + | jq .result.placement_status + +# Monitor +wrangler tail your-worker-name --header cf-placement +``` + +## Reading Order + +**First time?** Start here: +1. This README - understand core concepts and when to use Smart Placement +2. [configuration.md](./configuration.md) - set up wrangler.jsonc and understand limitations +3. [patterns.md](./patterns.md) - see practical examples for your use case +4. [api.md](./api.md) - monitor and verify Smart Placement is working +5. [gotchas.md](./gotchas.md) - troubleshoot common issues + +**Quick lookup:** +- "Should I enable Smart Placement?" → See "When to Use" above +- "How do I configure it?" → [configuration.md](./configuration.md) +- "How do I split frontend/backend?" → [patterns.md](./patterns.md) +- "Why isn't it working?" → [gotchas.md](./gotchas.md) + +## In This Reference + +- [configuration.md](./configuration.md) - wrangler.jsonc setup, mode values, validation rules +- [api.md](./api.md) - Placement Status API, cf-placement header, monitoring +- [patterns.md](./patterns.md) - Frontend/backend split, database workers, SSR patterns +- [gotchas.md](./gotchas.md) - Troubleshooting INSUFFICIENT_INVOCATIONS, performance issues + +## See Also + +- [workers](../workers/) - Worker runtime and fetch handlers +- [d1](../d1/) - D1 database that benefits from Smart Placement +- [durable-objects](../durable-objects/) - Durable Objects with backend logic +- [bindings](../bindings/) - Service bindings for frontend/backend split diff --git a/cloudflare/references/smart-placement/api.md b/cloudflare/references/smart-placement/api.md new file mode 100644 index 0000000..6608985 --- /dev/null +++ b/cloudflare/references/smart-placement/api.md @@ -0,0 +1,183 @@ +# Smart Placement API + +## Placement Status API + +Query Worker placement status via Cloudflare API: + +```bash +curl -X GET "https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/workers/services/{WORKER_NAME}" \ + -H "Authorization: Bearer " \ + -H "Content-Type: application/json" +``` + +Response includes `placement_status` field: + +```typescript +type PlacementStatus = + | undefined // Not yet analyzed + | 'SUCCESS' // Successfully optimized + | 'INSUFFICIENT_INVOCATIONS' // Not enough traffic + | 'UNSUPPORTED_APPLICATION'; // Made Worker slower (reverted) +``` + +## Status Meanings + +**`undefined` (not present)** +- Worker not yet analyzed +- Always runs at default edge location closest to user + +**`SUCCESS`** +- Analysis complete, Smart Placement active +- Worker runs in optimal location (may be edge or remote) + +**`INSUFFICIENT_INVOCATIONS`** +- Not enough requests to make placement decision +- Requires consistent multi-region traffic +- Always runs at default edge location + +**`UNSUPPORTED_APPLICATION`** (rare, <1% of Workers) +- Smart Placement made Worker slower +- Placement decision reverted +- Always runs at edge location +- Won't be re-analyzed until redeployed + +## cf-placement Header (Beta) + +Smart Placement adds response header indicating routing decision: + +```typescript +// Remote placement (Smart Placement routed request) +"cf-placement: remote-LHR" // Routed to London + +// Local placement (default edge routing) +"cf-placement: local-EWR" // Stayed at Newark edge +``` + +Format: `{placement-type}-{IATA-code}` +- `remote-*` = Smart Placement routed to remote location +- `local-*` = Stayed at default edge location +- IATA code = nearest airport to data center + +**Warning:** Beta feature, may be removed before GA. + +## Detecting Smart Placement in Code + +**Note:** `cf-placement` header is a beta feature and may change or be removed. + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const placementHeader = request.headers.get('cf-placement'); + + if (placementHeader?.startsWith('remote-')) { + const location = placementHeader.split('-')[1]; + console.log(`Smart Placement routed to ${location}`); + } else if (placementHeader?.startsWith('local-')) { + const location = placementHeader.split('-')[1]; + console.log(`Running at edge location ${location}`); + } + + return new Response('OK'); + } +} satisfies ExportedHandler; +``` + +## Request Duration Metrics + +Available in Cloudflare dashboard when Smart Placement enabled: + +**Workers & Pages → [Your Worker] → Metrics → Request Duration** + +Shows histogram comparing: +- Request duration WITH Smart Placement (99% of traffic) +- Request duration WITHOUT Smart Placement (1% baseline) + +**Request Duration vs Execution Duration:** +- **Request duration:** Total time from request arrival to response delivery (includes network latency) +- **Execution duration:** Time Worker code actively executing (excludes network waits) + +Use request duration to measure Smart Placement impact. + +### Interpreting Metrics + +| Metric Comparison | Interpretation | Action | +|-------------------|----------------|--------| +| WITH < WITHOUT | Smart Placement helping | Keep enabled | +| WITH ≈ WITHOUT | Neutral impact | Consider disabling to free resources | +| WITH > WITHOUT | Smart Placement hurting | Disable with `mode: "off"` | + +**Why Smart Placement might hurt performance:** +- Worker primarily serves static assets or cached content +- Backend services are globally distributed (no single optimal location) +- Worker has minimal backend communication +- Using Pages with `assets.run_worker_first = true` + +**Typical improvements when Smart Placement helps:** +- 20-50% reduction in request duration for database-heavy Workers +- 30-60% reduction for Workers making multiple backend API calls +- Larger improvements when backend is geographically concentrated + +## Monitoring Commands + +```bash +# Tail Worker logs +wrangler tail your-worker-name + +# Tail with filters +wrangler tail your-worker-name --status error +wrangler tail your-worker-name --header cf-placement + +# Check placement status via API +curl -H "Authorization: Bearer $TOKEN" \ + https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workers/services/$WORKER_NAME \ + | jq .result.placement_status +``` + +## TypeScript Types + +```typescript +// Placement status returned by API (field may be absent) +type PlacementStatus = + | 'SUCCESS' + | 'INSUFFICIENT_INVOCATIONS' + | 'UNSUPPORTED_APPLICATION' + | undefined; + +// Placement configuration in wrangler.jsonc +type PlacementMode = 'smart' | 'off'; + +interface PlacementConfig { + mode: PlacementMode; + // Legacy fields (deprecated/removed): + // hint?: string; // REMOVED - no longer supported +} + +// Explicit placement (separate feature from Smart Placement) +interface ExplicitPlacementConfig { + region?: string; + host?: string; + hostname?: string; + // Cannot combine with mode field +} + +// Worker metadata from API response +interface WorkerMetadata { + placement?: PlacementConfig | ExplicitPlacementConfig; + placement_status?: PlacementStatus; +} + +// Service Binding for backend Worker +interface Env { + BACKEND_SERVICE: Fetcher; // Service Binding to backend Worker + DATABASE: D1Database; +} + +// Example Worker with Service Binding +export default { + async fetch(request: Request, env: Env): Promise { + // Forward to backend Worker with Smart Placement enabled + const response = await env.BACKEND_SERVICE.fetch(request); + return response; + } +} satisfies ExportedHandler; +``` diff --git a/cloudflare/references/smart-placement/configuration.md b/cloudflare/references/smart-placement/configuration.md new file mode 100644 index 0000000..4f506ac --- /dev/null +++ b/cloudflare/references/smart-placement/configuration.md @@ -0,0 +1,196 @@ +# Smart Placement Configuration + +## wrangler.jsonc Setup + +```jsonc +{ + "$schema": "./node_modules/wrangler/config-schema.json", + "placement": { + "mode": "smart" + } +} +``` + +## Placement Mode Values + +| Mode | Behavior | +|------|----------| +| `"smart"` | Enable Smart Placement - automatic optimization based on traffic analysis | +| `"off"` | Explicitly disable Smart Placement - always run at edge closest to user | +| Not specified | Default behavior - run at edge closest to user (same as `"off"`) | + +**Note:** Smart Placement vs Explicit Placement are separate features. Smart Placement (`mode: "smart"`) uses automatic analysis. For manual placement control, see explicit placement options (`region`, `host`, `hostname` fields - not covered in this reference). + +## Frontend + Backend Split Configuration + +### Frontend Worker (No Smart Placement) + +```jsonc +// frontend-worker/wrangler.jsonc +{ + "name": "frontend", + "main": "frontend-worker.ts", + // No "placement" - runs at edge + "services": [ + { + "binding": "BACKEND", + "service": "backend-api" + } + ] +} +``` + +### Backend Worker (Smart Placement Enabled) + +```jsonc +// backend-api/wrangler.jsonc +{ + "name": "backend-api", + "main": "backend-worker.ts", + "placement": { + "mode": "smart" + }, + "d1_databases": [ + { + "binding": "DATABASE", + "database_id": "xxx" + } + ] +} +``` + +## Requirements & Limitations + +### Requirements +- **Wrangler version:** 2.20.0+ +- **Analysis time:** Up to 15 minutes +- **Traffic requirements:** Consistent multi-location traffic +- **Workers plan:** All plans (Free, Paid, Enterprise) + +### What Smart Placement Affects + +**CRITICAL LIMITATION - Smart Placement ONLY Affects `fetch` Handlers:** + +Smart Placement is fundamentally limited to Workers with default `fetch` handlers. This is a key architectural constraint. + +- ✅ **Affects:** `fetch` event handlers ONLY (the default export's fetch method) +- ❌ **Does NOT affect:** + - RPC methods (Service Bindings with `WorkerEntrypoint` - see example below) + - Named entrypoints (exports other than `default`) + - Workers without `fetch` handlers + - Queue consumers, scheduled handlers, or other event types + +**Example - Smart Placement ONLY affects `fetch`:** +```typescript +// ✅ Smart Placement affects this: +export default { + async fetch(request: Request, env: Env): Promise { + // This runs close to backend when Smart Placement enabled + const data = await env.DATABASE.prepare('SELECT * FROM users').all(); + return Response.json(data); + } +} + +// ❌ Smart Placement DOES NOT affect these: +export class MyRPC extends WorkerEntrypoint { + async myMethod() { + // This ALWAYS runs at edge, Smart Placement has NO EFFECT + const data = await this.env.DATABASE.prepare('SELECT * FROM users').all(); + return data; + } +} + +export async function scheduled(event: ScheduledEvent, env: Env) { + // NOT affected by Smart Placement +} +``` + +**Consequence:** If your backend logic uses RPC methods (`WorkerEntrypoint`), Smart Placement cannot optimize those calls. You must use fetch-based patterns for Smart Placement to work. + +**Solution:** Convert RPC methods to fetch endpoints, or use a wrapper Worker with `fetch` handler that calls your backend RPC (though this adds latency). + +### Baseline Traffic +Smart Placement automatically routes 1% of requests WITHOUT optimization as baseline for performance comparison. + +### Validation Rules + +**Mutually exclusive fields:** +- `mode` cannot be used with explicit placement fields (`region`, `host`, `hostname`) +- Choose either Smart Placement OR explicit placement, not both + +```jsonc +// ✅ Valid - Smart Placement +{ "placement": { "mode": "smart" } } + +// ✅ Valid - Explicit Placement (different feature) +{ "placement": { "region": "us-east1" } } + +// ❌ Invalid - Cannot combine +{ "placement": { "mode": "smart", "region": "us-east1" } } +``` + +## Dashboard Configuration + +**Workers & Pages** → Select Worker → **Settings** → **General** → **Placement: Smart** → Wait 15min → Check **Metrics** + +## TypeScript Types + +```typescript +interface Env { + BACKEND: Fetcher; + DATABASE: D1Database; +} + +export default { + async fetch(request: Request, env: Env): Promise { + const data = await env.DATABASE.prepare('SELECT * FROM table').all(); + return Response.json(data); + } +} satisfies ExportedHandler; +``` + +## Cloudflare Pages/Assets Warning + +**CRITICAL PERFORMANCE ISSUE:** Enabling Smart Placement with `assets.run_worker_first = true` in Pages projects **severely degrades asset serving performance**. This is one of the most common misconfigurations. + +**Why this is bad:** +- Smart Placement routes ALL requests (including static assets) away from edge to remote locations +- Static assets (HTML, CSS, JS, images) should ALWAYS be served from edge closest to user +- Result: 2-5x slower asset loading times, poor user experience + +**Problem:** Smart Placement routes asset requests away from edge, but static assets should always be served from edge closest to user. + +**Solutions (in order of preference):** +1. **Recommended:** Split into separate Workers (frontend at edge + backend with Smart Placement) +2. Set `"mode": "off"` to explicitly disable Smart Placement for Pages/Assets Workers +3. Use `assets.run_worker_first = false` (serves assets first, bypasses Worker for static content) + +```jsonc +// ❌ BAD - Degrades asset performance by 2-5x +{ + "name": "pages-app", + "placement": { "mode": "smart" }, + "assets": { "run_worker_first": true } +} + +// ✅ GOOD - Frontend at edge, backend optimized +// frontend-worker/wrangler.jsonc +{ + "name": "frontend", + "assets": { "run_worker_first": true } + // No placement - runs at edge +} + +// backend-worker/wrangler.jsonc +{ + "name": "backend-api", + "placement": { "mode": "smart" }, + "d1_databases": [{ "binding": "DB", "database_id": "xxx" }] +} +``` + +**Key takeaway:** Never enable Smart Placement on Workers that serve static assets with `run_worker_first = true`. + +## Local Development + +Smart Placement does NOT work in `wrangler dev` (local only). Test by deploying: `wrangler deploy --env staging` diff --git a/cloudflare/references/smart-placement/gotchas.md b/cloudflare/references/smart-placement/gotchas.md new file mode 100644 index 0000000..dc94e9b --- /dev/null +++ b/cloudflare/references/smart-placement/gotchas.md @@ -0,0 +1,174 @@ +# Smart Placement Gotchas + +## Common Errors + +### "INSUFFICIENT_INVOCATIONS" + +**Cause:** Not enough traffic for Smart Placement to analyze +**Solution:** +- Ensure Worker receives consistent global traffic +- Wait longer (analysis takes up to 15 minutes) +- Send test traffic from multiple global locations +- Check Worker has fetch event handler + +### "UNSUPPORTED_APPLICATION" + +**Cause:** Smart Placement made Worker slower rather than faster +**Reasons:** +- Worker doesn't make backend calls (runs faster at edge) +- Backend calls are cached (network latency to user more important) +- Backend service has good global distribution +- Worker serves static assets or Pages content + +**Solutions:** +- Disable Smart Placement: `{ "placement": { "mode": "off" } }` +- Review whether Worker actually benefits from Smart Placement +- Consider caching strategy to reduce backend calls +- For Pages/Assets Workers, use separate backend Worker with Smart Placement + +### "No request duration metrics" + +**Cause:** Smart Placement not enabled, insufficient time passed, insufficient traffic, or analysis incomplete +**Solution:** +- Ensure Smart Placement enabled in config +- Wait 15+ minutes after deployment +- Verify Worker has sufficient traffic +- Check `placement_status` is `SUCCESS` + +### "cf-placement header missing" + +**Cause:** Smart Placement not enabled, beta feature removed, or Worker not analyzed yet +**Solution:** Verify Smart Placement enabled, wait for analysis (15min), check if beta feature still available + +## Pages/Assets + Smart Placement Performance Degradation + +**Problem:** Static assets load 2-5x slower when Smart Placement enabled with `run_worker_first = true`. + +**Cause:** Smart Placement routes ALL requests (including static assets like HTML, CSS, JS, images) to remote locations. Static content should ALWAYS be served from edge closest to user. + +**Solution:** Split into separate Workers OR disable Smart Placement: +```jsonc +// ❌ BAD - Assets routed away from user +{ + "name": "pages-app", + "placement": { "mode": "smart" }, + "assets": { "run_worker_first": true } +} + +// ✅ GOOD - Assets at edge, API optimized +// frontend/wrangler.jsonc +{ + "name": "frontend", + "assets": { "run_worker_first": true } + // No placement field - stays at edge +} + +// backend/wrangler.jsonc +{ + "name": "backend-api", + "placement": { "mode": "smart" } +} +``` + +This is one of the most common and impactful Smart Placement misconfigurations. + +## Monolithic Full-Stack Worker + +**Problem:** Frontend and backend logic in single Worker with Smart Placement enabled. + +**Cause:** Smart Placement optimizes for backend latency but increases user-facing response time. + +**Solution:** Split into two Workers: +```jsonc +// frontend/wrangler.jsonc +{ + "name": "frontend", + "placement": { "mode": "off" }, // Explicit: stay at edge + "services": [{ "binding": "BACKEND", "service": "backend-api" }] +} + +// backend/wrangler.jsonc +{ + "name": "backend-api", + "placement": { "mode": "smart" }, + "d1_databases": [{ "binding": "DB", "database_id": "xxx" }] +} +``` + +## Local Development Confusion + +**Issue:** Smart Placement doesn't work in `wrangler dev`. + +**Explanation:** Smart Placement only activates in production deployments, not local development. + +**Solution:** Test Smart Placement in staging environment: `wrangler deploy --env staging` + +## Baseline Traffic & Analysis Time + +**Note:** Smart Placement routes 1% of requests WITHOUT optimization for comparison (expected). + +**Analysis time:** Up to 15 minutes. During analysis, Worker runs at edge. Monitor `placement_status`. + +## RPC Methods Not Affected (Critical Limitation) + +**Problem:** Enabled Smart Placement on backend but RPC calls still slow. + +**Cause:** Smart Placement ONLY affects `fetch` handlers. RPC methods (Service Bindings with `WorkerEntrypoint`) are NEVER affected. + +**Why:** RPC bypasses `fetch` handler - Smart Placement can only route `fetch` requests. + +**Solution:** Convert to fetch-based Service Bindings: + +```typescript +// ❌ RPC - Smart Placement has NO EFFECT +export class BackendRPC extends WorkerEntrypoint { + async getData() { + // ALWAYS runs at edge + return await this.env.DATABASE.prepare('SELECT * FROM table').all(); + } +} + +// ✅ Fetch - Smart Placement WORKS +export default { + async fetch(request: Request, env: Env): Promise { + // Runs close to DATABASE when Smart Placement enabled + const data = await env.DATABASE.prepare('SELECT * FROM table').all(); + return Response.json(data); + } +} +``` + +## Requirements + +- **Wrangler 2.20.0+** required +- **Consistent multi-region traffic** needed for analysis +- **Only affects fetch handlers** - RPC methods and named entrypoints not affected + +## Limits + +| Resource/Limit | Value | Notes | +|----------------|-------|-------| +| Analysis time | Up to 15 minutes | After enabling | +| Baseline traffic | 1% | Routed without optimization | +| Min Wrangler version | 2.20.0+ | Required | +| Traffic requirement | Multi-region | Consistent needed | + +## Disabling Smart Placement + +```jsonc +{ "placement": { "mode": "off" } } // Explicit disable +// OR remove "placement" field entirely (same effect) +``` + +Both behaviors identical - Worker runs at edge closest to user. + +## When NOT to Use Smart Placement + +- Workers serving only static content or cached responses +- Workers without significant backend communication +- Pure edge logic (auth checks, redirects, simple transformations) +- Workers without fetch event handlers +- Pages/Assets Workers with `run_worker_first = true` +- Workers using RPC methods instead of fetch handlers + +These scenarios won't benefit and may perform worse with Smart Placement. diff --git a/cloudflare/references/smart-placement/patterns.md b/cloudflare/references/smart-placement/patterns.md new file mode 100644 index 0000000..40dc4dd --- /dev/null +++ b/cloudflare/references/smart-placement/patterns.md @@ -0,0 +1,183 @@ +# Smart Placement Patterns + +## Backend Worker with Database Access + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const user = await env.DATABASE.prepare('SELECT * FROM users WHERE id = ?').bind(userId).first(); + const orders = await env.DATABASE.prepare('SELECT * FROM orders WHERE user_id = ?').bind(userId).all(); + return Response.json({ user, orders }); + } +}; +``` + +```jsonc +{ "placement": { "mode": "smart" }, "d1_databases": [{ "binding": "DATABASE", "database_id": "xxx" }] } +``` + +## Frontend + Backend Split (Service Bindings) + +**Frontend:** Runs at edge for fast user response +**Backend:** Smart Placement runs close to database + +```typescript +// Frontend Worker - routes requests to backend +interface Env { + BACKEND: Fetcher; // Service Binding to backend Worker +} + +export default { + async fetch(request: Request, env: Env): Promise { + if (new URL(request.url).pathname.startsWith('/api/')) { + return env.BACKEND.fetch(request); // Forward to backend + } + return new Response('Frontend content'); + } +}; + +// Backend Worker - database operations +interface BackendEnv { + DATABASE: D1Database; +} + +export default { + async fetch(request: Request, env: BackendEnv): Promise { + const data = await env.DATABASE.prepare('SELECT * FROM table').all(); + return Response.json(data); + } +}; +``` + +**CRITICAL:** Use fetch-based Service Bindings (shown above). If using RPC with `WorkerEntrypoint`, Smart Placement will NOT optimize those method calls - only `fetch` handlers are affected. + +**RPC vs Fetch - CRITICAL:** Smart Placement ONLY works with fetch-based bindings, NOT RPC. + +```typescript +// ❌ RPC - Smart Placement has NO EFFECT on backend RPC methods +export class BackendRPC extends WorkerEntrypoint { + async getData() { + // ALWAYS runs at edge, Smart Placement ignored + return await this.env.DATABASE.prepare('SELECT * FROM table').all(); + } +} + +// ✅ Fetch - Smart Placement WORKS +export default { + async fetch(request: Request, env: Env): Promise { + // Runs close to DATABASE when Smart Placement enabled + const data = await env.DATABASE.prepare('SELECT * FROM table').all(); + return Response.json(data); + } +}; +``` + +## External API Integration + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const apiUrl = 'https://api.partner.com'; + const headers = { 'Authorization': `Bearer ${env.API_KEY}` }; + + const [profile, transactions] = await Promise.all([ + fetch(`${apiUrl}/profile`, { headers }), + fetch(`${apiUrl}/transactions`, { headers }) + ]); + + return Response.json({ + profile: await profile.json(), + transactions: await transactions.json() + }); + } +}; +``` + +## SSR / API Gateway Pattern + +```typescript +// Frontend (edge) - auth/routing close to user +export default { + async fetch(request: Request, env: Env) { + if (!request.headers.get('Authorization')) { + return new Response('Unauthorized', { status: 401 }); + } + const data = await env.BACKEND.fetch(request); + return new Response(renderPage(await data.json()), { + headers: { 'Content-Type': 'text/html' } + }); + } +}; + +// Backend (Smart Placement) - DB operations close to data +export default { + async fetch(request: Request, env: Env) { + const data = await env.DATABASE.prepare('SELECT * FROM pages WHERE id = ?').bind(pageId).first(); + return Response.json(data); + } +}; +``` + +## Durable Objects with Smart Placement + +**Key principle:** Smart Placement does NOT control WHERE Durable Objects run. DOs always run in their designated region (based on jurisdiction or smart location hints). + +**What Smart Placement DOES affect:** The location of the coordinator Worker's `fetch` handler that makes calls to multiple DOs. + +**Pattern:** Enable Smart Placement on coordinator Worker that aggregates data from multiple DOs: + +```typescript +// Worker with Smart Placement - aggregates data from multiple DOs +export default { + async fetch(request: Request, env: Env): Promise { + const userId = new URL(request.url).searchParams.get('user'); + + // Get DO stubs + const userDO = env.USER_DO.get(env.USER_DO.idFromName(userId)); + const analyticsID = env.ANALYTICS_DO.idFromName(`analytics-${userId}`); + const analyticsDO = env.ANALYTICS_DO.get(analyticsID); + + // Fetch from multiple DOs + const [userData, analyticsData] = await Promise.all([ + userDO.fetch(new Request('https://do/profile')), + analyticsDO.fetch(new Request('https://do/stats')) + ]); + + return Response.json({ + user: await userData.json(), + analytics: await analyticsData.json() + }); + } +}; +``` + +```jsonc +// wrangler.jsonc +{ + "placement": { "mode": "smart" }, + "durable_objects": { + "bindings": [ + { "name": "USER_DO", "class_name": "UserDO" }, + { "name": "ANALYTICS_DO", "class_name": "AnalyticsDO" } + ] + } +} +``` + +**When this helps:** +- Worker's `fetch` handler runs closer to DO regions, reducing network latency for multiple DO calls +- Most beneficial when DOs are geographically concentrated or in specific jurisdictions +- Helps when coordinator makes many sequential or parallel DO calls + +**When this DOESN'T help:** +- DOs are globally distributed (no single optimal Worker location) +- Worker only calls a single DO +- DO calls are infrequent or cached + +## Best Practices + +- Split full-stack apps: frontend at edge, backend with Smart Placement +- Use fetch-based Service Bindings (not RPC) +- Enable for backend logic: APIs, data aggregation, DB operations +- Don't enable for: static content, edge logic, RPC methods, Pages with `run_worker_first` +- Wait 15+ min for analysis, verify `placement_status = SUCCESS` diff --git a/cloudflare/references/snippets/README.md b/cloudflare/references/snippets/README.md new file mode 100644 index 0000000..a09a1c4 --- /dev/null +++ b/cloudflare/references/snippets/README.md @@ -0,0 +1,68 @@ +# Cloudflare Snippets Skill Reference + +## Description +Expert guidance for **Cloudflare Snippets ONLY** - a lightweight JavaScript-based edge logic platform for modifying HTTP requests and responses. Snippets run as part of the Ruleset Engine and are included at no additional cost on paid plans (Pro, Business, Enterprise). + +## What Are Snippets? +Snippets are JavaScript functions executed at the edge as part of Cloudflare's Ruleset Engine. Key characteristics: +- **Execution time**: 5ms CPU limit per request +- **Size limit**: 32KB per snippet +- **Runtime**: V8 isolate (subset of Workers APIs) +- **Subrequests**: 2-5 fetch calls depending on plan +- **Cost**: Included with Pro/Business/Enterprise plans + +## Snippets vs Workers Decision Matrix + +| Factor | Choose Snippets If... | Choose Workers If... | +|--------|----------------------|---------------------| +| **Complexity** | Simple request/response modifications | Complex business logic, routing, middleware | +| **Execution time** | <5ms sufficient | Need >5ms or variable time | +| **Subrequests** | 2-5 fetch calls sufficient | Need >5 subrequests or complex orchestration | +| **Code size** | <32KB sufficient | Need >32KB or npm dependencies | +| **Cost** | Want zero additional cost | Can afford $5/mo + usage | +| **APIs** | Need basic fetch, headers, URL | Need KV, D1, R2, Durable Objects, cron triggers | +| **Deployment** | Need rule-based triggers | Want custom routing logic | + +**Rule of thumb**: Use Snippets for modifications, Workers for applications. + +## Execution Model +1. Request arrives at Cloudflare edge +2. Ruleset Engine evaluates snippet rules (filter expressions) +3. If rule matches, snippet executes within 5ms limit +4. Modified request/response continues through pipeline +5. Response returned to client + +Snippets execute synchronously in the request path - performance is critical. + +## Reading Order +1. **[configuration.md](configuration.md)** - Start here: setup, deployment methods (Dashboard/API/Terraform) +2. **[api.md](api.md)** - Core APIs: Request, Response, headers, `request.cf` properties +3. **[patterns.md](patterns.md)** - Real-world examples: geo-routing, A/B tests, security headers +4. **[gotchas.md](gotchas.md)** - Troubleshooting: common errors, performance tips, API limitations + +## In This Reference + +- **[configuration.md](configuration.md)** - Setup, deployment, configuration +- **[api.md](api.md)** - API endpoints, methods, interfaces +- **[patterns.md](patterns.md)** - Common patterns, use cases, examples +- **[gotchas.md](gotchas.md)** - Troubleshooting, best practices, limitations + +## Quick Start +```javascript +// Snippet: Add security headers +export default { + async fetch(request) { + const response = await fetch(request); + const newResponse = new Response(response.body, response); + newResponse.headers.set("X-Frame-Options", "DENY"); + newResponse.headers.set("X-Content-Type-Options", "nosniff"); + return newResponse; + } +} +``` + +Deploy via Dashboard (Rules → Snippets) or API/Terraform. See configuration.md for details. + +## See Also + +- [Cloudflare Docs](https://developers.cloudflare.com/rules/snippets/) diff --git a/cloudflare/references/snippets/api.md b/cloudflare/references/snippets/api.md new file mode 100644 index 0000000..76a5a4b --- /dev/null +++ b/cloudflare/references/snippets/api.md @@ -0,0 +1,198 @@ +# Snippets API Reference + +## Request Object + +### HTTP Properties +```javascript +request.method // GET, POST, PUT, DELETE, etc. +request.url // Full URL string +request.headers // Headers object +request.body // ReadableStream (for POST/PUT) +request.cf // Cloudflare properties (see below) +``` + +### URL Operations +```javascript +const url = new URL(request.url); +url.hostname // "example.com" +url.pathname // "/path/to/page" +url.search // "?query=value" +url.searchParams.get("q") // "value" +url.searchParams.set("q", "new") +url.searchParams.delete("q") +``` + +### Header Operations +```javascript +// Read headers +request.headers.get("User-Agent") +request.headers.has("Authorization") +request.headers.getSetCookie() // Get all Set-Cookie headers + +// Modify headers (create new request) +const modifiedRequest = new Request(request); +modifiedRequest.headers.set("X-Custom", "value") +modifiedRequest.headers.delete("X-Remove") +``` + +### Cloudflare Properties (`request.cf`) +Access Cloudflare-specific metadata about the request: + +```javascript +// Geolocation +request.cf.city // "San Francisco" +request.cf.continent // "NA" +request.cf.country // "US" +request.cf.region // "California" or "CA" +request.cf.regionCode // "CA" +request.cf.postalCode // "94102" +request.cf.latitude // "37.7749" +request.cf.longitude // "-122.4194" +request.cf.timezone // "America/Los_Angeles" +request.cf.metroCode // "807" (DMA code) + +// Network +request.cf.colo // "SFO" (airport code of datacenter) +request.cf.asn // 13335 (ASN number) +request.cf.asOrganization // "Cloudflare, Inc." + +// Bot Management (if enabled) +request.cf.botManagement.score // 1-99 (1=bot, 99=human) +request.cf.botManagement.verified_bot // true/false +request.cf.botManagement.static_resource // true/false + +// TLS/HTTP version +request.cf.tlsVersion // "TLSv1.3" +request.cf.tlsCipher // "AEAD-AES128-GCM-SHA256" +request.cf.httpProtocol // "HTTP/2" + +// Request metadata +request.cf.requestPriority // "weight=192;exclusive=0" +``` + +**Use cases**: Geo-routing, bot detection, security decisions, analytics. + +## Response Object + +### Response Constructors +```javascript +// Plain text +new Response("Hello", { status: 200 }) + +// JSON +Response.json({ key: "value" }, { status: 200 }) + +// HTML +new Response("

Hi

", { + status: 200, + headers: { "Content-Type": "text/html" } +}) + +// Redirect +Response.redirect("https://example.com", 301) // or 302 + +// Stream (pass through) +new Response(response.body, response) +``` + +### Response Headers +```javascript +// Create modified response +const newResponse = new Response(response.body, response); + +// Set/modify headers +newResponse.headers.set("X-Custom", "value") +newResponse.headers.append("Set-Cookie", "session=abc; Path=/") +newResponse.headers.delete("Server") + +// Common headers +newResponse.headers.set("Cache-Control", "public, max-age=3600") +newResponse.headers.set("Content-Type", "application/json") +``` + +### Response Properties +```javascript +response.status // 200, 404, 500, etc. +response.statusText // "OK", "Not Found", etc. +response.headers // Headers object +response.body // ReadableStream +response.ok // true if status 200-299 +response.redirected // true if redirected +``` + +## REST API Operations + +### List Snippets +```bash +GET /zones/{zone_id}/snippets +``` + +### Get Snippet +```bash +GET /zones/{zone_id}/snippets/{snippet_name} +``` + +### Create/Update Snippet +```bash +PUT /zones/{zone_id}/snippets/{snippet_name} +Content-Type: multipart/form-data + +files=@snippet.js +metadata={"main_module":"snippet.js"} +``` + +### Delete Snippet +```bash +DELETE /zones/{zone_id}/snippets/{snippet_name} +``` + +### List Snippet Rules +```bash +GET /zones/{zone_id}/rulesets/phases/http_request_snippets/entrypoint +``` + +### Update Snippet Rules +```bash +PUT /zones/{zone_id}/snippets/snippet_rules +Content-Type: application/json + +{ + "rules": [{ + "description": "Apply snippet", + "enabled": true, + "expression": "http.host eq \"example.com\"", + "snippet_name": "my_snippet" + }] +} +``` + +## Available APIs in Snippets + +### ✅ Supported +- `fetch()` - HTTP requests (2-5 subrequests per plan) +- `Request` / `Response` - Standard Web APIs +- `URL` / `URLSearchParams` - URL manipulation +- `Headers` - Header manipulation +- `TextEncoder` / `TextDecoder` - Text encoding +- `crypto.subtle` - Web Crypto API (hashing, signing) +- `crypto.randomUUID()` - UUID generation + +### ❌ Not Supported in Snippets +- `caches` API - Not available (use Workers) +- `KV`, `D1`, `R2` - Storage APIs (use Workers) +- `Durable Objects` - Stateful objects (use Workers) +- `WebSocket` - WebSocket upgrades (use Workers) +- `HTMLRewriter` - HTML parsing (use Workers) +- `import` statements - No module imports +- `addEventListener` - Use `export default { async fetch() {}` pattern + +## Snippet Structure +```javascript +export default { + async fetch(request) { + // Your logic here + const response = await fetch(request); + return response; // or modified response + } +} +``` \ No newline at end of file diff --git a/cloudflare/references/snippets/configuration.md b/cloudflare/references/snippets/configuration.md new file mode 100644 index 0000000..b5bea0f --- /dev/null +++ b/cloudflare/references/snippets/configuration.md @@ -0,0 +1,227 @@ +# Snippets Configuration Guide + +## Configuration Methods + +### 1. Dashboard (GUI) +**Best for**: Quick tests, single snippets, visual rule building + +``` +1. Go to zone → Rules → Snippets +2. Click "Create Snippet" or select template +3. Enter snippet name (a-z, 0-9, _ only, cannot change later) +4. Write JavaScript code (32KB max) +5. Configure snippet rule: + - Expression Builder (visual) or Expression Editor (text) + - Use Ruleset Engine filter expressions +6. Test with Preview/HTTP tabs +7. Deploy or Save as Draft +``` + +### 2. REST API +**Best for**: CI/CD, automation, programmatic management + +```bash +# Create/update snippet +curl "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/snippets/$SNIPPET_NAME" \ + --request PUT \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --form "files=@example.js" \ + --form "metadata={\"main_module\": \"example.js\"}" + +# Create snippet rule +curl "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/snippets/snippet_rules" \ + --request PUT \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ + --data '{ + "rules": [ + { + "description": "Trigger snippet on /api paths", + "enabled": true, + "expression": "starts_with(http.request.uri.path, \"/api/\")", + "snippet_name": "api_snippet" + } + ] + }' + +# List snippets +curl "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/snippets" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" + +# Delete snippet +curl "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/snippets/$SNIPPET_NAME" \ + --request DELETE \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" +``` + +### 3. Terraform +**Best for**: Infrastructure-as-code, multi-zone deployments + +```hcl +# Configure Terraform provider +terraform { + required_providers { + cloudflare = { + source = "cloudflare/cloudflare" + version = "~> 4.0" + } + } +} + +provider "cloudflare" { + api_token = var.cloudflare_api_token +} + +# Create snippet +resource "cloudflare_snippet" "security_headers" { + zone_id = var.zone_id + name = "security_headers" + + main_module = "security_headers.js" + files { + name = "security_headers.js" + content = file("${path.module}/snippets/security_headers.js") + } +} + +# Create snippet rule +resource "cloudflare_snippet_rules" "security_rules" { + zone_id = var.zone_id + + rules { + description = "Apply security headers to all requests" + enabled = true + expression = "true" + snippet_name = cloudflare_snippet.security_headers.name + } +} +``` + +### 4. Pulumi +**Best for**: Multi-cloud IaC, TypeScript/Python/Go workflows + +```typescript +import * as cloudflare from "@pulumi/cloudflare"; +import * as fs from "fs"; + +// Create snippet +const securitySnippet = new cloudflare.Snippet("security-headers", { + zoneId: zoneId, + name: "security_headers", + mainModule: "security_headers.js", + files: [{ + name: "security_headers.js", + content: fs.readFileSync("./snippets/security_headers.js", "utf8"), + }], +}); + +// Create snippet rule +const snippetRule = new cloudflare.SnippetRules("security-rules", { + zoneId: zoneId, + rules: [{ + description: "Apply security headers", + enabled: true, + expression: "true", + snippetName: securitySnippet.name, + }], +}); +``` + +## Filter Expressions + +Snippets use Cloudflare's Ruleset Engine expression language to determine when to execute. + +### Common Expression Patterns + +```javascript +// Host matching +http.host eq "example.com" +http.host in {"example.com" "www.example.com"} +http.host contains "example" + +// Path matching +http.request.uri.path eq "/api/users" +starts_with(http.request.uri.path, "/api/") +ends_with(http.request.uri.path, ".json") +matches(http.request.uri.path, "^/api/v[0-9]+/") + +// Query parameters +http.request.uri.query contains "debug=true" + +// Headers +http.headers["user-agent"] contains "Mobile" +http.headers["accept-language"] eq "en-US" + +// Cookies +http.cookie contains "session=" + +// Geolocation +ip.geoip.country eq "US" +ip.geoip.continent eq "EU" + +// Bot detection (requires Bot Management) +cf.bot_management.score lt 30 + +// Method +http.request.method eq "POST" +http.request.method in {"POST" "PUT" "PATCH"} + +// Combine with logical operators +http.host eq "example.com" and starts_with(http.request.uri.path, "/api/") +ip.geoip.country eq "US" or ip.geoip.country eq "CA" +not http.headers["user-agent"] contains "bot" +``` + +### Expression Functions + +| Function | Example | Description | +|----------|---------|-------------| +| `starts_with()` | `starts_with(http.request.uri.path, "/api/")` | Check prefix | +| `ends_with()` | `ends_with(http.request.uri.path, ".json")` | Check suffix | +| `contains()` | `contains(http.headers["user-agent"], "Mobile")` | Check substring | +| `matches()` | `matches(http.request.uri.path, "^/api/")` | Regex match | +| `lower()` | `lower(http.host) eq "example.com"` | Convert to lowercase | +| `upper()` | `upper(http.headers["x-api-key"])` | Convert to uppercase | +| `len()` | `len(http.request.uri.path) gt 100` | String length | + +## Deployment Workflow + +### Development +1. Write snippet code locally +2. Test syntax with `node snippet.js` or TypeScript compiler +3. Deploy to Dashboard or use API with `Save as Draft` +4. Test with Preview/HTTP tabs in Dashboard +5. Enable rule when ready + +### Production +1. Store snippet code in version control +2. Use Terraform/Pulumi for reproducible deployments +3. Deploy to staging zone first +4. Test with real traffic (use low-traffic subdomain) +5. Apply to production zone +6. Monitor with Analytics/Logpush + +## Limits & Requirements + +| Resource | Limit | Notes | +|----------|-------|-------| +| Snippet size | 32 KB | Per snippet, compressed | +| Snippet name | 64 chars | `a-z`, `0-9`, `_` only, immutable | +| Snippets per zone | 20 | Soft limit, contact support for more | +| Rules per zone | 20 | One rule per snippet typical | +| Expression length | 4096 chars | Per rule expression | + +## Authentication + +### API Token (Recommended) +```bash +# Create token at: https://dash.cloudflare.com/profile/api-tokens +# Required permissions: Zone.Snippets:Edit, Zone.Rules:Edit +export CLOUDFLARE_API_TOKEN="your_token_here" +``` + +### API Key (Legacy) +```bash +export CLOUDFLARE_EMAIL="your@email.com" +export CLOUDFLARE_API_KEY="your_global_api_key" +``` \ No newline at end of file diff --git a/cloudflare/references/snippets/gotchas.md b/cloudflare/references/snippets/gotchas.md new file mode 100644 index 0000000..832077e --- /dev/null +++ b/cloudflare/references/snippets/gotchas.md @@ -0,0 +1,86 @@ +# Gotchas & Best Practices + +## Common Errors + +### 1000: "Snippet execution failed" +Runtime error or syntax error. Wrap code in try/catch: +```javascript +try { return await fetch(request); } +catch (error) { return new Response(`Error: ${error.message}`, { status: 500 }); } +``` + +### 1100: "Exceeded execution limit" +Code takes >5ms CPU. Simplify logic or move to Workers. + +### 1201: "Multiple origin fetches" +Call `fetch(request)` exactly once: +```javascript +// ❌ Multiple origin fetches +const r1 = await fetch(request); const r2 = await fetch(request); +// ✅ Single fetch, reuse response +const response = await fetch(request); +``` + +### 1202: "Subrequest limit exceeded" +Pro: 2 subrequests, Business/Enterprise: 5. Reduce fetch calls. + +### "Cannot set property on immutable object" +Clone before modifying: +```javascript +const modifiedRequest = new Request(request); +modifiedRequest.headers.set("X-Custom", "value"); +``` + +### "caches is not defined" +Cache API NOT available in Snippets. Use Workers. + +### "Module not found" +Snippets don't support `import`. Use inline code or Workers. + +## Best Practices + +### Performance +- Keep code <10KB (32KB limit) +- Optimize for 5ms CPU +- Clone only when modifying +- Minimize subrequests + +### Security +- Validate all inputs +- Use Web Crypto API for hashing +- Sanitize headers before origin +- Don't log secrets + +### Debugging +```javascript +newResponse.headers.set("X-Debug-Country", request.cf.country); +``` +```bash +curl -H "X-Test: true" https://example.com -v +``` + +## Available APIs + +**✅ Available:** `fetch()`, `Request`, `Response`, `Headers`, `URL`, `crypto.subtle`, `crypto.randomUUID()`, `atob()`/`btoa()`, `JSON` + +**❌ NOT Available:** `caches`, `KV`, `D1`, `R2`, `Durable Objects`, `WebSocket`, `HTMLRewriter`, `import`, Node.js APIs + +## Limits + +| Resource | Limit | +|----------|-------| +| Snippet size | 32KB | +| Execution time | 5ms CPU | +| Subrequests (Pro/Biz) | 2/5 | +| Snippets/zone | 20 | + +## Performance Benchmarks + +| Operation | Time | +|-----------|------| +| Header set | <0.1ms | +| URL parsing | <0.2ms | +| fetch() | 1-3ms | +| SHA-256 | 0.5-1ms | + +**Migrate to Workers when:** >5ms needed, >5 subrequests, need storage (KV/D1/R2), need npm packages, >32KB code diff --git a/cloudflare/references/snippets/patterns.md b/cloudflare/references/snippets/patterns.md new file mode 100644 index 0000000..a60c420 --- /dev/null +++ b/cloudflare/references/snippets/patterns.md @@ -0,0 +1,135 @@ +# Snippets Patterns + +## Security Headers + +```javascript +export default { + async fetch(request) { + const response = await fetch(request); + const newResponse = new Response(response.body, response); + newResponse.headers.set("X-Frame-Options", "DENY"); + newResponse.headers.set("X-Content-Type-Options", "nosniff"); + newResponse.headers.delete("X-Powered-By"); + return newResponse; + } +} +``` + +**Rule:** `true` (all requests) + +## Geo-Based Routing + +```javascript +export default { + async fetch(request) { + const country = request.cf.country; + if (["GB", "DE", "FR"].includes(country)) { + const url = new URL(request.url); + url.hostname = url.hostname.replace(".com", ".eu"); + return Response.redirect(url.toString(), 302); + } + return fetch(request); + } +} +``` + +## A/B Testing + +```javascript +export default { + async fetch(request) { + const cookies = request.headers.get("Cookie") || ""; + let variant = cookies.match(/ab_test=([AB])/)?.[1] || (Math.random() < 0.5 ? "A" : "B"); + + const req = new Request(request); + req.headers.set("X-Variant", variant); + const response = await fetch(req); + + if (!cookies.includes("ab_test=")) { + const newResponse = new Response(response.body, response); + newResponse.headers.append("Set-Cookie", `ab_test=${variant}; Path=/; Secure`); + return newResponse; + } + return response; + } +} +``` + +## Bot Detection + +```javascript +export default { + async fetch(request) { + const botScore = request.cf.botManagement?.score; + if (botScore && botScore < 30) return new Response("Denied", { status: 403 }); + return fetch(request); + } +} +``` + +**Requires:** Bot Management plan + +## API Auth Header Injection + +```javascript +export default { + async fetch(request) { + if (new URL(request.url).pathname.startsWith("/api/")) { + const req = new Request(request); + req.headers.set("X-Internal-Auth", "secret_token"); + req.headers.delete("Authorization"); + return fetch(req); + } + return fetch(request); + } +} +``` + +## CORS Headers + +```javascript +export default { + async fetch(request) { + if (request.method === "OPTIONS") { + return new Response(null, { + status: 204, + headers: { + "Access-Control-Allow-Origin": "*", + "Access-Control-Allow-Methods": "GET, POST, PUT, DELETE", + "Access-Control-Allow-Headers": "Content-Type, Authorization" + } + }); + } + const response = await fetch(request); + const newResponse = new Response(response.body, response); + newResponse.headers.set("Access-Control-Allow-Origin", "*"); + return newResponse; + } +} +``` + +## Maintenance Mode + +```javascript +export default { + async fetch(request) { + if (request.headers.get("X-Bypass-Token") === "admin") return fetch(request); + return new Response("

Maintenance

", { + status: 503, + headers: { "Content-Type": "text/html", "Retry-After": "3600" } + }); + } +} +``` + +## Pattern Selection + +| Pattern | Complexity | Use Case | +|---------|-----------|----------| +| Security Headers | Low | All sites | +| Geo-Routing | Low | Regional content | +| A/B Testing | Medium | Experiments | +| Bot Detection | Medium | Requires Bot Management | +| API Auth | Low | Backend protection | +| CORS | Low | API endpoints | +| Maintenance | Low | Deployments | diff --git a/cloudflare/references/spectrum/README.md b/cloudflare/references/spectrum/README.md new file mode 100644 index 0000000..f78350d --- /dev/null +++ b/cloudflare/references/spectrum/README.md @@ -0,0 +1,52 @@ +# Cloudflare Spectrum Skill Reference + +## Overview + +Cloudflare Spectrum provides security and acceleration for ANY TCP or UDP-based application. It's a global Layer 4 (L4) reverse proxy running on Cloudflare's edge nodes that routes MQTT, email, file transfer, version control, games, and more through Cloudflare to mask origins and protect from DDoS attacks. + +**When to Use Spectrum**: When your protocol isn't HTTP/HTTPS (use Cloudflare proxy for HTTP). Spectrum handles everything else: SSH, gaming, databases, MQTT, SMTP, RDP, custom protocols. + +## Plan Capabilities + +| Capability | Pro/Business | Enterprise | +|------------|--------------|------------| +| TCP protocols | Selected ports only | All ports (1-65535) | +| UDP protocols | Selected ports only | All ports (1-65535) | +| Port ranges | ❌ | ✅ | +| Argo Smart Routing | ✅ | ✅ | +| IP Firewall | ✅ | ✅ | +| Load balancer origins | ✅ | ✅ | + +## Decision Tree + +**What are you trying to do?** + +1. **Create/manage Spectrum app** + - Via Dashboard → See [Cloudflare Dashboard](https://dash.cloudflare.com) + - Via API → See [api.md](api.md) - REST endpoints + - Via SDK → See [api.md](api.md) - TypeScript/Python/Go examples + - Via IaC → See [configuration.md](configuration.md) - Terraform/Pulumi + +2. **Protect specific protocol** + - SSH → See [patterns.md](patterns.md#1-ssh-server-protection) + - Gaming (Minecraft, etc) → See [patterns.md](patterns.md#2-game-server) + - MQTT/IoT → See [patterns.md](patterns.md#3-mqtt-broker) + - SMTP/Email → See [patterns.md](patterns.md#4-smtp-relay) + - Database → See [patterns.md](patterns.md#5-database-proxy) + - RDP → See [patterns.md](patterns.md#6-rdp-remote-desktop) + +3. **Choose origin type** + - Direct IP (single server) → See [configuration.md](configuration.md#direct-ip-origin) + - CNAME (hostname) → See [configuration.md](configuration.md#cname-origin) + - Load balancer (HA/failover) → See [configuration.md](configuration.md#load-balancer-origin) + +## Reading Order + +1. Start with [patterns.md](patterns.md) for your specific protocol +2. Then [configuration.md](configuration.md) for your origin type +3. Check [gotchas.md](gotchas.md) before going to production +4. Use [api.md](api.md) for programmatic access + +## See Also + +- [Cloudflare Docs](https://developers.cloudflare.com/spectrum/) diff --git a/cloudflare/references/spectrum/api.md b/cloudflare/references/spectrum/api.md new file mode 100644 index 0000000..645fe2e --- /dev/null +++ b/cloudflare/references/spectrum/api.md @@ -0,0 +1,181 @@ +## REST API Endpoints + +``` +GET /zones/{zone_id}/spectrum/apps # List apps +POST /zones/{zone_id}/spectrum/apps # Create app +GET /zones/{zone_id}/spectrum/apps/{app_id} # Get app +PUT /zones/{zone_id}/spectrum/apps/{app_id} # Update app +DELETE /zones/{zone_id}/spectrum/apps/{app_id} # Delete app + +GET /zones/{zone_id}/spectrum/analytics/aggregate/current +GET /zones/{zone_id}/spectrum/analytics/events/bytime +GET /zones/{zone_id}/spectrum/analytics/events/summary +``` + +## Request/Response Schemas + +### CreateSpectrumAppRequest + +```typescript +interface CreateSpectrumAppRequest { + protocol: string; // "tcp/22", "udp/53" + dns: { + type: "CNAME" | "ADDRESS"; + name: string; // "ssh.example.com" + }; + origin_direct?: string[]; // ["tcp://192.0.2.1:22"] + origin_dns?: { name: string }; // {"name": "origin.example.com"} + origin_port?: number | { start: number; end: number }; + proxy_protocol?: "off" | "v1" | "v2" | "simple"; + ip_firewall?: boolean; + tls?: "off" | "flexible" | "full" | "strict"; + edge_ips?: { + type: "dynamic" | "static"; + connectivity: "all" | "ipv4" | "ipv6"; + }; + traffic_type?: "direct" | "http" | "https"; + argo_smart_routing?: boolean; +} +``` + +### SpectrumApp Response + +```typescript +interface SpectrumApp { + id: string; + protocol: string; + dns: { type: string; name: string }; + origin_direct?: string[]; + origin_dns?: { name: string }; + origin_port?: number | { start: number; end: number }; + proxy_protocol: string; + ip_firewall: boolean; + tls: string; + edge_ips: { type: string; connectivity: string; ips?: string[] }; + argo_smart_routing: boolean; + created_on: string; + modified_on: string; +} +``` + +## TypeScript SDK + +```typescript +import Cloudflare from 'cloudflare'; + +const client = new Cloudflare({ apiToken: process.env.CLOUDFLARE_API_TOKEN }); + +// Create +const app = await client.spectrum.apps.create({ + zone_id: 'your-zone-id', + protocol: 'tcp/22', + dns: { type: 'CNAME', name: 'ssh.example.com' }, + origin_direct: ['tcp://192.0.2.1:22'], + ip_firewall: true, + tls: 'off', +}); + +// List +const apps = await client.spectrum.apps.list({ zone_id: 'your-zone-id' }); + +// Get +const appDetails = await client.spectrum.apps.get({ zone_id: 'your-zone-id', app_id: app.id }); + +// Update +await client.spectrum.apps.update({ zone_id: 'your-zone-id', app_id: app.id, tls: 'full' }); + +// Delete +await client.spectrum.apps.delete({ zone_id: 'your-zone-id', app_id: app.id }); + +// Analytics +const analytics = await client.spectrum.analytics.aggregate({ + zone_id: 'your-zone-id', + metrics: ['bytesIngress', 'bytesEgress'], + since: new Date(Date.now() - 3600000).toISOString(), +}); +``` + +## Python SDK + +```python +from cloudflare import Cloudflare + +client = Cloudflare(api_token="your-api-token") + +# Create +app = client.spectrum.apps.create( + zone_id="your-zone-id", + protocol="tcp/22", + dns={"type": "CNAME", "name": "ssh.example.com"}, + origin_direct=["tcp://192.0.2.1:22"], + ip_firewall=True, + tls="off", +) + +# List +apps = client.spectrum.apps.list(zone_id="your-zone-id") + +# Get +app_details = client.spectrum.apps.get(zone_id="your-zone-id", app_id=app.id) + +# Update +client.spectrum.apps.update(zone_id="your-zone-id", app_id=app.id, tls="full") + +# Delete +client.spectrum.apps.delete(zone_id="your-zone-id", app_id=app.id) + +# Analytics +analytics = client.spectrum.analytics.aggregate( + zone_id="your-zone-id", + metrics=["bytesIngress", "bytesEgress"], + since=datetime.now() - timedelta(hours=1), +) +``` + +## Go SDK + +```go +import "github.com/cloudflare/cloudflare-go" + +api, _ := cloudflare.NewWithAPIToken("your-api-token") + +// Create +app, _ := api.CreateSpectrumApplication(ctx, "zone-id", cloudflare.SpectrumApplication{ + Protocol: "tcp/22", + DNS: cloudflare.SpectrumApplicationDNS{Type: "CNAME", Name: "ssh.example.com"}, + OriginDirect: []string{"tcp://192.0.2.1:22"}, + IPFirewall: true, + ArgoSmartRouting: true, +}) + +// List +apps, _ := api.SpectrumApplications(ctx, "zone-id") + +// Delete +_ = api.DeleteSpectrumApplication(ctx, "zone-id", app.ID) +``` + +## Analytics API + +**Metrics:** +- `bytesIngress` - Bytes received from clients +- `bytesEgress` - Bytes sent to clients +- `count` - Number of connections +- `duration` - Connection duration (seconds) + +**Dimensions:** +- `event` - Connection event type +- `appID` - Spectrum application ID +- `coloName` - Datacenter name +- `ipVersion` - IPv4 or IPv6 + +**Example:** +```bash +curl "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/spectrum/analytics/aggregate/current?metrics=bytesIngress,bytesEgress,count&dimensions=appID" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" +``` + +## See Also + +- [configuration.md](configuration.md) - Terraform/Pulumi +- [patterns.md](patterns.md) - Protocol examples diff --git a/cloudflare/references/spectrum/configuration.md b/cloudflare/references/spectrum/configuration.md new file mode 100644 index 0000000..81aa72f --- /dev/null +++ b/cloudflare/references/spectrum/configuration.md @@ -0,0 +1,194 @@ +## Origin Types + +### Direct IP Origin + +Use when origin is a single server with static IP. + +**TypeScript SDK:** +```typescript +const app = await client.spectrum.apps.create({ + zone_id: 'your-zone-id', + protocol: 'tcp/22', + dns: { type: 'CNAME', name: 'ssh.example.com' }, + origin_direct: ['tcp://192.0.2.1:22'], + ip_firewall: true, + tls: 'off', +}); +``` + +**Terraform:** +```hcl +resource "cloudflare_spectrum_application" "ssh" { + zone_id = var.zone_id + protocol = "tcp/22" + + dns { + type = "CNAME" + name = "ssh.example.com" + } + + origin_direct = ["tcp://192.0.2.1:22"] + ip_firewall = true + tls = "off" + argo_smart_routing = true +} +``` + +### CNAME Origin + +Use when origin is a hostname (not static IP). Spectrum resolves DNS dynamically. + +**TypeScript SDK:** +```typescript +const app = await client.spectrum.apps.create({ + zone_id: 'your-zone-id', + protocol: 'tcp/3306', + dns: { type: 'CNAME', name: 'db.example.com' }, + origin_dns: { name: 'db-primary.internal.example.com' }, + origin_port: 3306, + tls: 'full', +}); +``` + +**Terraform:** +```hcl +resource "cloudflare_spectrum_application" "database" { + zone_id = var.zone_id + protocol = "tcp/3306" + + dns { + type = "CNAME" + name = "db.example.com" + } + + origin_dns { + name = "db-primary.internal.example.com" + } + + origin_port = 3306 + tls = "full" + argo_smart_routing = true +} +``` + +### Load Balancer Origin + +Use for high availability and failover. + +**Terraform:** +```hcl +resource "cloudflare_load_balancer" "game_lb" { + zone_id = var.zone_id + name = "game-lb.example.com" + default_pool_ids = [cloudflare_load_balancer_pool.game_pool.id] +} + +resource "cloudflare_load_balancer_pool" "game_pool" { + name = "game-primary" + origins { name = "game-1"; address = "192.0.2.1" } + monitor = cloudflare_load_balancer_monitor.tcp_monitor.id +} + +resource "cloudflare_load_balancer_monitor" "tcp_monitor" { + type = "tcp"; port = 25565; interval = 60; timeout = 5 +} + +resource "cloudflare_spectrum_application" "game" { + zone_id = var.zone_id + protocol = "tcp/25565" + dns { type = "CNAME"; name = "game.example.com" } + origin_dns { name = cloudflare_load_balancer.game_lb.name } + origin_port = 25565 +} +``` + +## TLS Configuration + +| Mode | Description | Use Case | Origin Cert | +|------|-------------|----------|-------------| +| `off` | No TLS | Non-encrypted (SSH, gaming) | No | +| `flexible` | TLS client→CF, plain CF→origin | Testing | No | +| `full` | TLS end-to-end, self-signed OK | Production | Yes (any) | +| `strict` | Full + valid cert verification | Max security | Yes (CA) | + +**Example:** +```typescript +const app = await client.spectrum.apps.create({ + zone_id: 'your-zone-id', + protocol: 'tcp/3306', + dns: { type: 'CNAME', name: 'db.example.com' }, + origin_direct: ['tcp://192.0.2.1:3306'], + tls: 'strict', // Validates origin certificate +}); +``` + +## Proxy Protocol + +Forwards real client IP to origin. Origin must support parsing. + +| Version | Protocol | Use Case | +|---------|----------|----------| +| `off` | - | Origin doesn't need client IP | +| `v1` | TCP | Most TCP apps (SSH, databases) | +| `v2` | TCP | High-performance TCP | +| `simple` | UDP | UDP applications | + +**Compatibility:** +- **v1**: HAProxy, nginx, SSH, most databases +- **v2**: HAProxy 1.5+, nginx 1.11+ +- **simple**: Cloudflare-specific UDP format + +**Enable:** +```typescript +const app = await client.spectrum.apps.create({ + // ... + proxy_protocol: 'v1', // Origin must parse PROXY header +}); +``` + +**Origin Config (nginx):** +```nginx +stream { + server { + listen 22 proxy_protocol; + proxy_pass backend:22; + } +} +``` + +## IP Access Rules + +Enable `ip_firewall: true` then configure zone-level firewall rules. + +```typescript +const app = await client.spectrum.apps.create({ + // ... + ip_firewall: true, // Applies zone firewall rules +}); +``` + +## Port Ranges (Enterprise Only) + +```hcl +resource "cloudflare_spectrum_application" "game_cluster" { + zone_id = var.zone_id + protocol = "tcp/25565-25575" + + dns { + type = "CNAME" + name = "games.example.com" + } + + origin_direct = ["tcp://192.0.2.1"] + + origin_port { + start = 25565 + end = 25575 + } +} +``` + +## See Also + +- [patterns.md](patterns.md) - Protocol-specific examples +- [api.md](api.md) - REST/SDK reference diff --git a/cloudflare/references/spectrum/gotchas.md b/cloudflare/references/spectrum/gotchas.md new file mode 100644 index 0000000..ef31a36 --- /dev/null +++ b/cloudflare/references/spectrum/gotchas.md @@ -0,0 +1,145 @@ +## Common Issues + +### Connection Timeouts + +**Problem:** Connections fail or timeout +**Cause:** Origin firewall blocking Cloudflare IPs, origin service not running, incorrect DNS +**Solution:** +1. Verify origin firewall allows Cloudflare IP ranges +2. Check origin service running on correct port +3. Ensure DNS record is CNAME (not A/AAAA) +4. Verify origin IP/hostname is correct + +```bash +# Test connectivity +nc -zv app.example.com 22 +dig app.example.com +``` + +### Client IP Showing Cloudflare IP + +**Problem:** Origin logs show Cloudflare IPs not real client IPs +**Cause:** Proxy Protocol not enabled or origin not configured +**Solution:** +```typescript +// Enable in Spectrum app +const app = await client.spectrum.apps.create({ + // ... + proxy_protocol: 'v1', // TCP: v1/v2; UDP: simple +}); +``` + +**Origin config:** +- **nginx**: `listen 22 proxy_protocol;` +- **HAProxy**: `bind :22 accept-proxy` + +### TLS Errors + +**Problem:** TLS handshake failures, 525 errors +**Cause:** TLS mode mismatch + +| Error | TLS Mode | Problem | Solution | +|-------|----------|---------|----------| +| Connection refused | `full`/`strict` | Origin not TLS | Use `tls: "off"` or enable TLS | +| 525 cert invalid | `strict` | Self-signed cert | Use `tls: "full"` or valid cert | +| Handshake timeout | `flexible` | Origin expects TLS | Use `tls: "full"` | + +**Debug:** +```bash +openssl s_client -connect app.example.com:443 -showcerts +``` + +### SMTP Reverse DNS + +**Problem:** Email servers reject SMTP via Spectrum +**Cause:** Spectrum IPs lack PTR (reverse DNS) records +**Impact:** Many mail servers require valid rDNS for anti-spam + +**Solution:** +- Outbound SMTP: NOT recommended through Spectrum +- Inbound SMTP: Use Cloudflare Email Routing +- Internal relay: Whitelist Spectrum IPs on destination + +### Proxy Protocol Compatibility + +**Problem:** Connection works but app behaves incorrectly +**Cause:** Origin doesn't support Proxy Protocol + +**Solution:** +1. Verify origin supports version (v1: widely supported, v2: HAProxy 1.5+/nginx 1.11+) +2. Test with `proxy_protocol: 'off'` first +3. Configure origin to parse headers + +**nginx TCP:** +```nginx +stream { + server { + listen 22 proxy_protocol; + proxy_pass backend:22; + } +} +``` + +**HAProxy:** +``` +frontend ft_ssh + bind :22 accept-proxy +``` + +### Analytics Data Retention + +**Problem:** Historical data not available +**Cause:** Retention varies by plan + +| Plan | Real-time | Historical | +|------|-----------|------------| +| Pro | Last hour | ❌ | +| Business | Last hour | Limited | +| Enterprise | Last hour | 90+ days | + +**Solution:** Query within retention window or export to external system + +### Enterprise-Only Features + +**Problem:** Feature unavailable/errors +**Cause:** Requires Enterprise plan + +**Enterprise-only:** +- Port ranges (`tcp/25565-25575`) +- All TCP/UDP ports (Pro/Business: selected only) +- Extended analytics retention +- Advanced load balancing + +### IPv6 Considerations + +**Problem:** IPv6 clients can't connect or origin doesn't support IPv6 +**Solution:** Configure `edge_ips.connectivity` + +```typescript +const app = await client.spectrum.apps.create({ + // ... + edge_ips: { + type: 'dynamic', + connectivity: 'ipv4', // Options: 'all', 'ipv4', 'ipv6' + }, +}); +``` + +**Options:** +- `all`: Dual-stack (default, requires origin support both) +- `ipv4`: IPv4 only (use if origin lacks IPv6) +- `ipv6`: IPv6 only (rare) + +## Limits + +| Resource | Pro/Business | Enterprise | +|----------|--------------|------------| +| Max apps | ~10-15 | 100+ | +| Protocols | Selected | All TCP/UDP | +| Port ranges | ❌ | ✅ | +| Analytics | ~1 hour | 90+ days | + +## See Also + +- [patterns.md](patterns.md) - Protocol examples +- [configuration.md](configuration.md) - TLS/Proxy setup diff --git a/cloudflare/references/spectrum/patterns.md b/cloudflare/references/spectrum/patterns.md new file mode 100644 index 0000000..4032486 --- /dev/null +++ b/cloudflare/references/spectrum/patterns.md @@ -0,0 +1,196 @@ +## Common Use Cases + +### 1. SSH Server Protection + +**Terraform:** +```hcl +resource "cloudflare_spectrum_application" "ssh" { + zone_id = var.zone_id + protocol = "tcp/22" + + dns { + type = "CNAME" + name = "ssh.example.com" + } + + origin_direct = ["tcp://10.0.1.5:22"] + ip_firewall = true + argo_smart_routing = true +} +``` + +**Benefits:** Hide origin IP, DDoS protection, IP firewall, Argo reduces latency + +### 2. Game Server + +**TypeScript (Minecraft):** +```typescript +const app = await client.spectrum.apps.create({ + zone_id: 'your-zone-id', + protocol: 'tcp/25565', + dns: { type: 'CNAME', name: 'mc.example.com' }, + origin_direct: ['tcp://192.168.1.10:25565'], + proxy_protocol: 'v1', // Preserves player IPs + argo_smart_routing: true, +}); +``` + +**Benefits:** DDoS protection, hide origin IP, Proxy Protocol for player IPs/bans, Argo reduces latency + +### 3. MQTT Broker + +IoT device communication. + +**TypeScript:** +```typescript +const mqttApp = await client.spectrum.apps.create({ + zone_id: 'your-zone-id', + protocol: 'tcp/8883', // Use 1883 for plain MQTT + dns: { type: 'CNAME', name: 'mqtt.example.com' }, + origin_direct: ['tcp://mqtt-broker.internal:8883'], + tls: 'full', // Use 'off' for plain MQTT +}); +``` + +**Benefits:** DDoS protection, hide broker IP, TLS termination at edge + +### 4. SMTP Relay + +Email submission (port 587). **WARNING**: See [gotchas.md](gotchas.md#smtp-reverse-dns) + +**Terraform:** +```hcl +resource "cloudflare_spectrum_application" "smtp" { + zone_id = var.zone_id + protocol = "tcp/587" + + dns { + type = "CNAME" + name = "smtp.example.com" + } + + origin_direct = ["tcp://mail-server.internal:587"] + tls = "full" # STARTTLS support +} +``` + +**Limitations:** +- Spectrum IPs lack reverse DNS (PTR records) +- Many mail servers reject without valid rDNS +- Best for internal/trusted relay only + +### 5. Database Proxy + +MySQL/PostgreSQL. **Use with caution** - security critical. + +**PostgreSQL:** +```typescript +const postgresApp = await client.spectrum.apps.create({ + zone_id: 'your-zone-id', + protocol: 'tcp/5432', + dns: { type: 'CNAME', name: 'postgres.example.com' }, + origin_dns: { name: 'db-primary.internal.example.com' }, + origin_port: 5432, + tls: 'strict', // REQUIRED + ip_firewall: true, // REQUIRED +}); +``` + +**MySQL:** +```hcl +resource "cloudflare_spectrum_application" "mysql" { + zone_id = var.zone_id + protocol = "tcp/3306" + + dns { + type = "CNAME" + name = "mysql.example.com" + } + + origin_dns { + name = "mysql-primary.internal.example.com" + } + + origin_port = 3306 + tls = "strict" + ip_firewall = true +} +``` + +**Security:** +- ALWAYS use `tls: "strict"` +- ALWAYS use `ip_firewall: true` +- Restrict to known IPs via zone firewall +- Use strong DB authentication +- Consider VPN or Cloudflare Access instead + +### 6. RDP (Remote Desktop) + +**Requires IP firewall.** + +**Terraform:** +```hcl +resource "cloudflare_spectrum_application" "rdp" { + zone_id = var.zone_id + protocol = "tcp/3389" + + dns { + type = "CNAME" + name = "rdp.example.com" + } + + origin_direct = ["tcp://windows-server.internal:3389"] + tls = "off" # RDP has own encryption + ip_firewall = true # REQUIRED +} +``` + +**Security:** ALWAYS `ip_firewall: true`, whitelist admin IPs, RDP is DDoS/brute-force target + +### 7. Multi-Origin Failover + +High availability with load balancer. + +**Terraform:** +```hcl +resource "cloudflare_load_balancer" "database_lb" { + zone_id = var.zone_id + name = "db-lb.example.com" + default_pool_ids = [cloudflare_load_balancer_pool.db_primary.id] + fallback_pool_id = cloudflare_load_balancer_pool.db_secondary.id +} + +resource "cloudflare_load_balancer_pool" "db_primary" { + name = "db-primary-pool" + origins { name = "db-1"; address = "192.0.2.1" } + monitor = cloudflare_load_balancer_monitor.postgres_monitor.id +} + +resource "cloudflare_load_balancer_pool" "db_secondary" { + name = "db-secondary-pool" + origins { name = "db-2"; address = "192.0.2.2" } + monitor = cloudflare_load_balancer_monitor.postgres_monitor.id +} + +resource "cloudflare_load_balancer_monitor" "postgres_monitor" { + type = "tcp"; port = 5432; interval = 30; timeout = 5 +} + +resource "cloudflare_spectrum_application" "postgres_ha" { + zone_id = var.zone_id + protocol = "tcp/5432" + dns { type = "CNAME"; name = "postgres.example.com" } + origin_dns { name = cloudflare_load_balancer.database_lb.name } + origin_port = 5432 + tls = "strict" + ip_firewall = true +} +``` + +**Benefits:** Automatic failover, health monitoring, traffic distribution, zero-downtime deployments + +## See Also + +- [configuration.md](configuration.md) - Origin type setup +- [gotchas.md](gotchas.md) - Protocol limitations +- [api.md](api.md) - SDK reference diff --git a/cloudflare/references/static-assets/README.md b/cloudflare/references/static-assets/README.md new file mode 100644 index 0000000..b2ba96a --- /dev/null +++ b/cloudflare/references/static-assets/README.md @@ -0,0 +1,65 @@ +# Cloudflare Static Assets Skill Reference + +Expert guidance for deploying and configuring static assets with Cloudflare Workers. This skill covers configuration patterns, routing architectures, asset binding usage, and best practices for SPAs, SSG sites, and full-stack applications. + +## Quick Start + +```jsonc +// wrangler.jsonc +{ + "name": "my-app", + "main": "src/index.ts", + "compatibility_date": "2025-01-01", + "assets": { + "directory": "./dist" + } +} +``` + +```typescript +// src/index.ts +export default { + async fetch(request: Request, env: Env): Promise { + return env.ASSETS.fetch(request); + } +}; +``` + +Deploy: `wrangler deploy` + +## When to Use Workers Static Assets vs Pages + +| Factor | Workers Static Assets | Cloudflare Pages | +|--------|----------------------|------------------| +| **Use case** | Hybrid apps (static + dynamic API) | Static sites, SSG | +| **Worker control** | Full control over routing | Limited (Functions) | +| **Configuration** | Code-first, flexible | Git-based, opinionated | +| **Dynamic routing** | Worker-first patterns | Functions (_functions/) | +| **Best for** | Full-stack apps, SPAs with APIs | Jamstack, static docs | + +**Decision tree:** + +- Need custom routing logic? → Workers Static Assets +- Pure static site or SSG? → Pages +- API routes + SPA? → Workers Static Assets +- Framework (Next, Nuxt, Remix)? → Pages + +## Reading Order + +1. **configuration.md** - Setup, wrangler.jsonc options, routing patterns +2. **api.md** - ASSETS binding API, request/response handling +3. **patterns.md** - Common patterns (SPA, API routes, auth, A/B testing) +4. **gotchas.md** - Limits, errors, performance tips + +## In This Reference + +- **[configuration.md](configuration.md)** - Setup, deployment, configuration +- **[api.md](api.md)** - API endpoints, methods, interfaces +- **[patterns.md](patterns.md)** - Common patterns, use cases, examples +- **[gotchas.md](gotchas.md)** - Troubleshooting, best practices, limitations + +## See Also + +- [Cloudflare Workers Docs](https://developers.cloudflare.com/workers/) +- [Static Assets Docs](https://developers.cloudflare.com/workers/static-assets/) +- [Cloudflare Pages](https://developers.cloudflare.com/pages/) diff --git a/cloudflare/references/static-assets/api.md b/cloudflare/references/static-assets/api.md new file mode 100644 index 0000000..08bb568 --- /dev/null +++ b/cloudflare/references/static-assets/api.md @@ -0,0 +1,199 @@ +# API Reference + +## ASSETS Binding + +The `ASSETS` binding provides access to static assets via the `Fetcher` interface. + +### Type Definition + +```typescript +interface Env { + ASSETS: Fetcher; +} + +interface Fetcher { + fetch(input: RequestInfo | URL, init?: RequestInit): Promise; +} +``` + +### Method Signatures + +```typescript +// 1. Forward entire request +await env.ASSETS.fetch(request); + +// 2. String path (hostname ignored, only path matters) +await env.ASSETS.fetch("https://any-host/path/to/asset.png"); + +// 3. URL object +await env.ASSETS.fetch(new URL("/index.html", request.url)); + +// 4. Constructed Request object +await env.ASSETS.fetch(new Request(new URL("/logo.png", request.url), { + method: "GET", + headers: request.headers +})); +``` + +**Key behaviors:** + +- Host/origin is ignored for string/URL inputs (only path is used) +- Method must be GET (others return 405) +- Request headers pass through (affects response) +- Returns standard `Response` object + +## Request Handling + +### Path Resolution + +```typescript +// All resolve to same asset: +env.ASSETS.fetch("https://example.com/logo.png") +env.ASSETS.fetch("https://ignored.host/logo.png") +env.ASSETS.fetch("/logo.png") +``` + +Assets are resolved relative to configured `assets.directory`. + +### Headers + +Request headers that affect response: + +| Header | Effect | +|--------|--------| +| `Accept-Encoding` | Controls compression (gzip, brotli) | +| `Range` | Enables partial content (206 responses) | +| `If-None-Match` | Conditional request via ETag | +| `If-Modified-Since` | Conditional request via modification date | + +Custom headers pass through but don't affect asset serving. + +### Method Support + +| Method | Supported | Response | +|--------|-----------|----------| +| `GET` | ✅ Yes | Asset content | +| `HEAD` | ✅ Yes | Headers only, no body | +| `POST`, `PUT`, etc. | ❌ No | 405 Method Not Allowed | + +## Response Behavior + +### Content-Type Inference + +Automatically set based on file extension: + +| Extension | Content-Type | +|-----------|--------------| +| `.html` | `text/html; charset=utf-8` | +| `.css` | `text/css` | +| `.js` | `application/javascript` | +| `.json` | `application/json` | +| `.png` | `image/png` | +| `.jpg`, `.jpeg` | `image/jpeg` | +| `.svg` | `image/svg+xml` | +| `.woff2` | `font/woff2` | + +### Default Headers + +Responses include: + +``` +Content-Type: +ETag: "" +Cache-Control: public, max-age=3600 +Content-Encoding: br (if supported and beneficial) +``` + +**Cache-Control defaults:** + +- 1 hour (`max-age=3600`) for most assets +- Override via Worker response transformation (see patterns.md:27-35) + +### Compression + +Automatic compression based on `Accept-Encoding`: + +- **Brotli** (`br`): Preferred, best compression +- **Gzip** (`gzip`): Fallback +- **None**: If client doesn't support or asset too small + +### ETag Generation + +ETags are content-based hashes: + +``` +ETag: "a3b2c1d4e5f6..." +``` + +Used for conditional requests (`If-None-Match`). Returns `304 Not Modified` if match. + +## Error Responses + +| Status | Condition | Behavior | +|--------|-----------|----------| +| `404` | Asset not found | Body depends on `not_found_handling` config | +| `405` | Non-GET/HEAD method | `{ "error": "Method not allowed" }` | +| `416` | Invalid Range header | Range not satisfiable | + +### 404 Handling + +Depends on configuration (see configuration.md:45-52): + +```typescript +// not_found_handling: "single-page-application" +// Returns /index.html with 200 status + +// not_found_handling: "404-page" +// Returns /404.html if exists, else 404 response + +// not_found_handling: "none" +// Returns 404 response +``` + +## Advanced Usage + +### Modifying Responses + +```typescript +const response = await env.ASSETS.fetch(request); + +// Clone and modify +return new Response(response.body, { + status: response.status, + headers: { + ...Object.fromEntries(response.headers), + 'Cache-Control': 'public, max-age=31536000', + 'X-Custom': 'value' + } +}); +``` + +See patterns.md:27-35 for full example. + +### Error Handling + +```typescript +const response = await env.ASSETS.fetch(request); + +if (!response.ok) { + // Asset not found or error + return new Response('Custom error page', { status: 404 }); +} + +return response; +``` + +### Conditional Serving + +```typescript +const url = new URL(request.url); + +// Serve different assets based on conditions +if (url.pathname === '/') { + return env.ASSETS.fetch('/index.html'); +} + +return env.ASSETS.fetch(request); +``` + +See patterns.md for complete patterns. diff --git a/cloudflare/references/static-assets/configuration.md b/cloudflare/references/static-assets/configuration.md new file mode 100644 index 0000000..2902698 --- /dev/null +++ b/cloudflare/references/static-assets/configuration.md @@ -0,0 +1,186 @@ +## Configuration + +### Basic Setup + +Minimal configuration requires only `assets.directory`: + +```jsonc +{ + "name": "my-worker", + "compatibility_date": "2025-01-01", // Use current date for new projects + "assets": { + "directory": "./dist" + } +} +``` + +### Full Configuration Options + +```jsonc +{ + "name": "my-worker", + "main": "src/index.ts", + "compatibility_date": "2025-01-01", + "assets": { + "directory": "./dist", + "binding": "ASSETS", + "not_found_handling": "single-page-application", + "html_handling": "auto-trailing-slash", + "run_worker_first": ["/api/*", "!/api/docs/*"] + } +} +``` + +**Configuration keys:** + +- `directory` (string, required): Path to assets folder (e.g. `./dist`, `./public`, `./build`) +- `binding` (string, optional): Name to access assets in Worker code (e.g. `env.ASSETS`). Default: `"ASSETS"` +- `not_found_handling` (string, optional): Behavior when asset not found + - `"single-page-application"`: Serve `/index.html` for non-asset paths (default for SPAs) + - `"404-page"`: Serve `/404.html` if present, otherwise 404 + - `"none"`: Return 404 for missing assets +- `html_handling` (string, optional): URL trailing slash behavior +- `run_worker_first` (boolean | string[], optional): Routes that invoke Worker before checking assets + +### not_found_handling Modes + +| Mode | Behavior | Use Case | +|------|----------|----------| +| `"single-page-application"` | Serve `/index.html` for non-asset requests | React, Vue, Angular SPAs | +| `"404-page"` | Serve `/404.html` if exists, else 404 | Static sites with custom error page | +| `"none"` | Return 404 for missing assets | API-first or custom routing | + +### html_handling Modes + +Controls trailing slash behavior for HTML files: + +| Mode | `/page` | `/page/` | Use Case | +|------|---------|----------|----------| +| `"auto-trailing-slash"` | Redirect to `/page/` if `/page/index.html` exists | Serve `/page/index.html` | Default, SEO-friendly | +| `"force-trailing-slash"` | Always redirect to `/page/` | Serve if exists | Consistent trailing slashes | +| `"drop-trailing-slash"` | Serve if exists | Redirect to `/page` | Cleaner URLs | +| `"none"` | No modification | No modification | Custom routing logic | + +**Default:** `"auto-trailing-slash"` + +### run_worker_first Configuration + +Controls which requests invoke Worker before checking assets. + +**Boolean syntax:** + +```jsonc +{ + "assets": { + "run_worker_first": true // ALL requests invoke Worker + } +} +``` + +**Array syntax (recommended):** + +```jsonc +{ + "assets": { + "run_worker_first": [ + "/api/*", // Positive pattern: match API routes + "/admin/*", // Match admin routes + "!/admin/assets/*" // Negative pattern: exclude admin assets + ] + } +} +``` + +**Pattern rules:** + +- Glob patterns: `*` (any chars), `**` (any path segments) +- Negative patterns: Prefix with `!` to exclude +- Precedence: Negative patterns override positive patterns +- Default: `false` (assets served directly) + +**Decision guidance:** + +- Use `true` for API-first apps (few static assets) +- Use array patterns for hybrid apps (APIs + static assets) +- Use `false` for static-first sites (minimal dynamic routes) + +### .assetsignore File + +Exclude files from upload using `.assetsignore` (same syntax as `.gitignore`): + +``` +# .assetsignore +_worker.js +*.map +*.md +node_modules/ +.git/ +``` + +**Common patterns:** + +- `_worker.js` - Exclude Worker code from assets +- `*.map` - Exclude source maps +- `*.md` - Exclude markdown files +- Development artifacts + +### Vite Plugin Integration + +For Vite-based projects, use `@cloudflare/vite-plugin`: + +```typescript +// vite.config.ts +import { defineConfig } from 'vite'; +import { cloudflare } from '@cloudflare/vite-plugin'; + +export default defineConfig({ + plugins: [ + cloudflare({ + assets: { + directory: './dist', + binding: 'ASSETS' + } + }) + ] +}); +``` + +**Features:** + +- Automatic asset detection during dev +- Hot module replacement for assets +- Production build integration +- Requires: Wrangler 4.0.0+, `@cloudflare/vite-plugin` 1.0.0+ + +### Key Compatibility Dates + +| Date | Feature | Impact | +|------|---------|--------| +| `2025-04-01` | Navigation request optimization | SPAs skip Worker for navigation, reducing costs | + +Use current date for new projects. See [Compatibility Dates](https://developers.cloudflare.com/workers/configuration/compatibility-dates/) for full list. + +### Environment-Specific Configuration + +Use `wrangler.jsonc` environments for different configs: + +```jsonc +{ + "name": "my-worker", + "assets": { "directory": "./dist" }, + "env": { + "staging": { + "assets": { + "not_found_handling": "404-page" + } + }, + "production": { + "assets": { + "not_found_handling": "single-page-application" + } + } + } +} +``` + +Deploy with: `wrangler deploy --env staging` diff --git a/cloudflare/references/static-assets/gotchas.md b/cloudflare/references/static-assets/gotchas.md new file mode 100644 index 0000000..2577f17 --- /dev/null +++ b/cloudflare/references/static-assets/gotchas.md @@ -0,0 +1,162 @@ +## Best Practices + +### 1. Use Selective Worker-First Routing + +Instead of `run_worker_first = true`, use array patterns: + +```jsonc +{ + "assets": { + "run_worker_first": [ + "/api/*", // API routes + "/admin/*", // Admin area + "!/admin/assets/*" // Except admin assets + ] + } +} +``` + +**Benefits:** +- Reduces Worker invocations +- Lowers costs +- Improves asset delivery performance + +### 2. Leverage Navigation Request Optimization + +For SPAs, use `compatibility_date = "2025-04-01"` or later: + +```jsonc +{ + "compatibility_date": "2025-04-01", + "assets": { + "not_found_handling": "single-page-application" + } +} +``` + +Navigation requests skip Worker invocation, reducing costs. + +### 3. Type Safety with Bindings + +Always type your environment: + +```typescript +interface Env { + ASSETS: Fetcher; +} +``` + +## Common Errors + +### "Asset not found" + +**Cause:** Asset not in assets directory, wrong path, or assets not deployed +**Solution:** Verify asset exists, check path case-sensitivity, redeploy if needed + +### "Worker not invoked for asset" + +**Cause:** Asset served directly, `run_worker_first` not configured +**Solution:** Configure `run_worker_first` patterns to include asset routes (see configuration.md:66-106) + +### "429 Too Many Requests on free tier" + +**Cause:** `run_worker_first` patterns invoke Worker for many requests, hitting free tier limits (100k req/day) +**Solution:** Use more selective patterns with negative exclusions, or upgrade to paid plan + +### "Smart Placement increases latency" + +**Cause:** `run_worker_first=true` + Smart Placement routes all requests through single smart-placed location +**Solution:** Use selective patterns (array syntax) or disable Smart Placement for asset-heavy apps + +### "CF-Cache-Status header unreliable" + +**Cause:** Header is probabilistically added for privacy reasons +**Solution:** Don't rely on `CF-Cache-Status` for critical routing logic. Use other signals (ETag, age). + +### "JWT expired during deployment" + +**Cause:** Large asset deployments exceed JWT token lifetime +**Solution:** Update to Wrangler 4.34.0+ (automatic token refresh), or reduce asset count + +### "Cannot use 'assets' with 'site'" + +**Cause:** Legacy `site` config conflicts with new `assets` config +**Solution:** Migrate from `site` to `assets` (see configuration.md). Remove `site` key from wrangler.jsonc. + +### "Assets not updating after deployment" + +**Cause:** Browser or CDN cache serving old assets +**Solution:** +- Hard refresh browser (Cmd+Shift+R / Ctrl+F5) +- Use cache-busting (hashed filenames) +- Verify deployment completed: `wrangler tail` + +## Limits + +| Resource/Limit | Free | Paid | Notes | +|----------------|------|------|-------| +| Max asset size | 25 MiB | 25 MiB | Per file | +| Total assets | 20,000 | **100,000** | Requires Wrangler 4.34.0+ (Sep 2025) | +| Worker invocations | 100k/day | 10M/month | Optimize with `run_worker_first` patterns | +| Asset storage | Unlimited | Unlimited | Included | + +### Version Requirements + +| Feature | Minimum Wrangler Version | +|---------|--------------------------| +| 100k file limit (paid) | 4.34.0 | +| Vite plugin | 4.0.0 + @cloudflare/vite-plugin 1.0.0 | +| Navigation optimization | 4.0.0 + compatibility_date: "2025-04-01" | + +## Performance Tips + +### 1. Use Hashed Filenames + +Enable long-term caching with content-hashed filenames: + +``` +app.a3b2c1d4.js +styles.e5f6g7h8.css +``` + +Most bundlers (Vite, Webpack, Parcel) do this automatically. + +### 2. Minimize Worker Invocations + +Serve assets directly when possible: + +```jsonc +{ + "assets": { + // Only invoke Worker for dynamic routes + "run_worker_first": ["/api/*", "/auth/*"] + } +} +``` + +### 3. Leverage Browser Cache + +Set appropriate `Cache-Control` headers: + +```typescript +// Versioned assets +'Cache-Control': 'public, max-age=31536000, immutable' + +// HTML (revalidate often) +'Cache-Control': 'public, max-age=0, must-revalidate' +``` + +See patterns.md:169-189 for implementation. + +### 4. Use .assetsignore + +Reduce upload time by excluding unnecessary files: + +``` +*.map +*.md +.DS_Store +node_modules/ +``` + +See configuration.md:107-126 for details. diff --git a/cloudflare/references/static-assets/patterns.md b/cloudflare/references/static-assets/patterns.md new file mode 100644 index 0000000..11ddda2 --- /dev/null +++ b/cloudflare/references/static-assets/patterns.md @@ -0,0 +1,189 @@ +### Common Patterns + +**1. Forward request to assets:** + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + return env.ASSETS.fetch(request); + } +}; +``` + +**2. Fetch specific asset by path:** + +```typescript +const response = await env.ASSETS.fetch("https://assets.local/logo.png"); +``` + +**3. Modify request before fetching asset:** + +```typescript +const url = new URL(request.url); +url.pathname = "/index.html"; +return env.ASSETS.fetch(new Request(url, request)); +``` + +**4. Transform asset response:** + +```typescript +const response = await env.ASSETS.fetch(request); +const modifiedResponse = new Response(response.body, response); +modifiedResponse.headers.set("X-Custom-Header", "value"); +modifiedResponse.headers.set("Cache-Control", "public, max-age=3600"); +return modifiedResponse; +``` + +**5. Conditional asset serving:** + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const url = new URL(request.url); + if (url.pathname === '/') { + return env.ASSETS.fetch('/index.html'); + } + return env.ASSETS.fetch(request); + } +}; +``` + +**6. SPA with API routes:** + +Most common full-stack pattern - static SPA with backend API: + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const url = new URL(request.url); + if (url.pathname.startsWith('/api/')) { + return handleAPI(request, env); + } + return env.ASSETS.fetch(request); + } +}; + +async function handleAPI(request: Request, env: Env): Promise { + return new Response(JSON.stringify({ status: 'ok' }), { + headers: { 'Content-Type': 'application/json' } + }); +} +``` + +**Config:** Set `run_worker_first: ["/api/*"]` (see configuration.md:66-106) + +**7. Auth gating for protected assets:** + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const url = new URL(request.url); + if (url.pathname.startsWith('/admin/')) { + const session = await validateSession(request, env); + if (!session) { + return Response.redirect('/login', 302); + } + } + return env.ASSETS.fetch(request); + } +}; +``` + +**Config:** Set `run_worker_first: ["/admin/*"]` + +**8. Custom headers for security:** + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const response = await env.ASSETS.fetch(request); + const secureResponse = new Response(response.body, response); + secureResponse.headers.set('X-Frame-Options', 'DENY'); + secureResponse.headers.set('X-Content-Type-Options', 'nosniff'); + secureResponse.headers.set('Content-Security-Policy', "default-src 'self'"); + return secureResponse; + } +}; +``` + +**9. A/B testing via cookies:** + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const cookies = request.headers.get('Cookie') || ''; + const variant = cookies.includes('variant=b') ? 'b' : 'a'; + const url = new URL(request.url); + if (url.pathname === '/') { + return env.ASSETS.fetch(`/index-${variant}.html`); + } + return env.ASSETS.fetch(request); + } +}; +``` + +**10. Locale-based routing:** + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const locale = request.headers.get('Accept-Language')?.split(',')[0] || 'en'; + const url = new URL(request.url); + if (url.pathname === '/') { + return env.ASSETS.fetch(`/${locale}/index.html`); + } + if (!url.pathname.startsWith(`/${locale}/`)) { + url.pathname = `/${locale}${url.pathname}`; + } + return env.ASSETS.fetch(url); + } +}; +``` + +**11. OAuth callback handling:** + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const url = new URL(request.url); + if (url.pathname === '/auth/callback') { + const code = url.searchParams.get('code'); + if (code) { + const session = await exchangeCode(code, env); + return new Response(null, { + status: 302, + headers: { + 'Location': '/', + 'Set-Cookie': `session=${session}; HttpOnly; Secure; SameSite=Lax` + } + }); + } + } + return env.ASSETS.fetch(request); + } +}; +``` + +**Config:** Set `run_worker_first: ["/auth/*"]` + +**12. Cache control override:** + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const response = await env.ASSETS.fetch(request); + const url = new URL(request.url); + // Immutable assets (hashed filenames) + if (/\.[a-f0-9]{8,}\.(js|css|png|jpg)$/.test(url.pathname)) { + return new Response(response.body, { + ...response, + headers: { + ...Object.fromEntries(response.headers), + 'Cache-Control': 'public, max-age=31536000, immutable' + } + }); + } + return response; + } +}; +``` diff --git a/cloudflare/references/stream/README.md b/cloudflare/references/stream/README.md new file mode 100644 index 0000000..be251e3 --- /dev/null +++ b/cloudflare/references/stream/README.md @@ -0,0 +1,114 @@ +# Cloudflare Stream + +Serverless live and on-demand video streaming platform with one API. + +## Overview + +Cloudflare Stream provides video upload, storage, encoding, and delivery without managing infrastructure. Runs on Cloudflare's global network. + +### Key Features +- **On-demand video**: Upload, encode, store, deliver +- **Live streaming**: RTMPS/SRT ingestion with ABR +- **Direct creator uploads**: End users upload without API keys +- **Signed URLs**: Token-based access control +- **Analytics**: Server-side metrics via GraphQL +- **Webhooks**: Processing notifications +- **Captions**: Upload or AI-generate subtitles +- **Watermarks**: Apply branding to videos +- **Downloads**: Enable MP4 offline viewing + +## Core Concepts + +### Video Upload Methods +1. **API Upload (TUS protocol)**: Direct server upload +2. **Upload from URL**: Import from external source +3. **Direct Creator Uploads**: User-generated content (recommended) + +### Playback Options +1. **Stream Player (iframe)**: Built-in, optimized player +2. **Custom Player (HLS/DASH)**: Video.js, HLS.js integration +3. **Thumbnails**: Static or animated previews + +### Access Control +- **Public**: No restrictions +- **requireSignedURLs**: Token-based access +- **allowedOrigins**: Domain restrictions +- **Access Rules**: Geo/IP restrictions in tokens + +### Live Streaming +- RTMPS/SRT ingest from OBS, FFmpeg +- Automatic recording to on-demand +- Simulcast to YouTube, Twitch, etc. +- WebRTC support for browser streaming + +## Quick Start + +**Upload video via API** +```bash +curl -X POST \ + "https://api.cloudflare.com/client/v4/accounts/{account_id}/stream/copy" \ + -H "Authorization: Bearer " \ + -H "Content-Type: application/json" \ + -d '{"url": "https://example.com/video.mp4"}' +``` + +**Embed player** +```html + +``` + +**Create live input** +```bash +curl -X POST \ + "https://api.cloudflare.com/client/v4/accounts/{account_id}/stream/live_inputs" \ + -H "Authorization: Bearer " \ + -H "Content-Type: application/json" \ + -d '{"recording": {"mode": "automatic"}}' +``` + +## Limits + +- Max file size: 30 GB +- Max frame rate: 60 fps (recommended) +- Supported formats: MP4, MKV, MOV, AVI, FLV, MPEG-2 TS/PS, MXF, LXF, GXF, 3GP, WebM, MPG, QuickTime + +## Pricing + +- $5/1000 min stored +- $1/1000 min delivered + +## Resources + +- Dashboard: https://dash.cloudflare.com/?to=/:account/stream +- API Docs: https://developers.cloudflare.com/api/resources/stream/ +- Stream Docs: https://developers.cloudflare.com/stream/ + +## Reading Order + +| Order | File | Purpose | When to Use | +|-------|------|---------|-------------| +| 1 | [configuration.md](./configuration.md) | Setup SDKs, env vars, signing keys | Starting new project | +| 2 | [api.md](./api.md) | On-demand video APIs | Implementing uploads/playback | +| 3 | [api-live.md](./api-live.md) | Live streaming APIs | Building live streaming | +| 4 | [patterns.md](./patterns.md) | Full-stack flows, TUS, JWT signing | Implementing workflows | +| 5 | [gotchas.md](./gotchas.md) | Errors, limits, troubleshooting | Debugging issues | + +## In This Reference + +- [configuration.md](./configuration.md) - Setup, environment variables, wrangler config +- [api.md](./api.md) - On-demand video upload, playback, management APIs +- [api-live.md](./api-live.md) - Live streaming (RTMPS/SRT/WebRTC), simulcast +- [patterns.md](./patterns.md) - Full-stack flows, state management, best practices +- [gotchas.md](./gotchas.md) - Error codes, troubleshooting, limits + +## See Also + +- [workers](../workers/) - Deploy Stream APIs in Workers +- [pages](../pages/) - Integrate Stream with Pages +- [workers-ai](../workers-ai/) - AI-generate captions diff --git a/cloudflare/references/stream/api-live.md b/cloudflare/references/stream/api-live.md new file mode 100644 index 0000000..6c4f4c0 --- /dev/null +++ b/cloudflare/references/stream/api-live.md @@ -0,0 +1,195 @@ +# Stream Live Streaming API + +Live input creation, status checking, simulcast, and WebRTC streaming. + +## Create Live Input + +### Using Cloudflare SDK + +```typescript +import Cloudflare from 'cloudflare'; + +const client = new Cloudflare({ apiToken: env.CF_API_TOKEN }); + +const liveInput = await client.stream.liveInputs.create({ + account_id: env.CF_ACCOUNT_ID, + recording: { mode: 'automatic', timeoutSeconds: 30 }, + deleteRecordingAfterDays: 30 +}); + +// Returns: { uid, rtmps, srt, webRTC } +``` + +### Raw fetch API + +```typescript +async function createLiveInput(accountId: string, apiToken: string) { + const response = await fetch( + `https://api.cloudflare.com/client/v4/accounts/${accountId}/stream/live_inputs`, + { + method: 'POST', + headers: { 'Authorization': `Bearer ${apiToken}`, 'Content-Type': 'application/json' }, + body: JSON.stringify({ + recording: { mode: 'automatic', timeoutSeconds: 30 }, + deleteRecordingAfterDays: 30 + }) + } + ); + const { result } = await response.json(); + return { + uid: result.uid, + rtmps: { url: result.rtmps.url, streamKey: result.rtmps.streamKey }, + srt: { url: result.srt.url, streamId: result.srt.streamId, passphrase: result.srt.passphrase }, + webRTC: result.webRTC + }; +} +``` + +## Check Live Status + +```typescript +async function getLiveStatus(accountId: string, liveInputId: string, apiToken: string) { + const response = await fetch( + `https://api.cloudflare.com/client/v4/accounts/${accountId}/stream/live_inputs/${liveInputId}`, + { headers: { 'Authorization': `Bearer ${apiToken}` } } + ); + const { result } = await response.json(); + return { + isLive: result.status?.current?.state === 'connected', + recording: result.recording, + status: result.status + }; +} +``` + +## Simulcast (Live Outputs) + +### Create Output + +```typescript +async function createLiveOutput( + accountId: string, liveInputId: string, apiToken: string, + outputUrl: string, streamKey: string +) { + return fetch( + `https://api.cloudflare.com/client/v4/accounts/${accountId}/stream/live_inputs/${liveInputId}/outputs`, + { + method: 'POST', + headers: { 'Authorization': `Bearer ${apiToken}`, 'Content-Type': 'application/json' }, + body: JSON.stringify({ + url: `${outputUrl}/${streamKey}`, + enabled: true, + streamKey // For platforms like YouTube, Twitch + }) + } + ).then(r => r.json()); +} +``` + +### Example: Simulcast to YouTube + Twitch + +```typescript +const liveInput = await createLiveInput(accountId, apiToken); + +// Add YouTube output +await createLiveOutput( + accountId, liveInput.uid, apiToken, + 'rtmp://a.rtmp.youtube.com/live2', + 'your-youtube-stream-key' +); + +// Add Twitch output +await createLiveOutput( + accountId, liveInput.uid, apiToken, + 'rtmp://live.twitch.tv/app', + 'your-twitch-stream-key' +); +``` + +## WebRTC Streaming (WHIP/WHEP) + +### Browser to Stream (WHIP) + +```typescript +async function startWebRTCBroadcast(liveInputId: string) { + const pc = new RTCPeerConnection(); + + // Add local media tracks + const stream = await navigator.mediaDevices.getUserMedia({ video: true, audio: true }); + stream.getTracks().forEach(track => pc.addTrack(track, stream)); + + // Create offer + const offer = await pc.createOffer(); + await pc.setLocalDescription(offer); + + // Send to Stream via WHIP + const response = await fetch( + `https://customer-.cloudflarestream.com/${liveInputId}/webRTC/publish`, + { + method: 'POST', + headers: { 'Content-Type': 'application/sdp' }, + body: offer.sdp + } + ); + + const answer = await response.text(); + await pc.setRemoteDescription({ type: 'answer', sdp: answer }); +} +``` + +### Stream to Browser (WHEP) + +```typescript +async function playWebRTCStream(videoId: string) { + const pc = new RTCPeerConnection(); + + pc.addTransceiver('video', { direction: 'recvonly' }); + pc.addTransceiver('audio', { direction: 'recvonly' }); + + const offer = await pc.createOffer(); + await pc.setLocalDescription(offer); + + const response = await fetch( + `https://customer-.cloudflarestream.com/${videoId}/webRTC/play`, + { + method: 'POST', + headers: { 'Content-Type': 'application/sdp' }, + body: offer.sdp + } + ); + + const answer = await response.text(); + await pc.setRemoteDescription({ type: 'answer', sdp: answer }); + + return pc; +} +``` + +## Recording Settings + +| Mode | Behavior | +|------|----------| +| `automatic` | Record all live streams | +| `off` | No recording | +| `timeoutSeconds` | Stop recording after N seconds of inactivity | + +```typescript +const recordingConfig = { + mode: 'automatic', + timeoutSeconds: 30, // Auto-stop 30s after stream ends + requireSignedURLs: true, // Require token for VOD playback + allowedOrigins: ['https://yourdomain.com'] +}; +``` + +## In This Reference + +- [README.md](./README.md) - Overview and quick start +- [api.md](./api.md) - On-demand video APIs +- [configuration.md](./configuration.md) - Setup and config +- [patterns.md](./patterns.md) - Full-stack flows, best practices +- [gotchas.md](./gotchas.md) - Error codes, troubleshooting + +## See Also + +- [workers](../workers/) - Deploy live APIs in Workers diff --git a/cloudflare/references/stream/api.md b/cloudflare/references/stream/api.md new file mode 100644 index 0000000..0c35a71 --- /dev/null +++ b/cloudflare/references/stream/api.md @@ -0,0 +1,199 @@ +# Stream API Reference + +Upload, playback, live streaming, and management APIs. + +## Upload APIs + +### Direct Creator Upload (Recommended) + +**Backend: Create upload URL (SDK)** +```typescript +import Cloudflare from 'cloudflare'; + +const client = new Cloudflare({ apiToken: env.CF_API_TOKEN }); + +const uploadData = await client.stream.directUpload.create({ + account_id: env.CF_ACCOUNT_ID, + maxDurationSeconds: 3600, + requireSignedURLs: true, + meta: { creator: 'user-123' } +}); +// Returns: { uploadURL: string, uid: string } +``` + +**Frontend: Upload file** +```typescript +async function uploadVideo(file: File, uploadURL: string) { + const formData = new FormData(); + formData.append('file', file); + return fetch(uploadURL, { method: 'POST', body: formData }).then(r => r.json()); +} +``` + +### Upload from URL + +```typescript +const video = await client.stream.copy.create({ + account_id: env.CF_ACCOUNT_ID, + url: 'https://example.com/video.mp4', + meta: { name: 'My Video' }, + requireSignedURLs: false +}); +``` + +## Playback APIs + +### Embed Player (iframe) + +```html + +``` + +### HLS/DASH Manifest URLs + +```typescript +// HLS +const hlsUrl = `https://customer-.cloudflarestream.com/${videoId}/manifest/video.m3u8`; + +// DASH +const dashUrl = `https://customer-.cloudflarestream.com/${videoId}/manifest/video.mpd`; +``` + +### Thumbnails + +```typescript +// At specific time (seconds) +const thumb = `https://customer-.cloudflarestream.com/${videoId}/thumbnails/thumbnail.jpg?time=10s`; + +// By percentage +const thumbPct = `https://customer-.cloudflarestream.com/${videoId}/thumbnails/thumbnail.jpg?time=50%`; + +// Animated GIF +const gif = `https://customer-.cloudflarestream.com/${videoId}/thumbnails/thumbnail.gif`; +``` + +## Signed URLs + +```typescript +// Low volume (<1k/day): Use API +async function getSignedToken(accountId: string, videoId: string, apiToken: string) { + const response = await fetch( + `https://api.cloudflare.com/client/v4/accounts/${accountId}/stream/${videoId}/token`, + { + method: 'POST', + headers: { 'Authorization': `Bearer ${apiToken}`, 'Content-Type': 'application/json' }, + body: JSON.stringify({ + exp: Math.floor(Date.now() / 1000) + 3600, + accessRules: [{ type: 'ip.geoip.country', action: 'allow', country: ['US'] }] + }) + } + ); + return (await response.json()).result.token; +} + +// High volume: Self-sign with RS256 JWT (see "Self-Sign JWT" in patterns.md) +``` + +## Captions & Clips + +### Upload Captions + +```typescript +async function uploadCaption( + accountId: string, videoId: string, apiToken: string, + language: string, captionFile: File +) { + const formData = new FormData(); + formData.append('file', captionFile); + return fetch( + `https://api.cloudflare.com/client/v4/accounts/${accountId}/stream/${videoId}/captions/${language}`, + { + method: 'PUT', + headers: { 'Authorization': `Bearer ${apiToken}` }, + body: formData + } + ).then(r => r.json()); +} +``` + +### Generate AI Captions + +```typescript +// TODO: Requires Workers AI integration - see workers-ai reference +async function generateAICaptions(accountId: string, videoId: string, apiToken: string) { + return fetch( + `https://api.cloudflare.com/client/v4/accounts/${accountId}/stream/${videoId}/captions/generate`, + { + method: 'POST', + headers: { 'Authorization': `Bearer ${apiToken}`, 'Content-Type': 'application/json' }, + body: JSON.stringify({ language: 'en' }) + } + ).then(r => r.json()); +} +``` + +### Clip Video + +```typescript +async function clipVideo( + accountId: string, videoId: string, apiToken: string, + startTime: number, endTime: number +) { + return fetch( + `https://api.cloudflare.com/client/v4/accounts/${accountId}/stream/clip`, + { + method: 'POST', + headers: { 'Authorization': `Bearer ${apiToken}`, 'Content-Type': 'application/json' }, + body: JSON.stringify({ + clippedFromVideoUID: videoId, + startTimeSeconds: startTime, + endTimeSeconds: endTime + }) + } + ).then(r => r.json()); +} +``` + +## Video Management + +```typescript +// List videos +const videos = await client.stream.videos.list({ + account_id: env.CF_ACCOUNT_ID, + search: 'keyword' // optional +}); + +// Get video details +const video = await client.stream.videos.get(videoId, { + account_id: env.CF_ACCOUNT_ID +}); + +// Update video +await client.stream.videos.update(videoId, { + account_id: env.CF_ACCOUNT_ID, + meta: { title: 'New Title' }, + requireSignedURLs: true +}); + +// Delete video +await client.stream.videos.delete(videoId, { + account_id: env.CF_ACCOUNT_ID +}); +``` + +## In This Reference + +- [README.md](./README.md) - Overview and quick start +- [configuration.md](./configuration.md) - Setup and config +- [api-live.md](./api-live.md) - Live streaming APIs (RTMPS/SRT/WebRTC) +- [patterns.md](./patterns.md) - Full-stack flows, best practices +- [gotchas.md](./gotchas.md) - Error codes, troubleshooting + +## See Also + +- [workers](../workers/) - Deploy Stream APIs in Workers diff --git a/cloudflare/references/stream/configuration.md b/cloudflare/references/stream/configuration.md new file mode 100644 index 0000000..c4e6613 --- /dev/null +++ b/cloudflare/references/stream/configuration.md @@ -0,0 +1,141 @@ +# Stream Configuration + +Setup, environment variables, and wrangler configuration. + +## Installation + +```bash +# Official Cloudflare SDK (Node.js, Workers, Pages) +npm install cloudflare + +# React component library +npm install @cloudflare/stream-react + +# TUS resumable uploads (large files) +npm install tus-js-client +``` + +## Environment Variables + +```bash +# Required +CF_ACCOUNT_ID=your-account-id +CF_API_TOKEN=your-api-token + +# For signed URLs (high volume) +STREAM_KEY_ID=your-key-id +STREAM_JWK=base64-encoded-jwk + +# For webhooks +WEBHOOK_SECRET=your-webhook-secret + +# Customer subdomain (from dashboard) +STREAM_CUSTOMER_CODE=your-customer-code +``` + +## Wrangler Configuration + +```jsonc +{ + "name": "stream-worker", + "main": "src/index.ts", + "compatibility_date": "2025-01-01", // Use current date for new projects + "vars": { + "CF_ACCOUNT_ID": "your-account-id" + } + // Store secrets: wrangler secret put CF_API_TOKEN + // wrangler secret put STREAM_KEY_ID + // wrangler secret put STREAM_JWK + // wrangler secret put WEBHOOK_SECRET +} +``` + +## Signing Keys (High Volume) + +Create once for self-signing tokens (thousands of daily users). + +**Create key** +```bash +curl -X POST \ + "https://api.cloudflare.com/client/v4/accounts/{account_id}/stream/keys" \ + -H "Authorization: Bearer " + +# Save `id` and `jwk` (base64) from response +``` + +**Store in secrets** +```bash +wrangler secret put STREAM_KEY_ID +wrangler secret put STREAM_JWK +``` + +## Webhooks + +**Setup webhook URL** +```bash +curl -X PUT \ + "https://api.cloudflare.com/client/v4/accounts/{account_id}/stream/webhook" \ + -H "Authorization: Bearer " \ + -H "Content-Type: application/json" \ + -d '{"notificationUrl": "https://your-worker.workers.dev/webhook"}' + +# Save the returned `secret` for signature verification +``` + +**Store secret** +```bash +wrangler secret put WEBHOOK_SECRET +``` + +## Direct Upload / Live / Watermark Config + +```typescript +// Direct upload +const uploadConfig = { + maxDurationSeconds: 3600, + expiry: new Date(Date.now() + 3600000).toISOString(), + requireSignedURLs: true, + allowedOrigins: ['https://yourdomain.com'], + meta: { creator: 'user-123' } +}; + +// Live input +const liveConfig = { + recording: { mode: 'automatic', timeoutSeconds: 30 }, + deleteRecordingAfterDays: 30 +}; + +// Watermark +const watermark = { + name: 'Logo', opacity: 0.7, padding: 20, + position: 'lowerRight', scale: 0.15 +}; +``` + +## Access Rules & Player Config + +```typescript +// Access rules: allow US/CA, block CN/RU, or IP allowlist +const geoRestrict = [ + { type: 'ip.geoip.country', action: 'allow', country: ['US', 'CA'] }, + { type: 'any', action: 'block' } +]; + +// Player params for iframe +const playerParams = new URLSearchParams({ + autoplay: 'true', muted: 'true', preload: 'auto', defaultTextTrack: 'en' +}); +``` + +## In This Reference + +- [README.md](./README.md) - Overview and quick start +- [api.md](./api.md) - On-demand video APIs +- [api-live.md](./api-live.md) - Live streaming APIs +- [patterns.md](./patterns.md) - Full-stack flows, best practices +- [gotchas.md](./gotchas.md) - Error codes, troubleshooting + +## See Also + +- [wrangler](../wrangler/) - Wrangler CLI and configuration +- [workers](../workers/) - Deploy Stream APIs in Workers diff --git a/cloudflare/references/stream/gotchas.md b/cloudflare/references/stream/gotchas.md new file mode 100644 index 0000000..2b1cf8b --- /dev/null +++ b/cloudflare/references/stream/gotchas.md @@ -0,0 +1,130 @@ +# Stream Gotchas + +## Common Errors + +### "ERR_NON_VIDEO" + +**Cause:** Uploaded file is not a valid video format +**Solution:** Ensure file is in supported format (MP4, MKV, MOV, AVI, FLV, MPEG-2 TS/PS, MXF, LXF, GXF, 3GP, WebM, MPG, QuickTime) + +### "ERR_DURATION_EXCEED_CONSTRAINT" + +**Cause:** Video duration exceeds `maxDurationSeconds` constraint +**Solution:** Increase `maxDurationSeconds` in direct upload config or trim video before upload + +### "ERR_FETCH_ORIGIN_ERROR" + +**Cause:** Failed to download video from URL (upload from URL) +**Solution:** Ensure URL is publicly accessible, uses HTTPS, and video file is available + +### "ERR_MALFORMED_VIDEO" + +**Cause:** Video file is corrupted or improperly encoded +**Solution:** Re-encode video using FFmpeg or check source file integrity + +### "ERR_DURATION_TOO_SHORT" + +**Cause:** Video must be at least 0.1 seconds long +**Solution:** Ensure video has valid duration (not a single frame) + +## Troubleshooting + +### Video stuck in "inprogress" state +- **Cause**: Processing large/complex video +- **Solution**: Wait up to 5 minutes for processing; use webhooks instead of polling + +### Signed URL returns 403 +- **Cause**: Token expired or invalid signature +- **Solution**: Check expiration timestamp, verify JWK is correct, ensure clock sync + +### Live stream not connecting +- **Cause**: Invalid RTMPS URL or stream key +- **Solution**: Use exact URL/key from API, ensure firewall allows outbound 443 + +### Webhook signature verification fails +- **Cause**: Incorrect secret or timestamp window +- **Solution**: Use exact secret from webhook setup, allow 5-minute timestamp drift + +### Video uploads but isn't visible +- **Cause**: `requireSignedURLs` enabled without providing token +- **Solution**: Generate signed token or set `requireSignedURLs: false` for public videos + +### Player shows infinite loading +- **Cause**: CORS issue with allowedOrigins +- **Solution**: Add your domain to `allowedOrigins` array + +## Limits + +| Resource | Limit | +|----------|-------| +| Max file size | 30 GB | +| Max frame rate | 60 fps (recommended) | +| Max duration per direct upload | Configurable via `maxDurationSeconds` | +| Token generation (API endpoint) | 1,000/day recommended (use signing keys for higher) | +| Live input outputs (simulcast) | 5 per live input | +| Webhook retry attempts | 5 (exponential backoff) | +| Webhook timeout | 30 seconds | +| Caption file size | 5 MB | +| Watermark image size | 2 MB | +| Metadata keys per video | Unlimited | +| Search results per page | Max 1,000 | + +## Performance Issues + +### Upload is slow +- **Cause**: Large file size or network constraints +- **Solution**: Use TUS resumable upload, compress video before upload, check bandwidth + +### Playback buffering +- **Cause**: Network congestion or low bandwidth +- **Solution**: Use ABR (adaptive bitrate) with HLS/DASH, reduce max bitrate + +### High processing time +- **Cause**: Complex video codec, high resolution +- **Solution**: Pre-encode with H.264 (most efficient), reduce resolution + +## Type Safety + +```typescript +// Error response type +interface StreamError { + success: false; + errors: Array<{ + code: number; + message: string; + }>; +} + +// Handle errors +async function uploadWithErrorHandling(url: string, file: File) { + const formData = new FormData(); + formData.append('file', file); + const response = await fetch(url, { method: 'POST', body: formData }); + const result = await response.json(); + + if (!result.success) { + throw new Error(result.errors[0]?.message || 'Upload failed'); + } + return result; +} +``` + +## Security Gotchas + +1. **Never expose API token in frontend** - Use direct creator uploads +2. **Always verify webhook signatures** - Prevent spoofed notifications +3. **Set appropriate token expiration** - Short-lived for security +4. **Use requireSignedURLs for private content** - Prevent unauthorized access +5. **Whitelist allowedOrigins** - Prevent hotlinking/embedding on unauthorized sites + +## In This Reference + +- [README.md](./README.md) - Overview and quick start +- [configuration.md](./configuration.md) - Setup and config +- [api.md](./api.md) - On-demand video APIs +- [api-live.md](./api-live.md) - Live streaming APIs +- [patterns.md](./patterns.md) - Full-stack flows, best practices + +## See Also + +- [workers](../workers/) - Deploy Stream APIs securely diff --git a/cloudflare/references/stream/patterns.md b/cloudflare/references/stream/patterns.md new file mode 100644 index 0000000..2e7782d --- /dev/null +++ b/cloudflare/references/stream/patterns.md @@ -0,0 +1,184 @@ +# Stream Patterns + +Common workflows, full-stack flows, and best practices. + +## React Stream Player + +`npm install @cloudflare/stream-react` + +```tsx +import { Stream } from '@cloudflare/stream-react'; + +export function VideoPlayer({ videoId, token }: { videoId: string; token?: string }) { + return ; +} +``` + +## Full-Stack Upload Flow + +**Backend API (Workers/Pages)** +```typescript +import Cloudflare from 'cloudflare'; + +export default { + async fetch(request: Request, env: Env): Promise { + const { videoName } = await request.json(); + const client = new Cloudflare({ apiToken: env.CF_API_TOKEN }); + const { uploadURL, uid } = await client.stream.directUpload.create({ + account_id: env.CF_ACCOUNT_ID, + maxDurationSeconds: 3600, + requireSignedURLs: true, + meta: { name: videoName } + }); + return Response.json({ uploadURL, uid }); + } +}; +``` + +**Frontend component** +```tsx +import { useState } from 'react'; + +export function VideoUploader() { + const [uploading, setUploading] = useState(false); + const [progress, setProgress] = useState(0); + + async function handleUpload(file: File) { + setUploading(true); + const { uploadURL, uid } = await fetch('/api/upload-url', { + method: 'POST', + body: JSON.stringify({ videoName: file.name }) + }).then(r => r.json()); + + const xhr = new XMLHttpRequest(); + xhr.upload.onprogress = (e) => setProgress((e.loaded / e.total) * 100); + xhr.onload = () => { setUploading(false); window.location.href = `/videos/${uid}`; }; + xhr.open('POST', uploadURL); + const formData = new FormData(); + formData.append('file', file); + xhr.send(formData); + } + + return ( +
+ e.target.files?.[0] && handleUpload(e.target.files[0])} disabled={uploading} /> + {uploading && } +
+ ); +} +``` + +## TUS Resumable Upload + +For large files (>500MB). `npm install tus-js-client` + +```typescript +import * as tus from 'tus-js-client'; + +async function uploadWithTUS(file: File, uploadURL: string, onProgress?: (pct: number) => void) { + return new Promise((resolve, reject) => { + const upload = new tus.Upload(file, { + endpoint: uploadURL, + retryDelays: [0, 3000, 5000, 10000, 20000], + chunkSize: 50 * 1024 * 1024, + metadata: { filename: file.name, filetype: file.type }, + onError: reject, + onProgress: (up, total) => onProgress?.((up / total) * 100), + onSuccess: () => resolve(upload.url?.split('/').pop() || '') + }); + upload.start(); + }); +} +``` + +## Video State Polling + +```typescript +async function waitForVideoReady(client: Cloudflare, accountId: string, videoId: string) { + for (let i = 0; i < 60; i++) { + const video = await client.stream.videos.get(videoId, { account_id: accountId }); + if (video.readyToStream || video.status.state === 'error') return video; + await new Promise(resolve => setTimeout(resolve, 5000)); + } + throw new Error('Video processing timeout'); +} +``` + +## Webhook Handler + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const signature = request.headers.get('Webhook-Signature'); + const body = await request.text(); + if (!signature || !await verifyWebhook(signature, body, env.WEBHOOK_SECRET)) { + return new Response('Unauthorized', { status: 401 }); + } + const payload = JSON.parse(body); + if (payload.readyToStream) console.log(`Video ${payload.uid} ready`); + return new Response('OK'); + } +}; + +async function verifyWebhook(sig: string, body: string, secret: string): Promise { + const parts = Object.fromEntries(sig.split(',').map(p => p.split('='))); + const timestamp = parseInt(parts.time || '0', 10); + if (Math.abs(Date.now() / 1000 - timestamp) > 300) return false; + + const key = await crypto.subtle.importKey( + 'raw', new TextEncoder().encode(secret), { name: 'HMAC', hash: 'SHA-256' }, false, ['sign'] + ); + const computed = await crypto.subtle.sign('HMAC', key, new TextEncoder().encode(`${timestamp}.${body}`)); + const hex = Array.from(new Uint8Array(computed), b => b.toString(16).padStart(2, '0')).join(''); + return hex === parts.sig1; +} +``` + +## Self-Sign JWT (High Volume Tokens) + +For >1k tokens/day. Prerequisites: Create signing key (see configuration.md). + +```typescript +async function selfSignToken(keyId: string, jwkBase64: string, videoId: string, expiresIn = 3600) { + const key = await crypto.subtle.importKey( + 'jwk', JSON.parse(atob(jwkBase64)), { name: 'RSASSA-PKCS1-v1_5', hash: 'SHA-256' }, false, ['sign'] + ); + const now = Math.floor(Date.now() / 1000); + const header = btoa(JSON.stringify({ alg: 'RS256', kid: keyId })).replace(/=/g, '').replace(/\+/g, '-').replace(/\//g, '_'); + const payload = btoa(JSON.stringify({ sub: videoId, kid: keyId, exp: now + expiresIn, nbf: now })) + .replace(/=/g, '').replace(/\+/g, '-').replace(/\//g, '_'); + const message = `${header}.${payload}`; + const sig = await crypto.subtle.sign('RSASSA-PKCS1-v1_5', key, new TextEncoder().encode(message)); + const b64Sig = btoa(String.fromCharCode(...new Uint8Array(sig))).replace(/=/g, '').replace(/\+/g, '-').replace(/\//g, '_'); + return `${message}.${b64Sig}`; +} + +// With access rules (geo-restriction) +const payloadWithRules = { + sub: videoId, kid: keyId, exp: now + 3600, nbf: now, + accessRules: [{ type: 'ip.geoip.country', action: 'allow', country: ['US'] }] +}; +``` + +## Best Practices + +- **Use Direct Creator Uploads** - Avoid proxying through servers +- **Enable requireSignedURLs** - Control private content access +- **Self-sign tokens at scale** - Use signing keys for >1k/day +- **Set allowedOrigins** - Prevent hotlinking +- **Use webhooks over polling** - Efficient status updates +- **Set maxDurationSeconds** - Prevent abuse +- **Enable live recordings** - Auto VOD after stream + +## In This Reference + +- [README.md](./README.md) - Overview and quick start +- [configuration.md](./configuration.md) - Setup and config +- [api.md](./api.md) - On-demand video APIs +- [api-live.md](./api-live.md) - Live streaming APIs +- [gotchas.md](./gotchas.md) - Error codes, troubleshooting + +## See Also + +- [workers](../workers/) - Deploy Stream APIs in Workers +- [pages](../pages/) - Integrate Stream with Pages diff --git a/cloudflare/references/tail-workers/README.md b/cloudflare/references/tail-workers/README.md new file mode 100644 index 0000000..d17da7d --- /dev/null +++ b/cloudflare/references/tail-workers/README.md @@ -0,0 +1,89 @@ +# Cloudflare Tail Workers + +Specialized Workers that consume execution events from producer Workers for logging, debugging, analytics, and observability. + +## When to Use This Reference + +- Implementing observability/logging for Cloudflare Workers +- Processing Worker execution events, logs, exceptions +- Building custom analytics or error tracking +- Configuring real-time event streaming +- Working with tail handlers or tail consumers + +## Core Concepts + +### What Are Tail Workers? + +Tail Workers automatically process events from producer Workers (the Workers being monitored). They receive: +- HTTP request/response info +- Console logs (`console.log/error/warn/debug`) +- Uncaught exceptions +- Execution outcomes (`ok`, `exception`, `exceededCpu`, etc.) +- Diagnostic channel events + +**Key characteristics:** +- Invoked AFTER producer finishes executing +- Capture entire request lifecycle including Service Bindings and Dynamic Dispatch sub-requests +- Billed by CPU time, not request count +- Available on Workers Paid and Enterprise tiers + +### Alternative: OpenTelemetry Export + +**Before using Tail Workers, consider OpenTelemetry:** + +For batch exports to observability tools (Sentry, Grafana, Honeycomb): +- OTEL export sends logs/traces in batches (more efficient) +- Built-in integrations with popular platforms +- Lower overhead than Tail Workers +- **Use Tail Workers only for custom real-time processing** + +## Decision Tree + +``` +Need observability for Workers? +├─ Batch export to known tools (Sentry/Grafana/Honeycomb)? +│ └─ Use OpenTelemetry export (not Tail Workers) +├─ Custom real-time processing needed? +│ ├─ Aggregated metrics? +│ │ └─ Use Tail Worker + Analytics Engine +│ ├─ Error tracking? +│ │ └─ Use Tail Worker + external service +│ ├─ Custom logging/debugging? +│ │ └─ Use Tail Worker + KV/HTTP endpoint +│ └─ Complex event processing? +│ └─ Use Tail Worker + Durable Objects +└─ Quick debugging? + └─ Use `wrangler tail` (different from Tail Workers) +``` + +## Reading Order + +1. **[configuration.md](configuration.md)** - Set up Tail Workers +2. **[api.md](api.md)** - Handler signature, types, redaction +3. **[patterns.md](patterns.md)** - Common use cases and integrations +4. **[gotchas.md](gotchas.md)** - Pitfalls and debugging tips + +## Quick Example + +```typescript +export default { + async tail(events, env, ctx) { + // Process events from producer Worker + ctx.waitUntil( + fetch(env.LOG_ENDPOINT, { + method: "POST", + headers: { "Content-Type": "application/json" }, + body: JSON.stringify(events), + }) + ); + } +}; +``` + +## Related Skills + +- **observability** - General Workers observability patterns, OTEL export +- **analytics-engine** - Aggregated metrics storage for tail event data +- **durable-objects** - Stateful event processing, batching tail events +- **logpush** - Alternative for batch log export (non-real-time) +- **workers-for-platforms** - Dynamic dispatch with tail consumers diff --git a/cloudflare/references/tail-workers/api.md b/cloudflare/references/tail-workers/api.md new file mode 100644 index 0000000..624d9e9 --- /dev/null +++ b/cloudflare/references/tail-workers/api.md @@ -0,0 +1,200 @@ +# Tail Workers API Reference + +## Handler Signature + +```typescript +export default { + async tail( + events: TraceItem[], + env: Env, + ctx: ExecutionContext + ): Promise { + // Process events + } +} satisfies ExportedHandler; +``` + +**Parameters:** +- `events`: Array of `TraceItem` objects (one per producer invocation) +- `env`: Bindings (KV, D1, R2, env vars, etc.) +- `ctx`: Context with `waitUntil()` for async work + +**CRITICAL:** Tail handlers don't return values. Use `ctx.waitUntil()` for async operations. + +## TraceItem Type + +```typescript +interface TraceItem { + scriptName: string; // Producer Worker name + eventTimestamp: number; // Epoch milliseconds + outcome: 'ok' | 'exception' | 'exceededCpu' | 'exceededMemory' + | 'canceled' | 'scriptNotFound' | 'responseStreamDisconnected' | 'unknown'; + + event?: { + request?: { + url: string; // Redacted by default + method: string; + headers: Record; // Sensitive headers redacted + cf?: IncomingRequestCfProperties; + getUnredacted(): TraceRequest; // Bypass redaction (use carefully) + }; + response?: { + status: number; + }; + }; + + logs: Array<{ + timestamp: number; // Epoch milliseconds + level: 'debug' | 'info' | 'log' | 'warn' | 'error'; + message: unknown[]; // Args passed to console function + }>; + + exceptions: Array<{ + timestamp: number; // Epoch milliseconds + name: string; // Error type (Error, TypeError, etc.) + message: string; // Error description + }>; + + diagnosticsChannelEvents: Array<{ + channel: string; + message: unknown; + timestamp: number; // Epoch milliseconds + }>; +} +``` + +**Note:** Official SDK uses `TraceItem`, not `TailItem`. Use `@cloudflare/workers-types` for accurate types. + +## Timestamp Handling + +All timestamps are **epoch milliseconds**, not seconds: + +```typescript +// ✅ CORRECT - use directly with Date +const date = new Date(event.eventTimestamp); + +// ❌ WRONG - don't multiply by 1000 +const date = new Date(event.eventTimestamp * 1000); +``` + +## Automatic Redaction + +By default, sensitive data is redacted from `TraceRequest`: + +### Header Redaction + +Headers containing these substrings (case-insensitive): +- `auth`, `key`, `secret`, `token`, `jwt` +- `cookie`, `set-cookie` + +Redacted values show as `"REDACTED"`. + +### URL Redaction + +- **Hex IDs:** 32+ hex digits → `"REDACTED"` +- **Base-64 IDs:** 21+ chars with 2+ upper, 2+ lower, 2+ digits → `"REDACTED"` + +## Bypassing Redaction + +```typescript +export default { + async tail(events, env, ctx) { + for (const event of events) { + // ⚠️ Use with extreme caution + const unredacted = event.event?.request?.getUnredacted(); + // unredacted.url and unredacted.headers contain raw values + } + } +}; +``` + +**Best practices:** +- Only call `getUnredacted()` when absolutely necessary +- Never log unredacted sensitive data +- Implement additional filtering before external transmission +- Use environment variables for API keys, never hardcode + +## Type-Safe Handler + +```typescript +interface Env { + LOGS_KV: KVNamespace; + ANALYTICS: AnalyticsEngineDataset; + LOG_ENDPOINT: string; + API_TOKEN: string; +} + +export default { + async tail( + events: TraceItem[], + env: Env, + ctx: ExecutionContext + ): Promise { + const payload = events.map(event => ({ + script: event.scriptName, + timestamp: event.eventTimestamp, + outcome: event.outcome, + url: event.event?.request?.url, + status: event.event?.response?.status, + })); + + ctx.waitUntil( + fetch(env.LOG_ENDPOINT, { + method: "POST", + headers: { "Content-Type": "application/json" }, + body: JSON.stringify(payload), + }) + ); + } +} satisfies ExportedHandler; +``` + +## Outcome vs HTTP Status + +**IMPORTANT:** `outcome` is script execution status, NOT HTTP status. + +- Worker returns 500 → `outcome='ok'` if script completed successfully +- Uncaught exception → `outcome='exception'` regardless of HTTP status +- CPU limit exceeded → `outcome='exceededCpu'` + +```typescript +// ✅ Check outcome for script execution status +if (event.outcome === 'exception') { + // Script threw uncaught exception +} + +// ✅ Check HTTP status separately +if (event.event?.response?.status === 500) { + // HTTP 500 returned (script may have handled error) +} +``` + +## Serialization Considerations + +`log.message` is `unknown[]` and may contain non-serializable objects: + +```typescript +// ❌ May fail with circular references or BigInt +JSON.stringify(events); + +// ✅ Safe serialization +const safePayload = events.map(event => ({ + ...event, + logs: event.logs.map(log => ({ + ...log, + message: log.message.map(m => { + try { + return JSON.parse(JSON.stringify(m)); + } catch { + return String(m); + } + }) + })) +})); +``` + +**Common serialization issues:** +- Circular references in logged objects +- `BigInt` values (not JSON-serializable) +- Functions or symbols in console.log arguments +- Large objects exceeding body size limits diff --git a/cloudflare/references/tail-workers/configuration.md b/cloudflare/references/tail-workers/configuration.md new file mode 100644 index 0000000..96fb33f --- /dev/null +++ b/cloudflare/references/tail-workers/configuration.md @@ -0,0 +1,176 @@ +# Tail Workers Configuration + +## Setup Steps + +### 1. Create Tail Worker + +Create a Worker with a `tail()` handler: + +```typescript +export default { + async tail(events, env, ctx) { + // Process events from producer Worker + ctx.waitUntil( + fetch(env.LOG_ENDPOINT, { + method: "POST", + body: JSON.stringify(events), + }) + ); + } +}; +``` + +### 2. Configure Producer Worker + +In producer's `wrangler.jsonc`: + +```jsonc +{ + "name": "my-producer-worker", + "tail_consumers": [ + { + "service": "my-tail-worker" + } + ] +} +``` + +### 3. Deploy Both Workers + +```bash +# Deploy Tail Worker first +cd tail-worker +wrangler deploy + +# Then deploy producer Worker +cd ../producer-worker +wrangler deploy +``` + +## Wrangler Configuration + +### Single Tail Consumer + +```jsonc +{ + "name": "producer-worker", + "tail_consumers": [ + { + "service": "logging-tail-worker" + } + ] +} +``` + +### Multiple Tail Consumers + +```jsonc +{ + "name": "producer-worker", + "tail_consumers": [ + { + "service": "logging-tail-worker" + }, + { + "service": "metrics-tail-worker" + } + ] +} +``` + +**Note:** Each consumer receives ALL events independently. + +### Remove Tail Consumer + +```jsonc +{ + "tail_consumers": [] +} +``` + +Then redeploy producer Worker. + +## Environment Variables + +Tail Workers use same binding syntax as regular Workers: + +```jsonc +{ + "name": "my-tail-worker", + "vars": { + "LOG_ENDPOINT": "https://logs.example.com/ingest" + }, + "kv_namespaces": [ + { + "binding": "LOGS_KV", + "id": "abc123..." + } + ] +} +``` + +## Testing & Development + +### Local Testing + +**Tail Workers cannot be fully tested with `wrangler dev`.** Deploy to staging environment for testing. + +### Testing Strategy + +1. Deploy producer Worker to staging +2. Deploy Tail Worker to staging +3. Configure `tail_consumers` in producer +4. Trigger producer Worker requests +5. Verify Tail Worker receives events (check destination logs/storage) + +### Wrangler Tail Command + +```bash +# Stream logs to terminal (NOT Tail Workers) +wrangler tail my-producer-worker +``` + +**This is different from Tail Workers:** +- `wrangler tail` streams logs to your terminal +- Tail Workers are Workers that process events programmatically + +## Deployment Checklist + +- [ ] Tail Worker has `tail()` handler +- [ ] Tail Worker deployed before producer +- [ ] Producer's `wrangler.jsonc` has correct `tail_consumers` +- [ ] Environment variables configured +- [ ] Tested with staging environment +- [ ] Monitoring configured for Tail Worker itself + +## Limits + +| Limit | Value | Notes | +|-------|-------|-------| +| Max tail consumers per producer | 10 | Each receives all events independently | +| Events batch size | Up to 100 events per invocation | Larger batches split across invocations | +| Tail Worker CPU time | Same as regular Workers | 10ms (free), 30ms (paid), 50ms (paid bundle) | +| Pricing tier | Workers Paid or Enterprise | Not available on free plan | +| Request body size | 100 MB max | When sending to external endpoints | +| Event retention | None | Events not retried if tail handler fails | + +## Workers for Platforms + +For dynamic dispatch Workers, both dispatch and user Worker events sent to tail consumer: + +```jsonc +{ + "name": "dispatch-worker", + "tail_consumers": [ + { + "service": "platform-tail-worker" + } + ] +} +``` + +Tail Worker receives TWO `TraceItem` elements per request: +1. Dynamic dispatch Worker event +2. User Worker event + +See [patterns.md](patterns.md) for handling. diff --git a/cloudflare/references/tail-workers/gotchas.md b/cloudflare/references/tail-workers/gotchas.md new file mode 100644 index 0000000..4865d0e --- /dev/null +++ b/cloudflare/references/tail-workers/gotchas.md @@ -0,0 +1,192 @@ +# Tail Workers Gotchas & Debugging + +## Critical Pitfalls + +### 1. Not Using `ctx.waitUntil()` + +**Problem:** Async work doesn't complete or tail Worker times out +**Cause:** Handlers exit immediately; awaiting blocks processing +**Solution:** + +```typescript +// ❌ WRONG - fire and forget +export default { + async tail(events) { + fetch(endpoint, { body: JSON.stringify(events) }); + } +}; + +// ❌ WRONG - blocking await +export default { + async tail(events, env, ctx) { + await fetch(endpoint, { body: JSON.stringify(events) }); + } +}; + +// ✅ CORRECT +export default { + async tail(events, env, ctx) { + ctx.waitUntil( + (async () => { + await fetch(endpoint, { body: JSON.stringify(events) }); + await processMore(); + })() + ); + } +}; +``` + +### 2. Missing `tail()` Handler + +**Problem:** Producer deployment fails +**Cause:** Worker in `tail_consumers` doesn't export `tail()` handler +**Solution:** Ensure `export default { async tail(events, env, ctx) { ... } }` + +### 3. Outcome vs HTTP Status + +**Problem:** Filtering by wrong status +**Cause:** `outcome` is script execution status, not HTTP status + +```typescript +// ❌ WRONG +if (event.outcome === 500) { /* never matches */ } + +// ✅ CORRECT +if (event.outcome === 'exception') { /* script threw */ } +if (event.event?.response?.status === 500) { /* HTTP 500 */ } +``` + +### 4. Timestamp Units + +**Problem:** Dates off by 1000x +**Cause:** Timestamps are epoch milliseconds, not seconds + +```typescript +// ❌ WRONG: const date = new Date(event.eventTimestamp * 1000); +// ✅ CORRECT: const date = new Date(event.eventTimestamp); +``` + +### 5. Type Name Mismatch + +**Problem:** Using `TailItem` type +**Cause:** Old docs used `TailItem`, SDK uses `TraceItem` + +```typescript +import type { TraceItem } from '@cloudflare/workers-types'; +export default { + async tail(events: TraceItem[], env, ctx) { /* ... */ } +}; +``` + +### 6. Excessive Logging Volume + +**Problem:** Unexpected high costs +**Cause:** Invoked on EVERY producer request +**Solution:** Sample events + +```typescript +export default { + async tail(events, env, ctx) { + if (Math.random() > 0.1) return; // 10% sample + ctx.waitUntil(sendToEndpoint(events)); + } +}; +``` + +### 7. Serialization Issues + +**Problem:** `JSON.stringify()` fails +**Cause:** `log.message` is `unknown[]` with non-serializable values +**Solution:** + +```typescript +const safePayload = events.map(e => ({ + ...e, + logs: e.logs.map(log => ({ + ...log, + message: log.message.map(m => { + try { return JSON.parse(JSON.stringify(m)); } + catch { return String(m); } + }) + })) +})); +``` + +### 8. Missing Error Handling + +**Problem:** Tail Worker silently fails +**Cause:** No try/catch +**Solution:** + +```typescript +ctx.waitUntil((async () => { + try { + await fetch(env.ENDPOINT, { body: JSON.stringify(events) }); + } catch (error) { + console.error("Tail error:", error); + await env.FALLBACK_KV.put(`failed:${Date.now()}`, JSON.stringify(events)); + } +})()); +``` + +### 9. Deployment Order + +**Problem:** Producer deployment fails +**Cause:** Tail consumer not deployed yet +**Solution:** Deploy tail consumer FIRST + +```bash +cd tail-worker && wrangler deploy +cd ../producer && wrangler deploy +``` + +### 10. No Event Retry + +**Problem:** Events lost when handler fails +**Cause:** Failed invocations NOT retried +**Solution:** Implement fallback storage (see #8) + +## Debugging + +**View logs:** `wrangler tail my-tail-worker` + +**Incremental testing:** +1. Verify receipt: `console.log('Events:', events.length)` +2. Inspect structure: `console.log(JSON.stringify(events[0], null, 2))` +3. Add external call with `ctx.waitUntil()` + +**Monitor dashboard:** Check invocation count (matches producer?), error rate, CPU time + +## Testing + +Add test endpoint to producer: + +```typescript +export default { + async fetch(request) { + if (request.url.includes('/test')) { + console.log('Test log'); + throw new Error('Test error'); + } + return new Response('OK'); + } +}; +``` + +Trigger: `curl https://producer.example.workers.dev/test` + +## Common Errors + +| Error | Cause | Solution | +|-------|-------|----------| +| "Tail consumer not found" | Not deployed | Deploy tail Worker first | +| "No tail handler" | Missing `tail()` | Add to default export | +| "waitUntil is not a function" | Missing `ctx` | Add `ctx` parameter | +| Timeout | Blocking await | Use `ctx.waitUntil()` | + +## Performance Notes + +- Max 100 events per invocation +- Each consumer receives all events independently +- CPU limits same as regular Workers +- For high volume, use Durable Objects batching diff --git a/cloudflare/references/tail-workers/patterns.md b/cloudflare/references/tail-workers/patterns.md new file mode 100644 index 0000000..a696ec2 --- /dev/null +++ b/cloudflare/references/tail-workers/patterns.md @@ -0,0 +1,180 @@ +# Tail Workers Common Patterns + +## Community Libraries + +While most tail Worker implementations are custom, these libraries may help: + +**Logging/Observability:** +- **Axiom** - `axiom-cloudflare-workers` (npm) - Direct Axiom integration +- **Baselime** - SDK for Baselime observability platform +- **LogFlare** - Structured log aggregation + +**Type Definitions:** +- **@cloudflare/workers-types** - Official TypeScript types (use `TraceItem`) + +**Note:** Most integrations require custom tail handler implementation. See integration examples below. + +## Basic Patterns + +### HTTP Endpoint Logging + +```typescript +export default { + async tail(events, env, ctx) { + const payload = events.map(event => ({ + script: event.scriptName, + timestamp: event.eventTimestamp, + outcome: event.outcome, + url: event.event?.request?.url, + status: event.event?.response?.status, + logs: event.logs, + exceptions: event.exceptions, + })); + + ctx.waitUntil( + fetch(env.LOG_ENDPOINT, { + method: "POST", + body: JSON.stringify(payload), + }) + ); + } +}; +``` + +### Error Tracking Only + +```typescript +export default { + async tail(events, env, ctx) { + const errors = events.filter(e => + e.outcome === 'exception' || e.exceptions.length > 0 + ); + + if (errors.length === 0) return; + + ctx.waitUntil( + fetch(env.ERROR_ENDPOINT, { + method: "POST", + body: JSON.stringify(errors), + }) + ); + } +}; +``` + +## Storage Integration + +### KV Storage with TTL + +```typescript +export default { + async tail(events, env, ctx) { + ctx.waitUntil( + Promise.all(events.map(event => + env.LOGS_KV.put( + `log:${event.scriptName}:${event.eventTimestamp}`, + JSON.stringify(event), + { expirationTtl: 86400 } // 24 hours + ) + )) + ); + } +}; +``` + +### Analytics Engine Metrics + +```typescript +export default { + async tail(events, env, ctx) { + ctx.waitUntil( + Promise.all(events.map(event => + env.ANALYTICS.writeDataPoint({ + blobs: [event.scriptName, event.outcome], + doubles: [1, event.event?.response?.status ?? 0], + indexes: [event.event?.request?.cf?.colo ?? 'unknown'], + }) + )) + ); + } +}; +``` + +## Filtering & Routing + +Filter by route, outcome, or other criteria: + +```typescript +export default { + async tail(events, env, ctx) { + // Route filtering + const apiEvents = events.filter(e => + e.event?.request?.url?.includes('/api/') + ); + + // Multi-destination routing + const errors = events.filter(e => e.outcome === 'exception'); + const success = events.filter(e => e.outcome === 'ok'); + + const tasks = []; + if (errors.length > 0) { + tasks.push(fetch(env.ERROR_ENDPOINT, { + method: "POST", + body: JSON.stringify(errors), + })); + } + if (success.length > 0) { + tasks.push(fetch(env.SUCCESS_ENDPOINT, { + method: "POST", + body: JSON.stringify(success), + })); + } + + ctx.waitUntil(Promise.all(tasks)); + } +}; +``` + +## Sampling + +Reduce costs by processing only a percentage of events: + +```typescript +export default { + async tail(events, env, ctx) { + if (Math.random() > 0.1) return; // 10% sample rate + ctx.waitUntil(fetch(env.LOG_ENDPOINT, { + method: "POST", + body: JSON.stringify(events), + })); + } +}; +``` + +## Advanced Patterns + +### Batching with Durable Objects + +Accumulate events before sending: + +```typescript +export default { + async tail(events, env, ctx) { + const batch = env.BATCH_DO.get(env.BATCH_DO.idFromName("batch")); + ctx.waitUntil(batch.fetch("https://batch/add", { + method: "POST", + body: JSON.stringify(events), + })); + } +}; +``` + +See durable-objects skill for full implementation. + +### Workers for Platforms + +Dynamic dispatch sends TWO events per request. Filter by `scriptName` to distinguish dispatch vs user Worker events. + +### Error Handling + +Always wrap external calls. See gotchas.md for fallback storage pattern. diff --git a/cloudflare/references/terraform/README.md b/cloudflare/references/terraform/README.md new file mode 100644 index 0000000..17d8a30 --- /dev/null +++ b/cloudflare/references/terraform/README.md @@ -0,0 +1,102 @@ +# Cloudflare Terraform Provider + +**Expert guidance for Cloudflare Terraform Provider - infrastructure as code for Cloudflare resources.** + +## Core Principles + +- **Provider-first**: Use Terraform provider for ALL infrastructure - never mix with wrangler.jsonc for the same resources +- **State management**: Always use remote state (S3, Terraform Cloud, etc.) for team environments +- **Modular architecture**: Create reusable modules for common patterns (zones, workers, pages) +- **Version pinning**: Always pin provider version with `~>` for predictable upgrades +- **Secret management**: Use variables + environment vars for sensitive data - never hardcode API tokens + +## Provider Version + +| Version | Status | Notes | +|---------|--------|-------| +| 5.x | Current | Auto-generated from OpenAPI, breaking changes from v4 | +| 4.x | Legacy | Manual maintenance, deprecated | + +**Critical:** v5 renamed many resources (`cloudflare_record` → `cloudflare_dns_record`, `cloudflare_worker_*` → `cloudflare_workers_*`). See [gotchas.md](./gotchas.md#v5-breaking-changes) for migration details. + +## Provider Setup + +### Basic Configuration + +```hcl +terraform { + required_version = ">= 1.0" + + required_providers { + cloudflare = { + source = "cloudflare/cloudflare" + version = "~> 5.15.0" + } + } +} + +provider "cloudflare" { + api_token = var.cloudflare_api_token # or CLOUDFLARE_API_TOKEN env var +} +``` + +### Authentication Methods (priority order) + +1. **API Token** (RECOMMENDED): `api_token` or `CLOUDFLARE_API_TOKEN` + - Create: Dashboard → My Profile → API Tokens + - Scope to specific accounts/zones for security + +2. **Global API Key** (LEGACY): `api_key` + `api_email` or `CLOUDFLARE_API_KEY` + `CLOUDFLARE_EMAIL` + - Less secure, use tokens instead + +3. **User Service Key**: `user_service_key` for Origin CA certificates + + + +## Quick Reference: Common Commands + +```bash +terraform init # Initialize provider +terraform plan # Plan changes +terraform apply # Apply changes +terraform destroy # Destroy resources +terraform import cloudflare_zone.example # Import existing +terraform state list # List resources in state +terraform output # Show outputs +terraform fmt -recursive # Format code +terraform validate # Validate configuration +``` + +## Import Existing Resources + +Use cf-terraforming to generate configs from existing Cloudflare resources: + +```bash +# Install +brew install cloudflare/cloudflare/cf-terraforming + +# Generate HCL from existing resources +cf-terraforming generate --resource-type cloudflare_dns_record --zone + +# Import into Terraform state +cf-terraforming import --resource-type cloudflare_dns_record --zone +``` + +## Reading Order + +1. Start with [README.md](./README.md) for provider setup and authentication +2. Review [configuration.md](./configuration.md) for resource configurations +3. Check [api.md](./api.md) for data sources and existing resource queries +4. See [patterns.md](./patterns.md) for multi-environment and CI/CD patterns +5. Read [gotchas.md](./gotchas.md) for state drift, v5 breaking changes, and troubleshooting + +## In This Reference +- [configuration.md](./configuration.md) - Resources for zones, DNS, workers, KV, R2, D1, Pages, rulesets +- [api.md](./api.md) - Data sources for existing resources +- [patterns.md](./patterns.md) - Architecture patterns, multi-env setup, CI/CD integration +- [gotchas.md](./gotchas.md) - Common issues, security, best practices + +## See Also +- [pulumi](../pulumi/) - Alternative IaC tool for Cloudflare +- [wrangler](../wrangler/) - CLI deployment alternative +- [workers](../workers/) - Worker runtime documentation diff --git a/cloudflare/references/terraform/api.md b/cloudflare/references/terraform/api.md new file mode 100644 index 0000000..8a06c1c --- /dev/null +++ b/cloudflare/references/terraform/api.md @@ -0,0 +1,178 @@ +# Terraform Data Sources Reference + +Query existing Cloudflare resources to reference in your configurations. + +## v5 Data Source Names + +| v4 Name | v5 Name | Notes | +|---------|---------|-------| +| `cloudflare_record` | `cloudflare_dns_record` | | +| `cloudflare_worker_script` | `cloudflare_workers_script` | Note: plural | +| `cloudflare_access_*` | `cloudflare_zero_trust_*` | Access → Zero Trust | + +## Zone Data Sources + +```hcl +# Get zone by name +data "cloudflare_zone" "example" { + name = "example.com" +} + +# Use in resources +resource "cloudflare_dns_record" "www" { + zone_id = data.cloudflare_zone.example.id + name = "www" + # ... +} +``` + +## Account Data Sources + +```hcl +# List all accounts +data "cloudflare_accounts" "main" { + name = "My Account" +} + +# Use account ID +resource "cloudflare_worker_script" "api" { + account_id = data.cloudflare_accounts.main.accounts[0].id + # ... +} +``` + +## Worker Data Sources + +```hcl +# Get existing worker script (v5: cloudflare_workers_script) +data "cloudflare_workers_script" "existing" { + account_id = var.account_id + name = "existing-worker" +} + +# Reference in service bindings +resource "cloudflare_workers_script" "consumer" { + service_binding { + name = "UPSTREAM" + service = data.cloudflare_workers_script.existing.name + } +} +``` + +## KV Data Sources + +```hcl +# Get KV namespace +data "cloudflare_workers_kv_namespace" "existing" { + account_id = var.account_id + namespace_id = "abc123" +} + +# Use in worker binding +resource "cloudflare_workers_script" "api" { + kv_namespace_binding { + name = "KV" + namespace_id = data.cloudflare_workers_kv_namespace.existing.id + } +} +``` + +## Lists Data Source + +```hcl +# Get IP lists for WAF rules +data "cloudflare_list" "blocked_ips" { + account_id = var.account_id + name = "blocked_ips" +} +``` + +## IP Ranges Data Source + +```hcl +# Get Cloudflare IP ranges (for firewall rules) +data "cloudflare_ip_ranges" "cloudflare" {} + +output "ipv4_cidrs" { + value = data.cloudflare_ip_ranges.cloudflare.ipv4_cidr_blocks +} + +output "ipv6_cidrs" { + value = data.cloudflare_ip_ranges.cloudflare.ipv6_cidr_blocks +} + +# Use in security group rules (AWS example) +resource "aws_security_group_rule" "allow_cloudflare" { + type = "ingress" + from_port = 443 + to_port = 443 + protocol = "tcp" + cidr_blocks = data.cloudflare_ip_ranges.cloudflare.ipv4_cidr_blocks + security_group_id = aws_security_group.web.id +} +``` + +## Common Patterns + +### Import ID Formats + +| Resource | Import ID Format | +|----------|------------------| +| `cloudflare_zone` | `` | +| `cloudflare_dns_record` | `/` | +| `cloudflare_workers_script` | `/` | +| `cloudflare_workers_kv_namespace` | `/` | +| `cloudflare_r2_bucket` | `/` | +| `cloudflare_d1_database` | `/` | +| `cloudflare_pages_project` | `/` | + +```bash +# Example: Import DNS record +terraform import cloudflare_dns_record.example / +``` + +### Reference Across Modules + +```hcl +# modules/worker/main.tf +data "cloudflare_zone" "main" { + name = var.domain +} + +resource "cloudflare_worker_route" "api" { + zone_id = data.cloudflare_zone.main.id + pattern = "api.${var.domain}/*" + script_name = cloudflare_worker_script.api.name +} +``` + +### Output Important Values + +```hcl +output "zone_id" { + value = cloudflare_zone.main.id + description = "Zone ID for DNS management" +} + +output "worker_url" { + value = "https://${cloudflare_worker_domain.api.hostname}" + description = "Worker API endpoint" +} + +output "kv_namespace_id" { + value = cloudflare_workers_kv_namespace.app.id + sensitive = false +} + +output "name_servers" { + value = cloudflare_zone.main.name_servers + description = "Name servers for domain registration" +} +``` + +## See Also + +- [README](./README.md) - Provider setup +- [Configuration Reference](./configuration.md) - All resource types +- [Patterns](./patterns.md) - Architecture patterns +- [Troubleshooting](./gotchas.md) - Common issues diff --git a/cloudflare/references/terraform/configuration.md b/cloudflare/references/terraform/configuration.md new file mode 100644 index 0000000..4b5eeb5 --- /dev/null +++ b/cloudflare/references/terraform/configuration.md @@ -0,0 +1,197 @@ +# Terraform Configuration Reference + +Complete resource configurations for Cloudflare infrastructure. + +## Zone & DNS + +```hcl +# Zone + settings +resource "cloudflare_zone" "example" { account = { id = var.account_id }; name = "example.com"; type = "full" } +resource "cloudflare_zone_settings_override" "example" { + zone_id = cloudflare_zone.example.id + settings { ssl = "strict"; always_use_https = "on"; min_tls_version = "1.2"; tls_1_3 = "on"; http3 = "on" } +} + +# DNS records (A, CNAME, MX, TXT) +resource "cloudflare_dns_record" "www" { + zone_id = cloudflare_zone.example.id; name = "www"; content = "192.0.2.1"; type = "A"; proxied = true +} +resource "cloudflare_dns_record" "mx" { + for_each = { "10" = "mail1.example.com", "20" = "mail2.example.com" } + zone_id = cloudflare_zone.example.id; name = "@"; content = each.value; type = "MX"; priority = each.key +} +``` + +## Workers + +### Simple Pattern (Legacy - Still Works) + +```hcl +resource "cloudflare_workers_script" "api" { + account_id = var.account_id; name = "api-worker"; content = file("worker.js") + module = true; compatibility_date = "2025-01-01" + kv_namespace_binding { name = "KV"; namespace_id = cloudflare_workers_kv_namespace.cache.id } + r2_bucket_binding { name = "BUCKET"; bucket_name = cloudflare_r2_bucket.assets.name } + d1_database_binding { name = "DB"; database_id = cloudflare_d1_database.app.id } + secret_text_binding { name = "SECRET"; text = var.secret } +} +``` + +### Gradual Rollouts (Recommended for Production) + +```hcl +resource "cloudflare_worker" "api" { account_id = var.account_id; name = "api-worker" } +resource "cloudflare_worker_version" "api_v1" { + account_id = var.account_id; worker_name = cloudflare_worker.api.name + content = file("worker.js"); content_sha256 = filesha256("worker.js") + compatibility_date = "2025-01-01" + bindings { + kv_namespace { name = "KV"; namespace_id = cloudflare_workers_kv_namespace.cache.id } + r2_bucket { name = "BUCKET"; bucket_name = cloudflare_r2_bucket.assets.name } + } +} +resource "cloudflare_workers_deployment" "api" { + account_id = var.account_id; worker_name = cloudflare_worker.api.name + versions { version_id = cloudflare_worker_version.api_v1.id; percentage = 100 } +} +``` + +### Worker Binding Types (v5) + +| Binding | Attribute | Example | +|---------|-----------|---------| +| KV | `kv_namespace_binding` | `{ name = "KV", namespace_id = "..." }` | +| R2 | `r2_bucket_binding` | `{ name = "BUCKET", bucket_name = "..." }` | +| D1 | `d1_database_binding` | `{ name = "DB", database_id = "..." }` | +| Service | `service_binding` | `{ name = "AUTH", service = "auth-worker" }` | +| Secret | `secret_text_binding` | `{ name = "API_KEY", text = "..." }` | +| Queue | `queue_binding` | `{ name = "QUEUE", queue_name = "..." }` | +| Vectorize | `vectorize_binding` | `{ name = "INDEX", index_name = "..." }` | +| Hyperdrive | `hyperdrive_binding` | `{ name = "DB", id = "..." }` | +| AI | `ai_binding` | `{ name = "AI" }` | +| Browser | `browser_binding` | `{ name = "BROWSER" }` | +| Analytics | `analytics_engine_binding` | `{ name = "ANALYTICS", dataset = "..." }` | +| mTLS | `mtls_certificate_binding` | `{ name = "CERT", certificate_id = "..." }` | + +### Routes & Triggers + +```hcl +resource "cloudflare_worker_route" "api" { + zone_id = cloudflare_zone.example.id; pattern = "api.example.com/*" + script_name = cloudflare_workers_script.api.name +} +resource "cloudflare_worker_cron_trigger" "task" { + account_id = var.account_id; script_name = cloudflare_workers_script.api.name + schedules = ["*/5 * * * *"] +} +``` + +## Storage (KV, R2, D1) + +```hcl +# KV +resource "cloudflare_workers_kv_namespace" "cache" { account_id = var.account_id; title = "cache" } +resource "cloudflare_workers_kv" "config" { + account_id = var.account_id; namespace_id = cloudflare_workers_kv_namespace.cache.id + key_name = "config"; value = jsonencode({ version = "1.0" }) +} + +# R2 +resource "cloudflare_r2_bucket" "assets" { account_id = var.account_id; name = "assets"; location = "WNAM" } + +# D1 (migrations via wrangler) & Queues +resource "cloudflare_d1_database" "app" { account_id = var.account_id; name = "app-db" } +resource "cloudflare_queue" "events" { account_id = var.account_id; name = "events-queue" } +``` + +## Pages + +```hcl +resource "cloudflare_pages_project" "site" { + account_id = var.account_id; name = "site"; production_branch = "main" + deployment_configs { + production { + compatibility_date = "2025-01-01" + environment_variables = { NODE_ENV = "production" } + kv_namespaces = { KV = cloudflare_workers_kv_namespace.cache.id } + d1_databases = { DB = cloudflare_d1_database.app.id } + } + } + build_config { build_command = "npm run build"; destination_dir = "dist" } + source { type = "github"; config { owner = "org"; repo_name = "site"; production_branch = "main" }} +} + +resource "cloudflare_pages_domain" "custom" { + account_id = var.account_id; project_name = cloudflare_pages_project.site.name; domain = "site.example.com" +} +``` + +## Rulesets (WAF, Redirects, Cache) + +```hcl +# WAF +resource "cloudflare_ruleset" "waf" { + zone_id = cloudflare_zone.example.id; name = "WAF"; kind = "zone"; phase = "http_request_firewall_custom" + rules { action = "block"; enabled = true; expression = "(cf.client.bot) and not (cf.verified_bot)" } +} + +# Redirects +resource "cloudflare_ruleset" "redirects" { + zone_id = cloudflare_zone.example.id; name = "Redirects"; kind = "zone"; phase = "http_request_dynamic_redirect" + rules { + action = "redirect"; enabled = true; expression = "(http.request.uri.path eq \"/old\")" + action_parameters { from_value { status_code = 301; target_url { value = "https://example.com/new" }}} + } +} + +# Cache rules +resource "cloudflare_ruleset" "cache" { + zone_id = cloudflare_zone.example.id; name = "Cache"; kind = "zone"; phase = "http_request_cache_settings" + rules { + action = "set_cache_settings"; enabled = true; expression = "(http.request.uri.path matches \"\\.(jpg|png|css|js)$\")" + action_parameters { cache = true; edge_ttl { mode = "override_origin"; default = 86400 }} + } +} +``` + +## Load Balancers + +```hcl +resource "cloudflare_load_balancer_monitor" "http" { + account_id = var.account_id; type = "http"; path = "/health"; interval = 60; timeout = 5 +} +resource "cloudflare_load_balancer_pool" "api" { + account_id = var.account_id; name = "api-pool"; monitor = cloudflare_load_balancer_monitor.http.id + origins { name = "api-1"; address = "192.0.2.1" } + origins { name = "api-2"; address = "192.0.2.2" } +} +resource "cloudflare_load_balancer" "api" { + zone_id = cloudflare_zone.example.id; name = "api.example.com" + default_pool_ids = [cloudflare_load_balancer_pool.api.id]; steering_policy = "geo" +} +``` + +## Access (Zero Trust) + +```hcl +resource "cloudflare_access_application" "admin" { + account_id = var.account_id; name = "Admin"; domain = "admin.example.com"; type = "self_hosted" + session_duration = "24h"; allowed_idps = [cloudflare_access_identity_provider.github.id] +} +resource "cloudflare_access_policy" "allow" { + account_id = var.account_id; application_id = cloudflare_access_application.admin.id + name = "Allow"; decision = "allow"; precedence = 1 + include { email = ["admin@example.com"] } +} +resource "cloudflare_access_identity_provider" "github" { + account_id = var.account_id; name = "GitHub"; type = "github" + config { client_id = var.github_id; client_secret = var.github_secret } +} +``` + +## See Also + +- [README](./README.md) - Provider setup +- [API](./api.md) - Data sources +- [Patterns](./patterns.md) - Use cases +- [Troubleshooting](./gotchas.md) - Issues diff --git a/cloudflare/references/terraform/gotchas.md b/cloudflare/references/terraform/gotchas.md new file mode 100644 index 0000000..eb4731d --- /dev/null +++ b/cloudflare/references/terraform/gotchas.md @@ -0,0 +1,150 @@ +# Terraform Troubleshooting & Best Practices + +Common issues, security considerations, and best practices. + +## State Drift Issues + +Some resources have known state drift. Add lifecycle blocks to prevent perpetual diffs: + +| Resource | Drift Attributes | Workaround | +|----------|------------------|------------| +| `cloudflare_pages_project` | `deployment_configs.*` | `ignore_changes = [deployment_configs]` | +| `cloudflare_workers_script` | secrets returned as REDACTED | `ignore_changes = [secret_text_binding]` | +| `cloudflare_load_balancer` | `adaptive_routing`, `random_steering` | `ignore_changes = [adaptive_routing, random_steering]` | +| `cloudflare_workers_kv` | special chars in keys (< 5.16.0) | Upgrade to 5.16.0+ | + +```hcl +# Example: Ignore secret drift +resource "cloudflare_workers_script" "api" { + account_id = var.account_id + name = "api-worker" + content = file("worker.js") + secret_text_binding { name = "API_KEY"; text = var.api_key } + + lifecycle { + ignore_changes = [secret_text_binding] + } +} +``` + +## v5 Breaking Changes + +Provider v5 is current (auto-generated from OpenAPI). v4→v5 has breaking changes: + +**Resource Renames:** + +| v4 Resource | v5 Resource | Notes | +|-------------|-------------|-------| +| `cloudflare_record` | `cloudflare_dns_record` | | +| `cloudflare_worker_script` | `cloudflare_workers_script` | Note: plural | +| `cloudflare_worker_*` | `cloudflare_workers_*` | All worker resources | +| `cloudflare_access_*` | `cloudflare_zero_trust_*` | Access → Zero Trust | + +**Attribute Changes:** + +| v4 Attribute | v5 Attribute | Resources | +|--------------|--------------|-----------| +| `zone` | `name` | zone | +| `account_id` | `account.id` | zone (object syntax) | +| `key` | `key_name` | KV | +| `location_hint` | `location` | R2 | + +**State Migration:** + +```bash +# Rename resources in state after v5 upgrade +terraform state mv cloudflare_record.example cloudflare_dns_record.example +terraform state mv cloudflare_worker_script.api cloudflare_workers_script.api +``` + +## Resource-Specific Gotchas + +### R2 Location Case Sensitivity + +**Problem:** Terraform creates R2 bucket but fails on subsequent applies +**Cause:** Location must be UPPERCASE +**Solution:** Use `WNAM`, `ENAM`, `WEUR`, `EEUR`, `APAC` (not `wnam`, `enam`, etc.) + +```hcl +resource "cloudflare_r2_bucket" "assets" { + account_id = var.account_id + name = "assets" + location = "WNAM" # UPPERCASE required +} +``` + +### KV Special Characters (< 5.16.0) + +**Problem:** Keys with `+`, `#`, `%` cause encoding issues +**Cause:** URL encoding bug in provider < 5.16.0 +**Solution:** Upgrade to 5.16.0+ or avoid special chars in keys + +### D1 Migrations + +**Problem:** Terraform creates database but schema is empty +**Cause:** Terraform only creates D1 resource, not schema +**Solution:** Run migrations via wrangler after Terraform apply + +```bash +# After terraform apply +wrangler d1 migrations apply +``` + +### Worker Script Size Limit + +**Problem:** Worker deployment fails with "script too large" +**Cause:** Worker script + dependencies exceed 10 MB limit +**Solution:** Use code splitting, external dependencies, or minification + +### Pages Project Drift + +**Problem:** Pages project shows perpetual diff on `deployment_configs` +**Cause:** Cloudflare API adds default values not in Terraform state +**Solution:** Add lifecycle ignore block (see State Drift table above) + +## Common Errors + +### "Error: couldn't find resource" + +**Cause:** Resource was deleted outside Terraform +**Solution:** Import resource back into state with `terraform import cloudflare_zone.example ` or remove from state with `terraform state rm cloudflare_zone.example` + +### "409 Conflict on worker deployment" + +**Cause:** Worker being deployed by both Terraform and wrangler simultaneously +**Solution:** Choose one deployment method; if using Terraform, remove wrangler deployments + +### "DNS record already exists" + +**Cause:** Existing DNS record not imported into Terraform state +**Solution:** Find record ID in Cloudflare dashboard and import with `terraform import cloudflare_dns_record.example /` + +### "Invalid provider configuration" + +**Cause:** API token missing, invalid, or lacking required permissions +**Solution:** Set `CLOUDFLARE_API_TOKEN` environment variable or check token permissions in dashboard + +### "State locking errors" + +**Cause:** Multiple concurrent Terraform runs or stale lock from crashed process +**Solution:** Remove stale lock with `terraform force-unlock ` (use with caution) + +## Limits + +| Resource | Limit | Notes | +|----------|-------|-------| +| API token rate limit | Varies by plan | Use `api_client_logging = true` to debug +| Worker script size | 10 MB | Includes all dependencies +| KV keys per namespace | Unlimited | Pay per operation +| R2 storage | Unlimited | Pay per GB +| D1 databases | 50,000 per account | Free tier: 10 +| Pages projects | 500 per account | 100 for free accounts +| DNS records | 3,500 per zone | Free plan + +## See Also + +- [README](./README.md) - Provider setup +- [Configuration](./configuration.md) - Resources +- [API](./api.md) - Data sources +- [Patterns](./patterns.md) - Use cases +- Provider docs: https://registry.terraform.io/providers/cloudflare/cloudflare/latest/docs diff --git a/cloudflare/references/terraform/patterns.md b/cloudflare/references/terraform/patterns.md new file mode 100644 index 0000000..aea3a96 --- /dev/null +++ b/cloudflare/references/terraform/patterns.md @@ -0,0 +1,174 @@ +# Terraform Patterns & Use Cases + +Architecture patterns, multi-environment setups, and real-world use cases. + +## Recommended Directory Structure + +``` +terraform/ +├── environments/ +│ ├── production/ +│ │ ├── main.tf +│ │ └── terraform.tfvars +│ └── staging/ +│ ├── main.tf +│ └── terraform.tfvars +├── modules/ +│ ├── zone/ +│ ├── worker/ +│ └── dns/ +└── shared/ # Shared resources across envs + └── main.tf +``` + +**Note:** Cloudflare recommends avoiding modules for provider resources due to v5 auto-generation complexity. Prefer environment directories + shared state instead. + +## Multi-Environment Setup + +```hcl +# Directory: environments/{production,staging}/main.tf + modules/{zone,worker,pages} +module "zone" { + source = "../../modules/zone"; account_id = var.account_id; zone_name = "example.com"; environment = "production" +} +module "api_worker" { + source = "../../modules/worker"; account_id = var.account_id; zone_id = module.zone.zone_id + name = "api-worker-prod"; script = file("../../workers/api.js"); environment = "production" +} +``` + +## R2 State Backend + +```hcl +terraform { + backend "s3" { + bucket = "terraform-state" + key = "cloudflare.tfstate" + region = "auto" + endpoints = { s3 = "https://.r2.cloudflarestorage.com" } + skip_credentials_validation = true + skip_region_validation = true + skip_requesting_account_id = true + skip_metadata_api_check = true + skip_s3_checksum = true + } +} +``` + +## Worker with All Bindings + +```hcl +locals { worker_name = "full-stack-worker" } +resource "cloudflare_workers_kv_namespace" "app" { account_id = var.account_id; title = "${local.worker_name}-kv" } +resource "cloudflare_r2_bucket" "app" { account_id = var.account_id; name = "${local.worker_name}-bucket" } +resource "cloudflare_d1_database" "app" { account_id = var.account_id; name = "${local.worker_name}-db" } + +resource "cloudflare_worker_script" "app" { + account_id = var.account_id; name = local.worker_name; content = file("worker.js"); module = true + compatibility_date = "2025-01-01" + kv_namespace_binding { name = "KV"; namespace_id = cloudflare_workers_kv_namespace.app.id } + r2_bucket_binding { name = "BUCKET"; bucket_name = cloudflare_r2_bucket.app.name } + d1_database_binding { name = "DB"; database_id = cloudflare_d1_database.app.id } + secret_text_binding { name = "API_KEY"; text = var.api_key } +} +``` + +## Wrangler Integration + +**CRITICAL**: Wrangler and Terraform must NOT manage same resources. + +**Terraform**: Zones, DNS, security rules, Access, load balancers, worker deployments (CI/CD), KV/R2/D1 resource creation +**Wrangler**: Local dev (`wrangler dev`), manual deploys, D1 migrations, KV bulk ops, log streaming (`wrangler tail`) + +### CI/CD Pattern + +```hcl +# Terraform creates infrastructure +resource "cloudflare_workers_kv_namespace" "app" { account_id = var.account_id; title = "app-kv" } +resource "cloudflare_d1_database" "app" { account_id = var.account_id; name = "app-db" } +output "kv_namespace_id" { value = cloudflare_workers_kv_namespace.app.id } +output "d1_database_id" { value = cloudflare_d1_database.app.id } +``` + +```yaml +# GitHub Actions: terraform apply → envsubst wrangler.jsonc.template → wrangler deploy +- run: terraform apply -auto-approve +- run: | + export KV_NAMESPACE_ID=$(terraform output -raw kv_namespace_id) + envsubst < wrangler.jsonc.template > wrangler.jsonc +- run: wrangler deploy +``` + +## Use Cases + +### Static Site + API Worker + +```hcl +resource "cloudflare_pages_project" "frontend" { + account_id = var.account_id; name = "frontend"; production_branch = "main" + build_config { build_command = "npm run build"; destination_dir = "dist" } +} +resource "cloudflare_worker_script" "api" { + account_id = var.account_id; name = "api"; content = file("api-worker.js") + d1_database_binding { name = "DB"; database_id = cloudflare_d1_database.api_db.id } +} +resource "cloudflare_dns_record" "frontend" { + zone_id = cloudflare_zone.main.id; name = "app"; content = cloudflare_pages_project.frontend.subdomain; type = "CNAME"; proxied = true +} +resource "cloudflare_worker_route" "api" { + zone_id = cloudflare_zone.main.id; pattern = "api.example.com/*"; script_name = cloudflare_worker_script.api.name +} +``` + +### Multi-Region Load Balancing + +```hcl +resource "cloudflare_load_balancer_pool" "us" { + account_id = var.account_id; name = "us-pool"; monitor = cloudflare_load_balancer_monitor.http.id + origins { name = "us-east"; address = var.us_east_ip } +} +resource "cloudflare_load_balancer_pool" "eu" { + account_id = var.account_id; name = "eu-pool"; monitor = cloudflare_load_balancer_monitor.http.id + origins { name = "eu-west"; address = var.eu_west_ip } +} +resource "cloudflare_load_balancer" "global" { + zone_id = cloudflare_zone.main.id; name = "api.example.com"; steering_policy = "geo" + default_pool_ids = [cloudflare_load_balancer_pool.us.id] + region_pools { region = "WNAM"; pool_ids = [cloudflare_load_balancer_pool.us.id] } + region_pools { region = "WEU"; pool_ids = [cloudflare_load_balancer_pool.eu.id] } +} +``` + +### Secure Admin with Access + +```hcl +resource "cloudflare_pages_project" "admin" { account_id = var.account_id; name = "admin"; production_branch = "main" } +resource "cloudflare_access_application" "admin" { + account_id = var.account_id; name = "Admin"; domain = "admin.example.com"; type = "self_hosted"; session_duration = "24h" + allowed_idps = [cloudflare_access_identity_provider.google.id] +} +resource "cloudflare_access_policy" "allow" { + account_id = var.account_id; application_id = cloudflare_access_application.admin.id + name = "Allow admins"; decision = "allow"; precedence = 1; include { email = var.admin_emails } +} +``` + +### Reusable Module + +```hcl +# modules/cloudflare-zone/main.tf +variable "account_id" { type = string }; variable "domain" { type = string }; variable "ssl_mode" { default = "strict" } +resource "cloudflare_zone" "main" { account = { id = var.account_id }; name = var.domain } +resource "cloudflare_zone_settings_override" "main" { + zone_id = cloudflare_zone.main.id; settings { ssl = var.ssl_mode; always_use_https = "on" } +} +output "zone_id" { value = cloudflare_zone.main.id } + +# Usage: module "prod" { source = "./modules/cloudflare-zone"; account_id = var.account_id; domain = "example.com" } +``` + +## See Also + +- [README](./README.md) - Provider setup +- [Configuration Reference](./configuration.md) - All resource types +- [API Reference](./api.md) - Data sources +- [Troubleshooting](./gotchas.md) - Best practices, common issues diff --git a/cloudflare/references/tunnel/README.md b/cloudflare/references/tunnel/README.md new file mode 100644 index 0000000..f70a668 --- /dev/null +++ b/cloudflare/references/tunnel/README.md @@ -0,0 +1,129 @@ +# Cloudflare Tunnel + +Secure outbound-only connections between infrastructure and Cloudflare's global network. + +## Overview + +Cloudflare Tunnel (formerly Argo Tunnel) enables: +- **Outbound-only connections** - No inbound ports or firewall changes +- **Public hostname routing** - Expose local services to internet +- **Private network access** - Connect internal networks via WARP +- **Zero Trust integration** - Built-in access policies + +**Architecture**: Tunnel (persistent object) → Replica (`cloudflared` process) → Origin services + +**Terminology:** +- **Tunnel**: Named persistent object with UUID +- **Replica**: Individual `cloudflared` process connected to tunnel +- **Config Source**: Where ingress rules stored (local file vs Cloudflare dashboard) +- **Connector**: Legacy term for replica + +## Quick Start + +### Local Config +```bash +# Install cloudflared +brew install cloudflared # macOS + +# Authenticate +cloudflared tunnel login + +# Create tunnel +cloudflared tunnel create my-tunnel + +# Route DNS +cloudflared tunnel route dns my-tunnel app.example.com + +# Run tunnel +cloudflared tunnel run my-tunnel +``` + +### Dashboard Config (Recommended) +1. **Zero Trust** > **Networks** > **Tunnels** > **Create** +2. Name tunnel, copy token +3. Configure routes in dashboard +4. Run: `cloudflared tunnel --no-autoupdate run --token ` + +## Decision Tree + +**Choose config source:** +``` +Need centralized config updates? +├─ Yes → Token-based (dashboard config) +└─ No → Local config file + +Multiple environments (dev/staging/prod)? +├─ Yes → Local config (version controlled) +└─ No → Either works + +Need firewall approval? +└─ See networking.md first +``` + +## Core Commands + +```bash +# Tunnel lifecycle +cloudflared tunnel create +cloudflared tunnel list +cloudflared tunnel info +cloudflared tunnel delete + +# DNS routing +cloudflared tunnel route dns +cloudflared tunnel route list + +# Private network +cloudflared tunnel route ip add 10.0.0.0/8 + +# Run tunnel +cloudflared tunnel run +``` + +## Configuration Example + +```yaml +# ~/.cloudflared/config.yml +tunnel: 6ff42ae2-765d-4adf-8112-31c55c1551ef +credentials-file: /root/.cloudflared/6ff42ae2-765d-4adf-8112-31c55c1551ef.json + +ingress: + - hostname: app.example.com + service: http://localhost:8000 + - hostname: api.example.com + service: https://localhost:8443 + originRequest: + noTLSVerify: true + - service: http_status:404 +``` + +## Reading Order + +**New to Cloudflare Tunnel:** +1. This README (overview, quick start) +2. [networking.md](./networking.md) - Firewall rules, connectivity pre-checks +3. [configuration.md](./configuration.md) - Config file options, ingress rules +4. [patterns.md](./patterns.md) - Docker, Kubernetes, production deployment +5. [gotchas.md](./gotchas.md) - Troubleshooting, best practices + +**Enterprise deployment:** +1. [networking.md](./networking.md) - Corporate firewall requirements +2. [gotchas.md](./gotchas.md) - HA setup, security best practices +3. [patterns.md](./patterns.md) - Kubernetes, rolling updates + +**Programmatic control:** +1. [api.md](./api.md) - REST API, TypeScript SDK + +## In This Reference + +- [networking.md](./networking.md) - Firewall rules, ports, connectivity pre-checks +- [configuration.md](./configuration.md) - Config file options, ingress rules, TLS settings +- [api.md](./api.md) - REST API, TypeScript SDK, token-based tunnels +- [patterns.md](./patterns.md) - Docker, Kubernetes, Terraform, HA, use cases +- [gotchas.md](./gotchas.md) - Troubleshooting, limitations, best practices + +## See Also + +- [workers](../workers/) - Workers with Tunnel integration +- [access](../access/) - Zero Trust access policies +- [warp](../warp/) - WARP client for private networks diff --git a/cloudflare/references/tunnel/api.md b/cloudflare/references/tunnel/api.md new file mode 100644 index 0000000..faa3013 --- /dev/null +++ b/cloudflare/references/tunnel/api.md @@ -0,0 +1,193 @@ +# Tunnel API + +## Cloudflare API Access + +**Base URL**: `https://api.cloudflare.com/client/v4` + +**Authentication**: +```bash +Authorization: Bearer ${CF_API_TOKEN} +``` + +## TypeScript SDK + +Install: `npm install cloudflare` + +```typescript +import Cloudflare from 'cloudflare'; + +const cf = new Cloudflare({ + apiToken: process.env.CF_API_TOKEN, +}); + +const accountId = process.env.CF_ACCOUNT_ID; +``` + +## Create Tunnel + +### cURL +```bash +curl -X POST "https://api.cloudflare.com/client/v4/accounts/{account_id}/tunnels" \ + -H "Authorization: Bearer ${CF_API_TOKEN}" \ + -H "Content-Type: application/json" \ + --data '{ + "name": "my-tunnel", + "tunnel_secret": "" + }' +``` + +### TypeScript +```typescript +const tunnel = await cf.zeroTrust.tunnels.create({ + account_id: accountId, + name: 'my-tunnel', + tunnel_secret: Buffer.from(crypto.randomBytes(32)).toString('base64'), +}); + +console.log(`Tunnel ID: ${tunnel.id}`); +``` + +## List Tunnels + +### cURL +```bash +curl -X GET "https://api.cloudflare.com/client/v4/accounts/{account_id}/tunnels" \ + -H "Authorization: Bearer ${CF_API_TOKEN}" +``` + +### TypeScript +```typescript +const tunnels = await cf.zeroTrust.tunnels.list({ + account_id: accountId, +}); + +for (const tunnel of tunnels.result) { + console.log(`${tunnel.name}: ${tunnel.id}`); +} +``` + +## Get Tunnel Info + +### cURL +```bash +curl -X GET "https://api.cloudflare.com/client/v4/accounts/{account_id}/tunnels/{tunnel_id}" \ + -H "Authorization: Bearer ${CF_API_TOKEN}" +``` + +### TypeScript +```typescript +const tunnel = await cf.zeroTrust.tunnels.get(tunnelId, { + account_id: accountId, +}); + +console.log(`Status: ${tunnel.status}`); +console.log(`Connections: ${tunnel.connections?.length || 0}`); +``` + +## Update Tunnel Config + +### cURL +```bash +curl -X PUT "https://api.cloudflare.com/client/v4/accounts/{account_id}/tunnels/{tunnel_id}/configurations" \ + -H "Authorization: Bearer ${CF_API_TOKEN}" \ + -H "Content-Type: application/json" \ + --data '{ + "config": { + "ingress": [ + {"hostname": "app.example.com", "service": "http://localhost:8000"}, + {"service": "http_status:404"} + ] + } + }' +``` + +### TypeScript +```typescript +const config = await cf.zeroTrust.tunnels.configurations.update( + tunnelId, + { + account_id: accountId, + config: { + ingress: [ + { hostname: 'app.example.com', service: 'http://localhost:8000' }, + { service: 'http_status:404' }, + ], + }, + } +); +``` + +## Delete Tunnel + +### cURL +```bash +curl -X DELETE "https://api.cloudflare.com/client/v4/accounts/{account_id}/tunnels/{tunnel_id}" \ + -H "Authorization: Bearer ${CF_API_TOKEN}" +``` + +### TypeScript +```typescript +await cf.zeroTrust.tunnels.delete(tunnelId, { + account_id: accountId, +}); +``` + +## Token-Based Tunnels (Config Source: Cloudflare) + +Token-based tunnels store config in Cloudflare dashboard instead of local files. + +### Via Dashboard +1. **Zero Trust** > **Networks** > **Tunnels** +2. **Create a tunnel** > **Cloudflared** +3. Configure routes in dashboard +4. Copy token +5. Run on origin: +```bash +cloudflared service install +``` + +### Via Token +```bash +# Run with token (no config file needed) +cloudflared tunnel --no-autoupdate run --token ${TUNNEL_TOKEN} + +# Docker +docker run cloudflare/cloudflared:latest tunnel --no-autoupdate run --token ${TUNNEL_TOKEN} +``` + +### Get Tunnel Token (TypeScript) +```typescript +// Get tunnel to retrieve token +const tunnel = await cf.zeroTrust.tunnels.get(tunnelId, { + account_id: accountId, +}); + +// Token available in tunnel.token (only for config source: cloudflare) +const token = tunnel.token; +``` + +## DNS Routes API + +```bash +# Create DNS route +curl -X POST "https://api.cloudflare.com/client/v4/accounts/{account_id}/tunnels/{tunnel_id}/connections" \ + -H "Authorization: Bearer ${CF_API_TOKEN}" \ + --data '{"hostname": "app.example.com"}' + +# Delete route +curl -X DELETE "https://api.cloudflare.com/client/v4/accounts/{account_id}/tunnels/{tunnel_id}/connections/{route_id}" \ + -H "Authorization: Bearer ${CF_API_TOKEN}" +``` + +## Private Network Routes API + +```bash +# Add IP route +curl -X POST "https://api.cloudflare.com/client/v4/accounts/{account_id}/tunnels/{tunnel_id}/routes" \ + -H "Authorization: Bearer ${CF_API_TOKEN}" \ + --data '{"ip_network": "10.0.0.0/8"}' + +# List IP routes +curl -X GET "https://api.cloudflare.com/client/v4/accounts/{account_id}/tunnels/{tunnel_id}/routes" \ + -H "Authorization: Bearer ${CF_API_TOKEN}" +``` diff --git a/cloudflare/references/tunnel/configuration.md b/cloudflare/references/tunnel/configuration.md new file mode 100644 index 0000000..32b7050 --- /dev/null +++ b/cloudflare/references/tunnel/configuration.md @@ -0,0 +1,157 @@ +# Tunnel Configuration + +## Config Source + +Tunnels use one of two config sources: + +| Config Source | Storage | Updates | Use Case | +|---------------|---------|---------|----------| +| Local | `config.yml` file | Edit file, restart | Dev, multi-env, version control | +| Cloudflare | Dashboard/API | Instant, no restart | Production, centralized management | + +**Token-based tunnels** = config source: Cloudflare +**Locally-managed tunnels** = config source: local + +## Config File Location + +``` +~/.cloudflared/config.yml # User config +/etc/cloudflared/config.yml # System-wide (Linux) +``` + +## Basic Structure + +```yaml +tunnel: +credentials-file: /path/to/.json + +ingress: + - hostname: app.example.com + service: http://localhost:8000 + - service: http_status:404 # Required catch-all +``` + +## Ingress Rules + +Rules evaluated **top to bottom**, first match wins. + +```yaml +ingress: + # Exact hostname + path regex + - hostname: static.example.com + path: \.(jpg|png|css|js)$ + service: https://localhost:8001 + + # Wildcard hostname + - hostname: "*.example.com" + service: https://localhost:8002 + + # Path only (all hostnames) + - path: /api/.* + service: http://localhost:9000 + + # Catch-all (required) + - service: http_status:404 +``` + +**Validation**: +```bash +cloudflared tunnel ingress validate +cloudflared tunnel ingress rule https://foo.example.com +``` + +## Service Types + +| Protocol | Format | Client Requirement | +|----------|--------|-------------------| +| HTTP | `http://localhost:8000` | Browser | +| HTTPS | `https://localhost:8443` | Browser | +| TCP | `tcp://localhost:2222` | `cloudflared access tcp` | +| SSH | `ssh://localhost:22` | `cloudflared access ssh` | +| RDP | `rdp://localhost:3389` | `cloudflared access rdp` | +| Unix | `unix:/path/to/socket` | Browser | +| Test | `hello_world` | Browser | + +## Origin Configuration + +### Connection Settings +```yaml +originRequest: + connectTimeout: 30s + tlsTimeout: 10s + tcpKeepAlive: 30s + keepAliveTimeout: 90s + keepAliveConnections: 100 +``` + +### TLS Settings +```yaml +originRequest: + noTLSVerify: true # Disable cert verification + originServerName: "app.internal" # Override SNI + caPool: /path/to/ca.pem # Custom CA +``` + +### HTTP Settings +```yaml +originRequest: + disableChunkedEncoding: true + httpHostHeader: "app.internal" + http2Origin: true +``` + +## Private Network Mode + +```yaml +tunnel: +credentials-file: /path/to/creds.json + +warp-routing: + enabled: true +``` + +```bash +cloudflared tunnel route ip add 10.0.0.0/8 my-tunnel +cloudflared tunnel route ip add 192.168.1.100/32 my-tunnel +``` + +## Config Source Comparison + +### Local Config +```yaml +# config.yml +tunnel: +credentials-file: /path/to/.json + +ingress: + - hostname: app.example.com + service: http://localhost:8000 + - service: http_status:404 +``` + +```bash +cloudflared tunnel run my-tunnel +``` + +**Pros:** Version control, multi-environment, offline edits +**Cons:** Requires file distribution, manual restarts + +### Cloudflare Config (Token-Based) +```bash +# No config file needed +cloudflared tunnel --no-autoupdate run --token +``` + +Configure routes in dashboard: **Zero Trust** > **Networks** > **Tunnels** > [Tunnel] > **Public Hostname** + +**Pros:** Centralized updates, no file management, instant route changes +**Cons:** Requires dashboard/API access, less portable + +## Environment Variables + +```bash +TUNNEL_TOKEN= # Token for config source: cloudflare +TUNNEL_ORIGIN_CERT=/path/to/cert.pem # Override cert path (local config) +NO_AUTOUPDATE=true # Disable auto-updates +TUNNEL_LOGLEVEL=debug # Log level +``` diff --git a/cloudflare/references/tunnel/gotchas.md b/cloudflare/references/tunnel/gotchas.md new file mode 100644 index 0000000..f368856 --- /dev/null +++ b/cloudflare/references/tunnel/gotchas.md @@ -0,0 +1,147 @@ +# Tunnel Gotchas + +## Common Errors + +### "Error 1016 (Origin DNS Error)" + +**Cause:** Tunnel not running or not connected +**Solution:** +```bash +cloudflared tunnel info my-tunnel # Check status +ps aux | grep cloudflared # Verify running +journalctl -u cloudflared -n 100 # Check logs +``` + +### "Self-signed certificate rejected" + +**Cause:** Origin using self-signed certificate +**Solution:** +```yaml +originRequest: + noTLSVerify: true # Dev only + caPool: /path/to/ca.pem # Custom CA +``` + +### "Connection timeout" + +**Cause:** Origin slow to respond or timeout settings too low +**Solution:** +```yaml +originRequest: + connectTimeout: 60s + tlsTimeout: 20s + keepAliveTimeout: 120s +``` + +### "Tunnel not starting" + +**Cause:** Invalid config, missing credentials, or tunnel doesn't exist +**Solution:** +```bash +cloudflared tunnel ingress validate # Validate config +ls -la ~/.cloudflared/*.json # Verify credentials +cloudflared tunnel list # Verify tunnel exists +``` + +### "Connection already registered" + +**Cause:** Multiple replicas with same connector ID or stale connection +**Solution:** +```bash +# Check active connections +cloudflared tunnel info my-tunnel + +# Wait 60s for stale connection cleanup, or restart with new connector ID +cloudflared tunnel run my-tunnel +``` + +### "Tunnel credentials rotated but connections fail" + +**Cause:** Old cloudflared processes using expired credentials +**Solution:** +```bash +# Stop all cloudflared processes +pkill cloudflared + +# Verify stopped +ps aux | grep cloudflared + +# Restart with new credentials +cloudflared tunnel run my-tunnel +``` + +## Limits + +| Resource/Limit | Value | Notes | +|----------------|-------|-------| +| Free tier | Unlimited tunnels | Unlimited traffic | +| Tunnel replicas | 1000 per tunnel | Max concurrent | +| Connection duration | No hard limit | Hours to days | +| Long-lived connections | May drop during updates | WebSocket, SSH, UDP | +| Replica registration | ~5s TTL | Old replica dropped after 5s no heartbeat | +| Token rotation grace | 24 hours | Old tokens work during grace period | + +## Best Practices + +### Security +1. Use token-based tunnels (config source: cloudflare) for centralized control +2. Enable Access policies for sensitive services +3. Rotate tunnel credentials regularly +4. After rotation: stop all old cloudflared processes within 24h grace period +5. Verify TLS certs (`noTLSVerify: false`) +6. Restrict `bastion` service type + +### Performance +1. Run multiple replicas for HA (2-4 typical, load balanced automatically) +2. Replicas share same tunnel UUID, get unique connector IDs +3. Place `cloudflared` close to origin (same network) +4. Use HTTP/2 for gRPC (`http2Origin: true`) +5. Tune keepalive for long-lived connections +6. Monitor connection counts + +### Configuration +1. Use environment variables for secrets +2. Version control config files +3. Validate before deploying (`cloudflared tunnel ingress validate`) +4. Test rules (`cloudflared tunnel ingress rule `) +5. Document rule order (first match wins) + +### Operations +1. Monitor tunnel health in dashboard (shows active replicas) +2. Set up disconnect alerts (when replica count drops to 0) +3. Graceful shutdown for config updates +4. Update replicas in rolling fashion (update 1, wait, update next) +5. Keep `cloudflared` updated (1 year support window) +6. Use `--no-autoupdate` in prod; control updates manually + +## Debug Mode + +```bash +cloudflared tunnel --loglevel debug run my-tunnel +cloudflared tunnel ingress rule https://app.example.com +``` + +## Migration Strategies + +### From Ngrok +```yaml +# Ngrok: ngrok http 8000 +# Cloudflare Tunnel: +ingress: + - hostname: app.example.com + service: http://localhost:8000 + - service: http_status:404 +``` + +### From VPN +```yaml +# Replace VPN with private network routing +warp-routing: + enabled: true +``` + +```bash +cloudflared tunnel route ip add 10.0.0.0/8 my-tunnel +``` + +Users install WARP client instead of VPN. diff --git a/cloudflare/references/tunnel/networking.md b/cloudflare/references/tunnel/networking.md new file mode 100644 index 0000000..8c3930d --- /dev/null +++ b/cloudflare/references/tunnel/networking.md @@ -0,0 +1,168 @@ +# Tunnel Networking + +## Connectivity Requirements + +### Outbound Ports + +Cloudflared requires outbound access on: + +| Port | Protocol | Purpose | Required | +|------|----------|---------|----------| +| 7844 | TCP/UDP | Primary tunnel protocol (QUIC) | Yes | +| 443 | TCP | Fallback (HTTP/2) | Yes | + +**Network path:** +``` +cloudflared → edge.argotunnel.com:7844 (preferred) +cloudflared → region.argotunnel.com:443 (fallback) +``` + +### Firewall Rules + +#### Minimal (Production) +```bash +# Outbound only +ALLOW tcp/udp 7844 to *.argotunnel.com +ALLOW tcp 443 to *.argotunnel.com +``` + +#### Full (Recommended) +```bash +# Tunnel connectivity +ALLOW tcp/udp 7844 to *.argotunnel.com +ALLOW tcp 443 to *.argotunnel.com + +# API access (for token-based tunnels) +ALLOW tcp 443 to api.cloudflare.com + +# Updates (optional) +ALLOW tcp 443 to github.com +ALLOW tcp 443 to objects.githubusercontent.com +``` + +### IP Ranges + +Cloudflare Anycast IPs (tunnel endpoints): +``` +# IPv4 +198.41.192.0/24 +198.41.200.0/24 + +# IPv6 +2606:4700::/32 +``` + +**Note:** Use DNS resolution for `*.argotunnel.com` rather than hardcoding IPs. Cloudflare may add edge locations. + +## Pre-Flight Check + +Test connectivity before deploying: + +```bash +# Test DNS resolution +dig edge.argotunnel.com +short + +# Test port 7844 (QUIC/UDP) +nc -zvu edge.argotunnel.com 7844 + +# Test port 443 (HTTP/2 fallback) +nc -zv edge.argotunnel.com 443 + +# Test with cloudflared +cloudflared tunnel --loglevel debug run my-tunnel +# Look for "Registered tunnel connection" +``` + +### Common Connectivity Errors + +| Error | Cause | Solution | +|-------|-------|----------| +| "no such host" | DNS blocked | Allow port 53 UDP/TCP | +| "context deadline exceeded" | Port 7844 blocked | Allow UDP/TCP 7844 | +| "TLS handshake timeout" | Port 443 blocked | Allow TCP 443, disable SSL inspection | + +## Protocol Selection + +Cloudflared automatically selects protocol: + +| Protocol | Port | Priority | Use Case | +|----------|------|----------|----------| +| QUIC | 7844 UDP | 1st (preferred) | Low latency, best performance | +| HTTP/2 | 443 TCP | 2nd (fallback) | QUIC blocked by firewall | + +**Force HTTP/2 fallback:** +```bash +cloudflared tunnel --protocol http2 run my-tunnel +``` + +**Verify active protocol:** +```bash +cloudflared tunnel info my-tunnel +# Shows "connections" with protocol type +``` + +## Private Network Routing + +### WARP Client Requirements + +Users accessing private IPs via WARP need: + +```bash +# Outbound (WARP client) +ALLOW udp 500,4500 to 162.159.*.* (IPsec) +ALLOW udp 2408 to 162.159.*.* (WireGuard) +ALLOW tcp 443 to *.cloudflareclient.com +``` + +### Split Tunnel Configuration + +Route only private networks through tunnel: + +```yaml +# warp-routing config +warp-routing: + enabled: true +``` + +```bash +# Add specific routes +cloudflared tunnel route ip add 10.0.0.0/8 my-tunnel +cloudflared tunnel route ip add 172.16.0.0/12 my-tunnel +cloudflared tunnel route ip add 192.168.0.0/16 my-tunnel +``` + +WARP users can access these IPs without VPN. + +## Network Diagnostics + +### Connection Diagnostics + +```bash +# Check edge selection and connection health +cloudflared tunnel info my-tunnel --output json | jq '.connections[]' + +# Enable metrics endpoint +cloudflared tunnel --metrics localhost:9090 run my-tunnel +curl localhost:9090/metrics | grep cloudflared_tunnel + +# Test latency +curl -w "time_total: %{time_total}\n" -o /dev/null https://myapp.example.com +``` + +## Corporate Network Considerations + +Cloudflared honors proxy environment variables (`HTTP_PROXY`, `HTTPS_PROXY`, `NO_PROXY`). + +If corporate proxy intercepts TLS, add corporate root CA to system trust store. + +## Bandwidth and Rate Limits + +| Limit | Value | Notes | +|-------|-------|-------| +| Request size | 100 MB | Single HTTP request | +| Upload speed | No hard limit | Governed by network/plan | +| Concurrent connections | 1000 per tunnel | Across all replicas | +| Requests per second | No limit | Subject to DDoS detection | + +**Large file transfers:** +Use R2 or Workers with chunked uploads instead of streaming through tunnel. diff --git a/cloudflare/references/tunnel/patterns.md b/cloudflare/references/tunnel/patterns.md new file mode 100644 index 0000000..7ef6d29 --- /dev/null +++ b/cloudflare/references/tunnel/patterns.md @@ -0,0 +1,192 @@ +# Tunnel Patterns + +## Docker Deployment + +### Token-Based (Recommended) +```yaml +services: + cloudflared: + image: cloudflare/cloudflared:latest + command: tunnel --no-autoupdate run --token ${TUNNEL_TOKEN} + restart: unless-stopped +``` + +### Local Config +```yaml +services: + cloudflared: + image: cloudflare/cloudflared:latest + volumes: + - ./config.yml:/etc/cloudflared/config.yml:ro + - ./credentials.json:/etc/cloudflared/credentials.json:ro + command: tunnel run +``` + +## Kubernetes Deployment + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: cloudflared +spec: + replicas: 2 + selector: + matchLabels: + app: cloudflared + template: + metadata: + labels: + app: cloudflared + spec: + containers: + - name: cloudflared + image: cloudflare/cloudflared:latest + args: + - tunnel + - --no-autoupdate + - run + - --token + - $(TUNNEL_TOKEN) + env: + - name: TUNNEL_TOKEN + valueFrom: + secretKeyRef: + name: tunnel-credentials + key: token +``` + +## High Availability + +```yaml +# Same config on multiple servers +tunnel: +credentials-file: /path/to/creds.json + +ingress: + - hostname: app.example.com + service: http://localhost:8000 + - service: http_status:404 +``` + +Run same config on multiple machines. Cloudflare automatically load balances. Long-lived connections (WebSocket, SSH) may drop during updates. + +## Use Cases + +### Web Application +```yaml +ingress: + - hostname: myapp.example.com + service: http://localhost:3000 + - service: http_status:404 +``` + +### SSH Access +```yaml +ingress: + - hostname: ssh.example.com + service: ssh://localhost:22 + - service: http_status:404 +``` + +Client: `cloudflared access ssh --hostname ssh.example.com` + +### gRPC Service +```yaml +ingress: + - hostname: grpc.example.com + service: http://localhost:50051 + originRequest: + http2Origin: true + - service: http_status:404 +``` + +## Infrastructure as Code + +### Terraform + +```hcl +resource "random_id" "tunnel_secret" { + byte_length = 32 +} + +resource "cloudflare_tunnel" "app" { + account_id = var.cloudflare_account_id + name = "app-tunnel" + secret = random_id.tunnel_secret.b64_std +} + +resource "cloudflare_tunnel_config" "app" { + account_id = var.cloudflare_account_id + tunnel_id = cloudflare_tunnel.app.id + config { + ingress_rule { + hostname = "app.example.com" + service = "http://localhost:8000" + } + ingress_rule { service = "http_status:404" } + } +} + +resource "cloudflare_record" "app" { + zone_id = var.cloudflare_zone_id + name = "app" + value = cloudflare_tunnel.app.cname + type = "CNAME" + proxied = true +} + +output "tunnel_token" { + value = cloudflare_tunnel.app.tunnel_token + sensitive = true +} +``` + +### Pulumi + +```typescript +import * as cloudflare from "@pulumi/cloudflare"; +import * as random from "@pulumi/random"; + +const secret = new random.RandomId("secret", { byteLength: 32 }); + +const tunnel = new cloudflare.ZeroTrustTunnelCloudflared("tunnel", { + accountId: accountId, + name: "app-tunnel", + secret: secret.b64Std, +}); + +const config = new cloudflare.ZeroTrustTunnelCloudflaredConfig("config", { + accountId: accountId, + tunnelId: tunnel.id, + config: { + ingressRules: [ + { hostname: "app.example.com", service: "http://localhost:8000" }, + { service: "http_status:404" }, + ], + }, +}); + +new cloudflare.Record("dns", { + zoneId: zoneId, + name: "app", + value: tunnel.cname, + type: "CNAME", + proxied: true, +}); +``` + +## Service Installation + +### Linux systemd +```bash +cloudflared service install +systemctl start cloudflared && systemctl enable cloudflared +journalctl -u cloudflared -f # Logs +``` + +### macOS launchd +```bash +sudo cloudflared service install +sudo launchctl start com.cloudflare.cloudflared +``` diff --git a/cloudflare/references/turn/README.md b/cloudflare/references/turn/README.md new file mode 100644 index 0000000..cc4b39e --- /dev/null +++ b/cloudflare/references/turn/README.md @@ -0,0 +1,82 @@ +# Cloudflare TURN Service + +Expert guidance for implementing Cloudflare TURN Service in WebRTC applications. + +## Overview + +Cloudflare TURN (Traversal Using Relays around NAT) Service is a managed relay service for WebRTC applications. TURN acts as a relay point for traffic between WebRTC clients and SFUs, particularly when direct peer-to-peer communication is obstructed by NATs or firewalls. The service runs on Cloudflare's global anycast network across 310+ cities. + +## Key Characteristics + +- **Anycast Architecture**: Automatically connects clients to the closest Cloudflare location +- **Global Network**: Available across Cloudflare's entire network (excluding China Network) +- **Zero Configuration**: No need to manually select regions or servers +- **Protocol Support**: STUN/TURN over UDP, TCP, and TLS +- **Free Tier**: Free when used with Cloudflare Calls SFU, otherwise $0.05/GB outbound + +## In This Reference + +| File | Purpose | +|------|---------| +| [api.md](./api.md) | Credentials API, TURN key management, types, constraints | +| [configuration.md](./configuration.md) | Worker setup, wrangler.jsonc, env vars, IP allowlisting | +| [patterns.md](./patterns.md) | Implementation patterns, use cases, integration examples | +| [gotchas.md](./gotchas.md) | Troubleshooting, limits, security, common mistakes | + +## Reading Order + +| Task | Files to Read | Est. Tokens | +|------|---------------|-------------| +| Quick start | README only | ~500 | +| Generate credentials | README → api | ~1300 | +| Worker integration | README → configuration → patterns | ~2000 | +| Debug connection | gotchas | ~700 | +| Security review | api → gotchas | ~1500 | +| Enterprise firewall | configuration | ~600 | + +## Service Addresses and Ports + +### STUN over UDP +- **Primary**: `stun.cloudflare.com:3478/udp` +- **Alternate**: `stun.cloudflare.com:53/udp` (blocked by browsers, not recommended) + +### TURN over UDP +- **Primary**: `turn.cloudflare.com:3478/udp` +- **Alternate**: `turn.cloudflare.com:53/udp` (blocked by browsers) + +### TURN over TCP +- **Primary**: `turn.cloudflare.com:3478/tcp` +- **Alternate**: `turn.cloudflare.com:80/tcp` + +### TURN over TLS +- **Primary**: `turn.cloudflare.com:5349/tcp` +- **Alternate**: `turn.cloudflare.com:443/tcp` + +## Quick Start + +1. **Create TURN key via API**: see [api.md#create-turn-key](./api.md#create-turn-key) +2. **Generate credentials**: see [api.md#generate-temporary-credentials](./api.md#generate-temporary-credentials) +3. **Configure Worker**: see [configuration.md#cloudflare-worker-integration](./configuration.md#cloudflare-worker-integration) +4. **Implement client**: see [patterns.md#basic-turn-configuration-browser](./patterns.md#basic-turn-configuration-browser) + +## When to Use TURN + +- **Restrictive NATs**: Symmetric NATs that block direct connections +- **Corporate firewalls**: Environments blocking WebRTC ports +- **Mobile networks**: Carrier-grade NAT scenarios +- **Predictable connectivity**: When reliability > efficiency + +## Related Cloudflare Services + +- **Cloudflare Calls SFU**: Managed Selective Forwarding Unit (TURN free when used with SFU) +- **Cloudflare Stream**: Video streaming with WHIP/WHEP support +- **Cloudflare Workers**: Backend for credential generation +- **Cloudflare KV**: Credential caching +- **Cloudflare Durable Objects**: Session state management + +## Additional Resources + +- [Cloudflare Calls Documentation](https://developers.cloudflare.com/calls/) +- [Cloudflare TURN Service Docs](https://developers.cloudflare.com/realtime/turn/) +- [Cloudflare API Reference](https://developers.cloudflare.com/api/resources/calls/subresources/turn/) +- [Orange Meets (Open Source Example)](https://github.com/cloudflare/orange) diff --git a/cloudflare/references/turn/api.md b/cloudflare/references/turn/api.md new file mode 100644 index 0000000..498f5e4 --- /dev/null +++ b/cloudflare/references/turn/api.md @@ -0,0 +1,239 @@ +# TURN API Reference + +Complete API documentation for Cloudflare TURN service credentials and key management. + +## Authentication + +All endpoints require Cloudflare API token with "Calls Write" permission. + +Base URL: `https://api.cloudflare.com/client/v4` + +## TURN Key Management + +### List TURN Keys + +``` +GET /accounts/{account_id}/calls/turn_keys +``` + +### Get TURN Key Details + +``` +GET /accounts/{account_id}/calls/turn_keys/{key_id} +``` + +### Create TURN Key + +``` +POST /accounts/{account_id}/calls/turn_keys +Content-Type: application/json + +{ + "name": "my-turn-key" +} +``` + +**Response includes**: +- `uid`: Key identifier +- `key`: The actual secret key (only returned on creation—save immediately) +- `name`: Human-readable name +- `created`: ISO 8601 timestamp +- `modified`: ISO 8601 timestamp + +### Update TURN Key + +``` +PUT /accounts/{account_id}/calls/turn_keys/{key_id} +Content-Type: application/json + +{ + "name": "updated-name" +} +``` + +### Delete TURN Key + +``` +DELETE /accounts/{account_id}/calls/turn_keys/{key_id} +``` + +## Generate Temporary Credentials + +``` +POST https://rtc.live.cloudflare.com/v1/turn/keys/{key_id}/credentials/generate +Authorization: Bearer {key_secret} +Content-Type: application/json + +{ + "ttl": 86400 +} +``` + +### Credential Constraints + +| Parameter | Min | Max | Default | Notes | +|-----------|-----|-----|---------|-------| +| ttl | 1 | 172800 (48hrs) | varies | API rejects values >172800 | + +**CRITICAL**: Maximum TTL is 48 hours (172800 seconds). API will reject requests exceeding this limit. + +### Response Schema + +```json +{ + "iceServers": { + "urls": [ + "stun:stun.cloudflare.com:3478", + "turn:turn.cloudflare.com:3478?transport=udp", + "turn:turn.cloudflare.com:3478?transport=tcp", + "turn:turn.cloudflare.com:53?transport=udp", + "turn:turn.cloudflare.com:80?transport=tcp", + "turns:turn.cloudflare.com:5349?transport=tcp", + "turns:turn.cloudflare.com:443?transport=tcp" + ], + "username": "1738035200:user123", + "credential": "base64encodedhmac==" + } +} +``` + +**Port 53 Warning**: Filter port 53 URLs for browser clients—blocked by Chrome/Firefox. See [gotchas.md](./gotchas.md#using-port-53-in-browsers). + +## Revoke Credentials + +``` +POST https://rtc.live.cloudflare.com/v1/turn/keys/{key_id}/credentials/revoke +Authorization: Bearer {key_secret} +Content-Type: application/json + +{ + "username": "1738035200:user123" +} +``` + +**Response**: 204 No Content + +Billing stops immediately. Active connection drops after short delay (~seconds). + +## TypeScript Types + +```typescript +interface CloudflareTURNConfig { + keyId: string; + keySecret: string; + ttl?: number; // Max 172800 (48 hours) +} + +interface TURNCredentialsRequest { + ttl?: number; // Max 172800 seconds +} + +interface TURNCredentialsResponse { + iceServers: { + urls: string[]; + username: string; + credential: string; + }; +} + +interface RTCIceServer { + urls: string | string[]; + username?: string; + credential?: string; + credentialType?: "password"; +} + +interface TURNKeyResponse { + uid: string; + key: string; // Only present on creation + name: string; + created: string; + modified: string; +} +``` + +## Validation Function + +```typescript +function validateRTCIceServer(obj: unknown): obj is RTCIceServer { + if (!obj || typeof obj !== 'object') { + return false; + } + + const server = obj as Record; + + if (typeof server.urls !== 'string' && !Array.isArray(server.urls)) { + return false; + } + + if (server.username && typeof server.username !== 'string') { + return false; + } + + if (server.credential && typeof server.credential !== 'string') { + return false; + } + + return true; +} +``` + +## Type-Safe Credential Generation + +```typescript +async function fetchTURNServers( + config: CloudflareTURNConfig +): Promise { + // Validate TTL constraint + const ttl = config.ttl ?? 3600; + if (ttl > 172800) { + throw new Error('TTL cannot exceed 172800 seconds (48 hours)'); + } + + const response = await fetch( + `https://rtc.live.cloudflare.com/v1/turn/keys/${config.keyId}/credentials/generate`, + { + method: 'POST', + headers: { + 'Authorization': `Bearer ${config.keySecret}`, + 'Content-Type': 'application/json' + }, + body: JSON.stringify({ ttl }) + } + ); + + if (!response.ok) { + throw new Error(`TURN credential generation failed: ${response.status}`); + } + + const data = await response.json(); + + // Filter port 53 for browser clients + const filteredUrls = data.iceServers.urls.filter( + (url: string) => !url.includes(':53') + ); + + const iceServers = [ + { urls: 'stun:stun.cloudflare.com:3478' }, + { + urls: filteredUrls, + username: data.iceServers.username, + credential: data.iceServers.credential, + credentialType: 'password' as const + } + ]; + + // Validate before returning + if (!iceServers.every(validateRTCIceServer)) { + throw new Error('Invalid ICE server configuration received'); + } + + return iceServers; +} +``` + +## See Also + +- [configuration.md](./configuration.md) - Worker setup, environment variables +- [patterns.md](./patterns.md) - Implementation examples using these APIs +- [gotchas.md](./gotchas.md) - Security best practices, common mistakes diff --git a/cloudflare/references/turn/configuration.md b/cloudflare/references/turn/configuration.md new file mode 100644 index 0000000..2d49736 --- /dev/null +++ b/cloudflare/references/turn/configuration.md @@ -0,0 +1,179 @@ +# TURN Configuration + +Setup and configuration for Cloudflare TURN service in Workers and applications. + +## Environment Variables + +```bash +# .env +CLOUDFLARE_ACCOUNT_ID=your_account_id +CLOUDFLARE_API_TOKEN=your_api_token +TURN_KEY_ID=your_turn_key_id +TURN_KEY_SECRET=your_turn_key_secret +``` + +Validate with zod: + +```typescript +import { z } from 'zod'; + +const envSchema = z.object({ + CLOUDFLARE_ACCOUNT_ID: z.string().min(1), + CLOUDFLARE_API_TOKEN: z.string().min(1), + TURN_KEY_ID: z.string().min(1), + TURN_KEY_SECRET: z.string().min(1) +}); + +export const config = envSchema.parse(process.env); +``` + +## wrangler.jsonc + +```jsonc +{ + "name": "turn-credentials-api", + "main": "src/index.ts", + "compatibility_date": "2025-01-01", + "vars": { + "TURN_KEY_ID": "your-turn-key-id" // Non-sensitive, can be in vars + }, + "env": { + "production": { + "kv_namespaces": [ + { + "binding": "CREDENTIALS_CACHE", + "id": "your-kv-namespace-id" + } + ] + } + } +} +``` + +**Store secrets separately**: +```bash +wrangler secret put TURN_KEY_SECRET +``` + +## Cloudflare Worker Integration + +### Worker Binding Types + +```typescript +interface Env { + TURN_KEY_ID: string; + TURN_KEY_SECRET: string; + CREDENTIALS_CACHE?: KVNamespace; +} + +export default { + async fetch(request: Request, env: Env): Promise { + // See patterns.md for implementation + } +} +``` + +### Basic Worker Example + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + if (request.url.endsWith('/turn-credentials')) { + // Validate client auth + const authHeader = request.headers.get('Authorization'); + if (!authHeader) { + return new Response('Unauthorized', { status: 401 }); + } + + const response = await fetch( + `https://rtc.live.cloudflare.com/v1/turn/keys/${env.TURN_KEY_ID}/credentials/generate`, + { + method: 'POST', + headers: { + 'Authorization': `Bearer ${env.TURN_KEY_SECRET}`, + 'Content-Type': 'application/json' + }, + body: JSON.stringify({ ttl: 3600 }) + } + ); + + if (!response.ok) { + return new Response('Failed to generate credentials', { status: 500 }); + } + + const data = await response.json(); + + // Filter port 53 for browser clients + const filteredUrls = data.iceServers.urls.filter( + (url: string) => !url.includes(':53') + ); + + return Response.json({ + iceServers: [ + { urls: 'stun:stun.cloudflare.com:3478' }, + { + urls: filteredUrls, + username: data.iceServers.username, + credential: data.iceServers.credential + } + ] + }); + } + + return new Response('Not found', { status: 404 }); + } +}; +``` + +## IP Allowlisting (Enterprise/Firewall) + +For strict firewalls, allowlist these IPs for `turn.cloudflare.com`: + +| Type | Address | Protocol | +|------|---------|----------| +| IPv4 | 141.101.90.1/32 | All | +| IPv4 | 162.159.207.1/32 | All | +| IPv6 | 2a06:98c1:3200::1/128 | All | +| IPv6 | 2606:4700:48::1/128 | All | + +**IMPORTANT**: These IPs may change with 14-day notice. Monitor DNS: + +```bash +# Check A and AAAA records +dig turn.cloudflare.com A +dig turn.cloudflare.com AAAA +``` + +Set up automated monitoring to detect IP changes and update allowlists within 14 days. + +## IPv6 Support + +- **Client-to-TURN**: Both IPv4 and IPv6 supported +- **Relay addresses**: IPv4 only (no RFC 6156 support) +- **TCP relaying**: Not supported (RFC 6062) + +Clients can connect via IPv6, but relayed traffic uses IPv4 addresses. + +## TLS Configuration + +### Supported TLS Versions +- TLS 1.1 +- TLS 1.2 +- TLS 1.3 + +### Recommended Ciphers (TLS 1.3) +- AEAD-AES128-GCM-SHA256 +- AEAD-AES256-GCM-SHA384 +- AEAD-CHACHA20-POLY1305-SHA256 + +### Recommended Ciphers (TLS 1.2) +- ECDHE-ECDSA-AES128-GCM-SHA256 +- ECDHE-RSA-AES128-GCM-SHA256 +- ECDHE-RSA-AES128-SHA (also TLS 1.1) +- AES128-GCM-SHA256 + +## See Also + +- [api.md](./api.md) - TURN key creation, credential generation API +- [patterns.md](./patterns.md) - Full Worker implementation patterns +- [gotchas.md](./gotchas.md) - Security best practices, troubleshooting diff --git a/cloudflare/references/turn/gotchas.md b/cloudflare/references/turn/gotchas.md new file mode 100644 index 0000000..e2d5bd1 --- /dev/null +++ b/cloudflare/references/turn/gotchas.md @@ -0,0 +1,231 @@ +# TURN Gotchas & Troubleshooting + +Common mistakes, security best practices, and troubleshooting for Cloudflare TURN. + +## Quick Reference + +| Issue | Solution | Details | +|-------|----------|---------| +| Credentials not working | Check TTL ≤ 48hrs | [See Troubleshooting](#issue-turn-credentials-not-working) | +| Connection drops after ~48hrs | Implement credential refresh | [See Connection Drops](#issue-connection-drops-after-48-hours) | +| Port 53 fails in browser | Filter server-side | [See Port 53](#using-port-53-in-browsers) | +| High packet loss | Check rate limits | [See Rate Limits](#limits-per-turn-allocation) | +| Connection fails after maintenance | Implement ICE restart | [See ICE Restart](#ice-restart-required-scenarios) | + +## Critical Constraints + +| Constraint | Value | Consequence if Violated | +|------------|-------|-------------------------| +| Max credential TTL | 48 hours (172800s) | API rejects request | +| Credential revocation delay | ~seconds | Billing stops immediately, connection drops shortly | +| IP allowlist update window | 14 days (if IPs change) | Connection fails if IPs change | +| Packet rate | 5-10k pps per allocation | Packet drops | +| Data rate | 50-100 Mbps per allocation | Packet drops | +| Unique IP rate | >5 new IPs/sec | Packet drops | + +## Limits Per TURN Allocation + +**Per user** (not account-wide): + +- **IP addresses**: >5 new unique IPs per second +- **Packet rate**: 5-10k packets per second (inbound/outbound) +- **Data rate**: 50-100 Mbps (inbound/outbound) +- **MTU**: No specific limit +- **Burst rates**: Higher than documented + +Exceeding limits results in **packet drops**. + +## Common Mistakes + +### Setting TTL > 48 hours + +```typescript +// ❌ BAD: API will reject +const creds = await generate({ ttl: 604800 }); // 7 days + +// ✅ GOOD: +const creds = await generate({ ttl: 86400 }); // 24 hours +``` + +### Hardcoding IPs without monitoring + +```typescript +// ❌ BAD: IPs can change with 14-day notice +const iceServers = [{ urls: 'turn:141.101.90.1:3478' }]; + +// ✅ GOOD: Use DNS +const iceServers = [{ urls: 'turn:turn.cloudflare.com:3478' }]; +``` + +### Using port 53 in browsers + +```typescript +// ❌ BAD: Blocked by Chrome/Firefox +urls: ['turn:turn.cloudflare.com:53'] + +// ✅ GOOD: Filter port 53 +urls: urls.filter(url => !url.includes(':53')) +``` + +### Not handling credential expiry + +```typescript +// ❌ BAD: Credentials expire but call continues → connection drops +const creds = await fetchCreds(); +const pc = new RTCPeerConnection({ iceServers: creds }); + +// ✅ GOOD: Refresh before expiry +setInterval(() => refreshCredentials(pc), 3000000); // 50 min +``` + +### Missing ICE restart support + +```typescript +// ❌ BAD: No recovery from TURN maintenance +pc.addEventListener('iceconnectionstatechange', () => { + console.log('State changed:', pc.iceConnectionState); +}); + +// ✅ GOOD: Implement ICE restart +pc.addEventListener('iceconnectionstatechange', async () => { + if (pc.iceConnectionState === 'failed') { + await refreshCredentials(pc); + pc.restartIce(); + } +}); +``` + +### Exposing TURN key secret client-side + +```typescript +// ❌ BAD: Secret exposed to client +const secret = 'your-turn-key-secret'; +const response = await fetch(`https://rtc.live.cloudflare.com/v1/turn/...`, { + headers: { 'Authorization': `Bearer ${secret}` } +}); + +// ✅ GOOD: Generate credentials server-side +const response = await fetch('/api/turn-credentials'); +``` + +## ICE Restart Required Scenarios + +These events require ICE restart (see [patterns.md](./patterns.md#ice-restart-pattern)): + +1. **TURN server maintenance** (occasional on Cloudflare's network) +2. **Network topology changes** (anycast routing changes) +3. **Credential refresh** during long sessions (>1 hour) +4. **Connection failure** (iceConnectionState === 'failed') + +Implement in all production apps: + +```typescript +pc.addEventListener('iceconnectionstatechange', async () => { + if (pc.iceConnectionState === 'failed' || + pc.iceConnectionState === 'disconnected') { + await refreshTURNCredentials(pc); + pc.restartIce(); + const offer = await pc.createOffer({ iceRestart: true }); + await pc.setLocalDescription(offer); + // Send offer to peer via signaling... + } +}); +``` + +Reference: [RFC 8445 Section 2.4](https://datatracker.ietf.org/doc/html/rfc8445#section-2.4) + +## Security Checklist + +- [ ] Credentials generated server-side only (never client-side) +- [ ] TURN_KEY_SECRET in wrangler secrets, not vars +- [ ] TTL ≤ expected session duration (and ≤ 48 hours) +- [ ] Rate limiting on credential generation endpoint +- [ ] Client authentication before issuing credentials +- [ ] Credential revocation API for compromised sessions +- [ ] No hardcoded IPs (or DNS monitoring in place) +- [ ] Port 53 filtered for browser clients + +## Troubleshooting + +### Issue: TURN credentials not working + +**Check:** +- Key ID and secret are correct +- Credentials haven't expired (check TTL) +- TTL doesn't exceed 172800 seconds (48 hours) +- Server can reach rtc.live.cloudflare.com +- Network allows outbound HTTPS + +**Solution:** +```typescript +// Validate before using +if (ttl > 172800) { + throw new Error('TTL cannot exceed 48 hours'); +} +``` + +### Issue: Slow connection establishment + +**Solutions:** +- Ensure proper ICE candidate gathering +- Check network latency to Cloudflare edge +- Verify firewall allows WebRTC ports (3478, 5349, 443) +- Consider using TURN over TLS (port 443) for corporate networks + +### Issue: High packet loss + +**Check:** +- Not exceeding rate limits (5-10k pps) +- Not exceeding bandwidth limits (50-100 Mbps) +- Not connecting to too many unique IPs (>5/sec) +- Client network quality + +### Issue: Connection drops after ~48 hours + +**Cause**: Credentials expired (48hr max) + +**Solution**: +- Set TTL to expected session duration +- Implement credential refresh with setConfiguration() +- Use ICE restart if connection fails + +```typescript +// Refresh credentials before expiry +const refreshInterval = ttl * 1000 - 60000; // 1 min early +setInterval(async () => { + await refreshTURNCredentials(pc); +}, refreshInterval); +``` + +### Issue: Port 53 URLs in browser fail silently + +**Cause**: Chrome/Firefox block port 53 + +**Solution**: Filter port 53 URLs server-side: + +```typescript +const filtered = urls.filter(url => !url.includes(':53')); +``` + +### Issue: Hardcoded IPs stop working + +**Cause**: Cloudflare changed IP addresses (14-day notice) + +**Solution**: +- Use DNS hostnames (`turn.cloudflare.com`) +- Monitor DNS changes with automated alerts +- Update allowlists within 14 days if using IP allowlisting + +## Cost Optimization + +1. Use appropriate TTLs (don't over-provision) +2. Implement credential caching +3. Set `iceTransportPolicy: 'all'` to try direct first (use `'relay'` only when necessary) +4. Monitor bandwidth usage +5. Free when used with Cloudflare Calls SFU + +## See Also + +- [api.md](./api.md) - Credential generation API, revocation +- [configuration.md](./configuration.md) - IP allowlisting, monitoring +- [patterns.md](./patterns.md) - ICE restart, credential refresh patterns diff --git a/cloudflare/references/turn/patterns.md b/cloudflare/references/turn/patterns.md new file mode 100644 index 0000000..39333be --- /dev/null +++ b/cloudflare/references/turn/patterns.md @@ -0,0 +1,213 @@ +# TURN Implementation Patterns + +Production-ready patterns for implementing Cloudflare TURN in WebRTC applications. + +## Prerequisites + +Before implementing these patterns, ensure you have: +- TURN key created: see [api.md#create-turn-key](./api.md#create-turn-key) +- Worker configured: see [configuration.md#cloudflare-worker-integration](./configuration.md#cloudflare-worker-integration) + +## Basic TURN Configuration (Browser) + +```typescript +interface RTCIceServer { + urls: string | string[]; + username?: string; + credential?: string; + credentialType?: "password" | "oauth"; +} + +async function getTURNConfig(): Promise { + const response = await fetch('/api/turn-credentials'); + const data = await response.json(); + + return [ + { + urls: 'stun:stun.cloudflare.com:3478' + }, + { + urls: [ + 'turn:turn.cloudflare.com:3478?transport=udp', + 'turn:turn.cloudflare.com:3478?transport=tcp', + 'turns:turn.cloudflare.com:5349?transport=tcp', + 'turns:turn.cloudflare.com:443?transport=tcp' + ], + username: data.username, + credential: data.credential, + credentialType: 'password' + } + ]; +} + +// Use in RTCPeerConnection +const iceServers = await getTURNConfig(); +const peerConnection = new RTCPeerConnection({ iceServers }); +``` + +## Port Selection Strategy + +Recommended order for browser clients: + +1. **3478/udp** (primary, lowest latency) +2. **3478/tcp** (fallback for UDP-blocked networks) +3. **5349/tls** (corporate firewalls, most reliable) +4. **443/tls** (alternate TLS port, firewall-friendly) + +**Avoid port 53**—blocked by Chrome and Firefox. + +```typescript +function filterICEServersForBrowser(urls: string[]): string[] { + return urls + .filter(url => !url.includes(':53')) // Remove port 53 + .sort((a, b) => { + // Prioritize UDP over TCP over TLS + if (a.includes('transport=udp')) return -1; + if (b.includes('transport=udp')) return 1; + if (a.includes('transport=tcp') && !a.startsWith('turns:')) return -1; + if (b.includes('transport=tcp') && !b.startsWith('turns:')) return 1; + return 0; + }); +} +``` + +## Credential Refresh (Mid-Session) + +When credentials expire during long calls: + +```typescript +async function refreshTURNCredentials(pc: RTCPeerConnection): Promise { + const newCreds = await fetch('/turn-credentials').then(r => r.json()); + const config = pc.getConfiguration(); + config.iceServers = newCreds.iceServers; + pc.setConfiguration(config); + // Note: setConfiguration() does NOT trigger ICE restart + // Combine with restartIce() if connection fails +} + +// Auto-refresh before expiry +setInterval(async () => { + await refreshTURNCredentials(peerConnection); +}, 3000000); // 50 minutes if TTL is 1 hour +``` + +## ICE Restart Pattern + +After network change, TURN server maintenance, or credential expiry: + +```typescript +pc.addEventListener('iceconnectionstatechange', async () => { + if (pc.iceConnectionState === 'failed') { + console.warn('ICE connection failed, restarting...'); + + // Refresh credentials + await refreshTURNCredentials(pc); + + // Trigger ICE restart + pc.restartIce(); + const offer = await pc.createOffer({ iceRestart: true }); + await pc.setLocalDescription(offer); + + // Send offer to peer via signaling channel... + } +}); +``` + +## Credentials Caching Pattern + +```typescript +class TURNCredentialsManager { + private creds: { username: string; credential: string; urls: string[]; expiresAt: number; } | null = null; + + async getCredentials(keyId: string, keySecret: string): Promise { + const now = Date.now(); + + if (this.creds && this.creds.expiresAt > now) { + return this.buildIceServers(this.creds); + } + + const ttl = 3600; + if (ttl > 172800) throw new Error('TTL max 48hrs'); + + const res = await fetch( + `https://rtc.live.cloudflare.com/v1/turn/keys/${keyId}/credentials/generate`, + { + method: 'POST', + headers: { 'Authorization': `Bearer ${keySecret}`, 'Content-Type': 'application/json' }, + body: JSON.stringify({ ttl }) + } + ); + + const data = await res.json(); + const filteredUrls = data.iceServers.urls.filter((url: string) => !url.includes(':53')); + + this.creds = { + username: data.iceServers.username, + credential: data.iceServers.credential, + urls: filteredUrls, + expiresAt: now + (ttl * 1000) - 60000 + }; + + return this.buildIceServers(this.creds); + } + + private buildIceServers(c: { username: string; credential: string; urls: string[] }): RTCIceServer[] { + return [ + { urls: 'stun:stun.cloudflare.com:3478' }, + { urls: c.urls, username: c.username, credential: c.credential, credentialType: 'password' as const } + ]; + } +} +``` + +## Common Use Cases + +```typescript +// Video conferencing: TURN as fallback +const config = { iceServers: await getTURNConfig(), iceTransportPolicy: 'all' }; + +// IoT/predictable connectivity: force TURN +const config = { iceServers: await getTURNConfig(), iceTransportPolicy: 'relay' }; + +// Screen sharing: reduce overhead +const pc = new RTCPeerConnection({ iceServers: await getTURNConfig(), bundlePolicy: 'max-bundle' }); +``` + +## Integration with Cloudflare Calls SFU + +```typescript +// TURN is automatically used when needed +// Cloudflare Calls handles TURN + SFU coordination +const session = await callsClient.createSession({ + appId: 'your-app-id', + sessionId: 'meeting-123' +}); +``` + +## Debugging ICE Connectivity + +```typescript +pc.addEventListener('icecandidate', (event) => { + if (event.candidate) { + console.log('ICE candidate:', event.candidate.type, event.candidate.protocol); + } +}); + +pc.addEventListener('iceconnectionstatechange', () => { + console.log('ICE state:', pc.iceConnectionState); +}); + +// Check selected candidate pair +const stats = await pc.getStats(); +stats.forEach(report => { + if (report.type === 'candidate-pair' && report.selected) { + console.log('Selected:', report); + } +}); +``` + +## See Also + +- [api.md](./api.md) - Credential generation API, types +- [configuration.md](./configuration.md) - Worker setup, environment variables +- [gotchas.md](./gotchas.md) - Common mistakes, troubleshooting diff --git a/cloudflare/references/turnstile/README.md b/cloudflare/references/turnstile/README.md new file mode 100644 index 0000000..29de799 --- /dev/null +++ b/cloudflare/references/turnstile/README.md @@ -0,0 +1,99 @@ +# Cloudflare Turnstile Implementation Skill Reference + +Expert guidance for implementing Cloudflare Turnstile - a smart CAPTCHA alternative that protects websites from bots without showing traditional CAPTCHA puzzles. + +## Overview + +Turnstile is a user-friendly CAPTCHA alternative that runs challenges in the background without user interaction. It validates visitors automatically using signals like browser behavior, device fingerprinting, and machine learning. + +## Widget Types + +| Type | Interaction | Use Case | +|------|-------------|----------| +| **Managed** (default) | Shows checkbox when needed | Forms, logins - balance UX and security | +| **Non-Interactive** | Invisible, runs automatically | Frictionless UX, low-risk actions | +| **Invisible** | Hidden, triggered programmatically | Pre-clearance, API calls, headless | + +## Quick Start + +### Implicit Rendering (HTML-based) +```html + + + + +
+
+ +
+``` + +### Explicit Rendering (JavaScript-based) +```html +
+ + +``` + +### Server Validation (Required) +```javascript +// Cloudflare Workers +export default { + async fetch(request) { + const formData = await request.formData(); + const token = formData.get('cf-turnstile-response'); + + const result = await fetch('https://challenges.cloudflare.com/turnstile/v0/siteverify', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ + secret: env.TURNSTILE_SECRET, + response: token, + remoteip: request.headers.get('CF-Connecting-IP') + }) + }); + + const validation = await result.json(); + if (!validation.success) { + return new Response('Invalid CAPTCHA', { status: 400 }); + } + // Process form... + } +} +``` + +## Testing Keys + +**Critical for development/testing:** + +| Type | Key | Behavior | +|------|-----|----------| +| **Site Key (Always Passes)** | `1x00000000000000000000AA` | Widget succeeds, token validates | +| **Site Key (Always Blocks)** | `2x00000000000000000000AB` | Widget fails visibly | +| **Site Key (Force Challenge)** | `3x00000000000000000000FF` | Always shows interactive challenge | +| **Secret Key (Testing)** | `1x0000000000000000000000000000000AA` | Validates test tokens | + +**Note:** Test keys work on `localhost` and any domain. Do NOT use in production. + +## Key Constraints + +- **Token expiry:** 5 minutes after generation +- **Single-use:** Each token can only be validated once +- **Server validation required:** Client-side checks are insufficient + +## Reading Order + +1. **[configuration.md](configuration.md)** - Setup, widget options, script loading +2. **[api.md](api.md)** - JavaScript API, siteverify endpoints, TypeScript types +3. **[patterns.md](patterns.md)** - Form integration, framework examples, validation patterns +4. **[gotchas.md](gotchas.md)** - Common errors, debugging, limitations + +## See Also + +- [Cloudflare Turnstile Docs](https://developers.cloudflare.com/turnstile/) +- [Dashboard](https://dash.cloudflare.com/?to=/:account/turnstile) diff --git a/cloudflare/references/turnstile/api.md b/cloudflare/references/turnstile/api.md new file mode 100644 index 0000000..f65ee4f --- /dev/null +++ b/cloudflare/references/turnstile/api.md @@ -0,0 +1,240 @@ +# API Reference + +## Client-Side JavaScript API + +The Turnstile JavaScript API is available at `window.turnstile` after loading the script. + +### `turnstile.render(container, options)` + +Renders a Turnstile widget into a container element. + +**Parameters:** +- `container` (string | HTMLElement): CSS selector or DOM element +- `options` (TurnstileOptions): Configuration object (see [configuration.md](configuration.md)) + +**Returns:** `string` - Widget ID for use with other API methods + +**Example:** +```javascript +const widgetId = window.turnstile.render('#my-container', { + sitekey: 'YOUR_SITE_KEY', + callback: (token) => console.log('Success:', token), + 'error-callback': (code) => console.error('Error:', code) +}); +``` + +### `turnstile.reset(widgetId)` + +Resets a widget (clears token, resets challenge state). Useful when form validation fails. + +**Parameters:** +- `widgetId` (string): Widget ID from `render()`, or container element + +**Returns:** `void` + +**Example:** +```javascript +// Reset on form error +if (!validateForm()) { + window.turnstile.reset(widgetId); +} +``` + +### `turnstile.remove(widgetId)` + +Removes a widget from the DOM completely. + +**Parameters:** +- `widgetId` (string): Widget ID from `render()` + +**Returns:** `void` + +**Example:** +```javascript +// Cleanup on navigation +window.turnstile.remove(widgetId); +``` + +### `turnstile.getResponse(widgetId)` + +Gets the current token from a widget (if challenge completed). + +**Parameters:** +- `widgetId` (string): Widget ID from `render()`, or container element + +**Returns:** `string | undefined` - Token string, or undefined if not ready + +**Example:** +```javascript +const token = window.turnstile.getResponse(widgetId); +if (token) { + submitForm(token); +} +``` + +### `turnstile.isExpired(widgetId)` + +Checks if a widget's token has expired (>5 minutes old). + +**Parameters:** +- `widgetId` (string): Widget ID from `render()` + +**Returns:** `boolean` - True if expired + +**Example:** +```javascript +if (window.turnstile.isExpired(widgetId)) { + window.turnstile.reset(widgetId); +} +``` + +## Callback Signatures + +```typescript +type TurnstileCallback = (token: string) => void; +type ErrorCallback = (errorCode: string) => void; +type TimeoutCallback = () => void; +type ExpiredCallback = () => void; +type BeforeInteractiveCallback = () => void; +type AfterInteractiveCallback = () => void; +type UnsupportedCallback = () => void; +``` + +## Siteverify API (Server-Side) + +**Endpoint:** `https://challenges.cloudflare.com/turnstile/v0/siteverify` + +### Request + +**Method:** POST +**Content-Type:** `application/json` or `application/x-www-form-urlencoded` + +```typescript +interface SiteverifyRequest { + secret: string; // Your secret key (never expose client-side) + response: string; // Token from cf-turnstile-response + remoteip?: string; // User's IP (optional but recommended) + idempotency_key?: string; // Unique key for idempotent validation +} +``` + +**Example:** +```javascript +// Cloudflare Workers +const result = await fetch('https://challenges.cloudflare.com/turnstile/v0/siteverify', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ + secret: env.TURNSTILE_SECRET, + response: token, + remoteip: request.headers.get('CF-Connecting-IP') + }) +}); +const data = await result.json(); +``` + +### Response + +```typescript +interface SiteverifyResponse { + success: boolean; // Validation result + challenge_ts?: string; // ISO timestamp of challenge + hostname?: string; // Hostname where widget was solved + 'error-codes'?: string[]; // Error codes if success=false + action?: string; // Action name from widget config + cdata?: string; // Custom data from widget config +} +``` + +**Example Success:** +```json +{ + "success": true, + "challenge_ts": "2024-01-15T10:30:00Z", + "hostname": "example.com", + "action": "login", + "cdata": "user123" +} +``` + +**Example Failure:** +```json +{ + "success": false, + "error-codes": ["timeout-or-duplicate"] +} +``` + +## Error Codes + +| Code | Cause | Solution | +|------|-------|----------| +| `missing-input-secret` | Secret key not provided | Include `secret` in request | +| `invalid-input-secret` | Secret key is wrong | Check secret key in dashboard | +| `missing-input-response` | Token not provided | Include `response` token | +| `invalid-input-response` | Token is invalid/malformed | Verify token from widget | +| `timeout-or-duplicate` | Token expired (>5min) or reused | Generate new token, validate once | +| `internal-error` | Cloudflare server error | Retry with exponential backoff | +| `bad-request` | Malformed request | Check JSON/form encoding | + +## TypeScript Types + +```typescript +interface TurnstileOptions { + sitekey: string; + action?: string; + cData?: string; + callback?: (token: string) => void; + 'error-callback'?: (errorCode: string) => void; + 'expired-callback'?: () => void; + 'timeout-callback'?: () => void; + 'before-interactive-callback'?: () => void; + 'after-interactive-callback'?: () => void; + 'unsupported-callback'?: () => void; + theme?: 'light' | 'dark' | 'auto'; + size?: 'normal' | 'compact' | 'flexible'; + tabindex?: number; + 'response-field'?: boolean; + 'response-field-name'?: string; + retry?: 'auto' | 'never'; + 'retry-interval'?: number; + language?: string; + execution?: 'render' | 'execute'; + appearance?: 'always' | 'execute' | 'interaction-only'; + 'refresh-expired'?: 'auto' | 'manual' | 'never'; +} + +interface Turnstile { + render(container: string | HTMLElement, options: TurnstileOptions): string; + reset(widgetId: string): void; + remove(widgetId: string): void; + getResponse(widgetId: string): string | undefined; + isExpired(widgetId: string): boolean; + execute(container?: string | HTMLElement, options?: TurnstileOptions): void; +} + +declare global { + interface Window { + turnstile: Turnstile; + onloadTurnstileCallback?: () => void; + } +} +``` + +## Script Loading + +```html + + + + + + + + + +``` \ No newline at end of file diff --git a/cloudflare/references/turnstile/configuration.md b/cloudflare/references/turnstile/configuration.md new file mode 100644 index 0000000..215bdab --- /dev/null +++ b/cloudflare/references/turnstile/configuration.md @@ -0,0 +1,222 @@ +# Configuration + +## Script Loading + +### Basic (Implicit Rendering) +```html + +``` +Automatically renders widgets with `class="cf-turnstile"` on page load. + +### Explicit Rendering +```html + +``` +Manual control over when/where widgets render via `window.turnstile.render()`. + +### With Load Callback +```html + + +``` + +### Compatibility Mode +```html + +``` +Provides `grecaptcha` API for Google reCAPTCHA drop-in replacement. + +## Widget Configuration + +### Complete Options Object + +```javascript +{ + // Required + sitekey: 'YOUR_SITE_KEY', // Widget sitekey from dashboard + + // Callbacks + callback: (token) => {}, // Success - token ready + 'error-callback': (code) => {}, // Error occurred + 'expired-callback': () => {}, // Token expired (>5min) + 'timeout-callback': () => {}, // Challenge timeout + 'before-interactive-callback': () => {}, // Before showing checkbox + 'after-interactive-callback': () => {}, // After user interacts + 'unsupported-callback': () => {}, // Browser doesn't support Turnstile + + // Appearance + theme: 'auto', // 'light' | 'dark' | 'auto' + size: 'normal', // 'normal' | 'compact' | 'flexible' + tabindex: 0, // Tab order (accessibility) + language: 'auto', // ISO 639-1 code or 'auto' + + // Behavior + execution: 'render', // 'render' (auto) | 'execute' (manual) + appearance: 'always', // 'always' | 'execute' | 'interaction-only' + retry: 'auto', // 'auto' | 'never' + 'retry-interval': 8000, // Retry interval (ms), default 8000 + 'refresh-expired': 'auto', // 'auto' | 'manual' | 'never' + + // Form Integration + 'response-field': true, // Add hidden input (default: true) + 'response-field-name': 'cf-turnstile-response', // Hidden input name + + // Analytics & Data + action: 'login', // Action name (for analytics) + cData: 'user-session-123', // Custom data (returned in siteverify) +} +``` + +### Key Options Explained + +**`execution`:** +- `'render'` (default): Challenge starts immediately on render +- `'execute'`: Wait for `turnstile.execute()` call + +**`appearance`:** +- `'always'` (default): Widget always visible +- `'execute'`: Hidden until `execute()` called +- `'interaction-only'`: Hidden until user interaction needed + +**`refresh-expired`:** +- `'auto'` (default): Auto-refresh expired tokens +- `'manual'`: App must call `reset()` after expiry +- `'never'`: No refresh, expired-callback triggered + +**`retry`:** +- `'auto'` (default): Auto-retry failed challenges +- `'never'`: Don't retry, trigger error-callback + +## HTML Data Attributes + +For implicit rendering, use data attributes on `
`: + +| JavaScript Property | HTML Data Attribute | Example | +|---------------------|---------------------|---------| +| `sitekey` | `data-sitekey` | `data-sitekey="YOUR_KEY"` | +| `action` | `data-action` | `data-action="login"` | +| `cData` | `data-cdata` | `data-cdata="session-123"` | +| `callback` | `data-callback` | `data-callback="onSuccess"` | +| `error-callback` | `data-error-callback` | `data-error-callback="onError"` | +| `expired-callback` | `data-expired-callback` | `data-expired-callback="onExpired"` | +| `timeout-callback` | `data-timeout-callback` | `data-timeout-callback="onTimeout"` | +| `theme` | `data-theme` | `data-theme="dark"` | +| `size` | `data-size` | `data-size="compact"` | +| `tabindex` | `data-tabindex` | `data-tabindex="0"` | +| `response-field` | `data-response-field` | `data-response-field="false"` | +| `response-field-name` | `data-response-field-name` | `data-response-field-name="token"` | +| `retry` | `data-retry` | `data-retry="never"` | +| `retry-interval` | `data-retry-interval` | `data-retry-interval="5000"` | +| `language` | `data-language` | `data-language="en"` | +| `execution` | `data-execution` | `data-execution="execute"` | +| `appearance` | `data-appearance` | `data-appearance="interaction-only"` | +| `refresh-expired` | `data-refresh-expired` | `data-refresh-expired="manual"` | + +**Example:** +```html +
+``` + +## Content Security Policy + +Add these directives to CSP header/meta tag: + +``` +script-src https://challenges.cloudflare.com; +frame-src https://challenges.cloudflare.com; +``` + +**Full Example:** +```html + +``` + +## Framework-Specific Setup + +### React +```bash +npm install @marsidev/react-turnstile +``` +```jsx +import Turnstile from '@marsidev/react-turnstile'; + + console.log(token)} +/> +``` + +### Vue +```bash +npm install vue-turnstile +``` +```vue + + +``` + +### Svelte +```bash +npm install svelte-turnstile +``` +```svelte + + +``` + +### Next.js (App Router) +```tsx +// app/components/TurnstileWidget.tsx +'use client'; +import { useEffect, useRef } from 'react'; + +export default function TurnstileWidget({ sitekey, onSuccess }) { + const ref = useRef(null); + + useEffect(() => { + if (ref.current && window.turnstile) { + const widgetId = window.turnstile.render(ref.current, { + sitekey, + callback: onSuccess + }); + return () => window.turnstile.remove(widgetId); + } + }, [sitekey, onSuccess]); + + return
; +} +``` + +## Cloudflare Pages Plugin + +```bash +npm install @cloudflare/pages-plugin-turnstile +``` + +```typescript +// functions/_middleware.ts +import turnstilePlugin from '@cloudflare/pages-plugin-turnstile'; + +export const onRequest = turnstilePlugin({ + secret: 'YOUR_SECRET_KEY', + onError: () => new Response('CAPTCHA failed', { status: 403 }) +}); +``` \ No newline at end of file diff --git a/cloudflare/references/turnstile/gotchas.md b/cloudflare/references/turnstile/gotchas.md new file mode 100644 index 0000000..f7556b4 --- /dev/null +++ b/cloudflare/references/turnstile/gotchas.md @@ -0,0 +1,218 @@ +# Troubleshooting & Gotchas + +## Critical Rules + +### ❌ Skipping Server-Side Validation +**Problem:** Client-only validation is easily bypassed. + +**Solution:** Always validate on server. +```javascript +// CORRECT - Server validates token +app.post('/submit', async (req, res) => { + const token = req.body['cf-turnstile-response']; + const validation = await fetch('https://challenges.cloudflare.com/turnstile/v0/siteverify', { + method: 'POST', + body: JSON.stringify({ secret: SECRET, response: token }) + }).then(r => r.json()); + + if (!validation.success) return res.status(403).json({ error: 'CAPTCHA failed' }); +}); +``` + +### ❌ Exposing Secret Key +**Problem:** Secret key leaked in client-side code. + +**Solution:** Server-side validation only. Never send secret to client. + +### ❌ Reusing Tokens (Single-Use Rule) +**Problem:** Tokens are single-use. Revalidation fails with `timeout-or-duplicate`. + +**Solution:** Generate new token for each submission. Reset widget on error. +```javascript +if (!response.ok) window.turnstile.reset(widgetId); +``` + +### ❌ Not Handling Token Expiry +**Problem:** Tokens expire after 5 minutes. + +**Solution:** Handle expiry callback or use auto-refresh. +```javascript +window.turnstile.render('#container', { + sitekey: 'YOUR_SITE_KEY', + 'refresh-expired': 'auto', // or 'manual' with expired-callback + 'expired-callback': () => window.turnstile.reset(widgetId) +}); +``` + +## Common Errors + +| Error | Cause | Solution | +|-------|-------|----------| +| **Widget not rendering** | Incorrect sitekey, CSP blocking, file:// protocol | Check sitekey, add CSP for challenges.cloudflare.com, use http:// | +| **timeout-or-duplicate** | Token expired (>5min) or reused | Generate fresh token, don't cache >5min | +| **invalid-input-secret** | Wrong secret key | Verify secret from dashboard, check env vars | +| **missing-input-response** | Token not sent | Check form field name is 'cf-turnstile-response' | + +## Framework Gotchas + +### React: Widget Re-mounting +**Problem:** Widget re-renders on state change, losing token. + +**Solution:** Control lifecycle with useRef. +```tsx +function TurnstileWidget({ onToken }) { + const containerRef = useRef(null); + const widgetIdRef = useRef(null); + + useEffect(() => { + if (containerRef.current && !widgetIdRef.current) { + widgetIdRef.current = window.turnstile.render(containerRef.current, { + sitekey: 'YOUR_SITE_KEY', + callback: onToken + }); + } + return () => { + if (widgetIdRef.current) { + window.turnstile.remove(widgetIdRef.current); + widgetIdRef.current = null; + } + }; + }, []); + + return
; +} +``` + +### React StrictMode: Double Render +**Problem:** Widget renders twice in dev due to StrictMode. + +**Solution:** Use cleanup function. +```tsx +useEffect(() => { + const widgetId = window.turnstile.render('#container', { sitekey }); + return () => window.turnstile.remove(widgetId); +}, []); +``` + +### Next.js: SSR Hydration +**Problem:** `window.turnstile` undefined during SSR. + +**Solution:** Use `'use client'` or dynamic import with `ssr: false`. +```tsx +'use client'; +export default function Turnstile() { /* component */ } +``` + +### SPA: Navigation Without Cleanup +**Problem:** Navigating leaves orphaned widgets. + +**Solution:** Remove widget in cleanup. +```javascript +// Vue +onBeforeUnmount(() => window.turnstile.remove(widgetId)); + +// React +useEffect(() => () => window.turnstile.remove(widgetId), []); +``` + +## Network & Security + +### CSP Blocking +**Problem:** Content Security Policy blocks script/iframe. + +**Solution:** Add CSP directives. +```html + +``` + +### IP Address Forwarding +**Problem:** Server receives proxy IP instead of client IP. + +**Solution:** Use correct header. +```javascript +// Cloudflare Workers +const ip = request.headers.get('CF-Connecting-IP'); + +// Behind proxy +const ip = request.headers.get('X-Forwarded-For')?.split(',')[0]; +``` + +### CORS (Siteverify) +**Problem:** CORS error calling siteverify from browser. + +**Solution:** Never call siteverify client-side. Call your backend, backend calls siteverify. + +## Limits & Constraints + +| Limit | Value | Impact | +|-------|-------|--------| +| Token validity | 5 minutes | Must regenerate after expiry | +| Token use | Single-use | Cannot revalidate same token | +| Widget size | 300x65px (normal), 130x120px (compact) | Plan layout | + +## Debugging + +### Console Logging +```javascript +window.turnstile.render('#container', { + sitekey: 'YOUR_SITE_KEY', + callback: (token) => console.log('✓ Token:', token), + 'error-callback': (code) => console.error('✗ Error:', code), + 'expired-callback': () => console.warn('⏱ Expired'), + 'timeout-callback': () => console.warn('⏱ Timeout') +}); +``` + +### Check Token State +```javascript +const token = window.turnstile.getResponse(widgetId); +console.log('Token:', token || 'NOT READY'); +console.log('Expired:', window.turnstile.isExpired(widgetId)); +``` + +### Test Keys (Use First) +Always develop with test keys before production: +- Site: `1x00000000000000000000AA` +- Secret: `1x0000000000000000000000000000000AA` + +### Network Tab +- Verify `api.js` loads (200 OK) +- Check siteverify request/response +- Look for 4xx/5xx errors + +## Misconfigurations + +### Wrong Key Pairing +**Problem:** Site key from one widget, secret from another. + +**Solution:** Verify site key and secret are from same widget in dashboard. + +### Test Keys in Production +**Problem:** Using test keys in production. + +**Solution:** Environment-based keys. +```javascript +const SITE_KEY = process.env.NODE_ENV === 'production' + ? process.env.TURNSTILE_SITE_KEY + : '1x00000000000000000000AA'; +``` + +### Missing Environment Variables +**Problem:** Secret undefined on server. + +**Solution:** Check .env and verify loading. +```bash +# .env +TURNSTILE_SECRET=your_secret_here + +# Verify +console.log('Secret loaded:', !!process.env.TURNSTILE_SECRET); +``` + +## Reference + +- [Turnstile Docs](https://developers.cloudflare.com/turnstile/) +- [Dashboard](https://dash.cloudflare.com/?to=/:account/turnstile) +- [Error Codes](https://developers.cloudflare.com/turnstile/troubleshooting/) diff --git a/cloudflare/references/turnstile/patterns.md b/cloudflare/references/turnstile/patterns.md new file mode 100644 index 0000000..c147a6f --- /dev/null +++ b/cloudflare/references/turnstile/patterns.md @@ -0,0 +1,193 @@ +# Common Patterns + +## Form Integration + +### Basic Form (Implicit Rendering) + +```html + + + + + + +
+ +
+ +
+ + +``` + +### Controlled Form (Explicit Rendering) + +```javascript + + +``` + +## Framework Patterns + +### React + +```tsx +import { useState } from 'react'; +import Turnstile from '@marsidev/react-turnstile'; + +export default function Form() { + const [token, setToken] = useState(null); + + return ( +
{ + e.preventDefault(); + if (!token) return; + await fetch('/api/submit', { + method: 'POST', + body: JSON.stringify({ 'cf-turnstile-response': token }) + }); + }}> + + + + ); +} +``` + +### Vue / Svelte + +```vue + + + + + token = e.detail.token} /> +``` + +## Server Validation + +### Cloudflare Workers + +```typescript +interface Env { + TURNSTILE_SECRET: string; +} + +export default { + async fetch(request: Request, env: Env): Promise { + if (request.method !== 'POST') { + return new Response('Method not allowed', { status: 405 }); + } + + const formData = await request.formData(); + const token = formData.get('cf-turnstile-response'); + + if (!token) { + return new Response('Missing token', { status: 400 }); + } + + // Validate token + const ip = request.headers.get('CF-Connecting-IP'); + const result = await fetch('https://challenges.cloudflare.com/turnstile/v0/siteverify', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ + secret: env.TURNSTILE_SECRET, + response: token, + remoteip: ip + }) + }); + + const validation = await result.json(); + + if (!validation.success) { + return new Response('CAPTCHA validation failed', { status: 403 }); + } + + // Process form... + return new Response('Success'); + } +}; +``` + +### Pages Functions + +```typescript +// functions/submit.ts - same pattern as Workers, use ctx.env and ctx.request +export const onRequestPost: PagesFunction<{ TURNSTILE_SECRET: string }> = async (ctx) => { + const token = (await ctx.request.formData()).get('cf-turnstile-response'); + // Validate with ctx.env.TURNSTILE_SECRET (same as Workers pattern above) +}; +``` + +## Advanced Patterns + +### Pre-Clearance (Invisible) + +```html +
+ + + + +``` + +### Token Refresh on Expiry + +```javascript +let widgetId = window.turnstile.render('#container', { + sitekey: 'YOUR_SITE_KEY', + 'refresh-expired': 'manual', + 'expired-callback': () => { + console.log('Token expired, refreshing...'); + window.turnstile.reset(widgetId); + } +}); +``` + +## Testing + +### Environment-Based Keys + +```javascript +const SITE_KEY = process.env.NODE_ENV === 'production' + ? 'YOUR_PRODUCTION_SITE_KEY' + : '1x00000000000000000000AA'; // Always passes + +const SECRET_KEY = process.env.NODE_ENV === 'production' + ? process.env.TURNSTILE_SECRET + : '1x0000000000000000000000000000000AA'; +``` diff --git a/cloudflare/references/vectorize/README.md b/cloudflare/references/vectorize/README.md new file mode 100644 index 0000000..9707a10 --- /dev/null +++ b/cloudflare/references/vectorize/README.md @@ -0,0 +1,133 @@ +# Cloudflare Vectorize + +Globally distributed vector database for AI applications. Store and query vector embeddings for semantic search, recommendations, RAG, and classification. + +**Status:** Generally Available (GA) | **Last Updated:** 2026-01-27 + +## Quick Start + +```typescript +// 1. Create index +// npx wrangler vectorize create my-index --dimensions=768 --metric=cosine + +// 2. Configure binding (wrangler.jsonc) +// { "vectorize": [{ "binding": "VECTORIZE", "index_name": "my-index" }] } + +// 3. Query vectors +const matches = await env.VECTORIZE.query(queryVector, { topK: 5 }); +``` + +## Key Features + +- **10M vectors per index** (V2) +- Dimensions up to 1536 (32-bit float) +- Three distance metrics: cosine, euclidean, dot-product +- Metadata filtering (up to 10 indexes) +- Namespace support (50K namespaces paid, 1K free) +- Seamless Workers AI integration +- Global distribution + +## Reading Order + +| Task | Files to Read | +|------|---------------| +| New to Vectorize | README only | +| Implement feature | README + api + patterns | +| Setup/configure | README + configuration | +| Debug issues | gotchas | +| Integrate with AI | README + patterns | +| RAG implementation | README + patterns | + +## File Guide + +- **README.md** (this file): Overview, quick decisions +- **api.md**: Runtime API, types, operations (query/insert/upsert) +- **configuration.md**: Setup, CLI, metadata indexes +- **patterns.md**: RAG, Workers AI, OpenAI, LangChain, multi-tenant +- **gotchas.md**: Limits, pitfalls, troubleshooting + +## Distance Metric Selection + +Choose based on your use case: + +``` +What are you building? +├─ Text/semantic search → cosine (most common) +├─ Image similarity → euclidean +├─ Recommendation system → dot-product +└─ Pre-normalized vectors → dot-product +``` + +| Metric | Best For | Score Interpretation | +|--------|----------|---------------------| +| `cosine` | Text embeddings, semantic similarity | Higher = closer (1.0 = identical) | +| `euclidean` | Absolute distance, spatial data | Lower = closer (0.0 = identical) | +| `dot-product` | Recommendations, normalized vectors | Higher = closer | + +**Note:** Index configuration is immutable. Cannot change dimensions or metric after creation. + +## Multi-Tenancy Strategy + +``` +How many tenants? +├─ < 50K tenants → Use namespaces (recommended) +│ ├─ Fastest (filter before vector search) +│ └─ Strict isolation +├─ > 50K tenants → Use metadata filtering +│ ├─ Slower (post-filter after vector search) +│ └─ Requires metadata index +└─ Per-tenant indexes → Only if compliance mandated + └─ 50K index limit per account (paid plan) +``` + +## Common Workflows + +### Semantic Search + +```typescript +// 1. Generate embedding +const result = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: [query] }); + +// 2. Query Vectorize +const matches = await env.VECTORIZE.query(result.data[0], { + topK: 5, + returnMetadata: "indexed" +}); +``` + +### RAG Pattern + +```typescript +// 1. Generate query embedding +const embedding = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: [query] }); + +// 2. Search Vectorize +const matches = await env.VECTORIZE.query(embedding.data[0], { topK: 5 }); + +// 3. Fetch full documents from R2/D1/KV +const docs = await Promise.all(matches.matches.map(m => + env.R2.get(m.metadata.key).then(obj => obj?.text()) +)); + +// 4. Generate LLM response with context +const answer = await env.AI.run("@cf/meta/llama-3-8b-instruct", { + prompt: `Context: ${docs.join("\n\n")}\n\nQuestion: ${query}\n\nAnswer:` +}); +``` + +## Critical Gotchas + +See `gotchas.md` for details. Most important: + +1. **Async mutations**: Inserts take 5-10s to be queryable +2. **500 batch limit**: Workers API enforces 500 vectors per call (undocumented) +3. **Metadata truncation**: `"indexed"` returns first 64 bytes only +4. **topK with metadata**: Max 20 (not 100) when using returnValues or returnMetadata: "all" +5. **Metadata indexes first**: Must create before inserting vectors + +## Resources + +- [Official Docs](https://developers.cloudflare.com/vectorize/) +- [Client API Reference](https://developers.cloudflare.com/vectorize/reference/client-api/) +- [Workers AI Models](https://developers.cloudflare.com/workers-ai/models/#text-embeddings) +- [Discord: #vectorize](https://discord.cloudflare.com) diff --git a/cloudflare/references/vectorize/api.md b/cloudflare/references/vectorize/api.md new file mode 100644 index 0000000..e29d87f --- /dev/null +++ b/cloudflare/references/vectorize/api.md @@ -0,0 +1,88 @@ +# Vectorize API Reference + +## Types + +```typescript +interface VectorizeVector { + id: string; // Max 64 bytes + values: number[]; // Must match index dimensions + namespace?: string; // Optional partition (max 64 bytes) + metadata?: Record; // Max 10 KiB +} +``` + +## Query + +```typescript +const matches = await env.VECTORIZE.query(queryVector, { + topK: 10, // Max 100 (or 20 with returnValues/returnMetadata:"all") + returnMetadata: "indexed", // "none" | "indexed" | "all" + returnValues: false, + namespace: "tenant-123", + filter: { category: "docs" } +}); +// matches.matches[0] = { id, score, metadata? } +``` + +**returnMetadata:** `"none"` (fastest) → `"indexed"` (recommended) → `"all"` (topK max 20) + +**queryById (V2 only):** Search using existing vector as query. +```typescript +await env.VECTORIZE.queryById("doc-123", { topK: 5 }); +``` + +## Insert/Upsert + +```typescript +// Insert: ignores duplicates (keeps first) +await env.VECTORIZE.insert([{ id, values, metadata }]); + +// Upsert: overwrites duplicates (keeps last) +await env.VECTORIZE.upsert([{ id, values, metadata }]); +``` + +**Max 500 vectors per call.** Queryable after 5-10 seconds. + +## Other Operations + +```typescript +// Get by IDs +const vectors = await env.VECTORIZE.getByIds(["id1", "id2"]); + +// Delete (max 1000 IDs per call) +await env.VECTORIZE.deleteByIds(["id1", "id2"]); + +// Index info +const info = await env.VECTORIZE.describe(); +// { dimensions, metric, vectorCount } +``` + +## Filtering + +Requires metadata index. Filter operators: + +| Operator | Example | +|----------|---------| +| `$eq` (implicit) | `{ category: "docs" }` | +| `$ne` | `{ status: { $ne: "deleted" } }` | +| `$in` / `$nin` | `{ tag: { $in: ["sale"] } }` | +| `$lt`, `$lte`, `$gt`, `$gte` | `{ price: { $lt: 100 } }` | + +**Constraints:** Max 2048 bytes, no dots/`$` in keys, values: string/number/boolean/null. + +## Performance + +| Configuration | topK Limit | Speed | +|--------------|------------|-------| +| No metadata | 100 | Fastest | +| `returnMetadata: "indexed"` | 100 | Fast | +| `returnMetadata: "all"` | 20 | Slower | +| `returnValues: true` | 20 | Slower | + +**Batch operations:** Always batch (500/call) for optimal throughput. + +```typescript +for (let i = 0; i < vectors.length; i += 500) { + await env.VECTORIZE.upsert(vectors.slice(i, i + 500)); +} +``` diff --git a/cloudflare/references/vectorize/configuration.md b/cloudflare/references/vectorize/configuration.md new file mode 100644 index 0000000..8c64d3e --- /dev/null +++ b/cloudflare/references/vectorize/configuration.md @@ -0,0 +1,88 @@ +# Vectorize Configuration + +## Create Index + +```bash +npx wrangler vectorize create my-index --dimensions=768 --metric=cosine +``` + +**⚠️ Dimensions and metric are immutable** - cannot change after creation. + +## Worker Binding + +```jsonc +// wrangler.jsonc +{ + "vectorize": [ + { "binding": "VECTORIZE", "index_name": "my-index" } + ] +} +``` + +```typescript +interface Env { + VECTORIZE: Vectorize; +} +``` + +## Metadata Indexes + +**Must create BEFORE inserting vectors** - existing vectors not retroactively indexed. + +```bash +wrangler vectorize create-metadata-index my-index --property-name=category --type=string +wrangler vectorize create-metadata-index my-index --property-name=price --type=number +``` + +| Type | Use For | +|------|---------| +| `string` | Categories, tags (first 64 bytes indexed) | +| `number` | Prices, timestamps | +| `boolean` | Flags | + +## CLI Commands + +```bash +# Index management +wrangler vectorize list +wrangler vectorize info +wrangler vectorize delete + +# Vector operations +wrangler vectorize insert --file=embeddings.ndjson +wrangler vectorize get --ids=id1,id2 +wrangler vectorize delete-by-ids --ids=id1,id2 + +# Metadata indexes +wrangler vectorize list-metadata-index +wrangler vectorize delete-metadata-index --property-name=field +``` + +## Bulk Upload (NDJSON) + +```json +{"id": "1", "values": [0.1, 0.2, ...], "metadata": {"category": "docs"}} +{"id": "2", "values": [0.4, 0.5, ...], "namespace": "tenant-abc"} +``` + +**Limits:** 5000 vectors per file, 100 MB max + +## Cardinality Best Practice + +Bucket high-cardinality data: +```typescript +// ❌ Millisecond timestamps +metadata: { timestamp: Date.now() } + +// ✅ 5-minute buckets +metadata: { timestamp_bucket: Math.floor(Date.now() / 300000) * 300000 } +``` + +## Production Checklist + +1. Create index with correct dimensions +2. Create metadata indexes FIRST +3. Test bulk upload +4. Configure bindings +5. Deploy Worker +6. Verify queries diff --git a/cloudflare/references/vectorize/gotchas.md b/cloudflare/references/vectorize/gotchas.md new file mode 100644 index 0000000..9282771 --- /dev/null +++ b/cloudflare/references/vectorize/gotchas.md @@ -0,0 +1,76 @@ +# Vectorize Gotchas + +## Critical Warnings + +### Async Mutations +Insert/upsert/delete return immediately but vectors aren't queryable for 5-10 seconds. + +### Batch Size Limit +**Workers API: 500 vectors max per call** (undocumented, silently truncates) + +```typescript +// ✅ Chunk into 500 +for (let i = 0; i < vectors.length; i += 500) { + await env.VECTORIZE.upsert(vectors.slice(i, i + 500)); +} +``` + +### Metadata Truncation +`returnMetadata: "indexed"` returns only first 64 bytes of strings. Use `"all"` for complete metadata (but max topK drops to 20). + +### topK Limits + +| returnMetadata | returnValues | Max topK | +|----------------|--------------|----------| +| `"none"` / `"indexed"` | `false` | 100 | +| `"all"` | any | **20** | +| any | `true` | **20** | + +### Metadata Indexes First +Create BEFORE inserting - existing vectors not retroactively indexed. + +```bash +# ✅ Create index FIRST +wrangler vectorize create-metadata-index my-index --property-name=category --type=string +wrangler vectorize insert my-index --file=data.ndjson +``` + +### Index Config Immutable +Cannot change dimensions/metric after creation. Must create new index and migrate. + +## Limits (V2) + +| Resource | Limit | +|----------|-------| +| Vectors per index | 10,000,000 | +| Max dimensions | 1536 | +| Batch upsert (Workers) | **500** | +| Indexed string metadata | **64 bytes** | +| Metadata indexes | 10 | +| Namespaces | 50,000 (paid) / 1,000 (free) | + +## Common Mistakes + +1. **Wrong embedding shape:** Extract `result.data[0]` from Workers AI +2. **Metadata index after data:** Re-upsert all vectors +3. **Insert vs upsert:** `insert` ignores duplicates, `upsert` overwrites +4. **Not batching:** Individual inserts ~1K/min, batched ~200K+/min + +## Troubleshooting + +**No results?** +- Wait 5-10s after insert +- Check namespace spelling (case-sensitive) +- Verify metadata index exists +- Check dimension mismatch + +**Metadata filter not working?** +- Index must exist before data insert +- Strings >64 bytes truncated +- Use dot notation for nested: `"product.category"` + +## Model Dimensions + +- `@cf/baai/bge-small-en-v1.5`: 384 +- `@cf/baai/bge-base-en-v1.5`: 768 +- `@cf/baai/bge-large-en-v1.5`: 1024 diff --git a/cloudflare/references/vectorize/patterns.md b/cloudflare/references/vectorize/patterns.md new file mode 100644 index 0000000..9ffa2cb --- /dev/null +++ b/cloudflare/references/vectorize/patterns.md @@ -0,0 +1,90 @@ +# Vectorize Patterns + +## Workers AI Integration + +```typescript +// Generate embedding + query +const result = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: [query] }); +const matches = await env.VECTORIZE.query(result.data[0], { topK: 5 }); // Pass data[0]! +``` + +| Model | Dimensions | +|-------|------------| +| `@cf/baai/bge-small-en-v1.5` | 384 | +| `@cf/baai/bge-base-en-v1.5` | 768 (recommended) | +| `@cf/baai/bge-large-en-v1.5` | 1024 | + +## OpenAI Integration + +```typescript +const response = await openai.embeddings.create({ model: "text-embedding-ada-002", input: query }); +const matches = await env.VECTORIZE.query(response.data[0].embedding, { topK: 5 }); +``` + +## RAG Pattern + +```typescript +// 1. Embed query +const emb = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: [query] }); + +// 2. Search vectors +const matches = await env.VECTORIZE.query(emb.data[0], { topK: 5, returnMetadata: "indexed" }); + +// 3. Fetch full docs from R2/D1/KV +const docs = await Promise.all(matches.matches.map(m => env.R2.get(m.metadata.key).then(o => o?.text()))); + +// 4. Generate with context +const answer = await env.AI.run("@cf/meta/llama-3-8b-instruct", { + prompt: `Context:\n${docs.filter(Boolean).join("\n\n")}\n\nQuestion: ${query}\n\nAnswer:` +}); +``` + +## Multi-Tenant + +### Namespaces (< 50K tenants, fastest) + +```typescript +await env.VECTORIZE.upsert([{ id: "1", values: emb, namespace: `tenant-${id}` }]); +await env.VECTORIZE.query(vec, { namespace: `tenant-${id}`, topK: 10 }); +``` + +### Metadata Filter (> 50K tenants) + +```bash +wrangler vectorize create-metadata-index my-index --property-name=tenantId --type=string +``` + +```typescript +await env.VECTORIZE.upsert([{ id: "1", values: emb, metadata: { tenantId: id } }]); +await env.VECTORIZE.query(vec, { filter: { tenantId: id }, topK: 10 }); +``` + +## Hybrid Search + +```typescript +const matches = await env.VECTORIZE.query(vec, { + topK: 20, + filter: { + category: { $in: ["tech", "science"] }, + published: { $gte: lastMonthTimestamp } + } +}); +``` + +## Batch Ingestion + +```typescript +const BATCH = 500; +for (let i = 0; i < vectors.length; i += BATCH) { + await env.VECTORIZE.upsert(vectors.slice(i, i + BATCH)); +} +``` + +## Best Practices + +1. **Pass `data[0]`** not `data` or full response +2. **Batch 500** vectors per upsert +3. **Create metadata indexes** before inserting +4. **Use namespaces** for tenant isolation (faster than filters) +5. **`returnMetadata: "indexed"`** for best speed/data balance +6. **Handle 5-10s mutation delay** in async operations diff --git a/cloudflare/references/waf/README.md b/cloudflare/references/waf/README.md new file mode 100644 index 0000000..052d0ae --- /dev/null +++ b/cloudflare/references/waf/README.md @@ -0,0 +1,113 @@ +# Cloudflare WAF Expert Skill Reference + +**Expertise**: Cloudflare Web Application Firewall (WAF) configuration, custom rules, managed rulesets, rate limiting, attack detection, and API integration + +## Overview + +Cloudflare WAF protects web applications from attacks through managed rulesets and custom rules. + +**Detection (Managed Rulesets)** +- Pre-configured rules maintained by Cloudflare +- CVE-based rules, OWASP Top 10 coverage +- Three main rulesets: Cloudflare Managed, OWASP CRS, Exposed Credentials +- Actions: log, block, challenge, js_challenge, managed_challenge + +**Mitigation (Custom Rules & Rate Limiting)** +- Custom expressions using Wirefilter syntax +- Attack score-based blocking (`cf.waf.score`) +- Rate limiting with per-IP, per-user, or custom characteristics +- Actions: block, challenge, js_challenge, managed_challenge, log, skip + +## Quick Start + +### Deploy Cloudflare Managed Ruleset +```typescript +import Cloudflare from 'cloudflare'; + +const client = new Cloudflare({ apiToken: process.env.CF_API_TOKEN }); + +// Deploy managed ruleset to zone +await client.rulesets.create({ + zone_id: 'zone_id', + kind: 'zone', + phase: 'http_request_firewall_managed', + name: 'Deploy Cloudflare Managed Ruleset', + rules: [{ + action: 'execute', + action_parameters: { + id: 'efb7b8c949ac4650a09736fc376e9aee', // Cloudflare Managed Ruleset + }, + expression: 'true', + enabled: true, + }], +}); +``` + +### Create Custom Rule +```typescript +// Block requests with attack score >= 40 +await client.rulesets.create({ + zone_id: 'zone_id', + kind: 'zone', + phase: 'http_request_firewall_custom', + name: 'Custom WAF Rules', + rules: [{ + action: 'block', + expression: 'cf.waf.score gt 40', + description: 'Block high attack scores', + enabled: true, + }], +}); +``` + +### Create Rate Limit +```typescript +await client.rulesets.create({ + zone_id: 'zone_id', + kind: 'zone', + phase: 'http_ratelimit', + name: 'API Rate Limits', + rules: [{ + action: 'block', + expression: 'http.request.uri.path eq "/api/login"', + action_parameters: { + ratelimit: { + characteristics: ['cf.colo.id', 'ip.src'], + period: 60, + requests_per_period: 10, + mitigation_timeout: 600, + }, + }, + enabled: true, + }], +}); +``` + +## Managed Ruleset Quick Reference + +| Ruleset Name | ID | Coverage | +|--------------|----|---------| +| Cloudflare Managed | `efb7b8c949ac4650a09736fc376e9aee` | OWASP Top 10, CVEs | +| OWASP Core Ruleset | `4814384a9e5d4991b9815dcfc25d2f1f` | OWASP ModSecurity CRS | +| Exposed Credentials Check | `c2e184081120413c86c3ab7e14069605` | Credential stuffing | + +## Phases + +WAF rules execute in specific phases: +- `http_request_firewall_managed` - Managed rulesets +- `http_request_firewall_custom` - Custom rules +- `http_ratelimit` - Rate limiting rules +- `http_request_sbfm` - Super Bot Fight Mode (Pro+) + +## Reading Order + +1. **[api.md](api.md)** - SDK methods, expressions, actions, parameters +2. **[configuration.md](configuration.md)** - Setup with Wrangler, Terraform, Pulumi +3. **[patterns.md](patterns.md)** - Common patterns: deploy managed, rate limiting, skip, override +4. **[gotchas.md](gotchas.md)** - Execution order, limits, expression errors + +## See Also + +- [Cloudflare WAF Docs](https://developers.cloudflare.com/waf/) +- [Ruleset Engine](https://developers.cloudflare.com/ruleset-engine/) +- [Expression Reference](https://developers.cloudflare.com/ruleset-engine/rules-language/) \ No newline at end of file diff --git a/cloudflare/references/waf/api.md b/cloudflare/references/waf/api.md new file mode 100644 index 0000000..a7bc9e0 --- /dev/null +++ b/cloudflare/references/waf/api.md @@ -0,0 +1,202 @@ +# API Reference + +## SDK Setup + +```typescript +import Cloudflare from 'cloudflare'; + +const client = new Cloudflare({ + apiToken: process.env.CF_API_TOKEN, +}); +``` + +## Core Methods + +```typescript +// List rulesets +await client.rulesets.list({ zone_id: 'zone_id', phase: 'http_request_firewall_managed' }); + +// Get ruleset +await client.rulesets.get({ zone_id: 'zone_id', ruleset_id: 'ruleset_id' }); + +// Create ruleset +await client.rulesets.create({ + zone_id: 'zone_id', + kind: 'zone', + phase: 'http_request_firewall_custom', + name: 'Custom WAF Rules', + rules: [{ action: 'block', expression: 'cf.waf.score gt 40', enabled: true }], +}); + +// Update ruleset (include rule id to keep existing, omit id for new rules) +await client.rulesets.update({ + zone_id: 'zone_id', + ruleset_id: 'ruleset_id', + rules: [ + { id: 'rule_id', action: 'block', expression: 'cf.waf.score gt 40', enabled: true }, + { action: 'challenge', expression: 'http.request.uri.path contains "/admin"', enabled: true }, + ], +}); + +// Delete ruleset +await client.rulesets.delete({ zone_id: 'zone_id', ruleset_id: 'ruleset_id' }); +``` + +## Actions & Phases + +### Actions by Phase + +| Action | Custom | Managed | Rate Limit | Description | +|--------|--------|---------|------------|-------------| +| `block` | ✅ | ❌ | ✅ | Block request with 403 | +| `challenge` | ✅ | ❌ | ✅ | Show CAPTCHA challenge | +| `js_challenge` | ✅ | ❌ | ✅ | JS-based challenge | +| `managed_challenge` | ✅ | ❌ | ✅ | Smart challenge (recommended) | +| `log` | ✅ | ❌ | ✅ | Log only, don't block | +| `skip` | ✅ | ❌ | ❌ | Skip rule evaluation | +| `execute` | ❌ | ✅ | ❌ | Deploy managed ruleset | + +### Phases (Execution Order) + +1. `http_request_firewall_custom` - Custom rules (first line of defense) +2. `http_request_firewall_managed` - Managed rulesets (pre-configured protection) +3. `http_ratelimit` - Rate limiting (request throttling) +4. `http_request_sbfm` - Super Bot Fight Mode (Pro+ only) + +## Expression Syntax + +### Fields + +```typescript +// Request properties +http.request.method // GET, POST, etc. +http.request.uri.path // /api/users +http.host // example.com + +// IP and Geolocation +ip.src // 192.0.2.1 +ip.geoip.country // US, GB, etc. +ip.geoip.continent // NA, EU, etc. + +// Attack detection +cf.waf.score // 0-100 attack score +cf.waf.score.sqli // SQL injection score +cf.waf.score.xss // XSS score + +// Headers & Cookies +http.request.headers["authorization"][0] +http.request.cookies["session"][0] +lower(http.user_agent) // Lowercase user agent +``` + +### Operators + +```typescript +// Comparison +eq // Equal +ne // Not equal +lt // Less than +le // Less than or equal +gt // Greater than +ge // Greater than or equal + +// String matching +contains // Substring match +matches // Regex match (use carefully) +starts_with // Prefix match +ends_with // Suffix match + +// List operations +in // Value in list +not // Logical NOT +and // Logical AND +or // Logical OR +``` + +### Expression Examples + +```typescript +'cf.waf.score gt 40' // Attack score +'http.request.uri.path eq "/api/login" and http.request.method eq "POST"' // Path + method +'ip.src in {192.0.2.0/24 203.0.113.0/24}' // IP blocking +'ip.geoip.country in {"CN" "RU" "KP"}' // Country blocking +'http.user_agent contains "bot"' // User agent +'not http.request.headers["authorization"][0]' // Header check +'(cf.waf.score.sqli gt 20 or cf.waf.score.xss gt 20) and http.request.uri.path starts_with "/api"' // Complex +``` + +## Rate Limiting Configuration + +```typescript +{ + action: 'block', + expression: 'http.request.uri.path starts_with "/api"', + action_parameters: { + ratelimit: { + // Characteristics define uniqueness: 'ip.src', 'cf.colo.id', + // 'http.request.headers["key"][0]', 'http.request.cookies["session"][0]' + characteristics: ['cf.colo.id', 'ip.src'], // Recommended: per-IP per-datacenter + period: 60, // Time window in seconds + requests_per_period: 100, // Max requests in period + mitigation_timeout: 600, // Block duration in seconds + counting_expression: 'http.request.method ne "GET"', // Optional: filter counted requests + requests_to_origin: false, // Count all requests (not just origin hits) + }, + }, + enabled: true, +} +``` + +## Managed Ruleset Deployment + +```typescript +{ + action: 'execute', + action_parameters: { + id: 'efb7b8c949ac4650a09736fc376e9aee', // Cloudflare Managed + overrides: { + // Override specific rules + rules: [ + { id: '5de7edfa648c4d6891dc3e7f84534ffa', action: 'log', enabled: true }, + ], + // Override categories: 'wordpress', 'sqli', 'xss', 'rce', etc. + categories: [ + { category: 'wordpress', enabled: false }, + { category: 'sqli', action: 'log' }, + ], + }, + }, + expression: 'true', + enabled: true, +} +``` + +## Skip Rules + +Skip rules bypass subsequent rule evaluation. Two skip types: + +**Skip current ruleset**: Skip remaining rules in current phase only +```typescript +{ + action: 'skip', + action_parameters: { + ruleset: 'current', // Skip rest of current ruleset + }, + expression: 'http.request.uri.path ends_with ".jpg" or http.request.uri.path ends_with ".css"', + enabled: true, +} +``` + +**Skip entire phases**: Skip one or more phases completely +```typescript +{ + action: 'skip', + action_parameters: { + phases: ['http_request_firewall_managed', 'http_ratelimit'], // Skip multiple phases + }, + expression: 'ip.src in {192.0.2.0/24 203.0.113.0/24}', + enabled: true, +} +``` + +**Note**: Skip rules in custom phase can skip managed/ratelimit phases, but not vice versa (execution order). \ No newline at end of file diff --git a/cloudflare/references/waf/configuration.md b/cloudflare/references/waf/configuration.md new file mode 100644 index 0000000..796a291 --- /dev/null +++ b/cloudflare/references/waf/configuration.md @@ -0,0 +1,203 @@ +# Configuration + +## Prerequisites + +**API Token**: Create at https://dash.cloudflare.com/profile/api-tokens +- Permission: `Zone.WAF Edit` or `Zone.Firewall Services Edit` +- Zone Resources: Include specific zones or all zones + +**Zone ID**: Found in dashboard > Overview > API section (right sidebar) + +```bash +# Set environment variables +export CF_API_TOKEN="your_api_token_here" +export ZONE_ID="your_zone_id_here" +``` + +## TypeScript SDK Usage + +```bash +npm install cloudflare +``` + +```typescript +import Cloudflare from 'cloudflare'; + +const client = new Cloudflare({ apiToken: process.env.CF_API_TOKEN }); + +// Custom rules +await client.rulesets.create({ + zone_id: process.env.ZONE_ID, + kind: 'zone', + phase: 'http_request_firewall_custom', + name: 'Custom WAF', + rules: [ + { action: 'block', expression: 'cf.waf.score gt 50', enabled: true }, + { action: 'challenge', expression: 'http.request.uri.path eq "/admin"', enabled: true }, + ], +}); + +// Managed ruleset +await client.rulesets.create({ + zone_id: process.env.ZONE_ID, + phase: 'http_request_firewall_managed', + rules: [{ + action: 'execute', + action_parameters: { id: 'efb7b8c949ac4650a09736fc376e9aee' }, + expression: 'true', + }], +}); + +// Rate limiting +await client.rulesets.create({ + zone_id: process.env.ZONE_ID, + phase: 'http_ratelimit', + rules: [{ + action: 'block', + expression: 'http.request.uri.path starts_with "/api"', + action_parameters: { + ratelimit: { + characteristics: ['cf.colo.id', 'ip.src'], + period: 60, + requests_per_period: 100, + mitigation_timeout: 600, + }, + }, + }], +}); +``` + +## Terraform Configuration + +```hcl +provider "cloudflare" { + api_token = var.cloudflare_api_token +} + +resource "cloudflare_ruleset" "waf_custom" { + zone_id = var.zone_id + kind = "zone" + phase = "http_request_firewall_custom" + + rules { + action = "block" + expression = "cf.waf.score gt 50" + } +} +``` + +**Managed Ruleset & Rate Limiting**: +```hcl +resource "cloudflare_ruleset" "waf_managed" { + zone_id = var.zone_id + name = "Managed Ruleset" + kind = "zone" + phase = "http_request_firewall_managed" + + rules { + action = "execute" + action_parameters { + id = "efb7b8c949ac4650a09736fc376e9aee" + overrides { + rules { + id = "5de7edfa648c4d6891dc3e7f84534ffa" + action = "log" + } + } + } + expression = "true" + } +} + +resource "cloudflare_ruleset" "rate_limiting" { + zone_id = var.zone_id + phase = "http_ratelimit" + + rules { + action = "block" + expression = "http.request.uri.path starts_with \"/api\"" + ratelimit { + characteristics = ["cf.colo.id", "ip.src"] + period = 60 + requests_per_period = 100 + mitigation_timeout = 600 + } + } +} +``` + +## Pulumi Configuration + +```typescript +import * as cloudflare from '@pulumi/cloudflare'; + +const zoneId = 'zone_id'; + +// Custom rules +const wafCustom = new cloudflare.Ruleset('waf-custom', { + zoneId, + phase: 'http_request_firewall_custom', + rules: [ + { action: 'block', expression: 'cf.waf.score gt 50', enabled: true }, + { action: 'challenge', expression: 'http.request.uri.path eq "/admin"', enabled: true }, + ], +}); + +// Managed ruleset +const wafManaged = new cloudflare.Ruleset('waf-managed', { + zoneId, + phase: 'http_request_firewall_managed', + rules: [{ + action: 'execute', + actionParameters: { id: 'efb7b8c949ac4650a09736fc376e9aee' }, + expression: 'true', + }], +}); + +// Rate limiting +const rateLimiting = new cloudflare.Ruleset('rate-limiting', { + zoneId, + phase: 'http_ratelimit', + rules: [{ + action: 'block', + expression: 'http.request.uri.path starts_with "/api"', + ratelimit: { + characteristics: ['cf.colo.id', 'ip.src'], + period: 60, + requestsPerPeriod: 100, + mitigationTimeout: 600, + }, + }], +}); +``` + +## Dashboard Configuration + +1. Navigate to: **Security** > **WAF** +2. Select tab: + - **Managed rules** - Deploy/configure managed rulesets + - **Custom rules** - Create custom rules + - **Rate limiting rules** - Configure rate limits +3. Click **Deploy** or **Create rule** + +**Testing**: Use Security Events to test expressions before deploying. + +## Wrangler Integration + +WAF configuration is zone-level (not Worker-specific). Configuration methods: +- Dashboard UI +- Cloudflare API via SDK +- Terraform/Pulumi (IaC) + +**Workers benefit from WAF automatically** - no Worker code changes needed. + +**Example: Query WAF API from Worker**: +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + return fetch(`https://api.cloudflare.com/client/v4/zones/${env.ZONE_ID}/rulesets`, { + headers: { 'Authorization': `Bearer ${env.CF_API_TOKEN}` }, + }); + }, +}; +``` \ No newline at end of file diff --git a/cloudflare/references/waf/gotchas.md b/cloudflare/references/waf/gotchas.md new file mode 100644 index 0000000..05b0dc1 --- /dev/null +++ b/cloudflare/references/waf/gotchas.md @@ -0,0 +1,204 @@ +# Gotchas & Troubleshooting + +## Execution Order + +**Problem:** Rules execute in unexpected order +**Cause:** Misunderstanding phase execution +**Solution:** + +Phases execute sequentially (can't be changed): +1. `http_request_firewall_custom` - Custom rules +2. `http_request_firewall_managed` - Managed rulesets +3. `http_ratelimit` - Rate limiting + +Within phase: top-to-bottom, first match wins (unless `skip`) + +```typescript +// WRONG: Can't mix phase-specific actions +await client.rulesets.create({ + phase: 'http_request_firewall_custom', + rules: [ + { action: 'block', expression: 'cf.waf.score gt 50' }, + { action: 'execute', action_parameters: { id: 'managed_id' } }, // WRONG + ], +}); + +// CORRECT: Separate rulesets per phase +await client.rulesets.create({ phase: 'http_request_firewall_custom', rules: [...] }); +await client.rulesets.create({ phase: 'http_request_firewall_managed', rules: [...] }); +``` + +## Expression Errors + +**Problem:** Syntax errors prevent deployment +**Cause:** Invalid field/operator/syntax +**Solution:** + +```typescript +// Common mistakes +'http.request.path' → 'http.request.uri.path' // Correct field +'ip.geoip.country eq US' → 'ip.geoip.country eq "US"' // Quote strings +'http.user_agent eq "Mozilla"' → 'lower(http.user_agent) contains "mozilla"' // Case sensitivity +'matches ".*[.jpg"' → 'matches ".*\\.jpg$"' // Valid regex +``` + +Test expressions in Security Events before deploying. + +## Skip Rule Pitfalls + +**Problem:** Skip rules don't work as expected +**Cause:** Misunderstanding skip scope +**Solution:** + +Skip types: +- `ruleset: 'current'` - Skip remaining rules in current ruleset only +- `phases: ['phase_name']` - Skip entire phases + +```typescript +// WRONG: Trying to skip managed rules from custom phase +// In http_request_firewall_custom: +{ + action: 'skip', + action_parameters: { ruleset: 'current' }, + expression: 'ip.src in {192.0.2.0/24}', +} +// This only skips remaining custom rules, not managed rules + +// CORRECT: Skip specific phases +{ + action: 'skip', + action_parameters: { + phases: ['http_request_firewall_managed', 'http_ratelimit'], + }, + expression: 'ip.src in {192.0.2.0/24}', +} +``` + +## Update Replaces All Rules + +**Problem:** Updating ruleset deletes other rules +**Cause:** `update()` replaces entire rule list +**Solution:** + +```typescript +// WRONG: This deletes all existing rules! +await client.rulesets.update({ + zone_id: 'zone_id', + ruleset_id: 'ruleset_id', + rules: [{ action: 'block', expression: 'cf.waf.score gt 50' }], +}); + +// CORRECT: Get existing rules first +const ruleset = await client.rulesets.get({ zone_id: 'zone_id', ruleset_id: 'ruleset_id' }); +await client.rulesets.update({ + zone_id: 'zone_id', + ruleset_id: 'ruleset_id', + rules: [...ruleset.rules, { action: 'block', expression: 'cf.waf.score gt 50' }], +}); +``` + +## Override Conflicts + +**Problem:** Managed ruleset overrides don't apply +**Cause:** Rule ID doesn't exist or category name incorrect +**Solution:** + +```typescript +// List managed ruleset rules to find IDs +const ruleset = await client.rulesets.get({ + zone_id: 'zone_id', + ruleset_id: 'efb7b8c949ac4650a09736fc376e9aee', +}); +console.log(ruleset.rules.map(r => ({ id: r.id, description: r.description }))); + +// Use correct IDs in overrides +{ action: 'execute', action_parameters: { id: 'efb7b8c949ac4650a09736fc376e9aee', + overrides: { rules: [{ id: '5de7edfa648c4d6891dc3e7f84534ffa', action: 'log' }] } } } +``` + +## False Positives + +**Problem:** Legitimate traffic blocked +**Cause:** Aggressive rules/thresholds +**Solution:** + +1. Start with log mode: `overrides: { action: 'log' }` +2. Review Security Events to identify false positives +3. Override specific rules: `overrides: { rules: [{ id: 'rule_id', action: 'log' }] }` + +## Rate Limiting NAT Issues + +**Problem:** Users behind NAT hit rate limits too quickly +**Cause:** Multiple users sharing single IP +**Solution:** + +Add more characteristics: User-Agent, session cookie, or authorization header +```typescript +{ + action: 'block', + expression: 'http.request.uri.path starts_with "/api"', + action_parameters: { + ratelimit: { + characteristics: ['cf.colo.id', 'ip.src', 'http.request.cookies["session"][0]'], + period: 60, + requests_per_period: 100, + }, + }, +} +``` + +## Performance Issues + +**Problem:** Increased latency +**Cause:** Complex expressions, excessive rules +**Solution:** + +1. Skip static assets early: `action: 'skip'` for `\\.(jpg|css|js)$` +2. Path-based deployment: Only run managed on `/api` or `/admin` +3. Disable unused categories: `{ category: 'wordpress', enabled: false }` +4. Prefer string operators over regex: `starts_with` vs `matches` + +## Limits & Quotas + +| Resource | Free | Pro | Business | Enterprise | +|----------|------|-----|----------|------------| +| Custom rules | 5 | 20 | 100 | 1000 | +| Rate limiting rules | 1 | 10 | 25 | 100 | +| Rule expression length | 4096 chars | 4096 chars | 4096 chars | 4096 chars | +| Rules per ruleset | 75 | 75 | 400 | 1000 | +| Managed rulesets | Yes | Yes | Yes | Yes | +| Rate limit characteristics | 2 | 3 | 5 | 5 | + +**Important Notes:** +- Rules execute in order; first match wins (except skip rules) +- Expression evaluation stops at first `false` in AND chains +- `matches` regex operator is slower than string operators +- Rate limit counting happens before mitigation + +## API Errors + +**Problem:** API calls fail with cryptic errors +**Cause:** Invalid parameters or permissions +**Solution:** + +```typescript +// Error: "Invalid phase" → Use exact phase name +phase: 'http_request_firewall_custom' + +// Error: "Ruleset already exists" → Use update() or list first +const rulesets = await client.rulesets.list({ zone_id, phase: 'http_request_firewall_custom' }); +if (rulesets.result.length > 0) { + await client.rulesets.update({ zone_id, ruleset_id: rulesets.result[0].id, rules: [...] }); +} + +// Error: "Action not supported" → Check phase/action compatibility +// 'execute' only in http_request_firewall_managed +// Rate limit config only in http_ratelimit phase + +// Error: "Expression parse error" → Common fixes: +'ip.geoip.country eq "US"' // Quote strings +'cf.waf.score gt 40' // Use 'gt' not '>' +'http.request.uri.path' // Not 'http.request.path' +``` + +**Tip**: Test expressions in dashboard Security Events before deploying. diff --git a/cloudflare/references/waf/patterns.md b/cloudflare/references/waf/patterns.md new file mode 100644 index 0000000..1fe0004 --- /dev/null +++ b/cloudflare/references/waf/patterns.md @@ -0,0 +1,197 @@ +# Common Patterns + +## Deploy Managed Rulesets + +```typescript +// Deploy Cloudflare Managed Ruleset (default) +await client.rulesets.create({ + zone_id: 'zone_id', + kind: 'zone', + phase: 'http_request_firewall_managed', + name: 'Cloudflare Managed Ruleset', + rules: [{ + action: 'execute', + action_parameters: { + id: 'efb7b8c949ac4650a09736fc376e9aee', // Cloudflare Managed + // Or: '4814384a9e5d4991b9815dcfc25d2f1f' for OWASP CRS + // Or: 'c2e184081120413c86c3ab7e14069605' for Exposed Credentials + }, + expression: 'true', // All requests + // Or: 'http.request.uri.path starts_with "/api"' for specific paths + enabled: true, + }], +}); +``` + +## Override Managed Ruleset + +```typescript +await client.rulesets.create({ + zone_id: 'zone_id', + phase: 'http_request_firewall_managed', + rules: [{ + action: 'execute', + action_parameters: { + id: 'efb7b8c949ac4650a09736fc376e9aee', + overrides: { + // Override specific rules + rules: [ + { id: '5de7edfa648c4d6891dc3e7f84534ffa', action: 'log' }, + { id: '75a0060762034b9dad4e883afc121b4c', enabled: false }, + ], + // Override categories: wordpress, sqli, xss, rce, etc. + categories: [ + { category: 'wordpress', enabled: false }, + { category: 'sqli', action: 'log' }, + ], + }, + }, + expression: 'true', + }], +}); +``` + +## Custom Rules + +```typescript +await client.rulesets.create({ + zone_id: 'zone_id', + kind: 'zone', + phase: 'http_request_firewall_custom', + name: 'Custom WAF Rules', + rules: [ + // Attack score-based + { action: 'block', expression: 'cf.waf.score gt 50', enabled: true }, + { action: 'challenge', expression: 'cf.waf.score gt 20', enabled: true }, + + // Specific attack types + { action: 'block', expression: 'cf.waf.score.sqli gt 30 or cf.waf.score.xss gt 30', enabled: true }, + + // Geographic blocking + { action: 'block', expression: 'ip.geoip.country in {"CN" "RU"}', enabled: true }, + ], +}); +``` + +## Rate Limiting + +```typescript +await client.rulesets.create({ + zone_id: 'zone_id', + kind: 'zone', + phase: 'http_ratelimit', + name: 'Rate Limits', + rules: [ + // Per-IP global limit + { + action: 'block', + expression: 'true', + action_parameters: { + ratelimit: { + characteristics: ['cf.colo.id', 'ip.src'], + period: 60, + requests_per_period: 100, + mitigation_timeout: 600, + }, + }, + }, + + // Login endpoint (stricter) + { + action: 'block', + expression: 'http.request.uri.path eq "/api/login"', + action_parameters: { + ratelimit: { + characteristics: ['ip.src'], + period: 60, + requests_per_period: 5, + mitigation_timeout: 600, + }, + }, + }, + + // API writes only (using counting_expression) + { + action: 'block', + expression: 'http.request.uri.path starts_with "/api"', + action_parameters: { + ratelimit: { + characteristics: ['cf.colo.id', 'ip.src'], + period: 60, + requests_per_period: 50, + counting_expression: 'http.request.method ne "GET"', + }, + }, + }, + ], +}); +``` + +## Skip Rules + +```typescript +await client.rulesets.create({ + zone_id: 'zone_id', + kind: 'zone', + phase: 'http_request_firewall_custom', + name: 'Skip Rules', + rules: [ + // Skip static assets (current ruleset only) + { + action: 'skip', + action_parameters: { ruleset: 'current' }, + expression: 'http.request.uri.path matches "\\.(jpg|css|js|woff2?)$"', + }, + + // Skip all WAF phases for trusted IPs + { + action: 'skip', + action_parameters: { + phases: ['http_request_firewall_managed', 'http_ratelimit'], + }, + expression: 'ip.src in {192.0.2.0/24}', + }, + ], +}); +``` + +## Complete Setup Example + +Combine all three phases for comprehensive protection: + +```typescript +const client = new Cloudflare({ apiToken: process.env.CF_API_TOKEN }); +const zoneId = process.env.ZONE_ID; + +// 1. Custom rules (execute first) +await client.rulesets.create({ + zone_id: zoneId, + phase: 'http_request_firewall_custom', + rules: [ + { action: 'skip', action_parameters: { phases: ['http_request_firewall_managed', 'http_ratelimit'] }, expression: 'ip.src in {192.0.2.0/24}' }, + { action: 'block', expression: 'cf.waf.score gt 50' }, + { action: 'managed_challenge', expression: 'cf.waf.score gt 20' }, + ], +}); + +// 2. Managed ruleset (execute second) +await client.rulesets.create({ + zone_id: zoneId, + phase: 'http_request_firewall_managed', + rules: [{ + action: 'execute', + action_parameters: { id: 'efb7b8c949ac4650a09736fc376e9aee', overrides: { categories: [{ category: 'wordpress', enabled: false }] } }, + expression: 'true', + }], +}); + +// 3. Rate limiting (execute third) +await client.rulesets.create({ + zone_id: zoneId, + phase: 'http_ratelimit', + rules: [ + { action: 'block', expression: 'true', action_parameters: { ratelimit: { characteristics: ['cf.colo.id', 'ip.src'], period: 60, requests_per_period: 100, mitigation_timeout: 600 } } }, + { action: 'block', expression: 'http.request.uri.path eq "/api/login"', action_parameters: { ratelimit: { characteristics: ['ip.src'], period: 60, requests_per_period: 5, mitigation_timeout: 600 } } }, + ], +}); +``` \ No newline at end of file diff --git a/cloudflare/references/web-analytics/README.md b/cloudflare/references/web-analytics/README.md new file mode 100644 index 0000000..fcc1938 --- /dev/null +++ b/cloudflare/references/web-analytics/README.md @@ -0,0 +1,140 @@ +# Cloudflare Web Analytics + +Privacy-first web analytics providing Core Web Vitals, traffic metrics, and user insights without compromising visitor privacy. + +## Overview + +Cloudflare Web Analytics provides: +- **Core Web Vitals** - LCP, FID, CLS, INP, TTFB monitoring +- **Page views & visits** - Traffic patterns without cookies +- **Referrers & paths** - Traffic sources and popular pages +- **Device & browser data** - User agent breakdown +- **Geographic data** - Country-level visitor distribution +- **Privacy-first** - No cookies, fingerprinting, or PII collection +- **Free** - No cost, unlimited pageviews + +**Important:** Web Analytics is **dashboard-only**. No API exists for programmatic data access. + +## Quick Start Decision Tree + +``` +Is your site proxied through Cloudflare? +├─ YES → Use automatic injection (configuration.md) +│ ├─ Enable auto-injection in dashboard +│ └─ No code changes needed (unless Cache-Control: no-transform) +│ +└─ NO → Use manual beacon integration (integration.md) + ├─ Add JS snippet to HTML + ├─ Use spa: true for React/Vue/Next.js + └─ Configure CSP if needed +``` + +## Reading Order + +1. **[configuration.md](configuration.md)** - Setup for proxied vs non-proxied sites +2. **[integration.md](integration.md)** - Framework-specific beacon integration (React, Next.js, Vue, Nuxt, etc.) +3. **[patterns.md](patterns.md)** - Common use cases (performance monitoring, GDPR consent, multi-site tracking) +4. **[gotchas.md](gotchas.md)** - Troubleshooting (SPA tracking, CSP issues, hash routing limitations) + +## When to Use Each File + +- **Setting up for first time?** → Start with configuration.md +- **Using React/Next.js/Vue/Nuxt?** → Go to integration.md for framework code +- **Need GDPR consent loading?** → See patterns.md +- **Beacon not loading or no data?** → Check gotchas.md +- **SPA not tracking navigation?** → See integration.md for `spa: true` config + +## Key Concepts + +### Proxied vs Non-Proxied Sites + +| Type | Description | Beacon Injection | Limit | +|------|-------------|------------------|-------| +| **Proxied** | DNS through Cloudflare (orange cloud) | Automatic or manual | Unlimited | +| **Non-proxied** | External hosting, manual beacon | Manual only | 10 sites max | + +### SPA Mode + +**Critical for modern frameworks:** +```json +{"token": "YOUR_TOKEN", "spa": true} +``` + +Without `spa: true`, client-side navigation (React Router, Vue Router, Next.js routing) will NOT be tracked. Only initial page loads will register. + +### CSP Requirements + +If using Content Security Policy, allow both domains: +``` +script-src https://static.cloudflareinsights.com https://cloudflareinsights.com; +``` + +## Features + +### Core Web Vitals Debugging +- **LCP (Largest Contentful Paint)** - Identifies slow-loading hero images/elements +- **FID (First Input Delay)** - Interaction responsiveness (legacy metric) +- **INP (Interaction to Next Paint)** - Modern interaction responsiveness metric +- **CLS (Cumulative Layout Shift)** - Visual stability issues +- **TTFB (Time to First Byte)** - Server response performance + +Dashboard shows top 5 problematic elements with CSS selectors for debugging. + +### Traffic Filters +- **Bot filtering** - Exclude automated traffic from metrics +- **Date ranges** - Custom time period analysis +- **Geographic** - Country-level filtering +- **Device type** - Desktop, mobile, tablet breakdown +- **Browser/OS** - User agent filtering + +### Rules (Advanced - Plan-dependent) + +Create custom tracking rules for advanced configurations: + +**Sample Rate Rules:** +- Reduce data collection percentage for high-traffic sites +- Example: Track only 50% of visitors to reduce volume + +**Path-Based Rules:** +- Different behavior per route +- Example: Exclude `/admin/*` or `/internal/*` from tracking + +**Host-Based Rules:** +- Multi-domain configurations +- Example: Separate tracking for staging vs production subdomains + +**Availability:** Rules feature depends on your Cloudflare plan. Check dashboard under Web Analytics → Rules to see if available. Free plans may have limited or no access. + +## Plan Limits + +| Feature | Free | Notes | +|---------|------|-------| +| Proxied sites | Unlimited | DNS through Cloudflare | +| Non-proxied sites | 10 | External hosting | +| Pageviews | Unlimited | No volume limits | +| Data retention | 6 months | Rolling window | +| Rules | Plan-dependent | Check dashboard | + +## Privacy & Compliance + +- **No cookies** - Zero client-side storage +- **No fingerprinting** - No tracking across sites +- **No PII** - IP addresses not stored +- **GDPR-friendly** - Minimal data collection +- **CCPA-compliant** - No personal data sale + +**EU opt-out:** Dashboard option to exclude EU visitor data entirely. + +## Limitations + +- **Dashboard-only** - No API for programmatic access +- **No real-time** - 5-10 minute data delay +- **No custom events** - Automatic pageview/navigation tracking only +- **History API only** - Hash-based routing (`#/path`) not supported +- **No session replay** - Metrics only, no user recordings +- **No form tracking** - Page navigation tracking only + +## See Also + +- [Cloudflare Web Analytics Docs](https://developers.cloudflare.com/analytics/web-analytics/) +- [Core Web Vitals Guide](https://web.dev/vitals/) diff --git a/cloudflare/references/web-analytics/configuration.md b/cloudflare/references/web-analytics/configuration.md new file mode 100644 index 0000000..ff8f18d --- /dev/null +++ b/cloudflare/references/web-analytics/configuration.md @@ -0,0 +1,76 @@ +# Configuration + +## Setup Methods + +### Proxied Sites (Automatic) + +Dashboard → Web Analytics → Add site → Select hostname → Done + +| Injection Option | Description | +|------------------|-------------| +| Enable | Auto-inject for all visitors (default) | +| Enable, excluding EU | No injection for EU (GDPR) | +| Enable with manual snippet | You add beacon manually | +| Disable | Pause tracking | + +**Fails if response has:** `Cache-Control: public, no-transform` + +**CSP required:** +``` +script-src https://static.cloudflareinsights.com https://cloudflareinsights.com; +``` + +### Non-Proxied Sites (Manual) + +Dashboard → Web Analytics → Add site → Enter hostname → Copy snippet + +```html + +``` + +**Limits:** 10 non-proxied sites per account + +## SPA Mode + +**Enable `spa: true` for:** React Router, Next.js, Vue Router, Nuxt, SvelteKit, Angular + +**Keep `spa: false` for:** Traditional multi-page apps, static sites, WordPress + +**Hash routing (`#/path`) NOT supported** - use History API routing. + +## Token Management + +- Found in: Dashboard → Web Analytics → Manage site +- **Not secrets** - domain-locked, safe to expose in HTML +- Each site gets unique token + +## Environment Config + +```typescript +// Only load in production +if (process.env.NODE_ENV === 'production') { + // Load beacon +} +``` + +Or use environment-specific tokens via env vars. + +## Verify Installation + +1. DevTools Network → filter `cloudflareinsights` → see `beacon.min.js` + data request +2. No CSP/CORS errors in console +3. Dashboard shows pageviews after 5-10 min delay + +## Rules (Plan-dependent) + +Configure in dashboard for: +- **Sample rate** - reduce collection % for high-traffic +- **Path-based** - different behavior per route +- **Host-based** - separate tracking per domain + +## Data Retention + +- 6 months rolling window +- 1-hour bucket granularity +- No raw export, dashboard only diff --git a/cloudflare/references/web-analytics/gotchas.md b/cloudflare/references/web-analytics/gotchas.md new file mode 100644 index 0000000..cad1424 --- /dev/null +++ b/cloudflare/references/web-analytics/gotchas.md @@ -0,0 +1,82 @@ +# Web Analytics Gotchas + +## Critical Issues + +### SPA Navigation Not Tracked + +**Symptom:** Only initial pageload counted +**Fix:** Add `spa: true`: +```html + +``` + +### CSP Blocking Beacon + +**Symptom:** Console error "Refused to load script" +**Fix:** Allow both domains: +``` +script-src 'self' https://static.cloudflareinsights.com https://cloudflareinsights.com; +``` + +### Hash-Based Routing Unsupported + +**Symptom:** `#/path` URLs not tracked +**Fix:** Migrate to History API (`BrowserRouter`, not `HashRouter`). No workaround for hash routing. + +### No Data Appearing + +**Causes & Fixes:** +1. **Delay** - Wait 5-15 minutes +2. **Wrong token** - Verify matches dashboard exactly +3. **Script blocked** - Check DevTools Network tab for beacon.min.js +4. **Domain mismatch** - Dashboard site must match actual URL + +### Auto-Injection Fails + +**Cause:** `Cache-Control: no-transform` header +**Fix:** Remove `no-transform` or install beacon manually + +### Duplicate Pageviews + +**Cause:** Multiple beacon scripts +**Fix:** Keep only one beacon per page + +## Configuration Issues + +| Issue | Fix | +|-------|-----| +| 10-site limit reached | Delete old sites or proxy through CF (unlimited) | +| Token not recognized | Use exact alphanumeric token from dashboard | + +## Framework-Specific + +### Next.js Hydration Warning + +```tsx + +``` + +Place before closing `` tag. + +## Framework Examples + +| Framework | Location | Notes | +|-----------|----------|-------| +| React/Vite | `public/index.html` | Add `spa: true` | +| Next.js App Router | `app/layout.tsx` | Use ` +``` + +Without `spa: true`: only initial pageload tracked. + +## Staging/Production Separation + +```typescript +// Use env-specific tokens +const token = process.env.NEXT_PUBLIC_CF_ANALYTICS_TOKEN; +// .env.production: production token +// .env.staging: staging token (or empty to disable) +``` + +## Bot Filtering + +Dashboard → Filters → "Exclude Bot Traffic" + +Filters: Search crawlers, monitoring services, known bots. +Not filtered: Headless browsers (Playwright/Puppeteer). + +## Ad-Blocker Impact + +~25-40% of users may block `cloudflareinsights.com`. No official workaround. +Dashboard shows minimum baseline; use server logs for complete picture. + +## Limitations + +- No UTM parameter tracking +- No webhooks/alerts/API +- No custom beacon domains +- Max 10 non-proxied sites diff --git a/cloudflare/references/workerd/README.md b/cloudflare/references/workerd/README.md new file mode 100644 index 0000000..11b2c6d --- /dev/null +++ b/cloudflare/references/workerd/README.md @@ -0,0 +1,78 @@ +# Workerd Runtime + +V8-based JS/Wasm runtime powering Cloudflare Workers. Use as app server, dev tool, or HTTP proxy. + +## ⚠️ IMPORTANT SECURITY NOTICE +**workerd is NOT a hardened sandbox.** Do not run untrusted code. It's designed for deploying YOUR code locally/self-hosted, not multi-tenant SaaS. Cloudflare production adds security layers not present in open-source workerd. + +## Decision Tree: When to Use What + +**95% of users:** Use Wrangler +- Local development: `wrangler dev` (uses workerd internally) +- Deployment: `wrangler deploy` (deploys to Cloudflare) +- Types: `wrangler types` (generates TypeScript types) + +**Use raw workerd directly only if:** +- Self-hosting Workers runtime in production +- Embedding runtime in C++ application +- Custom tooling/testing infrastructure +- Debugging workerd-specific behavior + +**Never use workerd for:** +- Running untrusted/user-submitted code +- Multi-tenant isolation (not hardened) +- Production without additional security layers + +## Key Features +- **Standards-based**: Fetch API, Web Crypto, Streams, WebSocket +- **Nanoservices**: Service bindings with local call performance +- **Capability security**: Explicit bindings prevent SSRF +- **Backwards compatible**: Version = max compat date supported + +## Architecture +``` +Config (workerd.capnp) +├── Services (workers/endpoints) +├── Sockets (HTTP/HTTPS listeners) +└── Extensions (global capabilities) +``` + +## Quick Start +```bash +workerd serve config.capnp +workerd compile config.capnp myConfig -o binary +workerd test config.capnp +``` + +## Platform Support & Beta Status + +| Platform | Status | Notes | +|----------|--------|-------| +| Linux (x64) | Stable | Primary platform | +| macOS (x64/ARM) | Stable | Full support | +| Windows | Beta | Use WSL2 for best results | +| Linux (ARM64) | Experimental | Limited testing | + +workerd is in **active development**. Breaking changes possible. Pin versions in production. + +## Core Concepts +- **Service**: Named endpoint (worker/network/disk/external) +- **Binding**: Capability-based resource access (KV/DO/R2/services) +- **Compatibility date**: Feature gate (always set!) +- **Modules**: ES modules (recommended) or service worker syntax + +## Reading Order (Progressive Disclosure) + +**Start here:** +1. This README (overview, decision tree) +2. [patterns.md](./patterns.md) - Common workflows, framework examples + +**When you need details:** +3. [configuration.md](./configuration.md) - Config format, services, bindings +4. [api.md](./api.md) - Runtime APIs, TypeScript types +5. [gotchas.md](./gotchas.md) - Common errors, debugging + +## Related References +- [workers](../workers/) - Workers runtime API documentation +- [miniflare](../miniflare/) - Testing tool built on workerd +- [wrangler](../wrangler/) - CLI that uses workerd for local dev diff --git a/cloudflare/references/workerd/api.md b/cloudflare/references/workerd/api.md new file mode 100644 index 0000000..085f507 --- /dev/null +++ b/cloudflare/references/workerd/api.md @@ -0,0 +1,185 @@ +# Workerd APIs + +## Worker Code (JS/TS) + +### ES Modules (Recommended) +```javascript +export default { + async fetch(request, env, ctx) { + const value = await env.KV.get("key"); // Bindings in env + const response = await env.API.fetch(request); // Service binding + ctx.waitUntil(logRequest(request)); // Background task + return new Response("OK"); + }, + async adminApi(request, env, ctx) { /* Named entrypoint */ }, + async queue(batch, env, ctx) { /* Queue consumer */ }, + async scheduled(event, env, ctx) { /* Cron handler */ } +}; +``` + +### TypeScript Types + +**Generate from wrangler.toml (Recommended):** +```bash +wrangler types # Output: worker-configuration.d.ts +``` + +**Manual types:** +```typescript +interface Env { + API: Fetcher; + CACHE: KVNamespace; + STORAGE: R2Bucket; + ROOMS: DurableObjectNamespace; + API_KEY: string; +} + +export default { + async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise { + return new Response(await env.CACHE.get("key")); + } +}; +``` + +**Setup:** +```bash +npm install -D @cloudflare/workers-types +``` + +```json +// tsconfig.json +{"compilerOptions": {"types": ["@cloudflare/workers-types"]}} +``` + +### Service Worker Syntax (Legacy) +```javascript +addEventListener('fetch', event => { + event.respondWith(handleRequest(event.request)); +}); + +async function handleRequest(request) { + const value = await KV.get("key"); // Bindings as globals + return new Response("OK"); +} +``` + +### Durable Objects +```javascript +export class Room { + constructor(state, env) { this.state = state; this.env = env; } + + async fetch(request) { + const url = new URL(request.url); + if (url.pathname === "/increment") { + const value = (await this.state.storage.get("counter")) || 0; + await this.state.storage.put("counter", value + 1); + return new Response(String(value + 1)); + } + return new Response("Not found", {status: 404}); + } +} +``` + +### RPC Between Services +```javascript +// Caller: env.AUTH.validateToken(token) returns structured data +const user = await env.AUTH.validateToken(request.headers.get("Authorization")); + +// Callee: export methods that return data +export default { + async validateToken(token) { return {id: 123, name: "Alice"}; } +}; +``` + +## Web Platform APIs + +### Fetch +- `fetch()`, `Request`, `Response`, `Headers` +- `AbortController`, `AbortSignal` + +### Streams +- `ReadableStream`, `WritableStream`, `TransformStream` +- Byte streams, BYOB readers + +### Web Crypto +- `crypto.subtle` (encrypt/decrypt/sign/verify) +- `crypto.randomUUID()`, `crypto.getRandomValues()` + +### Encoding +- `TextEncoder`, `TextDecoder` +- `atob()`, `btoa()` + +### Web Standards +- `URL`, `URLSearchParams` +- `Blob`, `File`, `FormData` +- `WebSocket` + +### Server-Sent Events (EventSource) +```javascript +// Server-side SSE +const { readable, writable } = new TransformStream(); +const writer = writable.getWriter(); +writer.write(new TextEncoder().encode('data: Hello\n\n')); +return new Response(readable, {headers: {'Content-Type': 'text/event-stream'}}); +``` + +### HTMLRewriter (HTML Parsing/Transformation) +```javascript +const response = await fetch('https://example.com'); +return new HTMLRewriter() + .on('a[href]', { + element(el) { + el.setAttribute('href', `/proxy?url=${encodeURIComponent(el.getAttribute('href'))}`); + } + }) + .on('script', { element(el) { el.remove(); } }) + .transform(response); +``` + +### TCP Sockets (Experimental) +```javascript +const socket = await connect({ hostname: 'example.com', port: 80 }); +const writer = socket.writable.getWriter(); +await writer.write(new TextEncoder().encode('GET / HTTP/1.1\r\n\r\n')); +const reader = socket.readable.getReader(); +const { value } = await reader.read(); +return new Response(value); +``` + +### Performance +- `performance.now()`, `performance.timeOrigin` +- `setTimeout()`, `setInterval()`, `queueMicrotask()` + +### Console +- `console.log()`, `console.error()`, `console.warn()` + +### Node.js Compat (`nodejs_compat` flag) +```javascript +import { Buffer } from 'node:buffer'; +import { randomBytes } from 'node:crypto'; + +const buf = Buffer.from('Hello'); +const random = randomBytes(16); +``` + +**Available:** `node:buffer`, `node:crypto`, `node:stream`, `node:util`, `node:events`, `node:assert`, `node:path`, `node:querystring`, `node:url` +**NOT available:** `node:fs`, `node:http`, `node:net`, `node:child_process` + +## CLI Commands + +```bash +workerd serve config.capnp [constantName] # Start server +workerd serve config.capnp --socket-addr http=*:3000 --verbose +workerd compile config.capnp constantName -o binary # Compile to binary +workerd test config.capnp [--test-only=test.js] # Run tests +``` + +## Wrangler Integration + +Use Wrangler for development: +```bash +wrangler dev # Uses workerd internally +wrangler types # Generate TypeScript types from wrangler.toml +``` + +See [patterns.md](./patterns.md) for usage examples, [configuration.md](./configuration.md) for config details. diff --git a/cloudflare/references/workerd/configuration.md b/cloudflare/references/workerd/configuration.md new file mode 100644 index 0000000..bad5f43 --- /dev/null +++ b/cloudflare/references/workerd/configuration.md @@ -0,0 +1,183 @@ +# Workerd Configuration + +## Basic Structure +```capnp +using Workerd = import "/workerd/workerd.capnp"; + +const config :Workerd.Config = ( + services = [(name = "main", worker = .mainWorker)], + sockets = [(name = "http", address = "*:8080", http = (), service = "main")] +); + +const mainWorker :Workerd.Worker = ( + modules = [(name = "index.js", esModule = embed "src/index.js")], + compatibilityDate = "2024-01-15", + bindings = [...] +); +``` + +## Services +**Worker**: Run JS/Wasm code +```capnp +(name = "api", worker = ( + modules = [(name = "index.js", esModule = embed "index.js")], + compatibilityDate = "2024-01-15", + bindings = [...] +)) +``` + +**Network**: Internet access +```capnp +(name = "internet", network = (allow = ["public"], tlsOptions = (trustBrowserCas = true))) +``` + +**External**: Reverse proxy +```capnp +(name = "backend", external = (address = "api.com:443", http = (style = tls))) +``` + +**Disk**: Static files +```capnp +(name = "assets", disk = (path = "/var/www", writable = false)) +``` + +## Sockets +```capnp +(name = "http", address = "*:8080", http = (), service = "main") +(name = "https", address = "*:443", https = (options = (), tlsOptions = (keypair = (...))), service = "main") +(name = "app", address = "unix:/tmp/app.sock", http = (), service = "main") +``` + +## Worker Formats +```capnp +# ES Modules (recommended) +modules = [(name = "index.js", esModule = embed "src/index.js"), (name = "wasm.wasm", wasm = embed "build/module.wasm")] + +# Service Worker (legacy) +serviceWorkerScript = embed "worker.js" + +# CommonJS +(name = "legacy.js", commonJsModule = embed "legacy.js", namedExports = ["foo"]) +``` + +## Bindings +Bindings expose resources to workers. ES modules: `env.BINDING`, Service workers: globals. + +### Primitive Types +```capnp +(name = "API_KEY", text = "secret") # String +(name = "CONFIG", json = '{"key":"val"}') # Parsed JSON +(name = "DATA", data = embed "data.bin") # ArrayBuffer +(name = "DATABASE_URL", fromEnvironment = "DB_URL") # System env var +``` + +### Service Binding +```capnp +(name = "AUTH", service = "auth-worker") # Basic +(name = "API", service = ( + name = "backend", + entrypoint = "adminApi", # Named export + props = (json = '{"role":"admin"}') # ctx.props +)) +``` + +### Storage +```capnp +(name = "CACHE", kvNamespace = "kv-service") # KV +(name = "STORAGE", r2Bucket = "r2-service") # R2 +(name = "ROOMS", durableObjectNamespace = ( + serviceName = "room-service", + className = "Room" +)) +(name = "FAST", memoryCache = ( + id = "cache-id", + limits = (maxKeys = 1000, maxValueSize = 1048576) +)) +``` + +### Other +```capnp +(name = "TASKS", queue = "queue-service") +(name = "ANALYTICS", analyticsEngine = "analytics") +(name = "LOADER", workerLoader = (id = "dynamic")) +(name = "KEY", cryptoKey = (format = raw, algorithm = (name = "HMAC", hash = "SHA-256"), keyData = embed "key.bin", usages = [sign, verify], extractable = false)) +(name = "TRACED", wrapped = (moduleName = "tracing", entrypoint = "makeTracer", innerBindings = [(name = "backend", service = "backend")])) +``` + +## Compatibility +```capnp +compatibilityDate = "2024-01-15" # Always set! +compatibilityFlags = ["nodejs_compat", "streams_enable_constructors"] +``` + +Version = max compat date. Update carefully after testing. + +## Parameter Bindings (Inheritance) +```capnp +const base :Workerd.Worker = ( + modules = [...], compatibilityDate = "2024-01-15", + bindings = [(name = "API_URL", parameter = (type = text)), (name = "DB", parameter = (type = service))] +); + +const derived :Workerd.Worker = ( + inherit = "base-service", + bindings = [(name = "API_URL", text = "https://api.com"), (name = "DB", service = "postgres")] +); +``` + +## Durable Objects Config +```capnp +const worker :Workerd.Worker = ( + modules = [...], + compatibilityDate = "2024-01-15", + bindings = [(name = "ROOMS", durableObjectNamespace = "Room")], + durableObjectNamespaces = [(className = "Room", uniqueKey = "v1")], + durableObjectStorage = (localDisk = "/var/do") +); +``` + +## Remote Bindings (Development) + +Connect local workerd to production Cloudflare resources: + +```capnp +bindings = [ + # Remote KV (requires API token) + (name = "PROD_KV", kvNamespace = ( + remote = ( + accountId = "your-account-id", + namespaceId = "your-namespace-id", + apiToken = .envVar("CF_API_TOKEN") + ) + )), + + # Remote R2 + (name = "PROD_R2", r2Bucket = ( + remote = ( + accountId = "your-account-id", + bucketName = "my-bucket", + apiToken = .envVar("CF_API_TOKEN") + ) + )), + + # Remote Durable Object + (name = "PROD_DO", durableObjectNamespace = ( + remote = ( + accountId = "your-account-id", + scriptName = "my-worker", + className = "MyDO", + apiToken = .envVar("CF_API_TOKEN") + ) + )) +] +``` + +**Note:** Remote bindings require network access and valid Cloudflare API credentials. + +## Logging & Debugging +```capnp +logging = (structuredLogging = true, stdoutPrefix = "OUT: ", stderrPrefix = "ERR: ") +v8Flags = ["--expose-gc", "--max-old-space-size=2048"] # ⚠️ Unsupported in production +``` + +See [patterns.md](./patterns.md) for multi-service examples, [gotchas.md](./gotchas.md) for config errors. diff --git a/cloudflare/references/workerd/gotchas.md b/cloudflare/references/workerd/gotchas.md new file mode 100644 index 0000000..dc35109 --- /dev/null +++ b/cloudflare/references/workerd/gotchas.md @@ -0,0 +1,139 @@ +# Workerd Gotchas + +## Common Errors + +### "Missing compatibility date" +**Cause:** Compatibility date not set +**Solution:** +❌ Wrong: +```capnp +const worker :Workerd.Worker = ( + serviceWorkerScript = embed "worker.js" +) +``` + +✅ Correct: +```capnp +const worker :Workerd.Worker = ( + serviceWorkerScript = embed "worker.js", + compatibilityDate = "2024-01-15" # Always set! +) +``` + +### Wrong Binding Type +**Problem:** JSON not parsed +**Cause:** Using `text = '{"key":"value"}'` instead of `json` +**Solution:** Use `json = '{"key":"value"}'` for parsed objects + +### Service vs Namespace +**Problem:** Cannot create DO instance +**Cause:** Using `service = "room-service"` for Durable Object +**Solution:** Use `durableObjectNamespace = "Room"` for DO bindings + +### Module Name Mismatch +**Problem:** Import fails +**Cause:** Module name includes path: `name = "src/index.js"` +**Solution:** Use simple names: `name = "index.js"`, embed with path + +## Network Access + +**Problem:** Fetch fails with network error +**Cause:** No network service configured (workerd has no global fetch) +**Solution:** Add network service binding: +```capnp +services = [(name = "internet", network = (allow = ["public"]))] +bindings = [(name = "NET", service = "internet")] +``` + +Or external service: +```capnp +bindings = [(name = "API", service = (external = (address = "api.com:443", http = (style = tls))))] +``` + +### "Worker not responding" +**Cause:** Socket misconfigured, no fetch handler, or port unavailable +**Solution:** Verify socket `address` matches, worker exports `fetch()`, port available + +### "Binding not found" +**Cause:** Name mismatch or service doesn't exist +**Solution:** Check binding name in config matches code (`env.BINDING` for ES modules) + +### "Module not found" +**Cause:** Module name doesn't match import or bad embed path +**Solution:** Module `name` must match import path exactly, verify `embed` path + +### "Compatibility error" +**Cause:** Date not set or API unavailable on that date +**Solution:** Set `compatibilityDate`, verify API available on that date + +## Performance Issues + +**Problem:** High memory usage +**Cause:** Large caches or many isolates +**Solution:** Set cache limits, reduce isolate count, or use V8 flags (caution) + +**Problem:** Slow startup +**Cause:** Many modules or complex config +**Solution:** Compile to binary (`workerd compile`), reduce imports + +**Problem:** Request timeouts +**Cause:** External service issues or DNS problems +**Solution:** Check connectivity, DNS resolution, TLS handshake + +## Build Issues + +**Problem:** Cap'n Proto syntax errors +**Cause:** Invalid config or missing schema +**Solution:** Install capnproto tools, validate: `capnp compile -I. config.capnp` + +**Problem:** Embed path not found +**Cause:** Path relative to config file +**Solution:** Use correct relative path or absolute path + +**Problem:** V8 flags cause crashes +**Cause:** Unsafe V8 flags +**Solution:** ⚠️ V8 flags unsupported in production. Test thoroughly before use. + +## Security Issues + +**Problem:** Hardcoded secrets in config +**Cause:** `text` binding with secret value +**Solution:** Use `fromEnvironment` to load from env vars + +**Problem:** Overly broad network access +**Cause:** `network = (allow = ["*"])` +**Solution:** Restrict to `allow = ["public"]` or specific hosts + +**Problem:** Extractable crypto keys +**Cause:** `cryptoKey = (extractable = true, ...)` +**Solution:** Set `extractable = false` unless export required + +## Compatibility Changes + +**Problem:** Breaking changes after compat date update +**Cause:** New flags enabled between dates +**Solution:** Review [compat dates docs](https://developers.cloudflare.com/workers/configuration/compatibility-dates/), test locally first + +**Problem:** "Compatibility date not supported" +**Cause:** Workerd version older than compat date +**Solution:** Update workerd binary (version = max compat date supported) + +## Limits + +| Resource/Limit | Value | Notes | +|----------------|-------|-------| +| V8 flags | Unsupported in production | Use with caution | +| Compatibility date | Must match workerd version | Update if mismatch | +| Module count | Affects startup time | Many imports slow | + +## Troubleshooting Steps + +1. **Enable verbose logging**: `workerd serve config.capnp --verbose` +2. **Check logs**: Look for error messages, stack traces +3. **Validate config**: `capnp compile -I. config.capnp` +4. **Test bindings**: Log `Object.keys(env)` to verify +5. **Check versions**: Workerd version vs compat date +6. **Isolate issue**: Minimal repro config +7. **Review schema**: [workerd.capnp](https://github.com/cloudflare/workerd/blob/main/src/workerd/server/workerd.capnp) + +See [configuration.md](./configuration.md) for config details, [patterns.md](./patterns.md) for working examples, [api.md](./api.md) for runtime APIs. diff --git a/cloudflare/references/workerd/patterns.md b/cloudflare/references/workerd/patterns.md new file mode 100644 index 0000000..5aaf092 --- /dev/null +++ b/cloudflare/references/workerd/patterns.md @@ -0,0 +1,192 @@ +# Workerd Patterns + +## Multi-Service Architecture +```capnp +const config :Workerd.Config = ( + services = [ + (name = "frontend", worker = ( + modules = [(name = "index.js", esModule = embed "frontend/index.js")], + compatibilityDate = "2024-01-15", + bindings = [(name = "API", service = "api")] + )), + (name = "api", worker = ( + modules = [(name = "index.js", esModule = embed "api/index.js")], + compatibilityDate = "2024-01-15", + bindings = [(name = "DB", service = "postgres"), (name = "CACHE", kvNamespace = "kv")] + )), + (name = "postgres", external = (address = "db.internal:5432", http = ())), + (name = "kv", disk = (path = "/var/kv", writable = true)) + ], + sockets = [(name = "http", address = "*:8080", http = (), service = "frontend")] +); +``` + +## Durable Objects +```capnp +const worker :Workerd.Worker = ( + modules = [(name = "index.js", esModule = embed "index.js"), (name = "room.js", esModule = embed "room.js")], + compatibilityDate = "2024-01-15", + bindings = [(name = "ROOMS", durableObjectNamespace = "Room")], + durableObjectNamespaces = [(className = "Room", uniqueKey = "v1")], + durableObjectStorage = (localDisk = "/var/do") +); +``` + +## Dev vs Prod Configs +```capnp +# Use parameter bindings for env-specific config +const baseWorker :Workerd.Worker = ( + modules = [(name = "index.js", esModule = embed "src/index.js")], + compatibilityDate = "2024-01-15", + bindings = [(name = "API_URL", parameter = (type = text))] +); + +const prodWorker :Workerd.Worker = ( + inherit = "base-service", + bindings = [(name = "API_URL", text = "https://api.prod.com")] +); +``` + +## HTTP Reverse Proxy +```capnp +services = [ + (name = "proxy", worker = (serviceWorkerScript = embed "proxy.js", compatibilityDate = "2024-01-15", bindings = [(name = "BACKEND", service = "backend")])), + (name = "backend", external = (address = "internal:8080", http = ())) +] +``` + +## Local Development + +**Recommended:** Use Wrangler +```bash +wrangler dev # Uses workerd internally +``` + +**Direct workerd:** +```bash +workerd serve config.capnp --socket-addr http=*:3000 --verbose +``` + +**Environment variables:** +```capnp +bindings = [(name = "DATABASE_URL", fromEnvironment = "DATABASE_URL")] +``` + +## Testing +```bash +workerd test config.capnp +workerd test config.capnp --test-only=test.js +``` + +Test files must be included in `modules = [...]` config. + +## Production Deployment + +### Compiled Binary (Recommended) +```bash +workerd compile config.capnp myConfig -o production-server +./production-server +``` + +### Docker +```dockerfile +FROM debian:bookworm-slim +RUN apt-get update && apt-get install -y ca-certificates +COPY workerd /usr/local/bin/ +COPY config.capnp /etc/workerd/ +COPY src/ /etc/workerd/src/ +EXPOSE 8080 +CMD ["workerd", "serve", "/etc/workerd/config.capnp"] +``` + +### Systemd +```ini +# /etc/systemd/system/workerd.service +[Service] +ExecStart=/usr/bin/workerd serve /etc/workerd/config.capnp --socket-fd http=3 +Restart=always +User=nobody +``` + +See systemd socket activation docs for complete setup. + +## Framework Integration + +### Hono +```javascript +import { Hono } from 'hono'; + +const app = new Hono(); + +app.get('/', (c) => c.text('Hello Hono!')); +app.get('/api/:id', async (c) => { + const id = c.req.param('id'); + const data = await c.env.KV.get(id); + return c.json({ id, data }); +}); + +export default app; +``` + +### itty-router +```javascript +import { Router } from 'itty-router'; + +const router = Router(); + +router.get('/', () => new Response('Hello itty!')); +router.get('/api/:id', async (request, env) => { + const { id } = request.params; + const data = await env.KV.get(id); + return Response.json({ id, data }); +}); + +export default { + fetch: (request, env, ctx) => router.handle(request, env, ctx) +}; +``` + +## Best Practices + +1. **Use ES modules** over service worker syntax +2. **Explicit bindings** - no global namespace assumptions +3. **Type safety** - define `Env` interfaces (use `wrangler types`) +4. **Service isolation** - split concerns into multiple services +5. **Pin compat date** in production after testing +6. **Use ctx.waitUntil()** for background tasks +7. **Handle errors gracefully** with try/catch +8. **Configure resource limits** on caches/storage + +## Common Patterns + +### Error Handling +```javascript +export default { + async fetch(request, env, ctx) { + try { + return await handleRequest(request, env); + } catch (error) { + console.error("Request failed", error); + return new Response("Internal Error", {status: 500}); + } + } +}; +``` + +### Background Tasks +```javascript +export default { + async fetch(request, env, ctx) { + const response = new Response("OK"); + + // Fire-and-forget background work + ctx.waitUntil( + env.ANALYTICS.put(request.url, Date.now()) + ); + + return response; + } +}; +``` + +See [configuration.md](./configuration.md) for config syntax, [api.md](./api.md) for runtime APIs, [gotchas.md](./gotchas.md) for common errors. diff --git a/cloudflare/references/workers-ai/README.md b/cloudflare/references/workers-ai/README.md new file mode 100644 index 0000000..a8419d2 --- /dev/null +++ b/cloudflare/references/workers-ai/README.md @@ -0,0 +1,197 @@ +# Cloudflare Workers AI + +Expert guidance for Cloudflare Workers AI - serverless GPU-powered AI inference at the edge. + +## Overview + +Workers AI provides: +- 50+ pre-trained models (LLMs, embeddings, image generation, speech-to-text, translation) +- Native Workers binding (no external API calls) +- Pay-per-use pricing (neurons consumed per inference) +- OpenAI-compatible REST API +- Streaming support for text generation +- Function calling with compatible models + +**Architecture**: Inference runs on Cloudflare's GPU network. Models load on first request (cold start 1-3s), subsequent requests are faster. + +## Quick Start + +```typescript +interface Env { + AI: Ai; +} + +export default { + async fetch(request: Request, env: Env) { + const response = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', { + messages: [{ role: 'user', content: 'What is Cloudflare?' }] + }); + return Response.json(response); + } +}; +``` + +```bash +# Setup - add binding to wrangler.jsonc +wrangler dev --remote # Must use --remote for AI +wrangler deploy +``` + +## Model Selection Decision Tree + +### Text Generation (Chat/Completion) + +**Quality Priority**: +- **Best quality**: `@cf/meta/llama-3.1-70b-instruct` (expensive, ~2000 neurons) +- **Balanced**: `@cf/meta/llama-3.1-8b-instruct` (good quality, ~200 neurons) +- **Fastest/cheapest**: `@cf/mistral/mistral-7b-instruct-v0.1` (~50 neurons) + +**Function Calling**: +- Use `@cf/meta/llama-3.1-8b-instruct` or `@cf/meta/llama-3.1-70b-instruct` (native tool support) + +**Code Generation**: +- Use `@cf/deepseek-ai/deepseek-coder-6.7b-instruct` (specialized for code) + +### Embeddings (Semantic Search/RAG) + +**English text**: +- **Best**: `@cf/baai/bge-large-en-v1.5` (1024 dims, highest quality) +- **Balanced**: `@cf/baai/bge-base-en-v1.5` (768 dims, good quality) +- **Fast**: `@cf/baai/bge-small-en-v1.5` (384 dims, lower quality but fast) + +**Multilingual**: +- Use `@hf/sentence-transformers/paraphrase-multilingual-minilm-l12-v2` + +### Image Generation + +- **Stable Diffusion**: `@cf/stabilityai/stable-diffusion-xl-base-1.0` (~10,000 neurons) +- **Portraits**: `@cf/lykon/dreamshaper-8-lcm` (optimized for faces) + +### Other Tasks + +- **Speech-to-text**: `@cf/openai/whisper` +- **Translation**: `@cf/meta/m2m100-1.2b` (100 languages) +- **Image classification**: `@cf/microsoft/resnet-50` + +## SDK Approach Decision Tree + +### Native Binding (Recommended) + +**When**: Building Workers/Pages with TypeScript +**Why**: Zero external dependencies, best performance, native types + +```typescript +await env.AI.run(model, input); +``` + +### REST API + +**When**: External services, non-Workers environments, testing +**Why**: Standard HTTP, works anywhere + +```bash +curl https://api.cloudflare.com/client/v4/accounts//ai/run/@cf/meta/llama-3.1-8b-instruct \ + -H "Authorization: Bearer " \ + -d '{"messages":[{"role":"user","content":"Hello"}]}' +``` + +### Vercel AI SDK Integration + +**When**: Using Vercel AI SDK features (streaming UI, tool calling abstractions) +**Why**: Unified interface across providers + +```typescript +import { openai } from '@ai-sdk/openai'; + +const model = openai('model-name', { + baseURL: 'https://api.cloudflare.com/client/v4/accounts//ai/v1', + headers: { Authorization: 'Bearer ' } +}); +``` + +## RAG vs Direct Generation + +### Use RAG (Vectorize + Workers AI) When: +- Answering questions about specific documents/data +- Need factual accuracy from known corpus +- Context exceeds model's window (>4K tokens) +- Building knowledge base chat + +### Use Direct Generation When: +- Creative writing, brainstorming +- General knowledge questions +- Small context fits in prompt (<4K tokens) +- Cost optimization (RAG adds embedding + vector search costs) + +## Platform Limits + +| Limit | Free Tier | Paid Plans | +|-------|-----------|------------| +| Neurons/day | 10,000 | Pay per use | +| Rate limit | Varies by model | Higher (contact support) | +| Context window | Model dependent (2K-8K) | Same | +| Streaming | ✅ Supported | ✅ Supported | +| Function calling | ✅ Supported (select models) | ✅ Supported | + +**Pricing**: Free 10K neurons/day, then pay per neuron consumed (varies by model) + +## Common Tasks + +```typescript +// Streaming text generation +const stream = await env.AI.run(model, { messages, stream: true }); +for await (const chunk of stream) { + console.log(chunk.response); +} + +// Embeddings for RAG +const { data } = await env.AI.run('@cf/baai/bge-base-en-v1.5', { + text: ['Query text', 'Document 1', 'Document 2'] +}); + +// Function calling +const response = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', { + messages: [{ role: 'user', content: 'What is the weather?' }], + tools: [{ + type: 'function', + function: { name: 'getWeather', parameters: { ... } } + }] +}); +``` + +## Development Workflow + +```bash +# Always use --remote for AI (local doesn't have models) +wrangler dev --remote + +# Deploy to production +wrangler deploy + +# View model catalog +# https://developers.cloudflare.com/workers-ai/models/ +``` + +## Reading Order + +**Start here**: Quick Start above → configuration.md (setup) + +**Common tasks**: +- First time setup: configuration.md → Add binding + deploy +- Choose model: Model Selection Decision Tree (above) → api.md +- Build RAG: patterns.md → Vectorize integration +- Optimize costs: Model Selection + gotchas.md (rate limits) +- Debugging: gotchas.md → Common errors + +## In This Reference + +- [configuration.md](./configuration.md) - wrangler.jsonc setup, TypeScript types, bindings, environment variables +- [api.md](./api.md) - env.AI.run(), streaming, function calling, REST API, response types +- [patterns.md](./patterns.md) - RAG with Vectorize, prompt engineering, batching, error handling, caching +- [gotchas.md](./gotchas.md) - Deprecated @cloudflare/ai package, rate limits, pricing, common errors + +## See Also + +- [vectorize](../vectorize/) - Vector database for RAG patterns +- [ai-gateway](../ai-gateway/) - Caching, rate limiting, analytics for AI requests +- [workers](../workers/) - Worker runtime and fetch handler patterns diff --git a/cloudflare/references/workers-ai/api.md b/cloudflare/references/workers-ai/api.md new file mode 100644 index 0000000..e65f97a --- /dev/null +++ b/cloudflare/references/workers-ai/api.md @@ -0,0 +1,112 @@ +# Workers AI API Reference + +## Core Method + +```typescript +const response = await env.AI.run(model, input); +``` + +## Text Generation + +```typescript +const result = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', { + messages: [ + { role: 'system', content: 'You are helpful' }, + { role: 'user', content: 'Hello' } + ], + temperature: 0.7, // 0-1 + max_tokens: 100 +}); +console.log(result.response); +``` + +**Streaming:** +```typescript +const stream = await env.AI.run(model, { messages, stream: true }); +return new Response(stream, { headers: { 'Content-Type': 'text/event-stream' } }); +``` + +## Embeddings + +```typescript +const result = await env.AI.run('@cf/baai/bge-base-en-v1.5', { + text: ['Query', 'Doc 1', 'Doc 2'] // Batch for efficiency +}); +const [queryEmbed, doc1Embed, doc2Embed] = result.data; // 768-dim vectors +``` + +## Function Calling + +```typescript +const tools = [{ + type: 'function', + function: { + name: 'getWeather', + description: 'Get weather for location', + parameters: { + type: 'object', + properties: { location: { type: 'string' } }, + required: ['location'] + } + } +}]; + +const response = await env.AI.run(model, { messages, tools }); +if (response.tool_calls) { + const args = JSON.parse(response.tool_calls[0].function.arguments); + // Execute function, send result back +} +``` + +## Image Generation + +```typescript +const image = await env.AI.run('@cf/stabilityai/stable-diffusion-xl-base-1.0', { + prompt: 'Mountain sunset', + num_steps: 20, // 1-20 + guidance: 7.5 // 1-20 +}); +return new Response(image, { headers: { 'Content-Type': 'image/png' } }); +``` + +## Speech Recognition + +```typescript +const audioArray = Array.from(new Uint8Array(await request.arrayBuffer())); +const result = await env.AI.run('@cf/openai/whisper', { audio: audioArray }); +console.log(result.text); +``` + +## Translation + +```typescript +const result = await env.AI.run('@cf/meta/m2m100-1.2b', { + text: 'Hello', + source_lang: 'en', + target_lang: 'es' +}); +console.log(result.translated_text); +``` + +## REST API + +```bash +curl https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run/@cf/meta/llama-3.1-8b-instruct \ + -H "Authorization: Bearer $TOKEN" \ + -d '{"messages":[{"role":"user","content":"Hello"}]}' +``` + +## Error Codes + +| Code | Meaning | Fix | +|------|---------|-----| +| 7502 | Model not found | Check spelling | +| 7504 | Validation failed | Verify input schema | +| 7505 | Rate limited | Reduce rate or upgrade | +| 7506 | Context exceeded | Reduce input size | + +## Performance Tips + +1. **Batch embeddings** - single request for multiple texts +2. **Stream long responses** - reduce perceived latency +3. **Accept cold starts** - first request ~1-3s, subsequent ~100-500ms diff --git a/cloudflare/references/workers-ai/configuration.md b/cloudflare/references/workers-ai/configuration.md new file mode 100644 index 0000000..f5563b3 --- /dev/null +++ b/cloudflare/references/workers-ai/configuration.md @@ -0,0 +1,97 @@ +# Workers AI Configuration + +## wrangler.jsonc + +```jsonc +{ + "name": "my-ai-worker", + "main": "src/index.ts", + "compatibility_date": "2024-01-01", + "ai": { + "binding": "AI" + } +} +``` + +## TypeScript + +```bash +npm install --save-dev @cloudflare/workers-types +``` + +```typescript +interface Env { + AI: Ai; +} + +export default { + async fetch(request: Request, env: Env) { + const response = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', { + messages: [{ role: 'user', content: 'Hello' }] + }); + return Response.json(response); + } +}; +``` + +## Local Development + +```bash +wrangler dev --remote # Required for AI - no local inference +``` + +## REST API + +```typescript +const response = await fetch( + `https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/ai/run/@cf/meta/llama-3.1-8b-instruct`, + { + method: 'POST', + headers: { 'Authorization': `Bearer ${API_TOKEN}` }, + body: JSON.stringify({ messages: [{ role: 'user', content: 'Hello' }] }) + } +); +``` + +Create API token at: dash.cloudflare.com/profile/api-tokens (Workers AI - Read permission) + +## SDK Compatibility + +**OpenAI SDK:** +```typescript +import OpenAI from 'openai'; +const client = new OpenAI({ + apiKey: env.CLOUDFLARE_API_TOKEN, + baseURL: `https://api.cloudflare.com/client/v4/accounts/${env.ACCOUNT_ID}/ai/v1` +}); +``` + +## Multi-Model Setup + +```typescript +const MODELS = { + chat: '@cf/meta/llama-3.1-8b-instruct', + embed: '@cf/baai/bge-base-en-v1.5', + image: '@cf/stabilityai/stable-diffusion-xl-base-1.0' +}; +``` + +## RAG Setup (with Vectorize) + +```jsonc +{ + "ai": { "binding": "AI" }, + "vectorize": { + "bindings": [{ "binding": "VECTORIZE", "index_name": "embeddings-index" }] + } +} +``` + +## Troubleshooting + +| Error | Fix | +|-------|-----| +| `env.AI is undefined` | Check `ai` binding in wrangler.jsonc | +| Local AI doesn't work | Use `wrangler dev --remote` | +| Type 'Ai' not found | Install `@cloudflare/workers-types` | +| @cloudflare/ai package error | Don't install - use native binding | diff --git a/cloudflare/references/workers-ai/gotchas.md b/cloudflare/references/workers-ai/gotchas.md new file mode 100644 index 0000000..c69255f --- /dev/null +++ b/cloudflare/references/workers-ai/gotchas.md @@ -0,0 +1,114 @@ +# Workers AI Gotchas + +## Critical: @cloudflare/ai is DEPRECATED + +```typescript +// ❌ WRONG - Don't install @cloudflare/ai +import Ai from '@cloudflare/ai'; + +// ✅ CORRECT - Use native binding +export default { + async fetch(request: Request, env: Env) { + await env.AI.run('@cf/meta/llama-3.1-8b-instruct', { messages: [...] }); + } +} +``` + +## Development + +### "AI inference doesn't work locally" +```bash +# ❌ Local AI doesn't work +wrangler dev +# ✅ Use remote +wrangler dev --remote +``` + +### "env.AI is undefined" +Add binding to wrangler.jsonc: +```jsonc +{ "ai": { "binding": "AI" } } +``` + +## API Responses + +### Embedding response shape varies +```typescript +// @cf/baai/bge-base-en-v1.5 returns: { data: [[0.1, 0.2, ...]] } +const embedding = response.data[0]; // Get first element +``` + +### Stream returns ReadableStream +```typescript +const stream = await env.AI.run(model, { messages: [...], stream: true }); +for await (const chunk of stream) { console.log(chunk.response); } +``` + +## Rate Limits & Pricing + +| Model Type | Neurons/Request | +|------------|-----------------| +| Small text (7B) | ~50-200 | +| Large text (70B) | ~500-2000 | +| Embeddings | ~5-20 | +| Image gen | ~10,000+ | + +**Free tier**: 10,000 neurons/day + +```typescript +// ❌ EXPENSIVE - 70B model +await env.AI.run('@cf/meta/llama-3.1-70b-instruct', ...); +// ✅ CHEAPER - Use smallest that works +await env.AI.run('@cf/meta/llama-3.1-8b-instruct', ...); +``` + +## Model-Specific + +### Function calling +Only `@cf/meta/llama-3.1-*` and `mistral-7b-instruct-v0.2` support tools. + +### Empty response +Check context limits (2K-8K tokens). Validate input structure. + +### Inconsistent responses +Set `temperature: 0` for deterministic outputs. + +### Cold start latency +First request: 1-3s. Use AI Gateway caching for frequent prompts. + +## TypeScript + +```typescript +interface Env { + AI: Ai; // From @cloudflare/workers-types +} + +interface TextGenerationResponse { response: string; } +interface EmbeddingResponse { data: number[][]; shape: number[]; } +``` + +## Common Errors + +### 7502: Model not found +Check exact model name at developers.cloudflare.com/workers-ai/models/ + +### 7504: Input validation failed +```typescript +// Text gen requires messages array +await env.AI.run('@cf/meta/llama-3.1-8b-instruct', { + messages: [{ role: 'user', content: 'Hello' }] // ✅ +}); + +// Embeddings require text +await env.AI.run('@cf/baai/bge-base-en-v1.5', { text: 'Hello' }); // ✅ +``` + +## Vercel AI SDK Integration + +```typescript +import { openai } from '@ai-sdk/openai'; +const model = openai('gpt-3.5-turbo', { + baseURL: 'https://api.cloudflare.com/client/v4/accounts//ai/v1', + headers: { Authorization: 'Bearer ' } +}); +``` diff --git a/cloudflare/references/workers-ai/patterns.md b/cloudflare/references/workers-ai/patterns.md new file mode 100644 index 0000000..8295a5b --- /dev/null +++ b/cloudflare/references/workers-ai/patterns.md @@ -0,0 +1,120 @@ +# Workers AI Patterns + +## RAG (Retrieval-Augmented Generation) + +```typescript +// 1. Embed query +const embedding = await env.AI.run('@cf/baai/bge-base-en-v1.5', { text: query }); + +// 2. Search vectors +const results = await env.VECTORIZE.query(embedding.data[0], { + topK: 5, returnMetadata: true +}); + +// 3. Build context +const context = results.matches.map(m => m.metadata?.text).join('\n\n'); + +// 4. Generate with context +const response = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', { + messages: [ + { role: 'system', content: `Answer based on:\n\n${context}` }, + { role: 'user', content: query } + ] +}); +``` + +## Streaming (SSE) + +```typescript +const stream = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', { + messages, stream: true +}); + +const { readable, writable } = new TransformStream(); +const writer = writable.getWriter(); + +(async () => { + for await (const chunk of stream) { + await writer.write(new TextEncoder().encode(`data: ${JSON.stringify(chunk)}\n\n`)); + } + await writer.write(new TextEncoder().encode('data: [DONE]\n\n')); + await writer.close(); +})(); + +return new Response(readable, { + headers: { 'Content-Type': 'text/event-stream' } +}); +``` + +## Error Handling & Retry + +```typescript +async function runWithRetry(env, model, input, maxRetries = 3) { + for (let attempt = 0; attempt < maxRetries; attempt++) { + try { + return await env.AI.run(model, input); + } catch (error) { + if (error.message?.includes('7505') && attempt < maxRetries - 1) { + await new Promise(r => setTimeout(r, Math.pow(2, attempt) * 1000)); + continue; + } + throw error; + } + } +} +``` + +## Model Fallback + +```typescript +try { + return await env.AI.run('@cf/meta/llama-3.1-70b-instruct', { messages }); +} catch { + return await env.AI.run('@cf/meta/llama-3.1-8b-instruct', { messages }); +} +``` + +## Prompt Patterns + +```typescript +// System prompts +const PROMPTS = { + json: 'Respond with valid JSON only.', + concise: 'Keep responses brief.', + cot: 'Think step by step before answering.' +}; + +// Few-shot +messages: [ + { role: 'system', content: 'Extract as JSON' }, + { role: 'user', content: 'John bought 3 apples for $5' }, + { role: 'assistant', content: '{"name":"John","item":"apples","qty":3}' }, + { role: 'user', content: actualInput } +] +``` + +## Parallel Execution + +```typescript +const [sentiment, summary, embedding] = await Promise.all([ + env.AI.run('@cf/mistral/mistral-7b-instruct-v0.1', { messages: sentimentPrompt }), + env.AI.run('@cf/meta/llama-3.1-8b-instruct', { messages: summaryPrompt }), + env.AI.run('@cf/baai/bge-base-en-v1.5', { text }) +]); +``` + +## Cost Optimization + +| Task | Model | Neurons | +|------|-------|---------| +| Classify | `@cf/mistral/mistral-7b-instruct-v0.1` | ~50 | +| Chat | `@cf/meta/llama-3.1-8b-instruct` | ~200 | +| Complex | `@cf/meta/llama-3.1-70b-instruct` | ~2000 | +| Embed | `@cf/baai/bge-base-en-v1.5` | ~10 | + +```typescript +// Batch embeddings +const response = await env.AI.run('@cf/baai/bge-base-en-v1.5', { + text: textsArray // Process multiple at once +}); +``` diff --git a/cloudflare/references/workers-for-platforms/README.md b/cloudflare/references/workers-for-platforms/README.md new file mode 100644 index 0000000..7bfd9b7 --- /dev/null +++ b/cloudflare/references/workers-for-platforms/README.md @@ -0,0 +1,89 @@ +# Cloudflare Workers for Platforms + +Multi-tenant platform with isolated customer code execution at scale. + +## Use Cases + +- Multi-tenant SaaS running customer code +- AI-generated code execution in secure sandboxes +- Programmable platforms with isolated compute +- Edge functions/serverless platforms +- Website builders with static + dynamic content +- Unlimited app deployment at scale + +**NOT for general Workers** - only for Workers for Platforms architecture. + +## Quick Start + +**One-click deploy:** [Platform Starter Kit](https://github.com/cloudflare/workers-for-platforms-example) deploys complete WfP setup with dispatch namespace, dispatch worker, and user worker example. + +[![Deploy to Cloudflare](https://deploy.workers.cloudflare.com/button)](https://deploy.workers.cloudflare.com/?url=https://github.com/cloudflare/workers-for-platforms-example) + +**Manual setup:** See [configuration.md](./configuration.md) for namespace creation and dispatch worker configuration. + +## Key Features + +- Unlimited Workers per namespace (no script limits) +- Automatic tenant isolation +- Custom CPU/subrequest limits per customer +- Hostname routing (subdomains/vanity domains) +- Egress/ingress control +- Static assets support +- Tags for bulk operations + +## Architecture + +**4 Components:** +1. **Dispatch Namespace** - Container for unlimited customer Workers, automatic isolation (untrusted mode by default - no request.cf access, no shared cache) +2. **Dynamic Dispatch Worker** - Entry point, routes requests, enforces platform logic (auth, limits, validation) +3. **User Workers** - Customer code in isolated sandboxes, API-deployed, optional bindings (KV/D1/R2/DO) +4. **Outbound Worker** (optional) - Intercepts external fetch, controls egress, logs subrequests (blocks TCP socket connect() API) + +**Request Flow:** +``` +Request → Dispatch Worker → Determines user Worker → env.DISPATCHER.get("customer") +→ User Worker executes (Outbound Worker for external fetch) → Response → Dispatch Worker → Client +``` + +## Decision Trees + +### When to Use Workers for Platforms +``` +Need to run code? +├─ Your code only → Regular Workers +├─ Customer/AI code → Workers for Platforms +└─ Untrusted code in sandbox → Workers for Platforms OR Sandbox API +``` + +### Routing Strategy Selection +``` +Hostname routing needed? +├─ Subdomains only (*.saas.com) → `*.saas.com/*` route + subdomain extraction +├─ Custom domains → `*/*` wildcard + Cloudflare for SaaS + KV/metadata routing +└─ Path-based (/customer/app) → Any route + path parsing +``` + +### Isolation Mode Selection +``` +Worker mode? +├─ Running customer code → Untrusted (default) +├─ Need request.cf geolocation → Trusted mode +├─ Internal platform, controlled code → Trusted mode with cache key prefixes +└─ Maximum isolation → Untrusted + unique resources per customer +``` + +## In This Reference + +| File | Purpose | When to Read | +|------|---------|--------------| +| [configuration.md](./configuration.md) | Namespace setup, dispatch worker config | First-time setup, changing limits | +| [api.md](./api.md) | User worker API, dispatch API, outbound worker | Deploying workers, SDK integration | +| [patterns.md](./patterns.md) | Multi-tenancy, routing, egress control | Planning architecture, scaling | +| [gotchas.md](./gotchas.md) | Limits, isolation issues, best practices | Debugging, production prep | + +## See Also +- [workers](../workers/) - Core Workers runtime documentation +- [durable-objects](../durable-objects/) - Stateful multi-tenant patterns +- [sandbox](../sandbox/) - Alternative for untrusted code execution +- [Reference Architecture: Programmable Platforms](https://developers.cloudflare.com/reference-architecture/diagrams/serverless/programmable-platforms/) +- [Reference Architecture: AI Vibe Coding Platform](https://developers.cloudflare.com/reference-architecture/diagrams/ai/ai-vibe-coding-platform/) diff --git a/cloudflare/references/workers-for-platforms/api.md b/cloudflare/references/workers-for-platforms/api.md new file mode 100644 index 0000000..663c608 --- /dev/null +++ b/cloudflare/references/workers-for-platforms/api.md @@ -0,0 +1,196 @@ +# API Operations + +## Deploy User Worker + +```bash +curl -X PUT \ + "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workers/dispatch/namespaces/$NAMESPACE/scripts/$SCRIPT_NAME" \ + -H "Authorization: Bearer $API_TOKEN" \ + -F 'metadata={"main_module": "worker.mjs"};type=application/json' \ + -F 'worker.mjs=@worker.mjs;type=application/javascript+module' +``` + +### TypeScript SDK +```typescript +import Cloudflare from "cloudflare"; + +const client = new Cloudflare({ apiToken: process.env.API_TOKEN }); + +const scriptFile = new File([scriptContent], `${scriptName}.mjs`, { + type: "application/javascript+module", +}); + +await client.workersForPlatforms.dispatch.namespaces.scripts.update( + namespace, scriptName, + { + account_id: accountId, + metadata: { main_module: `${scriptName}.mjs` }, + files: [scriptFile], + } +); +``` + +## TypeScript Types + +```typescript +import type { DispatchNamespace } from '@cloudflare/workers-types'; + +interface DispatchNamespace { + get(name: string, options?: Record, dispatchOptions?: DynamicDispatchOptions): Fetcher; +} + +interface DynamicDispatchOptions { + limits?: DynamicDispatchLimits; + outbound?: Record; +} + +interface DynamicDispatchLimits { + cpuMs?: number; // Max CPU milliseconds + subRequests?: number; // Max fetch() calls +} + +// Usage +const userWorker = env.DISPATCHER.get('customer-123', {}, { + limits: { cpuMs: 50, subRequests: 20 }, + outbound: { customerId: '123', url: request.url } +}); +``` + +## Deploy with Bindings +```bash +curl -X PUT ".../scripts/$SCRIPT_NAME" \ + -F 'metadata={ + "main_module": "worker.mjs", + "bindings": [ + {"type": "kv_namespace", "name": "MY_KV", "namespace_id": "'$KV_ID'"} + ], + "tags": ["customer-123", "production"], + "compatibility_date": "2026-01-01" // Use current date for new projects + };type=application/json' \ + -F 'worker.mjs=@worker.mjs;type=application/javascript+module' +``` + +## List/Delete Workers + +```bash +# List +curl "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workers/dispatch/namespaces/$NAMESPACE/scripts" \ + -H "Authorization: Bearer $API_TOKEN" + +# Delete by name +curl -X DELETE ".../scripts/$SCRIPT_NAME" -H "Authorization: Bearer $API_TOKEN" + +# Delete by tag +curl -X DELETE ".../scripts?tags=customer-123%3Ayes" -H "Authorization: Bearer $API_TOKEN" +``` + +**Pagination:** SDK supports async iteration. Manual: add `?per_page=100&page=1` query params. + +## Static Assets + +**3-step process:** Create session → Upload files → Deploy Worker + +### 1. Create Upload Session +```bash +curl -X POST ".../scripts/$SCRIPT_NAME/assets-upload-session" \ + -H "Authorization: Bearer $API_TOKEN" \ + -d '{ + "manifest": { + "/index.html": {"hash": "08f1dfda4574284ab3c21666d1ee8c7d4", "size": 1234} + } + }' +# Returns: jwt, buckets +``` + +**Hash:** SHA-256 truncated to first 16 bytes (32 hex characters) + +### 2. Upload Files +```bash +curl -X POST ".../workers/assets/upload?base64=true" \ + -H "Authorization: Bearer $UPLOAD_JWT" \ + -F '08f1dfda4574284ab3c21666d1ee8c7d4=' +# Returns: completion jwt +``` + +**Multiple buckets:** Upload to all returned bucket URLs (typically 2 for redundancy) using same JWT and hash. + +### 3. Deploy with Assets +```bash +curl -X PUT ".../scripts/$SCRIPT_NAME" \ + -F 'metadata={ + "main_module": "index.js", + "assets": {"jwt": ""}, + "bindings": [{"type": "assets", "name": "ASSETS"}] + };type=application/json' \ + -F 'index.js=export default {...};type=application/javascript+module' +``` + +**Asset Isolation:** Assets shared across namespace by default. For customer isolation, salt hash: `sha256(customerId + fileContents).slice(0, 32)` + +## Dispatch Workers + +### Subdomain Routing +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const userWorkerName = new URL(request.url).hostname.split(".")[0]; + const userWorker = env.DISPATCHER.get(userWorkerName); + return await userWorker.fetch(request); + }, +}; +``` + +### Path Routing +```typescript +const pathParts = new URL(request.url).pathname.split("/").filter(Boolean); +const userWorker = env.DISPATCHER.get(pathParts[0]); +return await userWorker.fetch(request); +``` + +### KV Routing +```typescript +const hostname = new URL(request.url).hostname; +const userWorkerName = await env.ROUTING_KV.get(hostname); +const userWorker = env.DISPATCHER.get(userWorkerName); +return await userWorker.fetch(request); +``` + +## Outbound Workers + +Control external fetch from user Workers: + +### Configure +```typescript +const userWorker = env.DISPATCHER.get( + workerName, {}, + { outbound: { customer_context: { customer_name: workerName, url: request.url } } } +); +``` + +### Implement +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const customerName = env.customer_name; + const url = new URL(request.url); + + // Block domains + if (["malicious.com"].some(d => url.hostname.includes(d))) { + return new Response("Blocked", { status: 403 }); + } + + // Inject auth + if (url.hostname === "api.example.com") { + const headers = new Headers(request.headers); + headers.set("Authorization", `Bearer ${generateJWT(customerName)}`); + return fetch(new Request(request, { headers })); + } + + return fetch(request); + }, +}; +``` + +**Note:** Doesn't intercept DO/mTLS fetch. + +See [README.md](./README.md), [configuration.md](./configuration.md), [patterns.md](./patterns.md), [gotchas.md](./gotchas.md) diff --git a/cloudflare/references/workers-for-platforms/configuration.md b/cloudflare/references/workers-for-platforms/configuration.md new file mode 100644 index 0000000..b434999 --- /dev/null +++ b/cloudflare/references/workers-for-platforms/configuration.md @@ -0,0 +1,167 @@ +# Configuration + +## Dispatch Namespace Binding + +### wrangler.jsonc +```jsonc +{ + "$schema": "./node_modules/wrangler/config-schema.json", + "dispatch_namespaces": [{ + "binding": "DISPATCHER", + "namespace": "production" + }] +} +``` + +## Worker Isolation Mode + +Workers in a namespace run in **untrusted mode** by default for security: +- No access to `request.cf` object +- Isolated cache per Worker (no shared cache) +- `caches.default` disabled + +### Enable Trusted Mode + +For internal platforms where you control all code: + +```bash +curl -X PUT \ + "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workers/dispatch/namespaces/$NAMESPACE" \ + -H "Authorization: Bearer $API_TOKEN" \ + -d '{"name": "'$NAMESPACE'", "trusted_workers": true}' +``` + +**Caveats:** +- Workers share cache within namespace (use cache key prefixes: `customer-${id}:${key}`) +- `request.cf` object accessible +- Redeploy existing Workers after enabling trusted mode + +**When to use:** Internal platforms, A/B testing platforms, need geolocation data + + +### With Outbound Worker +```jsonc +{ + "dispatch_namespaces": [{ + "binding": "DISPATCHER", + "namespace": "production", + "outbound": { + "service": "outbound-worker", + "parameters": ["customer_context"] + } + }] +} +``` + +## Wrangler Commands + +```bash +wrangler dispatch-namespace list +wrangler dispatch-namespace get production +wrangler dispatch-namespace create production +wrangler dispatch-namespace delete staging +wrangler dispatch-namespace rename old new +``` + +## Custom Limits + +Set CPU time and subrequest limits per invocation: + +```typescript +const userWorker = env.DISPATCHER.get( + workerName, + {}, + { + limits: { + cpuMs: 10, // Max CPU ms + subRequests: 5 // Max fetch() calls + } + } +); +``` + +Handle limit violations: +```typescript +try { + return await userWorker.fetch(request); +} catch (e) { + if (e.message.includes("CPU time limit")) { + return new Response("CPU limit exceeded", { status: 429 }); + } + throw e; +} +``` + +## Static Assets + +Deploy HTML/CSS/images with Workers. See [api.md](./api.md#static-assets) for upload process. + +### Wrangler +```jsonc +{ + "name": "customer-site", + "main": "./src/index.js", + "assets": { + "directory": "./public", + "binding": "ASSETS" + } +} +``` + +```bash +npx wrangler deploy --name customer-site --dispatch-namespace production +``` + +### Dashboard Deployment + +Alternative to CLI: + +1. Upload Worker file in dashboard +2. Add `--dispatch-namespace` flag: `wrangler deploy --dispatch-namespace production` +3. Or configure in wrangler.jsonc under `dispatch_namespaces` + +See [api.md](./api.md) for programmatic deployment via REST API or SDK. + +## Tags + +Organize/search Workers (max 8/script): + +```bash +# Set tags +curl -X PUT ".../tags" -d '["customer-123", "pro", "production"]' + +# Filter by tag +curl ".../scripts?tags=production%3Ayes" + +# Delete by tag +curl -X DELETE ".../scripts?tags=customer-123%3Ayes" +``` + +Common patterns: `customer-123`, `free|pro|enterprise`, `production|staging` + +## Bindings + +**Supported binding types:** 29 total including KV, D1, R2, Durable Objects, Analytics Engine, Service, Assets, Queue, Vectorize, Hyperdrive, Workflow, AI, Browser, and more. + +Add via API metadata (see [api.md](./api.md#deploy-with-bindings)): +```json +{ + "bindings": [ + {"type": "kv_namespace", "name": "USER_KV", "namespace_id": "..."}, + {"type": "r2_bucket", "name": "STORAGE", "bucket_name": "..."}, + {"type": "d1", "name": "DB", "id": "..."} + ] +} +``` + +Preserve existing bindings: +```json +{ + "bindings": [{"type": "r2_bucket", "name": "STORAGE", "bucket_name": "new"}], + "keep_bindings": ["kv_namespace", "d1"] // Preserves existing bindings of these types +} +``` + +For complete binding type reference, see [bindings](../bindings/) documentation + +See [README.md](./README.md), [api.md](./api.md), [patterns.md](./patterns.md), [gotchas.md](./gotchas.md) diff --git a/cloudflare/references/workers-for-platforms/gotchas.md b/cloudflare/references/workers-for-platforms/gotchas.md new file mode 100644 index 0000000..a32fe18 --- /dev/null +++ b/cloudflare/references/workers-for-platforms/gotchas.md @@ -0,0 +1,134 @@ +# Gotchas & Limits + +## Common Errors + +### "Worker not found" + +**Cause:** Attempting to get Worker that doesn't exist in namespace +**Solution:** Catch error and return 404: + +```typescript +try { + const userWorker = env.DISPATCHER.get(workerName); + return userWorker.fetch(request); +} catch (e) { + if (e.message.startsWith("Worker not found")) { + return new Response("Worker not found", { status: 404 }); + } + throw e; // Re-throw unexpected errors +} +``` + +### "CPU time limit exceeded" + +**Cause:** User Worker exceeded configured CPU time limit +**Solution:** Track violations in Analytics Engine and return 429 response; consider adjusting limits per customer tier + +### "Hostname Routing Issues" + +**Cause:** DNS proxy settings causing routing problems +**Solution:** Use `*/*` wildcard route which works regardless of proxy settings for orange-to-orange routing + +### "Bindings Lost on Update" + +**Cause:** Not using `keep_bindings` flag when updating Worker +**Solution:** Use `keep_bindings: true` in API requests to preserve existing bindings during updates + +### "Tag Filtering Not Working" + +**Cause:** Special characters not URL encoded in tag filters +**Solution:** URL encode tags (e.g., `tags=production%3Ayes`) and avoid special chars like `,` and `&` + +### "Deploy Failures with ES Modules" + +**Cause:** Incorrect upload format for ES modules +**Solution:** Use multipart form upload, specify `main_module` in metadata, and set file type to `application/javascript+module` + +### "Static Asset Upload Failed" + +**Cause:** Invalid hash format, expired token, or incorrect encoding +**Solution:** Hash must be first 16 bytes (32 hex chars) of SHA-256, upload within 1 hour of session creation, deploy within 1 hour of upload completion, and Base64 encode file contents + +### "Outbound Worker Not Intercepting Calls" + +**Cause:** Outbound Workers don't intercept Durable Object or mTLS binding fetch +**Solution:** Plan egress control accordingly; not all fetch calls are intercepted + +### "TCP Socket Connection Failed" + +**Cause:** Outbound Worker enabled blocks `connect()` API for TCP sockets +**Solution:** Outbound Workers only intercept `fetch()` calls; TCP socket connections unavailable when outbound configured. Remove outbound if TCP needed, or use proxy pattern. + +### "API Rate Limit Exceeded" + +**Cause:** Exceeded Cloudflare API rate limits (1200 requests per 5 minutes per account, 200 requests per second per IP) +**Solution:** Implement exponential backoff: + +```typescript +async function deployWithBackoff(deploy: () => Promise, maxRetries = 3) { + for (let i = 0; i < maxRetries; i++) { + try { + return await deploy(); + } catch (e) { + if (e.status === 429 && i < maxRetries - 1) { + await new Promise(r => setTimeout(r, Math.pow(2, i) * 1000)); + continue; + } + throw e; + } + } +} +``` + +### "Gradual Deployment Not Supported" + +**Cause:** Attempted to use gradual deployments with user Workers +**Solution:** Gradual deployments not supported for Workers in dispatch namespaces. Use all-at-once deployment with staged rollout via dispatch worker logic (feature flags, percentage-based routing). + +### "Asset Session Expired" + +**Cause:** Upload JWT expired (1 hour validity) or completion token expired (1 hour after upload) +**Solution:** Complete asset upload within 1 hour of session creation, and deploy Worker within 1 hour of upload completion. For large uploads, batch files or increase upload parallelism. + +## Platform Limits + +| Limit | Value | Notes | +|-------|-------|-------| +| Workers per namespace | Unlimited | Unlike regular Workers (500 per account) | +| Namespaces per account | Unlimited | Best practice: 1 production + 1 staging | +| Max tags per Worker | 8 | For filtering and organization | +| Worker mode | Untrusted (default) | No `request.cf` access unless trusted mode | +| Cache isolation | Per-Worker (untrusted) | Shared in trusted mode with key prefixes | +| Durable Object namespaces | Unlimited | No per-account limit for WfP | +| Gradual Deployments | Not supported | All-at-once only | +| `caches.default` | Disabled (untrusted) | Use Cache API with custom keys | + +## Asset Upload Limits + +| Limit | Value | Notes | +|-------|-------|-------| +| Upload session JWT validity | 1 hour | Must complete upload within this time | +| Completion token validity | 1 hour | Must deploy within this time after upload | +| Asset hash format | First 16 bytes SHA-256 | 32 hex characters | +| Base64 encoding | Required | For binary files | + +## API Rate Limits + +| Limit Type | Value | Scope | +|------------|-------|-------| +| Client API | 1200 requests / 5 min | Per account | +| Client API | 200 requests / sec | Per IP address | +| GraphQL | Varies by query cost | Query complexity | + +See [Cloudflare API Rate Limits](https://developers.cloudflare.com/fundamentals/api/reference/limits/) for details. + +## Operational Limits + +| Operation | Limit | Notes | +|-----------|-------|-------| +| CPU time (custom limits) | Up to Workers plan limit | Set per-invocation in dispatch worker | +| Subrequests (custom limits) | Up to Workers plan limit | Set per-invocation in dispatch worker | +| Outbound Worker subrequests | Not intercepted for DO/mTLS | Only regular fetch() calls | +| TCP sockets with outbound | Disabled | `connect()` API unavailable | + +See [README.md](./README.md), [configuration.md](./configuration.md), [api.md](./api.md), [patterns.md](./patterns.md) diff --git a/cloudflare/references/workers-for-platforms/patterns.md b/cloudflare/references/workers-for-platforms/patterns.md new file mode 100644 index 0000000..d198430 --- /dev/null +++ b/cloudflare/references/workers-for-platforms/patterns.md @@ -0,0 +1,188 @@ +# Multi-Tenant Patterns + +## Billing by Plan + +```typescript +interface Env { + DISPATCHER: DispatchNamespace; + CUSTOMERS_KV: KVNamespace; +} + +export default { + async fetch(request: Request, env: Env): Promise { + const userWorkerName = new URL(request.url).hostname.split(".")[0]; + const customerPlan = await env.CUSTOMERS_KV.get(userWorkerName); + + const plans = { + enterprise: { cpuMs: 50, subRequests: 50 }, + pro: { cpuMs: 20, subRequests: 20 }, + free: { cpuMs: 10, subRequests: 5 }, + }; + const limits = plans[customerPlan as keyof typeof plans] || plans.free; + + const userWorker = env.DISPATCHER.get(userWorkerName, {}, { limits }); + return await userWorker.fetch(request); + }, +}; +``` + +## Resource Isolation + +**Complete isolation:** Create unique resources per customer +- KV namespace per customer +- D1 database per customer +- R2 bucket per customer + +```typescript +const bindings = [{ + type: "kv_namespace", + name: "USER_KV", + namespace_id: `customer-${customerId}-kv` +}]; +``` + +## Hostname Routing + +### Wildcard Route (Recommended) +Configure `*/*` route on SaaS domain → dispatch Worker + +**Benefits:** +- Supports subdomains + custom vanity domains +- No per-route limits (regular Workers limited to 100 routes) +- Programmatic control +- Works with any DNS proxy settings + +**Setup:** +1. Cloudflare for SaaS custom hostnames +2. Fallback origin (dummy `A 192.0.2.0` if Worker is origin) +3. DNS CNAME to SaaS domain +4. `*/*` route → dispatch Worker +5. Routing logic in dispatch Worker + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + const hostname = new URL(request.url).hostname; + const hostnameData = await env.ROUTING_KV.get(`hostname:${hostname}`, { type: "json" }); + + if (!hostnameData?.workerName) { + return new Response("Hostname not configured", { status: 404 }); + } + + const userWorker = env.DISPATCHER.get(hostnameData.workerName); + return await userWorker.fetch(request); + }, +}; +``` + +### Subdomain-Only +1. Wildcard DNS: `*.saas.com` → origin +2. Route: `*.saas.com/*` → dispatch Worker +3. Extract subdomain for routing + +### Orange-to-Orange (O2O) Behavior + +When customers use Cloudflare and CNAME to your Workers domain: + +| Scenario | Behavior | Route Pattern | +|----------|----------|---------------| +| Customer not on Cloudflare | Standard routing | `*/*` or `*.domain.com/*` | +| Customer on Cloudflare (proxied CNAME) | Invokes Worker at edge | `*/*` required | +| Customer on Cloudflare (DNS-only CNAME) | Standard routing | Any route works | + +**Recommendation:** Always use `*/*` wildcard for consistent O2O behavior. + +### Custom Metadata Routing + +For Cloudflare for SaaS: Store worker name in custom hostname `custom_metadata`, retrieve in dispatch worker to route requests. Requires custom hostnames as subdomains of your domain. + +## Observability + +### Logpush +- Enable on dispatch Worker → captures all user Worker logs +- Filter by `Outcome` or `Script Name` + +### Tail Workers +- Real-time logs with custom formatting +- Receives HTTP status, `console.log()`, exceptions, diagnostics + +### Analytics Engine +```typescript +// Track violations +env.ANALYTICS.writeDataPoint({ + indexes: [customerName], + blobs: ["cpu_limit_exceeded"], +}); +``` + +### GraphQL +```graphql +query { + viewer { + accounts(filter: {accountTag: $accountId}) { + workersInvocationsAdaptive(filter: {dispatchNamespaceName: "production"}) { + sum { requests errors cpuTime } + } + } + } +} +``` + +## Use Case Implementations + +### AI Code Execution +```typescript +async function deployGeneratedCode(name: string, code: string) { + const file = new File([code], `${name}.mjs`, { type: "application/javascript+module" }); + await client.workersForPlatforms.dispatch.namespaces.scripts.update("production", name, { + account_id: accountId, + metadata: { main_module: `${name}.mjs`, tags: [name, "ai-generated"] }, + files: [file], + }); +} + +// Short limits for untrusted code +const userWorker = env.DISPATCHER.get(sessionId, {}, { limits: { cpuMs: 5, subRequests: 3 } }); +``` + +**VibeSDK:** For AI-powered code generation + deployment platforms, see [VibeSDK](https://github.com/cloudflare/vibesdk) - handles AI generation, sandbox execution, live preview, and deployment. + +Reference: [AI Vibe Coding Platform Architecture](https://developers.cloudflare.com/reference-architecture/diagrams/ai/ai-vibe-coding-platform/) + +### Edge Functions Platform +```typescript +// Route: /customer-id/function-name +const [customerId, functionName] = new URL(request.url).pathname.split("/").filter(Boolean); +const workerName = `${customerId}-${functionName}`; +const userWorker = env.DISPATCHER.get(workerName); +``` + +### Website Builder +- Deploy static assets + Worker code +- See [api.md](./api.md#static-assets) for full implementation +- Salt hashes for asset isolation + +## Best Practices + +### Architecture +- One namespace per environment (production, staging) +- Platform logic in dispatch Worker (auth, rate limiting, validation) +- Isolation automatic (no shared cache, untrusted mode) + +### Routing +- Use `*/*` wildcard routes +- Store mappings in KV +- Handle missing Workers gracefully + +### Limits & Security +- Set custom limits by plan +- Track violations with Analytics Engine +- Use outbound Workers for egress control +- Sanitize responses + +### Tags +- Tag all Workers: customer ID, plan, environment +- Enable bulk operations +- Filter efficiently + +See [README.md](./README.md), [configuration.md](./configuration.md), [api.md](./api.md), [gotchas.md](./gotchas.md) diff --git a/cloudflare/references/workers-playground/README.md b/cloudflare/references/workers-playground/README.md new file mode 100644 index 0000000..6dee4f9 --- /dev/null +++ b/cloudflare/references/workers-playground/README.md @@ -0,0 +1,127 @@ +# Cloudflare Workers Playground Skill Reference + +## Overview + +Cloudflare Workers Playground is a browser-based sandbox for instantly experimenting with, testing, and deploying Cloudflare Workers without authentication or setup. This skill provides patterns, APIs, and best practices specifically for Workers Playground development. + +**URL:** [workers.cloudflare.com/playground](https://workers.cloudflare.com/playground) + +## ⚠️ Playground Constraints + +**Playground is NOT production-equivalent:** +- ✅ Real Workers runtime, instant testing, shareable URLs +- ❌ No TypeScript (JavaScript only) +- ❌ No bindings (KV, D1, R2, Durable Objects) +- ❌ No environment variables or secrets +- ❌ ES modules only (no Service Worker format) +- ⚠️ Safari broken (use Chrome/Firefox) + +**For production:** Use `wrangler` CLI. Playground is for rapid prototyping. + +## Quick Start + +Minimal Worker: + +```javascript +export default { + async fetch(request, env, ctx) { + return new Response('Hello World'); + } +}; +``` + +JSON API: + +```javascript +export default { + async fetch(request, env, ctx) { + const data = { message: 'Hello', timestamp: Date.now() }; + return Response.json(data); + } +}; +``` + +Proxy with modification: + +```javascript +export default { + async fetch(request, env, ctx) { + const response = await fetch('https://example.com'); + const modified = new Response(response.body, response); + modified.headers.set('X-Custom-Header', 'added-by-worker'); + return modified; + } +}; +``` + +Import from CDN: + +```javascript +import { Hono } from 'https://esm.sh/hono@3'; + +export default { + async fetch(request) { + const app = new Hono(); + app.get('/', (c) => c.text('Hello Hono!')); + return app.fetch(request); + } +}; +``` + +## Reading Order + +1. **[configuration.md](configuration.md)** - Start here: playground setup, constraints, deployment +2. **[api.md](api.md)** - Core APIs: Request, Response, ExecutionContext, fetch, Cache +3. **[patterns.md](patterns.md)** - Common use cases: routing, proxying, A/B testing, multi-module code +4. **[gotchas.md](gotchas.md)** - Troubleshooting: errors, browser issues, limits, best practices + +## In This Reference + +- **[configuration.md](configuration.md)** - Setup, deployment, configuration +- **[api.md](api.md)** - API endpoints, methods, interfaces +- **[patterns.md](patterns.md)** - Common patterns, use cases, examples +- **[gotchas.md](gotchas.md)** - Troubleshooting, best practices, limitations + +## Key Features + +**No Setup Required:** +- Open URL and start coding +- No CLI, no account, no config files +- Code executes in real Cloudflare Workers runtime + +**Instant Preview:** +- Live preview pane with browser tab or HTTP tester +- Auto-reload on code changes +- DevTools integration (right-click → Inspect) + +**Share & Deploy:** +- Copy Link generates permanent shareable URL +- Deploy button publishes to production in ~30 seconds +- Get `*.workers.dev` subdomain immediately + +## Common Use Cases + +- **API development:** Test endpoints before wrangler setup +- **Learning Workers:** Experiment with APIs without local environment +- **Prototyping:** Quick POCs for edge logic +- **Sharing examples:** Generate shareable links for bug reports or demos +- **Framework testing:** Import from CDN (Hono, itty-router, etc.) + +## Limitations vs Production + +| Feature | Playground | Production (wrangler) | +|---------|------------|----------------------| +| Language | JavaScript only | JS + TypeScript | +| Bindings | None | KV, D1, R2, DO, AI, etc. | +| Environment vars | None | Full support | +| Module format | ES only | ES + Service Worker | +| CPU time | 10ms (Free plan) | 10ms Free / 50ms Paid | +| Custom domains | No | Yes | +| Analytics | No | Yes | + +## See Also + +- [Cloudflare Workers Docs](https://developers.cloudflare.com/workers/) +- [Workers Examples](https://developers.cloudflare.com/workers/examples/) +- [Wrangler CLI](https://developers.cloudflare.com/workers/wrangler/) +- [Workers API Reference](https://developers.cloudflare.com/workers/runtime-apis/) diff --git a/cloudflare/references/workers-playground/api.md b/cloudflare/references/workers-playground/api.md new file mode 100644 index 0000000..1382ab4 --- /dev/null +++ b/cloudflare/references/workers-playground/api.md @@ -0,0 +1,101 @@ +# Workers Playground API + +## Handler + +```javascript +export default { + async fetch(request, env, ctx) { + // request: Request, env: {} (empty in playground), ctx: ExecutionContext + return new Response('Hello'); + } +}; +``` + +## Request + +```javascript +const method = request.method; // "GET", "POST" +const url = new URL(request.url); // Parse URL +const headers = request.headers; // Headers object +const body = await request.json(); // Read body (consumes stream) +const clone = request.clone(); // Clone before reading body + +// Query params +url.searchParams.get('page'); // Single value +url.searchParams.getAll('tag'); // Array + +// Cloudflare metadata +request.cf.country; // "US" +request.cf.colo; // "SFO" +``` + +## Response + +```javascript +// Text +return new Response('Hello', { status: 200 }); + +// JSON +return Response.json({ data }, { status: 200, headers: {...} }); + +// Redirect +return Response.redirect('/new-path', 301); + +// Modify existing +const modified = new Response(response.body, response); +modified.headers.set('X-Custom', 'value'); +``` + +## ExecutionContext + +```javascript +// Background work (after response sent) +ctx.waitUntil(fetch('https://logs.example.com', { method: 'POST', body: '...' })); +return new Response('OK'); // Returns immediately +``` + +## Fetch + +```javascript +const response = await fetch('https://api.example.com'); +const data = await response.json(); + +// With options +await fetch(url, { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ name: 'Alice' }) +}); +``` + +## Cache + +```javascript +const cache = caches.default; + +// Check cache +let response = await cache.match(request); +if (!response) { + response = await fetch(origin); + await cache.put(request, response.clone()); // Clone before put! +} +return response; +``` + +## Crypto + +```javascript +crypto.randomUUID(); // UUID v4 +crypto.getRandomValues(new Uint8Array(16)); + +// SHA-256 hash +const hash = await crypto.subtle.digest('SHA-256', new TextEncoder().encode(data)); +``` + +## Limits (Playground = Free Plan) + +| Resource | Limit | +|----------|-------| +| CPU time | 10ms | +| Subrequests | 50 | +| Memory | 128 MB | diff --git a/cloudflare/references/workers-playground/configuration.md b/cloudflare/references/workers-playground/configuration.md new file mode 100644 index 0000000..7d747ac --- /dev/null +++ b/cloudflare/references/workers-playground/configuration.md @@ -0,0 +1,163 @@ +# Configuration + +## Getting Started + +Navigate to [workers.cloudflare.com/playground](https://workers.cloudflare.com/playground) + +- **No account required** for testing +- **No CLI or local setup** needed +- Code executes in real Cloudflare Workers runtime +- Share code via URL (never expires) + +## Playground Constraints + +⚠️ **Important Limitations** + +| Constraint | Playground | Production Workers | +|------------|------------|-------------------| +| **Module Format** | ES modules only | ES modules or Service Worker | +| **TypeScript** | Not supported (JS only) | Supported via build step | +| **Bindings** | Not available | KV, D1, R2, Durable Objects, etc. | +| **wrangler.toml** | Not used | Required for config | +| **Environment Variables** | Not available | Full support | +| **Secrets** | Not available | Full support | +| **Custom Domains** | Not available | Full support | + +**Playground is for rapid prototyping only.** For production apps, use `wrangler` CLI. + +## Code Editor + +### Syntax Requirements + +Must export default object with `fetch` handler: + +```javascript +export default { + async fetch(request, env, ctx) { + return new Response('Hello World'); + } +}; +``` + +**Key Points:** +- Must use ES modules (`export default`) +- `fetch` method receives `(request, env, ctx)` +- Must return `Response` object +- TypeScript not supported (use plain JavaScript) + +### Multi-Module Code + +Import from external URLs or inline modules: + +```javascript +// Import from CDN +import { Hono } from 'https://esm.sh/hono@3'; + +// Or paste library code and import relatively +// (See patterns.md for multi-module examples) + +export default { + async fetch(request) { + const app = new Hono(); + app.get('/', (c) => c.text('Hello')); + return app.fetch(request); + } +}; +``` + +## Preview Panel + +### Browser Tab + +Default interactive preview with address bar: +- Enter custom URL paths +- Automatic reload on code changes +- DevTools available (right-click → Inspect) + +### HTTP Test Panel + +Switch to **HTTP** tab for raw HTTP testing: +- Change HTTP method (GET, POST, PUT, DELETE, PATCH, etc.) +- Add/edit request headers +- Modify request body (JSON, form data, text) +- View response headers and body +- Test different content types + +Example HTTP test: +``` +Method: POST +URL: /api/users +Headers: + Content-Type: application/json + Authorization: Bearer token123 +Body: +{ + "name": "Alice", + "email": "alice@example.com" +} +``` + +## Sharing Code + +**Copy Link** button generates shareable URL: +- Code embedded in URL fragment +- Links never expire +- No account required +- Can be bookmarked for later + +Example: `https://workers.cloudflare.com/playground#abc123...` + +## Deploying from Playground + +Click **Deploy** button to move code to production: + +1. **Log in** to Cloudflare account (creates free account if needed) +2. **Review** Worker name and code +3. **Deploy** to global network (takes ~30 seconds) +4. **Get URL**: Deployed to `.workers.dev` subdomain +5. **Manage** from dashboard: add bindings, custom domains, analytics + +**After deploy:** +- Code runs on Cloudflare's global network (300+ cities) +- Can add KV, D1, R2, Durable Objects bindings +- Configure custom domains and routes +- View analytics and logs +- Set environment variables and secrets + +**Note:** Deployed Workers are production-ready but start on Free plan (100k requests/day). + +## Browser Compatibility + +| Browser | Status | Notes | +|---------|--------|-------| +| Chrome/Edge | ✅ Full support | Recommended | +| Firefox | ✅ Full support | Works well | +| Safari | ⚠️ Broken | Preview fails with "PreviewRequestFailed" | + +**Safari users:** Use Chrome, Firefox, or Edge for Workers Playground. + +## DevTools Integration + +1. **Open preview** in browser tab +2. **Right-click** → Inspect Element +3. **Console tab** shows Worker logs: + - `console.log()` output + - Uncaught errors + - Network requests (subrequests) + +**Note:** DevTools show client-side console, not Worker execution logs. For production logging, use Logpush or Tail Workers. + +## Limits in Playground + +Same as production Free plan: + +| Resource | Limit | Notes | +|----------|-------|-------| +| CPU time | 10ms | Per request | +| Memory | 128 MB | Per request | +| Script size | 1 MB | After compression | +| Subrequests | 50 | Outbound fetch calls | +| Request size | 100 MB | Incoming | +| Response size | Unlimited | Outgoing (streamed) | + +**Exceeding CPU time** throws error immediately. Optimize hot paths or upgrade to Paid plan (50ms CPU). diff --git a/cloudflare/references/workers-playground/gotchas.md b/cloudflare/references/workers-playground/gotchas.md new file mode 100644 index 0000000..9271dc4 --- /dev/null +++ b/cloudflare/references/workers-playground/gotchas.md @@ -0,0 +1,88 @@ +# Workers Playground Gotchas + +## Platform Limitations + +| Limitation | Impact | Workaround | +|------------|--------|------------| +| Safari broken | Preview fails | Use Chrome/Firefox/Edge | +| TypeScript unsupported | TS syntax errors | Write plain JS or use JSDoc | +| No bindings | `env` always `{}` | Mock data or use external APIs | +| No env vars | Can't access secrets | Hardcode for testing | + +## Common Runtime Errors + +### "Response body already read" + +```javascript +// ❌ Body consumed twice +const body = await request.text(); +await fetch(url, { body: request.body }); // Error! + +// ✅ Clone first +const clone = request.clone(); +const body = await request.text(); +await fetch(url, { body: clone.body }); +``` + +### "Worker exceeded CPU time" + +**Limit:** 10ms (free), 50ms (paid) + +```javascript +// ✅ Move slow work to background +ctx.waitUntil(fetch('https://analytics.example.com', {...})); +return new Response('OK'); // Return immediately +``` + +### "Too many subrequests" + +**Limit:** 50 (free), 1000 (paid) + +```javascript +// ❌ 100 individual fetches +// ✅ Batch into single API call +await fetch('https://api.example.com/batch', { + body: JSON.stringify({ ids: [...] }) +}); +``` + +## Best Practices + +```javascript +// Clone before caching +await cache.put(request, response.clone()); +return response; + +// Validate input early +if (request.method !== 'POST') return new Response('', { status: 405 }); + +// Handle errors +try { ... } catch (e) { + return Response.json({ error: e.message }, { status: 500 }); +} +``` + +## Limits + +| Resource | Free | Paid | +|----------|------|------| +| CPU time | 10ms | 50ms | +| Memory | 128 MB | 128 MB | +| Subrequests | 50 | 1000 | + +## Browser Support + +| Browser | Status | +|---------|--------| +| Chrome | ✅ Recommended | +| Firefox | ✅ Works | +| Edge | ✅ Works | +| Safari | ❌ Broken | + +## Debugging + +```javascript +console.log('URL:', request.url); // View in browser DevTools Console +``` + +**Note:** `console.log` works in playground. For production, use Logpush or Tail Workers. diff --git a/cloudflare/references/workers-playground/patterns.md b/cloudflare/references/workers-playground/patterns.md new file mode 100644 index 0000000..4af891c --- /dev/null +++ b/cloudflare/references/workers-playground/patterns.md @@ -0,0 +1,132 @@ +# Workers Playground Patterns + +## JSON API + +```javascript +export default { + async fetch(request) { + const url = new URL(request.url); + if (url.pathname === '/api/hello') return Response.json({ message: 'Hello' }); + if (url.pathname === '/api/echo' && request.method === 'POST') { + return Response.json({ received: await request.json() }); + } + return Response.json({ error: 'Not found' }, { status: 404 }); + } +}; +``` + +## Router Pattern + +```javascript +const routes = { + '/': () => new Response('Home'), + '/api/users': () => Response.json([{ id: 1, name: 'Alice' }]) +}; + +export default { + async fetch(request) { + const handler = routes[new URL(request.url).pathname]; + return handler ? handler() : new Response('Not Found', { status: 404 }); + } +}; +``` + +## Proxy Pattern + +```javascript +export default { + async fetch(request) { + const url = new URL(request.url); + url.hostname = 'api.example.com'; + return fetch(url.toString(), { + method: request.method, headers: request.headers, body: request.body + }); + } +}; +``` + +## CORS Handling + +```javascript +export default { + async fetch(request) { + if (request.method === 'OPTIONS') { + return new Response(null, { + headers: { + 'Access-Control-Allow-Origin': '*', + 'Access-Control-Allow-Methods': 'GET, POST, PUT, DELETE', + 'Access-Control-Allow-Headers': 'Content-Type, Authorization' + } + }); + } + const response = await fetch('https://api.example.com', request); + const modified = new Response(response.body, response); + modified.headers.set('Access-Control-Allow-Origin', '*'); + return modified; + } +}; +``` + +## Caching + +```javascript +export default { + async fetch(request) { + if (request.method !== 'GET') return fetch(request); + const cache = caches.default; + let response = await cache.match(request); + if (!response) { + response = await fetch('https://api.example.com'); + if (response.status === 200) await cache.put(request, response.clone()); + } + return response; + } +}; +``` + +## Hono Framework + +```javascript +import { Hono } from 'https://esm.sh/hono@3'; +const app = new Hono(); +app.get('/', (c) => c.text('Hello')); +app.get('/api/users/:id', (c) => c.json({ id: c.req.param('id') })); +app.notFound((c) => c.json({ error: 'Not found' }, 404)); +export default app; +``` + +## Authentication + +```javascript +export default { + async fetch(request) { + const auth = request.headers.get('Authorization'); + if (!auth?.startsWith('Bearer ')) { + return Response.json({ error: 'Unauthorized' }, { status: 401 }); + } + const token = auth.substring(7); + if (token !== 'secret-token') { + return Response.json({ error: 'Invalid token' }, { status: 403 }); + } + return Response.json({ message: 'Authenticated' }); + } +}; +``` + +## Error Handling + +```javascript +export default { + async fetch(request) { + try { + const response = await fetch('https://api.example.com'); + if (!response.ok) throw new Error(`API returned ${response.status}`); + return response; + } catch (error) { + return Response.json({ error: error.message }, { status: 500 }); + } + } +}; +``` + +**Note:** In-memory state (Maps, variables) resets on Worker cold start. Use Durable Objects or KV for persistence. diff --git a/cloudflare/references/workers-vpc/README.md b/cloudflare/references/workers-vpc/README.md new file mode 100644 index 0000000..412d823 --- /dev/null +++ b/cloudflare/references/workers-vpc/README.md @@ -0,0 +1,127 @@ +# Workers VPC Connectivity + +Connect Cloudflare Workers to private networks and internal infrastructure using TCP Sockets. + +## Overview + +Workers VPC connectivity enables outbound TCP connections from Workers to private resources in AWS, Azure, GCP, on-premises datacenters, or any private network. This is achieved through the **TCP Sockets API** (`cloudflare:sockets`), which provides low-level network access for custom protocols and services. + +**Key capabilities:** +- Direct TCP connections to private IPs and hostnames +- TLS/StartTLS support for encrypted connections +- Integration with Cloudflare Tunnel for secure private network access +- Full control over wire protocols (database protocols, SSH, MQTT, custom TCP) + +**Note:** This reference documents the TCP Sockets API. For the newer Workers VPC Services product (HTTP-only service bindings with built-in SSRF protection), refer to separate documentation when available. VPC Services is currently in beta (2025+). + +## Quick Decision: Which Technology? + +Need private network connectivity from Workers? + +| Requirement | Use | Why | +|------------|-----|-----| +| HTTP/HTTPS APIs in private network | VPC Services (beta, separate docs) | SSRF-safe, declarative bindings | +| PostgreSQL/MySQL databases | [Hyperdrive](../hyperdrive/) | Connection pooling, caching, optimized | +| Custom TCP protocols (SSH, MQTT, proprietary) | **TCP Sockets (this doc)** | Full protocol control | +| Simple HTTP with lowest latency | TCP Sockets + [Smart Placement](../smart-placement/) | Manual optimization | +| Expose on-prem to internet (inbound) | [Cloudflare Tunnel](../tunnel/) | Not Worker-specific | + +## When to Use TCP Sockets + +**Use TCP Sockets when you need:** +- ✅ Direct control over wire protocols (e.g., Postgres wire protocol, SSH, Redis RESP) +- ✅ Non-HTTP protocols (MQTT, SMTP, custom binary protocols) +- ✅ StartTLS or custom TLS negotiation +- ✅ Streaming binary data over TCP + +**Don't use TCP Sockets when:** +- ❌ You just need HTTP/HTTPS (use `fetch()` or VPC Services) +- ❌ You need PostgreSQL/MySQL (use Hyperdrive for pooling) +- ❌ You need WebSocket (use native Workers WebSocket) + +## Quick Start + +```typescript +import { connect } from 'cloudflare:sockets'; + +export default { + async fetch(req: Request): Promise { + // Connect to private service + const socket = connect( + { hostname: "db.internal.company.net", port: 5432 }, + { secureTransport: "on" } + ); + + try { + await socket.opened; // Wait for connection + + const writer = socket.writable.getWriter(); + await writer.write(new TextEncoder().encode("QUERY\r\n")); + await writer.close(); + + const reader = socket.readable.getReader(); + const { value } = await reader.read(); + + return new Response(value); + } finally { + await socket.close(); + } + } +}; +``` + +## Architecture Pattern: Workers + Tunnel + +Most private network connectivity combines TCP Sockets with Cloudflare Tunnel: + +``` +┌─────────┐ ┌─────────────┐ ┌──────────────┐ ┌─────────────┐ +│ Worker │────▶│ TCP Socket │────▶│ Tunnel │────▶│ Private │ +│ │ │ (this API) │ │ (cloudflared)│ │ Network │ +└─────────┘ └─────────────┘ └──────────────┘ └─────────────┘ +``` + +1. Worker opens TCP socket to Tunnel hostname +2. Tunnel endpoint routes to private IP +3. Response flows back through Tunnel to Worker + +See [configuration.md](./configuration.md) for Tunnel setup details. + +## Reading Order + +1. **Start here (README.md)** - Overview and decision guide +2. **[api.md](./api.md)** - Socket interface, types, methods +3. **[configuration.md](./configuration.md)** - Wrangler setup, Tunnel integration +4. **[patterns.md](./patterns.md)** - Real-world examples (databases, protocols, error handling) +5. **[gotchas.md](./gotchas.md)** - Limits, blocked ports, common errors + +## Key Limits + +| Limit | Value | +|-------|-------| +| Max concurrent sockets per request | 6 | +| Blocked destinations | Cloudflare IPs, localhost, port 25 | +| Scope requirement | Must create in handler (not global) | + +See [gotchas.md](./gotchas.md) for complete limits and troubleshooting. + +## Best Practices + +1. **Always close sockets** - Use try/finally blocks +2. **Validate destinations** - Prevent SSRF by allowlisting hosts +3. **Use Hyperdrive for databases** - Better performance than raw TCP +4. **Prefer fetch() for HTTP** - Only use TCP when necessary +5. **Combine with Smart Placement** - Reduce latency to private networks + +## Related Technologies + +- **[Hyperdrive](../hyperdrive/)** - PostgreSQL/MySQL with connection pooling +- **[Cloudflare Tunnel](../tunnel/)** - Secure private network access +- **[Smart Placement](../smart-placement/)** - Auto-locate Workers near backends +- **VPC Services (beta)** - HTTP-only service bindings with SSRF protection (separate docs) + +## Reference + +- [TCP Sockets API Documentation](https://developers.cloudflare.com/workers/runtime-apis/tcp-sockets/) +- [Connect to databases guide](https://developers.cloudflare.com/workers/tutorials/connect-to-postgres/) +- [Cloudflare Tunnel setup](https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/) diff --git a/cloudflare/references/workers-vpc/api.md b/cloudflare/references/workers-vpc/api.md new file mode 100644 index 0000000..987fb2e --- /dev/null +++ b/cloudflare/references/workers-vpc/api.md @@ -0,0 +1,202 @@ +# TCP Sockets API Reference + +Complete API reference for the Cloudflare Workers TCP Sockets API (`cloudflare:sockets`). + +## Core Function: `connect()` + +```typescript +function connect( + address: SocketAddress, + options?: SocketOptions +): Socket +``` + +Creates an outbound TCP connection to the specified address. + +### Parameters + +#### `SocketAddress` + +```typescript +interface SocketAddress { + hostname: string; // DNS hostname or IP address + port: number; // TCP port (1-65535, excluding blocked ports) +} +``` + +| Field | Type | Description | Example | +|-------|------|-------------|---------| +| `hostname` | `string` | Target hostname or IP | `"db.internal.net"`, `"10.0.1.50"` | +| `port` | `number` | TCP port number | `5432`, `443`, `22` | + +DNS names are resolved at connection time. IPv4, IPv6, and private IPs (10.x, 172.16.x, 192.168.x) supported. + +#### `SocketOptions` + +```typescript +interface SocketOptions { + secureTransport?: "off" | "on" | "starttls"; + allowHalfOpen?: boolean; +} +``` + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `secureTransport` | `"off" \| "on" \| "starttls"` | `"off"` | TLS mode | +| `allowHalfOpen` | `boolean` | `false` | Allow half-closed connections | + +**`secureTransport` modes:** + +| Mode | Behavior | Use Case | +|------|----------|----------| +| `"off"` | Plain TCP, no encryption | Testing, internal trusted networks | +| `"on"` | Immediate TLS handshake | HTTPS, secure databases, SSH | +| `"starttls"` | Start plain, upgrade later with `startTls()` | Postgres, SMTP, IMAP | + +**`allowHalfOpen`:** When `false` (default), closing read stream auto-closes write stream. When `true`, streams are independent. + +### Returns + +A `Socket` object with readable/writable streams. + +## Socket Interface + +```typescript +interface Socket { + // Streams + readable: ReadableStream; + writable: WritableStream; + + // Connection state + opened: Promise; + closed: Promise; + + // Methods + close(): Promise; + startTls(): Socket; +} +``` + +### Properties + +#### `readable: ReadableStream` + +Stream for reading data from the socket. Use `getReader()` to consume data. + +```typescript +const reader = socket.readable.getReader(); +const { done, value } = await reader.read(); // Read one chunk +``` + +#### `writable: WritableStream` + +Stream for writing data to the socket. Use `getWriter()` to send data. + +```typescript +const writer = socket.writable.getWriter(); +await writer.write(new TextEncoder().encode("HELLO\r\n")); +await writer.close(); +``` + +#### `opened: Promise` + +Promise that resolves when connection succeeds, rejects on failure. + +```typescript +interface SocketInfo { + remoteAddress?: string; // May be undefined + localAddress?: string; // May be undefined +} + +try { + const info = await socket.opened; +} catch (error) { + // Connection failed +} +``` + +#### `closed: Promise` + +Promise that resolves when socket is fully closed (both directions). + +### Methods + +#### `close(): Promise` + +Closes the socket gracefully, waiting for pending writes to complete. + +```typescript +const socket = connect({ hostname: "api.internal", port: 443 }); +try { + // Use socket +} finally { + await socket.close(); // Always call in finally block +} +``` + +#### `startTls(): Socket` + +Upgrades connection to TLS. Only available when `secureTransport: "starttls"` was specified. + +```typescript +const socket = connect( + { hostname: "db.internal", port: 5432 }, + { secureTransport: "starttls" } +); + +// Send protocol-specific StartTLS command +const writer = socket.writable.getWriter(); +await writer.write(new TextEncoder().encode("STARTTLS\r\n")); + +// Upgrade to TLS - use returned socket, not original +const secureSocket = socket.startTls(); +const secureWriter = secureSocket.writable.getWriter(); +``` + +## Complete Example + +```typescript +import { connect } from 'cloudflare:sockets'; + +export default { + async fetch(req: Request): Promise { + const socket = connect({ hostname: "echo.example.com", port: 7 }, { secureTransport: "on" }); + + try { + await socket.opened; + + const writer = socket.writable.getWriter(); + await writer.write(new TextEncoder().encode("Hello, TCP!\n")); + await writer.close(); + + const reader = socket.readable.getReader(); + const { value } = await reader.read(); + + return new Response(value); + } finally { + await socket.close(); + } + } +}; +``` + +See [patterns.md](./patterns.md) for multi-chunk reading, error handling, and protocol implementations. + +## Quick Reference + +| Task | Code | +|------|------| +| Import | `import { connect } from 'cloudflare:sockets';` | +| Connect | `connect({ hostname: "host", port: 443 })` | +| With TLS | `connect(addr, { secureTransport: "on" })` | +| StartTLS | `socket.startTls()` after handshake | +| Write | `await writer.write(data); await writer.close();` | +| Read | `const { value } = await reader.read();` | +| Error handling | `try { await socket.opened; } catch { }` | +| Always close | `try { } finally { await socket.close(); }` | + +## See Also + +- [patterns.md](./patterns.md) - Real-world protocol implementations +- [configuration.md](./configuration.md) - Wrangler setup and environment variables +- [gotchas.md](./gotchas.md) - Limits and error handling diff --git a/cloudflare/references/workers-vpc/configuration.md b/cloudflare/references/workers-vpc/configuration.md new file mode 100644 index 0000000..efd2d35 --- /dev/null +++ b/cloudflare/references/workers-vpc/configuration.md @@ -0,0 +1,147 @@ +# Configuration + +Setup and configuration for TCP Sockets in Cloudflare Workers. + +## Wrangler Configuration + +### Basic Setup + +TCP Sockets are available by default in Workers runtime. No special configuration required in `wrangler.jsonc`: + +```jsonc +{ + "name": "private-network-worker", + "main": "src/index.ts", + "compatibility_date": "2025-01-01" +} +``` + +### Environment Variables + +Store connection details as env vars: + +```jsonc +{ + "vars": { "DB_HOST": "10.0.1.50", "DB_PORT": "5432" } +} +``` + +```typescript +interface Env { DB_HOST: string; DB_PORT: string; } + +export default { + async fetch(req: Request, env: Env): Promise { + const socket = connect({ hostname: env.DB_HOST, port: parseInt(env.DB_PORT) }); + } +}; +``` + +### Per-Environment Configuration + +```jsonc +{ + "vars": { "DB_HOST": "localhost" }, + "env": { + "staging": { "vars": { "DB_HOST": "staging-db.internal.net" } }, + "production": { "vars": { "DB_HOST": "prod-db.internal.net" } } + } +} +``` + +Deploy: `wrangler deploy --env staging` or `wrangler deploy --env production` + +## Integration with Cloudflare Tunnel + +To connect Workers to private networks, combine TCP Sockets with Cloudflare Tunnel: + +``` +Worker (TCP Socket) → Tunnel hostname → cloudflared → Private Network +``` + +### Quick Setup + +1. **Install cloudflared** on a server inside your private network +2. **Create tunnel**: `cloudflared tunnel create my-private-network` +3. **Configure routing** in `config.yml`: + +```yaml +tunnel: +credentials-file: /path/to/.json +ingress: + - hostname: db.internal.example.com + service: tcp://10.0.1.50:5432 + - service: http_status:404 # Required catch-all +``` + +4. **Run tunnel**: `cloudflared tunnel run my-private-network` +5. **Connect from Worker**: + +```typescript +const socket = connect( + { hostname: "db.internal.example.com", port: 5432 }, // Tunnel hostname + { secureTransport: "on" } +); +``` + +For detailed Tunnel setup, see [Tunnel configuration reference](../tunnel/configuration.md). + +## Smart Placement Integration + +Reduce latency by auto-placing Workers near backends: + +```jsonc +{ "placement": { "mode": "smart" } } +``` + +Workers automatically relocate closer to TCP socket destinations after observing connection latency. See [Smart Placement reference](../smart-placement/). + +## Secrets Management + +Store sensitive credentials as secrets (not in wrangler.jsonc): + +```bash +wrangler secret put DB_PASSWORD # Enter value when prompted +``` + +Access in Worker via `env.DB_PASSWORD`. Use in protocol handshake or authentication. + +## Local Development + +Test with `wrangler dev`. Note: Local mode may not access private networks. Use public endpoints or mock servers for development: + +```typescript +const config = process.env.NODE_ENV === 'dev' + ? { hostname: 'localhost', port: 5432 } // Mock + : { hostname: 'db.internal.example.com', port: 5432 }; // Production +``` + +## Connection String Patterns + +Parse connection strings to extract host and port: + +```typescript +function parseConnectionString(connStr: string): SocketAddress { + const url = new URL(connStr); // e.g., "postgres://10.0.1.50:5432/mydb" + return { hostname: url.hostname, port: parseInt(url.port) || 5432 }; +} +``` + +## Hyperdrive Integration + +For PostgreSQL/MySQL, prefer Hyperdrive over raw TCP sockets (includes connection pooling): + +```jsonc +{ "hyperdrive": [{ "binding": "DB", "id": "" }] } +``` + +See [Hyperdrive reference](../hyperdrive/) for complete setup. + +## Compatibility + +TCP Sockets available in all modern Workers. Use current date: `"compatibility_date": "2025-01-01"`. No special flags required. + +## Related Configuration + +- **[Tunnel Configuration](../tunnel/configuration.md)** - Detailed cloudflared setup +- **[Smart Placement](../smart-placement/configuration.md)** - Placement mode options +- **[Hyperdrive](../hyperdrive/configuration.md)** - Database connection pooling setup diff --git a/cloudflare/references/workers-vpc/gotchas.md b/cloudflare/references/workers-vpc/gotchas.md new file mode 100644 index 0000000..d14faae --- /dev/null +++ b/cloudflare/references/workers-vpc/gotchas.md @@ -0,0 +1,167 @@ +# Gotchas and Troubleshooting + +Common pitfalls, limitations, and solutions for TCP Sockets in Cloudflare Workers. + +## Platform Limits + +### Connection Limits + +| Limit | Value | +|-------|-------| +| Max concurrent sockets per request | 6 (hard limit) | +| Socket lifetime | Request duration | +| Connection timeout | Platform-dependent, no setting | + +**Problem:** Exceeding 6 connections throws error + +**Solution:** Process in batches of 6 + +```typescript +for (let i = 0; i < hosts.length; i += 6) { + const batch = hosts.slice(i, i + 6).map(h => connect({ hostname: h, port: 443 })); + await Promise.all(batch.map(async s => { /* use */ await s.close(); })); +} +``` + +### Blocked Destinations + +Cloudflare IPs (1.1.1.1), localhost (127.0.0.1), port 25 (SMTP), Worker's own URL blocked for security. + +**Solution:** Use public IPs or Tunnel hostnames: `connect({ hostname: "db.internal.company.net", port: 5432 })` + +### Scope Requirements + +**Problem:** Sockets created in global scope fail + +**Cause:** Sockets tied to request lifecycle + +**Solution:** Create inside handler: `export default { async fetch() { const socket = connect(...); } }` + +## Common Errors + +### Error: "proxy request failed" + +**Causes:** Blocked destination (Cloudflare IP, localhost, port 25), DNS failure, network unreachable + +**Solution:** Validate destinations, use Tunnel hostnames, catch errors with try/catch + +### Error: "TCP Loop detected" + +**Cause:** Worker connecting to itself + +**Solution:** Connect to external service, not Worker's own hostname + +### Error: "Port 25 prohibited" + +**Cause:** SMTP port blocked + +**Solution:** Use Email Workers API for email + +### Error: "socket is not open" + +**Cause:** Read/write after close + +**Solution:** Always use try/finally to ensure proper closure order + +### Error: Connection timeout + +**Cause:** No built-in timeout + +**Solution:** Use `Promise.race()`: + +```typescript +const socket = connect(addr, opts); +const timeout = new Promise((_, reject) => setTimeout(() => reject(new Error('Timeout')), 5000)); +await Promise.race([socket.opened, timeout]); +``` + +## TLS/SSL Issues + +### StartTLS Timing + +**Problem:** Calling `startTls()` too early + +**Solution:** Send protocol-specific STARTTLS command, wait for server OK, then call `socket.startTls()` + +### Certificate Validation + +**Problem:** Self-signed certs fail + +**Solution:** Use proper certs or Tunnel (handles TLS termination) + +## Performance Issues + +### Not Using Connection Pooling + +**Problem:** New connection overhead per request + +**Solution:** Use [Hyperdrive](../hyperdrive/) for databases (built-in pooling) + +### Not Using Smart Placement + +**Problem:** High latency to backend + +**Solution:** Enable: `{ "placement": { "mode": "smart" } }` in wrangler.jsonc + +### Forgetting to Close Sockets + +**Problem:** Resource leaks + +**Solution:** Always use try/finally: + +```typescript +const socket = connect({ hostname: "api.internal", port: 443 }); +try { + // Use socket +} finally { + await socket.close(); +} +``` + +## Data Handling Issues + +### Assuming Single Read Gets All Data + +**Problem:** Only reading once may miss chunked data + +**Solution:** Loop `reader.read()` until `done === true` (see patterns.md) + +### Text Encoding Issues + +**Problem:** Using wrong encoding + +**Solution:** Specify encoding: `new TextDecoder('iso-8859-1').decode(data)` + +## Security Issues + +### SSRF Vulnerability + +**Problem:** User-controlled destinations allow access to internal services + +**Solution:** Validate against strict allowlist: + +```typescript +const ALLOWED = ['api1.internal.net', 'api2.internal.net']; +const host = new URL(req.url).searchParams.get('host'); +if (!host || !ALLOWED.includes(host)) return new Response('Forbidden', { status: 403 }); +``` + +## When to Use Alternatives + +| Use Case | Alternative | Reason | +|----------|-------------|--------| +| PostgreSQL/MySQL | [Hyperdrive](../hyperdrive/) | Connection pooling, caching | +| HTTP/HTTPS | `fetch()` | Simpler, built-in | +| HTTP with SSRF protection | VPC Services (beta 2025+) | Declarative bindings | + +## Debugging Tips + +1. **Log connection details:** `const info = await socket.opened; console.log(info.remoteAddress);` +2. **Test with public services first:** Use tcpbin.com:4242 echo server +3. **Verify Tunnel:** `cloudflared tunnel info ` and `cloudflared tunnel route ip list` + +## Related + +- [Hyperdrive](../hyperdrive/) - Database connections +- [Smart Placement](../smart-placement/) - Latency optimization +- [Tunnel Troubleshooting](../tunnel/gotchas.md) diff --git a/cloudflare/references/workers-vpc/patterns.md b/cloudflare/references/workers-vpc/patterns.md new file mode 100644 index 0000000..392627e --- /dev/null +++ b/cloudflare/references/workers-vpc/patterns.md @@ -0,0 +1,209 @@ +# Common Patterns + +Real-world patterns and examples for TCP Sockets in Cloudflare Workers. + +```typescript +import { connect } from 'cloudflare:sockets'; +``` + +## Basic Patterns + +### Simple Request-Response + +```typescript +const socket = connect({ hostname: "echo.example.com", port: 7 }, { secureTransport: "on" }); +try { + await socket.opened; + const writer = socket.writable.getWriter(); + await writer.write(new TextEncoder().encode("Hello\n")); + await writer.close(); + + const reader = socket.readable.getReader(); + const { value } = await reader.read(); + return new Response(value); +} finally { + await socket.close(); +} +``` + +### Reading All Data + +```typescript +async function readAll(socket: Socket): Promise { + const reader = socket.readable.getReader(); + const chunks: Uint8Array[] = []; + while (true) { + const { done, value } = await reader.read(); + if (done) break; + chunks.push(value); + } + const total = chunks.reduce((sum, c) => sum + c.length, 0); + const result = new Uint8Array(total); + let offset = 0; + for (const chunk of chunks) { result.set(chunk, offset); offset += chunk.length; } + return result; +} +``` + +### Streaming Response + +```typescript +// Stream socket data directly to HTTP response +const socket = connect({ hostname: "stream.internal", port: 9000 }, { secureTransport: "on" }); +const writer = socket.writable.getWriter(); +await writer.write(new TextEncoder().encode("STREAM\n")); +await writer.close(); +return new Response(socket.readable); +``` + +## Protocol Examples + +### Redis RESP + +```typescript +// Send: *2\r\n$3\r\nGET\r\n$\r\n\r\n +// Recv: $\r\n\r\n or $-1\r\n for null +const socket = connect({ hostname: "redis.internal", port: 6379 }); +const writer = socket.writable.getWriter(); +await writer.write(new TextEncoder().encode(`*2\r\n$3\r\nGET\r\n$3\r\nkey\r\n`)); +``` + +### PostgreSQL + +**Use [Hyperdrive](../hyperdrive/) for production.** Raw Postgres protocol is complex (startup, auth, query messages). + +### MQTT + +```typescript +const socket = connect({ hostname: "mqtt.broker", port: 1883 }); +const writer = socket.writable.getWriter(); +// CONNECT: 0x10 0x00 0x04 "MQTT" 0x04 ... +// PUBLISH: 0x30 +``` + +## Error Handling Patterns + +### Retry with Backoff + +```typescript +async function connectWithRetry(addr: SocketAddress, opts: SocketOptions, maxRetries = 3): Promise { + for (let i = 1; i <= maxRetries; i++) { + try { + const socket = connect(addr, opts); + await socket.opened; + return socket; + } catch (error) { + if (i === maxRetries) throw error; + await new Promise(r => setTimeout(r, 1000 * Math.pow(2, i - 1))); // Exponential backoff + } + } + throw new Error('Unreachable'); +} +``` + +### Timeout + +```typescript +async function connectWithTimeout(addr: SocketAddress, opts: SocketOptions, ms = 5000): Promise { + const socket = connect(addr, opts); + const timeout = new Promise((_, reject) => setTimeout(() => reject(new Error('Timeout')), ms)); + await Promise.race([socket.opened, timeout]); + return socket; +} +``` + +### Fallback + +```typescript +async function connectWithFallback(primary: string, fallback: string, port: number): Promise { + try { + const socket = connect({ hostname: primary, port }, { secureTransport: "on" }); + await socket.opened; + return socket; + } catch { + return connect({ hostname: fallback, port }, { secureTransport: "on" }); + } +} +``` + +## Security Patterns + +### Destination Allowlist (Prevent SSRF) + +```typescript +const ALLOWED_HOSTS = ['db.internal.company.net', 'api.internal.company.net', /^10\.0\.1\.\d+$/]; + +function isAllowed(hostname: string): boolean { + return ALLOWED_HOSTS.some(p => p instanceof RegExp ? p.test(hostname) : p === hostname); +} + +export default { + async fetch(req: Request): Promise { + const target = new URL(req.url).searchParams.get('host'); + if (!target || !isAllowed(target)) return new Response('Forbidden', { status: 403 }); + const socket = connect({ hostname: target, port: 443 }); + // Use socket... + } +}; +``` + +### Connection Pooling + +```typescript +class SocketPool { + private pool = new Map(); + + async acquire(hostname: string, port: number): Promise { + const key = `${hostname}:${port}`; + const sockets = this.pool.get(key) || []; + if (sockets.length > 0) return sockets.pop()!; + const socket = connect({ hostname, port }, { secureTransport: "on" }); + await socket.opened; + return socket; + } + + release(hostname: string, port: number, socket: Socket): void { + const key = `${hostname}:${port}`; + const sockets = this.pool.get(key) || []; + if (sockets.length < 3) { sockets.push(socket); this.pool.set(key, sockets); } + else socket.close(); + } +} +``` + +## Multi-Protocol Gateway + +```typescript +interface Protocol { name: string; defaultPort: number; test(host: string, port: number): Promise; } + +const PROTOCOLS: Record = { + redis: { + name: 'redis', + defaultPort: 6379, + async test(host, port) { + const socket = connect({ hostname: host, port }); + try { + const writer = socket.writable.getWriter(); + await writer.write(new TextEncoder().encode('*1\r\n$4\r\nPING\r\n')); + writer.releaseLock(); + const reader = socket.readable.getReader(); + const { value } = await reader.read(); + return new TextDecoder().decode(value || new Uint8Array()); + } finally { await socket.close(); } + } + } +}; + +export default { + async fetch(req: Request): Promise { + const url = new URL(req.url); + const proto = url.pathname.slice(1); // /redis + const host = url.searchParams.get('host'); + if (!host || !PROTOCOLS[proto]) return new Response('Invalid', { status: 400 }); + const result = await PROTOCOLS[proto].test(host, parseInt(url.searchParams.get('port') || '') || PROTOCOLS[proto].defaultPort); + return new Response(result); + } +}; +``` + + diff --git a/cloudflare/references/workers/README.md b/cloudflare/references/workers/README.md new file mode 100644 index 0000000..d5b04e1 --- /dev/null +++ b/cloudflare/references/workers/README.md @@ -0,0 +1,108 @@ +# Cloudflare Workers + +Expert guidance for building, deploying, and optimizing Cloudflare Workers applications. + +## Overview + +Cloudflare Workers run on V8 isolates (NOT containers/VMs): +- Extremely fast cold starts (< 1ms) +- Global deployment across 300+ locations +- Web standards compliant (fetch, URL, Headers, Request, Response) +- Support JS/TS, Python, Rust, and WebAssembly + +**Key principle**: Workers use web platform APIs wherever possible for portability. + +## Module Worker Pattern (Recommended) + +```typescript +export default { + async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise { + return new Response('Hello World!'); + }, +}; +``` + +**Handler parameters**: +- `request`: Incoming HTTP request (standard Request object) +- `env`: Environment bindings (KV, D1, R2, secrets, vars) +- `ctx`: Execution context (`waitUntil`, `passThroughOnException`) + +## Essential Commands + +```bash +npx wrangler dev # Local dev +npx wrangler dev --remote # Remote dev (actual resources) +npx wrangler deploy # Production +npx wrangler deploy --env staging # Specific environment +npx wrangler tail # Stream logs +npx wrangler secret put API_KEY # Set secret +``` + +## When to Use Workers + +- API endpoints at the edge +- Request/response transformation +- Authentication/authorization layers +- Static asset optimization +- A/B testing and feature flags +- Rate limiting and security +- Proxy/routing logic +- WebSocket applications + +## Quick Start + +```bash +npm create cloudflare@latest my-worker -- --type hello-world +cd my-worker +npx wrangler dev +``` + +## Handler Signatures + +```typescript +// HTTP requests +async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise + +// Cron triggers +async scheduled(event: ScheduledEvent, env: Env, ctx: ExecutionContext): Promise + +// Queue consumer +async queue(batch: MessageBatch, env: Env, ctx: ExecutionContext): Promise + +// Tail consumer +async tail(events: TraceItem[], env: Env, ctx: ExecutionContext): Promise +``` + +## Resources + +**Docs**: https://developers.cloudflare.com/workers/ +**Examples**: https://developers.cloudflare.com/workers/examples/ +**Runtime APIs**: https://developers.cloudflare.com/workers/runtime-apis/ + +## In This Reference + +- [Configuration](./configuration.md) - wrangler.jsonc setup, bindings, environments +- [API](./api.md) - Runtime APIs, bindings, execution context +- [Patterns](./patterns.md) - Common workflows, testing, optimization +- [Frameworks](./frameworks.md) - Hono, routing, validation +- [Gotchas](./gotchas.md) - Common issues, limits, troubleshooting + +## Reading Order + +| Task | Start With | Then Read | +|------|------------|-----------| +| First Worker | README → Configuration → API | Patterns | +| Add framework | Frameworks | Configuration (bindings) | +| Add storage/bindings | Configuration → API (binding usage) | See Also links | +| Debug issues | Gotchas | API (specific binding docs) | +| Production optimization | Patterns | API (caching, streaming) | +| Type safety | Configuration (TypeScript) | Frameworks (Hono typing) | + +## See Also + +- [KV](../kv/README.md) - Key-value storage +- [D1](../d1/README.md) - SQL database +- [R2](../r2/README.md) - Object storage +- [Durable Objects](../durable-objects/README.md) - Stateful coordination +- [Queues](../queues/README.md) - Message queues +- [Wrangler](../wrangler/README.md) - CLI tool reference diff --git a/cloudflare/references/workers/api.md b/cloudflare/references/workers/api.md new file mode 100644 index 0000000..a4fc13f --- /dev/null +++ b/cloudflare/references/workers/api.md @@ -0,0 +1,195 @@ +# Workers Runtime APIs + +## Fetch Handler + +```typescript +export default { + async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise { + const url = new URL(request.url); + if (request.method === 'POST' && url.pathname === '/api') { + const body = await request.json(); + return new Response(JSON.stringify({ id: 1 }), { + headers: { 'Content-Type': 'application/json' } + }); + } + return fetch(request); // Subrequest to origin + }, +}; +``` + +## Execution Context + +```typescript +ctx.waitUntil(logAnalytics(request)); // Background work, don't block response +ctx.passThroughOnException(); // Failover to origin on error +``` + +**Never** `await` background operations - use `ctx.waitUntil()`. + +## Bindings + +```typescript +// KV +await env.MY_KV.get('key'); +await env.MY_KV.put('key', 'value', { expirationTtl: 3600 }); + +// R2 +const obj = await env.MY_BUCKET.get('file.txt'); +await env.MY_BUCKET.put('file.txt', 'content'); + +// D1 +const result = await env.DB.prepare('SELECT * FROM users WHERE id = ?').bind(1).first(); + +// D1 Sessions (2024+) - read-after-write consistency +const session = env.DB.withSession(); +await session.prepare('INSERT INTO users (name) VALUES (?)').bind('Alice').run(); +const user = await session.prepare('SELECT * FROM users WHERE name = ?').bind('Alice').first(); // Guaranteed fresh + +// Queues +await env.MY_QUEUE.send({ timestamp: Date.now() }); + +// Secrets/vars +const key = env.API_KEY; +``` + +## Cache API + +```typescript +const cache = caches.default; +let response = await cache.match(request); + +if (!response) { + response = await fetch(request); + response = new Response(response.body, response); + response.headers.set('Cache-Control', 'max-age=3600'); + ctx.waitUntil(cache.put(request, response.clone())); // Clone before caching +} +``` + +## HTMLRewriter + +```typescript +return new HTMLRewriter() + .on('a[href]', { + element(el) { + const href = el.getAttribute('href'); + if (href?.startsWith('http://')) { + el.setAttribute('href', href.replace('http://', 'https://')); + } + } + }) + .transform(response); +``` + +**Use cases**: A/B testing, analytics injection, link rewriting + +## WebSockets + +### Standard WebSocket + +```typescript +const [client, server] = Object.values(new WebSocketPair()); + +server.accept(); +server.addEventListener('message', event => { + server.send(`Echo: ${event.data}`); +}); + +return new Response(null, { status: 101, webSocket: client }); +``` + +### WebSocket Hibernation (Recommended for idle connections) + +```typescript +// In Durable Object +export class WebSocketDO { + async webSocketMessage(ws: WebSocket, message: string) { + ws.send(`Echo: ${message}`); + } + + async webSocketClose(ws: WebSocket, code: number, reason: string) { + // Cleanup on close + } + + async webSocketError(ws: WebSocket, error: Error) { + console.error('WebSocket error:', error); + } +} +``` + +Hibernation automatically suspends inactive connections (no CPU cost), wakes on events + +## Durable Objects + +### RPC Pattern (Recommended 2024+) + +```typescript +export class Counter { + private value = 0; + + constructor(private state: DurableObjectState) { + state.blockConcurrencyWhile(async () => { + this.value = (await state.storage.get('value')) || 0; + }); + } + + // Export methods directly - called via RPC (type-safe, zero serialization) + async increment(): Promise { + this.value++; + await this.state.storage.put('value', this.value); + return this.value; + } + + async getValue(): Promise { + return this.value; + } +} + +// Worker usage: +const stub = env.COUNTER.get(env.COUNTER.idFromName('global')); +const count = await stub.increment(); // Direct method call, full type safety +``` + +### Legacy Fetch Pattern (Pre-2024) + +```typescript +async fetch(request: Request): Promise { + const url = new URL(request.url); + if (url.pathname === '/increment') { + await this.state.storage.put('value', ++this.value); + } + return new Response(String(this.value)); +} +// Usage: await stub.fetch('http://x/increment') +``` + +**When to use DOs**: Real-time collaboration, rate limiting, strongly consistent state + +## Other Handlers + +```typescript +// Cron: async scheduled(event, env, ctx) { ctx.waitUntil(doCleanup(env)); } +// Queue: async queue(batch) { for (const msg of batch.messages) { await process(msg.body); msg.ack(); } } +// Tail: async tail(events, env) { for (const e of events) if (e.outcome === 'exception') await log(e); } +``` + +## Service Bindings + +```typescript +// Worker-to-worker RPC (zero latency, no internet round-trip) +return env.SERVICE_B.fetch(request); + +// With RPC (2024+) - same as Durable Objects RPC +export class ServiceWorker { + async getData() { return { data: 'value' }; } +} +// Usage: const data = await env.SERVICE_B.getData(); +``` + +**Benefits**: Type-safe method calls, no HTTP overhead, share code between Workers + +## See Also + +- [Configuration](./configuration.md) - Binding setup +- [Patterns](./patterns.md) - Common workflows +- [KV](../kv/README.md), [D1](../d1/README.md), [R2](../r2/README.md), [Durable Objects](../durable-objects/README.md), [Queues](../queues/README.md) diff --git a/cloudflare/references/workers/configuration.md b/cloudflare/references/workers/configuration.md new file mode 100644 index 0000000..9eae70b --- /dev/null +++ b/cloudflare/references/workers/configuration.md @@ -0,0 +1,185 @@ +# Workers Configuration + +## wrangler.jsonc (Recommended) + +```jsonc +{ + "$schema": "./node_modules/wrangler/config-schema.json", + "name": "my-worker", + "main": "src/index.ts", + "compatibility_date": "2025-01-01", // Use current date for new projects + + // Bindings (non-inheritable) + "vars": { "ENVIRONMENT": "production" }, + "kv_namespaces": [{ "binding": "MY_KV", "id": "abc123" }], + "r2_buckets": [{ "binding": "MY_BUCKET", "bucket_name": "my-bucket" }], + "d1_databases": [{ "binding": "DB", "database_name": "my-db", "database_id": "xyz789" }], + + // Environments + "env": { + "staging": { + "vars": { "ENVIRONMENT": "staging" }, + "kv_namespaces": [{ "binding": "MY_KV", "id": "staging-id" }] + } + } +} +``` + +## Configuration Rules + +**Inheritable**: `name`, `main`, `compatibility_date`, `routes`, `workers_dev` +**Non-inheritable**: All bindings (`vars`, `kv_namespaces`, `r2_buckets`, etc.) +**Top-level only**: `migrations`, `keep_vars`, `send_metrics` + +**ALWAYS set `compatibility_date` to current date for new projects** + +## Bindings + +```jsonc +{ + // Environment variables - access via env.VAR_NAME + "vars": { "ENVIRONMENT": "production" }, + + // KV (key-value storage) + "kv_namespaces": [{ "binding": "MY_KV", "id": "abc123" }], + + // R2 (object storage) + "r2_buckets": [{ "binding": "MY_BUCKET", "bucket_name": "my-bucket" }], + + // D1 (SQL database) + "d1_databases": [{ "binding": "DB", "database_name": "my-db", "database_id": "xyz789" }], + + // Durable Objects (stateful coordination) + "durable_objects": { + "bindings": [{ "name": "COUNTER", "class_name": "Counter" }] + }, + + // Queues (message queues) + "queues": { + "producers": [{ "binding": "MY_QUEUE", "queue": "my-queue" }], + "consumers": [{ "queue": "my-queue", "max_batch_size": 10 }] + }, + + // Service bindings (worker-to-worker RPC) + "services": [{ "binding": "SERVICE_B", "service": "service-b" }], + + // Analytics Engine + "analytics_engine_datasets": [{ "binding": "ANALYTICS" }] +} +``` + +### Secrets + +Set via CLI (never in config): + +```bash +npx wrangler secret put API_KEY +``` + +Access: `env.API_KEY` + +### Automatic Provisioning (Beta) + +Bindings without IDs are auto-created: + +```jsonc +{ "kv_namespaces": [{ "binding": "MY_KV" }] } // ID added on deploy +``` + +## Routes & Triggers + +```jsonc +{ + "routes": [ + { "pattern": "example.com/*", "zone_name": "example.com" } + ], + "triggers": { + "crons": ["0 */6 * * *"] // Every 6 hours + } +} +``` + +## TypeScript Setup + +### Automatic Type Generation (Recommended) + +```bash +npm install -D @cloudflare/workers-types +npx wrangler types # Generates .wrangler/types/runtime.d.ts from wrangler.jsonc +``` + +`tsconfig.json`: + +```jsonc +{ + "compilerOptions": { + "target": "ES2022", + "lib": ["ES2022"], + "types": ["@cloudflare/workers-types"] + }, + "include": [".wrangler/types/**/*.ts", "src/**/*"] +} +``` + +Import generated types: + +```typescript +import type { Env } from './.wrangler/types/runtime'; + +export default { + async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise { + await env.MY_KV.get('key'); // Fully typed, autocomplete works + return new Response('OK'); + }, +}; +``` + +Re-run `npx wrangler types` after changing bindings in wrangler.jsonc + +### Manual Type Definition (Legacy) + +```typescript +interface Env { + MY_KV: KVNamespace; + DB: D1Database; + API_KEY: string; +} +``` + +## Advanced Options + +```jsonc +{ + // Auto-locate compute near data sources + "placement": { "mode": "smart" }, + + // Enable Node.js built-ins (Buffer, process, path, etc.) + "compatibility_flags": ["nodejs_compat_v2"], + + // Observability (10% sampling) + "observability": { "enabled": true, "head_sampling_rate": 0.1 } +} +``` + +### Node.js Compatibility + +`nodejs_compat_v2` enables: +- `Buffer`, `process.env`, `path`, `stream` +- CommonJS `require()` for Node modules +- `node:` imports (e.g., `import { Buffer } from 'node:buffer'`) + +**Note:** Adds ~1-2ms cold start overhead. Use Workers APIs (R2, KV) when possible + +## Deployment Commands + +```bash +npx wrangler deploy # Production +npx wrangler deploy --env staging +npx wrangler deploy --dry-run # Validate only +``` + +## See Also + +- [API](./api.md) - Runtime APIs and bindings usage +- [Patterns](./patterns.md) - Deployment strategies +- [Wrangler](../wrangler/README.md) - CLI reference diff --git a/cloudflare/references/workers/frameworks.md b/cloudflare/references/workers/frameworks.md new file mode 100644 index 0000000..6089e6d --- /dev/null +++ b/cloudflare/references/workers/frameworks.md @@ -0,0 +1,197 @@ +# Workers Frameworks + +## Hono (Recommended) + +Workers-native web framework with excellent TypeScript support and middleware ecosystem. + +```bash +npm install hono +``` + +### Basic Setup + +```typescript +import { Hono } from 'hono'; + +const app = new Hono(); + +app.get('/', (c) => c.text('Hello World!')); +app.post('/api/users', async (c) => { + const body = await c.req.json(); + return c.json({ id: 1, ...body }, 201); +}); + +export default app; +``` + +### Typed Environment + +```typescript +import type { Env } from './.wrangler/types/runtime'; + +const app = new Hono<{ Bindings: Env }>(); + +app.get('/data', async (c) => { + const value = await c.env.MY_KV.get('key'); // Fully typed + return c.text(value || 'Not found'); +}); +``` + +### Middleware + +```typescript +import { cors } from 'hono/cors'; +import { logger } from 'hono/logger'; + +app.use('*', logger()); +app.use('/api/*', cors({ origin: '*' })); + +// Custom middleware +app.use('/protected/*', async (c, next) => { + const auth = c.req.header('Authorization'); + if (!auth?.startsWith('Bearer ')) return c.text('Unauthorized', 401); + await next(); +}); +``` + +### Request Validation (Zod) + +```typescript +import { zValidator } from '@hono/zod-validator'; +import { z } from 'zod'; + +const schema = z.object({ + name: z.string().min(1), + email: z.string().email(), +}); + +app.post('/users', zValidator('json', schema), async (c) => { + const validated = c.req.valid('json'); // Type-safe, validated data + return c.json({ id: 1, ...validated }); +}); +``` + +**Error handling**: Automatic 400 response with validation errors + +### Route Groups + +```typescript +const api = new Hono().basePath('/api'); + +api.get('/users', (c) => c.json([])); +api.post('/users', (c) => c.json({ id: 1 })); + +app.route('/', api); // Mounts at /api/* +``` + +### Error Handling + +```typescript +app.onError((err, c) => { + console.error(err); + return c.json({ error: err.message }, 500); +}); + +app.notFound((c) => c.json({ error: 'Not Found' }, 404)); +``` + +### Accessing ExecutionContext + +```typescript +export default { + fetch(request: Request, env: Env, ctx: ExecutionContext) { + return app.fetch(request, env, ctx); + }, +}; + +// In route handlers: +app.get('/log', (c) => { + c.executionCtx.waitUntil(logRequest(c.req)); + return c.text('OK'); +}); +``` + +### OpenAPI/Swagger (Hono OpenAPI) + +```typescript +import { OpenAPIHono, createRoute, z } from '@hono/zod-openapi'; + +const app = new OpenAPIHono(); + +const route = createRoute({ + method: 'get', + path: '/users/{id}', + request: { params: z.object({ id: z.string() }) }, + responses: { + 200: { description: 'User found', content: { 'application/json': { schema: z.object({ id: z.string() }) } } }, + }, +}); + +app.openapi(route, (c) => { + const { id } = c.req.valid('param'); + return c.json({ id }); +}); + +app.doc('/openapi.json', { openapi: '3.0.0', info: { version: '1.0.0', title: 'API' } }); +``` + +### Testing with Hono + +```typescript +import { describe, it, expect } from 'vitest'; +import app from '../src/index'; + +describe('API', () => { + it('GET /', async () => { + const res = await app.request('/'); + expect(res.status).toBe(200); + expect(await res.text()).toBe('Hello World!'); + }); +}); +``` + +## Other Frameworks + +### itty-router (Minimalist) + +```typescript +import { Router } from 'itty-router'; + +const router = Router(); + +router.get('/users/:id', ({ params }) => new Response(params.id)); + +export default { fetch: router.handle }; +``` + +**Use case**: Tiny bundle size (~500 bytes), simple routing needs + +### Worktop (Advanced) + +```typescript +import { Router } from 'worktop'; + +const router = new Router(); + +router.add('GET', '/users/:id', (req, res) => { + res.send(200, { id: req.params.id }); +}); + +router.listen(); +``` + +**Use case**: Advanced routing, built-in CORS/cache utilities + +## Framework Comparison + +| Framework | Bundle Size | TypeScript | Middleware | Validation | Best For | +|-----------|-------------|------------|------------|------------|----------| +| Hono | ~12KB | Excellent | Rich | Zod | Production apps | +| itty-router | ~500B | Good | Basic | Manual | Minimal APIs | +| Worktop | ~8KB | Good | Advanced | Manual | Complex routing | + +## See Also + +- [Patterns](./patterns.md) - Common workflows +- [API](./api.md) - Runtime APIs +- [Gotchas](./gotchas.md) - Framework-specific issues diff --git a/cloudflare/references/workers/gotchas.md b/cloudflare/references/workers/gotchas.md new file mode 100644 index 0000000..d01811b --- /dev/null +++ b/cloudflare/references/workers/gotchas.md @@ -0,0 +1,136 @@ +# Workers Gotchas + +## Common Errors + +### "Too much CPU time used" + +**Cause:** Worker exceeded CPU time limit (10ms standard, 30ms unbound) +**Solution:** Use `ctx.waitUntil()` for background work, offload heavy compute to Durable Objects, or consider Workers AI for ML workloads + +### "Module-Level State Lost" + +**Cause:** Workers are stateless between requests; module-level variables reset unpredictably +**Solution:** Use KV, D1, or Durable Objects for persistent state; don't rely on module-level variables + +### "Body has already been used" + +**Cause:** Attempting to read response body twice (bodies are streams) +**Solution:** Clone response before reading: `response.clone()` or read once and create new Response with the text + +### "Node.js module not found" + +**Cause:** Node.js built-ins not available by default +**Solution:** Use Workers APIs (e.g., R2 for file storage) or enable Node.js compat with `"compatibility_flags": ["nodejs_compat_v2"]` + +### "Cannot fetch in global scope" + +**Cause:** Attempting to use fetch during module initialization +**Solution:** Move fetch calls inside handler functions (fetch, scheduled, etc.) where they're allowed + +### "Subrequest depth limit exceeded" + +**Cause:** Too many nested subrequests creating deep call chain +**Solution:** Flatten request chain or use service bindings for direct Worker-to-Worker communication + +### "D1 read-after-write inconsistency" + +**Cause:** D1 is eventually consistent; reads may not reflect recent writes +**Solution:** Use D1 Sessions (2024+) to guarantee read-after-write consistency within a session: + +```typescript +const session = env.DB.withSession(); +await session.prepare('INSERT INTO users (name) VALUES (?)').bind('Alice').run(); +const user = await session.prepare('SELECT * FROM users WHERE name = ?').bind('Alice').first(); // Guaranteed to see Alice +``` + +**When to use sessions:** Write → Read patterns, transactions requiring consistency + +### "wrangler types not generating TypeScript definitions" + +**Cause:** Type generation not configured or outdated +**Solution:** Run `npx wrangler types` after changing bindings in wrangler.jsonc: + +```bash +npx wrangler types # Generates .wrangler/types/runtime.d.ts +``` + +Add to `tsconfig.json`: `"include": [".wrangler/types/**/*.ts"]` + +Then import: `import type { Env } from './.wrangler/types/runtime';` + +### "Durable Object RPC errors with deprecated fetch pattern" + +**Cause:** Using old `stub.fetch()` pattern instead of RPC (2024+) +**Solution:** Export methods directly, call via RPC: + +```typescript +// ❌ Old fetch pattern +export class MyDO { + async fetch(request: Request) { + const { method } = await request.json(); + if (method === 'increment') return new Response(String(await this.increment())); + } + async increment() { return ++this.value; } +} +const stub = env.DO.get(id); +const res = await stub.fetch('http://x', { method: 'POST', body: JSON.stringify({ method: 'increment' }) }); + +// ✅ RPC pattern (type-safe, no serialization overhead) +export class MyDO { + async increment() { return ++this.value; } +} +const stub = env.DO.get(id); +const count = await stub.increment(); // Direct method call +``` + +### "WebSocket connection closes unexpectedly" + +**Cause:** Worker reaches CPU limit while maintaining WebSocket connection +**Solution:** Use WebSocket hibernation (2024+) to offload idle connections: + +```typescript +export class WebSocketDO { + async webSocketMessage(ws: WebSocket, message: string) { + // Handle message + } + async webSocketClose(ws: WebSocket, code: number) { + // Cleanup + } +} +``` + +Hibernation automatically suspends inactive connections, wakes on events + +### "Framework middleware not working with Workers" + +**Cause:** Framework expects Node.js primitives (e.g., Express uses Node streams) +**Solution:** Use Workers-native frameworks (Hono, itty-router, Worktop) or adapt middleware: + +```typescript +// ✅ Hono (Workers-native) +import { Hono } from 'hono'; +const app = new Hono(); +app.use('*', async (c, next) => { /* middleware */ await next(); }); +``` + +See [frameworks.md](./frameworks.md) for full patterns + +## Limits + +| Limit | Value | Notes | +|-------|-------|-------| +| Request size | 100 MB | Maximum incoming request size | +| Response size | Unlimited | Supports streaming | +| CPU time (standard) | 10ms | Standard Workers | +| CPU time (unbound) | 30ms | Unbound Workers | +| Subrequests | 1000 | Per request | +| KV reads | 1000 | Per request | +| KV write size | 25 MB | Maximum per write | +| Environment size | 5 MB | Total size of env bindings | + +## See Also + +- [Patterns](./patterns.md) - Best practices +- [API](./api.md) - Runtime APIs +- [Configuration](./configuration.md) - Setup +- [Frameworks](./frameworks.md) - Hono, routing, validation diff --git a/cloudflare/references/workers/patterns.md b/cloudflare/references/workers/patterns.md new file mode 100644 index 0000000..7768c4d --- /dev/null +++ b/cloudflare/references/workers/patterns.md @@ -0,0 +1,198 @@ +# Workers Patterns + +## Error Handling + +```typescript +class HTTPError extends Error { + constructor(public status: number, message: string) { super(message); } +} + +export default { + async fetch(request: Request, env: Env): Promise { + try { + return await handleRequest(request, env); + } catch (error) { + if (error instanceof HTTPError) { + return new Response(JSON.stringify({ error: error.message }), { + status: error.status, headers: { 'Content-Type': 'application/json' } + }); + } + return new Response('Internal Server Error', { status: 500 }); + } + }, +}; +``` + +## CORS + +```typescript +const corsHeaders = { 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'GET, POST, PUT, DELETE, OPTIONS' }; +if (request.method === 'OPTIONS') return new Response(null, { headers: corsHeaders }); +``` + +## Routing + +```typescript +const router = { 'GET /api/users': handleGetUsers, 'POST /api/users': handleCreateUser }; + +const handler = router[`${request.method} ${url.pathname}`]; +return handler ? handler(request, env) : new Response('Not Found', { status: 404 }); +``` + +**Production**: Use Hono, itty-router, or Worktop (see [frameworks.md](./frameworks.md)) + +## Request Validation (Zod) + +```typescript +import { z } from 'zod'; + +const userSchema = z.object({ + name: z.string().min(1).max(100), + email: z.string().email(), + age: z.number().int().positive().optional(), +}); + +async function handleCreateUser(request: Request) { + try { + const body = await request.json(); + const validated = userSchema.parse(body); // Throws on invalid data + return new Response(JSON.stringify({ id: 1, ...validated }), { + status: 201, + headers: { 'Content-Type': 'application/json' }, + }); + } catch (err) { + if (err instanceof z.ZodError) { + return new Response(JSON.stringify({ errors: err.errors }), { status: 400 }); + } + throw err; + } +} +``` + +**With Hono**: Use `@hono/zod-validator` for automatic validation (see [frameworks.md](./frameworks.md)) + +## Performance + +```typescript +// ❌ Sequential +const user = await fetch('/api/user/1'); +const posts = await fetch('/api/posts?user=1'); + +// ✅ Parallel +const [user, posts] = await Promise.all([fetch('/api/user/1'), fetch('/api/posts?user=1')]); +``` + +## Streaming + +```typescript +const stream = new ReadableStream({ + async start(controller) { + for (let i = 0; i < 1000; i++) { + controller.enqueue(new TextEncoder().encode(`Item ${i}\n`)); + if (i % 100 === 0) await new Promise(r => setTimeout(r, 0)); + } + controller.close(); + } +}); +``` + +## Transform Streams + +```typescript +response.body.pipeThrough(new TextDecoderStream()).pipeThrough( + new TransformStream({ transform(chunk, c) { c.enqueue(chunk.toUpperCase()); } }) +).pipeThrough(new TextEncoderStream()); +``` + +## Testing + +```typescript +import { describe, it, expect } from 'vitest'; +import worker from '../src/index'; + +describe('Worker', () => { + it('returns 200', async () => { + const req = new Request('http://localhost/'); + const env = { MY_VAR: 'test' }; + const ctx = { waitUntil: () => {}, passThroughOnException: () => {} }; + expect((await worker.fetch(req, env, ctx)).status).toBe(200); + }); +}); +``` + +## Deployment + +```bash +npx wrangler deploy # production +npx wrangler deploy --env staging +npx wrangler versions upload --message "Add feature" +npx wrangler rollback +``` + +## Monitoring + +```typescript +const start = Date.now(); +const response = await handleRequest(request, env); +ctx.waitUntil(env.ANALYTICS.writeDataPoint({ + doubles: [Date.now() - start], blobs: [request.url, String(response.status)] +})); +``` + +## Security & Rate Limiting + +```typescript +// Security headers +const security = { 'X-Content-Type-Options': 'nosniff', 'X-Frame-Options': 'DENY' }; + +// Auth +const auth = request.headers.get('Authorization'); +if (!auth?.startsWith('Bearer ')) return new Response('Unauthorized', { status: 401 }); + +// Gradual rollouts (deterministic user bucketing) +const hash = await crypto.subtle.digest('SHA-256', new TextEncoder().encode(userId)); +if (new Uint8Array(hash)[0] % 100 < rolloutPercent) return newFeature(request); +``` + +Rate limiting: See [Durable Objects](../durable-objects/README.md) + +## R2 Multipart Upload + +```typescript +// For files > 100MB +const upload = await env.MY_BUCKET.createMultipartUpload('large-file.bin'); +try { + const parts = []; + for (let i = 0; i < chunks.length; i++) { + parts.push(await upload.uploadPart(i + 1, chunks[i])); + } + await upload.complete(parts); +} catch (err) { await upload.abort(); throw err; } +``` + +Parallel uploads, resume on failure, handle files > 5GB + +## Workflows (Step Orchestration) + +```typescript +import { WorkflowEntrypoint, WorkflowStep, WorkflowEvent } from 'cloudflare:workers'; + +export class MyWorkflow extends WorkflowEntrypoint { + async run(event: WorkflowEvent<{ userId: string }>, step: WorkflowStep) { + const user = await step.do('fetch-user', async () => + fetch(`/api/users/${event.payload.userId}`).then(r => r.json()) + ); + await step.sleep('wait', '1 hour'); + await step.do('notify', async () => sendEmail(user.email)); + } +} +``` + +Multi-step jobs with automatic retries, state persistence, resume from failure + +## See Also + +- [API](./api.md) - Runtime APIs +- [Gotchas](./gotchas.md) - Common issues +- [Configuration](./configuration.md) - Setup +- [Frameworks](./frameworks.md) - Hono, routing, validation diff --git a/cloudflare/references/workflows/README.md b/cloudflare/references/workflows/README.md new file mode 100644 index 0000000..561f907 --- /dev/null +++ b/cloudflare/references/workflows/README.md @@ -0,0 +1,69 @@ +# Cloudflare Workflows + +Durable multi-step applications with automatic retries, state persistence, and long-running execution. + +## What It Does + +- Chain steps with automatic retry logic +- Persist state between steps (minutes → weeks) +- Handle failures without losing progress +- Wait for external events/approvals +- Sleep without consuming resources + +**Available:** Free & Paid Workers plans + +## Core Concepts + +**Workflow**: Class extending `WorkflowEntrypoint` with `run` method +**Instance**: Single execution with unique ID & independent state +**Steps**: Independently retriable units via `step.do()` - API calls, DB queries, AI invocations +**State**: Persisted from step returns; step name = cache key + +## Quick Start + +```typescript +import { WorkflowEntrypoint, WorkflowStep, WorkflowEvent } from 'cloudflare:workers'; + +type Env = { MY_WORKFLOW: Workflow; DB: D1Database }; +type Params = { userId: string }; + +export class MyWorkflow extends WorkflowEntrypoint { + async run(event: WorkflowEvent, step: WorkflowStep) { + const user = await step.do('fetch user', async () => { + return await this.env.DB.prepare('SELECT * FROM users WHERE id = ?') + .bind(event.params.userId).first(); + }); + + await step.sleep('wait 7 days', '7 days'); + + await step.do('send reminder', async () => { + await sendEmail(user.email, 'Reminder!'); + }); + } +} +``` + +## Key Features + +- **Durability**: Failed steps don't re-run successful ones +- **Retries**: Configurable backoff (constant/linear/exponential) +- **Events**: `waitForEvent()` for webhooks/approvals (timeout: 1h → 365d) +- **Sleep**: `sleep()` / `sleepUntil()` for scheduling (max 365d) +- **Parallel**: `Promise.all()` for concurrent steps +- **Idempotency**: Check-then-execute patterns + +## Reading Order + +**Getting Started:** configuration.md → api.md → patterns.md +**Troubleshooting:** gotchas.md + +## In This Reference +- [configuration.md](./configuration.md) - wrangler.jsonc setup, step config, bindings +- [api.md](./api.md) - Step APIs, instance management, sleep/parameters +- [patterns.md](./patterns.md) - Common workflows, testing, orchestration +- [gotchas.md](./gotchas.md) - Timeouts, limits, debugging strategies + +## See Also +- [durable-objects](../durable-objects/) - Alternative stateful approach +- [queues](../queues/) - Message-driven workflows +- [workers](../workers/) - Entry point for workflow instances diff --git a/cloudflare/references/workflows/api.md b/cloudflare/references/workflows/api.md new file mode 100644 index 0000000..7ac5625 --- /dev/null +++ b/cloudflare/references/workflows/api.md @@ -0,0 +1,185 @@ +# Workflow APIs + +## Step APIs + +```typescript +// step.do() +const result = await step.do('step name', async () => { /* logic */ }); +const result = await step.do('step name', { retries, timeout }, async () => {}); + +// step.sleep() +await step.sleep('description', '1 hour'); +await step.sleep('description', 5000); // ms + +// step.sleepUntil() +await step.sleepUntil('description', Date.parse('2024-12-31')); + +// step.waitForEvent() +const data = await step.waitForEvent('wait', {event: 'webhook-type', timeout: '24h'}); // Default 24h, max 365d +try { const event = await step.waitForEvent('wait', { event: 'approval', timeout: '1h' }); } catch (e) { /* Timeout */ } +``` + +## Instance Management + +```typescript +// Create single +const instance = await env.MY_WORKFLOW.create({id: crypto.randomUUID(), params: { userId: 'user123' }}); // id optional, auto-generated if omitted + +// Create with custom retention (default: 3 days free, 30 days paid) +const instance = await env.MY_WORKFLOW.create({ + id: crypto.randomUUID(), + params: { userId: 'user123' }, + retention: '30 days' // Override default retention period +}); + +// Batch (max 100, idempotent: skips existing IDs) +const instances = await env.MY_WORKFLOW.createBatch([{id: 'user1', params: {name: 'John'}}, {id: 'user2', params: {name: 'Jane'}}]); + +// Get & Status +const instance = await env.MY_WORKFLOW.get('instance-id'); +const status = await instance.status(); // {status: 'queued' | 'running' | 'paused' | 'errored' | 'terminated' | 'complete' | 'waiting' | 'waitingForPause' | 'unknown', error?, output?} + +// Control +await instance.pause(); await instance.resume(); await instance.terminate(); await instance.restart(); + +// Send Events +await instance.sendEvent({type: 'approval', payload: { approved: true }}); // Must match waitForEvent type +``` + +## Triggering Workflows + +```typescript +// From Worker +export default { async fetch(req, env) { const instance = await env.MY_WORKFLOW.create({id: crypto.randomUUID(), params: { userId: 'user123' }}); return Response.json({ id: instance.id }); }}; + +// From Queue +export default { async queue(batch, env) { for (const msg of batch.messages) { await env.MY_WORKFLOW.create({id: `job-${msg.id}`, params: msg.body}); } }}; + +// From Cron +export default { async scheduled(event, env) { await env.CLEANUP_WORKFLOW.create({id: `cleanup-${Date.now()}`, params: { timestamp: event.scheduledTime }}); }}; + +// From Another Workflow (non-blocking) +export class ParentWorkflow extends WorkflowEntrypoint { + async run(event, step) { + const child = await step.do('start child', async () => await this.env.CHILD_WORKFLOW.create({id: `child-${event.instanceId}`, params: {}})); + } +} +``` + +## Error Handling + +```typescript +import { NonRetryableError } from 'cloudflare:workers'; + +// NonRetryableError +await step.do('validate', async () => { + if (!event.params.paymentMethod) throw new NonRetryableError('Payment method required'); + const res = await fetch('https://api.example.com/charge', { method: 'POST' }); + if (res.status === 401) throw new NonRetryableError('Invalid credentials'); // Don't retry + if (!res.ok) throw new Error('Retryable failure'); // Will retry + return res.json(); +}); + +// Catching Errors +try { await step.do('risky op', async () => { throw new NonRetryableError('Failed'); }); } catch (e) { await step.do('cleanup', async () => {}); } + +// Idempotency +await step.do('charge', async () => { + const sub = await fetch(`https://api/subscriptions/${id}`).then(r => r.json()); + if (sub.charged) return sub; // Already done + return await fetch(`https://api/subscriptions/${id}`, {method: 'POST', body: JSON.stringify({ amount: 10.0 })}).then(r => r.json()); +}); +``` + +## Type Constraints + +Params and step returns must be `Rpc.Serializable`: + +```typescript +// ✅ Valid types +type ValidParams = { + userId: string; + count: number; + tags: string[]; + metadata: Record; +}; + +// ❌ Invalid types +type InvalidParams = { + callback: () => void; // Functions not serializable + symbol: symbol; // Symbols not serializable + circular: any; // Circular references not allowed +}; + +// Step returns follow same rules +const result = await step.do('fetch', async () => { + return { userId: '123', data: [1, 2, 3] }; // ✅ Plain object +}); +``` + +## Sleep & Scheduling + +```typescript +// Relative +await step.sleep('wait 1 hour', '1 hour'); +await step.sleep('wait 30 days', '30 days'); +await step.sleep('wait 5s', 5000); // ms + +// Absolute +await step.sleepUntil('launch date', Date.parse('24 Oct 2024 13:00:00 UTC')); +await step.sleepUntil('deadline', new Date('2024-12-31T23:59:59Z')); +``` + +Units: second, minute, hour, day, week, month, year. Max: 365 days. +Sleeping instances don't count toward concurrency. + +## Parameters + +**Pass from Worker:** +```typescript +const instance = await env.MY_WORKFLOW.create({ + id: crypto.randomUUID(), + params: { userId: 'user123', email: 'user@example.com' } +}); +``` + +**Access in Workflow:** +```typescript +async run(event: WorkflowEvent, step: WorkflowStep) { + const userId = event.params.userId; + const instanceId = event.instanceId; + const createdAt = event.timestamp; +} +``` + +**CLI Trigger:** +```bash +npx wrangler workflows trigger my-workflow '{"userId":"user123"}' +``` + +## Wrangler CLI + +```bash +npm create cloudflare@latest my-workflow -- --template "cloudflare/workflows-starter" +npx wrangler deploy +npx wrangler workflows list +npx wrangler workflows trigger my-workflow '{"userId":"user123"}' +npx wrangler workflows instances list my-workflow +npx wrangler workflows instances describe my-workflow instance-id +npx wrangler workflows instances pause/resume/terminate my-workflow instance-id +``` + +## REST API + +```bash +# Create +curl -X POST "https://api.cloudflare.com/client/v4/accounts/{account_id}/workflows/{workflow_name}/instances" -H "Authorization: Bearer {token}" -d '{"id":"custom-id","params":{"userId":"user123"}}' + +# Status +curl "https://api.cloudflare.com/client/v4/accounts/{account_id}/workflows/{workflow_name}/instances/{instance_id}/status" -H "Authorization: Bearer {token}" + +# Send Event +curl -X POST "https://api.cloudflare.com/client/v4/accounts/{account_id}/workflows/{workflow_name}/instances/{instance_id}/events" -H "Authorization: Bearer {token}" -d '{"type":"approval","payload":{"approved":true}}' +``` + +See: [configuration.md](./configuration.md), [patterns.md](./patterns.md) diff --git a/cloudflare/references/workflows/configuration.md b/cloudflare/references/workflows/configuration.md new file mode 100644 index 0000000..cfe7e4a --- /dev/null +++ b/cloudflare/references/workflows/configuration.md @@ -0,0 +1,151 @@ +# Workflow Configuration + +## wrangler.jsonc Setup + +```jsonc +{ + "name": "my-worker", + "main": "src/index.ts", + "compatibility_date": "2025-01-01", // Use current date for new projects + "observability": { + "enabled": true // Enables Workflows dashboard + structured logs + }, + "workflows": [ + { + "name": "my-workflow", // Workflow name + "binding": "MY_WORKFLOW", // Env binding + "class_name": "MyWorkflow" // TS class name + // "script_name": "other-worker" // For cross-script calls + } + ], + "limits": { + "cpu_ms": 300000 // 5 min max (default 30s) + } +} +``` + +## Step Configuration + +```typescript +// Basic step +const data = await step.do('step name', async () => ({ result: 'value' })); + +// With retry config +await step.do('api call', { + retries: { + limit: 10, // Default: 5, or Infinity + delay: '10 seconds', // Default: 10000ms + backoff: 'exponential' // constant | linear | exponential + }, + timeout: '30 minutes' // Per-attempt timeout (default: 10min) +}, async () => { + const res = await fetch('https://api.example.com/data'); + if (!res.ok) throw new Error('Failed'); + return res.json(); +}); +``` + +### Parallel Steps +```typescript +const [user, settings] = await Promise.all([ + step.do('fetch user', async () => this.env.KV.get(`user:${id}`)), + step.do('fetch settings', async () => this.env.KV.get(`settings:${id}`)) +]); +``` + +### Conditional Steps +```typescript +const config = await step.do('fetch config', async () => + this.env.KV.get('flags', { type: 'json' }) +); + +// ✅ Deterministic (based on step output) +if (config.enableEmail) { + await step.do('send email', async () => sendEmail()); +} + +// ❌ Non-deterministic (Date.now outside step) +if (Date.now() > deadline) { /* BAD */ } +``` + +### Dynamic Steps (Loops) +```typescript +const files = await step.do('list files', async () => + this.env.BUCKET.list() +); + +for (const file of files.objects) { + await step.do(`process ${file.key}`, async () => { + const obj = await this.env.BUCKET.get(file.key); + return processData(await obj.arrayBuffer()); + }); +} +``` + +## Multiple Workflows + +```jsonc +{ + "workflows": [ + {"name": "user-onboarding", "binding": "USER_ONBOARDING", "class_name": "UserOnboarding"}, + {"name": "data-processing", "binding": "DATA_PROCESSING", "class_name": "DataProcessing"} + ] +} +``` + +Each class extends `WorkflowEntrypoint` with its own `Params` type. + +## Cross-Script Bindings + +Worker A defines workflow. Worker B calls it by adding `script_name`: + +```jsonc +// Worker B (caller) +{ + "workflows": [{ + "name": "billing-workflow", + "binding": "BILLING", + "script_name": "billing-worker" // Points to Worker A + }] +} +``` + +## Bindings + +Workflows access Cloudflare bindings via `this.env`: + +```typescript +type Env = { + MY_WORKFLOW: Workflow; + KV: KVNamespace; + DB: D1Database; + BUCKET: R2Bucket; + AI: Ai; + VECTORIZE: VectorizeIndex; +}; + +await step.do('use bindings', async () => { + const kv = await this.env.KV.get('key'); + const db = await this.env.DB.prepare('SELECT * FROM users').first(); + const file = await this.env.BUCKET.get('file.txt'); + const ai = await this.env.AI.run('@cf/meta/llama-2-7b-chat-int8', { prompt: 'Hi' }); +}); +``` + +## Pages Functions Binding + +Pages Functions can trigger Workflows via service bindings: + +```typescript +// functions/_middleware.ts +export const onRequest: PagesFunction = async ({ env, request }) => { + const instance = await env.MY_WORKFLOW.create({ + params: { url: request.url } + }); + return new Response(`Started ${instance.id}`); +}; +``` + +Configure in wrangler.jsonc under `service_bindings`. + +See: [api.md](./api.md), [patterns.md](./patterns.md) diff --git a/cloudflare/references/workflows/gotchas.md b/cloudflare/references/workflows/gotchas.md new file mode 100644 index 0000000..6f85444 --- /dev/null +++ b/cloudflare/references/workflows/gotchas.md @@ -0,0 +1,97 @@ +# Gotchas & Debugging + +## Common Errors + +### "Step Timeout" + +**Cause:** Step execution exceeding 10 minute default timeout or configured timeout +**Solution:** Set custom timeout with `step.do('long operation', {timeout: '30 minutes'}, async () => {...})` or increase CPU limit in wrangler.jsonc (max 5min CPU time) + +### "waitForEvent Timeout" + +**Cause:** Event not received within timeout period (default 24h, max 365d) +**Solution:** Wrap in try-catch to handle timeout gracefully and proceed with default behavior + +### "Non-Deterministic Step Names" + +**Cause:** Using dynamic values like `Date.now()` in step names causes replay issues +**Solution:** Use deterministic values like `event.instanceId` for step names + +### "State Lost in Variables" + +**Cause:** Using module-level or local variables to store state which is lost on hibernation +**Solution:** Return values from `step.do()` which are automatically persisted: `const total = await step.do('step 1', async () => 10)` + +### "Non-Deterministic Conditionals" + +**Cause:** Using non-deterministic logic (like `Date.now()`) outside steps in conditionals +**Solution:** Move non-deterministic operations inside steps: `const isLate = await step.do('check', async () => Date.now() > deadline)` + +### "Large Step Returns Exceeding Limit" + +**Cause:** Returning data >1 MiB from step +**Solution:** Store large data in R2 and return only reference: `{ key: 'r2-object-key' }` + +### "Step Exceeded CPU Limit But Ran for < 30s" + +**Cause:** Confusion between CPU time (active compute) and wall-clock time (includes I/O waits) +**Solution:** Network requests, database queries, and sleeps don't count toward CPU. 30s limit = 30s of active processing + +### "Idempotency Violation" + +**Cause:** Step operations not idempotent, causing duplicate charges or actions on retry +**Solution:** Check if operation already completed before executing (e.g., check if customer already charged) + +### "Instance ID Collision" + +**Cause:** Reusing instance IDs causing conflicts +**Solution:** Use unique IDs with timestamp: `await env.MY_WORKFLOW.create({ id: \`${userId}-${Date.now()}\`, params: {} })` + +### "Instance Data Disappeared After Completion" + +**Cause:** Completed/errored instances are automatically deleted after retention period (3 days free / 30 days paid) +**Solution:** Export critical data to KV/R2/D1 before workflow completes + +### "Missing await on step.do" + +**Cause:** Forgetting to await step.do() causing fire-and-forget behavior +**Solution:** Always await step operations: `await step.do('task', ...)` + +## Limits + +| Limit | Free | Paid | Notes | +|-------|------|------|-------| +| CPU per step | 10ms | 30s (default), 5min (max) | Set via `limits.cpu_ms` in wrangler.jsonc | +| Step state | 1 MiB | 1 MiB | Per step return value | +| Instance state | 100 MB | 1 GB | Total state per workflow instance | +| Steps per workflow | 1,024 | 1,024 | `step.sleep()` doesn't count | +| Executions per day | 100k | Unlimited | Daily execution limit | +| Concurrent instances | 25 | 10k | Maximum concurrent workflows; waiting state excluded | +| Queued instances | 100k | 1M | Maximum queued workflow instances | +| Subrequests per step | 50 | 1,000 | Maximum outbound requests per step | +| State retention | 3 days | 30 days | How long completed instances kept | +| Step timeout default | 10 min | 10 min | Per attempt | +| waitForEvent timeout default | 24h | 24h | Maximum 365 days | +| waitForEvent timeout max | 365 days | 365 days | Maximum wait time | + +**Note:** Instances in `waiting` state (from `step.sleep` or `step.waitForEvent`) don't count toward concurrent instance limit, allowing millions of sleeping workflows. + +## Pricing + +| Metric | Free | Paid | Notes | +|--------|------|------|-------| +| Requests | 100k/day | 10M/mo + $0.30/M | Workflow invocations | +| CPU time | 10ms/invoke | 30M CPU-ms/mo + $0.02/M CPU-ms | Actual CPU usage | +| Storage | 1 GB | 1 GB/mo + $0.20/GB-mo | All instances (running/errored/sleeping/completed) | + +## References + +- [Official Docs](https://developers.cloudflare.com/workflows/) +- [Get Started Guide](https://developers.cloudflare.com/workflows/get-started/guide/) +- [Workers API](https://developers.cloudflare.com/workflows/build/workers-api/) +- [REST API](https://developers.cloudflare.com/api/resources/workflows/) +- [Examples](https://developers.cloudflare.com/workflows/examples/) +- [Limits](https://developers.cloudflare.com/workflows/reference/limits/) +- [Pricing](https://developers.cloudflare.com/workflows/reference/pricing/) + +See: [README.md](./README.md), [configuration.md](./configuration.md), [api.md](./api.md), [patterns.md](./patterns.md) diff --git a/cloudflare/references/workflows/patterns.md b/cloudflare/references/workflows/patterns.md new file mode 100644 index 0000000..72ce024 --- /dev/null +++ b/cloudflare/references/workflows/patterns.md @@ -0,0 +1,175 @@ +# Workflow Patterns + +## Image Processing Pipeline + +```typescript +export class ImageProcessingWorkflow extends WorkflowEntrypoint { + async run(event, step) { + const imageData = await step.do('fetch', async () => (await this.env.BUCKET.get(event.params.imageKey)).arrayBuffer()); + const description = await step.do('generate description', async () => + await this.env.AI.run('@cf/llava-hf/llava-1.5-7b-hf', {image: Array.from(new Uint8Array(imageData)), prompt: 'Describe this image', max_tokens: 50}) + ); + await step.waitForEvent('await approval', { event: 'approved', timeout: '24h' }); + await step.do('publish', async () => await this.env.BUCKET.put(`public/${event.params.imageKey}`, imageData)); + } +} +``` + +## User Lifecycle + +```typescript +export class UserLifecycleWorkflow extends WorkflowEntrypoint { + async run(event, step) { + await step.do('welcome email', async () => await sendEmail(event.params.email, 'Welcome!')); + await step.sleep('trial period', '7 days'); + const hasConverted = await step.do('check conversion', async () => { + const user = await this.env.DB.prepare('SELECT subscription_status FROM users WHERE id = ?').bind(event.params.userId).first(); + return user.subscription_status === 'active'; + }); + if (!hasConverted) await step.do('trial expiration email', async () => await sendEmail(event.params.email, 'Trial ending')); + } +} +``` + +## Data Pipeline + +```typescript +export class DataPipelineWorkflow extends WorkflowEntrypoint { + async run(event, step) { + const rawData = await step.do('extract', {retries: { limit: 10, delay: '30s', backoff: 'exponential' }}, async () => { + const res = await fetch(event.params.sourceUrl); + if (!res.ok) throw new Error('Fetch failed'); + return res.json(); + }); + const transformed = await step.do('transform', async () => + rawData.map(item => ({ id: item.id, normalized: normalizeData(item) })) + ); + const dataRef = await step.do('store', async () => { + const key = `processed/${Date.now()}.json`; + await this.env.BUCKET.put(key, JSON.stringify(transformed)); + return { key }; + }); + await step.do('load', async () => { + const data = await (await this.env.BUCKET.get(dataRef.key)).json(); + for (let i = 0; i < data.length; i += 100) { + await this.env.DB.batch(data.slice(i, i + 100).map(item => + this.env.DB.prepare('INSERT INTO records VALUES (?, ?)').bind(item.id, item.normalized) + )); + } + }); + } +} +``` + +## Human-in-the-Loop Approval + +```typescript +export class ApprovalWorkflow extends WorkflowEntrypoint { + async run(event, step) { + await step.do('create approval', async () => await this.env.DB.prepare('INSERT INTO approvals (id, user_id, status) VALUES (?, ?, ?)').bind(event.instanceId, event.params.userId, 'pending').run()); + try { + const approval = await step.waitForEvent<{ approved: boolean }>('wait for approval', { event: 'approval-response', timeout: '48h' }); + if (approval.approved) { await step.do('process approval', async () => {}); } + else { await step.do('handle rejection', async () => {}); } + } catch (e) { + await step.do('auto reject', async () => await this.env.DB.prepare('UPDATE approvals SET status = ? WHERE id = ?').bind('auto-rejected', event.instanceId).run()); + } + } +} +``` + +## Testing Workflows + +### Setup + +```typescript +// vitest.config.ts +import { defineWorkersConfig } from '@cloudflare/vitest-pool-workers/config'; + +export default defineWorkersConfig({ + test: { + poolOptions: { + workers: { + wrangler: { configPath: './wrangler.jsonc' } + } + } + } +}); +``` + +### Introspection API + +```typescript +import { introspectWorkflowInstance } from 'cloudflare:test'; + +const instance = await env.MY_WORKFLOW.create({ params: { userId: '123' } }); +const introspector = await introspectWorkflowInstance(env.MY_WORKFLOW, instance.id); + +// Wait for step completion +const result = await introspector.waitForStepResult({ name: 'fetch user', index: 0 }); + +// Mock step behavior +await introspector.modify(async (m) => { + await m.mockStepResult({ name: 'api call' }, { mocked: true }); +}); +``` + +## Best Practices + +### ✅ DO + +1. **Granular steps**: One API call per step (unless proving idempotency) +2. **Idempotency**: Check-then-execute; use idempotency keys +3. **Deterministic names**: Use static or step-output-based names +4. **Return state**: Persist via step returns, not variables +5. **Always await**: `await step.do()`, avoid dangling promises +6. **Deterministic conditionals**: Base on `event.payload` or step outputs +7. **Store large data externally**: R2/KV for >1 MiB, return refs +8. **Batch creation**: `createBatch()` for multiple instances + +### ❌ DON'T + +1. **One giant step**: Breaks durability & retry control +2. **State outside steps**: Lost on hibernation +3. **Mutate events**: Events immutable, return new state +4. **Non-deterministic logic outside steps**: `Math.random()`, `Date.now()` must be in steps +5. **Side effects outside steps**: May duplicate on restart +6. **Non-deterministic step names**: Prevents caching +7. **Ignore timeouts**: `waitForEvent` throws, use try-catch +8. **Reuse instance IDs**: Must be unique within retention + +## Orchestration Patterns + +### Fan-Out (Parallel Processing) +```typescript +const files = await step.do('list', async () => this.env.BUCKET.list()); +await Promise.all(files.objects.map((file, i) => step.do(`process ${i}`, async () => processFile(await (await this.env.BUCKET.get(file.key)).arrayBuffer())))); +``` + +### Parent-Child Workflows +```typescript +const child = await step.do('start child', async () => await this.env.CHILD_WORKFLOW.create({id: `child-${event.instanceId}`, params: { data: result.data }})); +await step.do('other work', async () => console.log(`Child started: ${child.id}`)); +``` + +### Race Pattern +```typescript +const winner = await Promise.race([ + step.do('option A', async () => slowOperation()), + step.do('option B', async () => fastOperation()) +]); +``` + +### Scheduled Workflow Chain +```typescript +export default { async scheduled(event, env) { await env.DAILY_WORKFLOW.create({id: `daily-${event.scheduledTime}`, params: { timestamp: event.scheduledTime }}); }}; +export class DailyWorkflow extends WorkflowEntrypoint { + async run(event, step) { + await step.do('daily task', async () => {}); + await step.sleep('wait 7 days', '7 days'); + await step.do('weekly followup', async () => {}); + } +} +``` + +See: [configuration.md](./configuration.md), [api.md](./api.md), [gotchas.md](./gotchas.md) diff --git a/cloudflare/references/wrangler/README.md b/cloudflare/references/wrangler/README.md new file mode 100644 index 0000000..dc32292 --- /dev/null +++ b/cloudflare/references/wrangler/README.md @@ -0,0 +1,135 @@ +# Cloudflare Wrangler + +Official CLI for Cloudflare Workers - develop, manage, and deploy Workers from the command line. + +## What is Wrangler? + +Wrangler is the Cloudflare Developer Platform CLI that allows you to: +- Create, develop, and deploy Workers +- Manage bindings (KV, D1, R2, Durable Objects, etc.) +- Configure routing and environments +- Run local development servers +- Execute migrations and manage resources +- Perform integration testing + +## Installation + +```bash +npm install wrangler --save-dev +# or globally +npm install -g wrangler +``` + +Run commands: `npx wrangler ` (or `pnpm`/`yarn wrangler`) + +## Reading Order + +| If you want to... | Start here | +|-------------------|------------| +| Create/deploy Worker quickly | Essential Commands below → [patterns.md](./patterns.md) §New Worker | +| Configure bindings (KV, D1, R2) | [configuration.md](./configuration.md) §Bindings | +| Write integration tests | [api.md](./api.md) §startWorker | +| Debug production issues | [gotchas.md](./gotchas.md) + Essential Commands §Monitoring | +| Set up multi-environment workflow | [configuration.md](./configuration.md) §Environments | + +## Essential Commands + +### Project & Development +```bash +wrangler init [name] # Create new project +wrangler dev # Local dev server (fast, simulated) +wrangler dev --remote # Dev with remote resources (production-like) +wrangler deploy # Deploy to production +wrangler deploy --env staging # Deploy to environment +wrangler versions list # List versions +wrangler rollback [id] # Rollback deployment +wrangler login # OAuth login +wrangler whoami # Check auth status +``` + +## Resource Management + +### KV +```bash +wrangler kv namespace create NAME +wrangler kv key put "key" "value" --namespace-id= +wrangler kv key get "key" --namespace-id= +``` + +### D1 +```bash +wrangler d1 create NAME +wrangler d1 execute NAME --command "SQL" +wrangler d1 migrations create NAME "description" +wrangler d1 migrations apply NAME +``` + +### R2 +```bash +wrangler r2 bucket create NAME +wrangler r2 object put BUCKET/key --file path +wrangler r2 object get BUCKET/key +``` + +### Other Resources +```bash +wrangler queues create NAME +wrangler vectorize create NAME --dimensions N --metric cosine +wrangler hyperdrive create NAME --connection-string "..." +wrangler workflows create NAME +wrangler constellation create NAME +wrangler pages project create NAME +wrangler pages deployment create --project NAME --branch main +``` + +### Secrets +```bash +wrangler secret put NAME # Set Worker secret +wrangler secret list # List Worker secrets +wrangler secret delete NAME # Delete Worker secret +wrangler secret bulk FILE.json # Bulk upload from JSON + +# Secrets Store (centralized, reusable across Workers) +wrangler secret-store:secret put STORE_NAME SECRET_NAME +wrangler secret-store:secret list STORE_NAME +``` + +### Monitoring +```bash +wrangler tail # Real-time logs +wrangler tail --env production # Tail specific env +wrangler tail --status error # Filter by status +``` + +## In This Reference + +- [configuration.md](./configuration.md) - wrangler.jsonc setup, environments, bindings +- [api.md](./api.md) - Programmatic API (`startWorker`, `getPlatformProxy`, events) +- [patterns.md](./patterns.md) - Common workflows and development patterns +- [gotchas.md](./gotchas.md) - Common pitfalls, limits, and troubleshooting + +## Quick Decision Tree + +``` +Need to test your Worker? +├─ Testing full Worker with bindings → api.md §startWorker +├─ Testing individual functions → api.md §getPlatformProxy +└─ Testing with Vitest → patterns.md §Testing with Vitest + +Need to configure something? +├─ Bindings (KV, D1, R2, etc.) → configuration.md §Bindings +├─ Multiple environments → configuration.md §Environments +├─ Static files → configuration.md §Workers Assets +└─ Routing → configuration.md §Routing + +Development not working? +├─ Local differs from production → Use `wrangler dev --remote` +├─ Bindings not available → gotchas.md §Binding Not Available +└─ Auth issues → wrangler login +``` + +## See Also + +- [workers](../workers/) - Workers runtime API reference +- [miniflare](../miniflare/) - Local testing with Miniflare +- [workerd](../workerd/) - Runtime that powers `wrangler dev` diff --git a/cloudflare/references/wrangler/api.md b/cloudflare/references/wrangler/api.md new file mode 100644 index 0000000..11384c2 --- /dev/null +++ b/cloudflare/references/wrangler/api.md @@ -0,0 +1,188 @@ +# Wrangler Programmatic API + +Node.js APIs for testing and development. + +## startWorker (Testing) + +Starts Worker with real local bindings for integration tests. Stable API (replaces `unstable_startWorker`). + +```typescript +import { startWorker } from "wrangler"; +import { describe, it, before, after } from "node:test"; +import assert from "node:assert"; + +describe("worker", () => { + let worker; + + before(async () => { + worker = await startWorker({ + config: "wrangler.jsonc", + environment: "development" + }); + }); + + after(async () => { + await worker.dispose(); + }); + + it("responds with 200", async () => { + const response = await worker.fetch("http://example.com"); + assert.strictEqual(response.status, 200); + }); +}); +``` + +### Options + +| Option | Type | Description | +|--------|------|-------------| +| `config` | `string` | Path to wrangler.jsonc | +| `environment` | `string` | Environment name from config | +| `persist` | `boolean \| { path: string }` | Enable persistent state | +| `bundle` | `boolean` | Enable bundling (default: true) | +| `remote` | `false \| true \| "minimal"` | Remote mode: `false` (local), `true` (full remote), `"minimal"` (remote bindings only) | + +### Remote Mode + +```typescript +// Local mode (default) - fast, simulated +const worker = await startWorker({ config: "wrangler.jsonc" }); + +// Full remote mode - production-like, slower +const worker = await startWorker({ + config: "wrangler.jsonc", + remote: true +}); + +// Minimal remote mode - remote bindings, local Worker +const worker = await startWorker({ + config: "wrangler.jsonc", + remote: "minimal" +}); +``` + +## getPlatformProxy + +Emulate bindings in Node.js without starting Worker. + +```typescript +import { getPlatformProxy } from "wrangler"; + +const { env, dispose, caches } = await getPlatformProxy({ + configPath: "wrangler.jsonc", + environment: "production", + persist: { path: ".wrangler/state" } +}); + +// Use bindings +const value = await env.MY_KV.get("key"); +await env.DB.prepare("SELECT * FROM users").all(); +await env.ASSETS.put("file.txt", "content"); + +// Platform APIs +await caches.default.put("https://example.com", new Response("cached")); + +await dispose(); +``` + +Use for unit tests (test functions, not full Worker) or scripts that need bindings. + +## Type Generation + +Generate types from config: `wrangler types` → creates `worker-configuration.d.ts` + +## Event System + +Listen to Worker lifecycle events for advanced workflows. + +```typescript +import { startWorker } from "wrangler"; + +const worker = await startWorker({ + config: "wrangler.jsonc", + bundle: true +}); + +// Bundle events +worker.on("bundleStart", (details) => { + console.log("Bundling started:", details.config); +}); + +worker.on("bundleComplete", (details) => { + console.log("Bundle ready:", details.duration); +}); + +// Reconfiguration events +worker.on("reloadStart", () => { + console.log("Worker reloading..."); +}); + +worker.on("reloadComplete", () => { + console.log("Worker reloaded"); +}); + +await worker.dispose(); +``` + +### Dynamic Reconfiguration + +```typescript +import { startWorker } from "wrangler"; + +const worker = await startWorker({ config: "wrangler.jsonc" }); + +// Replace entire config +await worker.setConfig({ + config: "wrangler.staging.jsonc", + environment: "staging" +}); + +// Patch specific fields +await worker.patchConfig({ + vars: { DEBUG: "true" } +}); + +await worker.dispose(); +``` + +## unstable_dev (Deprecated) + +Use `startWorker` instead. + +## Multi-Worker Registry + +Test multiple Workers with service bindings. + +```typescript +import { startWorker } from "wrangler"; + +const auth = await startWorker({ config: "./auth/wrangler.jsonc" }); +const api = await startWorker({ + config: "./api/wrangler.jsonc", + bindings: { AUTH: auth } // Service binding +}); + +const response = await api.fetch("http://example.com/api/login"); +// API Worker calls AUTH Worker via env.AUTH.fetch() + +await api.dispose(); +await auth.dispose(); +``` + +## Best Practices + +- Use `startWorker` for integration tests (tests full Worker) +- Use `getPlatformProxy` for unit tests (tests individual functions) +- Use `remote: true` when debugging production-specific issues +- Use `remote: "minimal"` for faster tests with real bindings +- Enable `persist: true` for debugging (state survives runs) +- Run `wrangler types` after config changes +- Always `dispose()` to prevent resource leaks +- Listen to bundle events for build monitoring +- Use multi-worker registry for testing service bindings + +## See Also + +- [README.md](./README.md) - CLI commands +- [configuration.md](./configuration.md) - Config +- [patterns.md](./patterns.md) - Testing patterns diff --git a/cloudflare/references/wrangler/configuration.md b/cloudflare/references/wrangler/configuration.md new file mode 100644 index 0000000..20dc2f0 --- /dev/null +++ b/cloudflare/references/wrangler/configuration.md @@ -0,0 +1,197 @@ +# Wrangler Configuration + +Configuration reference for wrangler.jsonc (recommended). + +## Config Format + +**wrangler.jsonc recommended** (v3.91.0+) - provides schema validation. + +```jsonc +{ + "$schema": "./node_modules/wrangler/config-schema.json", + "name": "my-worker", + "main": "src/index.ts", + "compatibility_date": "2025-01-01", // Use current date + "vars": { "API_KEY": "dev-key" }, + "kv_namespaces": [{ "binding": "MY_KV", "id": "abc123" }] +} +``` + +## Field Inheritance + +Inheritable: `name`, `main`, `compatibility_date`, `routes`, `triggers` +Non-inheritable (define per env): `vars`, bindings (KV, D1, R2, etc.) + +## Environments + +```jsonc +{ + "name": "my-worker", + "vars": { "ENV": "dev" }, + "env": { + "production": { + "name": "my-worker-prod", + "vars": { "ENV": "prod" }, + "route": { "pattern": "example.com/*", "zone_name": "example.com" } + } + } +} +``` + +Deploy: `wrangler deploy --env production` + +## Routing + +```jsonc +// Custom domain (recommended) +{ "routes": [{ "pattern": "api.example.com", "custom_domain": true }] } + +// Zone-based +{ "routes": [{ "pattern": "api.example.com/*", "zone_name": "example.com" }] } + +// workers.dev +{ "workers_dev": true } +``` + +## Bindings + +```jsonc +// Variables +{ "vars": { "API_URL": "https://api.example.com" } } + +// KV +{ "kv_namespaces": [{ "binding": "CACHE", "id": "abc123" }] } + +// D1 +{ "d1_databases": [{ "binding": "DB", "database_id": "abc-123" }] } + +// R2 +{ "r2_buckets": [{ "binding": "ASSETS", "bucket_name": "my-assets" }] } + +// Durable Objects +{ "durable_objects": { + "bindings": [{ + "name": "COUNTER", + "class_name": "Counter", + "script_name": "my-worker" // Required for external DOs + }] +} } +{ "migrations": [{ "tag": "v1", "new_sqlite_classes": ["Counter"] }] } + +// Service Bindings +{ "services": [{ "binding": "AUTH", "service": "auth-worker" }] } + +// Queues +{ "queues": { + "producers": [{ "binding": "TASKS", "queue": "task-queue" }], + "consumers": [{ "queue": "task-queue", "max_batch_size": 10 }] +} } + +// Vectorize +{ "vectorize": [{ "binding": "VECTORS", "index_name": "embeddings" }] } + +// Hyperdrive (requires nodejs_compat_v2 for pg/postgres) +{ "hyperdrive": [{ "binding": "HYPERDRIVE", "id": "hyper-id" }] } +{ "compatibility_flags": ["nodejs_compat_v2"] } // For pg/postgres + +// Workers AI +{ "ai": { "binding": "AI" } } + +// Workflows +{ "workflows": [{ "binding": "WORKFLOW", "name": "my-workflow", "class_name": "MyWorkflow" }] } + +// Secrets Store (centralized secrets) +{ "secrets_store": [{ "binding": "SECRETS", "id": "store-id" }] } + +// Constellation (AI inference) +{ "constellation": [{ "binding": "MODEL", "project_id": "proj-id" }] } +``` + +## Workers Assets (Static Files) + +Recommended for serving static files (replaces old `site` config). + +```jsonc +{ + "assets": { + "directory": "./public", + "binding": "ASSETS", + "html_handling": "auto-trailing-slash", // or "none", "force-trailing-slash" + "not_found_handling": "single-page-application" // or "404-page", "none" + } +} +``` + +Access in Worker: +```typescript +export default { + async fetch(request, env) { + // Try serving static asset first + const asset = await env.ASSETS.fetch(request); + if (asset.status !== 404) return asset; + + // Custom logic for non-assets + return new Response("API response"); + } +} +``` + +## Placement + +Control where Workers run geographically. + +```jsonc +{ + "placement": { + "mode": "smart" // or "off" + } +} +``` + +- `"smart"`: Run Worker near data sources (D1, Durable Objects) to reduce latency +- `"off"`: Default distribution (run everywhere) + +## Auto-Provisioning (Beta) + +Omit resource IDs - Wrangler creates them and writes back to config on deploy. + +```jsonc +{ "kv_namespaces": [{ "binding": "MY_KV" }] } // No id - auto-provisioned +``` + +After deploy, ID is added to config automatically. + +## Advanced + +```jsonc +// Cron Triggers +{ "triggers": { "crons": ["0 0 * * *"] } } + +// Observability (tracing) +{ "observability": { "enabled": true, "head_sampling_rate": 0.1 } } + +// Runtime Limits +{ "limits": { "cpu_ms": 100 } } + +// Browser Rendering +{ "browser": { "binding": "BROWSER" } } + +// mTLS Certificates +{ "mtls_certificates": [{ "binding": "CERT", "certificate_id": "cert-uuid" }] } + +// Logpush (stream logs to R2/S3) +{ "logpush": true } + +// Tail Consumers (process logs with another Worker) +{ "tail_consumers": [{ "service": "log-worker" }] } + +// Unsafe bindings (access to arbitrary bindings) +{ "unsafe": { "bindings": [{ "name": "MY_BINDING", "type": "plain_text", "text": "value" }] } } +``` + +## See Also + +- [README.md](./README.md) - Overview and commands +- [api.md](./api.md) - Programmatic API +- [patterns.md](./patterns.md) - Workflows +- [gotchas.md](./gotchas.md) - Common issues diff --git a/cloudflare/references/wrangler/gotchas.md b/cloudflare/references/wrangler/gotchas.md new file mode 100644 index 0000000..db6ba08 --- /dev/null +++ b/cloudflare/references/wrangler/gotchas.md @@ -0,0 +1,197 @@ +# Wrangler Common Issues + +## Common Errors + +### "Binding ID vs name mismatch" + +**Cause:** Confusion between binding name (code) and resource ID +**Solution:** Bindings use `binding` (code name) and `id`/`database_id`/`bucket_name` (resource ID). Preview bindings need separate IDs: `preview_id`, `preview_database_id` + +### "Environment not inheriting config" + +**Cause:** Non-inheritable keys not redefined per environment +**Solution:** Non-inheritable keys (bindings, vars) must be redefined per environment. Inheritable keys (routes, compatibility_date) can be overridden + +### "Local dev behavior differs from production" + +**Cause:** Using local simulation instead of remote execution +**Solution:** Choose appropriate remote mode: +- `wrangler dev` (default): Local simulation, fast, limited accuracy +- `wrangler dev --remote`: Full remote execution, production-accurate, slower +- Use `remote: "minimal"` in tests for fast tests with real remote bindings + +### "startWorker doesn't match production" + +**Cause:** Using local mode when remote resources needed +**Solution:** Use `remote` option: +```typescript +const worker = await startWorker({ + config: "wrangler.jsonc", + remote: true // or "minimal" for faster tests +}); +``` + +### "Unexpected runtime changes" + +**Cause:** Missing compatibility_date +**Solution:** Always set `compatibility_date`: +```jsonc +{ "compatibility_date": "2025-01-01" } +``` + +### "Durable Object binding not working" + +**Cause:** Missing script_name for external DOs +**Solution:** Always specify `script_name` for external Durable Objects: +```jsonc +{ + "durable_objects": { + "bindings": [ + { "name": "MY_DO", "class_name": "MyDO", "script_name": "my-worker" } + ] + } +} +``` + +For local DOs in same Worker, `script_name` is optional. + +### "Auto-provisioned resources not appearing" + +**Cause:** IDs written back to config on first deploy, but config not reloaded +**Solution:** After first deploy with auto-provisioning, config file is updated with IDs. Commit the updated config. On subsequent deploys, existing resources are reused. + +### "Secrets not available in local dev" + +**Cause:** Secrets set with `wrangler secret put` only work in deployed Workers +**Solution:** For local dev, use `.dev.vars` + +### "Node.js compatibility error" + +**Cause:** Missing Node.js compatibility flag +**Solution:** Some bindings (Hyperdrive with `pg`) require: +```jsonc +{ "compatibility_flags": ["nodejs_compat_v2"] } +``` + +### "Workers Assets 404 errors" + +**Cause:** Asset path mismatch or incorrect `html_handling` +**Solution:** +- Check `assets.directory` points to correct build output +- Set `html_handling: "auto-trailing-slash"` for SPAs +- Use `not_found_handling: "single-page-application"` to serve index.html for 404s +```jsonc +{ + "assets": { + "directory": "./dist", + "html_handling": "auto-trailing-slash", + "not_found_handling": "single-page-application" + } +} +``` + +### "Placement not reducing latency" + +**Cause:** Misunderstanding of Smart Placement +**Solution:** Smart Placement only helps when Worker accesses D1 or Durable Objects. It doesn't affect KV, R2, or external API latency. +```jsonc +{ "placement": { "mode": "smart" } } // Only beneficial with D1/DOs +``` + +### "unstable_startWorker not found" + +**Cause:** Using outdated API +**Solution:** Use stable `startWorker` instead: +```typescript +import { startWorker } from "wrangler"; // Not unstable_startWorker +``` + +### "outboundService not mocking fetch" + +**Cause:** Mock function not returning Response +**Solution:** Always return Response, use `fetch(req)` for passthrough: +```typescript +const worker = await startWorker({ + outboundService: (req) => { + if (shouldMock(req)) { + return new Response("mocked"); + } + return fetch(req); // Required for non-mocked requests + } +}); +``` + +## Limits + +| Resource/Limit | Value | Notes | +|----------------|-------|-------| +| Bindings per Worker | 64 | Total across all types | +| Environments | Unlimited | Named envs in config | +| Config file size | ~1MB | Keep reasonable | +| Workers Assets size | 25 MB | Per deployment | +| Workers Assets files | 20,000 | Max number of files | +| Script size (compressed) | 1 MB | Free, 10 MB paid | +| CPU time | 10-50ms | Free, 50-500ms paid | +| Subrequest limit | 50 | Free, 1000 paid | + +## Troubleshooting + +### Authentication Issues +```bash +wrangler logout +wrangler login +wrangler whoami +``` + +### Configuration Errors +```bash +wrangler check # Validate config +``` +Use wrangler.jsonc with `$schema` for validation. + +### Binding Not Available +- Check binding exists in config +- For environments, ensure binding defined for that env +- Local dev: some bindings need `--remote` + +### Deployment Failures +```bash +wrangler tail # Check logs +wrangler deploy --dry-run # Validate +wrangler whoami # Check account limits +``` + +### Local Development Issues +```bash +rm -rf .wrangler/state # Clear local state +wrangler dev --remote # Use remote bindings +wrangler dev --persist-to ./local-state # Custom persist location +wrangler dev --inspector-port 9229 # Enable debugging +``` + +### Testing Issues +```bash +# If tests hang, ensure dispose() is called +worker.dispose() // Always cleanup + +# If bindings don't work in tests +const worker = await startWorker({ + config: "wrangler.jsonc", + remote: "minimal" // Use remote bindings +}); +``` + +## Resources + +- Docs: https://developers.cloudflare.com/workers/wrangler/ +- Config: https://developers.cloudflare.com/workers/wrangler/configuration/ +- Commands: https://developers.cloudflare.com/workers/wrangler/commands/ +- Examples: https://github.com/cloudflare/workers-sdk/tree/main/templates +- Discord: https://discord.gg/cloudflaredev + +## See Also + +- [README.md](./README.md) - Commands +- [configuration.md](./configuration.md) - Config +- [api.md](./api.md) - Programmatic API +- [patterns.md](./patterns.md) - Workflows diff --git a/cloudflare/references/wrangler/patterns.md b/cloudflare/references/wrangler/patterns.md new file mode 100644 index 0000000..a22a41a --- /dev/null +++ b/cloudflare/references/wrangler/patterns.md @@ -0,0 +1,209 @@ +# Wrangler Development Patterns + +Common workflows and best practices. + +## New Worker Project + +```bash +wrangler init my-worker && cd my-worker +wrangler dev # Develop locally +wrangler deploy # Deploy +``` + +## Local Development + +```bash +wrangler dev # Local mode (fast, simulated) +wrangler dev --remote # Remote mode (production-accurate) +wrangler dev --env staging --port 8787 +wrangler dev --inspector-port 9229 # Enable debugging +``` + +Debug: chrome://inspect → Configure → localhost:9229 + +## Secrets + +```bash +# Production +echo "secret-value" | wrangler secret put SECRET_KEY + +# Local: use .dev.vars (gitignored) +# SECRET_KEY=local-dev-key +``` + +## Adding KV + +```bash +wrangler kv namespace create MY_KV +wrangler kv namespace create MY_KV --preview +# Add to wrangler.jsonc: { "binding": "MY_KV", "id": "abc123" } +wrangler deploy +``` + +## Adding D1 + +```bash +wrangler d1 create my-db +wrangler d1 migrations create my-db "initial_schema" +# Edit migration file in migrations/, then: +wrangler d1 migrations apply my-db --local +wrangler deploy +wrangler d1 migrations apply my-db --remote + +# Time Travel (restore to point in time) +wrangler d1 time-travel restore my-db --timestamp 2025-01-01T12:00:00Z +``` + +## Multi-Environment + +```bash +wrangler deploy --env staging +wrangler deploy --env production +``` + +```jsonc +{ "env": { "staging": { "vars": { "ENV": "staging" } } } } +``` + +## Testing + +### Integration Tests with Node.js Test Runner + +```typescript +import { startWorker } from "wrangler"; +import { describe, it, before, after } from "node:test"; +import assert from "node:assert"; + +describe("API", () => { + let worker; + + before(async () => { + worker = await startWorker({ + config: "wrangler.jsonc", + remote: "minimal" // Fast tests with real bindings + }); + }); + + after(async () => await worker.dispose()); + + it("creates user", async () => { + const response = await worker.fetch("http://example.com/api/users", { + method: "POST", + body: JSON.stringify({ name: "Alice" }) + }); + assert.strictEqual(response.status, 201); + }); +}); +``` + +### Testing with Vitest + +Install: `npm install -D vitest @cloudflare/vitest-pool-workers` + +**vitest.config.ts:** +```typescript +import { defineWorkersConfig } from "@cloudflare/vitest-pool-workers/config"; +export default defineWorkersConfig({ + test: { poolOptions: { workers: { wrangler: { configPath: "./wrangler.jsonc" } } } } +}); +``` + +**tests/api.test.ts:** +```typescript +import { env, SELF } from "cloudflare:test"; +import { describe, it, expect } from "vitest"; + +it("fetches users", async () => { + const response = await SELF.fetch("https://example.com/api/users"); + expect(response.status).toBe(200); +}); + +it("uses bindings", async () => { + await env.MY_KV.put("key", "value"); + expect(await env.MY_KV.get("key")).toBe("value"); +}); +``` + +### Multi-Worker Development (Service Bindings) + +```typescript +const authWorker = await startWorker({ config: "./auth/wrangler.jsonc" }); +const apiWorker = await startWorker({ + config: "./api/wrangler.jsonc", + bindings: { AUTH: authWorker } // Service binding +}); + +// Test API calling AUTH +const response = await apiWorker.fetch("http://example.com/api/protected"); +await authWorker.dispose(); +await apiWorker.dispose(); +``` + +### Mock External APIs + +```typescript +const worker = await startWorker({ + config: "wrangler.jsonc", + outboundService: (req) => { + const url = new URL(req.url); + if (url.hostname === "api.external.com") { + return new Response(JSON.stringify({ mocked: true }), { + headers: { "content-type": "application/json" } + }); + } + return fetch(req); // Pass through other requests + } +}); + +// Test Worker that calls external API +const response = await worker.fetch("http://example.com/proxy"); +// Worker internally fetches api.external.com - gets mocked response +``` + +## Monitoring & Versions + +```bash +wrangler tail # Real-time logs +wrangler tail --status error # Filter errors +wrangler versions list +wrangler rollback [id] +``` + +## TypeScript + +```bash +wrangler types # Generate types from config +``` + +```typescript +export default { + async fetch(request: Request, env: Env): Promise { + return Response.json({ value: await env.MY_KV.get("key") }); + } +} satisfies ExportedHandler; +``` + +## Workers Assets + +```jsonc +{ "assets": { "directory": "./dist", "binding": "ASSETS" } } +``` + +```typescript +export default { + async fetch(request, env) { + // API routes first + if (new URL(request.url).pathname.startsWith("/api/")) { + return Response.json({ data: "from API" }); + } + return env.ASSETS.fetch(request); // Static assets + } +} +``` + +## See Also + +- [README.md](./README.md) - Commands +- [configuration.md](./configuration.md) - Config +- [api.md](./api.md) - Programmatic API +- [gotchas.md](./gotchas.md) - Issues diff --git a/cloudflare/references/zaraz/IMPLEMENTATION_SUMMARY.md b/cloudflare/references/zaraz/IMPLEMENTATION_SUMMARY.md new file mode 100644 index 0000000..dd8da19 --- /dev/null +++ b/cloudflare/references/zaraz/IMPLEMENTATION_SUMMARY.md @@ -0,0 +1,121 @@ +# Zaraz Reference Implementation Summary + +## Files Created + +| File | Lines | Purpose | +|------|-------|---------| +| README.md | 111 | Navigation, decision tree, quick start | +| api.md | 287 | Web API reference, Zaraz Context | +| configuration.md | 307 | Dashboard setup, triggers, tools, consent | +| patterns.md | 430 | SPA, e-commerce, Worker integration | +| gotchas.md | 317 | Troubleshooting, limits, tool-specific issues | +| **Total** | **1,452** | **vs 366 original** | + +## Key Improvements Applied + +### Structure +- ✅ Created 5-file progressive disclosure system +- ✅ Added navigation table in README +- ✅ Added decision tree for routing +- ✅ Added "Reading Order by Task" guide +- ✅ Cross-referenced files throughout + +### New Content Added +- ✅ Zaraz Context (system/client properties) +- ✅ History Change trigger for SPA tracking +- ✅ Context Enrichers pattern +- ✅ Worker Variables pattern +- ✅ Consent management deep dive +- ✅ Tool-specific quirks (GA4, Facebook, Google Ads) +- ✅ GTM migration guide +- ✅ Comprehensive troubleshooting +- ✅ "When NOT to use Zaraz" section +- ✅ TypeScript type definitions + +### Preserved Content +- ✅ All original API methods +- ✅ E-commerce tracking examples +- ✅ Consent management +- ✅ Workers integration (expanded) +- ✅ Common patterns (expanded) +- ✅ Debugging tools +- ✅ Reference links + +## Progressive Disclosure Impact + +### Before (Monolithic) +All tasks loaded 366 lines regardless of need. + +### After (Progressive) +- **Track event task**: README (111) + api.md (287) = 398 lines +- **Debug issue**: gotchas.md (317) = 317 lines (13% reduction) +- **Configure tool**: configuration.md (307) = 307 lines (16% reduction) +- **SPA tracking**: README + patterns.md (SPA section) ~180 lines (51% reduction) + +**Net effect:** Task-specific loading reduces unnecessary content by 13-51% depending on use case. + +## File Summary + +### README.md (111 lines) +- Overview and core concepts +- Quick start guide +- When to use Zaraz vs Workers +- Navigation table +- Reading order by task +- Decision tree + +### api.md (287 lines) +- zaraz.track() +- zaraz.set() +- zaraz.ecommerce() +- Zaraz Context (system/client properties) +- zaraz.consent API +- zaraz.debug +- Cookie methods +- TypeScript definitions + +### configuration.md (307 lines) +- Dashboard setup flow +- Trigger types (including History Change) +- Tool configuration (GA4, Facebook, Google Ads) +- Actions and action rules +- Selective loading +- Consent management setup +- Privacy features +- Testing workflow + +### patterns.md (430 lines) +- SPA tracking (React, Vue, Next.js) +- User identification flows +- Complete e-commerce funnel +- A/B testing +- Worker integration (Context Enrichers, Worker Variables, HTML injection) +- Multi-tool coordination +- GTM migration +- Best practices + +### gotchas.md (317 lines) +- Events not firing (5-step debug process) +- Consent issues +- SPA tracking pitfalls +- Performance issues +- Tool-specific quirks +- Data layer issues +- Limits table +- When NOT to use Zaraz +- Debug checklist + +## Quality Metrics + +- ✅ All files use consistent markdown formatting +- ✅ Code examples include language tags +- ✅ Tables for structured data (limits, parameters, comparisons) +- ✅ Problem → Cause → Solution format in gotchas +- ✅ Cross-references between files +- ✅ No "see documentation" placeholders +- ✅ Real, actionable examples throughout +- ✅ Verified API syntax for Workers + +## Original Backup + +Original SKILL.md preserved as `_SKILL_old.md` for reference. diff --git a/cloudflare/references/zaraz/README.md b/cloudflare/references/zaraz/README.md new file mode 100644 index 0000000..0e28155 --- /dev/null +++ b/cloudflare/references/zaraz/README.md @@ -0,0 +1,111 @@ +# Cloudflare Zaraz + +Expert guidance for Cloudflare Zaraz - server-side tag manager for loading third-party tools at the edge. + +## What is Zaraz? + +Zaraz offloads third-party scripts (analytics, ads, chat, marketing) to Cloudflare's edge, improving site speed, privacy, and security. Zero client-side performance impact. + +**Core Concepts:** +- **Server-side execution** - Scripts run on Cloudflare, not user's browser +- **Single HTTP request** - All tools loaded via one endpoint +- **Privacy-first** - Control data sent to third parties +- **No client-side JS overhead** - Minimal browser impact + +## Quick Start + +1. Navigate to domain > Zaraz in Cloudflare dashboard +2. Click "Start setup" +3. Add tools (Google Analytics, Facebook Pixel, etc.) +4. Configure triggers (when tools fire) +5. Add tracking code to your site: + +```javascript +// Track page view +zaraz.track('page_view'); + +// Track custom event +zaraz.track('button_click', { button_id: 'cta' }); + +// Set user properties +zaraz.set('userId', 'user_123'); +``` + +## When to Use Zaraz + +**Use Zaraz when:** +- Adding multiple third-party tools (analytics, ads, marketing) +- Site performance is critical (no client-side JS overhead) +- Privacy compliance required (GDPR, CCPA) +- Non-technical teams need to manage tools + +**Use Workers directly when:** +- Building custom server-side tracking logic +- Need full control over data processing +- Integrating with complex backend systems +- Zaraz's tool library doesn't meet needs + +## In This Reference + +| File | Purpose | When to Read | +|------|---------|--------------| +| [api.md](./api.md) | Web API, zaraz object, consent methods | Implementing tracking calls | +| [configuration.md](./configuration.md) | Dashboard setup, triggers, tools | Initial setup, adding tools | +| [patterns.md](./patterns.md) | SPA, e-commerce, Worker integration | Best practices, common scenarios | +| [gotchas.md](./gotchas.md) | Troubleshooting, limits, pitfalls | Debugging issues | + +## Reading Order by Task + +| Task | Files to Read | +|------|---------------| +| Add analytics to site | README → configuration.md | +| Track custom events | README → api.md | +| Debug tracking issues | gotchas.md | +| SPA tracking | api.md → patterns.md (SPA section) | +| E-commerce tracking | api.md#ecommerce → patterns.md#ecommerce | +| Worker integration | patterns.md#worker-integration | +| GDPR compliance | api.md#consent → configuration.md#consent | + +## Decision Tree + +``` +What do you need? + +├─ Track events in browser → api.md +│ ├─ Page views, clicks → zaraz.track() +│ ├─ User properties → zaraz.set() +│ └─ E-commerce → zaraz.ecommerce() +│ +├─ Configure Zaraz → configuration.md +│ ├─ Add GA4/Facebook → tools setup +│ ├─ When tools fire → triggers +│ └─ GDPR consent → consent purposes +│ +├─ Integrate with Workers → patterns.md#worker-integration +│ ├─ Enrich context → Context Enrichers +│ └─ Inject tracking → HTML rewriting +│ +└─ Debug issues → gotchas.md + ├─ Events not firing → troubleshooting + ├─ Consent issues → consent debugging + └─ Performance → debugging tools +``` + +## Key Features + +- **100+ Pre-built Tools** - GA4, Facebook, Google Ads, TikTok, etc. +- **Zero Client Impact** - Runs at Cloudflare's edge, not browser +- **Privacy Controls** - Consent management, data filtering +- **Custom Tools** - Build Managed Components for proprietary systems +- **Worker Integration** - Enrich context, compute dynamic values +- **Debug Mode** - Real-time event inspection + +## Reference + +- [Zaraz Docs](https://developers.cloudflare.com/zaraz/) +- [Web API](https://developers.cloudflare.com/zaraz/web-api/) +- [Managed Components](https://developers.cloudflare.com/zaraz/advanced/load-custom-managed-component/) + +--- + +This skill focuses exclusively on Zaraz. For Workers development, see `cloudflare-workers` skill. diff --git a/cloudflare/references/zaraz/api.md b/cloudflare/references/zaraz/api.md new file mode 100644 index 0000000..5d8e1cc --- /dev/null +++ b/cloudflare/references/zaraz/api.md @@ -0,0 +1,112 @@ +# Zaraz Web API + +Client-side JavaScript API for tracking events, setting properties, and managing consent. + +## zaraz.track() + +```javascript +zaraz.track('button_click'); +zaraz.track('purchase', { value: 99.99, currency: 'USD', item_id: '12345' }); +zaraz.track('pageview', { page_path: '/products', page_title: 'Products' }); // SPA +``` + +**Params:** `eventName` (string), `properties` (object, optional). Fire-and-forget. + +## zaraz.set() + +```javascript +zaraz.set('userId', 'user_12345'); +zaraz.set({ email: '[email protected]', plan: 'premium', country: 'US' }); +``` + +Properties persist for page session. Use for user identification and segmentation. + +## zaraz.ecommerce() + +```javascript +zaraz.ecommerce('Product Viewed', { product_id: 'SKU123', name: 'Widget', price: 49.99 }); +zaraz.ecommerce('Product Added', { product_id: 'SKU123', quantity: 2, price: 49.99 }); +zaraz.ecommerce('Order Completed', { + order_id: 'ORD-789', total: 149.98, currency: 'USD', + products: [{ product_id: 'SKU123', quantity: 2, price: 49.99 }] +}); +``` + +**Events:** `Product Viewed`, `Product Added`, `Product Removed`, `Cart Viewed`, `Checkout Started`, `Order Completed` + +Tools auto-map to GA4, Facebook CAPI, etc. + +## System Properties (Triggers) + +``` +{{system.page.url}} {{system.page.title}} {{system.page.referrer}} +{{system.device.ip}} {{system.device.userAgent}} {{system.device.language}} +{{system.cookies.name}} {{client.__zarazTrack.userId}} +``` + +## zaraz.consent + +```javascript +// Check +const purposes = zaraz.consent.getAll(); // { analytics: true, marketing: false } + +// Set +zaraz.consent.modal = true; // Show modal +zaraz.consent.setAll({ analytics: true, marketing: false }); +zaraz.consent.set('marketing', true); + +// Listen +zaraz.consent.addEventListener('consentChanged', () => { + if (zaraz.consent.getAll().marketing) zaraz.track('marketing_consent_granted'); +}); +``` + +**Flow:** Configure purposes in dashboard → Map tools to purposes → Show modal/set programmatically → Tools fire when allowed + +## zaraz.debug + +```javascript +zaraz.debug = true; +zaraz.track('test_event'); +console.log(zaraz.tools); // View loaded tools +``` + +## Cookie Methods + +```javascript +zaraz.getCookie('session_id'); // Zaraz namespace +zaraz.readCookie('_ga'); // Any cookie +``` + +## Async Behavior + +All methods fire-and-forget. Events batched and sent asynchronously: + +```javascript +zaraz.track('event1'); +zaraz.set('prop', 'value'); +zaraz.track('event2'); // All batched +``` + +## TypeScript Types + +```typescript +interface Zaraz { + track(event: string, properties?: Record): void; + set(key: string, value: unknown): void; + set(properties: Record): void; + ecommerce(event: string, properties: Record): void; + consent: { + getAll(): Record; + setAll(purposes: Record): void; + set(purpose: string, value: boolean): void; + addEventListener(event: 'consentChanged', callback: () => void): void; + modal: boolean; + }; + debug: boolean; + tools?: string[]; + getCookie(name: string): string | undefined; + readCookie(name: string): string | undefined; +} +declare global { interface Window { zaraz: Zaraz; } } +``` diff --git a/cloudflare/references/zaraz/configuration.md b/cloudflare/references/zaraz/configuration.md new file mode 100644 index 0000000..e2e534c --- /dev/null +++ b/cloudflare/references/zaraz/configuration.md @@ -0,0 +1,90 @@ +# Zaraz Configuration + +## Dashboard Setup + +1. Domain → Zaraz → Start setup +2. Add tool (e.g., Google Analytics 4) +3. Enter credentials (GA4: `G-XXXXXXXXXX`) +4. Configure triggers +5. Save and Publish + +## Triggers + +| Type | When | Use Case | +|------|------|----------| +| Pageview | Page load | Track page views | +| Click | Element clicked | Button tracking | +| Form Submission | Form submitted | Lead capture | +| History Change | URL changes (SPA) | React/Vue routing | +| Variable Match | Custom condition | Conditional firing | + +### History Change (SPA) + +``` +Type: History Change +Event: pageview +``` + +Fires on `pushState`, `replaceState`, hash changes. **No manual tracking needed.** + +### Click Trigger + +``` +Type: Click +CSS Selector: .buy-button +Event: purchase_intent +Properties: + button_text: {{system.clickElement.text}} +``` + +## Tool Configuration + +**GA4:** +``` +Measurement ID: G-XXXXXXXXXX +Events: page_view, purchase, user_engagement +``` + +**Facebook Pixel:** +``` +Pixel ID: 1234567890123456 +Events: PageView, Purchase, AddToCart +``` + +**Google Ads:** +``` +Conversion ID: AW-XXXXXXXXX +Conversion Label: YYYYYYYYYY +``` + +## Consent Management + +1. Settings → Consent → Create purposes (analytics, marketing) +2. Map tools to purposes +3. Set behavior: "Do not load until consent granted" + +**Programmatic consent:** +```javascript +zaraz.consent.setAll({ analytics: true, marketing: true }); +``` + +## Privacy Features + +| Feature | Default | +|---------|---------| +| IP Anonymization | Enabled | +| Cookie Control | Via consent purposes | +| GDPR/CCPA | Consent modal | + +## Testing + +1. **Preview Mode** - test without publishing +2. **Debug Mode** - `zaraz.debug = true` +3. **Network tab** - filter "zaraz" + +## Limits + +| Resource | Limit | +|----------|-------| +| Event properties | 100KB | +| Consent purposes | 20 | diff --git a/cloudflare/references/zaraz/gotchas.md b/cloudflare/references/zaraz/gotchas.md new file mode 100644 index 0000000..eaa6b49 --- /dev/null +++ b/cloudflare/references/zaraz/gotchas.md @@ -0,0 +1,81 @@ +# Zaraz Gotchas + +## Events Not Firing + +**Check:** +1. Tool enabled in dashboard (green dot) +2. Trigger conditions met +3. Consent granted for tool's purpose +4. Tool credentials correct (GA4: `G-XXXXXXXXXX`, FB: numeric only) + +**Debug:** +```javascript +zaraz.debug = true; +console.log('Tools:', zaraz.tools); +console.log('Consent:', zaraz.consent.getAll()); +``` + +## Consent Issues + +**Modal not showing:** +```javascript +// Clear consent cookie +document.cookie = 'zaraz-consent=; expires=Thu, 01 Jan 1970 00:00:00 UTC; path=/;'; +location.reload(); +``` + +**Tools firing before consent:** Map tool to consent purpose with "Do not load until consent granted". + +## SPA Tracking + +**Route changes not tracked:** +1. Configure History Change trigger in dashboard +2. Hash routing (`#/path`) requires manual tracking: +```javascript +window.addEventListener('hashchange', () => { + zaraz.track('pageview', { page_path: location.pathname + location.hash }); +}); +``` + +**React fix:** +```javascript +const location = useLocation(); +useEffect(() => { + zaraz.track('pageview', { page_path: location.pathname }); +}, [location]); // Include dependency +``` + +## Performance + +**Slow page load:** +- Audit tool count (50+ degrades performance) +- Disable blocking triggers unless required +- Reduce event payload size (<100KB) + +## Tool-Specific Issues + +| Tool | Issue | Fix | +|------|-------|-----| +| GA4 | Events not in real-time | Wait 5-10 min, use DebugView | +| Facebook | Invalid Pixel ID | Use numeric only (no `fbpx_` prefix) | +| Google Ads | Conversions not attributed | Include `send_to: 'AW-XXX/LABEL'` | + +## Data Layer + +- Properties persist per page only - set on each page load +- Nested access: `{{client.__zarazTrack.user.plan}}` + +## Limits + +| Resource | Limit | +|----------|-------| +| Request size | 100KB | +| Consent purposes | 20 | +| API rate | 1000 req/sec | + +## When NOT to Use Zaraz + +- Server-to-server tracking (use Workers) +- Real-time bidirectional communication +- Binary data transmission +- Authentication flows diff --git a/cloudflare/references/zaraz/patterns.md b/cloudflare/references/zaraz/patterns.md new file mode 100644 index 0000000..c5ef967 --- /dev/null +++ b/cloudflare/references/zaraz/patterns.md @@ -0,0 +1,74 @@ +# Zaraz Patterns + +## SPA Tracking + +**History Change Trigger (Recommended):** Configure in dashboard - no code needed, Zaraz auto-detects route changes. + +**Manual tracking (React/Vue/Next.js):** +```javascript +// On route change +zaraz.track('pageview', { page_path: pathname, page_title: document.title }); +``` + +## User Identification + +```javascript +// Login +zaraz.set({ userId: user.id, email: user.email, plan: user.plan }); +zaraz.track('login', { method: 'oauth' }); + +// Logout - set to null (cannot clear) +zaraz.set('userId', null); +``` + +## E-commerce Funnel + +| Event | Method | +|-------|--------| +| View | `zaraz.ecommerce('Product Viewed', { product_id, name, price })` | +| Add to cart | `zaraz.ecommerce('Product Added', { product_id, quantity })` | +| Checkout | `zaraz.ecommerce('Checkout Started', { cart_id, products: [...] })` | +| Purchase | `zaraz.ecommerce('Order Completed', { order_id, total, products })` | + +## A/B Testing + +```javascript +zaraz.set('experiment_checkout', variant); +zaraz.track('experiment_viewed', { experiment_id: 'checkout', variant }); +// On conversion +zaraz.track('experiment_conversion', { experiment_id, variant, value }); +``` + +## Worker Integration + +**Context Enricher** - Modify context before tools execute: +```typescript +export default { + async fetch(request, env) { + const body = await request.json(); + body.system.userRegion = request.cf?.region; + return Response.json(body); + } +}; +``` +Configure: Zaraz > Settings > Context Enrichers + +**Worker Variables** - Compute dynamic values server-side, use as `{{worker.variable_name}}`. + +## GTM Migration + +| GTM | Zaraz | +|-----|-------| +| `dataLayer.push({event: 'purchase'})` | `zaraz.ecommerce('Order Completed', {...})` | +| `{{Page URL}}` | `{{system.page.url}}` | +| `{{Page Title}}` | `{{system.page.title}}` | +| Page View trigger | Pageview trigger | +| Click trigger | Click (selector: `*`) | + +## Best Practices + +1. Use dashboard triggers over inline code +2. Enable History Change for SPAs (no manual code) +3. Debug with `zaraz.debug = true` +4. Implement consent early (GDPR/CCPA) +5. Use Context Enrichers for sensitive/server data diff --git a/code-maturity-assessor/.skillshare-meta.json b/code-maturity-assessor/.skillshare-meta.json new file mode 100644 index 0000000..9f725b4 --- /dev/null +++ b/code-maturity-assessor/.skillshare-meta.json @@ -0,0 +1,8 @@ +{ + "source": "github.com/trailofbits/skills/tree/main/plugins/building-secure-contracts/skills/code-maturity-assessor", + "type": "github-subdir", + "installed_at": "2026-01-30T02:23:18.859181561Z", + "repo_url": "https://github.com/trailofbits/skills.git", + "subdir": "plugins/building-secure-contracts/skills/code-maturity-assessor", + "version": "650f6e3" +} \ No newline at end of file diff --git a/code-maturity-assessor/SKILL.md b/code-maturity-assessor/SKILL.md new file mode 100644 index 0000000..8e8e9c2 --- /dev/null +++ b/code-maturity-assessor/SKILL.md @@ -0,0 +1,218 @@ +--- +name: code-maturity-assessor +description: Systematic code maturity assessment using Trail of Bits' 9-category framework. Analyzes codebase for arithmetic safety, auditing practices, access controls, complexity, decentralization, documentation, MEV risks, low-level code, and testing. Produces professional scorecard with evidence-based ratings and actionable recommendations. +--- + +# Code Maturity Assessor + +## Purpose + +Systematically assesses codebase maturity using Trail of Bits' 9-category framework. Provides evidence-based ratings and actionable recommendations. + +**Framework**: Building Secure Contracts - Code Maturity Evaluation v0.1.0 + +--- + +## How This Works + +### Phase 1: Discovery +Explores the codebase to understand: +- Project structure and platform +- Contract/module files +- Test coverage +- Documentation availability + +### Phase 2: Analysis +For each of 9 categories, I'll: +- **Search the code** for relevant patterns +- **Read key files** to assess implementation +- **Present findings** with file references +- **Ask clarifying questions** about processes I can't see in code +- **Determine rating** based on criteria + +### Phase 3: Report +Generates: +- Executive summary +- Maturity scorecard (ratings for all 9 categories) +- Detailed analysis with evidence +- Priority-ordered improvement roadmap + +--- + +## Rating System + +- **Missing (0)**: Not present/not implemented +- **Weak (1)**: Several significant improvements needed +- **Moderate (2)**: Adequate, can be improved +- **Satisfactory (3)**: Above average, minor improvements +- **Strong (4)**: Exceptional, only small improvements possible + +**Rating Logic**: +- ANY "Weak" criteria → **Weak** +- NO "Weak" + SOME "Moderate" unmet → **Moderate** +- ALL "Moderate" + SOME "Satisfactory" met → **Satisfactory** +- ALL "Satisfactory" + exceptional practices → **Strong** + +--- + +## The 9 Categories + +I assess 9 comprehensive categories covering all aspects of code maturity. For detailed criteria, analysis approaches, and rating thresholds, see [ASSESSMENT_CRITERIA.md](resources/ASSESSMENT_CRITERIA.md). + +### Quick Reference: + +**1. ARITHMETIC** +- Overflow protection mechanisms +- Precision handling and rounding +- Formula specifications +- Edge case testing + +**2. AUDITING** +- Event definitions and coverage +- Monitoring infrastructure +- Incident response planning + +**3. AUTHENTICATION / ACCESS CONTROLS** +- Privilege management +- Role separation +- Access control testing +- Key compromise scenarios + +**4. COMPLEXITY MANAGEMENT** +- Function scope and clarity +- Cyclomatic complexity +- Inheritance hierarchies +- Code duplication + +**5. DECENTRALIZATION** +- Centralization risks +- Upgrade control mechanisms +- User opt-out paths +- Timelock/multisig patterns + +**6. DOCUMENTATION** +- Specifications and architecture +- Inline code documentation +- User stories +- Domain glossaries + +**7. TRANSACTION ORDERING RISKS** +- MEV vulnerabilities +- Front-running protections +- Slippage controls +- Oracle security + +**8. LOW-LEVEL MANIPULATION** +- Assembly usage +- Unsafe code sections +- Low-level calls +- Justification and testing + +**9. TESTING & VERIFICATION** +- Test coverage +- Fuzzing and formal verification +- CI/CD integration +- Test quality + +For complete assessment criteria including what I'll analyze, what I'll ask you, and detailed rating thresholds (WEAK/MODERATE/SATISFACTORY/STRONG), see [ASSESSMENT_CRITERIA.md](resources/ASSESSMENT_CRITERIA.md). + +--- + +## Example Output + +When the assessment is complete, you'll receive a comprehensive maturity report including: + +- **Executive Summary**: Overall score, top 3 strengths, top 3 gaps, priority recommendations +- **Maturity Scorecard**: Table with all 9 categories rated with scores and notes +- **Detailed Analysis**: Category-by-category breakdown with evidence (file:line references) +- **Improvement Roadmap**: Priority-ordered recommendations (CRITICAL/HIGH/MEDIUM) with effort estimates + +For a complete example assessment report, see [EXAMPLE_REPORT.md](resources/EXAMPLE_REPORT.md). + +--- + +## Assessment Process + +When invoked, I will: + +1. **Explore codebase** + - Find contract/module files + - Identify test files + - Locate documentation + +2. **Analyze each category** + - Search for relevant code patterns + - Read key implementations + - Assess against criteria + - Collect evidence + +3. **Interactive assessment** + - Present my findings with file references + - Ask about processes I can't see in code + - Discuss borderline cases + - Determine ratings together + +4. **Generate report** + - Executive summary + - Maturity scorecard table + - Detailed category analysis with evidence + - Priority-ordered improvement roadmap + +--- + +## Rationalizations (Do Not Skip) + +| Rationalization | Why It's Wrong | Required Action | +|-----------------|----------------|-----------------| +| "Found some findings, assessment complete" | Assessment requires evaluating ALL 9 categories | Complete assessment of all 9 categories with evidence for each | +| "I see events, auditing category looks good" | Events alone don't equal auditing maturity | Check logging comprehensiveness, testing, incident response processes | +| "Code looks simple, complexity is low" | Visual simplicity masks composition complexity | Analyze cyclomatic complexity, dependency depth, state machine transitions | +| "Not a DeFi protocol, MEV category doesn't apply" | MEV extends beyond DeFi (governance, NFTs, games) | Verify with transaction ordering analysis before declaring N/A | +| "No assembly found, low-level category is N/A" | Low-level risks include external calls, delegatecall, inline assembly | Search for all low-level patterns before skipping category | +| "This is taking too long" | Thorough assessment requires time per category | Complete all 9 categories, ask clarifying questions about off-chain processes | +| "I can rate this without evidence" | Ratings without file:line references = unsubstantiated claims | Collect concrete code evidence for every category assessment | +| "User will know what to improve" | Vague guidance = no action | Provide priority-ordered roadmap with specific improvements and effort estimates | + +--- + +## Report Format + +For detailed report structure and templates, see [REPORT_FORMAT.md](resources/REPORT_FORMAT.md). + +### Structure: + +1. **Executive Summary** + - Project name and platform + - Overall maturity (average rating) + - Top 3 strengths + - Top 3 critical gaps + - Priority recommendations + +2. **Maturity Scorecard** + - Table with all 9 categories + - Ratings and scores + - Key findings notes + +3. **Detailed Analysis** + - Per-category breakdown + - Evidence with file:line references + - Gaps and improvement actions + +4. **Improvement Roadmap** + - CRITICAL (immediate) + - HIGH (1-2 months) + - MEDIUM (2-4 months) + - Effort estimates and impact + +--- + +## Ready to Begin + +**Estimated Time**: 30-40 minutes + +**I'll need**: +- Access to full codebase +- Your knowledge of processes (monitoring, incident response, team practices) +- Context about the project (DeFi, NFT, infrastructure, etc.) + +Let's assess this codebase! diff --git a/code-maturity-assessor/resources/ASSESSMENT_CRITERIA.md b/code-maturity-assessor/resources/ASSESSMENT_CRITERIA.md new file mode 100644 index 0000000..9c3ac7c --- /dev/null +++ b/code-maturity-assessor/resources/ASSESSMENT_CRITERIA.md @@ -0,0 +1,355 @@ +## The 9 Categories + +### 1. ARITHMETIC +**Focus**: Overflow protection, precision handling, formula specification, edge case testing + +**I'll analyze**: +- Overflow protection mechanisms (Solidity 0.8, SafeMath, checked_*, saturating_*) +- Unchecked arithmetic blocks and documentation +- Division/rounding operations +- Arithmetic in critical functions (balances, rewards, fees) +- Test coverage for arithmetic edge cases +- Arithmetic specification documents + +**WEAK if**: +- No overflow protection without justification +- Unchecked arithmetic not documented +- No arithmetic specification OR spec doesn't match code +- No testing strategy for arithmetic +- Critical edge cases not tested + +**MODERATE requires**: +- All weak criteria resolved +- Unchecked arithmetic minimal, justified, documented +- Overflow/underflow risks documented and tested +- Explicit rounding for precision loss +- Automated testing (fuzzing/formal methods) +- Stateless arithmetic functions +- Bounded parameters with explained ranges + +**SATISFACTORY requires**: +- All moderate criteria met +- Precision loss analyzed vs ground-truth +- All trapping operations identified +- Arithmetic spec matches code one-to-one +- Automated testing covers all operations in CI + +--- + +### 2. AUDITING +**Focus**: Events, monitoring systems, incident response + +**I'll analyze**: +- Event definitions and emission patterns +- Events for critical operations (transfers, access changes, parameter updates) +- Event naming consistency +- Critical functions without events + +**I'll ask you**: +- Off-chain monitoring infrastructure? +- Monitoring plan documented? +- Incident response plan exists and tested? + +**WEAK if**: +- No event strategy +- Events missing for critical updates +- No consistent event guidelines +- Same events reused for different purposes + +**MODERATE requires**: +- All weak criteria resolved +- Events for all critical functions +- Off-chain monitoring logs events +- Monitoring plan documented +- Event documentation (purpose, usage, assumptions) +- Log review process documented +- Incident response plan exists + +**SATISFACTORY requires**: +- All moderate criteria met +- Monitoring triggers alerts on unexpected behavior +- Defined roles for incident detection +- Incident response plan regularly tested + +--- + +### 3. AUTHENTICATION / ACCESS CONTROLS +**Focus**: Privilege management, role separation, access patterns + +**I'll analyze**: +- Access control modifiers/functions +- Role definitions and separation +- Admin/owner patterns +- Privileged function implementations +- Test coverage for access controls + +**I'll ask you**: +- Who are privileged actors? (EOA, multisig, DAO?) +- Documentation of roles and privileges? +- Key compromise scenarios? + +**WEAK if**: +- Access controls unclear or inconsistent +- Single address controls system without safeguards +- Missing access controls on privileged functions +- No role differentiation +- All privileges on one address + +**MODERATE requires**: +- All weak criteria resolved +- All privileged functions have access control +- Least privilege principle followed +- Non-overlapping role privileges +- Clear actor/privilege documentation +- Tests cover all privileges +- Roles can be revoked +- Two-step processes for EOA operations + +**SATISFACTORY requires**: +- All moderate criteria met +- All actors well documented +- Implementation matches specification +- Privileged actors not EOAs +- Key leakage doesn't compromise system +- Tested against known attack vectors + +--- + +### 4. COMPLEXITY MANAGEMENT +**Focus**: Code clarity, function scope, avoiding unnecessary complexity + +**I'll analyze**: +- Function length and nesting depth +- Cyclomatic complexity +- Code duplication +- Inheritance hierarchies +- Naming conventions +- Function clarity + +**I'll ask you**: +- Complex parts documented? +- Naming convention documented? +- Complexity measurements? + +**WEAK if**: +- Unnecessary complexity hinders review +- Functions overuse nested operations +- Functions have unclear scope +- Unnecessary code duplication +- Complex inheritance tree + +**MODERATE requires**: +- All weak criteria resolved +- Complex parts identified, minimized +- High complexity (≥11) justified +- Critical functions well-scoped +- Minimal, justified redundancy +- Clear inputs with validation +- Documented naming convention +- Types not misused + +**SATISFACTORY requires**: +- All moderate criteria met +- Minimal unnecessary complexity +- Necessary complexity documented +- Clear function purposes +- Straightforward to test +- No redundant behavior + +--- + +### 5. DECENTRALIZATION +**Focus**: Centralization risks, upgrade control, user opt-out + +**I'll analyze**: +- Upgrade mechanisms (proxies, governance) +- Owner/admin control scope +- Timelock/multisig patterns +- User opt-out mechanisms + +**I'll ask you**: +- Upgrade mechanism and control? +- User opt-out/exit paths? +- Centralization risk documentation? + +**WEAK if**: +- Centralization points not visible to users +- Critical functions upgradable by single entity without opt-out +- Single entity controls user funds +- All decisions by single entity +- Parameters changeable anytime by single entity +- Centralized permission required + +**MODERATE requires**: +- All weak criteria resolved +- Centralization risks identified, justified, documented +- User opt-out/exit path documented +- Upgradeability only for non-critical features +- Privileged actors can't unilaterally move/trap funds +- All privileges documented + +**SATISFACTORY requires**: +- All moderate criteria met +- Clear decentralization path justified +- On-chain voting risks addressed OR no centralization +- Deployment risks documented +- External interaction risks documented +- Critical parameters immutable OR users can exit + +--- + +### 6. DOCUMENTATION +**Focus**: Specifications, architecture, user stories, inline comments + +**I'll analyze**: +- README, specification, architecture docs +- Inline code comments (NatSpec, rustdoc, etc.) +- User stories +- Glossaries +- Documentation completeness and accuracy + +**I'll ask you**: +- User stories documented? +- Architecture diagrams exist? +- Glossary for domain terms? + +**WEAK if**: +- Minimal or incomplete/outdated documentation +- Only high-level description +- Code comments don't match docs +- Not publicly available (for public codebases) +- Unexplained artificial terms + +**MODERATE requires**: +- All weak criteria resolved +- Clear, unambiguous writing +- Glossary for business terms +- Architecture diagrams +- User stories included +- Core/critical components identified +- Docs sufficient to understand behavior +- All critical functions/blocks documented +- Known risks/limitations documented + +**SATISFACTORY requires**: +- All moderate criteria met +- User stories cover all operations +- Detailed behavior descriptions +- Implementation matches spec (deviations justified) +- Invariants clearly defined +- Consistent naming conventions +- Documentation for end-users AND developers + +--- + +### 7. TRANSACTION ORDERING RISKS +**Focus**: MEV, front-running, sandwich attacks + +**I'll analyze**: +- MEV-vulnerable patterns (AMM swaps, arbitrage, large trades) +- Front-running protections +- Slippage/deadline checks +- Oracle implementations + +**I'll ask you**: +- Transaction ordering risks identified/documented? +- Known MEV opportunities? +- Mitigation strategies? +- Testing for ordering attacks? + +**WEAK if**: +- Ordering risks not identified/documented +- Protocols/assets at risk from unexpected ordering +- Relies on unjustified MEV prevention constraints +- Unproven assumptions about MEV extractors + +**MODERATE requires**: +- All weak criteria resolved +- User operation ordering risks limited, justified, documented +- MEV mitigations in place (delays, slippage checks) +- Testing emphasizes ordering risks +- Tamper-resistant oracles used + +**SATISFACTORY requires**: +- All moderate criteria met +- All ordering risks documented and justified +- Known risks highlighted in docs/tests, visible to users +- Documentation centralizes MEV opportunities +- Privileged operation ordering risks limited, justified +- Tests highlight ordering risks + +--- + +### 8. LOW-LEVEL MANIPULATION +**Focus**: Assembly, unsafe code, low-level operations + +**I'll analyze**: +- Assembly blocks +- Unsafe code sections +- Low-level calls +- Bitwise operations +- Justification and documentation + +**I'll ask you**: +- Why use assembly/unsafe here? +- High-level reference implementation? +- How is this tested? + +**WEAK if**: +- Unjustified low-level manipulations +- Assembly/low-level not justified, could be high-level + +**MODERATE requires**: +- All weak criteria resolved +- Assembly use limited and justified +- Inline comments for each operation +- No re-implementation of established libraries without justification +- High-level reference for complex assembly + +**SATISFACTORY requires**: +- All moderate criteria met +- Thorough documentation/justification/testing +- Validated with automated testing vs reference +- Differential fuzzing compares implementations +- Compiler optimization risks identified + +--- + +### 9. TESTING AND VERIFICATION +**Focus**: Coverage, testing techniques, CI/CD + +**I'll analyze**: +- Test file count and organization +- Test coverage reports +- CI/CD configuration +- Advanced testing (fuzzing, formal verification) +- Test quality and isolation + +**I'll ask you**: +- Test coverage percentage? +- Do all tests pass? +- Testing techniques used? +- Easy to run tests? + +**WEAK if**: +- Limited testing, only happy paths +- Common use cases not tested +- Tests fail +- Can't run tests "out of the box" + +**MODERATE requires**: +- All weak criteria resolved +- Most functions/use cases tested +- All tests pass +- Coverage reports available +- Automated testing for critical components +- Tests in CI/CD +- Integration tests (if applicable) +- Test code follows best practices + +**SATISFACTORY requires**: +- All moderate criteria met +- 100% reachable branch/statement coverage +- End-to-end testing covers all entry points +- Isolated test cases (no dependencies) +- Mutation testing used diff --git a/code-maturity-assessor/resources/EXAMPLE_REPORT.md b/code-maturity-assessor/resources/EXAMPLE_REPORT.md new file mode 100644 index 0000000..bcadb01 --- /dev/null +++ b/code-maturity-assessor/resources/EXAMPLE_REPORT.md @@ -0,0 +1,248 @@ +## Example Output + +When the assessment is complete, you'll receive a comprehensive maturity report: + +``` +=== CODE MATURITY ASSESSMENT REPORT === + +Project: DeFi DEX Protocol +Platform: Solidity (Ethereum) +Assessment Date: March 15, 2024 +Assessor: Trail of Bits Code Maturity Framework v0.1.0 + +--- + +## EXECUTIVE SUMMARY + +Overall Maturity Score: 2.7 / 4.0 (MODERATE-SATISFACTORY) + +Top 3 Strengths: +✓ Comprehensive testing with 96% coverage and fuzzing +✓ Well-documented access controls with multi-sig governance +✓ Clear architectural documentation with diagrams + +Top 3 Critical Gaps: +⚠ Arithmetic operations lack formal specification +⚠ No event monitoring infrastructure deployed +⚠ Centralized upgrade mechanism without timelock + +Priority Recommendation: +Implement arithmetic specification document and add 48-hour timelock +to all governance operations before mainnet launch. + +--- + +## MATURITY SCORECARD + +| Category | Rating | Score | Notes | +|-----------------------------|---------------|-------|---------------------------------| +| 1. Arithmetic | WEAK | 1/4 | Missing specification | +| 2. Auditing | MODERATE | 2/4 | Events present, no monitoring | +| 3. Authentication/Access | SATISFACTORY | 3/4 | Multi-sig, well-documented | +| 4. Complexity Management | MODERATE | 2/4 | Some functions too complex | +| 5. Decentralization | WEAK | 1/4 | Centralized upgrades | +| 6. Documentation | SATISFACTORY | 3/4 | Comprehensive, minor gaps | +| 7. Transaction Ordering | MODERATE | 2/4 | Some MEV risks documented | +| 8. Low-Level Manipulation | SATISFACTORY | 3/4 | Minimal assembly, justified | +| 9. Testing & Verification | STRONG | 4/4 | Excellent coverage & techniques | + +**OVERALL: 2.7 / 4.0** (Moderate-Satisfactory) + +--- + +## DETAILED ANALYSIS + +### 1. ARITHMETIC - WEAK (1/4) + +**Evidence:** +✗ No arithmetic specification document found +✗ AMM pricing formula not documented (src/SwapRouter.sol:89-156) +✗ Slippage calculation lacks precision analysis +✓ Using Solidity 0.8+ for overflow protection +✓ Critical functions tested for edge cases + +**Critical Gap:** +File: src/SwapRouter.sol:127 +```solidity +uint256 amountOut = (reserveOut * amountIn * 997) / (reserveIn * 1000 + amountIn * 997); +``` +No specification for: +- Expected liquidity depth ranges +- Precision loss analysis +- Rounding direction justification + +**To Reach Moderate (2/4):** +- Create arithmetic specification document +- Document all formulas and their precision requirements +- Add explicit rounding direction comments +- Test arithmetic edge cases with fuzzing + +**Files Referenced:** +- src/SwapRouter.sol:89-156 +- src/LiquidityPool.sol:234-267 +- src/PriceCalculator.sol:178-195 + +--- + +### 2. AUDITING - MODERATE (2/4) + +**Evidence:** +✓ Events emitted for all critical operations +✓ Consistent event naming (Action + noun) +✓ Indexed parameters for filtering +✗ No off-chain monitoring infrastructure +✗ No monitoring plan documented +✗ No incident response plan + +**Events Found:** 23 events across 8 contracts +- Swap, AddLiquidity, RemoveLiquidity ✓ +- PairCreated, LiquidityProvided ✓ +- OwnershipTransferred, GovernanceProposed ✓ + +**Critical Gap:** +No monitoring alerts for: +- Large swaps causing significant price impact +- Oracle price deviations +- Unusual liquidity withdrawal patterns + +**To Reach Satisfactory (3/4):** +- Deploy off-chain monitoring (Tenderly/Defender) +- Create monitoring playbook document +- Set up alerts for critical events +- Test incident response plan quarterly + +--- + +### 3. AUTHENTICATION/ACCESS CONTROLS - SATISFACTORY (3/4) + +**Evidence:** +✓ All privileged functions have access controls +✓ Multi-sig (3/5) controls governance +✓ Role separation (Admin, Operator, Pauser) +✓ Roles documented in ROLES.md +✓ Two-step ownership transfer +✓ All access patterns tested +✓ Emergency pause by separate role + +**Access Control Implementation:** +- OpenZeppelin AccessControl used consistently +- 4 roles defined with non-overlapping privileges +- Emergency functions require multi-sig + +**Minor Gap:** +Multi-sig is EOA-based (should upgrade to Governor contract) + +**To Reach Strong (4/4):** +- Replace multi-sig EOAs with on-chain Governor +- Add timelock to all parameter changes +- Document key compromise scenarios +- Test governor upgrade path + +**Files Referenced:** +- All contracts use consistent access patterns +- ROLES.md comprehensive +- test/access/* covers all scenarios + +--- + +### 9. TESTING & VERIFICATION - STRONG (4/4) + +**Evidence:** +✓ 96% line coverage, 94% branch coverage +✓ 287 unit tests, all passing +✓ Echidna fuzzing for 12 invariants +✓ Integration tests for all workflows +✓ Mutation testing implemented +✓ Tests run in CI/CD +✓ Fork tests against mainnet state + +**Testing Breakdown:** +- Unit: 287 tests (forge test) +- Integration: 45 scenarios (end-to-end flows) +- Fuzzing: 12 invariants (Echidna, 10k runs each) +- Formal: 3 key properties (Certora) +- Fork: Tested against live Uniswap/SushiSwap + +**Uncovered Code:** +- Emergency migration (tested manually) +- Governance upgrade path (one-time) + +**Why Strong:** +Exceeds all satisfactory criteria with formal verification and +extensive fuzzing. Test quality is exceptional. + +--- + +## IMPROVEMENT ROADMAP + +### CRITICAL (Fix Before Mainnet - Week 1-2) + +**1. Create Arithmetic Specification [HIGH IMPACT]** +- Effort: 3-5 days +- Document all formulas with ground-truth models +- Analyze precision loss for each operation +- Justify rounding directions +- Impact: Moves Arithmetic from WEAK → MODERATE + +**2. Add Governance Timelock [HIGH IMPACT]** +- Effort: 2-3 days +- Deploy TimelockController (48-hour delay) +- Update all governance functions +- Document emergency override procedure +- Impact: Moves Decentralization from WEAK → MODERATE + +--- + +### HIGH PRIORITY (Fix Before Launch - Week 3-4) + +**3. Deploy Monitoring Infrastructure [MEDIUM IMPACT]** +- Effort: 3-4 days +- Set up Tenderly/OpenZeppelin Defender +- Create alert rules for critical events +- Document monitoring playbook +- Impact: Moves Auditing from MODERATE → SATISFACTORY + +**4. Simplify Complex Functions [MEDIUM IMPACT]** +- Effort: 5-7 days +- Split SwapRouter.getAmountOut() (cyclomatic complexity: 15) +- Extract PriceCalculator._validateSlippage() logic +- Impact: Moves Complexity from MODERATE → SATISFACTORY + +--- + +### MEDIUM PRIORITY (Improve for V2 - Month 2-3) + +**5. Document MEV Risks** +- Effort: 2-3 days +- Create MEV analysis document +- Add slippage protection where missing +- Impact: Moves Transaction Ordering from MODERATE → SATISFACTORY + +**6. Upgrade to On-Chain Governance** +- Effort: 1-2 weeks +- Replace multi-sig with Governor contract +- Add voting period and quorum +- Impact: Moves Authentication from SATISFACTORY → STRONG + +--- + +## CONCLUSION + +The codebase demonstrates **MODERATE-SATISFACTORY maturity** (2.7/4.0), +with excellent testing practices and good documentation. Primary concerns +are arithmetic specification gaps and centralized upgrade control. + +**Recommended Path to Mainnet:** +1. Complete CRITICAL items (arithmetic spec, timelock) +2. Address HIGH priority items (monitoring, complexity) +3. Conduct external audit +4. Launch with documented limitations +5. Implement MEDIUM priority items in V2 + +**Timeline:** 3-4 weeks to address critical/high items before audit. + +--- + +Assessment completed using Trail of Bits Building Secure Contracts +Code Maturity Evaluation Framework v0.1.0 +``` diff --git a/code-maturity-assessor/resources/REPORT_FORMAT.md b/code-maturity-assessor/resources/REPORT_FORMAT.md new file mode 100644 index 0000000..27ead6d --- /dev/null +++ b/code-maturity-assessor/resources/REPORT_FORMAT.md @@ -0,0 +1,33 @@ + +## Report Format + +### Executive Summary +- Project name and platform +- Overall maturity (average rating) +- Top 3 strengths +- Top 3 critical gaps +- Priority recommendations + +### Maturity Scorecard +| Category | Rating | Notes | +|----------|--------|-------| +| Arithmetic | [Rating] | [Key findings] | +| Auditing | [Rating] | [Key findings] | +| ... | ... | ... | + +**Overall**: [X.X / 4.0] + +### Detailed Analysis +For each category: +- Rating with justification +- Evidence from codebase (file:line references) +- Gaps identified +- Actions to reach next level + +### Improvement Roadmap +Priority-ordered recommendations: +- **CRITICAL** (immediate) +- **HIGH** (1-2 months) +- **MEDIUM** (2-4 months) + +Each with effort estimate and impact diff --git a/code-review/.skillshare-meta.json b/code-review/.skillshare-meta.json new file mode 100644 index 0000000..8dccbdb --- /dev/null +++ b/code-review/.skillshare-meta.json @@ -0,0 +1,8 @@ +{ + "source": "github.com/getsentry/skills/tree/main/plugins/sentry-skills/skills/code-review", + "type": "github-subdir", + "installed_at": "2026-01-30T02:23:24.577272662Z", + "repo_url": "https://github.com/getsentry/skills.git", + "subdir": "plugins/sentry-skills/skills/code-review", + "version": "bb366a0" +} \ No newline at end of file diff --git a/code-review/SKILL.md b/code-review/SKILL.md new file mode 100644 index 0000000..621007b --- /dev/null +++ b/code-review/SKILL.md @@ -0,0 +1,102 @@ +--- +name: code-review +description: Perform code reviews following Sentry engineering practices. Use when reviewing pull requests, examining code changes, or providing feedback on code quality. Covers security, performance, testing, and design review. +--- + +# Sentry Code Review + +Follow these guidelines when reviewing code for Sentry projects. + +## Review Checklist + +### Identifying Problems + +Look for these issues in code changes: + +- **Runtime errors**: Potential exceptions, null pointer issues, out-of-bounds access +- **Performance**: Unbounded O(n²) operations, N+1 queries, unnecessary allocations +- **Side effects**: Unintended behavioral changes affecting other components +- **Backwards compatibility**: Breaking API changes without migration path +- **ORM queries**: Complex Django ORM with unexpected query performance +- **Security vulnerabilities**: Injection, XSS, access control gaps, secrets exposure + +### Design Assessment + +- Do component interactions make logical sense? +- Does the change align with existing project architecture? +- Are there conflicts with current requirements or goals? + +### Test Coverage + +Every PR should have appropriate test coverage: + +- Functional tests for business logic +- Integration tests for component interactions +- End-to-end tests for critical user paths + +Verify tests cover actual requirements and edge cases. Avoid excessive branching or looping in test code. + +### Long-Term Impact + +Flag for senior engineer review when changes involve: + +- Database schema modifications +- API contract changes +- New framework or library adoption +- Performance-critical code paths +- Security-sensitive functionality + +## Feedback Guidelines + +### Tone + +- Be polite and empathetic +- Provide actionable suggestions, not vague criticism +- Phrase as questions when uncertain: "Have you considered...?" + +### Approval + +- Approve when only minor issues remain +- Don't block PRs for stylistic preferences +- Remember: the goal is risk reduction, not perfect code + +## Common Patterns to Flag + +### Python/Django + +```python +# Bad: N+1 query +for user in users: + print(user.profile.name) # Separate query per user + +# Good: Prefetch related +users = User.objects.prefetch_related('profile') +``` + +### TypeScript/React + +```typescript +// Bad: Missing dependency in useEffect +useEffect(() => { + fetchData(userId); +}, []); // userId not in deps + +// Good: Include all dependencies +useEffect(() => { + fetchData(userId); +}, [userId]); +``` + +### Security + +```python +# Bad: SQL injection risk +cursor.execute(f"SELECT * FROM users WHERE id = {user_id}") + +# Good: Parameterized query +cursor.execute("SELECT * FROM users WHERE id = %s", [user_id]) +``` + +## References + +- [Sentry Code Review Guidelines](https://develop.sentry.dev/engineering-practices/code-review/) diff --git a/commit/.skillshare-meta.json b/commit/.skillshare-meta.json new file mode 100644 index 0000000..78e60b7 --- /dev/null +++ b/commit/.skillshare-meta.json @@ -0,0 +1,8 @@ +{ + "source": "github.com/getsentry/skills/tree/main/plugins/sentry-skills/skills/commit", + "type": "github-subdir", + "installed_at": "2026-01-30T02:23:58.422475751Z", + "repo_url": "https://github.com/getsentry/skills.git", + "subdir": "plugins/sentry-skills/skills/commit", + "version": "bb366a0" +} \ No newline at end of file diff --git a/commit/SKILL.md b/commit/SKILL.md new file mode 100644 index 0000000..144f9e8 --- /dev/null +++ b/commit/SKILL.md @@ -0,0 +1,160 @@ +--- +name: commit +description: Create commit messages following Sentry conventions. Use when committing code changes, writing commit messages, or formatting git history. Follows conventional commits with Sentry-specific issue references. +--- + +# Sentry Commit Messages + +Follow these conventions when creating commits for Sentry projects. + +## Prerequisites + +Before committing, ensure you're working on a feature branch, not the main branch. + +```bash +# Check current branch +git branch --show-current +``` + +If you're on `main` or `master`, create a new branch first: + +```bash +# Create and switch to a new branch +git checkout -b / +``` + +Branch naming should follow the pattern: `/` where type matches the commit type (e.g., `feat/add-user-auth`, `fix/null-pointer-error`, `ref/extract-validation`). + +## Format + +``` +(): + + + +