- Add pre-deploy-check.sh script to validate .env doesn't contain secrets - Add clean-env-secrets.sh script to remove secrets from .env before deployment - Update deploy:firebase script to run validation automatically - Add sync-secrets npm script for local development - Add deploy:firebase:force for deployments that skip validation This prevents 'Secret environment variable overlaps non secret environment variable' errors by ensuring secrets defined via defineSecret() are not also in .env file. ## Completed Todos - ✅ Test financial extraction with Stax Holding Company CIM - All values correct (FY-3: $64M, FY-2: $71M, FY-1: $71M, LTM: $76M) - ✅ Implement deterministic parser fallback - Integrated into simpleDocumentProcessor - ✅ Implement few-shot examples - Added comprehensive examples for PRIMARY table identification - ✅ Fix primary table identification - Financial extraction now correctly identifies PRIMARY table (millions) vs subsidiary tables (thousands) ## Pending Todos 1. Review older commits (1-2 months ago) to see how financial extraction was working then - Check commits:185c780(Claude 3.7),5b3b1bf(Document AI fixes),0ec3d14(multi-pass extraction) - Compare prompt simplicity - older versions may have had simpler, more effective prompts - Check if deterministic parser was being used more effectively 2. Review best practices for structured financial data extraction from PDFs/CIMs - Research: LLM prompt engineering for tabular data (few-shot examples, chain-of-thought) - Period identification strategies - Validation techniques - Hybrid approaches (deterministic + LLM) - Error handling patterns - Check academic papers and industry case studies 3. Determine how to reduce processing time without sacrificing accuracy - Options: 1) Use Claude Haiku 4.5 for initial extraction, Sonnet 4.5 for validation - 2) Parallel extraction of different sections - 3) Caching common patterns - 4) Streaming responses - 5) Incremental processing with early validation - 6) Reduce prompt verbosity while maintaining clarity 4. Add unit tests for financial extraction validation logic - Test: invalid value rejection, cross-period validation, numeric extraction - Period identification from various formats (years, FY-X, mixed) - Include edge cases: missing periods, projections mixed with historical, inconsistent formatting 5. Monitor production financial extraction accuracy - Track: extraction success rate, validation rejection rate, common error patterns - User feedback on extracted financial data - Set up alerts for validation failures and extraction inconsistencies 6. Optimize prompt size for financial extraction - Current prompts may be too verbose - Test shorter, more focused prompts that maintain accuracy - Consider: removing redundant instructions, using more concise examples, focusing on critical rules only 7. Add financial data visualization - Consider adding a financial data preview/validation step in the UI - Allow users to verify/correct extracted values if needed - Provides human-in-the-loop validation for critical financial data 8. Document extraction strategies - Document the different financial table formats found in CIMs - Create a reference guide for common patterns (years format, FY-X format, mixed format, etc.) - This will help with prompt engineering and parser improvements 9. Compare RAG-based extraction vs simple full-document extraction for financial accuracy - Determine which approach produces more accurate financial data and why - May need to hybrid approach 10. Add confidence scores to financial extraction results - Flag low-confidence extractions for manual review - Helps identify when extraction may be incorrect and needs human validation
49 lines
1.1 KiB
Bash
Executable File
49 lines
1.1 KiB
Bash
Executable File
#!/bin/bash
|
|
# Pre-deployment validation script
|
|
# Checks for environment variable conflicts before deploying Firebase Functions
|
|
|
|
set -e
|
|
|
|
echo "🔍 Pre-deployment validation..."
|
|
|
|
# List of secrets that should NOT be in .env
|
|
SECRETS=(
|
|
"ANTHROPIC_API_KEY"
|
|
"OPENAI_API_KEY"
|
|
"OPENROUTER_API_KEY"
|
|
"DATABASE_URL"
|
|
"SUPABASE_SERVICE_KEY"
|
|
"SUPABASE_ANON_KEY"
|
|
"EMAIL_PASS"
|
|
)
|
|
|
|
CONFLICTS=0
|
|
|
|
if [ -f .env ]; then
|
|
echo "Checking .env file for secret conflicts..."
|
|
|
|
for secret in "${SECRETS[@]}"; do
|
|
if grep -q "^${secret}=" .env; then
|
|
echo "⚠️ CONFLICT: ${secret} is in .env but should only be a Firebase Secret"
|
|
CONFLICTS=$((CONFLICTS + 1))
|
|
fi
|
|
done
|
|
|
|
if [ $CONFLICTS -gt 0 ]; then
|
|
echo ""
|
|
echo "❌ Found ${CONFLICTS} conflict(s). Please remove these from .env:"
|
|
echo ""
|
|
echo "For local development, use: npm run sync-secrets"
|
|
echo "This will temporarily add secrets to .env for local testing."
|
|
echo ""
|
|
echo "To fix now, run: npm run clean-env-secrets"
|
|
exit 1
|
|
fi
|
|
else
|
|
echo "✅ No .env file found (this is fine for deployment)"
|
|
fi
|
|
|
|
echo "✅ Pre-deployment check passed!"
|
|
exit 0
|
|
|