# Agentic RAG Database Integration ## Overview This document describes the comprehensive database integration for the agentic RAG system, including session management, performance tracking, analytics, and quality metrics persistence. ## Architecture ### Database Schema The agentic RAG system uses the following database tables: #### Core Tables - `agentic_rag_sessions` - Main session tracking - `agent_executions` - Individual agent execution steps - `processing_quality_metrics` - Quality assessment metrics #### Performance & Analytics Tables - `performance_metrics` - Performance tracking data - `session_events` - Session-level audit trail - `execution_events` - Execution-level audit trail ### Key Features 1. **Atomic Transactions** - All database operations use transactions for data consistency 2. **Performance Tracking** - Comprehensive metrics for processing time, API calls, and costs 3. **Quality Metrics** - Automated quality assessment and scoring 4. **Analytics** - Historical data analysis and reporting 5. **Health Monitoring** - Real-time system health status 6. **Audit Trail** - Complete event logging for debugging and compliance ## Usage ### Basic Session Management ```typescript import { agenticRAGDatabaseService } from './services/agenticRAGDatabaseService'; // Create a new session const session = await agenticRAGDatabaseService.createSessionWithTransaction( 'document-id-123', 'user-id-456', 'agentic_rag' ); // Update session with performance metrics await agenticRAGDatabaseService.updateSessionWithMetrics( session.id, { status: 'completed', completedAgents: 6, overallValidationScore: 0.92 }, { processingTime: 45000, apiCalls: 12, cost: 0.85 } ); ``` ### Agent Execution Tracking ```typescript // Create agent execution const execution = await agenticRAGDatabaseService.createExecutionWithTransaction( session.id, 'document_understanding', { text: 'Document content...' } ); // Update execution with results await agenticRAGDatabaseService.updateExecutionWithTransaction( execution.id, { status: 'completed', outputData: { analysis: 'Analysis result...' }, processingTimeMs: 5000, validationResult: true } ); ``` ### Quality Metrics Persistence ```typescript const qualityMetrics = [ { documentId: 'doc-123', sessionId: session.id, metricType: 'completeness', metricValue: 0.85, metricDetails: { score: 0.85, missingFields: ['field1'] } }, { documentId: 'doc-123', sessionId: session.id, metricType: 'accuracy', metricValue: 0.92, metricDetails: { score: 0.92, issues: [] } } ]; await agenticRAGDatabaseService.saveQualityMetricsWithTransaction( session.id, qualityMetrics ); ``` ### Analytics and Reporting ```typescript // Get session metrics const sessionMetrics = await agenticRAGDatabaseService.getSessionMetrics(sessionId); // Generate performance report const startDate = new Date('2024-01-01'); const endDate = new Date('2024-01-31'); const performanceReport = await agenticRAGDatabaseService.generatePerformanceReport( startDate, endDate ); // Get health status const healthStatus = await agenticRAGDatabaseService.getHealthStatus(); // Get analytics data const analyticsData = await agenticRAGDatabaseService.getAnalyticsData(30); // Last 30 days ``` ## Performance Considerations ### Database Indexes The system includes optimized indexes for common query patterns: ```sql -- Session queries CREATE INDEX idx_agentic_rag_sessions_document_id ON agentic_rag_sessions(document_id); CREATE INDEX idx_agentic_rag_sessions_user_id ON agentic_rag_sessions(user_id); CREATE INDEX idx_agentic_rag_sessions_status ON agentic_rag_sessions(status); CREATE INDEX idx_agentic_rag_sessions_created_at ON agentic_rag_sessions(created_at); -- Execution queries CREATE INDEX idx_agent_executions_session_id ON agent_executions(session_id); CREATE INDEX idx_agent_executions_agent_name ON agent_executions(agent_name); CREATE INDEX idx_agent_executions_status ON agent_executions(status); -- Performance metrics CREATE INDEX idx_performance_metrics_session_id ON performance_metrics(session_id); CREATE INDEX idx_performance_metrics_metric_type ON performance_metrics(metric_type); ``` ### Query Optimization 1. **Batch Operations** - Use transactions for multiple related operations 2. **Connection Pooling** - Reuse database connections efficiently 3. **Async Operations** - Non-blocking database operations 4. **Error Handling** - Graceful degradation on database failures ### Data Retention ```typescript // Clean up old data (default: 30 days) const cleanupResult = await agenticRAGDatabaseService.cleanupOldData(30); console.log(`Cleaned up ${cleanupResult.sessionsDeleted} sessions and ${cleanupResult.metricsDeleted} metrics`); ``` ## Monitoring and Alerting ### Health Checks The system provides comprehensive health monitoring: ```typescript const healthStatus = await agenticRAGDatabaseService.getHealthStatus(); // Check overall health if (healthStatus.status === 'unhealthy') { // Send alert await sendAlert('Agentic RAG system is unhealthy', healthStatus); } // Check individual agents Object.entries(healthStatus.agents).forEach(([agentName, metrics]) => { if (metrics.status === 'unhealthy') { console.log(`Agent ${agentName} is unhealthy: ${metrics.successRate * 100}% success rate`); } }); ``` ### Performance Thresholds Configure alerts based on performance metrics: ```typescript const report = await agenticRAGDatabaseService.generatePerformanceReport( new Date(Date.now() - 24 * 60 * 60 * 1000), // Last 24 hours new Date() ); // Alert on high processing time if (report.averageProcessingTime > 120000) { // 2 minutes await sendAlert('High processing time detected', report); } // Alert on low success rate if (report.successRate < 0.9) { // 90% await sendAlert('Low success rate detected', report); } // Alert on high costs if (report.averageCost > 5.0) { // $5 per document await sendAlert('High cost per document detected', report); } ``` ## Error Handling ### Database Connection Failures ```typescript try { const session = await agenticRAGDatabaseService.createSessionWithTransaction( documentId, userId, strategy ); } catch (error) { if (error.code === 'ECONNREFUSED') { // Database connection failed logger.error('Database connection failed', { error }); // Implement fallback strategy return await fallbackProcessing(documentId, userId); } throw error; } ``` ### Transaction Rollbacks The system automatically handles transaction rollbacks on errors: ```typescript // If any operation in the transaction fails, all changes are rolled back const client = await db.connect(); try { await client.query('BEGIN'); // ... operations ... await client.query('COMMIT'); } catch (error) { await client.query('ROLLBACK'); throw error; } finally { client.release(); } ``` ## Testing ### Running Database Integration Tests ```bash # Run the comprehensive test suite node test-agentic-rag-database-integration.js ``` The test suite covers: - Session creation and management - Agent execution tracking - Quality metrics persistence - Performance tracking - Analytics and reporting - Health monitoring - Data cleanup ### Test Data Management ```typescript // Clean up test data after tests await agenticRAGDatabaseService.cleanupOldData(0); // Clean today's data ``` ## Maintenance ### Regular Maintenance Tasks 1. **Data Cleanup** - Remove old sessions and metrics 2. **Index Maintenance** - Rebuild indexes for optimal performance 3. **Performance Monitoring** - Track query performance and optimize 4. **Backup Verification** - Ensure data integrity ### Backup Strategy ```bash # Backup agentic RAG tables pg_dump -t agentic_rag_sessions -t agent_executions -t processing_quality_metrics \ -t performance_metrics -t session_events -t execution_events \ your_database > agentic_rag_backup.sql ``` ### Migration Management ```bash # Run migrations psql -d your_database -f src/models/migrations/009_create_agentic_rag_tables.sql psql -d your_database -f src/models/migrations/010_add_performance_metrics_and_events.sql ``` ## Configuration ### Environment Variables ```bash # Agentic RAG Database Configuration AGENTIC_RAG_ENABLED=true AGENTIC_RAG_MAX_AGENTS=6 AGENTIC_RAG_PARALLEL_PROCESSING=true AGENTIC_RAG_VALIDATION_STRICT=true AGENTIC_RAG_RETRY_ATTEMPTS=3 AGENTIC_RAG_TIMEOUT_PER_AGENT=60000 # Quality Control AGENTIC_RAG_QUALITY_THRESHOLD=0.8 AGENTIC_RAG_COMPLETENESS_THRESHOLD=0.9 AGENTIC_RAG_CONSISTENCY_CHECK=true # Monitoring and Logging AGENTIC_RAG_DETAILED_LOGGING=true AGENTIC_RAG_PERFORMANCE_TRACKING=true AGENTIC_RAG_ERROR_REPORTING=true ``` ## Troubleshooting ### Common Issues 1. **High Processing Times** - Check database connection pool size - Monitor query performance - Consider database optimization 2. **Memory Usage** - Monitor JSONB field sizes - Implement data archiving - Optimize query patterns 3. **Connection Pool Exhaustion** - Increase connection pool size - Implement connection timeout - Add connection health checks ### Debugging ```typescript // Enable detailed logging process.env.AGENTIC_RAG_DETAILED_LOGGING = 'true'; // Check session events const events = await db.query( 'SELECT * FROM session_events WHERE session_id = $1 ORDER BY created_at', [sessionId] ); // Check execution events const executionEvents = await db.query( 'SELECT * FROM execution_events WHERE execution_id = $1 ORDER BY created_at', [executionId] ); ``` ## Best Practices 1. **Use Transactions** - Always use transactions for related operations 2. **Monitor Performance** - Regularly check performance metrics 3. **Implement Cleanup** - Schedule regular data cleanup 4. **Handle Errors Gracefully** - Implement proper error handling and fallbacks 5. **Backup Regularly** - Maintain regular backups of agentic RAG data 6. **Monitor Health** - Set up health checks and alerting 7. **Optimize Queries** - Monitor and optimize slow queries 8. **Scale Appropriately** - Plan for database scaling as usage grows ## Future Enhancements 1. **Real-time Analytics** - Implement real-time dashboard 2. **Advanced Metrics** - Add more sophisticated performance metrics 3. **Data Archiving** - Implement automatic data archiving 4. **Multi-region Support** - Support for distributed databases 5. **Advanced Monitoring** - Integration with external monitoring tools