Add Bluepoint logo integration to PDF reports and web navigation
This commit is contained in:
606
TROUBLESHOOTING_GUIDE.md
Normal file
606
TROUBLESHOOTING_GUIDE.md
Normal file
@@ -0,0 +1,606 @@
|
||||
# Troubleshooting Guide
|
||||
## Complete Problem Resolution for CIM Document Processor
|
||||
|
||||
### 🎯 Overview
|
||||
|
||||
This guide provides comprehensive troubleshooting procedures for common issues in the CIM Document Processor, including diagnostic steps, solutions, and prevention strategies.
|
||||
|
||||
---
|
||||
|
||||
## 🔍 Diagnostic Procedures
|
||||
|
||||
### System Health Check
|
||||
|
||||
#### **Quick Health Assessment**
|
||||
```bash
|
||||
# Check application health
|
||||
curl -f http://localhost:5000/health
|
||||
|
||||
# Check database connectivity
|
||||
curl -f http://localhost:5000/api/documents
|
||||
|
||||
# Check authentication service
|
||||
curl -f http://localhost:5000/api/auth/status
|
||||
```
|
||||
|
||||
#### **Comprehensive Health Check**
|
||||
```typescript
|
||||
// utils/diagnostics.ts
|
||||
export const runSystemDiagnostics = async () => {
|
||||
const diagnostics = {
|
||||
timestamp: new Date().toISOString(),
|
||||
services: {
|
||||
database: await checkDatabaseHealth(),
|
||||
storage: await checkStorageHealth(),
|
||||
auth: await checkAuthHealth(),
|
||||
ai: await checkAIHealth()
|
||||
},
|
||||
resources: {
|
||||
memory: process.memoryUsage(),
|
||||
cpu: process.cpuUsage(),
|
||||
uptime: process.uptime()
|
||||
}
|
||||
};
|
||||
|
||||
return diagnostics;
|
||||
};
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🚨 Common Issues and Solutions
|
||||
|
||||
### Authentication Issues
|
||||
|
||||
#### **Problem**: User cannot log in
|
||||
**Symptoms**:
|
||||
- Login form shows "Invalid credentials"
|
||||
- Firebase authentication errors
|
||||
- Token validation failures
|
||||
|
||||
**Diagnostic Steps**:
|
||||
1. Check Firebase project configuration
|
||||
2. Verify authentication tokens
|
||||
3. Check network connectivity to Firebase
|
||||
4. Review authentication logs
|
||||
|
||||
**Solutions**:
|
||||
```typescript
|
||||
// Check Firebase configuration
|
||||
const firebaseConfig = {
|
||||
apiKey: process.env.FIREBASE_API_KEY,
|
||||
authDomain: process.env.FIREBASE_AUTH_DOMAIN,
|
||||
projectId: process.env.FIREBASE_PROJECT_ID
|
||||
};
|
||||
|
||||
// Verify token validation
|
||||
const verifyToken = async (token: string) => {
|
||||
try {
|
||||
const decodedToken = await admin.auth().verifyIdToken(token);
|
||||
return { valid: true, user: decodedToken };
|
||||
} catch (error) {
|
||||
logger.error('Token verification failed', { error: error.message });
|
||||
return { valid: false, error: error.message };
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
**Prevention**:
|
||||
- Regular Firebase configuration validation
|
||||
- Token refresh mechanism
|
||||
- Proper error handling in authentication flow
|
||||
|
||||
#### **Problem**: Token expiration issues
|
||||
**Symptoms**:
|
||||
- Users logged out unexpectedly
|
||||
- API requests returning 401 errors
|
||||
- Authentication state inconsistencies
|
||||
|
||||
**Solutions**:
|
||||
```typescript
|
||||
// Implement token refresh
|
||||
const refreshToken = async (refreshToken: string) => {
|
||||
try {
|
||||
const response = await fetch(`https://securetoken.googleapis.com/v1/token?key=${apiKey}`, {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({
|
||||
grant_type: 'refresh_token',
|
||||
refresh_token: refreshToken
|
||||
})
|
||||
});
|
||||
|
||||
const data = await response.json();
|
||||
return { success: true, token: data.id_token };
|
||||
} catch (error) {
|
||||
return { success: false, error: error.message };
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
### Document Upload Issues
|
||||
|
||||
#### **Problem**: File upload fails
|
||||
**Symptoms**:
|
||||
- Upload progress stops
|
||||
- Error messages about file size or type
|
||||
- Storage service errors
|
||||
|
||||
**Diagnostic Steps**:
|
||||
1. Check file size and type validation
|
||||
2. Verify Firebase Storage configuration
|
||||
3. Check network connectivity
|
||||
4. Review storage permissions
|
||||
|
||||
**Solutions**:
|
||||
```typescript
|
||||
// Enhanced file validation
|
||||
const validateFile = (file: File) => {
|
||||
const maxSize = 100 * 1024 * 1024; // 100MB
|
||||
const allowedTypes = ['application/pdf', 'application/msword', 'application/vnd.openxmlformats-officedocument.wordprocessingml.document'];
|
||||
|
||||
if (file.size > maxSize) {
|
||||
return { valid: false, error: 'File too large' };
|
||||
}
|
||||
|
||||
if (!allowedTypes.includes(file.type)) {
|
||||
return { valid: false, error: 'Invalid file type' };
|
||||
}
|
||||
|
||||
return { valid: true };
|
||||
};
|
||||
|
||||
// Storage error handling
|
||||
const uploadWithRetry = async (file: File, maxRetries = 3) => {
|
||||
for (let attempt = 1; attempt <= maxRetries; attempt++) {
|
||||
try {
|
||||
const result = await uploadToStorage(file);
|
||||
return result;
|
||||
} catch (error) {
|
||||
if (attempt === maxRetries) throw error;
|
||||
await new Promise(resolve => setTimeout(resolve, 1000 * attempt));
|
||||
}
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
#### **Problem**: Upload progress stalls
|
||||
**Symptoms**:
|
||||
- Progress bar stops advancing
|
||||
- No error messages
|
||||
- Upload appears to hang
|
||||
|
||||
**Solutions**:
|
||||
```typescript
|
||||
// Implement upload timeout
|
||||
const uploadWithTimeout = async (file: File, timeoutMs = 300000) => {
|
||||
const uploadPromise = uploadToStorage(file);
|
||||
const timeoutPromise = new Promise((_, reject) => {
|
||||
setTimeout(() => reject(new Error('Upload timeout')), timeoutMs);
|
||||
});
|
||||
|
||||
return Promise.race([uploadPromise, timeoutPromise]);
|
||||
};
|
||||
|
||||
// Add progress monitoring
|
||||
const monitorUploadProgress = (uploadTask: any, onProgress: (progress: number) => void) => {
|
||||
uploadTask.on('state_changed',
|
||||
(snapshot: any) => {
|
||||
const progress = (snapshot.bytesTransferred / snapshot.totalBytes) * 100;
|
||||
onProgress(progress);
|
||||
},
|
||||
(error: any) => {
|
||||
console.error('Upload error:', error);
|
||||
},
|
||||
() => {
|
||||
onProgress(100);
|
||||
}
|
||||
);
|
||||
};
|
||||
```
|
||||
|
||||
### Document Processing Issues
|
||||
|
||||
#### **Problem**: Document processing fails
|
||||
**Symptoms**:
|
||||
- Documents stuck in "processing" status
|
||||
- AI processing errors
|
||||
- PDF generation failures
|
||||
|
||||
**Diagnostic Steps**:
|
||||
1. Check Document AI service status
|
||||
2. Verify LLM API credentials
|
||||
3. Review processing logs
|
||||
4. Check system resources
|
||||
|
||||
**Solutions**:
|
||||
```typescript
|
||||
// Enhanced error handling for Document AI
|
||||
const processWithFallback = async (document: Document) => {
|
||||
try {
|
||||
// Try Document AI first
|
||||
const result = await processWithDocumentAI(document);
|
||||
return result;
|
||||
} catch (error) {
|
||||
logger.warn('Document AI failed, trying fallback', { error: error.message });
|
||||
|
||||
// Fallback to local processing
|
||||
try {
|
||||
const result = await processWithLocalParser(document);
|
||||
return result;
|
||||
} catch (fallbackError) {
|
||||
logger.error('Both Document AI and fallback failed', {
|
||||
documentAIError: error.message,
|
||||
fallbackError: fallbackError.message
|
||||
});
|
||||
throw new Error('Document processing failed');
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
// LLM service error handling
|
||||
const callLLMWithRetry = async (prompt: string, maxRetries = 3) => {
|
||||
for (let attempt = 1; attempt <= maxRetries; attempt++) {
|
||||
try {
|
||||
const response = await callLLM(prompt);
|
||||
return response;
|
||||
} catch (error) {
|
||||
if (attempt === maxRetries) throw error;
|
||||
|
||||
// Exponential backoff
|
||||
const delay = Math.pow(2, attempt) * 1000;
|
||||
await new Promise(resolve => setTimeout(resolve, delay));
|
||||
}
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
#### **Problem**: PDF generation fails
|
||||
**Symptoms**:
|
||||
- PDF generation errors
|
||||
- Missing PDF files
|
||||
- Generation timeout
|
||||
|
||||
**Solutions**:
|
||||
```typescript
|
||||
// PDF generation with error handling
|
||||
const generatePDFWithRetry = async (content: string, maxRetries = 3) => {
|
||||
for (let attempt = 1; attempt <= maxRetries; attempt++) {
|
||||
try {
|
||||
const pdf = await generatePDF(content);
|
||||
return pdf;
|
||||
} catch (error) {
|
||||
if (attempt === maxRetries) throw error;
|
||||
|
||||
// Clear browser cache and retry
|
||||
await clearBrowserCache();
|
||||
await new Promise(resolve => setTimeout(resolve, 2000));
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
// Browser resource management
|
||||
const clearBrowserCache = async () => {
|
||||
try {
|
||||
await browser.close();
|
||||
await browser.launch();
|
||||
} catch (error) {
|
||||
logger.error('Failed to clear browser cache', { error: error.message });
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
### Database Issues
|
||||
|
||||
#### **Problem**: Database connection failures
|
||||
**Symptoms**:
|
||||
- API errors with database connection messages
|
||||
- Slow response times
|
||||
- Connection pool exhaustion
|
||||
|
||||
**Diagnostic Steps**:
|
||||
1. Check Supabase service status
|
||||
2. Verify database credentials
|
||||
3. Check connection pool settings
|
||||
4. Review query performance
|
||||
|
||||
**Solutions**:
|
||||
```typescript
|
||||
// Connection pool management
|
||||
const createConnectionPool = () => {
|
||||
return new Pool({
|
||||
connectionString: process.env.DATABASE_URL,
|
||||
max: 20, // Maximum number of connections
|
||||
idleTimeoutMillis: 30000, // Close idle connections after 30 seconds
|
||||
connectionTimeoutMillis: 2000, // Return an error after 2 seconds if connection could not be established
|
||||
});
|
||||
};
|
||||
|
||||
// Query timeout handling
|
||||
const executeQueryWithTimeout = async (query: string, params: any[], timeoutMs = 5000) => {
|
||||
const client = await pool.connect();
|
||||
|
||||
try {
|
||||
const result = await Promise.race([
|
||||
client.query(query, params),
|
||||
new Promise((_, reject) =>
|
||||
setTimeout(() => reject(new Error('Query timeout')), timeoutMs)
|
||||
)
|
||||
]);
|
||||
|
||||
return result;
|
||||
} finally {
|
||||
client.release();
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
#### **Problem**: Slow database queries
|
||||
**Symptoms**:
|
||||
- Long response times
|
||||
- Database timeout errors
|
||||
- High CPU usage
|
||||
|
||||
**Solutions**:
|
||||
```typescript
|
||||
// Query optimization
|
||||
const optimizeQuery = (query: string) => {
|
||||
// Add proper indexes
|
||||
// Use query planning
|
||||
// Implement pagination
|
||||
return query;
|
||||
};
|
||||
|
||||
// Implement query caching
|
||||
const queryCache = new Map();
|
||||
|
||||
const cachedQuery = async (key: string, queryFn: () => Promise<any>, ttlMs = 300000) => {
|
||||
const cached = queryCache.get(key);
|
||||
if (cached && Date.now() - cached.timestamp < ttlMs) {
|
||||
return cached.data;
|
||||
}
|
||||
|
||||
const data = await queryFn();
|
||||
queryCache.set(key, { data, timestamp: Date.now() });
|
||||
return data;
|
||||
};
|
||||
```
|
||||
|
||||
### Performance Issues
|
||||
|
||||
#### **Problem**: Slow application response
|
||||
**Symptoms**:
|
||||
- High response times
|
||||
- Timeout errors
|
||||
- User complaints about slowness
|
||||
|
||||
**Diagnostic Steps**:
|
||||
1. Monitor CPU and memory usage
|
||||
2. Check database query performance
|
||||
3. Review external service response times
|
||||
4. Analyze request patterns
|
||||
|
||||
**Solutions**:
|
||||
```typescript
|
||||
// Performance monitoring
|
||||
const performanceMiddleware = (req: Request, res: Response, next: NextFunction) => {
|
||||
const start = Date.now();
|
||||
|
||||
res.on('finish', () => {
|
||||
const duration = Date.now() - start;
|
||||
|
||||
if (duration > 5000) {
|
||||
logger.warn('Slow request detected', {
|
||||
method: req.method,
|
||||
path: req.path,
|
||||
duration,
|
||||
userAgent: req.get('User-Agent')
|
||||
});
|
||||
}
|
||||
});
|
||||
|
||||
next();
|
||||
};
|
||||
|
||||
// Implement caching
|
||||
const cacheMiddleware = (ttlMs = 300000) => {
|
||||
const cache = new Map();
|
||||
|
||||
return (req: Request, res: Response, next: NextFunction) => {
|
||||
const key = `${req.method}:${req.path}:${JSON.stringify(req.query)}`;
|
||||
const cached = cache.get(key);
|
||||
|
||||
if (cached && Date.now() - cached.timestamp < ttlMs) {
|
||||
return res.json(cached.data);
|
||||
}
|
||||
|
||||
const originalSend = res.json;
|
||||
res.json = function(data) {
|
||||
cache.set(key, { data, timestamp: Date.now() });
|
||||
return originalSend.call(this, data);
|
||||
};
|
||||
|
||||
next();
|
||||
};
|
||||
};
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Debugging Tools
|
||||
|
||||
### Log Analysis
|
||||
|
||||
#### **Structured Logging**
|
||||
```typescript
|
||||
// Enhanced logging
|
||||
const logger = winston.createLogger({
|
||||
level: 'info',
|
||||
format: winston.format.combine(
|
||||
winston.format.timestamp(),
|
||||
winston.format.errors({ stack: true }),
|
||||
winston.format.json()
|
||||
),
|
||||
defaultMeta: {
|
||||
service: 'cim-processor',
|
||||
version: process.env.APP_VERSION,
|
||||
environment: process.env.NODE_ENV
|
||||
},
|
||||
transports: [
|
||||
new winston.transports.File({ filename: 'error.log', level: 'error' }),
|
||||
new winston.transports.File({ filename: 'combined.log' }),
|
||||
new winston.transports.Console({
|
||||
format: winston.format.simple()
|
||||
})
|
||||
]
|
||||
});
|
||||
```
|
||||
|
||||
#### **Log Analysis Commands**
|
||||
```bash
|
||||
# Find errors in logs
|
||||
grep -i "error" logs/combined.log | tail -20
|
||||
|
||||
# Find slow requests
|
||||
grep "duration.*[5-9][0-9][0-9][0-9]" logs/combined.log
|
||||
|
||||
# Find authentication failures
|
||||
grep -i "auth.*fail" logs/combined.log
|
||||
|
||||
# Monitor real-time logs
|
||||
tail -f logs/combined.log | grep -E "(error|warn|critical)"
|
||||
```
|
||||
|
||||
### Debug Endpoints
|
||||
|
||||
#### **Debug Information Endpoint**
|
||||
```typescript
|
||||
// routes/debug.ts
|
||||
router.get('/debug/info', async (req: Request, res: Response) => {
|
||||
const debugInfo = {
|
||||
timestamp: new Date().toISOString(),
|
||||
environment: process.env.NODE_ENV,
|
||||
version: process.env.APP_VERSION,
|
||||
uptime: process.uptime(),
|
||||
memory: process.memoryUsage(),
|
||||
cpu: process.cpuUsage(),
|
||||
services: {
|
||||
database: await checkDatabaseHealth(),
|
||||
storage: await checkStorageHealth(),
|
||||
auth: await checkAuthHealth()
|
||||
}
|
||||
};
|
||||
|
||||
res.json(debugInfo);
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📋 Troubleshooting Checklist
|
||||
|
||||
### Pre-Incident Preparation
|
||||
- [ ] Set up monitoring and alerting
|
||||
- [ ] Configure structured logging
|
||||
- [ ] Create runbooks for common issues
|
||||
- [ ] Establish escalation procedures
|
||||
- [ ] Document system architecture
|
||||
|
||||
### During Incident Response
|
||||
- [ ] Assess impact and scope
|
||||
- [ ] Check system health endpoints
|
||||
- [ ] Review recent logs and metrics
|
||||
- [ ] Identify root cause
|
||||
- [ ] Implement immediate fix
|
||||
- [ ] Communicate with stakeholders
|
||||
- [ ] Monitor system recovery
|
||||
|
||||
### Post-Incident Review
|
||||
- [ ] Document incident timeline
|
||||
- [ ] Analyze root cause
|
||||
- [ ] Review response effectiveness
|
||||
- [ ] Update procedures and documentation
|
||||
- [ ] Implement preventive measures
|
||||
- [ ] Schedule follow-up review
|
||||
|
||||
---
|
||||
|
||||
## 🛠️ Maintenance Procedures
|
||||
|
||||
### Regular Maintenance Tasks
|
||||
|
||||
#### **Daily Tasks**
|
||||
- [ ] Review system health metrics
|
||||
- [ ] Check error logs for new issues
|
||||
- [ ] Monitor performance trends
|
||||
- [ ] Verify backup systems
|
||||
|
||||
#### **Weekly Tasks**
|
||||
- [ ] Review alert effectiveness
|
||||
- [ ] Analyze performance metrics
|
||||
- [ ] Update monitoring thresholds
|
||||
- [ ] Review security logs
|
||||
|
||||
#### **Monthly Tasks**
|
||||
- [ ] Performance optimization review
|
||||
- [ ] Capacity planning assessment
|
||||
- [ ] Security audit
|
||||
- [ ] Documentation updates
|
||||
|
||||
### Preventive Maintenance
|
||||
|
||||
#### **System Optimization**
|
||||
```typescript
|
||||
// Regular cleanup tasks
|
||||
const performMaintenance = async () => {
|
||||
// Clean up old logs
|
||||
await cleanupOldLogs();
|
||||
|
||||
// Clear expired cache entries
|
||||
await clearExpiredCache();
|
||||
|
||||
// Optimize database
|
||||
await optimizeDatabase();
|
||||
|
||||
// Update system metrics
|
||||
await updateSystemMetrics();
|
||||
};
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📞 Support and Escalation
|
||||
|
||||
### Support Levels
|
||||
|
||||
#### **Level 1: Basic Support**
|
||||
- User authentication issues
|
||||
- Basic configuration problems
|
||||
- Common error messages
|
||||
|
||||
#### **Level 2: Technical Support**
|
||||
- System performance issues
|
||||
- Database problems
|
||||
- Integration issues
|
||||
|
||||
#### **Level 3: Advanced Support**
|
||||
- Complex system failures
|
||||
- Security incidents
|
||||
- Architecture problems
|
||||
|
||||
### Escalation Procedures
|
||||
|
||||
#### **Escalation Criteria**
|
||||
- System downtime > 15 minutes
|
||||
- Data loss or corruption
|
||||
- Security breaches
|
||||
- Performance degradation > 50%
|
||||
|
||||
#### **Escalation Contacts**
|
||||
- **Primary**: Operations Team Lead
|
||||
- **Secondary**: System Administrator
|
||||
- **Emergency**: CTO/Technical Director
|
||||
|
||||
---
|
||||
|
||||
This comprehensive troubleshooting guide provides the tools and procedures needed to quickly identify and resolve issues in the CIM Document Processor, ensuring high availability and user satisfaction.
|
||||
Reference in New Issue
Block a user