15 KiB
API Documentation Guide
Complete API Reference for CIM Document Processor
🎯 Overview
This document provides comprehensive API documentation for the CIM Document Processor, including all endpoints, authentication, error handling, and usage examples.
🔐 Authentication
Firebase JWT Authentication
All API endpoints require Firebase JWT authentication. Include the JWT token in the Authorization header:
Authorization: Bearer <firebase_jwt_token>
Token Validation
- Tokens are validated on every request
- Invalid or expired tokens return 401 Unauthorized
- User context is extracted from the token for data isolation
📊 Base URL
Development
http://localhost:5001/api
Production
https://your-domain.com/api
🔌 API Endpoints
Document Management
POST /documents/upload-url
Get a signed upload URL for direct file upload to Google Cloud Storage.
Request Body:
{
"fileName": "sample_cim.pdf",
"fileType": "application/pdf",
"fileSize": 2500000
}
Response:
{
"success": true,
"uploadUrl": "https://storage.googleapis.com/...",
"filePath": "uploads/user-123/doc-456/sample_cim.pdf",
"correlationId": "req-789"
}
Error Responses:
400 Bad Request- Invalid file type or size401 Unauthorized- Missing or invalid authentication500 Internal Server Error- Upload URL generation failed
POST /documents/:id/confirm-upload
Confirm file upload and start document processing.
Path Parameters:
id(string, required) - Document ID (UUID)
Request Body:
{
"filePath": "uploads/user-123/doc-456/sample_cim.pdf",
"fileSize": 2500000,
"fileName": "sample_cim.pdf"
}
Response:
{
"success": true,
"documentId": "doc-456",
"status": "processing",
"message": "Document processing started",
"correlationId": "req-789"
}
Error Responses:
400 Bad Request- Invalid document ID or file path401 Unauthorized- Missing or invalid authentication404 Not Found- Document not found500 Internal Server Error- Processing failed to start
POST /documents/:id/process-optimized-agentic-rag
Trigger AI processing using the optimized agentic RAG strategy.
Path Parameters:
id(string, required) - Document ID (UUID)
Request Body:
{
"strategy": "optimized_agentic_rag",
"options": {
"enableSemanticChunking": true,
"enableMetadataEnrichment": true
}
}
Response:
{
"success": true,
"processingStrategy": "optimized_agentic_rag",
"processingTime": 180000,
"apiCalls": 25,
"summary": "Comprehensive CIM analysis completed...",
"analysisData": {
"dealOverview": { ... },
"businessDescription": { ... },
"financialSummary": { ... }
},
"correlationId": "req-789"
}
Error Responses:
400 Bad Request- Invalid strategy or options401 Unauthorized- Missing or invalid authentication404 Not Found- Document not found500 Internal Server Error- Processing failed
GET /documents/:id/download
Download the processed PDF report.
Path Parameters:
id(string, required) - Document ID (UUID)
Response:
200 OK- PDF file streamContent-Type: application/pdfContent-Disposition: attachment; filename="cim_report.pdf"
Error Responses:
401 Unauthorized- Missing or invalid authentication404 Not Found- Document or PDF not found500 Internal Server Error- Download failed
DELETE /documents/:id
Delete a document and all associated data.
Path Parameters:
id(string, required) - Document ID (UUID)
Response:
{
"success": true,
"message": "Document deleted successfully",
"correlationId": "req-789"
}
Error Responses:
401 Unauthorized- Missing or invalid authentication404 Not Found- Document not found500 Internal Server Error- Deletion failed
Analytics & Monitoring
GET /documents/analytics
Get processing analytics for the current user.
Query Parameters:
days(number, optional) - Number of days to analyze (default: 30)
Response:
{
"success": true,
"analytics": {
"totalDocuments": 150,
"processingSuccessRate": 0.95,
"averageProcessingTime": 180000,
"totalApiCalls": 3750,
"estimatedCost": 45.50,
"documentsByStatus": {
"completed": 142,
"processing": 5,
"failed": 3
},
"processingTrends": [
{
"date": "2024-12-20",
"documentsProcessed": 8,
"averageTime": 175000
}
]
},
"correlationId": "req-789"
}
GET /documents/processing-stats
Get real-time processing statistics.
Response:
{
"success": true,
"stats": {
"totalDocuments": 150,
"documentAiAgenticRagSuccess": 142,
"averageProcessingTime": {
"documentAiAgenticRag": 180000
},
"averageApiCalls": {
"documentAiAgenticRag": 25
},
"activeProcessing": 3,
"queueLength": 2
},
"correlationId": "req-789"
}
GET /documents/:id/agentic-rag-sessions
Get agentic RAG processing sessions for a document.
Path Parameters:
id(string, required) - Document ID (UUID)
Response:
{
"success": true,
"sessions": [
{
"id": "session-123",
"strategy": "optimized_agentic_rag",
"status": "completed",
"totalAgents": 6,
"completedAgents": 6,
"failedAgents": 0,
"overallValidationScore": 0.92,
"processingTimeMs": 180000,
"apiCallsCount": 25,
"totalCost": 0.35,
"createdAt": "2024-12-20T10:30:00Z",
"completedAt": "2024-12-20T10:33:00Z"
}
],
"correlationId": "req-789"
}
Monitoring Endpoints
GET /monitoring/upload-metrics
Get upload metrics for a specified time period.
Query Parameters:
hours(number, required) - Number of hours to analyze (1-168)
Response:
{
"success": true,
"data": {
"totalUploads": 45,
"successfulUploads": 43,
"failedUploads": 2,
"successRate": 0.956,
"averageFileSize": 2500000,
"totalDataTransferred": 112500000,
"uploadTrends": [
{
"hour": "2024-12-20T10:00:00Z",
"uploads": 8,
"successRate": 1.0
}
]
},
"correlationId": "req-789"
}
GET /monitoring/upload-health
Get upload pipeline health status.
Response:
{
"success": true,
"data": {
"status": "healthy",
"successRate": 0.956,
"averageResponseTime": 1500,
"errorRate": 0.044,
"activeConnections": 12,
"lastError": null,
"lastErrorTime": null,
"uptime": 86400000
},
"correlationId": "req-789"
}
GET /monitoring/real-time-stats
Get real-time upload statistics.
Response:
{
"success": true,
"data": {
"currentUploads": 3,
"queueLength": 2,
"processingRate": 8.5,
"averageProcessingTime": 180000,
"memoryUsage": 45.2,
"cpuUsage": 23.1,
"activeUsers": 15,
"systemLoad": 0.67
},
"correlationId": "req-789"
}
Vector Database Endpoints
GET /vector/document-chunks/:documentId
Get document chunks for a specific document.
Path Parameters:
documentId(string, required) - Document ID (UUID)
Response:
{
"success": true,
"chunks": [
{
"id": "chunk-123",
"content": "Document chunk content...",
"embedding": [0.1, 0.2, 0.3, ...],
"metadata": {
"sectionType": "financial",
"confidence": 0.95
},
"createdAt": "2024-12-20T10:30:00Z"
}
],
"correlationId": "req-789"
}
GET /vector/analytics
Get search analytics for the current user.
Query Parameters:
days(number, optional) - Number of days to analyze (default: 30)
Response:
{
"success": true,
"analytics": {
"totalSearches": 125,
"averageSearchTime": 250,
"searchSuccessRate": 0.98,
"popularQueries": [
"financial performance",
"market analysis",
"management team"
],
"searchTrends": [
{
"date": "2024-12-20",
"searches": 8,
"averageTime": 245
}
]
},
"correlationId": "req-789"
}
GET /vector/stats
Get vector database statistics.
Response:
{
"success": true,
"stats": {
"totalChunks": 1500,
"totalDocuments": 150,
"averageChunkSize": 4000,
"embeddingDimensions": 1536,
"indexSize": 2500000,
"queryPerformance": {
"averageQueryTime": 250,
"cacheHitRate": 0.85
}
},
"correlationId": "req-789"
}
🚨 Error Handling
Standard Error Response Format
All error responses follow this format:
{
"success": false,
"error": "Error message description",
"errorCode": "ERROR_CODE",
"correlationId": "req-789",
"details": {
"field": "Additional error details"
}
}
Common Error Codes
400 Bad Request
INVALID_INPUT- Invalid request parametersMISSING_REQUIRED_FIELD- Required field is missingINVALID_FILE_TYPE- Unsupported file typeFILE_TOO_LARGE- File size exceeds limit
401 Unauthorized
MISSING_TOKEN- Authentication token is missingINVALID_TOKEN- Authentication token is invalidEXPIRED_TOKEN- Authentication token has expired
404 Not Found
DOCUMENT_NOT_FOUND- Document does not existSESSION_NOT_FOUND- Processing session not foundFILE_NOT_FOUND- File does not exist
500 Internal Server Error
PROCESSING_FAILED- Document processing failedSTORAGE_ERROR- File storage operation failedDATABASE_ERROR- Database operation failedEXTERNAL_SERVICE_ERROR- External service unavailable
Error Recovery Strategies
Retry Logic
- Transient Errors: Automatically retry with exponential backoff
- Rate Limiting: Respect rate limits and implement backoff
- Service Unavailable: Retry with increasing delays
Fallback Strategies
- Primary Strategy: Optimized agentic RAG processing
- Fallback Strategy: Basic processing without advanced features
- Degradation Strategy: Simple text extraction only
📊 Rate Limiting
Limits
- Upload Endpoints: 10 requests per minute per user
- Processing Endpoints: 5 requests per minute per user
- Analytics Endpoints: 30 requests per minute per user
- Download Endpoints: 20 requests per minute per user
Rate Limit Headers
X-RateLimit-Limit: 10
X-RateLimit-Remaining: 7
X-RateLimit-Reset: 1640000000
Rate Limit Exceeded Response
{
"success": false,
"error": "Rate limit exceeded",
"errorCode": "RATE_LIMIT_EXCEEDED",
"retryAfter": 60,
"correlationId": "req-789"
}
📋 Usage Examples
Complete Document Processing Workflow
1. Get Upload URL
curl -X POST http://localhost:5001/api/documents/upload-url \
-H "Authorization: Bearer <firebase_jwt_token>" \
-H "Content-Type: application/json" \
-d '{
"fileName": "sample_cim.pdf",
"fileType": "application/pdf",
"fileSize": 2500000
}'
2. Upload File to GCS
curl -X PUT "<upload_url>" \
-H "Content-Type: application/pdf" \
--upload-file sample_cim.pdf
3. Confirm Upload
curl -X POST http://localhost:5001/api/documents/doc-123/confirm-upload \
-H "Authorization: Bearer <firebase_jwt_token>" \
-H "Content-Type: application/json" \
-d '{
"filePath": "uploads/user-123/doc-123/sample_cim.pdf",
"fileSize": 2500000,
"fileName": "sample_cim.pdf"
}'
4. Trigger AI Processing
curl -X POST http://localhost:5001/api/documents/doc-123/process-optimized-agentic-rag \
-H "Authorization: Bearer <firebase_jwt_token>" \
-H "Content-Type: application/json" \
-d '{
"strategy": "optimized_agentic_rag",
"options": {
"enableSemanticChunking": true,
"enableMetadataEnrichment": true
}
}'
5. Download PDF Report
curl -X GET http://localhost:5001/api/documents/doc-123/download \
-H "Authorization: Bearer <firebase_jwt_token>" \
--output cim_report.pdf
JavaScript/TypeScript Examples
Document Upload and Processing
import axios from 'axios';
const API_BASE = 'http://localhost:5001/api';
const AUTH_TOKEN = 'firebase_jwt_token';
// Get upload URL
const uploadUrlResponse = await axios.post(`${API_BASE}/documents/upload-url`, {
fileName: 'sample_cim.pdf',
fileType: 'application/pdf',
fileSize: 2500000
}, {
headers: { Authorization: `Bearer ${AUTH_TOKEN}` }
});
const { uploadUrl, filePath } = uploadUrlResponse.data;
// Upload file to GCS
await axios.put(uploadUrl, fileBuffer, {
headers: { 'Content-Type': 'application/pdf' }
});
// Confirm upload
await axios.post(`${API_BASE}/documents/${documentId}/confirm-upload`, {
filePath,
fileSize: 2500000,
fileName: 'sample_cim.pdf'
}, {
headers: { Authorization: `Bearer ${AUTH_TOKEN}` }
});
// Trigger AI processing
const processingResponse = await axios.post(
`${API_BASE}/documents/${documentId}/process-optimized-agentic-rag`,
{
strategy: 'optimized_agentic_rag',
options: {
enableSemanticChunking: true,
enableMetadataEnrichment: true
}
},
{
headers: { Authorization: `Bearer ${AUTH_TOKEN}` }
}
);
console.log('Processing result:', processingResponse.data);
Error Handling
try {
const response = await axios.post(`${API_BASE}/documents/upload-url`, {
fileName: 'sample_cim.pdf',
fileType: 'application/pdf',
fileSize: 2500000
}, {
headers: { Authorization: `Bearer ${AUTH_TOKEN}` }
});
console.log('Upload URL:', response.data.uploadUrl);
} catch (error) {
if (error.response) {
const { status, data } = error.response;
switch (status) {
case 400:
console.error('Bad request:', data.error);
break;
case 401:
console.error('Authentication failed:', data.error);
break;
case 429:
console.error('Rate limit exceeded, retry after:', data.retryAfter, 'seconds');
break;
case 500:
console.error('Server error:', data.error);
break;
default:
console.error('Unexpected error:', data.error);
}
} else {
console.error('Network error:', error.message);
}
}
🔍 Monitoring and Debugging
Correlation IDs
All API responses include a correlationId for request tracking:
{
"success": true,
"data": { ... },
"correlationId": "req-789"
}
Request Logging
Include correlation ID in logs for debugging:
logger.info('API request', {
correlationId: response.data.correlationId,
endpoint: '/documents/upload-url',
userId: 'user-123'
});
Health Checks
Monitor API health with correlation IDs:
curl -X GET http://localhost:5001/api/monitoring/upload-health \
-H "Authorization: Bearer <firebase_jwt_token>"
This comprehensive API documentation provides all the information needed to integrate with the CIM Document Processor API, including authentication, endpoints, error handling, and usage examples.