Go to file

Jonathan Pressnell cf0b596449 Initial commit: sales analysis template

Co-authored-by: Cursor <cursoragent@cursor.com>

2026-02-06 09:16:34 -05:00

.cursor/rules

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

examples

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

tests

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

analysis_template.py

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

analysis_utils.py

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

config_validator.py

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

config.py

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

data_loader.py

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

data_processing.py

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

data_quality.py

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

EXAMPLES.md

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

export_utils.py

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

generate_sample_data.py

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

logger_config.py

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

QUICK_START.md

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

README.md

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

report_generator.py

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

requirements.txt

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

run_all_analyses.py

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

SETUP_CHECKLIST.md

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

setup_wizard.py

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

statistical_utils.py

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

TEMPLATE_OVERVIEW.md

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

TEMPLATE_SUMMARY.md

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

validate_revenue.py

Initial commit: sales analysis template

2026-02-06 09:16:34 -05:00

README.md

Sales Analysis Template

A best-in-class, reusable template for sales invoice detail analysis

Optimized for Cursor AI - Just ask the AI to create analyses and it handles everything automatically.

This template provides a complete framework for analyzing sales data from any company. It's designed to be:

Flexible: Works with different column names, date formats, and data structures
Automated: Interactive setup wizard configures everything for your company
AI-Optimized: Fully optimized for Cursor - AI knows all patterns and generates code automatically
Production-Ready: Includes error handling, validation, and best practices

🚀 Quick Start

1. Setup (Automated)

Run the interactive setup wizard:

python setup_wizard.py

The wizard will ask you about:

Company name and analysis date
Data file location
Column names in your CSV
Date range and LTM configuration
Exclusion filters (if needed)

2. Manual Setup (Alternative)

If you prefer to configure manually:

Update config.py with your company-specific settings:
- COMPANY_NAME: Your company name
- DATA_FILE: Your CSV filename
- REVENUE_COLUMN: Your revenue/amount column name
- DATE_COLUMN: Your primary date column
- Column mappings for Customer, Item, etc.
- Date range and LTM settings
Place your data file in the template directory (or update DATA_DIR in config.py)

3. Test Data Loading

Verify your configuration works:

python -c "from data_loader import load_sales_data; from config import get_data_path; df = load_sales_data(get_data_path()); print(f'Loaded {len(df):,} rows')"

4. Create Your First Analysis

Copy the template and customize:

cp analysis_template.py my_first_analysis.py
# Edit my_first_analysis.py with your analysis logic
python my_first_analysis.py

📁 Project Structure

sales_analysis_template/
├── README.md                    # This file
├── QUICK_START.md               # Quick start guide
├── TEMPLATE_OVERVIEW.md         # High-level overview
├── TEMPLATE_SUMMARY.md          # Comprehensive template summary
├── EXAMPLES.md                  # Example scripts guide
├── SETUP_CHECKLIST.md           # Setup verification checklist
├── requirements.txt             # Python dependencies
├── setup_wizard.py              # Interactive setup wizard
│
├── config.py                    # ⭐ Configuration (customize for your company)
├── config_validator.py          # Configuration validation utility
│
├── data_loader.py               # ⭐ Data loading with fallback logic
├── data_quality.py              # Data quality reporting
├── data_processing.py           # Data transformation utilities
│
├── analysis_utils.py           # ⭐ Common utilities (formatters, LTM, helpers)
├── statistical_utils.py         # Statistical analysis utilities
├── validate_revenue.py          # Revenue validation utility
│
├── export_utils.py              # Export to CSV/Excel
├── report_generator.py          # PDF report generation
├── logger_config.py             # Logging configuration
│
├── analysis_template.py         # Template for creating new analyses
├── run_all_analyses.py          # Batch runner for all scripts
├── generate_sample_data.py      # Generate sample data for testing
│
├── examples/                    # Example analysis scripts
│   ├── annual_revenue_trend.py  # Simple annual revenue analysis
│   ├── customer_segmentation.py # RFM customer segmentation
│   ├── cohort_analysis.py       # Customer cohort analysis
│   └── product_performance.py   # Product performance analysis
│
├── tests/                       # Unit tests
│   ├── test_data_loader.py      # Data loader tests
│   ├── test_analysis_utils.py   # Analysis utils tests
│   └── test_config_validator.py # Config validator tests
│
└── .cursor/
    └── rules/                   # Cursor IDE rules (auto-loaded)
        ├── ai_assistant_guide.md # Complete AI assistant guide
        ├── advanced_analysis_patterns.md # Advanced techniques
        ├── analysis_patterns.md  # Common analysis patterns
        ├── chart_formatting.md   # Chart formatting rules
        ├── code_quality.md       # Code quality standards
        ├── common_errors.md      # Error troubleshooting
        ├── data_loading.md       # Data loading patterns
        ├── error_handling.md     # Error handling patterns
        └── ltm_methodology.md    # LTM methodology

🔧 Configuration Guide

Required Configuration

In config.py, you MUST configure:

Company Information:
```
COMPANY_NAME = "Your Company Name"
```
Data File:
```
DATA_FILE = 'your_sales_data.csv'
```

Column Mappings:

REVENUE_COLUMN = 'USD'  # Your revenue column name
DATE_COLUMN = 'InvoiceDate'  # Your date column name
CUSTOMER_COLUMN = 'Customer'  # Your customer column name

Date Range:

MIN_YEAR = 2021
MAX_DATE = pd.Timestamp('2025-09-30')
ANALYSIS_YEARS = [2021, 2022, 2023, 2024, 2025]

Optional Configuration

LTM (Last Twelve Months):

LTM_ENABLED = True  # Set to False if all years are complete
LTM_START_MONTH = 10
LTM_START_YEAR = 2024
LTM_END_MONTH = 9
LTM_END_YEAR = 2025

Exclusion Filters:

EXCLUSION_FILTERS = {
    'enabled': True,
    'exclude_by_column': 'Country',
    'exclude_values': ['Test', 'KVT']
}

See config.py for all available options and detailed comments.

📊 Data Requirements

Required Columns

Your CSV file must have:

Revenue column: A numeric column with sales amounts (configured as REVENUE_COLUMN)
Date column: At least one date column (configured as DATE_COLUMN)

Recommended Columns

For full analysis capabilities, include:

Customer/Account: For customer segmentation and analysis
Item/Product: For product analysis
Quantity: For price calculations
Geographic: Region, Country for geographic analysis
Segments: Technology, EndMarket, ProductGroup for segmentation

Date Column Fallback

The data loader supports fallback logic:

Primary: Uses DATE_COLUMN (e.g., InvoiceDate)
Fallback 1: Uses columns in DATE_FALLBACK_COLUMNS (e.g., Month, Year)
Fallback 2: Constructs from Year column if available

This ensures maximum date coverage even if some rows have missing dates.

💻 Creating Analysis Scripts

Using the Template

Copy the template:
```
cp analysis_template.py my_analysis.py
```

Update configuration:

ANALYSIS_NAME = "My Analysis"
DESCRIPTION = "Description of what this analysis does"

Implement your logic:
- Use calculate_annual_metrics() for annual aggregations
- Use setup_revenue_chart() and save_chart() for visualizations
- Follow patterns from .cursor/rules/analysis_patterns.md
Run your analysis:
```
python my_analysis.py
```

Standard Pattern

from data_loader import load_sales_data, validate_data_structure
from analysis_utils import (
    get_ltm_period_config, calculate_annual_metrics,
    setup_revenue_chart, save_chart, apply_exclusion_filters
)
from config import get_data_path, REVENUE_COLUMN, CHART_SIZES

# Load and validate
df = load_sales_data(get_data_path())
is_valid, msg = validate_data_structure(df)
if not is_valid:
    print(f"ERROR: {msg}")
    return

# Apply filters
df = apply_exclusion_filters(df)

# Calculate metrics
ltm_start, ltm_end = get_ltm_period_config()
annual_df = calculate_annual_metrics(df, calculate_metrics, ltm_start, ltm_end)

# Create charts
fig, ax = plt.subplots(figsize=CHART_SIZES['medium'])
ax.plot(data / 1e6, ...)
setup_revenue_chart(ax)
save_chart(fig, 'chart.png')

🎯 Key Features

1. Flexible Data Loading

Handles different column names via configuration
Fallback logic for date parsing (100% coverage)
Automatic validation and error reporting

2. LTM (Last Twelve Months) Support

Automatic LTM calculation for partial years
Apples-to-apples comparison with full calendar years
Configurable LTM periods

3. Standardized Chart Formatting

Automatic millions formatter for revenue charts
Consistent styling and sizing
Professional output ready for reports
Optional interactive charts with Plotly

4. Exclusion Filters

Easy configuration for excluding segments
Useful for excluding test accounts, business units, etc.

5. Revenue Validation

Automatic validation after each analysis
Ensures data loading is working correctly
Optional validation against expected values

6. Example Scripts

Working examples for common analyses
Demonstrates best practices
Easy to customize and extend

7. Data Export

Export results to CSV and Excel
Formatted summary tables
Multiple sheet support

8. Data Quality Reporting

Comprehensive data quality checks
Missing value analysis
Outlier detection
Data profiling

9. Configuration Validation

Early error detection
Validates column mappings
Checks date ranges and LTM configuration

10. Statistical Utilities

Year-over-year growth calculations
CAGR (Compound Annual Growth Rate)
Correlation analysis
Statistical significance testing

11. Report Generation

Combine multiple charts into PDF reports
Professional formatting
Summary tables and metadata

12. Logging Infrastructure

Structured logging with file and console output
Analysis execution tracking
Configurable log levels

📚 Documentation

For AI Agents (Cursor IDE)

The .cursor/rules/ directory contains comprehensive rules that are automatically loaded by Cursor:

ai_assistant_guide.md: Complete guide with ready-to-use prompts
advanced_analysis_patterns.md: Advanced techniques (cohort, PVM, forecasting, etc.)
analysis_patterns.md: Standard patterns for creating analyses
data_loading.md: Always use data_loader.py, never pd.read_csv() directly
chart_formatting.md: How to format charts correctly
ltm_methodology.md: LTM implementation and usage
common_errors.md: Troubleshooting guide
code_quality.md: Code quality standards and Cursor best practices
error_handling.md: How to write AI-friendly error messages

For Developers

config.py: Heavily commented with all configuration options
analysis_template.py: Template with examples and comments
analysis_utils.py: Well-documented utility functions

🔍 Common Analysis Types

This template supports all standard sales analyses:

Revenue Analyses

Annual revenue trends
Monthly revenue analysis
Revenue by segment/product/geography

Customer Analyses

Customer segmentation (RFM)
Customer concentration
Churn analysis
Cohort analysis
Customer lifetime value (CLV)

Product Analyses

Product performance
Product lifecycle
BCG matrix
Market basket analysis

Financial Analyses

Price elasticity
Contribution margin
Price vs volume analysis

Advanced Analyses

Seasonality analysis
Time series forecasting
Customer churn prediction

See examples/ directory for working example scripts, or the original Dukane project for 24+ production analysis scripts.

🛠️ Dependencies

Install required packages:

pip install -r requirements.txt

Core dependencies:

pandas - Data manipulation
numpy - Numerical operations
matplotlib - Charting
seaborn - Enhanced visualizations

Optional dependencies (uncomment in requirements.txt if needed):

openpyxl - Excel export (export_utils.py)
plotly - Interactive charts (analysis_utils.py)
reportlab - PDF reports (report_generator.py)
scipy - Statistical analysis (statistical_utils.py)
pytest - Unit testing
pmdarima - Time series forecasting
mlxtend - Market basket analysis
scikit-learn - Machine learning

⚠️ Important Notes

Always Use Utilities

✅ DO:

from data_loader import load_sales_data
from analysis_utils import setup_revenue_chart, save_chart
from config import REVENUE_COLUMN, CHART_SIZES

❌ DON'T:

df = pd.read_csv('data.csv')  # Use data_loader instead
ax.plot(revenue, ...)  # Divide by 1e6 first, use setup_revenue_chart()

Chart Formatting

ALWAYS divide revenue by 1e6 before plotting:

ax.plot(revenue / 1e6, ...)  # Convert to millions
setup_revenue_chart(ax)  # Apply formatter

LTM Labeling

ALWAYS label LTM years correctly:

from config import get_ltm_label
ltm_label = get_ltm_label()  # Returns "2025 (LTM 9/2025)" or None
if ltm_label:
    title += f'\n({ltm_label})'

🐛 Troubleshooting

Data Loading Issues

Problem: "Data file not found"

Solution: Check DATA_FILE path in config.py
Solution: Ensure file is in template directory or update DATA_DIR

Problem: "Required column 'USD' not found"

Solution: Update REVENUE_COLUMN in config.py to match your CSV
Solution: Check all column mappings in config.py

Problem: "All dates are NaN"

Solution: Add fallback date columns to DATE_FALLBACK_COLUMNS
Solution: Check date format in your CSV

Analysis Issues

Problem: Charts show scientific notation (1e8)

Solution: Divide by 1e6 before plotting: ax.plot(data / 1e6, ...)
Solution: Use setup_revenue_chart(ax) to apply formatter

Problem: "DataFrame is empty" after filtering

Solution: Check MIN_YEAR and MAX_DATE in config.py
Solution: Verify ANALYSIS_YEARS includes years in your data

See .cursor/rules/common_errors.md for more troubleshooting help.

📝 Example Workflow

Complete Analysis Workflow

Setup:
```
python setup_wizard.py
```

Test data loading:

python -c "from data_loader import load_sales_data; from config import get_data_path; df = load_sales_data(get_data_path()); print(f'✓ Loaded {len(df):,} rows')"

Create analysis:

cp analysis_template.py revenue_analysis.py
# Edit revenue_analysis.py

Run analysis:
```
python revenue_analysis.py
```

Add to batch runner:

# In run_all_analyses.py:
ANALYSIS_SCRIPTS = [
    'revenue_analysis.py',
    # ... other analyses
]

Run all analyses:
```
python run_all_analyses.py
```

🤝 Best Practices

Always validate data after loading:

is_valid, msg = validate_data_structure(df)

Use configuration values instead of hardcoding:

from config import REVENUE_COLUMN  # ✅
revenue = df['USD'].sum()  # ❌ Hardcoded

Apply exclusion filters if configured:
```
df = apply_exclusion_filters(df)
```
Validate revenue at end of each analysis:
```
validate_revenue(df, "Analysis Name")
```

Use utility functions for consistency:

from analysis_utils import calculate_annual_metrics, setup_revenue_chart

📄 License

This template is provided as-is for use in sales analysis projects.

🙏 Acknowledgments

This template is based on best practices developed during the Dukane Corporation sales analysis project, which included 24+ production-ready analysis scripts and comprehensive documentation.

📞 Support

For questions or issues:

Check .cursor/rules/ for detailed patterns and troubleshooting
Review config.py comments for configuration options
See example analyses in the original Dukane project

Last Updated: January 2026
Template Version: 1.0
Status: Production Ready