390 lines
9.7 KiB
Markdown
390 lines
9.7 KiB
Markdown
# Code Quality & Best Practices
|
|
|
|
**Comprehensive guide for writing Cursor-optimized code in the sales analysis template.**
|
|
|
|
This document combines code quality standards and Cursor best practices to ensure AI assistants can effectively understand, modify, and extend the codebase.
|
|
|
|
## Type Hints
|
|
|
|
### When to Use Type Hints
|
|
|
|
Use type hints for:
|
|
- Function parameters
|
|
- Return values
|
|
- Class attributes
|
|
- Complex data structures
|
|
|
|
### Example Pattern
|
|
|
|
```python
|
|
from typing import Dict, List, Optional, Tuple
|
|
import pandas as pd
|
|
|
|
def calculate_annual_metrics(
|
|
df: pd.DataFrame,
|
|
metrics_func: callable,
|
|
ltm_start: Optional[pd.Period] = None,
|
|
ltm_end: Optional[pd.Period] = None
|
|
) -> pd.DataFrame:
|
|
"""
|
|
Calculate annual metrics for all years
|
|
|
|
Args:
|
|
df: DataFrame with 'Year' and 'YearMonth' columns
|
|
metrics_func: Function that takes a DataFrame and returns a dict of metrics
|
|
ltm_start: LTM start period (defaults to config if None)
|
|
ltm_end: LTM end period (defaults to config if None)
|
|
|
|
Returns:
|
|
DataFrame with 'Year' index and metric columns
|
|
"""
|
|
# Implementation
|
|
```
|
|
|
|
## Docstrings
|
|
|
|
### Docstring Format
|
|
|
|
All functions should use Google-style docstrings:
|
|
|
|
```python
|
|
def function_name(param1: type, param2: type) -> return_type:
|
|
"""
|
|
Brief description of what the function does.
|
|
|
|
More detailed explanation if needed. Can span multiple lines.
|
|
Explain any complex logic or important considerations.
|
|
|
|
Args:
|
|
param1: Description of param1
|
|
param2: Description of param2
|
|
|
|
Returns:
|
|
Description of return value
|
|
|
|
Raises:
|
|
ValueError: When and why this exception is raised
|
|
|
|
Example:
|
|
>>> result = function_name(value1, value2)
|
|
>>> print(result)
|
|
expected_output
|
|
"""
|
|
```
|
|
|
|
### Required Elements
|
|
|
|
- Brief one-line summary
|
|
- Detailed description (if needed)
|
|
- Args section (all parameters)
|
|
- Returns section (return value)
|
|
- Raises section (if exceptions raised)
|
|
- Example section (for complex functions)
|
|
|
|
## Variable Naming
|
|
|
|
### Conventions
|
|
|
|
- **Descriptive names:** `customer_revenue` not `cr`
|
|
- **Consistent prefixes:** `df_` for DataFrames, `annual_` for annual metrics
|
|
- **Clear abbreviations:** `ltm` for Last Twelve Months (well-known)
|
|
- **Avoid single letters:** Except for loop variables (`i`, `j`, `k`)
|
|
|
|
### Good Examples
|
|
|
|
```python
|
|
# Good
|
|
customer_revenue_by_year = df.groupby(['Customer', 'Year'])[REVENUE_COLUMN].sum()
|
|
annual_metrics_df = calculate_annual_metrics(df, metrics_func)
|
|
ltm_start_period, ltm_end_period = get_ltm_period_config()
|
|
|
|
# Bad
|
|
cr = df.groupby(['C', 'Y'])['R'].sum()
|
|
am = calc(df, mf)
|
|
s, e = get_ltm()
|
|
```
|
|
|
|
## Error Messages
|
|
|
|
### Structure
|
|
|
|
Error messages should be:
|
|
1. **Specific:** What exactly went wrong
|
|
2. **Actionable:** How to fix it
|
|
3. **Contextual:** Where it occurred
|
|
4. **Helpful:** Reference to documentation
|
|
|
|
### Good Error Messages
|
|
|
|
```python
|
|
# Good
|
|
raise ValueError(
|
|
f"Required column '{REVENUE_COLUMN}' not found in data.\n"
|
|
f"Available columns: {list(df.columns)}\n"
|
|
f"Please update config.py REVENUE_COLUMN to match your data.\n"
|
|
f"See .cursor/rules/data_loading.md for more help."
|
|
)
|
|
|
|
# Bad
|
|
raise ValueError("Column not found")
|
|
```
|
|
|
|
## Code Comments
|
|
|
|
### When to Comment
|
|
|
|
- Complex logic that isn't immediately obvious
|
|
- Business rules or domain-specific knowledge
|
|
- Workarounds or non-obvious solutions
|
|
- Performance considerations
|
|
- TODO items with context
|
|
|
|
### Comment Style
|
|
|
|
```python
|
|
# Good: Explains WHY, not WHAT
|
|
# Use LTM for most recent year to enable apples-to-apples comparison
|
|
# with full calendar years (avoids partial year bias)
|
|
if year == LTM_END_YEAR and LTM_ENABLED:
|
|
year_data = get_ltm_data(df, ltm_start, ltm_end)
|
|
|
|
# Bad: States the obvious
|
|
# Check if year equals LTM_END_YEAR
|
|
if year == LTM_END_YEAR:
|
|
```
|
|
|
|
## Function Design
|
|
|
|
### Single Responsibility
|
|
|
|
Each function should do one thing well:
|
|
|
|
```python
|
|
# Good: Single responsibility
|
|
def calculate_revenue(df: pd.DataFrame) -> float:
|
|
"""Calculate total revenue from DataFrame"""
|
|
return df[REVENUE_COLUMN].sum()
|
|
|
|
def calculate_customer_count(df: pd.DataFrame) -> int:
|
|
"""Calculate unique customer count"""
|
|
return df[CUSTOMER_COLUMN].nunique()
|
|
|
|
# Bad: Multiple responsibilities
|
|
def calculate_metrics(df):
|
|
"""Calculate revenue and customer count"""
|
|
revenue = df[REVENUE_COLUMN].sum()
|
|
customers = df[CUSTOMER_COLUMN].nunique()
|
|
return revenue, customers
|
|
```
|
|
|
|
### Function Length
|
|
|
|
- Keep functions under 50 lines when possible
|
|
- Break complex functions into smaller helper functions
|
|
- Use descriptive function names that explain purpose
|
|
|
|
## Import Organization
|
|
|
|
### Standard Order
|
|
|
|
1. Standard library imports
|
|
2. Third-party imports (pandas, numpy, matplotlib)
|
|
3. Local/template imports (data_loader, analysis_utils, config)
|
|
|
|
### Example
|
|
|
|
```python
|
|
# Standard library
|
|
from pathlib import Path
|
|
from typing import Dict, Optional
|
|
from datetime import datetime
|
|
|
|
# Third-party
|
|
import pandas as pd
|
|
import numpy as np
|
|
import matplotlib.pyplot as plt
|
|
|
|
# Template imports
|
|
from data_loader import load_sales_data, validate_data_structure
|
|
from analysis_utils import calculate_annual_metrics, setup_revenue_chart
|
|
from config import REVENUE_COLUMN, CHART_SIZES, COMPANY_NAME
|
|
```
|
|
|
|
## Constants and Configuration
|
|
|
|
### Use Config Values
|
|
|
|
```python
|
|
# Good: From config
|
|
from config import REVENUE_COLUMN, DATE_COLUMN
|
|
revenue = df[REVENUE_COLUMN].sum()
|
|
|
|
# Bad: Hardcoded
|
|
revenue = df['USD'].sum()
|
|
```
|
|
|
|
### Magic Numbers
|
|
|
|
Avoid magic numbers - use named constants or config:
|
|
|
|
```python
|
|
# Good: Named constant
|
|
MILLIONS_DIVISOR = 1e6
|
|
revenue_millions = revenue / MILLIONS_DIVISOR
|
|
|
|
# Or from config
|
|
CHART_DPI = 300 # In config.py
|
|
|
|
# Bad: Magic number
|
|
revenue_millions = revenue / 1000000
|
|
```
|
|
|
|
## Testing Considerations
|
|
|
|
### Testable Code
|
|
|
|
Write code that's easy to test:
|
|
- Pure functions when possible (no side effects)
|
|
- Dependency injection for external dependencies
|
|
- Clear inputs and outputs
|
|
|
|
### Example
|
|
|
|
```python
|
|
# Good: Testable
|
|
def calculate_metrics(year_data: pd.DataFrame, revenue_col: str) -> Dict:
|
|
"""Calculate metrics - easy to test with sample data"""
|
|
return {
|
|
'Revenue': year_data[revenue_col].sum(),
|
|
'Count': len(year_data)
|
|
}
|
|
|
|
# Harder to test: Depends on global config
|
|
def calculate_metrics(year_data):
|
|
"""Uses global REVENUE_COLUMN - harder to test"""
|
|
return {'Revenue': year_data[REVENUE_COLUMN].sum()}
|
|
```
|
|
|
|
## AI-Friendly Patterns
|
|
|
|
### Clear Intent
|
|
|
|
Code should clearly express intent:
|
|
|
|
```python
|
|
# Good: Intent is clear
|
|
customers_with_revenue = df[df[REVENUE_COLUMN] > 0][CUSTOMER_COLUMN].unique()
|
|
|
|
# Less clear: Requires understanding of pandas
|
|
customers_with_revenue = df.loc[df[REVENUE_COLUMN] > 0, CUSTOMER_COLUMN].unique()
|
|
```
|
|
|
|
### Explicit Over Implicit
|
|
|
|
```python
|
|
# Good: Explicit
|
|
if LTM_ENABLED and ltm_start is not None and ltm_end is not None:
|
|
use_ltm = True
|
|
else:
|
|
use_ltm = False
|
|
|
|
# Less clear: Implicit truthiness
|
|
use_ltm = LTM_ENABLED and ltm_start and ltm_end
|
|
```
|
|
|
|
## Documentation for AI
|
|
|
|
### Help AI Understand Context
|
|
|
|
Add comments that help AI understand business context:
|
|
|
|
```python
|
|
# LTM (Last Twelve Months) is used for the most recent partial year
|
|
# to enable fair comparison with full calendar years.
|
|
# Example: If latest data is through Sep 2025, use Oct 2024 - Sep 2025
|
|
if year == LTM_END_YEAR and LTM_ENABLED:
|
|
# Use 12-month rolling period instead of partial calendar year
|
|
year_data = get_ltm_data(df, ltm_start, ltm_end)
|
|
```
|
|
|
|
## Cursor-Specific Optimizations
|
|
|
|
### AI-Friendly Code Structure
|
|
|
|
Code should be structured so Cursor AI can:
|
|
1. **Understand intent** - Clear function names and comments
|
|
2. **Generate code** - Follow established patterns
|
|
3. **Fix errors** - Actionable error messages
|
|
4. **Extend functionality** - Modular, reusable functions
|
|
|
|
### Example: AI-Generated Code Pattern
|
|
|
|
When AI generates code, it should automatically:
|
|
```python
|
|
# AI recognizes this pattern and replicates it
|
|
def main():
|
|
# 1. Load data (AI knows to use data_loader)
|
|
df = load_sales_data(get_data_path())
|
|
|
|
# 2. Validate (AI knows to check structure)
|
|
is_valid, msg = validate_data_structure(df)
|
|
if not is_valid:
|
|
print(f"ERROR: {msg}")
|
|
return
|
|
|
|
# 3. Apply filters (AI knows exclusion filters)
|
|
df = apply_exclusion_filters(df)
|
|
|
|
# 4. Analysis logic (AI follows template patterns)
|
|
# ...
|
|
|
|
# 5. Create charts (AI knows formatting rules)
|
|
# ...
|
|
|
|
# 6. Validate revenue (AI knows to validate)
|
|
validate_revenue(df, ANALYSIS_NAME)
|
|
```
|
|
|
|
### Help AI Generate Better Code
|
|
|
|
Add context comments that help AI:
|
|
```python
|
|
# LTM (Last Twelve Months) is used for the most recent partial year
|
|
# to enable fair comparison with full calendar years.
|
|
# Example: If latest data is through Sep 2025, use Oct 2024 - Sep 2025
|
|
# This avoids partial-year bias in year-over-year comparisons.
|
|
if year == LTM_END_YEAR and LTM_ENABLED:
|
|
# Use 12-month rolling period instead of partial calendar year
|
|
year_data = get_ltm_data(df, ltm_start, ltm_end)
|
|
year_label = get_ltm_label() # Returns "2025 (LTM 9/2025)"
|
|
```
|
|
|
|
## Summary Checklist
|
|
|
|
For Cursor-optimized code:
|
|
- ✅ Comprehensive docstrings with examples
|
|
- ✅ Type hints on functions
|
|
- ✅ Descriptive variable names
|
|
- ✅ Clear comments for business logic
|
|
- ✅ Structured error messages
|
|
- ✅ Consistent code patterns
|
|
- ✅ Use config values (never hardcode)
|
|
- ✅ Follow template utilities
|
|
- ✅ Include validation steps
|
|
- ✅ Reference documentation
|
|
|
|
## Summary
|
|
|
|
Follow these standards to ensure:
|
|
1. AI can understand code structure
|
|
2. AI can modify code safely
|
|
3. AI can generate new code following patterns
|
|
4. Code is maintainable and readable
|
|
5. Errors are clear and actionable
|
|
6. Cursor AI can assist effectively
|
|
|
|
---
|
|
|
|
**Last Updated:** January 2026
|
|
**For:** Cursor AI optimization and human developers
|