9.7 KiB
Code Quality & Best Practices
Comprehensive guide for writing Cursor-optimized code in the sales analysis template.
This document combines code quality standards and Cursor best practices to ensure AI assistants can effectively understand, modify, and extend the codebase.
Type Hints
When to Use Type Hints
Use type hints for:
- Function parameters
- Return values
- Class attributes
- Complex data structures
Example Pattern
from typing import Dict, List, Optional, Tuple
import pandas as pd
def calculate_annual_metrics(
df: pd.DataFrame,
metrics_func: callable,
ltm_start: Optional[pd.Period] = None,
ltm_end: Optional[pd.Period] = None
) -> pd.DataFrame:
"""
Calculate annual metrics for all years
Args:
df: DataFrame with 'Year' and 'YearMonth' columns
metrics_func: Function that takes a DataFrame and returns a dict of metrics
ltm_start: LTM start period (defaults to config if None)
ltm_end: LTM end period (defaults to config if None)
Returns:
DataFrame with 'Year' index and metric columns
"""
# Implementation
Docstrings
Docstring Format
All functions should use Google-style docstrings:
def function_name(param1: type, param2: type) -> return_type:
"""
Brief description of what the function does.
More detailed explanation if needed. Can span multiple lines.
Explain any complex logic or important considerations.
Args:
param1: Description of param1
param2: Description of param2
Returns:
Description of return value
Raises:
ValueError: When and why this exception is raised
Example:
>>> result = function_name(value1, value2)
>>> print(result)
expected_output
"""
Required Elements
- Brief one-line summary
- Detailed description (if needed)
- Args section (all parameters)
- Returns section (return value)
- Raises section (if exceptions raised)
- Example section (for complex functions)
Variable Naming
Conventions
- Descriptive names:
customer_revenuenotcr - Consistent prefixes:
df_for DataFrames,annual_for annual metrics - Clear abbreviations:
ltmfor Last Twelve Months (well-known) - Avoid single letters: Except for loop variables (
i,j,k)
Good Examples
# Good
customer_revenue_by_year = df.groupby(['Customer', 'Year'])[REVENUE_COLUMN].sum()
annual_metrics_df = calculate_annual_metrics(df, metrics_func)
ltm_start_period, ltm_end_period = get_ltm_period_config()
# Bad
cr = df.groupby(['C', 'Y'])['R'].sum()
am = calc(df, mf)
s, e = get_ltm()
Error Messages
Structure
Error messages should be:
- Specific: What exactly went wrong
- Actionable: How to fix it
- Contextual: Where it occurred
- Helpful: Reference to documentation
Good Error Messages
# Good
raise ValueError(
f"Required column '{REVENUE_COLUMN}' not found in data.\n"
f"Available columns: {list(df.columns)}\n"
f"Please update config.py REVENUE_COLUMN to match your data.\n"
f"See .cursor/rules/data_loading.md for more help."
)
# Bad
raise ValueError("Column not found")
Code Comments
When to Comment
- Complex logic that isn't immediately obvious
- Business rules or domain-specific knowledge
- Workarounds or non-obvious solutions
- Performance considerations
- TODO items with context
Comment Style
# Good: Explains WHY, not WHAT
# Use LTM for most recent year to enable apples-to-apples comparison
# with full calendar years (avoids partial year bias)
if year == LTM_END_YEAR and LTM_ENABLED:
year_data = get_ltm_data(df, ltm_start, ltm_end)
# Bad: States the obvious
# Check if year equals LTM_END_YEAR
if year == LTM_END_YEAR:
Function Design
Single Responsibility
Each function should do one thing well:
# Good: Single responsibility
def calculate_revenue(df: pd.DataFrame) -> float:
"""Calculate total revenue from DataFrame"""
return df[REVENUE_COLUMN].sum()
def calculate_customer_count(df: pd.DataFrame) -> int:
"""Calculate unique customer count"""
return df[CUSTOMER_COLUMN].nunique()
# Bad: Multiple responsibilities
def calculate_metrics(df):
"""Calculate revenue and customer count"""
revenue = df[REVENUE_COLUMN].sum()
customers = df[CUSTOMER_COLUMN].nunique()
return revenue, customers
Function Length
- Keep functions under 50 lines when possible
- Break complex functions into smaller helper functions
- Use descriptive function names that explain purpose
Import Organization
Standard Order
- Standard library imports
- Third-party imports (pandas, numpy, matplotlib)
- Local/template imports (data_loader, analysis_utils, config)
Example
# Standard library
from pathlib import Path
from typing import Dict, Optional
from datetime import datetime
# Third-party
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Template imports
from data_loader import load_sales_data, validate_data_structure
from analysis_utils import calculate_annual_metrics, setup_revenue_chart
from config import REVENUE_COLUMN, CHART_SIZES, COMPANY_NAME
Constants and Configuration
Use Config Values
# Good: From config
from config import REVENUE_COLUMN, DATE_COLUMN
revenue = df[REVENUE_COLUMN].sum()
# Bad: Hardcoded
revenue = df['USD'].sum()
Magic Numbers
Avoid magic numbers - use named constants or config:
# Good: Named constant
MILLIONS_DIVISOR = 1e6
revenue_millions = revenue / MILLIONS_DIVISOR
# Or from config
CHART_DPI = 300 # In config.py
# Bad: Magic number
revenue_millions = revenue / 1000000
Testing Considerations
Testable Code
Write code that's easy to test:
- Pure functions when possible (no side effects)
- Dependency injection for external dependencies
- Clear inputs and outputs
Example
# Good: Testable
def calculate_metrics(year_data: pd.DataFrame, revenue_col: str) -> Dict:
"""Calculate metrics - easy to test with sample data"""
return {
'Revenue': year_data[revenue_col].sum(),
'Count': len(year_data)
}
# Harder to test: Depends on global config
def calculate_metrics(year_data):
"""Uses global REVENUE_COLUMN - harder to test"""
return {'Revenue': year_data[REVENUE_COLUMN].sum()}
AI-Friendly Patterns
Clear Intent
Code should clearly express intent:
# Good: Intent is clear
customers_with_revenue = df[df[REVENUE_COLUMN] > 0][CUSTOMER_COLUMN].unique()
# Less clear: Requires understanding of pandas
customers_with_revenue = df.loc[df[REVENUE_COLUMN] > 0, CUSTOMER_COLUMN].unique()
Explicit Over Implicit
# Good: Explicit
if LTM_ENABLED and ltm_start is not None and ltm_end is not None:
use_ltm = True
else:
use_ltm = False
# Less clear: Implicit truthiness
use_ltm = LTM_ENABLED and ltm_start and ltm_end
Documentation for AI
Help AI Understand Context
Add comments that help AI understand business context:
# LTM (Last Twelve Months) is used for the most recent partial year
# to enable fair comparison with full calendar years.
# Example: If latest data is through Sep 2025, use Oct 2024 - Sep 2025
if year == LTM_END_YEAR and LTM_ENABLED:
# Use 12-month rolling period instead of partial calendar year
year_data = get_ltm_data(df, ltm_start, ltm_end)
Cursor-Specific Optimizations
AI-Friendly Code Structure
Code should be structured so Cursor AI can:
- Understand intent - Clear function names and comments
- Generate code - Follow established patterns
- Fix errors - Actionable error messages
- Extend functionality - Modular, reusable functions
Example: AI-Generated Code Pattern
When AI generates code, it should automatically:
# AI recognizes this pattern and replicates it
def main():
# 1. Load data (AI knows to use data_loader)
df = load_sales_data(get_data_path())
# 2. Validate (AI knows to check structure)
is_valid, msg = validate_data_structure(df)
if not is_valid:
print(f"ERROR: {msg}")
return
# 3. Apply filters (AI knows exclusion filters)
df = apply_exclusion_filters(df)
# 4. Analysis logic (AI follows template patterns)
# ...
# 5. Create charts (AI knows formatting rules)
# ...
# 6. Validate revenue (AI knows to validate)
validate_revenue(df, ANALYSIS_NAME)
Help AI Generate Better Code
Add context comments that help AI:
# LTM (Last Twelve Months) is used for the most recent partial year
# to enable fair comparison with full calendar years.
# Example: If latest data is through Sep 2025, use Oct 2024 - Sep 2025
# This avoids partial-year bias in year-over-year comparisons.
if year == LTM_END_YEAR and LTM_ENABLED:
# Use 12-month rolling period instead of partial calendar year
year_data = get_ltm_data(df, ltm_start, ltm_end)
year_label = get_ltm_label() # Returns "2025 (LTM 9/2025)"
Summary Checklist
For Cursor-optimized code:
- ✅ Comprehensive docstrings with examples
- ✅ Type hints on functions
- ✅ Descriptive variable names
- ✅ Clear comments for business logic
- ✅ Structured error messages
- ✅ Consistent code patterns
- ✅ Use config values (never hardcode)
- ✅ Follow template utilities
- ✅ Include validation steps
- ✅ Reference documentation
Summary
Follow these standards to ensure:
- AI can understand code structure
- AI can modify code safely
- AI can generate new code following patterns
- Code is maintainable and readable
- Errors are clear and actionable
- Cursor AI can assist effectively
Last Updated: January 2026
For: Cursor AI optimization and human developers