Fourier Analysis for Seasonal Pattern Detection

# Fourier Analysis for Seasonal Pattern Detection Based on conversation with Nate Shenkute, 2025-11-05 [[The Fourier Transform]] ## Motivation > "So if I understand it more correctly then, what I could do is I could actually transform my, you know, my demo marketing data, transform it onto some other instead of time space, some other space that has frequency. And then apply these things if I wanna detect those spectral points that are cyclical. Or repeating." ## Mathematical Framework ### The Discrete Fourier Transform For a discrete time series X = {x₀, x₁, ..., x_{N-1}} of length N, the DFT is defined as: ``` X_k = Σ_{n=0}^{N-1} x_n · e^(-2πikn/N) for k = 0, 1, ..., N-1 ``` Where: - X_k is the k-th frequency component - i is the imaginary unit (i² = -1) - k/N represents the frequency in cycles per sample The magnitude |X_k| reveals the strength of the frequency component at k/N cycles per unit time. ### Frequency Domain Representation **Time domain**: x(t) = marketing volume at time t **Frequency domain**: X(f) = amplitude of cyclical component at frequency f **Power spectrum**: P(f) = |X(f)|² quantifies energy at each frequency **Dominant frequencies**: argmax_f P(f) identifies the strongest cyclical patterns ### Application to Marketing Data For time series of length N = 22 months: ``` Frequencies sampled: f_k = k/N cycles per month, k = 0, 1, ..., N/2 Key frequencies of interest: - f = 1/12 (annual cycle) - f = 1/6 (bi-annual cycle) - f = 1/3 (quarterly cycle) - f = 1/1 (monthly cycle) ``` ## Detection Strategy > "So when I would use the Fourier transform is if I actually want to detect a certain pattern of seasonality, like, if the seasonality comes in 3 month frequency or 6 month frequency, etcetera. Whatever that is is just that a Fourier transform will help me identify where those frequencies are." ### Peak Detection Algorithm 1. Compute FFT: X = FFT(marketing_volume_data) 2. Calculate power spectrum: P(f) = |X(f)|² 3. Identify peaks: {f_i} where P(f_i) > threshold 4. Rank by magnitude: Sort {f_i} by P(f_i) descending ### Interpretation **If peak at f = 1/12**: Strong annual seasonality (12-month cycle) **If peak at f = 1/6**: Bi-annual pattern (summer/winter effects) **If peak at f = 1/3**: Quarterly business cycles **Multiple peaks**: Superposition of cyclical patterns ## Inverse Transform and Reconstruction Given detected frequencies {f₁, f₂, ..., f_m}, reconstruct seasonal component: ``` s(t) = Σ_{i=1}^m A_i · cos(2πf_i t + φ_i) ``` Where: - A_i = |X(f_i)| (amplitude from FFT magnitude) - φ_i = arg(X(f_i)) (phase from FFT complex angle) - t = time index in months ### Integration with Linear Model Full decomposition becomes: ``` y(t) = μ + βt + Σ_{i=1}^m [A_i · cos(2πf_i t + φ_i)] + ε(t) ```` Where: - μ + βt: Trend component - Σ[...]: Seasonal component from detected frequencies - ε(t): Residual noise ## Implementation ```python from scipy.fft import fft, fftfreq import numpy as np # Transform to frequency domain N = len(demo_volume_timeseries) frequencies = fft(demo_volume_timeseries) freq_bins = fftfreq(N, d=1) # d=1 for monthly sampling # Compute power spectrum power_spectrum = np.abs(frequencies)**2 # Identify dominant frequencies threshold = np.percentile(power_spectrum, 95) dominant_idx = np.where(power_spectrum > threshold)[0] dominant_freqs = freq_bins[dominant_idx] # Extract amplitudes and phases amplitudes = np.abs(frequencies[dominant_idx]) phases = np.angle(frequencies[dominant_idx]) ```` ## Advantages Over Assumed Seasonality **Traditional approach**: Assume monthly dummy variables for 12-month seasonality **FFT approach**: - Discovers actual periodicity from data - May reveal 6-month or 3-month cycles not captured by annual dummies - Quantifies relative strength of each cyclical component - Provides phase information (when cycles peak) **Mathematical benefit**: Fourier basis functions are orthogonal, ensuring independent component estimation ## Nyquist Frequency Considerations With N = 22 monthly observations: ``` f_nyquist = 1/(2·Δt) = 1/2 cycles per month ``` **Detectable periods**: T ≥ 2 months **Reliable detection**: Requires at least 2-3 full cycles - 12-month cycle: 22/12 ≈ 1.8 cycles (marginal) - 6-month cycle: 22/6 ≈ 3.7 cycles (good) - 3-month cycle: 22/3 ≈ 7.3 cycles (excellent) ## Spectral Leakage and Windowing For finite time series, spectral leakage occurs when true frequencies don't align with FFT bins. **Mitigation**: Apply window function before FFT ``` x_windowed(n) = x(n) · w(n) Common windows: - Hann: w(n) = 0.5(1 - cos(2πn/N)) - Hamming: w(n) = 0.54 - 0.46·cos(2πn/N) - Blackman: w(n) = 0.42 - 0.5·cos(2πn/N) + 0.08·cos(4πn/N) ``` Trade-off: Reduced leakage vs. broader peaks ## Current Project Application > "Actually, my attention is now slightly shifted. Curious about the spectral analysis as you framed it." **For immediate forecasting** (Friday deadline): Use simple monthly seasonal factors **For future refinement**: FFT analysis to discover true cyclical structure once more data accumulates **Recommendation**: Run FFT as validation check to confirm assumed 12-month seasonality is dominant frequency, or reveal other patterns worth modeling ## Mathematical Properties **Parseval's Theorem**: Energy conservation ``` Σ_{n=0}^{N-1} |x_n|² = (1/N) Σ_{k=0}^{N-1} |X_k|² ``` **Convolution Theorem**: Filtering in frequency domain ``` FFT(x * h) = FFT(x) · FFT(h) ``` **Linearity**: Superposition principle ``` FFT(ax + by) = a·FFT(x) + b·FFT(y) ``` These properties enable efficient computation and interpretation of cyclical components. --- **Note**: For 22-month dataset, FFT provides marginal benefit over domain knowledge. Valuable for longer time series or when seasonality structure is unknown. Consider for model validation and future iterations.