Module 03: Fat Tails & Stylized Facts
Universal patterns in financial returns that every statistician should know
1. Introduction: Universal Patterns in Financial Returns
In Module 02 we established that returns, not prices, are the proper object of statistical analysis. We also hinted that returns are not normally distributed. This module dives deep into how returns deviate from normality and introduces the stylized facts — a set of empirical regularities that appear across nearly all financial assets, markets, and time periods.
As a statistician, you will recognize these as distributional properties and serial dependence structures. What makes them remarkable is their universality: the same patterns appear in equities, currencies, commodities, and interest rates, across New York, London, Tokyo, and Mumbai, from the 1920s to today.
2. Heavy Tails: Extreme Events Are Not Rare
2.1 Kurtosis: Measuring Tail Heaviness
The kurtosis of a distribution measures the weight of its tails relative to the normal distribution. The normal distribution has a kurtosis of 3 (or an excess kurtosis of 0). Financial returns typically have excess kurtosis between 5 and 50 for daily data.
An excess kurtosis of 10 does not mean the distribution is “slightly” non-normal. It means the probability of a 5-sigma event is orders of magnitude larger than the normal distribution predicts.
Pythonimport yfinance as yf import numpy as np import pandas as pd import scipy.stats as stats # Download several assets to show universality tickers = { "S&P 500": "^GSPC", "Apple": "AAPL", "Euro/USD": "EURUSD=X", "Gold": "GC=F", "10Y Treasury": "^TNX", "Bitcoin": "BTC-USD", } results = [] for name, ticker in tickers.items(): data = yf.download(ticker, start="2015-01-01", end="2025-01-01", progress=False) ret = np.log(data["Adj Close"] / data["Adj Close"].shift(1)).dropna() results.append({ "Asset": name, "N": len(ret), "Mean (bps)": ret.mean() * 10000, "Std (%)": ret.std() * 100, "Skewness": ret.skew(), "Excess Kurtosis": ret.kurtosis(), "Min (%)": ret.min() * 100, "Max (%)": ret.max() * 100, }) df = pd.DataFrame(results) print(df.to_string(index=False)) # Notice: EVERY asset has excess kurtosis >> 0
2.2 How Much More Likely Are Extreme Events?
Under a normal distribution, a daily return exceeding 4 standard deviations should occur roughly once every 126 years. In practice, it happens roughly once or twice per year. This table quantifies the discrepancy:
| Event Size | Normal Prob (one-tail) | Expected Freq (daily) | Actual Freq (S&P 500) | Ratio |
|---|---|---|---|---|
| 3σ move | 0.135% | Once per 3 years | Several per year | ~5–10x |
| 4σ move | 0.003% | Once per 126 years | Once per 1–2 years | ~100x |
| 5σ move | 0.00003% | Once per 13,000 years | Once per 5–10 years | ~1,000x |
| 6σ move | ~10−9 | Once per 1.4 million years | Several times in 100 years | ~10,000x+ |
2.3 Comparing to the Student-t Distribution
A much better model for the unconditional distribution of returns is the Student-t distribution with low degrees of freedom (ν ≈ 3–7). This naturally produces heavier tails.
Pythonimport matplotlib.pyplot as plt # Fit a t-distribution to AAPL log returns aapl = yf.download("AAPL", start="2015-01-01", end="2025-01-01", progress=False) log_ret = np.log(aapl["Adj Close"] / aapl["Adj Close"].shift(1)).dropna() # Fit Student-t (returns df, loc, scale) t_params = stats.t.fit(log_ret) print(f"Fitted t-distribution: df={t_params[0]:.2f}, loc={t_params[1]:.6f}, scale={t_params[2]:.6f}") # Compare fits visually fig, axes = plt.subplots(1, 2, figsize=(14, 5)) x = np.linspace(log_ret.min(), log_ret.max(), 500) # Linear scale axes[0].hist(log_ret, bins=150, density=True, alpha=0.6, color='#3182ce', label='Data') axes[0].plot(x, stats.norm.pdf(x, log_ret.mean(), log_ret.std()), 'r-', lw=2, label='Normal') axes[0].plot(x, stats.t.pdf(x, *t_params), 'g-', lw=2, label=f'Student-t (df={t_params[0]:.1f})') axes[0].set_title('Density Comparison (Linear Scale)') axes[0].legend() # Log scale — crucial for seeing tail behavior axes[1].hist(log_ret, bins=200, density=True, alpha=0.6, color='#3182ce', label='Data') axes[1].plot(x, stats.norm.pdf(x, log_ret.mean(), log_ret.std()), 'r-', lw=2, label='Normal') axes[1].plot(x, stats.t.pdf(x, *t_params), 'g-', lw=2, label=f'Student-t (df={t_params[0]:.1f})') axes[1].set_yscale('log') axes[1].set_title('Density Comparison (Log Scale — Shows Tails)') axes[1].legend() plt.tight_layout() plt.show()
3. QQ Plots: The Visual Signature of Fat Tails
3.1 Reading a QQ Plot
The quantile-quantile (QQ) plot is the single most informative diagnostic for distributional assumptions. It plots the quantiles of your data against the quantiles of a theoretical distribution. Points on the 45-degree line mean perfect agreement.
- S-shaped deviation: Both tails deviate upward on the right and downward on the left — this is the signature of heavy tails (leptokurtosis).
- Concave or convex pattern: Indicates skewness.
- Staircase pattern: Indicates discreteness in the data.
3.2 QQ Plots Against Different Reference Distributions
Pythonfig, axes = plt.subplots(1, 3, figsize=(16, 5)) # 1. QQ plot against Normal stats.probplot(log_ret, dist="norm", plot=axes[0]) axes[0].set_title('QQ Plot: Data vs Normal') axes[0].get_lines()[0].set_markerfacecolor('#3182ce') axes[0].get_lines()[0].set_markersize(2) # 2. QQ plot against t-distribution with fitted df stats.probplot(log_ret, dist=stats.t, sparams=(t_params[0],), plot=axes[1]) axes[1].set_title(f'QQ Plot: Data vs t(df={t_params[0]:.1f})') axes[1].get_lines()[0].set_markerfacecolor('#38a169') axes[1].get_lines()[0].set_markersize(2) # 3. QQ plot against Laplace (double exponential) stats.probplot(log_ret, dist=stats.laplace, plot=axes[2]) axes[2].set_title('QQ Plot: Data vs Laplace') axes[2].get_lines()[0].set_markerfacecolor('#d69e2e') axes[2].get_lines()[0].set_markersize(2) plt.tight_layout() plt.show() # The normal QQ will show clear S-shaped deviation # The t-distribution QQ will be much closer to the diagonal # The Laplace may overshoot the tails slightly
4. Volatility Clustering: Large Moves Follow Large Moves
4.1 The Phenomenon
If you plot the time series of daily returns, you will notice something striking: large moves (positive or negative) tend to cluster together, and calm periods tend to cluster together. This is called volatility clustering.
Mandelbrot described it: large changes tend to be followed by large changes — of either sign — and small changes tend to be followed by small changes.
4.2 Visualizing Volatility Clustering
Pythonfig, axes = plt.subplots(3, 1, figsize=(14, 10), sharex=True) # Panel 1: Returns axes[0].plot(log_ret.index, log_ret.values, color='#1a365d', linewidth=0.4) axes[0].set_ylabel('Log Return') axes[0].set_title('AAPL: Returns and Volatility Clustering') axes[0].axhline(y=0, color='gray', linestyle='--', linewidth=0.5) # Panel 2: Absolute returns (proxy for volatility) axes[1].plot(log_ret.index, log_ret.abs().values, color='#e53e3e', linewidth=0.4) axes[1].set_ylabel('|Log Return|') axes[1].set_title('Absolute Returns (Volatility Proxy)') # Panel 3: Squared returns (another volatility proxy) axes[2].plot(log_ret.index, (log_ret ** 2).values, color='#d69e2e', linewidth=0.4) axes[2].set_ylabel('Return²') axes[2].set_title('Squared Returns (Variance Proxy)') plt.tight_layout() plt.show() # The clustering in panels 2 and 3 is visually obvious
4.3 ARCH Effects: Testing for Volatility Clustering
The Engle ARCH test formally tests whether squared residuals exhibit serial correlation — the statistical signature of volatility clustering.
Pythonfrom statsmodels.stats.diagnostic import het_arch # Engle's ARCH test on residuals (or demeaned returns) demeaned = log_ret - log_ret.mean() arch_test = het_arch(demeaned, nlags=10) print("=== Engle's ARCH Test ===") print(f"LM statistic: {arch_test[0]:.4f}") print(f"p-value: {arch_test[1]:.2e}") print(f"F-statistic: {arch_test[2]:.4f}") print(f"F p-value: {arch_test[3]:.2e}") print(f"Conclusion: {'ARCH effects present' if arch_test[1] < 0.05 else 'No ARCH effects'}") # Will almost certainly show significant ARCH effects
5. Autocorrelation: Returns vs Absolute Returns
5.1 The Key Stylized Fact
Returns themselves show little or no serial correlation. This is consistent with weak-form market efficiency. However, absolute returns and squared returns show strong, persistent positive autocorrelation that decays slowly. This is the autocorrelation signature of volatility clustering.
- Corr(rt, rt-k) ≈ 0 (because the zt are independent)
- Corr(|rt|, |rt-k|) > 0 (because σt is persistent)
5.2 Autocorrelation Plots
Pythonfrom statsmodels.graphics.tsaplots import plot_acf from statsmodels.tsa.stattools import acf fig, axes = plt.subplots(2, 2, figsize=(14, 10)) # ACF of returns plot_acf(log_ret, lags=50, ax=axes[0, 0], title='ACF of Returns', color='#1a365d', alpha=0.05) # ACF of absolute returns plot_acf(log_ret.abs(), lags=50, ax=axes[0, 1], title='ACF of |Returns|', color='#e53e3e', alpha=0.05) # ACF of squared returns plot_acf(log_ret ** 2, lags=50, ax=axes[1, 0], title='ACF of Returns²', color='#d69e2e', alpha=0.05) # Compare ACF decay: absolute vs squared acf_abs = acf(log_ret.abs(), nlags=100) acf_sq = acf(log_ret ** 2, nlags=100) axes[1, 1].plot(acf_abs, label='|Returns|', color='#e53e3e') axes[1, 1].plot(acf_sq, label='Returns²', color='#d69e2e') axes[1, 1].set_title('ACF Comparison: |r| vs r²') axes[1, 1].set_xlabel('Lag') axes[1, 1].legend() plt.tight_layout() plt.show() # Key observation: |returns| has HIGHER autocorrelation than returns² # This is called the "Taylor effect" — another stylized fact
5.3 The Ljung-Box Test for Serial Correlation
Pythonfrom statsmodels.stats.diagnostic import acorr_ljungbox # Test serial correlation in returns lb_returns = acorr_ljungbox(log_ret, lags=[5, 10, 20], return_df=True) print("=== Ljung-Box Test: Returns ===") print(lb_returns) print() # Test serial correlation in squared returns lb_squared = acorr_ljungbox(log_ret ** 2, lags=[5, 10, 20], return_df=True) print("=== Ljung-Box Test: Squared Returns ===") print(lb_squared) # Returns: may or may not show significance # Squared returns: will almost certainly show HIGHLY significant autocorrelation
5.4 Long Memory in Volatility
The autocorrelation of absolute returns decays very slowly — much slower than the exponential decay of a GARCH(1,1) model. This is evidence of long memory in volatility, sometimes modeled with fractional integration (FIGARCH) or hyperbolic decay functions.
6. Formal Normality Tests for Financial Data
6.1 Overview of Tests
| Test | Tests What | Best For | Limitations |
|---|---|---|---|
| Jarque-Bera | Skewness + kurtosis jointly | Large samples, financial data | Only uses 3rd and 4th moments |
| Shapiro-Wilk | Overall distributional shape | Small to medium samples (n < 5000) | Not valid for n > 5000 |
| Anderson-Darling | Tail behavior (more weight on tails) | Detecting tail deviations | More conservative than other tests |
| Lilliefors | Overall CDF comparison | When parameters are estimated | Less powerful than AD for tails |
| D’Agostino-Pearson | Skewness + kurtosis omnibus | General purpose | Requires n > 20 |
6.2 Running All Tests
Python# Comprehensive normality testing print("=== Normality Tests on AAPL Log Returns ===") print(f"Sample size: {len(log_ret)}\n") # 1. Jarque-Bera (ideal for financial data) jb_stat, jb_p = stats.jarque_bera(log_ret) print(f"Jarque-Bera: stat={jb_stat:>10.2f}, p={jb_p:.2e}") # 2. D'Agostino-Pearson omnibus dp_stat, dp_p = stats.normaltest(log_ret) print(f"D'Agostino-K2: stat={dp_stat:>10.2f}, p={dp_p:.2e}") # 3. Anderson-Darling ad_result = stats.anderson(log_ret, dist='norm') print(f"Anderson-Darling: stat={ad_result.statistic:>10.4f}") for sl, cv in zip(ad_result.significance_level, ad_result.critical_values): reject = "REJECT" if ad_result.statistic > cv else "fail to reject" print(f" At {sl}% level: critical={cv:.4f} -> {reject}") # 4. Shapiro-Wilk (subsample for large n) sw_sample = log_ret.sample(min(5000, len(log_ret)), random_state=42) sw_stat, sw_p = stats.shapiro(sw_sample) print(f"Shapiro-Wilk: stat={sw_stat:>10.6f}, p={sw_p:.2e} (n={len(sw_sample)})") # 5. Lilliefors (KS test with estimated parameters) from statsmodels.stats.diagnostic import lilliefors lf_stat, lf_p = lilliefors(log_ret, dist='norm') print(f"Lilliefors: stat={lf_stat:>10.6f}, p={lf_p:.4f}") print(f"\nAll tests will reject normality. This is not surprising — it is a fact.")
stats.kstest) for
normality testing when you estimate μ and σ from the data. The KS test
assumes the parameters are known a priori; estimating them from the same
sample makes the test overly conservative (reduced power). Use the
Lilliefors modification instead, which corrects for parameter
estimation.
7. The Leverage Effect: Asymmetric Volatility
7.1 What Is the Leverage Effect?
The leverage effect is the empirical observation that negative returns tend to increase future volatility more than positive returns of the same magnitude. When a stock drops 5%, subsequent volatility tends to be higher than after a 5% gain.
7.2 Measuring the Leverage Effect
Python# Cross-correlation between returns and future squared returns # Negative returns at time t should predict higher squared returns at t+1, t+2, ... max_lag = 20 leverage_corr = [] for k in range(1, max_lag + 1): corr = log_ret.corr(log_ret.shift(-k) ** 2) leverage_corr.append(corr) fig, axes = plt.subplots(1, 2, figsize=(14, 5)) # Left: cross-correlation (leverage effect) axes[0].bar(range(1, max_lag + 1), leverage_corr, color='#e53e3e') axes[0].set_xlabel('Lag k (days ahead)') axes[0].set_ylabel('Corr(r_t, r²_{t+k})') axes[0].set_title('Leverage Effect: Return vs Future Volatility') axes[0].axhline(y=0, color='gray', linestyle='--') # Right: scatter of returns vs next-day absolute returns axes[1].scatter(log_ret.iloc[:-1].values, log_ret.abs().iloc[1:].values, alpha=0.15, s=5, color='#1a365d') axes[1].set_xlabel('Return today') axes[1].set_ylabel('|Return| tomorrow') axes[1].set_title('Asymmetric Volatility Response') # Add binned means to show the asymmetry bins = pd.qcut(log_ret.iloc[:-1], 20) binned_mean = log_ret.abs().iloc[1:].groupby(bins.values).mean() bin_centers = [interval.mid for interval in binned_mean.index] axes[1].plot(bin_centers, binned_mean.values, 'r-o', linewidth=2, markersize=4) plt.tight_layout() plt.show() # The cross-correlations should be negative (negative return -> higher future vol) # The scatter should show a "smirk" — higher abs returns following negative returns
8. Summary: The Complete List of Stylized Facts
Here is the canonical list of stylized facts, each mapped to its statistical interpretation:
| # | Stylized Fact | Statistical Description | Implication |
|---|---|---|---|
| 1 | Heavy tails | Excess kurtosis >> 0; power-law tails | Normal distribution underestimates extreme events |
| 2 | Absence of return autocorrelation | ACF(rt) ≈ 0 for k ≥ 1 | Returns are approximately unpredictable |
| 3 | Volatility clustering | ACF(|rt|) > 0, slowly decaying | Conditional heteroscedasticity; need GARCH |
| 4 | Leverage effect | Corr(rt, σ2t+1) < 0 | Negative returns increase volatility more |
| 5 | Volume-volatility correlation | Corr(Vt, |rt|) > 0 | High volume accompanies large price moves |
| 6 | Aggregational Gaussianity | Kurtosis decreases at longer horizons | CLT applies; monthly returns are more normal than daily |
| 7 | Taylor effect | ACF(|rt|) > ACF(rt2) | Absolute returns are more predictable than squared |
| 8 | Gain/loss asymmetry | Negative skewness in stock returns | Large drops more likely than equivalent large gains |
| 9 | Long memory in volatility | ACF of |rt| decays hyperbolically, not exponentially | GARCH(1,1) may be insufficient; consider FIGARCH |
In the next module, we will turn to the multivariate structure of financial data: how correlations between assets behave, why they break down during crises, and why Pearson correlation is often the wrong tool for the job.