Learn Without Walls

Module 08: CAPM & Factor Models = Regression

Asset pricing models are regressions you already know how to run

Part 2 of 5 Module 08 of 22

1. The Capital Asset Pricing Model Is a Regression

The Capital Asset Pricing Model (CAPM), developed independently by Sharpe (1964), Lintner (1965), and Mossin (1966), is the single most important model in asset pricing. For a statistician, it has a beautifully simple interpretation: it is a simple linear regression.

ri,t − rf,t = αi + βi (rm,t − rf,t) + εi,t

This is an OLS regression where:

Finance Symbol Regression Symbol Meaning
ri,t − rf,t Yt Dependent variable: the stock's excess return over the risk-free rate
rm,t − rf,t Xt Independent variable: the market's excess return
αi Intercept Excess return not explained by market exposure (“alpha”)
βi Slope Sensitivity to market movements (“beta”)
εi,t Residual Idiosyncratic (stock-specific) return
Stats Bridge

The CAPM regression is exactly what you'd write in statsmodels: Y ~ X. Beta is the OLS slope, alpha is the OLS intercept, epsilon is the residual, and R-squared tells you what fraction of the stock's return variance is explained by the market. Every diagnostic you know — residual plots, heteroscedasticity tests, influence measures — applies directly.

1.1 The CAPM Prediction

The theoretical CAPM makes a strong prediction: αi = 0 for all assets. If markets are efficient and CAPM is the correct model, no asset should earn a return above or below what its market exposure (β) predicts. A non-zero alpha means the asset is mispriced.

Finance Term

Alpha (α): The holy grail of active management. Fund managers are judged by whether they generate “alpha” — returns in excess of what a passive market exposure would produce. If a fund earns 12% but its beta to the market is 1.2 and the market returned 10%, the CAPM-predicted return is 1.2 × 10% = 12%. Alpha = 12% − 12% = 0%. The manager added no value beyond market exposure.

2. Beta: The Regression Slope Coefficient

Beta has a clean formula that should be immediately recognizable:

βi = Cov(ri, rm) / Var(rm)
Stats Bridge

This is the formula for the OLS slope coefficient in simple linear regression: β̂ = Cov(Y, X) / Var(X). You've derived this a hundred times. Beta is the regression coefficient of the stock's excess return on the market's excess return. It measures how many percentage points the stock moves for each 1% move in the market.

2.1 Interpreting Beta Values

Beta Value Interpretation Example
β = 1.0 Moves with the market S&P 500 index fund (by definition)
β > 1.0 More volatile than market (amplifies movements) Tech stocks, small-cap growth stocks
0 < β < 1.0 Less volatile than market (dampens movements) Utilities, consumer staples
β = 0 Uncorrelated with market Market-neutral hedge fund (by design)
β < 0 Moves opposite to market Gold (sometimes), volatility products

2.2 Beta Decomposition of Total Risk

The CAPM regression decomposes total variance into two components:

Var(ri) = βi2 Var(rm) + Var(εi)

Total Risk = Systematic Risk + Idiosyncratic Risk

The fraction of variance explained by the market is the R-squared:

R2 = βi2 Var(rm) / Var(ri) = 1 − Var(εi) / Var(ri)
Key Insight

For a typical individual stock, R2 is only 20–40%, meaning 60–80% of the stock's return variance is idiosyncratic (unexplained by the market). For a diversified portfolio, R2 is much higher (often 90%+) because diversification averages away idiosyncratic risk. This is why β matters more for portfolio management than for individual stock analysis.

3. Running the CAPM Regression in Python

Let's run the CAPM regression for Apple (AAPL) against the market (S&P 500), with full regression diagnostics.

Pythonimport numpy as np
import pandas as pd
import yfinance as yf
import statsmodels.api as sm
import matplotlib.pyplot as plt

# ── Download data ──────────────────────────────────────────
# Stock: Apple; Market proxy: S&P 500 ETF (SPY)
# Risk-free rate: 3-month T-bill rate from FRED
tickers = {"AAPL": "Apple", "SPY": "S&P 500 (SPY)"}
data = yf.download(list(tickers.keys()), start="2019-01-01",
                   end="2024-01-01")["Adj Close"]

# Compute daily returns
returns = data.pct_change().dropna()

# Risk-free rate: approximate as constant 2% annual
rf_daily = 0.02 / 252

# Excess returns
excess_stock = returns["AAPL"] - rf_daily
excess_market = returns["SPY"] - rf_daily

# ── Run the CAPM regression ───────────────────────────────
X = sm.add_constant(excess_market)  # adds intercept column
model = sm.OLS(excess_stock, X).fit()

print(model.summary())

# Extract key parameters
alpha = model.params["const"]
beta = model.params["SPY"]
r_squared = model.rsquared
alpha_tstat = model.tvalues["const"]
alpha_pval = model.pvalues["const"]
beta_tstat = model.tvalues["SPY"]
beta_se = model.bse["SPY"]

print(f"\n{'='*50}")
print(f"  CAPM Results for AAPL")
print(f"{'='*50}")
print(f"  Alpha (daily):       {alpha:.6f}")
print(f"  Alpha (annualized):  {alpha*252*100:.2f}%")
print(f"  Alpha t-stat:        {alpha_tstat:.4f}")
print(f"  Alpha p-value:       {alpha_pval:.4f}")
print(f"  Alpha significant?   {'Yes' if alpha_pval < 0.05 else 'No'}")
print(f"  {'─'*48}")
print(f"  Beta:                {beta:.4f}")
print(f"  Beta SE:             {beta_se:.4f}")
print(f"  Beta 95% CI:         [{beta-1.96*beta_se:.4f}, {beta+1.96*beta_se:.4f}]")
print(f"  Beta t-stat:         {beta_tstat:.4f}")
print(f"  {'─'*48}")
print(f"  R-squared:           {r_squared:.4f}")
print(f"  Residual std (daily):{model.resid.std():.6f}")
print(f"  Residual std (ann.): {model.resid.std()*np.sqrt(252)*100:.2f}%")
print(f"{'='*50}")

3.1 The Security Characteristic Line

The scatter plot of the stock's excess return versus the market's excess return, with the OLS regression line, is called the Security Characteristic Line (SCL).

Python# Plot the Security Characteristic Line
fig, ax = plt.subplots(figsize=(10, 7))

# Scatter plot of excess returns
ax.scatter(excess_market*100, excess_stock*100, alpha=0.3,
           s=10, color='#1a365d', label='Daily returns')

# Regression line
x_range = np.linspace(excess_market.min(), excess_market.max(), 100)
y_pred = (alpha + beta * x_range) * 100
ax.plot(x_range*100, y_pred, color='#e53e3e', linewidth=2,
        label=f'SCL: y = {alpha*100:.4f} + {beta:.2f}x')

# Reference lines
ax.axhline(y=0, color='gray', linewidth=0.5, linestyle='-')
ax.axvline(x=0, color='gray', linewidth=0.5, linestyle='-')

# Annotations
ax.set_xlabel("Market Excess Return (%)")
ax.set_ylabel("AAPL Excess Return (%)")
ax.set_title(f"Security Characteristic Line: AAPL\n"
             f"Beta = {beta:.3f}, Alpha (ann.) = {alpha*252*100:.2f}%, "
             f"R² = {r_squared:.3f}")
ax.legend()
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig("scl_aapl.png", dpi=150, bbox_inches='tight')
plt.show()
Common Pitfall

Beta is not constant. A stock's sensitivity to the market changes over time as the company's leverage, business mix, and market conditions change. A rolling 60-day beta for AAPL can range from 0.8 to 1.8 depending on the period. Always report the estimation window alongside the beta estimate.

4. Testing Alpha: Is the Manager Skilled or Lucky?

The central question of active management is whether a fund's alpha is statistically significantly different from zero. This is a t-test on the regression intercept.

H0: α = 0   (no skill)       H1: α ≠ 0   (skill exists)

t = α̂ / SE(α̂)

4.1 The Power Problem

Even if a manager truly generates 2% annual alpha, detecting it is extremely difficult. With daily data, the daily alpha is 2%/252 = 0.008%, while daily idiosyncratic volatility might be 1.5%. The signal-to-noise ratio is abysmal.

Pythonfrom scipy.stats import t as t_dist

def alpha_power_analysis(true_alpha_annual, idio_vol_annual,
                         n_years_list=[1, 3, 5, 10, 20]):
    """
    How many years of data do we need to detect alpha?
    """
    true_alpha_daily = true_alpha_annual / 252
    idio_vol_daily = idio_vol_annual / np.sqrt(252)

    print(f"True alpha: {true_alpha_annual*100:.1f}% annual")
    print(f"Idiosyncratic volatility: {idio_vol_annual*100:.1f}% annual")
    print(f"\n{'Years':>6s} {'T':>6s} {'t-stat':>8s} {'Power':>8s}")
    print(f"{'-'*30}")

    for n_years in n_years_list:
        T = int(n_years * 252)
        # Expected t-statistic under the alternative
        se_alpha = idio_vol_daily / np.sqrt(T)
        expected_t = true_alpha_daily / se_alpha

        # Power: probability of rejecting H0 at 5% level
        critical_value = t_dist.ppf(0.975, T - 2)
        power = 1 - t_dist.cdf(critical_value - expected_t, T - 2)

        print(f"{n_years:>6d} {T:>6d} {expected_t:>8.2f} {power:>8.1%}")

# Scenario 1: Strong alpha (3% per year), typical stock vol
print("Scenario 1: Strong alpha, individual stock")
alpha_power_analysis(0.03, 0.25)

print("\n")

# Scenario 2: Moderate alpha (1% per year), diversified fund
print("Scenario 2: Moderate alpha, diversified fund")
alpha_power_analysis(0.01, 0.08)
Key Insight

To detect a 2% annual alpha with 80% power at the 5% significance level, you need roughly 15–20 years of daily data for a diversified fund, and even more for individual stocks. This means that even genuinely skilled managers will look indistinguishable from luck for many years. The “is this alpha real?” question is fundamentally a low-power hypothesis test.

4.2 Multiple Testing and Survivorship Bias

When evaluating thousands of fund managers, the multiple comparisons problem is severe. If 1,000 managers have zero alpha and you test each at α = 5%, you expect 50 false positives. These “star managers” will be profiled in magazines, given more capital, and then revert to the mean.

Stats Bridge

This is the multiple testing problem (Bonferroni, BH-FDR). The finance industry needs to apply False Discovery Rate control when evaluating fund performance. Harvey, Liu, and Zhu (2016) argued that the t-statistic threshold for a “new factor” should be 3.0, not 2.0, to account for the hundreds of factors that have been tested. This is the FDR correction applied to asset pricing.

5. Fama-French Three-Factor Model: Multiple Regression

The CAPM says the only risk that matters is market risk. But empirically, two other factors have significant explanatory power: size (small stocks outperform large) and value (cheap stocks outperform expensive). The Fama-French three-factor model (1993) extends the CAPM to a multiple regression:

ri,t − rf,t = αi + βi,MKT (rm,t − rf,t) + βi,SMB SMBt + βi,HML HMLt + εi,t
Finance Term

SMB (Small Minus Big): The return difference between portfolios of small and large stocks. Captures the size premium.
HML (High Minus Low): The return difference between portfolios of high book-to-market (value) and low book-to-market (growth) stocks. Captures the value premium.

Stats Bridge

This is multiple linear regression with three predictors. The interpretation is identical to any multiple regression: each β measures the partial effect of that factor, controlling for the others. The model's R2 is typically 5–15 percentage points higher than the single-factor CAPM.

5.1 Extended Factor Models

The factor zoo has expanded considerably since Fama-French:

Model Factors Regression Analogue
CAPM (1964) Market Simple regression (1 predictor)
Fama-French 3 (1993) Market + SMB + HML Multiple regression (3 predictors)
Carhart 4 (1997) + Momentum (UMD) Multiple regression (4 predictors)
Fama-French 5 (2015) + Profitability (RMW) + Investment (CMA) Multiple regression (5 predictors)
q-factor (2015) Market + Size + Investment + Profitability Multiple regression (4 predictors)
Pythonimport pandas as pd
import statsmodels.api as sm

# ── Download Fama-French factors from Ken French's website ─
# Using pandas_datareader (or download CSV manually)
# For demonstration, we'll simulate or use the famafrench package

try:
    import pandas_datareader.data as web
    ff_factors = web.DataReader('F-F_Research_Data_Factors_daily',
                                'famafrench',
                                start='2019-01-01',
                                end='2024-01-01')[0]
    ff_factors = ff_factors / 100  # Convert from percent to decimal
except ImportError:
    # If pandas_datareader not available, create synthetic factors
    print("Note: using synthetic factor data for illustration")
    dates = returns.index
    np.random.seed(42)
    ff_factors = pd.DataFrame({
        'Mkt-RF': excess_market,
        'SMB': np.random.normal(0.0002, 0.005, len(dates)),
        'HML': np.random.normal(0.0001, 0.005, len(dates)),
        'RF': rf_daily * np.ones(len(dates))
    }, index=dates)

# ── Merge stock returns with factors ──────────────────────
merged = pd.concat([
    excess_stock.rename('AAPL_excess'),
    ff_factors[['Mkt-RF', 'SMB', 'HML']]
], axis=1).dropna()

# ── Run Fama-French 3-factor regression ───────────────────
Y = merged['AAPL_excess']
X_capm = sm.add_constant(merged[['Mkt-RF']])
X_ff3 = sm.add_constant(merged[['Mkt-RF', 'SMB', 'HML']])

model_capm = sm.OLS(Y, X_capm).fit()
model_ff3 = sm.OLS(Y, X_ff3).fit()

# ── Compare models ────────────────────────────────────────
print("="*60)
print("  Model Comparison: CAPM vs Fama-French 3-Factor")
print("="*60)
print(f"\n{'Metric':<30s} {'CAPM':>12s} {'FF3':>12s}")
print(f"{'-'*54}")
print(f"{'Alpha (daily)':<30s} {model_capm.params['const']:>12.6f} "
      f"{model_ff3.params['const']:>12.6f}")
print(f"{'Alpha (ann. %)':<30s} "
      f"{model_capm.params['const']*252*100:>12.2f} "
      f"{model_ff3.params['const']*252*100:>12.2f}")
print(f"{'Alpha t-stat':<30s} {model_capm.tvalues['const']:>12.4f} "
      f"{model_ff3.tvalues['const']:>12.4f}")
print(f"{'Alpha p-value':<30s} {model_capm.pvalues['const']:>12.4f} "
      f"{model_ff3.pvalues['const']:>12.4f}")
print(f"{'Beta (Market)':<30s} {model_capm.params['Mkt-RF']:>12.4f} "
      f"{model_ff3.params['Mkt-RF']:>12.4f}")
print(f"{'Beta (SMB)':<30s} {'—':>12s} "
      f"{model_ff3.params['SMB']:>12.4f}")
print(f"{'Beta (HML)':<30s} {'—':>12s} "
      f"{model_ff3.params['HML']:>12.4f}")
print(f"{'R-squared':<30s} {model_capm.rsquared:>12.4f} "
      f"{model_ff3.rsquared:>12.4f}")
print(f"{'Adj. R-squared':<30s} {model_capm.rsquared_adj:>12.4f} "
      f"{model_ff3.rsquared_adj:>12.4f}")
print(f"{'AIC':<30s} {model_capm.aic:>12.1f} "
      f"{model_ff3.aic:>12.1f}")
print(f"{'BIC':<30s} {model_capm.bic:>12.1f} "
      f"{model_ff3.bic:>12.1f}")

6. Factor Models as Dimension Reduction (PCA Connection)

Factor models decompose the N-dimensional space of asset returns into a low-dimensional factor structure plus idiosyncratic noise. This is conceptually identical to Principal Component Analysis (PCA).

r = Bf + ε

where r is the N×1 return vector, B is the N×K factor loading matrix, f is the K×1 factor vector, and ε is the idiosyncratic noise. The covariance matrix then factors as:

Σ = B Σf B' + D

where Σf is the K×K factor covariance and D is a diagonal matrix of idiosyncratic variances. Instead of estimating N(N+1)/2 parameters, you estimate NK + K(K+1)/2 + N parameters — a massive reduction.

Stats Bridge

The statistical factor model (PCA on the return covariance matrix) often recovers factors that look like the Fama-French factors. The first principal component is almost always a “market” factor (all loadings positive, roughly equal). The second often looks like a “size” factor (small vs. large loadings). The third resembles a “value” factor. The economic factor models and statistical factor models converge.

Pythonfrom sklearn.decomposition import PCA

# Download a broader set of stocks
broad_tickers = ["AAPL", "MSFT", "GOOGL", "AMZN", "META",
                 "JPM", "BAC", "WFC", "GS", "C",
                 "JNJ", "PFE", "UNH", "MRK", "ABBV",
                 "XOM", "CVX", "COP", "SLB", "BKR",
                 "PG", "KO", "PEP", "WMT", "COST"]

broad_data = yf.download(broad_tickers, start="2020-01-01",
                         end="2024-01-01")["Adj Close"]
broad_returns = broad_data.pct_change().dropna()

# Run PCA
pca = PCA(n_components=10)
pca.fit(broad_returns.values)

# Variance explained
var_explained = pca.explained_variance_ratio_
cumvar = np.cumsum(var_explained)

print("Principal Components - Variance Explained:")
print(f"{'PC':>4s} {'Var Explained':>14s} {'Cumulative':>12s}")
print(f"{'-'*32}")
for i in range(10):
    print(f"{'PC'+str(i+1):>4s} {var_explained[i]:>14.4f} {cumvar[i]:>12.4f}")

# Plot variance explained
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))

ax1.bar(range(1, 11), var_explained * 100, color='#1a365d')
ax1.set_xlabel("Principal Component")
ax1.set_ylabel("Variance Explained (%)")
ax1.set_title("Scree Plot")

ax2.plot(range(1, 11), cumvar * 100, 'o-', color='#1a365d')
ax2.axhline(y=80, color='#e53e3e', linestyle='--',
            label='80% threshold')
ax2.set_xlabel("Number of Components")
ax2.set_ylabel("Cumulative Variance Explained (%)")
ax2.set_title("Cumulative Variance Explained")
ax2.legend()

plt.tight_layout()
plt.savefig("pca_factors.png", dpi=150, bbox_inches='tight')
plt.show()

# Examine the first 3 factor loadings
loadings = pd.DataFrame(
    pca.components_[:3].T,
    columns=['PC1 (Market?)', 'PC2 (Sector?)', 'PC3 (Value?)'],
    index=broad_tickers
)
print("\nFactor Loadings (first 3 PCs):")
print(loadings.round(4).to_string())
Key Insight

Typically, the first 3–5 principal components explain 60–80% of the variance in a panel of stock returns. This means that the effective dimensionality of stock returns is much lower than N. Factor models exploit this low-rank structure, just as PCA compresses high-dimensional data into a few latent dimensions.

7. Regression Diagnostics for Factor Models

Since the CAPM and factor models are regressions, all standard regression diagnostics apply. Financial returns, however, have special properties that make some diagnostics particularly important.

7.1 Heteroscedasticity

Returns exhibit volatility clustering (ARCH/GARCH effects), which means the residual variance is not constant over time. This doesn't bias OLS estimates, but it invalidates the usual standard errors.

Python# ── Regression diagnostics for the CAPM model ─────────────
from statsmodels.stats.diagnostic import het_breuschpagan
from statsmodels.stats.diagnostic import acorr_ljungbox
from scipy.stats import jarque_bera

residuals = model_capm.resid

# 1. Heteroscedasticity test (Breusch-Pagan)
bp_stat, bp_pval, _, _ = het_breuschpagan(residuals, X_capm)
print(f"Breusch-Pagan test: stat={bp_stat:.4f}, p={bp_pval:.4f}")
if bp_pval < 0.05:
    print("  => Heteroscedasticity detected! Use HAC standard errors.")

# 2. Autocorrelation of residuals (Ljung-Box)
lb_result = acorr_ljungbox(residuals, lags=[5, 10, 20],
                           return_df=True)
print(f"\nLjung-Box test for residual autocorrelation:")
print(lb_result)

# 3. Normality of residuals (Jarque-Bera)
jb_stat, jb_pval = jarque_bera(residuals)
print(f"\nJarque-Bera normality test: stat={jb_stat:.2f}, p={jb_pval:.6f}")

# 4. Re-run with Heteroscedasticity and Autocorrelation
#    Consistent (HAC) standard errors (Newey-West)
model_hac = sm.OLS(Y, X_capm).fit(cov_type='HAC',
                                    cov_kwds={'maxlags': 10})
print("\n\nComparison: OLS vs HAC (Newey-West) standard errors:")
print(f"{'Parameter':<12s} {'OLS SE':>10s} {'HAC SE':>10s} {'Ratio':>8s}")
print(f"{'-'*42}")
for param in ['const', 'Mkt-RF']:
    ols_se = model_capm.bse[param]
    hac_se = model_hac.bse[param]
    print(f"{param:<12s} {ols_se:>10.6f} {hac_se:>10.6f} "
          f"{hac_se/ols_se:>8.2f}")
Common Pitfall

Always use HAC (Newey-West) standard errors for financial regressions. Volatility clustering means OLS standard errors are typically too small (by 10–30%), making alpha and beta look more significant than they really are. In finance, the correction matters for inference, even though the point estimates are unbiased.

7.2 Rolling Betas

Python# Rolling beta estimation
window = 126  # ~6 months

rolling_beta = excess_stock.rolling(window).cov(excess_market) / \
               excess_market.rolling(window).var()

rolling_alpha = excess_stock.rolling(window).mean() - \
                rolling_beta * excess_market.rolling(window).mean()

fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 8), sharex=True)

ax1.plot(rolling_beta, color='#1a365d', linewidth=1)
ax1.axhline(y=beta, color='#e53e3e', linestyle='--',
            label=f'Full-sample beta = {beta:.3f}')
ax1.axhline(y=1, color='gray', linestyle=':', alpha=0.5)
ax1.set_ylabel("Beta")
ax1.set_title(f"AAPL Rolling {window}-Day Beta")
ax1.legend()
ax1.grid(True, alpha=0.3)

ax2.plot(rolling_alpha * 252 * 100, color='#38a169', linewidth=1)
ax2.axhline(y=0, color='gray', linestyle='-', alpha=0.5)
ax2.set_ylabel("Annualized Alpha (%)")
ax2.set_title(f"AAPL Rolling {window}-Day Alpha")
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig("rolling_beta_alpha.png", dpi=150, bbox_inches='tight')
plt.show()

8. Practical Applications of Factor Models

8.1 Performance Attribution

Factor models decompose a portfolio's return into factor exposures and alpha:

Rportfolio = α + βMKT ⋅ RMKT + βSMB ⋅ RSMB + βHML ⋅ RHML + ε

This answers the question: “Did the fund manager generate returns through skill (α), or just by loading on known risk factors?”

Component Source Should You Pay For It?
βMKT contribution Market exposure (easy to replicate with index fund) No — costs 0.03% via Vanguard
βSMB contribution Small-cap tilt (replicable with small-cap ETF) No — costs 0.05% via small-cap ETF
βHML contribution Value tilt (replicable with value ETF) No — costs 0.06% via value ETF
α Genuine skill (if statistically significant) Maybe — if α > fee

8.2 Risk Decomposition

For a portfolio with factor exposures β, the variance decomposes as:

Var(Rp) = β' Σf β + σε2

= (Systematic Risk) + (Idiosyncratic Risk)

This tells you exactly where the portfolio's risk comes from: how much is from market exposure, how much from size tilts, how much from value tilts, and how much is stock-specific.

Key Insight

Factor models reveal that many “hedge funds” and “active managers” are simply selling exposure to well-known factors at high fees. A fund that loads heavily on the size and value factors is not demonstrating skill — it's selling something you can buy for 5 basis points in an ETF. True alpha is the residual after accounting for all known factor exposures.

9. Chapter Summary

The CAPM and factor models are regressions — the most fundamental tool in a statistician's toolkit, applied to the most important problem in finance:

Stats Bridge

The entire field of empirical asset pricing is, at its core, applied regression analysis with careful attention to standard errors, model selection, and the multiple testing problem. Your training in regression, hypothesis testing, and model diagnostics is directly transferable — finance just uses different variable names.