Module 02: Returns, Not Prices

The most fundamental transformation in quantitative finance

Part 1 of 5 Module 02 of 22

← Previous Module 02 of 22 Next →

1. Why Raw Prices Are Useless for Statistical Analysis

If you were handed a time series of daily temperatures and asked to model it, your first instinct would be to check stationarity. You know that most standard statistical methods — OLS regression, correlation, spectral analysis — assume or require stationarity. Financial prices violate this assumption catastrophically.

A stock price like Apple at $175 today tells you almost nothing in isolation. Is that high or low? Is the stock rising or falling? Is it more volatile than Google? You cannot answer any of these questions from the raw price alone. Worse, any regression of one price on another will almost certainly produce a high R-squared and a statistically significant coefficient — even when the two series are completely unrelated.

Stats Bridge

This is the spurious regression problem that Granger and Newbold (1974) demonstrated: regressing one random walk on another produces absurdly high t-statistics and R-squared values. The standard errors are wrong because they assume stationary residuals. If you have taken a time series course, you know this as the reason we difference nonstationary series before modeling.

1.1 Nonstationarity of Prices

Stock prices exhibit several forms of nonstationarity:

Trending mean: Over long periods, stock prices tend to increase (reflecting economic growth and inflation).
Non-constant variance: The variance of price levels grows over time. A $100 stock that fluctuates by $2 per day has a 2% daily range; if it grows to $200, $2 fluctuations would be 1%, but in practice the fluctuations scale with the price level.
Unit root: Prices follow an approximate random walk — today’s price is yesterday’s price plus noise.

Key Insight

The fundamental transformation in quantitative finance is: never analyze prices; always analyze returns. This is the financial equivalent of differencing a time series to achieve stationarity. It is so universal that when a finance person says “data,” they almost always mean returns, not prices.

1.2 A Visual Demonstration

Pythonimport yfinance as yf
import matplotlib.pyplot as plt
import numpy as np

# Download two unrelated assets
aapl = yf.download("AAPL", start="2015-01-01", end="2025-01-01")["Adj Close"]
gold = yf.download("GC=F", start="2015-01-01", end="2025-01-01")["Adj Close"]

# Align on common dates
combined = pd.DataFrame({"AAPL": aapl, "Gold": gold}).dropna()

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Left: prices (misleading correlation)
axes[0].scatter(combined["AAPL"], combined["Gold"], alpha=0.3, s=5)
axes[0].set_xlabel("AAPL Price ($)")
axes[0].set_ylabel("Gold Price ($)")
r_prices = combined["AAPL"].corr(combined["Gold"])
axes[0].set_title(f"Prices: r = {r_prices:.3f} (SPURIOUS)")

# Right: returns (real correlation)
ret = combined.pct_change().dropna()
axes[1].scatter(ret["AAPL"], ret["Gold"], alpha=0.3, s=5)
axes[1].set_xlabel("AAPL Return")
axes[1].set_ylabel("Gold Return")
r_returns = ret["AAPL"].corr(ret["Gold"])
axes[1].set_title(f"Returns: r = {r_returns:.3f} (REAL)")

plt.tight_layout()
plt.show()
# The price correlation will be high (~0.9); the return correlation near zero

2. Simple (Arithmetic) Returns

2.1 Definition

The simple return (also called arithmetic return) over one period is the percentage change in price:

R_t = (P_t − P_t−1) / P_t−1 = P_t / P_t−1 − 1

Finance Term

Simple Return: The fractional change in the value of an investment over one period. A return of 0.02 means a 2% gain; a return of −0.03 means a 3% loss. This is the most intuitive measure: if you invested $100 and earned a 5% simple return, you now have $105.

2.2 Properties of Simple Returns

Property	Simple Returns	Statistical Note
Lower bound	−1 (100% loss)	Bounded below; cannot lose more than 100% (for stocks)
Upper bound	Unbounded above	Asymmetric distribution by construction
Multi-period aggregation	Multiplicative: (1+R₁)(1+R₂)…(1+R_T) − 1	Not additive — you cannot simply sum daily returns
Cross-sectional aggregation	Additive (portfolio return = weighted sum)	Portfolio return is a linear combination
Distribution	Slightly right-skewed	Due to the −1 lower bound

2.3 Computing Simple Returns in Python

Pythonimport pandas as pd
import yfinance as yf

aapl = yf.download("AAPL", start="2020-01-01", end="2025-01-01")

# Method 1: Using pct_change()
simple_returns = aapl["Adj Close"].pct_change()

# Method 2: Manual calculation (equivalent)
prices = aapl["Adj Close"]
simple_returns_manual = (prices - prices.shift(1)) / prices.shift(1)

# Verify they are identical
print("Max difference:", (simple_returns - simple_returns_manual).abs().max())

# Drop the first NaN value
simple_returns = simple_returns.dropna()

# Summary statistics
print(f"Mean daily return:   {simple_returns.mean():.6f}")
print(f"Std daily return:    {simple_returns.std():.6f}")
print(f"Min daily return:    {simple_returns.min():.6f}")
print(f"Max daily return:    {simple_returns.max():.6f}")
print(f"Annualized mean:     {simple_returns.mean() * 252:.4f}")
print(f"Annualized vol:      {simple_returns.std() * np.sqrt(252):.4f}")

Stats Bridge

The annualization formulas above assume returns are i.i.d. — a strong assumption we will relax later. The mean scales by T (number of trading days) and the standard deviation scales by √T. This is just the standard result for the mean and variance of a sum of i.i.d. random variables: if X_i has variance σ², then the sum of 252 of them has variance 252σ², so the standard deviation is σ√252.

2.4 Multi-Period Simple Returns

To compute the cumulative return over multiple periods, you must compound:

R_cumulative = ∏_t=1^T (1 + R_t) − 1

Python# Cumulative return over the entire period
cumulative = ((1 + simple_returns).cumprod() - 1)
print(f"Total cumulative return: {cumulative.iloc[-1]:.4f}")
print(f"Meaning: ${100 * (1 + cumulative.iloc[-1]):.2f} from $100 invested")

# WRONG way: simply summing returns
wrong_total = simple_returns.sum()
print(f"\nIncorrect (summed) return: {wrong_total:.4f}")
print(f"Correct (compounded) return: {cumulative.iloc[-1]:.4f}")
print(f"Difference: {wrong_total - cumulative.iloc[-1]:.4f}")

Common Pitfall

Never sum simple returns to get a multi-period return. This is one of the most common errors. Simple returns compound multiplicatively, not additively. The error grows with the number of periods and the magnitude of individual returns. Over a year of daily returns, summing vs. compounding can differ by several percentage points.

3. Log (Continuously Compounded) Returns

3.1 Definition

The log return (or continuously compounded return) is the natural logarithm of the price ratio:

r_t = ln(P_t / P_t−1) = ln(P_t) − ln(P_t−1)

Finance Term

Log Return (Continuously Compounded Return): The rate of return that, if applied continuously (infinitely many compounding periods), would produce the observed price change. Denoted with lowercase r to distinguish from the simple return R.

3.2 Why Log Returns Are Preferred for Statistical Analysis

Log returns have several properties that make them far more convenient for statistical work:

Property	Log Returns	Why It Matters
Time additivity	r_1:T = r₁ + r₂ + … + r_T	Multi-period return is a simple sum — CLT applies directly
Symmetry	A +5% move and a −5% move are symmetric around zero	Distribution is more symmetric; better approximation to normal
Domain	(−∞, +∞)	No bounded support issues; compatible with normal distribution
Differencing	r_t = Δ ln(P_t)	Log returns are literally first differences of log prices
Approximation	r_t ≈ R_t for small returns	For daily returns (<2%), the difference is negligible

Stats Bridge

The time-additivity of log returns is the key property. If daily log returns are i.i.d. with mean μ and variance σ², then the T-period log return is the sum of T i.i.d. random variables, which has mean Tμ and variance Tσ². By the Central Limit Theorem, the T-period log return is approximately normal for large T — even if individual daily returns are not. This is why the geometric Brownian motion model uses log returns.

3.3 Computing Log Returns in Python

Pythonimport numpy as np

# Method 1: Using numpy log
prices = aapl["Adj Close"]
log_returns = np.log(prices / prices.shift(1)).dropna()

# Method 2: Difference of log prices (equivalent)
log_returns_v2 = np.log(prices).diff().dropna()

# Verify equivalence
print("Max difference:", (log_returns - log_returns_v2).abs().max())

# Compare log returns vs simple returns
print(f"\nSimple return mean:  {simple_returns.mean():.6f}")
print(f"Log return mean:     {log_returns.mean():.6f}")
print(f"Difference:          {simple_returns.mean() - log_returns.mean():.6f}")
# Log return mean is always slightly lower (Jensen's inequality)

# Multi-period: just sum log returns!
total_log_return = log_returns.sum()
print(f"\nTotal log return (summed): {total_log_return:.4f}")
print(f"Equivalent simple return:  {np.exp(total_log_return) - 1:.4f}")

3.4 The Relationship Between Simple and Log Returns

r_t = ln(1 + R_t) and R_t = e^r_t − 1

For small returns (typical of daily data), the Taylor expansion gives:

r_t = ln(1 + R_t) ≈ R_t − R_t²/2 + …

The R_t²/2 term explains why the mean log return is always slightly less than the mean simple return. This correction factor is half the variance, and it becomes important over long periods.

Key Insight

The difference between simple and log returns matters most for:

Portfolio construction: Use simple returns (they aggregate linearly across assets)
Time series modeling: Use log returns (they aggregate linearly across time)
Risk measurement: Either works for short horizons; log returns are better for long horizons

In practice, for daily equity data, the difference is usually less than 0.01%.

3.5 When the Approximation Breaks Down

Python# Show when simple and log returns diverge
import pandas as pd

simple_ret_values = [0.001, 0.01, 0.05, 0.10, 0.20, 0.50, 1.00, -0.01, -0.05, -0.10, -0.50]
comparison = pd.DataFrame({
    "Simple Return (R)": simple_ret_values,
    "Log Return (r)": [np.log(1 + r) for r in simple_ret_values],
    "Abs Difference": [abs(r - np.log(1 + r)) for r in simple_ret_values],
    "Relative Difference (%)": [abs(r - np.log(1 + r)) / abs(r) * 100
                                  for r in simple_ret_values]
})
print(comparison.to_string(index=False))
# For R = 0.01 (1%), difference is ~0.005%
# For R = 0.50 (50%), difference is ~9.5% — HUGE

4. The Random Walk Hypothesis

4.1 Prices as a Random Walk

The simplest model for stock prices is the random walk:

P_t = P_t−1 + ε_t, ε_t ~ WN(0, σ²)

Equivalently, in log-price form:

ln(P_t) = μ + ln(P_t−1) + ε_t

This says that the best forecast for tomorrow’s price is today’s price (plus a drift term μ). Returns — the first differences — are white noise. This is a unit root process with I(1) integration order.

Stats Bridge

The random walk is an ARIMA(0,1,0) process with possible drift. If you difference it once, you get white noise (the returns). This is the simplest member of the unit root family. The Efficient Market Hypothesis (EMH) in its weak form is essentially the statement that returns are unpredictable — i.e., prices follow something close to a random walk.

4.2 Implications of the Random Walk

The variance of prices grows linearly with time. After T steps, Var(P_T) = Tσ². This is why confidence intervals for price forecasts fan out over time.
The forecast at any horizon is the current price (plus drift). No matter how sophisticated your model, if the random walk holds, you cannot beat it for mean-squared error.
Correlations between price levels are spurious. Two independent random walks will appear highly correlated because they both trend.

4.3 Random Walk with Drift

In practice, stock prices tend to increase over time (the equity risk premium). The random walk with drift adds a constant:

ln(P_t) = μ + ln(P_t−1) + ε_t

where μ > 0 represents the average growth rate. Differencing gives log returns with a positive mean: r_t = μ + ε_t.

Python# Simulate a random walk with drift
np.random.seed(42)
n_days = 252 * 5  # 5 years of trading days
mu = 0.0003         # daily drift (~7.5% annualized)
sigma = 0.015       # daily volatility (~24% annualized)

# Generate log returns
log_ret_sim = np.random.normal(mu, sigma, n_days)

# Construct price path
log_prices = np.cumsum(np.concatenate([[np.log(100)], log_ret_sim]))
prices_sim = np.exp(log_prices)

# Plot simulated vs real AAPL
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
axes[0].plot(prices_sim, color='#1a365d')
axes[0].set_title('Simulated Random Walk with Drift')
axes[0].set_ylabel('Price ($)')
axes[1].plot(log_ret_sim, color='#e53e3e', alpha=0.6, linewidth=0.5)
axes[1].set_title('Simulated Log Returns (White Noise + Drift)')
axes[1].set_ylabel('Log Return')
axes[1].axhline(y=mu, color='black', linestyle='--', label=f'drift = {mu}')
axes[1].legend()
plt.tight_layout()
plt.show()

5. Testing for Nonstationarity: Unit Root Tests

5.1 The Augmented Dickey-Fuller (ADF) Test

The ADF test is the workhorse of unit root testing. The null hypothesis is that the series has a unit root (is nonstationary). Rejecting the null means you have evidence of stationarity.

H₀: φ = 1 (unit root present, series is I(1))
H₁: φ < 1 (no unit root, series is I(0) or stationary)

Pythonfrom statsmodels.tsa.stattools import adfuller, kpss
import pandas as pd

prices = aapl["Adj Close"]
log_returns = np.log(prices / prices.shift(1)).dropna()

# ADF test on PRICES (expect to NOT reject H0 — prices are nonstationary)
adf_prices = adfuller(prices, autolag='AIC')
print("=== ADF Test on Prices ===")
print(f"Test statistic: {adf_prices[0]:.4f}")
print(f"p-value:        {adf_prices[1]:.4f}")
print(f"Lags used:      {adf_prices[2]}")
print(f"Critical values: {adf_prices[4]}")
print(f"Conclusion: {'Stationary' if adf_prices[1] < 0.05 else 'Nonstationary'}")

print()

# ADF test on LOG RETURNS (expect to reject H0 — returns are stationary)
adf_returns = adfuller(log_returns, autolag='AIC')
print("=== ADF Test on Log Returns ===")
print(f"Test statistic: {adf_returns[0]:.4f}")
print(f"p-value:        {adf_returns[1]:.6f}")
print(f"Lags used:      {adf_returns[2]}")
print(f"Conclusion: {'Stationary' if adf_returns[1] < 0.05 else 'Nonstationary'}")

5.2 The KPSS Test

The KPSS test reverses the hypotheses: the null is stationarity. This makes it a useful complement to the ADF test — using both together provides stronger evidence.

H₀: Series is stationary (or trend-stationary)
H₁: Series has a unit root

Python# KPSS test on PRICES (expect to reject H0 — prices are not stationary)
kpss_prices = kpss(prices, regression='ct', nlags='auto')
print("=== KPSS Test on Prices ===")
print(f"Test statistic: {kpss_prices[0]:.4f}")
print(f"p-value:        {kpss_prices[1]:.4f}")
print(f"Conclusion: {'Stationary' if kpss_prices[1] > 0.05 else 'Nonstationary'}")

print()

# KPSS test on LOG RETURNS (expect to NOT reject H0 — returns are stationary)
kpss_returns = kpss(log_returns, regression='c', nlags='auto')
print("=== KPSS Test on Log Returns ===")
print(f"Test statistic: {kpss_returns[0]:.4f}")
print(f"p-value:        {kpss_returns[1]:.4f}")
print(f"Conclusion: {'Stationary' if kpss_returns[1] > 0.05 else 'Nonstationary'}")

5.3 The Confirmatory Strategy

Best practice is to use both tests together:

ADF Result	KPSS Result	Conclusion
Reject H₀ (no unit root)	Fail to reject H₀ (stationary)	Strong evidence of stationarity
Fail to reject H₀	Reject H₀	Strong evidence of unit root
Both reject	Both reject	Ambiguous — may be trend-stationary
Neither rejects	Neither rejects	Ambiguous — low power, need more data

Key Insight

For typical stock price data, the ADF test will fail to reject (evidence of unit root) and the KPSS test will reject (evidence against stationarity) — both pointing to nonstationary prices. For returns, both tests will agree on stationarity. This confirmation gives you much stronger evidence than either test alone.

6. The Distribution of Returns: Normality and Its Failures

6.1 Are Returns Normally Distributed?

Many financial models assume returns follow a normal distribution. This assumption is convenient — it leads to closed-form solutions for option pricing, portfolio optimization, and risk measurement. But is it true?

The short answer: no. Returns are approximately normal in the center of the distribution, but the tails are much heavier than the normal distribution predicts. We will explore this in great detail in Module 03. For now, let us see how to test the assumption.

6.2 Visual Tests: Histogram and QQ Plot

Pythonimport scipy.stats as stats

fig, axes = plt.subplots(1, 3, figsize=(16, 5))

# 1. Histogram with normal overlay
axes[0].hist(log_returns, bins=100, density=True, alpha=0.7,
            color='#3182ce', edgecolor='white')
x = np.linspace(log_returns.min(), log_returns.max(), 200)
axes[0].plot(x, stats.norm.pdf(x, log_returns.mean(), log_returns.std()),
            'r-', linewidth=2, label='Normal fit')
axes[0].set_title('Histogram vs Normal')
axes[0].legend()

# 2. QQ plot
stats.probplot(log_returns, dist="norm", plot=axes[1])
axes[1].set_title('QQ Plot Against Normal')
axes[1].get_lines()[0].set_markerfacecolor('#3182ce')
axes[1].get_lines()[0].set_markersize(3)

# 3. Log-scale density comparison
axes[2].hist(log_returns, bins=200, density=True, alpha=0.7,
            color='#3182ce', edgecolor='white', log=True)
axes[2].plot(x, stats.norm.pdf(x, log_returns.mean(), log_returns.std()),
            'r-', linewidth=2)
axes[2].set_title('Log-Scale Density (shows tail behavior)')
axes[2].set_yscale('log')

plt.tight_layout()
plt.show()

6.3 Formal Normality Tests

Python# Shapiro-Wilk test (best for n < 5000)
if len(log_returns) > 5000:
    sample = log_returns.sample(5000, random_state=42)
else:
    sample = log_returns

sw_stat, sw_pval = stats.shapiro(sample)
print(f"Shapiro-Wilk: W={sw_stat:.6f}, p={sw_pval:.2e}")

# Jarque-Bera test (based on skewness and kurtosis)
jb_stat, jb_pval = stats.jarque_bera(log_returns)
print(f"Jarque-Bera:  JB={jb_stat:.2f}, p={jb_pval:.2e}")

# D'Agostino-Pearson omnibus test
dp_stat, dp_pval = stats.normaltest(log_returns)
print(f"D'Agostino:   K2={dp_stat:.2f}, p={dp_pval:.2e}")

# Moment comparison
print(f"\nMoment comparison:")
print(f"  Skewness: {log_returns.skew():.4f}  (normal = 0)")
print(f"  Kurtosis: {log_returns.kurtosis():.4f}  (normal = 0, excess)")
print(f"  Note: Excess kurtosis > 0 indicates heavier tails than normal")

Common Pitfall

All formal normality tests will reject for financial return data with enough observations. This is not a statistical failure — it is a genuine feature of the data. Returns really are non-normal. The practical question is not “are returns normal?” (they are not) but “how badly does the normal approximation fail, and does it matter for my application?”

7. Practical Considerations and Common Operations

7.1 Annualizing Returns and Volatility

Annualized Mean Return = μ_daily × 252

Annualized Volatility = σ_daily × √252

Python# Standard annualization
mean_daily = log_returns.mean()
std_daily = log_returns.std()

annualized_return = mean_daily * 252
annualized_vol = std_daily * np.sqrt(252)
sharpe_ratio = annualized_return / annualized_vol

print(f"Daily mean:          {mean_daily:.6f}")
print(f"Daily std:           {std_daily:.6f}")
print(f"Annualized return:   {annualized_return:.4f} ({annualized_return*100:.2f}%)")
print(f"Annualized vol:      {annualized_vol:.4f} ({annualized_vol*100:.2f}%)")
print(f"Sharpe ratio:        {sharpe_ratio:.4f}")

7.2 Rolling Statistics

Python# Rolling 21-day (1 month) statistics
rolling_mean = log_returns.rolling(21).mean() * 252
rolling_vol = log_returns.rolling(21).std() * np.sqrt(252)
rolling_sharpe = rolling_mean / rolling_vol

fig, axes = plt.subplots(3, 1, figsize=(12, 10), sharex=True)

axes[0].plot(rolling_mean, color='#1a365d', linewidth=0.8)
axes[0].axhline(y=0, color='gray', linestyle='--')
axes[0].set_ylabel('Annualized Return')
axes[0].set_title('21-Day Rolling Statistics (AAPL)')

axes[1].plot(rolling_vol, color='#e53e3e', linewidth=0.8)
axes[1].set_ylabel('Annualized Volatility')

axes[2].plot(rolling_sharpe, color='#38a169', linewidth=0.8)
axes[2].axhline(y=0, color='gray', linestyle='--')
axes[2].set_ylabel('Sharpe Ratio')

plt.tight_layout()
plt.show()

7.3 Multi-Period Returns at Different Horizons

Python# Compute returns at different horizons
horizons = {
    "Daily": 1,
    "Weekly": 5,
    "Monthly": 21,
    "Quarterly": 63,
    "Annual": 252
}

log_price = np.log(aapl["Adj Close"])
print(f"{'Horizon':<12} {'Mean':>8} {'Std':>8} {'Skew':>8} {'Kurt':>8} {'N':>6}")
print("-" * 55)

for name, h in horizons.items():
    ret_h = (log_price - log_price.shift(h)).dropna()
    print(f"{name:<12} {ret_h.mean():>8.5f} {ret_h.std():>8.5f} "
          f"{ret_h.skew():>8.3f} {ret_h.kurtosis():>8.3f} {len(ret_h):>6}")

# Notice: kurtosis decreases at longer horizons (CLT in action!)
# Also: skewness changes sign at different horizons

Stats Bridge

The decreasing kurtosis at longer horizons is the Central Limit Theorem in action. Monthly returns are sums of ~21 daily returns. If daily returns were truly i.i.d., the sum would converge to normality. The fact that monthly returns are still non-normal (kurtosis > 0) tells you that the i.i.d. assumption is violated — there are serial dependencies in volatility (the topic of Module 03).

8. Chapter Summary

Concept	Key Formula / Rule	Statistical Analogue
Simple return	R_t = P_t/P_t-1 − 1	Percentage change; additive across assets
Log return	r_t = ln(P_t/P_t-1)	First difference of log series; additive across time
Nonstationarity	Prices are I(1), returns are I(0)	Unit root; difference to achieve stationarity
ADF test	H₀: unit root	Fails to reject for prices; rejects for returns
KPSS test	H₀: stationary	Rejects for prices; fails to reject for returns
Annualization	μ × 252, σ × √252	Scaling rules for i.i.d. sums
Normality	Returns are approximately but not exactly normal	Heavy tails, excess kurtosis (explored in Module 03)

You now understand the most fundamental transformation in quantitative finance: converting raw prices into returns. In the next module, we will explore why returns are not normal — the so-called “stylized facts” of financial data — and what distributional models work better.

← Previous Course Home Next →