Learn Without Walls

Module 14: Commodities, Gold & Oil

Futures vs spot prices, contango/backwardation, gold as inflation hedge, and ties to predictive models

Part 3 of 5 Module 14 of 22

1. Commodities: Physical Assets vs. Financial Assets

Commodities are physical goods that are traded on markets: metals (gold, silver, copper), energy (crude oil, natural gas), agriculture (wheat, corn, soybeans), and livestock. They differ fundamentally from stocks and bonds because they are real, tangible assets that must be stored, transported, and physically delivered.

Stats Bridge

Stocks and bonds are claims on cash flow processes (stochastic or deterministic). Commodities are priced by supply-demand equilibrium for physical goods. This means commodity prices are driven by a fundamentally different data-generating process: weather, geopolitics, extraction costs, storage constraints, and inventory levels. While financial asset returns are often modeled as random walks, commodity prices tend to be mean-reverting — high prices incentivize new supply and reduce demand, pushing prices back toward the cost of production.

1.1 Key Differences from Financial Assets

Property Financial Assets (Stocks, Bonds) Commodities
Nature Claims on future cash flows (paper) Physical goods with intrinsic use
Storage Costless (electronic entries) Costly (warehouses, refrigeration, tanks)
Yield Dividends, coupons No yield; instead a "convenience yield"
Price dynamics Tend toward random walk (efficient market) Tend toward mean-reversion (supply response)
Supply elasticity Instantaneous (issue new shares) Slow (takes years to build mines, wells)
Expiration Perpetual (stocks) or fixed maturity (bonds) Continuous production/consumption cycle
Return distribution Slight positive skew (stocks), near-normal (bonds) Positive skew, supply-shock driven fat tails
Finance Term

Convenience yield = the non-monetary benefit of holding the physical commodity rather than a futures contract. A refinery holding crude oil inventory can keep running even during supply disruptions; this operational flexibility has value. The convenience yield is like an implicit dividend that physical commodity holders earn.

1.2 Major Commodity Categories

Category Examples Key Price Drivers Volatility Character
Precious metals Gold, silver, platinum Monetary policy, inflation, safe-haven demand Moderate; gold is least volatile
Energy Crude oil, natural gas, gasoline OPEC, geopolitics, economic growth, weather High; supply shocks create extreme tails
Industrial metals Copper, aluminum, zinc, iron ore Industrial demand (especially China), mining supply Moderate to high; cyclical
Agriculture Wheat, corn, soybeans, coffee Weather, crop yields, government policy Seasonal patterns; weather-driven spikes
Livestock Live cattle, lean hogs Feed costs, disease outbreaks, demand Moderate; biological production constraints

2. Spot Price vs. Futures Price

Every commodity has two types of prices that coexist in the market:

Stats Bridge

The spot price is an observed value (current measurement). The futures price is a conditional expectation — the market's consensus forecast of the future spot price, adjusted for risk premiums and carrying costs. The relationship between spot and futures prices across different delivery dates forms the term structure of commodity prices, analogous to the yield curve for bonds.

2.1 Contango and Backwardation

The relationship between spot and futures prices defines two market regimes:

Finance Term

Contango: Futures price > Spot price (F > S). The forward curve slopes upward. This is the "normal" state for storable commodities where carrying costs dominate.

Backwardation: Futures price < Spot price (F < S). The forward curve slopes downward. This occurs when there is strong immediate demand or supply scarcity, making the physical commodity more valuable now than later.

Feature Contango (F > S) Backwardation (F < S)
Forward curve Upward sloping Downward sloping
What it signals Adequate supply; storage costs dominate Supply scarcity; immediate need is high
Convenience yield Low (no urgency to hold physical) High (physical inventory is very valuable)
Roll return for futures holders Negative (buy high, sell low when rolling) Positive (buy low, sell high when rolling)
Common for Gold, financial futures Oil during supply crises, perishable goods
Statistical analogue E[ST|info] < F (risk premium positive) E[ST|info] > F (risk premium negative)
Common Pitfall

Many investors assume contango means the market expects prices to rise. This is wrong. Contango primarily reflects carrying costs (storage + financing), not price expectations. A market can be in contango even if participants expect the spot price to fall. Similarly, backwardation does not necessarily mean the market expects prices to drop. The futures-spot relationship is determined by carrying costs and convenience yields, not pure expectations.

3. The Cost of Carry Model

The fundamental pricing relationship for commodity futures is the cost of carry model. It links the futures price to the spot price through the economics of physically holding the commodity:

F = S · e(r + u − y) · T

where:

Stats Bridge

The cost of carry model is a no-arbitrage relationship, not a forecasting model. It says: the futures price must equal the spot price adjusted for the cost of "carrying" the physical commodity until delivery. If F is too high relative to this formula, an arbitrageur would buy the physical commodity, store it, and sell the futures — locking in a riskless profit. This arbitrage keeps futures prices tied to the carry model, just as no-arbitrage keeps put-call parity in the options world.

3.1 Components of the Carry Cost

Component Description Effect on Futures Price Typical Magnitude
Financing cost (r) Opportunity cost of capital tied up in inventory Increases F (contango) 4–6% per year
Storage cost (u) Warehousing, insurance, maintenance Increases F (contango) 0.1% (gold) to 5%+ (natural gas)
Convenience yield (y) Benefit of holding physical supply Decreases F (toward backwardation) 0% (gold) to 20%+ (scarce oil)
Key Insight

For gold, the convenience yield is essentially zero (gold in a vault provides no operational benefit), storage costs are minimal, and the cost of carry is dominated by the interest rate. This is why gold is almost always in contango: F ≈ S · erT. For crude oil, convenience yields can be enormous during supply crises, pushing the market into steep backwardation. The convenience yield is the key variable that distinguishes commodity from financial asset pricing.

3.2 Implied Convenience Yield

We can invert the cost of carry model to extract the market's implied convenience yield, just as we extracted implied volatility from option prices:

yimplied = r + u − (1/T) · ln(F/S)
Stats Bridge

This is another inverse problem: given the observables (S, F, r, u, T), solve for the latent parameter (convenience yield). A high implied convenience yield signals supply tightness — the market is placing a premium on having the physical commodity available immediately. Tracking the implied convenience yield over time provides a real-time indicator of supply-demand balance, analogous to tracking implied volatility to monitor market fear.

4. Gold as an Inflation Hedge: A Hypothesis Test

One of the most persistent claims in finance is that gold is an inflation hedge — that gold returns are positively correlated with inflation, preserving purchasing power when the value of money declines. As statisticians, we should test this claim rather than accept it as folklore.

4.1 The Hypothesis

H0: ρ(Rgold, π) = 0    (gold returns uncorrelated with inflation)
H1: ρ(Rgold, π) > 0    (gold is a positive inflation hedge)

where Rgold is the gold return and π is the inflation rate (CPI change).

Stats Bridge

This is a standard one-sided hypothesis test for correlation. We can use the Pearson correlation coefficient with a t-test, or a regression of gold returns on CPI changes: Rgold,t = α + β · πt + εt. Under the inflation hedge hypothesis, β > 0 and ideally β ≥ 1 (gold more than compensates for inflation). But we must be careful: the result depends heavily on the time horizon. Monthly correlations may be near zero while decade-long correlations are strongly positive.

4.2 Empirical Evidence by Time Horizon

Time Horizon Gold-Inflation Correlation Interpretation
Monthly ~0.05 (near zero) Gold is a poor short-term inflation hedge
Quarterly ~0.10 Weak relationship; not statistically significant
Annual ~0.25 Moderate positive correlation; marginally significant
5-year rolling ~0.50 Meaningful relationship at medium horizons
10-year rolling ~0.70 Strong long-term inflation hedge
Multi-century ~1.0 Near-perfect long-run purchasing power preservation
Key Insight

Gold is a long-run inflation hedge but a poor short-run inflation hedge. Over centuries, gold has maintained purchasing power remarkably well. Over months or even years, gold prices are driven by many factors beyond inflation: real interest rates, dollar strength, geopolitical risk, central bank purchases, and speculative flows. This is a classic example of a relationship that exists at one time scale but not another — a phenomenon familiar to any time-series statistician as scale-dependent correlation.

4.3 Gold and Real Interest Rates

A more nuanced model recognizes that gold responds not to inflation itself but to real interest rates (nominal rate minus inflation). When real rates are low or negative, the opportunity cost of holding gold (which pays no interest) is low, making gold more attractive.

Rgold,t = α + β1 · πt + β2 · rreal,t + β3 · ΔUSDt + εt

This multivariate regression typically finds β2 < 0 (negative relationship with real rates) to be the strongest predictor, often more significant than β1 (direct inflation effect).

5. Oil's Unique Dynamics

Crude oil is the most economically significant commodity. Its price affects virtually every sector of the global economy. Oil prices exhibit dynamics that are distinct from other commodities due to three factors: cartel behavior, geopolitical risk, and storage constraints.

5.1 OPEC as a Cartel

The Organization of the Petroleum Exporting Countries (OPEC) is a cartel of oil-producing nations that collectively control approximately 35–40% of global oil production. OPEC's production decisions directly affect global supply and therefore prices.

Stats Bridge

OPEC introduces a strategic game-theoretic element into oil pricing. Unlike most commodities where supply responds mechanically to price signals, OPEC members make strategic production decisions. This means oil supply is not just a function of price but also of oligopoly dynamics: each member's output depends on what they believe other members will produce. In statistical terms, the data-generating process for oil has an endogenous structural break mechanism — OPEC meetings can discretely shift the supply function.

5.2 Supply Shocks and Fat Tails

Oil prices are subject to sudden supply disruptions caused by:

These shocks create a return distribution with extreme tails. Oil prices have experienced moves of ±30% in a single month multiple times in the past 50 years. The 2020 COVID demand shock briefly pushed oil futures to negative prices — a previously unthinkable event.

Common Pitfall

In April 2020, the WTI crude oil May futures contract settled at −$37.63 per barrel. This was not a data error. With demand collapsing and storage facilities at capacity, holders of expiring futures faced the prospect of physical delivery with nowhere to put the oil. They literally had to pay someone to take the oil off their hands. This event demonstrated that commodity futures prices are not bounded below by zero when physical delivery and storage constraints bind — a fact that broke many pricing models that assumed positive prices.

5.3 Oil's Statistical Properties

Property Typical Value Implication
Annual volatility 30–40% Much higher than stocks (~15–20%) or bonds (~5–8%)
Skewness Positive (supply-shock driven) Large upward spikes more common than large drops
Kurtosis 6–10 (excess kurtosis 3–7) Extremely fat tails; normal distribution is a poor fit
Mean reversion Half-life of 2–5 years Extreme prices tend to revert toward production cost
Autocorrelation of returns Near zero (daily) Short-term returns are approximately unpredictable
Volatility clustering Strong GARCH effects Calm and turbulent periods cluster together

6. The Gold-to-Oil Ratio: A Mean-Reverting Spread

The gold-to-oil ratio — the price of one ounce of gold divided by the price of one barrel of crude oil — is one of the most watched inter-commodity ratios in finance:

Gold/Oil Ratio = Price of Gold ($/oz) / Price of Oil ($/bbl)

Historically, this ratio has averaged about 15–20, meaning one ounce of gold buys 15–20 barrels of oil. The ratio is mean-reverting: when it deviates far from its historical average, it tends to revert back.

Stats Bridge

The gold-to-oil ratio is a cointegration spread. While gold and oil prices individually follow near-random-walk processes (non-stationary, I(1)), their ratio (or log-difference) is stationary (I(0)). This is the definition of cointegration. You can test this formally with the Augmented Dickey-Fuller test or the Engle-Granger two-step procedure. The mean-reverting property makes this spread tradeable: buy the spread when it is unusually low (gold cheap relative to oil), sell when it is unusually high.

6.1 Interpreting Extreme Ratios

Ratio Level Interpretation Historical Context
< 10 Oil expensive relative to gold Oil supply crises (1980, 2008 price spike)
10–15 Below average; oil relatively dear Periods of strong oil demand
15–25 Normal range Most of the time
25–40 Above average; gold relatively dear Oil gluts, gold fear premium
> 40 Extreme; oil very cheap relative to gold 2020 COVID crash (ratio briefly exceeded 100)
Key Insight

The gold-to-oil ratio is a relative value indicator, not a directional predictor. A high ratio does not tell you whether gold will fall or oil will rise — only that the ratio is likely to narrow. This is important for trading: you would trade the spread (long oil, short gold, or vice versa) rather than taking a directional bet on either commodity alone. This pairs trading approach exploits mean-reversion while being market-neutral.

7. Connecting to Our Gold Prediction Model

Earlier in this course, we built a statistical model to predict gold prices. Now we can place that model in the broader context of commodity pricing theory:

7.1 Model Features Revisited

The features we used in the gold prediction model correspond to the fundamental drivers identified in commodity pricing theory:

Model Feature Commodity Theory Explanation Statistical Role
Real interest rates Opportunity cost of holding a zero-yield asset Primary predictor; negative coefficient expected
CPI / Inflation rate Gold as inflation hedge (long-run) Positive coefficient, but weak at short horizons
US dollar index (DXY) Gold priced in USD; weaker dollar = higher gold Negative coefficient; strong relationship
VIX (market fear index) Safe-haven demand during uncertainty Positive coefficient during crises
Central bank gold reserves Official sector demand shifts Slow-moving covariate; structural shift
Oil price Cointegrated with gold; input cost correlation Positive coefficient; captures commodity cycle
Stats Bridge

Our gold prediction model was essentially a reduced-form regression that captures the equilibrium relationships in commodity pricing theory. Each feature corresponds to a theoretical price driver. The model works because these economic relationships are statistically stable (cointegrated). However, the model's out-of-sample performance degrades during structural breaks — periods when the relationship between gold and its drivers changes (e.g., when gold transitions from a currency-linked asset to a freely traded commodity, as it did in 1971).

8. Commodity Indices and Their Construction

Commodity indices aggregate the prices of multiple commodities into a single index, providing a benchmark for the asset class. However, the construction methodology matters enormously because it determines what the index actually measures.

8.1 Major Commodity Indices

Index Weighting Scheme Number of Commodities Energy Weight
S&P GSCI World production weighted 24 ~54% (heavy energy tilt)
Bloomberg Commodity (BCOM) Production + liquidity, with caps 23 ~30% (more diversified)
CRB Index Fixed arithmetic weights 19 ~39%
DBIQ Optimum Yield Optimized roll strategy 14 ~33%
Stats Bridge

The weighting scheme of a commodity index is analogous to the choice of weighting in a composite estimator. Production-weighted indices (like S&P GSCI) weight commodities by their economic importance, analogous to a GDP-weighted average. Equal-weighted indices treat each commodity as equally informative, like an unweighted sample mean. Cap-weighted indices with diversification constraints are like shrinkage estimators that pull extreme weights toward a balanced allocation. The choice of weighting fundamentally changes the index's statistical properties.

8.2 Return Components of Commodity Indices

The total return of a commodity futures index has three components:

Component Source Magnitude
Spot return Change in the spot price of the underlying commodity Varies widely; can be ±50%+ in extreme years
Roll return Gain or loss from rolling expiring futures to the next contract Positive in backwardation, negative in contango; typically −5% to +5%
Collateral return Interest earned on the T-bill collateral backing futures positions Equal to the short-term interest rate; currently ~5%
Common Pitfall

The roll return is the most misunderstood component. When a market is in contango, an index fund must sell the expiring (lower-priced) contract and buy the next (higher-priced) contract — buying high and selling low. This creates a persistent drag that can cause the index to underperform the spot price by several percent per year. During 2005–2020, the roll yield drag on oil futures indices was approximately −5% to −10% per year, meaning futures investors earned far less than the spot price appreciation would suggest.

9. Python: Commodity Analysis

9.1 Testing the Gold-Inflation Hedge Hypothesis

Pythonimport numpy as np
from scipy import stats

# Simulated monthly data: gold returns and inflation (CPI change)
# In practice, you would download this from FRED or similar
np.random.seed(42)
n_months = 360  # 30 years of monthly data

# Generate correlated gold returns and inflation
# True correlation is weak at monthly frequency (~0.05)
rho_monthly = 0.05
cov_matrix = np.array([[0.05**2, rho_monthly * 0.05 * 0.003],
                        [rho_monthly * 0.05 * 0.003, 0.003**2]])
data = np.random.multivariate_normal([0.007, 0.002], cov_matrix, n_months)
gold_returns = data[:, 0]
inflation = data[:, 1]

# Test 1: Monthly correlation
r_monthly, p_monthly = stats.pearsonr(gold_returns, inflation)
print("=== Gold-Inflation Correlation by Time Horizon ===")
print(f"Monthly:   r = {r_monthly:.4f}, p-value = {p_monthly:.4f}")

# Test 2: Quarterly (aggregate to 3-month periods)
n_quarters = n_months // 3
gold_q = gold_returns[:n_quarters*3].reshape(n_quarters, 3).sum(axis=1)
infl_q = inflation[:n_quarters*3].reshape(n_quarters, 3).sum(axis=1)
r_q, p_q = stats.pearsonr(gold_q, infl_q)
print(f"Quarterly: r = {r_q:.4f}, p-value = {p_q:.4f}")

# Test 3: Annual (aggregate to 12-month periods)
n_years = n_months // 12
gold_y = gold_returns[:n_years*12].reshape(n_years, 12).sum(axis=1)
infl_y = inflation[:n_years*12].reshape(n_years, 12).sum(axis=1)
r_y, p_y = stats.pearsonr(gold_y, infl_y)
print(f"Annual:    r = {r_y:.4f}, p-value = {p_y:.4f}")

# Test 4: 5-year rolling (aggregate to 60-month periods)
n_5y = n_months // 60
gold_5y = gold_returns[:n_5y*60].reshape(n_5y, 60).sum(axis=1)
infl_5y = inflation[:n_5y*60].reshape(n_5y, 60).sum(axis=1)
if len(gold_5y) > 2:
    r_5y, p_5y = stats.pearsonr(gold_5y, infl_5y)
    print(f"5-Year:    r = {r_5y:.4f}, p-value = {p_5y:.4f}")

print("\nConclusion: Correlation strengthens with time horizon.")
print("Gold is a long-run inflation hedge, not a short-run one.")

9.2 Cost of Carry and Contango/Backwardation

Pythonimport numpy as np

def futures_price(spot, r, storage, convenience_yield, T):
    """Cost of carry model for commodity futures."""
    return spot * np.exp((r + storage - convenience_yield) * T)

def implied_convenience_yield(spot, futures, r, storage, T):
    """Extract implied convenience yield from observed prices."""
    return r + storage - np.log(futures / spot) / T

# === Gold: Near-zero convenience yield, always in contango ===
print("=== Gold Futures Term Structure ===")
gold_spot = 2000  # $/oz
r = 0.05           # 5% risk-free rate
gold_storage = 0.002  # 0.2% storage cost
gold_cy = 0.0       # zero convenience yield for gold

print(f"{'Delivery':>10} {'Futures Price':>15} {'Basis':>10} {'State':>15}")
for months in [1, 3, 6, 12, 24]:
    T = months / 12
    F = futures_price(gold_spot, r, gold_storage, gold_cy, T)
    basis = F - gold_spot
    state = "Contango" if F > gold_spot else "Backwardation"
    print(f"  {months:>3}mo     ${F:>12.2f}   ${basis:>+8.2f}   {state}")

# === Oil: Varies between contango and backwardation ===
print("\n=== Oil Futures: Contango vs Backwardation ===")
oil_spot = 75  # $/bbl
oil_storage = 0.04  # 4% storage cost

# Scenario 1: Low convenience yield (ample supply) = Contango
print("\nScenario 1: Ample supply (convenience yield = 2%)")
print(f"{'Delivery':>10} {'Futures':>10} {'Basis':>10} {'State':>15}")
for months in [1, 3, 6, 12]:
    T = months / 12
    F = futures_price(oil_spot, r, oil_storage, 0.02, T)
    basis = F - oil_spot
    state = "Contango" if F > oil_spot else "Backwardation"
    print(f"  {months:>3}mo     ${F:>7.2f}   ${basis:>+7.2f}   {state}")

# Scenario 2: High convenience yield (supply crisis) = Backwardation
print("\nScenario 2: Supply crisis (convenience yield = 20%)")
print(f"{'Delivery':>10} {'Futures':>10} {'Basis':>10} {'State':>15}")
for months in [1, 3, 6, 12]:
    T = months / 12
    F = futures_price(oil_spot, r, oil_storage, 0.20, T)
    basis = F - oil_spot
    state = "Contango" if F > oil_spot else "Backwardation"
    print(f"  {months:>3}mo     ${F:>7.2f}   ${basis:>+7.2f}   {state}")

9.3 Gold-to-Oil Ratio and Mean Reversion

Pythonimport numpy as np
from scipy import stats

# Simulate 30 years of monthly gold and oil prices
# Both are non-stationary, but their ratio is mean-reverting
np.random.seed(99)
n = 360

# Gold: random walk with drift (non-stationary)
gold_log_returns = np.random.normal(0.005, 0.04, n)
gold_prices = 800 * np.exp(np.cumsum(gold_log_returns))

# Oil: random walk with drift, correlated with gold
oil_log_returns = 0.3 * gold_log_returns + np.random.normal(0.003, 0.06, n)
oil_prices = 50 * np.exp(np.cumsum(oil_log_returns))

# Compute ratio
ratio = gold_prices / oil_prices

# Summary statistics
print("=== Gold-to-Oil Ratio Analysis ===")
print(f"Mean ratio:   {ratio.mean():.1f}")
print(f"Median ratio: {np.median(ratio):.1f}")
print(f"Std dev:      {ratio.std():.1f}")
print(f"Min:          {ratio.min():.1f}")
print(f"Max:          {ratio.max():.1f}")

# Test for stationarity (ADF test approximation)
# H0: ratio has a unit root (non-stationary)
# H1: ratio is stationary (mean-reverting)
# Simple check: regression of change on lagged level
delta_ratio = np.diff(ratio)
lagged_ratio = ratio[:-1]
slope, intercept, r_val, p_val, se = stats.linregress(lagged_ratio, delta_ratio)

print(f"\n=== Mean Reversion Test ===")
print(f"AR(1) coefficient on lagged level: {slope:.4f}")
print(f"  (Negative = mean-reverting, should be significantly < 0)")
print(f"t-statistic: {slope / se:.2f}")

if slope < 0:
    half_life = -np.log(2) / np.log(1 + slope)
    print(f"Estimated half-life of mean reversion: {half_life:.1f} months")
    print(f"  ({half_life / 12:.1f} years)")

# Trading signal: z-score of the ratio
z_score = (ratio - ratio.mean()) / ratio.std()
print(f"\nCurrent ratio:  {ratio[-1]:.1f}")
print(f"Current z-score: {z_score[-1]:+.2f}")
if z_score[-1] > 2:
    print("Signal: Gold expensive vs oil. Consider long oil / short gold.")
elif z_score[-1] < -2:
    print("Signal: Oil expensive vs gold. Consider long gold / short oil.")
else:
    print("Signal: Ratio within normal range. No trade.")

9.4 Commodity Return Distribution Analysis

Pythonimport numpy as np
from scipy import stats

# Compare return distributions across asset classes
np.random.seed(55)
n = 2520  # ~10 years of daily data

# Simulate daily returns with realistic properties
stock_returns = np.random.standard_t(df=5, size=n) * 0.01 + 0.0003
bond_returns  = np.random.normal(0.0001, 0.003, n)
gold_returns  = np.random.standard_t(df=7, size=n) * 0.008 + 0.0002
oil_returns   = np.random.standard_t(df=4, size=n) * 0.015 + 0.0001

assets = {
    'Stocks (S&P 500)': stock_returns,
    'Bonds (10Y Treasury)': bond_returns,
    'Gold': gold_returns,
    'Crude Oil': oil_returns,
}

print("=== Daily Return Distribution Comparison ===")
print(f"{'Asset':>22} {'Mean':>8} {'Std':>8} {'Skew':>8} "
      f"{'Kurt':>8} {'Min':>8} {'Max':>8}")
print("-" * 74)

for name, rets in assets.items():
    print(f"{name:>22} {rets.mean():>+7.4f} {rets.std():>7.4f} "
          f"{stats.skew(rets):>+7.2f}  {stats.kurtosis(rets):>6.1f} "
          f"{rets.min():>+7.3f} {rets.max():>+7.3f}")

print("\n(Kurtosis shown is excess kurtosis; normal distribution = 0)")
print("Note: Oil has the fattest tails and highest volatility.")

# Normality test (Jarque-Bera)
print("\n=== Jarque-Bera Normality Test ===")
for name, rets in assets.items():
    jb_stat, jb_p = stats.jarque_bera(rets)
    normal = "Normal" if jb_p > 0.05 else "Non-normal"
    print(f"  {name:>22}: JB = {jb_stat:>10.1f}, p = {jb_p:.4e}  [{normal}]")

# Correlation matrix
print("\n=== Cross-Asset Correlation Matrix ===")
returns_matrix = np.column_stack([stock_returns, bond_returns, gold_returns, oil_returns])
corr = np.corrcoef(returns_matrix.T)
labels = ['Stocks', 'Bonds', 'Gold', 'Oil']

print(f"{'':>10}" + "".join(f"{l:>10}" for l in labels))
for i, label in enumerate(labels):
    print(f"{label:>10}" + "".join(f"{corr[i,j]:>10.3f}" for j in range(4)))

print("\nKey: Gold's low correlation with stocks makes it a diversifier.")

10. Summary

This module has covered the statistical foundations of commodity pricing and analysis:

Finance Concept Statistical / Mathematical Analogue
Commodity price dynamics Mean-reverting process (Ornstein-Uhlenbeck), not random walk
Spot vs. futures price Observed value vs. conditional expectation (adjusted)
Contango / backwardation Sign of the carry cost; term structure shape
Cost of carry model No-arbitrage constraint equation
Convenience yield Latent variable extracted via inverse problem
Gold as inflation hedge Scale-dependent correlation; long-run cointegration
Oil supply shocks Heavy-tailed innovations; structural breaks (OPEC decisions)
Gold-to-oil ratio Cointegration spread; mean-reverting (ADF testable)
Commodity index weights Choice of composite estimator weighting (production vs. equal)
Roll return drag Systematic bias in futures-based returns vs. spot

With stocks, bonds, options, and commodities now covered, you have a comprehensive understanding of the four major asset classes. Each has distinct statistical properties, return distributions, and pricing mechanisms. In the next module, we will explore how to combine these assets into portfolios using the tools of modern portfolio theory.