Module 22: Building Your Own Financial Dashboard
Capstone project: pull live data, compute risk metrics, visualize portfolio performance, and tie everything together
This is the capstone module. Over the preceding 21 modules, you have learned to think about financial data, risk, portfolios, derivatives, behavioral biases, macroeconomics, and financial news through the lens of statistics. Now you will build something: a personal financial dashboard that pulls live data and computes the metrics you have learned throughout the course.
The dashboard has six components, each drawing on different modules:
| Component | What It Does | Modules Referenced |
|---|---|---|
| 1. Portfolio Tracker | Track holdings, compute values, returns, weights | Modules 1–4 (data, returns, correlation) |
| 2. Risk Metrics | VaR, CVaR, max drawdown, Sharpe ratio | Modules 5–8 (risk, volatility, tail risk) |
| 3. Correlation Monitor | Rolling correlation heatmap | Module 4 (correlation), Module 9 (portfolio theory) |
| 4. Factor Exposure | CAPM and Fama-French regressions | Modules 10–12 (factor models, regression) |
| 5. Macro Context | GDP, inflation, yield curve status | Module 21 (macroeconomics) |
| 6. News Sentiment | Headline sentiment analysis | Module 20 (reading financial news) |
A financial dashboard is not just a software project — it is a statistical monitoring system. Each component is a summary statistic, a hypothesis test, or a model output that helps you make decisions. Building it forces you to operationalize everything you have learned.
22.1 — Component 1: Portfolio Tracker
The foundation of any financial dashboard is knowing what you own. The portfolio tracker takes a list of holdings (ticker, shares, purchase price) and computes current market values, portfolio weights, and returns.
Data Model
A portfolio is a collection of positions. Each position has:
- Ticker: The stock symbol (e.g., AAPL, VTI, BND)
- Shares: Number of shares held
- Cost basis: Average purchase price per share
- Current price: Latest market price (pulled live)
- Market value: Shares × Current price
- Weight: Market value / Total portfolio value
- Gain/Loss: (Current price − Cost basis) × Shares
A portfolio is a weighted mixture distribution. The portfolio return is a weighted sum of individual asset returns: Rp = ∑ wi Ri. The weights are not fixed — they drift as prices change, which means the portfolio's statistical properties (mean, variance, skewness) evolve over time even without trading. This is why periodic rebalancing is necessary to maintain a target allocation.
Rportfolio = ∑i wi · Ri
Total P&L = ∑i ni · (Pi,current − Pi,cost)
Python# Component 1: Portfolio Tracker import yfinance as yf import pandas as pd import numpy as np class PortfolioTracker: """Track a portfolio of holdings with live price data.""" def __init__(self, holdings: list[dict]): """ holdings: list of dicts with keys 'ticker', 'shares', 'cost_basis' Example: [{'ticker': 'AAPL', 'shares': 50, 'cost_basis': 150.00}, ...] """ self.holdings = pd.DataFrame(holdings) self.tickers = self.holdings["ticker"].tolist() def fetch_prices(self): """Fetch current prices for all holdings.""" prices = {} for ticker in self.tickers: stock = yf.Ticker(ticker) hist = stock.history(period="1d") if not hist.empty: prices[ticker] = hist["Close"].iloc[-1] self.holdings["current_price"] = self.holdings["ticker"].map(prices) return self def compute_metrics(self): """Compute portfolio metrics.""" h = self.holdings h["market_value"] = h["shares"] * h["current_price"] h["cost_value"] = h["shares"] * h["cost_basis"] h["gain_loss"] = h["market_value"] - h["cost_value"] h["return_pct"] = (h["current_price"] / h["cost_basis"] - 1) * 100 total_value = h["market_value"].sum() h["weight"] = h["market_value"] / total_value * 100 return self def summary(self): """Print portfolio summary.""" h = self.holdings total_value = h["market_value"].sum() total_cost = h["cost_value"].sum() total_gain = h["gain_loss"].sum() total_return = (total_value / total_cost - 1) * 100 print("Portfolio Summary") print("=" * 70) cols = ["ticker", "shares", "cost_basis", "current_price", "market_value", "gain_loss", "return_pct", "weight"] print(h[cols].to_string(index=False, float_format="%.2f")) print(f"\nTotal Value: ${total_value:,.2f}") print(f"Total Cost: ${total_cost:,.2f}") print(f"Total Gain: ${total_gain:+,.2f} ({total_return:+.1f}%)") return h def fetch_history(self, period="1y"): """Fetch historical prices for portfolio return computation.""" prices = pd.DataFrame() for ticker in self.tickers: df = yf.download(ticker, period=period, progress=False) prices[ticker] = df["Close"] self.price_history = prices.dropna() self.returns = self.price_history.pct_change().dropna() return self # Example usage holdings = [ {"ticker": "VTI", "shares": 100, "cost_basis": 200.00}, {"ticker": "VXUS", "shares": 80, "cost_basis": 55.00}, {"ticker": "BND", "shares": 150, "cost_basis": 75.00}, {"ticker": "VNQ", "shares": 40, "cost_basis": 85.00}, {"ticker": "GLD", "shares": 30, "cost_basis": 170.00}, ] tracker = PortfolioTracker(holdings) tracker.fetch_prices().compute_metrics().summary() tracker.fetch_history(period="1y")
22.2 — Component 2: Risk Metrics
Knowing your portfolio's value is necessary but not sufficient. You must also know your portfolio's risk. This component computes the key risk metrics covered throughout the course.
Risk Metrics Overview
| Metric | What It Measures | Formula | Interpretation |
|---|---|---|---|
| Annualized Volatility | Dispersion of returns | σann = σdaily × √252 | Standard deviation of annual return |
| Value at Risk (VaR) | Maximum expected loss at confidence level | VaRα = −F-1(α) | "95% of the time, daily loss will not exceed $X" |
| CVaR (Expected Shortfall) | Expected loss beyond VaR | CVaRα = E[−R | R ≤ −VaR] | Average loss on the worst 5% of days |
| Maximum Drawdown | Largest peak-to-trough decline | MDD = maxt (Peakt − Trought) / Peakt | Worst cumulative loss experienced |
| Sharpe Ratio | Risk-adjusted return | SR = (Rp − Rf) / σp | Excess return per unit of risk; > 1 is good |
| Sortino Ratio | Downside risk-adjusted return | SoR = (Rp − Rf) / σdownside | Like Sharpe but only penalizes downside volatility |
VaR is the α-quantile of the loss distribution. CVaR is the conditional expectation beyond that quantile (tail conditional expectation). The Sharpe ratio is a signal-to-noise ratio. Max drawdown is the running maximum of cumulative losses — a path-dependent order statistic. Each of these maps directly to a concept from your statistics courses.
Python# Component 2: Risk Metrics Engine import numpy as np import pandas as pd from scipy import stats class RiskMetrics: """Compute comprehensive risk metrics for a portfolio.""" def __init__(self, returns: pd.Series, risk_free_rate: float = 0.05): """ returns: daily portfolio returns (simple, not log) risk_free_rate: annualized risk-free rate """ self.returns = returns.dropna() self.rf_daily = (1 + risk_free_rate) ** (1/252) - 1 self.rf_annual = risk_free_rate def annualized_return(self): """Geometric mean annualized return.""" total = (1 + self.returns).prod() n_years = len(self.returns) / 252 return total ** (1 / n_years) - 1 def annualized_volatility(self): """Annualized standard deviation.""" return self.returns.std() * np.sqrt(252) def sharpe_ratio(self): """Annualized Sharpe ratio.""" excess = self.returns - self.rf_daily return excess.mean() / excess.std() * np.sqrt(252) def sortino_ratio(self): """Sortino ratio (downside deviation only).""" excess = self.returns - self.rf_daily downside = excess[excess < 0] downside_std = np.sqrt((downside ** 2).mean()) * np.sqrt(252) return excess.mean() * 252 / downside_std def var_historical(self, confidence=0.95): """Historical VaR at given confidence level.""" return -np.percentile(self.returns, (1 - confidence) * 100) def var_parametric(self, confidence=0.95): """Parametric (Gaussian) VaR.""" z = stats.norm.ppf(confidence) return -(self.returns.mean() - z * self.returns.std()) def cvar_historical(self, confidence=0.95): """Historical CVaR (Expected Shortfall).""" var = self.var_historical(confidence) tail_losses = self.returns[self.returns <= -var] return -tail_losses.mean() def max_drawdown(self): """Maximum drawdown (peak to trough).""" cumulative = (1 + self.returns).cumprod() running_max = cumulative.cummax() drawdown = (cumulative - running_max) / running_max return -drawdown.min() def calmar_ratio(self): """Calmar ratio: annualized return / max drawdown.""" return self.annualized_return() / self.max_drawdown() def skewness(self): return self.returns.skew() def kurtosis(self): return self.returns.kurtosis() def report(self): """Generate full risk report.""" print("Risk Metrics Report") print("=" * 55) print(f" Observations: {len(self.returns)} days") print(f" Annualized Return: {self.annualized_return():.2%}") print(f" Annualized Volatility: {self.annualized_volatility():.2%}") print(f" Sharpe Ratio: {self.sharpe_ratio():.3f}") print(f" Sortino Ratio: {self.sortino_ratio():.3f}") print(f" Calmar Ratio: {self.calmar_ratio():.3f}") print(f"\n VaR (95%, Historical): {self.var_historical():.2%}") print(f" VaR (95%, Parametric): {self.var_parametric():.2%}") print(f" CVaR (95%, Historical): {self.cvar_historical():.2%}") print(f" Maximum Drawdown: {self.max_drawdown():.2%}") print(f"\n Skewness: {self.skewness():.3f}") print(f" Excess Kurtosis: {self.kurtosis():.3f}") # Statistical tests jb_stat, jb_p = stats.jarque_bera(self.returns) print(f"\n Jarque-Bera statistic: {jb_stat:.1f} (p = {jb_p:.4f})") normality = "REJECTED" if jb_p < 0.05 else "Not rejected" print(f" Normality: {normality}") # Example: compute risk for a portfolio # (Using tracker.returns from Component 1) # weights = tracker.holdings['weight'].values / 100 # portfolio_returns = (tracker.returns * weights).sum(axis=1) # risk = RiskMetrics(portfolio_returns) # risk.report()
Do not rely solely on parametric VaR. Financial returns are not normally distributed (you learned this in Modules 3 and 5). Parametric VaR underestimates tail risk because it assumes Gaussian tails. Always compute historical VaR alongside parametric VaR and compare the two. If they differ substantially, your returns have fat tails and the parametric estimate is dangerously optimistic.
22.3 — Component 3: Correlation Monitor
Portfolio diversification depends on correlations between assets. But correlations are not constant — they shift over time and spike during crises (exactly when you need diversification most). The correlation monitor tracks this instability.
A rolling correlation matrix is a time-varying parameter estimate. The window length controls the bias-variance tradeoff: a short window (30 days) is responsive but noisy; a long window (252 days) is smooth but may miss regime changes. This is the same tradeoff you encounter in kernel density estimation (bandwidth selection) or in exponential smoothing (the decay parameter λ).
What to Monitor
- Pairwise correlations over time: Are your assets becoming more or less correlated?
- Average portfolio correlation: High average correlation means poor diversification.
- Correlation regime changes: Sudden jumps in correlation often signal a market stress event.
- Stock-bond correlation: Historically negative (bonds diversify stocks), but it can turn positive during inflation episodes.
Average portfolio correlation: ρ̄ = (2 / n(n-1)) ∑i<j ρij
Effective number of independent bets: Neff ≈ n / (1 + (n-1)ρ̄)
Python# Component 3: Correlation Monitor import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns class CorrelationMonitor: """Monitor rolling correlations in a portfolio.""" def __init__(self, returns: pd.DataFrame, window: int = 63): """ returns: DataFrame with columns = tickers, index = dates window: rolling window in trading days (63 ~ 3 months) """ self.returns = returns self.window = window self.tickers = returns.columns.tolist() self.n_assets = len(self.tickers) def current_correlation(self): """Current (full-sample) correlation matrix.""" return self.returns.corr() def rolling_pairwise(self, ticker1, ticker2): """Rolling correlation between two assets.""" return self.returns[ticker1].rolling(self.window).corr( self.returns[ticker2] ) def average_correlation(self): """Average pairwise correlation (diversification measure).""" corr = self.current_correlation() # Extract upper triangle (excluding diagonal) mask = np.triu(np.ones_like(corr, dtype=bool), k=1) upper = corr.where(mask) return upper.stack().mean() def effective_independent_bets(self): """Number of effective independent bets.""" avg_corr = self.average_correlation() return self.n_assets / (1 + (self.n_assets - 1) * avg_corr) def rolling_average_correlation(self): """Rolling average pairwise correlation over time.""" avg_corrs = [] dates = [] for end in range(self.window, len(self.returns)): window_data = self.returns.iloc[end - self.window:end] corr = window_data.corr() mask = np.triu(np.ones_like(corr, dtype=bool), k=1) avg_corrs.append(corr.where(mask).stack().mean()) dates.append(self.returns.index[end]) return pd.Series(avg_corrs, index=dates) def plot_heatmap(self): """Plot correlation heatmap.""" corr = self.current_correlation() fig, ax = plt.subplots(figsize=(8, 6)) sns.heatmap(corr, annot=True, fmt=".2f", cmap="RdBu_r", center=0, vmin=-1, vmax=1, ax=ax, square=True, linewidths=0.5) ax.set_title(f"Correlation Matrix ({self.window}-day window)") plt.tight_layout() return fig def report(self): """Print correlation summary.""" print("Correlation Monitor Report") print("=" * 55) print(f" Assets: {self.n_assets}") print(f" Rolling window: {self.window} days") print(f" Avg pairwise corr: {self.average_correlation():.3f}") print(f" Effective ind. bets: {self.effective_independent_bets():.1f}") print(f"\nCorrelation Matrix:") print(self.current_correlation().round(3).to_string()) # Example usage # corr_monitor = CorrelationMonitor(tracker.returns, window=63) # corr_monitor.report() # corr_monitor.plot_heatmap()
The "effective number of independent bets" is a powerful diversification diagnostic. If you hold 5 assets but the effective number is only 2.1, your portfolio is really just two independent bets wearing five different labels. This happens commonly with portfolios heavy in US equities: large-cap, mid-cap, and small-cap are highly correlated, so they count as roughly one bet.
22.4 — Component 4: Factor Exposure Analysis
Understanding why your portfolio moves requires factor analysis. Are your returns driven by the broad market (beta), by value exposure, by size exposure, or by momentum? Factor regression answers this question.
CAPM: Single-Factor Model
Here α is the intercept (excess return not explained by the market), β is the market sensitivity, and ε is the idiosyncratic residual.
Fama-French Three-Factor Model
| Factor | Name | What It Captures | Statistical Interpretation |
|---|---|---|---|
| MKT − Rf | Market excess return | Systematic market risk | First principal component of stock returns |
| SMB | Small Minus Big | Size premium (small caps vs large caps) | Long-short portfolio sorted on market cap |
| HML | High Minus Low | Value premium (value stocks vs growth stocks) | Long-short portfolio sorted on book-to-market |
| RMW | Robust Minus Weak | Profitability premium | Long-short portfolio sorted on operating profitability |
| CMA | Conservative Minus Aggressive | Investment premium | Long-short portfolio sorted on asset growth |
| MOM | Momentum | Trend-following premium | Long recent winners, short recent losers |
Factor regressions are multiple linear regressions. The R-squared tells you how much of your portfolio's variance is explained by systematic factors. The alpha (α) is the intercept — the return not attributable to any factor. A statistically significant positive alpha would imply genuine skill (or an omitted factor). The t-statistic on alpha is the key test: if |t| > 2, the alpha is significant at approximately 5%. Most portfolios have alphas indistinguishable from zero.
Python# Component 4: Factor Exposure Analysis import numpy as np import pandas as pd import statsmodels.api as sm from pandas_datareader import data as pdr class FactorAnalysis: """Run CAPM and Fama-French factor regressions.""" def __init__(self, portfolio_returns: pd.Series, start_date: str = "2020-01-01"): self.port_ret = portfolio_returns self.start_date = start_date self._load_factors() def _load_factors(self): """Load Fama-French factors from Ken French's data library.""" from pandas_datareader.famafrench import get_available_datasets ff3 = pdr.DataReader("F-F_Research_Data_Factors_daily", "famafrench", start=self.start_date) self.factors = ff3[0] / 100 # Convert from percent to decimal def run_capm(self): """Run CAPM regression.""" merged = pd.DataFrame({ "excess_ret": self.port_ret - self.factors["RF"], "mkt_excess": self.factors["Mkt-RF"] }).dropna() X = sm.add_constant(merged["mkt_excess"]) model = sm.OLS(merged["excess_ret"], X).fit(cov_type="HC1") return model def run_fama_french(self): """Run Fama-French 3-factor regression.""" merged = pd.DataFrame({ "excess_ret": self.port_ret - self.factors["RF"], "Mkt_RF": self.factors["Mkt-RF"], "SMB": self.factors["SMB"], "HML": self.factors["HML"], }).dropna() X = sm.add_constant(merged[["Mkt_RF", "SMB", "HML"]]) model = sm.OLS(merged["excess_ret"], X).fit(cov_type="HC1") return model def report(self): """Generate factor analysis report.""" capm = self.run_capm() ff3 = self.run_fama_french() print("Factor Exposure Report") print("=" * 60) # CAPM results print("\n--- CAPM (Single Factor) ---") print(f" Alpha (daily): {capm.params['const']:.6f} " f"(t = {capm.tvalues['const']:.2f}, p = {capm.pvalues['const']:.4f})") print(f" Alpha (annual): {capm.params['const'] * 252:.2%}") print(f" Beta: {capm.params['mkt_excess']:.3f} " f"(t = {capm.tvalues['mkt_excess']:.2f})") print(f" R-squared: {capm.rsquared:.3f}") # FF3 results print("\n--- Fama-French 3-Factor ---") print(f" Alpha (daily): {ff3.params['const']:.6f} " f"(t = {ff3.tvalues['const']:.2f}, p = {ff3.pvalues['const']:.4f})") print(f" Alpha (annual): {ff3.params['const'] * 252:.2%}") print(f" Market Beta: {ff3.params['Mkt_RF']:.3f} (t = {ff3.tvalues['Mkt_RF']:.2f})") print(f" SMB (size): {ff3.params['SMB']:.3f} (t = {ff3.tvalues['SMB']:.2f})") print(f" HML (value): {ff3.params['HML']:.3f} (t = {ff3.tvalues['HML']:.2f})") print(f" R-squared: {ff3.rsquared:.3f}") # Interpretation beta = ff3.params["Mkt_RF"] smb = ff3.params["SMB"] hml = ff3.params["HML"] print(f"\n--- Interpretation ---") print(f" Market exposure: {'Aggressive' if beta > 1 else 'Defensive'} (beta = {beta:.2f})") print(f" Size tilt: {'Small-cap' if smb > 0.05 else 'Large-cap' if smb < -0.05 else 'Neutral'}") print(f" Value tilt: {'Value' if hml > 0.05 else 'Growth' if hml < -0.05 else 'Neutral'}") # Example usage # factor_analysis = FactorAnalysis(portfolio_returns, start_date="2023-01-01") # factor_analysis.report()
A positive alpha in a CAPM regression does not necessarily mean skill. It might mean you have exposure to omitted factors (size, value, momentum). Always run the Fama-French model as a robustness check. If the CAPM alpha disappears when you add SMB and HML, your "alpha" was just disguised factor exposure. True alpha should survive the inclusion of all known factors.
22.5 — Component 5: Macro Context
No portfolio exists in a vacuum. The macroeconomic environment determines the backdrop for all investment returns. This component pulls the key macro indicators from FRED and provides a concise summary of where the economy stands.
Key Macro Indicators for Investors
| Indicator | Why It Matters | What to Watch |
|---|---|---|
| Real GDP Growth | Overall economic health | Negative growth = recession; watch for deceleration |
| Core PCE Inflation | Fed's target measure | Above 2%: Fed likely tightening. Below 2%: Fed may ease. |
| Unemployment Rate | Labor market health | Rising unemployment often precedes equity weakness |
| Fed Funds Rate | Cost of money | Rate hikes: headwind for bonds and growth stocks |
| 10Y−2Y Spread | Recession predictor | Inversion (negative) historically precedes recessions |
| VIX | Market fear gauge | >30: high fear; <15: complacency |
Python# Component 5: Macro Context Dashboard import pandas as pd import numpy as np from pandas_datareader import data as pdr from datetime import datetime, timedelta import yfinance as yf class MacroContext: """Pull and summarize macroeconomic context.""" SERIES = { "GDP Growth (Q/Q Ann.)": "A191RL1Q225SBEA", "Core PCE Inflation": "PCEPILFE", "CPI Inflation": "CPIAUCSL", "Unemployment Rate": "UNRATE", "Fed Funds Rate": "FEDFUNDS", "10Y Treasury": "GS10", "2Y Treasury": "GS2", "Initial Claims": "ICSA", } def __init__(self): self.data = {} self._fetch_data() def _fetch_data(self): """Download latest data from FRED.""" start = (datetime.now() - timedelta(days=365 * 3)).strftime("%Y-%m-%d") for name, code in self.SERIES.items(): try: df = pdr.get_data_fred(code, start=start) self.data[name] = df.iloc[:, 0] except Exception as e: print(f" Warning: Could not fetch {name}: {e}") # VIX from Yahoo Finance try: vix = yf.download("^VIX", start=start, progress=False) self.data["VIX"] = vix["Close"] except: pass def yield_curve_status(self): """Determine yield curve status.""" if "10Y Treasury" in self.data and "2Y Treasury" in self.data: spread = (self.data["10Y Treasury"].dropna().iloc[-1] - self.data["2Y Treasury"].dropna().iloc[-1]) if spread < -0.5: return spread, "DEEPLY INVERTED — Strong recession signal" elif spread < 0: return spread, "INVERTED — Recession warning" elif spread < 0.5: return spread, "FLAT — Late cycle, watch closely" else: return spread, "NORMAL — No immediate recession signal" return None, "Data unavailable" def inflation_status(self): """Compute year-over-year inflation.""" if "CPI Inflation" in self.data: cpi = self.data["CPI Inflation"].dropna() yoy = (cpi.iloc[-1] / cpi.iloc[-13] - 1) * 100 if len(cpi) > 13 else None return yoy return None def report(self): """Print macro context summary.""" print("Macroeconomic Context") print("=" * 60) for name, series in self.data.items(): s = series.dropna() if len(s) > 0: print(f" {name:30s}: {s.iloc[-1]:>8.2f} ({s.index[-1].date()})") spread, status = self.yield_curve_status() if spread is not None: print(f"\n Yield Curve (10Y-2Y): {spread:+.2f}pp") print(f" Status: {status}") infl = self.inflation_status() if infl is not None: target = "ABOVE TARGET" if infl > 2.5 else ("NEAR TARGET" if infl > 1.5 else "BELOW TARGET") print(f"\n CPI Inflation (YoY): {infl:.1f}% — {target}") # macro = MacroContext() # macro.report()
The macro dashboard is a multivariate monitoring system, analogous to a statistical process control (SPC) chart in quality engineering. Each indicator has a "normal" range, and deviations trigger alerts. The challenge is that the indicators are correlated (inflation and the Fed rate move together), so you need to think about the joint distribution, not just marginals. A macro regime is a point in this multivariate space.
22.6 — Component 6: News Sentiment Monitor
The final component applies the critical reading skills from Module 20. Rather than reading every headline manually, you can automate sentiment measurement across a stream of financial news.
Approach: Loughran-McDonald Financial Sentiment
General-purpose sentiment lexicons (like VADER) perform poorly on financial text because many financial-specific words have different sentiment connotations. The Loughran-McDonald dictionary was developed specifically for financial text analysis.
| Word | General Sentiment | Financial Sentiment (L-M) |
|---|---|---|
| "liability" | Negative | Neutral (accounting term) |
| "outstanding" | Positive | Neutral ("shares outstanding") |
| "capital" | Neutral | Neutral (financial term) |
| "crude" | Negative | Neutral ("crude oil") |
| "tax" | Negative | Neutral (financial reporting) |
Python# Component 6: News Sentiment Monitor import numpy as np import pandas as pd from collections import Counter import re class SentimentMonitor: """Simple financial news sentiment analysis.""" # Loughran-McDonald inspired financial sentiment words (subset) POSITIVE = { 'beat', 'surge', 'rally', 'gain', 'profit', 'growth', 'strong', 'upgrade', 'outperform', 'record', 'boom', 'bullish', 'optimistic', 'recovery', 'rebound', 'improve', 'expand', 'opportunity', 'exceed', 'advance', 'favorable', 'positive', 'upturn', } NEGATIVE = { 'loss', 'crash', 'plunge', 'fear', 'decline', 'risk', 'recession', 'downgrade', 'miss', 'weak', 'selloff', 'bearish', 'pessimistic', 'crisis', 'default', 'bankrupt', 'turmoil', 'layoff', 'volatility', 'slump', 'adverse', 'negative', 'downturn', } UNCERTAINTY = { 'uncertain', 'risk', 'volatile', 'unpredictable', 'unclear', 'turbulent', 'instability', 'doubt', 'speculation', 'caution', } def __init__(self): pass def analyze_headline(self, headline: str) -> dict: """Analyze sentiment of a single headline.""" words = set(re.findall(r'\b\w+\b', headline.lower())) pos = len(words & self.POSITIVE) neg = len(words & self.NEGATIVE) unc = len(words & self.UNCERTAINTY) total = pos + neg score = (pos - neg) / total if total > 0 else 0.0 return { "headline": headline, "score": score, "positive_count": pos, "negative_count": neg, "uncertainty_count": unc, "label": "Positive" if score > 0.1 else ("Negative" if score < -0.1 else "Neutral") } def analyze_batch(self, headlines: list[str]) -> pd.DataFrame: """Analyze a batch of headlines.""" results = [self.analyze_headline(h) for h in headlines] return pd.DataFrame(results) def aggregate_sentiment(self, df: pd.DataFrame) -> dict: """Compute aggregate sentiment statistics.""" return { "mean_score": df["score"].mean(), "median_score": df["score"].median(), "std_score": df["score"].std(), "pct_positive": (df["label"] == "Positive").mean(), "pct_negative": (df["label"] == "Negative").mean(), "pct_neutral": (df["label"] == "Neutral").mean(), "avg_uncertainty": df["uncertainty_count"].mean(), } # Example usage with sample headlines monitor = SentimentMonitor() sample_headlines = [ "S&P 500 surges to record high on strong earnings growth", "Federal Reserve signals caution amid recession fears", "Tech stocks rally as AI boom drives profit expansion", "Bank layoffs accelerate as credit risk mounts", "Markets advance on positive jobs data despite inflation uncertainty", "Oil prices plunge on weak demand outlook", "Housing market shows signs of recovery after slump", "Volatile trading session ends with modest gains", ] results = monitor.analyze_batch(sample_headlines) agg = monitor.aggregate_sentiment(results) print("News Sentiment Analysis") print("=" * 60) for _, row in results.iterrows(): print(f" [{row['label']:>8s}] (score: {row['score']:+.2f}) {row['headline'][:55]}") print(f"\nAggregate Sentiment:") print(f" Mean score: {agg['mean_score']:+.3f}") print(f" Positive: {agg['pct_positive']:.0%}") print(f" Negative: {agg['pct_negative']:.0%}") print(f" Neutral: {agg['pct_neutral']:.0%}") print(f" Avg uncertainty: {agg['avg_uncertainty']:.1f} words/headline")
Automated sentiment analysis is a noisy classifier that gives you a rough directional read, not a precise signal. Its value is in aggregation: the sentiment of one headline is meaningless, but the average sentiment across 100 headlines over a week provides a useful summary of the narrative environment. Think of it as a sufficient statistic for the mood of the financial press.
22.7 — Putting It All Together: The Complete Dashboard
Now we assemble all six components into a single script that generates a comprehensive financial dashboard report. This is the culmination of the entire course.
Python#!/usr/bin/env python3 """ Financial Dashboard for Statisticians ====================================== A comprehensive personal finance dashboard that combines: 1. Portfolio tracking 2. Risk metrics 3. Correlation monitoring 4. Factor exposure analysis 5. Macroeconomic context 6. News sentiment Generates an HTML report with charts and tables. Requirements: pip install yfinance pandas numpy matplotlib seaborn pip install statsmodels scipy pandas-datareader scikit-learn """ import yfinance as yf import pandas as pd import numpy as np import matplotlib matplotlib.use("Agg") # Non-interactive backend import matplotlib.pyplot as plt import seaborn as sns from scipy import stats import statsmodels.api as sm from datetime import datetime, timedelta import base64 from io import BytesIO import warnings warnings.filterwarnings("ignore") # ─── Configuration ─────────────────────────────────────── PORTFOLIO = [ {"ticker": "VTI", "shares": 100, "cost_basis": 200.00, "name": "US Total Market"}, {"ticker": "VXUS", "shares": 80, "cost_basis": 55.00, "name": "Intl Stocks"}, {"ticker": "BND", "shares": 150, "cost_basis": 75.00, "name": "US Bonds"}, {"ticker": "VNQ", "shares": 40, "cost_basis": 85.00, "name": "REITs"}, {"ticker": "GLD", "shares": 30, "cost_basis": 170.00, "name": "Gold"}, ] LOOKBACK_PERIOD = "2y" RISK_FREE_RATE = 0.05 ROLLING_WINDOW = 63 # ~3 months # ─── Helper: embed matplotlib figure as base64 ────────── def fig_to_base64(fig): """Convert matplotlib figure to base64 for HTML embedding.""" buf = BytesIO() fig.savefig(buf, format="png", dpi=120, bbox_inches="tight") buf.seek(0) b64 = base64.b64encode(buf.read()).decode("utf-8") plt.close(fig) return f'<img src="data:image/png;base64,{b64}" style="max-width:100%;">' # ─── 1. Portfolio Tracker ──────────────────────────────── print("[1/6] Fetching portfolio data...") holdings = pd.DataFrame(PORTFOLIO) tickers = holdings["ticker"].tolist() # Current prices current_prices = {} for t in tickers: hist = yf.Ticker(t).history(period="1d") if not hist.empty: current_prices[t] = hist["Close"].iloc[-1] holdings["current_price"] = holdings["ticker"].map(current_prices) holdings["market_value"] = holdings["shares"] * holdings["current_price"] holdings["cost_value"] = holdings["shares"] * holdings["cost_basis"] holdings["gain_loss"] = holdings["market_value"] - holdings["cost_value"] holdings["return_pct"] = (holdings["current_price"] / holdings["cost_basis"] - 1) * 100 total_value = holdings["market_value"].sum() holdings["weight"] = holdings["market_value"] / total_value * 100 # Historical prices price_history = pd.DataFrame() for t in tickers: df = yf.download(t, period=LOOKBACK_PERIOD, progress=False) price_history[t] = df["Close"] price_history = price_history.dropna() returns = price_history.pct_change().dropna() # Portfolio returns (weighted) weights = holdings.set_index("ticker")["weight"].reindex(tickers) / 100 portfolio_returns = (returns * weights).sum(axis=1) # Allocation pie chart fig_alloc, ax = plt.subplots(figsize=(6, 6)) colors = ['#3182ce', '#38a169', '#d69e2e', '#e53e3e', '#805ad5'] ax.pie(holdings["weight"], labels=holdings["name"], autopct='%1.1f%%', colors=colors, startangle=90) ax.set_title("Portfolio Allocation") alloc_chart = fig_to_base64(fig_alloc) # Cumulative return chart fig_cum, ax = plt.subplots(figsize=(10, 5)) cum = (1 + portfolio_returns).cumprod() ax.plot(cum.index, cum.values, color='#3182ce', linewidth=1.5) ax.fill_between(cum.index, 1, cum.values, alpha=0.1, color='#3182ce') ax.axhline(1, color='gray', linestyle='--', alpha=0.5) ax.set_title("Cumulative Portfolio Return") ax.set_ylabel("Growth of $1") cum_chart = fig_to_base64(fig_cum) # ─── 2. Risk Metrics ───────────────────────────────────── print("[2/6] Computing risk metrics...") rf_daily = (1 + RISK_FREE_RATE) ** (1/252) - 1 excess = portfolio_returns - rf_daily ann_return = (1 + portfolio_returns).prod() ** (252 / len(portfolio_returns)) - 1 ann_vol = portfolio_returns.std() * np.sqrt(252) sharpe = excess.mean() / excess.std() * np.sqrt(252) downside = excess[excess < 0] sortino = excess.mean() * 252 / (np.sqrt((downside**2).mean()) * np.sqrt(252)) var_95 = -np.percentile(portfolio_returns, 5) cvar_95 = -portfolio_returns[portfolio_returns <= -var_95].mean() cum_ret = (1 + portfolio_returns).cumprod() max_dd = -((cum_ret - cum_ret.cummax()) / cum_ret.cummax()).min() skew = portfolio_returns.skew() kurt = portfolio_returns.kurtosis() jb_stat, jb_p = stats.jarque_bera(portfolio_returns) # Drawdown chart fig_dd, ax = plt.subplots(figsize=(10, 4)) drawdown = (cum_ret - cum_ret.cummax()) / cum_ret.cummax() * 100 ax.fill_between(drawdown.index, drawdown.values, 0, color='#e53e3e', alpha=0.4) ax.plot(drawdown.index, drawdown.values, color='#e53e3e', linewidth=0.8) ax.set_title("Portfolio Drawdown") ax.set_ylabel("Drawdown (%)") dd_chart = fig_to_base64(fig_dd) # Return distribution fig_dist, ax = plt.subplots(figsize=(10, 5)) ax.hist(portfolio_returns * 100, bins=60, density=True, alpha=0.6, color='#3182ce', edgecolor='white', label='Actual') x = np.linspace(portfolio_returns.min()*100, portfolio_returns.max()*100, 200) ax.plot(x, stats.norm.pdf(x, portfolio_returns.mean()*100, portfolio_returns.std()*100), color='coral', linewidth=2, label='Normal fit') ax.axvline(-var_95*100, color='red', linestyle='--', label=f'95% VaR ({var_95:.2%})') ax.set_title("Return Distribution vs Normal") ax.set_xlabel("Daily Return (%)") ax.legend() dist_chart = fig_to_base64(fig_dist) # ─── 3. Correlation Monitor ────────────────────────────── print("[3/6] Computing correlations...") corr_matrix = returns.corr() avg_corr_vals = corr_matrix.where( np.triu(np.ones_like(corr_matrix, dtype=bool), k=1) ).stack() avg_corr = avg_corr_vals.mean() n = len(tickers) eff_bets = n / (1 + (n - 1) * avg_corr) # Heatmap fig_heatmap, ax = plt.subplots(figsize=(7, 6)) sns.heatmap(corr_matrix, annot=True, fmt=".2f", cmap="RdBu_r", center=0, vmin=-1, vmax=1, ax=ax, square=True, linewidths=0.5) ax.set_title("Correlation Matrix") heatmap_chart = fig_to_base64(fig_heatmap) # Rolling average correlation rolling_corrs = [] for end in range(ROLLING_WINDOW, len(returns)): window = returns.iloc[end-ROLLING_WINDOW:end] c = window.corr() mask = np.triu(np.ones_like(c, dtype=bool), k=1) rolling_corrs.append(c.where(mask).stack().mean()) rolling_corr_series = pd.Series(rolling_corrs, index=returns.index[ROLLING_WINDOW:]) fig_rcorr, ax = plt.subplots(figsize=(10, 4)) ax.plot(rolling_corr_series.index, rolling_corr_series.values, color='#805ad5') ax.axhline(avg_corr, color='gray', linestyle='--', alpha=0.5, label=f'Mean: {avg_corr:.2f}') ax.set_title("Rolling Average Pairwise Correlation") ax.legend() rcorr_chart = fig_to_base64(fig_rcorr) # ─── 4. Factor Exposure ────────────────────────────────── print("[4/6] Running factor regressions...") try: from pandas_datareader import data as pdr_data ff3 = pdr_data.DataReader("F-F_Research_Data_Factors_daily", "famafrench", start=price_history.index[0].strftime("%Y-%m-%d")) factors = ff3[0] / 100 merged = pd.DataFrame({ "port_excess": portfolio_returns - factors["RF"], "Mkt_RF": factors["Mkt-RF"], "SMB": factors["SMB"], "HML": factors["HML"], }).dropna() # CAPM X_capm = sm.add_constant(merged["Mkt_RF"]) capm = sm.OLS(merged["port_excess"], X_capm).fit(cov_type="HC1") # Fama-French 3 X_ff3 = sm.add_constant(merged[["Mkt_RF", "SMB", "HML"]]) ff3_model = sm.OLS(merged["port_excess"], X_ff3).fit(cov_type="HC1") factor_data_available = True except Exception as e: print(f" Factor data unavailable: {e}") factor_data_available = False # ─── 5. Macro Context ──────────────────────────────────── print("[5/6] Fetching macro data...") macro_data = {} try: from pandas_datareader import data as pdr_data macro_series = { "GDP Growth": "A191RL1Q225SBEA", "Unemployment": "UNRATE", "Fed Funds": "FEDFUNDS", "10Y Yield": "GS10", "2Y Yield": "GS2", "CPI": "CPIAUCSL", } start = (datetime.now() - timedelta(days=365*2)).strftime("%Y-%m-%d") for name, code in macro_series.items(): try: df = pdr_data.get_data_fred(code, start=start) macro_data[name] = df.iloc[:, 0].dropna().iloc[-1] except: pass macro_available = True except: macro_available = False # ─── 6. Assemble HTML Report ───────────────────────────── print("[6/6] Generating HTML report...") report_date = datetime.now().strftime("%B %d, %Y at %H:%M") html = f"""<!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <title>Financial Dashboard Report</title> <style> body {{ font-family: 'Inter', sans-serif; max-width: 900px; margin: 0 auto; padding: 2rem; background: #f7fafc; color: #2d3748; }} h1 {{ color: #1a365d; border-bottom: 3px solid #e53e3e; padding-bottom: 0.5rem; }} h2 {{ color: #2c5282; margin-top: 2rem; }} table {{ width: 100%; border-collapse: collapse; margin: 1rem 0; }} th {{ background: #1a365d; color: white; padding: 0.6rem; text-align: left; }} td {{ padding: 0.5rem; border-bottom: 1px solid #e2e8f0; }} .positive {{ color: #38a169; font-weight: 600; }} .negative {{ color: #e53e3e; font-weight: 600; }} .metric {{ background: white; padding: 1rem; border-radius: 8px; border: 1px solid #e2e8f0; margin: 0.5rem 0; }} .grid {{ display: grid; grid-template-columns: 1fr 1fr; gap: 1rem; }} .chart {{ text-align: center; margin: 1.5rem 0; }} .footer {{ text-align: center; color: #718096; margin-top: 3rem; padding-top: 1rem; border-top: 1px solid #e2e8f0; }} </style> </head> <body> <h1>Financial Dashboard Report</h1> <p>Generated: {report_date}</p> <h2>1. Portfolio Holdings</h2> <table> <tr><th>Asset</th><th>Ticker</th><th>Shares</th><th>Price</th> <th>Value</th><th>Gain/Loss</th><th>Return</th><th>Weight</th></tr> """ for _, row in holdings.iterrows(): gl_class = "positive" if row["gain_loss"] >= 0 else "negative" html += f"""<tr> <td>{row['name']}</td><td>{row['ticker']}</td> <td>{row['shares']}</td><td>${row['current_price']:.2f}</td> <td>${row['market_value']:,.2f}</td> <td class="{gl_class}">${row['gain_loss']:+,.2f}</td> <td class="{gl_class}">{row['return_pct']:+.1f}%</td> <td>{row['weight']:.1f}%</td> </tr>""" total_gl = holdings["gain_loss"].sum() total_ret = (total_value / holdings["cost_value"].sum() - 1) * 100 gl_class = "positive" if total_gl >= 0 else "negative" html += f"""<tr style="font-weight:700; border-top:2px solid #1a365d;"> <td colspan="4">TOTAL</td> <td>${total_value:,.2f}</td> <td class="{gl_class}">${total_gl:+,.2f}</td> <td class="{gl_class}">{total_ret:+.1f}%</td> <td>100.0%</td> </tr></table> <div class="grid"> <div class="chart">{alloc_chart}</div> <div class="chart">{cum_chart}</div> </div> <h2>2. Risk Metrics</h2> <div class="grid"> <div class="metric">Annualized Return: <strong>{ann_return:.2%}</strong></div> <div class="metric">Annualized Volatility: <strong>{ann_vol:.2%}</strong></div> <div class="metric">Sharpe Ratio: <strong>{sharpe:.3f}</strong></div> <div class="metric">Sortino Ratio: <strong>{sortino:.3f}</strong></div> <div class="metric">VaR (95%): <strong>{var_95:.2%}</strong></div> <div class="metric">CVaR (95%): <strong>{cvar_95:.2%}</strong></div> <div class="metric">Max Drawdown: <strong class="negative">{max_dd:.2%}</strong></div> <div class="metric">Skewness: <strong>{skew:.3f}</strong> | Kurtosis: <strong>{kurt:.3f}</strong></div> </div> <div class="grid"> <div class="chart">{dd_chart}</div> <div class="chart">{dist_chart}</div> </div> <p>Jarque-Bera: {jb_stat:.1f} (p = {jb_p:.4f}) — Normality {'REJECTED' if jb_p < 0.05 else 'not rejected'} at 5%</p> <h2>3. Correlation Analysis</h2> <p>Average pairwise correlation: <strong>{avg_corr:.3f}</strong> | Effective independent bets: <strong>{eff_bets:.1f}</strong> out of {n}</p> <div class="grid"> <div class="chart">{heatmap_chart}</div> <div class="chart">{rcorr_chart}</div> </div> """ # Factor exposure section if factor_data_available: html += f""" <h2>4. Factor Exposure</h2> <table> <tr><th>Model</th><th>Factor</th><th>Coefficient</th><th>t-stat</th><th>p-value</th></tr> <tr><td>CAPM</td><td>Alpha (annual)</td> <td>{capm.params['const']*252:.2%}</td> <td>{capm.tvalues['const']:.2f}</td> <td>{capm.pvalues['const']:.4f}</td></tr> <tr><td>CAPM</td><td>Market Beta</td> <td>{capm.params['Mkt_RF']:.3f}</td> <td>{capm.tvalues['Mkt_RF']:.2f}</td> <td>{capm.pvalues['Mkt_RF']:.4f}</td></tr> <tr><td colspan="5" style="border-top:2px solid #e2e8f0;"> CAPM R-squared: {capm.rsquared:.3f}</td></tr> <tr><td>FF3</td><td>Alpha (annual)</td> <td>{ff3_model.params['const']*252:.2%}</td> <td>{ff3_model.tvalues['const']:.2f}</td> <td>{ff3_model.pvalues['const']:.4f}</td></tr> <tr><td>FF3</td><td>Market Beta</td> <td>{ff3_model.params['Mkt_RF']:.3f}</td> <td>{ff3_model.tvalues['Mkt_RF']:.2f}</td> <td>{ff3_model.pvalues['Mkt_RF']:.4f}</td></tr> <tr><td>FF3</td><td>SMB (Size)</td> <td>{ff3_model.params['SMB']:.3f}</td> <td>{ff3_model.tvalues['SMB']:.2f}</td> <td>{ff3_model.pvalues['SMB']:.4f}</td></tr> <tr><td>FF3</td><td>HML (Value)</td> <td>{ff3_model.params['HML']:.3f}</td> <td>{ff3_model.tvalues['HML']:.2f}</td> <td>{ff3_model.pvalues['HML']:.4f}</td></tr> <tr><td colspan="5" style="border-top:2px solid #e2e8f0;"> FF3 R-squared: {ff3_model.rsquared:.3f}</td></tr> </table> """ # Macro context section if macro_available and macro_data: spread_val = macro_data.get("10Y Yield", 0) - macro_data.get("2Y Yield", 0) curve_status = "INVERTED" if spread_val < 0 else "Normal" html += f""" <h2>5. Macroeconomic Context</h2> <div class="grid"> <div class="metric">GDP Growth: <strong>{macro_data.get('GDP Growth', 'N/A'):.1f}%</strong></div> <div class="metric">Unemployment: <strong>{macro_data.get('Unemployment', 'N/A'):.1f}%</strong></div> <div class="metric">Fed Funds Rate: <strong>{macro_data.get('Fed Funds', 'N/A'):.2f}%</strong></div> <div class="metric">Yield Curve (10Y-2Y): <strong>{spread_val:+.2f}pp</strong> ({curve_status})</div> </div> """ html += f""" <div class="footer"> <p>Financial Dashboard for Statisticians | Generated by the Module 22 Capstone Script</p> <p>Data sources: Yahoo Finance, FRED, Kenneth French Data Library</p> </div> </body></html>""" # Save the report output_file = "financial_dashboard_report.html" with open(output_file, "w") as f: f.write(html) print(f"\nDashboard report saved to: {output_file}") print(f"Open it in a browser to view the full interactive report.")
22.8 — Deploying as a Web Application
The script above generates a static HTML report. For a more interactive experience, you can deploy it as a live web application using one of several Python frameworks.
Option A: Streamlit (Easiest)
Streamlit is a Python framework designed specifically for data dashboards. It turns Python scripts into interactive web apps with minimal code changes.
Python# dashboard_streamlit.py # Run with: streamlit run dashboard_streamlit.py import streamlit as st import yfinance as yf import pandas as pd import numpy as np import matplotlib.pyplot as plt st.set_page_config(page_title="Financial Dashboard", layout="wide") st.title("Financial Dashboard for Statisticians") # Sidebar: portfolio input st.sidebar.header("Portfolio Holdings") n_assets = st.sidebar.number_input("Number of assets", 1, 20, 5) holdings = [] for i in range(n_assets): col1, col2, col3 = st.sidebar.columns(3) ticker = col1.text_input(f"Ticker {i+1}", value=["VTI","VXUS","BND","VNQ","GLD"][i] if i<5 else "") shares = col2.number_input(f"Shares {i+1}", value=100, key=f"shares_{i}") cost = col3.number_input(f"Cost {i+1}", value=100.0, key=f"cost_{i}") if ticker: holdings.append({"ticker": ticker, "shares": shares, "cost_basis": cost}) if st.sidebar.button("Update Dashboard") and holdings: # ... (insert dashboard computation logic here) # Use st.metric(), st.dataframe(), st.pyplot() for display col1, col2, col3 = st.columns(3) col1.metric("Portfolio Value", f"${total_value:,.0f}") col2.metric("Sharpe Ratio", f"{sharpe:.2f}") col3.metric("Max Drawdown", f"{max_dd:.1%}") st.pyplot(fig_heatmap) st.dataframe(holdings_df)
Option B: Plotly Dash (More Control)
Dash by Plotly provides more control over layout and interactivity. It uses a callback-based architecture similar to reactive programming.
Python# dashboard_dash.py # Run with: python dashboard_dash.py import dash from dash import dcc, html, dash_table from dash.dependencies import Input, Output import plotly.express as px import plotly.graph_objects as go app = dash.Dash(__name__) app.layout = html.Div([ html.H1("Financial Dashboard for Statisticians"), # Portfolio input table dash_table.DataTable( id="portfolio-table", columns=[ {"name": "Ticker", "id": "ticker", "editable": True}, {"name": "Shares", "id": "shares", "type": "numeric", "editable": True}, {"name": "Cost Basis", "id": "cost_basis", "type": "numeric", "editable": True}, ], data=[ {"ticker": "VTI", "shares": 100, "cost_basis": 200}, {"ticker": "BND", "shares": 150, "cost_basis": 75}, ], row_deletable=True, ), html.Button("Add Row", id="add-row-btn"), # Charts dcc.Graph(id="allocation-pie"), dcc.Graph(id="cumulative-return"), dcc.Graph(id="correlation-heatmap"), # Auto-refresh every 5 minutes dcc.Interval(id="interval", interval=5*60*1000), ]) # Add callbacks for interactivity # @app.callback(Output('allocation-pie', 'figure'), # Input('portfolio-table', 'data')) # def update_pie(data): ... if __name__ == "__main__": app.run(debug=True)
Deployment Comparison
| Feature | Static HTML (this module) | Streamlit | Plotly Dash |
|---|---|---|---|
| Setup difficulty | None (just run the script) | Easy (pip install streamlit) |
Moderate |
| Interactivity | None (static report) | High (widgets, sliders) | Very high (custom callbacks) |
| Sharing | Email the HTML file | Streamlit Cloud (free) | Heroku, AWS, etc. |
| Live data | Snapshot at generation time | Refreshes on interaction | Periodic auto-refresh |
| Best for | Personal use, quick reports | Rapid prototyping, sharing | Production applications |
22.9 — Course Summary: What You Have Learned
Over 22 modules, you have built a comprehensive framework for understanding finance through statistics. Here is a summary of the journey:
Part I: Financial Data as a Statistician Sees It
- Module 1: Financial data structures (OHLCV, time series, data sources)
- Module 2: Returns vs. prices; stationarity; log returns; random walks
- Module 3: Fat tails, excess kurtosis, volatility clustering, stylized facts
- Module 4: Correlation in finance; rolling correlations; copulas
Part II: Risk and Return
- Module 5: Volatility as a measure of risk; GARCH models
- Module 6: Value at Risk and Expected Shortfall; quantile regression
- Module 7: The risk-return tradeoff; Sharpe ratio; efficient frontiers
- Module 8: Extreme value theory; tail risk; Black Swan events
Part III: Portfolio Construction and Factor Models
- Module 9: Modern Portfolio Theory; mean-variance optimization
- Module 10: CAPM; beta; the security market line
- Module 11: Fama-French factors; multi-factor models
- Module 12: Factor investing; smart beta; portfolio tilts
Part IV: Advanced Topics
- Module 13: Fixed income; bond math; duration and convexity
- Module 14: Options and derivatives; Black-Scholes; the Greeks
- Module 15: Time series models for finance (ARIMA, GARCH, regime-switching)
- Module 16: Machine learning in finance; overfitting; cross-validation
- Module 17: Behavioral finance; cognitive biases; prospect theory
- Module 18: Performance evaluation; attribution analysis
- Module 19: Personal finance; tax optimization; retirement planning
Part V: Context and Capstone
- Module 20: Reading financial news statistically; debunking headlines
- Module 21: Macroeconomics; GDP, inflation, yield curves, business cycles
- Module 22: This capstone — building your own financial dashboard
The entire course can be summarized as a mapping from statistics to finance: distributions become return profiles, estimation becomes valuation, hypothesis testing becomes performance evaluation, regression becomes factor modeling, time series becomes forecasting, Bayesian inference becomes belief updating, and causal inference becomes policy analysis. You already had the tools — now you have the financial vocabulary to apply them.
22.10 — Where to Go from Here
This course has given you the foundation. Here are pathways for deeper exploration:
For Deeper Financial Knowledge
| Topic | Resource | Prerequisite from This Course |
|---|---|---|
| Quantitative finance | Hull, Options, Futures, and Other Derivatives | Modules 13–14 |
| Financial econometrics | Campbell, Lo & MacKinlay, The Econometrics of Financial Markets | Modules 1–8, 15 |
| Factor investing | Ang, Asset Management | Modules 9–12 |
| Behavioral finance | Shiller, Irrational Exuberance | Module 17 |
| Machine learning in finance | Lopez de Prado, Advances in Financial Machine Learning | Module 16 |
For Hands-On Practice
- Build a real dashboard: Take the capstone script, customize it for your actual portfolio, and deploy it with Streamlit.
- Replicate a paper: Pick a financial economics paper and replicate its results. Start with Fama & French (1993) or Jegadeesh & Titman (1993).
- Kaggle competitions: Financial prediction competitions let you test your models against others.
- FRED exploration: Browse the FRED database (700,000+ time series) and explore macro relationships that interest you.
- Personal finance automation: Extend the dashboard to include your bank accounts, budgeting, and tax projections.
Key Principles to Remember
- Returns are not normal. Every model that assumes normality is wrong. Know how and when it is wrong enough to matter.
- Correlations are not constant. They spike during crises, exactly when you need diversification most.
- Past performance is not predictive. Most patterns in financial data are noise, not signal. Demand out-of-sample validation.
- Costs matter. Fees, taxes, and transaction costs erode returns. The cheapest portfolio often wins.
- Risk is not just volatility. It is also tail risk, drawdown risk, liquidity risk, and the risk of not meeting your goals.
- Simplicity beats complexity. A diversified index portfolio with low fees beats most active strategies. Your statistical edge is in understanding why, not in finding the next alpha signal.
- The biggest risks are behavioral. Panic selling, overconfidence, and chasing performance destroy more wealth than any market crash.
Your statistical training is a genuine competitive advantage in personal finance. Most investors cannot compute a Sharpe ratio, do not understand what a confidence interval means, and confuse correlation with causation daily. You can. Use that advantage not to chase alpha, but to make better decisions, avoid common pitfalls, and build a robust financial plan grounded in evidence rather than narrative.
Finance is one of the richest applied domains for statistics. Every tool you have learned — from the t-test to Markov chains, from maximum likelihood to bootstrapping, from linear regression to Bayesian updating — has a direct application in understanding financial markets. The mapping is complete. Now go build something with it.