Statistical Arbitrage Research Platform

Live App: statistical-arbitrage-research-platform.streamlit.app · GitHub: github.com/rcodeborg2311/Statistical-Arbitrage-Research-Platform

Overview

A production-grade statistical arbitrage research platform implementing the full pairs trading pipeline. This encompasses the entire process from cointegration testing through signal generation, dynamic hedge ratio estimation, and comprehensive risk analytics, all deployed on Streamlit Cloud.

Technical Highlights

Pairs Trading Backtester: Engineered using Engle-Granger & Johansen cointegration, event-driven dollar-neutral simulation, and stateful Z-score signal generation verified look-ahead-free via hand-traced unit tests.
Kalman Filter Hedge Ratios: Implemented a Kalman filter state-space model (predict-update recursion, configurable δ adaptation) for dynamic hedge ratio estimation, replacing static OLS and eliminating stale beta during regime changes (e.g., 2020 COVID volatility spike).
OU-MLE Closed-Form Estimation: Derived closed-form OU-MLE via sufficient statistics (O(n), no numerical optimisation) to estimate mean-reversion speed κ, equilibrium μ, and theoretical Sharpe ceiling √(2κ × 252) per Avellaneda & Lee (2010).
20+ Risk Metrics: Built metrics beyond Sharpe including CVaR/Expected Shortfall (Basel III), Omega Ratio, HAC Sharpe with Newey-West standard errors (Andrews 1991 bandwidth), Ulcer Index, and CAPM residual Sharpe decomposition for market-neutral alpha isolation.
Risk-Parity Portfolio Engine: Constructed inverse-vol weights with volatility targeting via full covariance matrix, 2,000-path Monte Carlo simulation, and 6 historical stress scenarios (GFC, COVID crash, 2022 rate shock, 2018 VIX spike).
6-Tab Research Platform: Delivered parameter sensitivity heatmaps, transaction cost sweep curves, and a full Knowledge Base. This includes theme-adaptive CSS using Streamlit CSS variables for seamless light and dark mode support.

Test Coverage

163/163 pytest unit tests with zero failures across 9 test modules, covering cointegration recovery, look-ahead bias, commission edge cases, and Monte Carlo distribution invariants. Flake8 clean at max-line-length=100.

Key Metrics

Metric	Detail
Cointegration tests	Engle-Granger + Johansen
Hedge ratio model	Kalman Filter (dynamic) vs static OLS
OU parameter estimation	Closed-form MLE, O(n)
Risk metrics	20+ (CVaR, HAC Sharpe, Omega, Ulcer, CAPM α)
Monte Carlo paths	2,000
Stress scenarios	6 (GFC, COVID, 2022, 2018 VIX, …)
Test suite	163/163 passing, 9 modules