Project

Statistical Arbitrage Research Platform

Live App: statistical-arbitrage-research-platform.streamlit.app · GitHub: github.com/rcodeborg2311/Statistical-Arbitrage-Research-Platform

Overview

A production-grade statistical arbitrage research platform implementing the full pairs trading pipeline. This encompasses the entire process from cointegration testing through signal generation, dynamic hedge ratio estimation, and comprehensive risk analytics, all deployed on Streamlit Cloud.

Technical Highlights

  • Pairs Trading Backtester: Engineered using Engle-Granger & Johansen cointegration, event-driven dollar-neutral simulation, and stateful Z-score signal generation verified look-ahead-free via hand-traced unit tests.

  • Kalman Filter Hedge Ratios: Implemented a Kalman filter state-space model (predict-update recursion, configurable δ adaptation) for dynamic hedge ratio estimation, replacing static OLS and eliminating stale beta during regime changes (e.g., 2020 COVID volatility spike).

  • OU-MLE Closed-Form Estimation: Derived closed-form OU-MLE via sufficient statistics (O(n), no numerical optimisation) to estimate mean-reversion speed κ, equilibrium μ, and theoretical Sharpe ceiling √(2κ × 252) per Avellaneda & Lee (2010).

  • 20+ Risk Metrics: Built metrics beyond Sharpe including CVaR/Expected Shortfall (Basel III), Omega Ratio, HAC Sharpe with Newey-West standard errors (Andrews 1991 bandwidth), Ulcer Index, and CAPM residual Sharpe decomposition for market-neutral alpha isolation.

  • Risk-Parity Portfolio Engine: Constructed inverse-vol weights with volatility targeting via full covariance matrix, 2,000-path Monte Carlo simulation, and 6 historical stress scenarios (GFC, COVID crash, 2022 rate shock, 2018 VIX spike).

  • 6-Tab Research Platform: Delivered parameter sensitivity heatmaps, transaction cost sweep curves, and a full Knowledge Base. This includes theme-adaptive CSS using Streamlit CSS variables for seamless light and dark mode support.

Test Coverage

163/163 pytest unit tests with zero failures across 9 test modules, covering cointegration recovery, look-ahead bias, commission edge cases, and Monte Carlo distribution invariants. Flake8 clean at max-line-length=100.

Key Metrics

MetricDetail
Cointegration testsEngle-Granger + Johansen
Hedge ratio modelKalman Filter (dynamic) vs static OLS
OU parameter estimationClosed-form MLE, O(n)
Risk metrics20+ (CVaR, HAC Sharpe, Omega, Ulcer, CAPM α)
Monte Carlo paths2,000
Stress scenarios6 (GFC, COVID, 2022, 2018 VIX, …)
Test suite163/163 passing, 9 modules