Skip to content

fast-minimum-variance: Solving Minimum Variance Portfolios Fast

Python Downloads License Coverage CodeFactor Rhiza

Overview

fast-minimum-variance solves the long-only minimum variance portfolio without ever forming the sample covariance matrix. The key observation is that the KKT stationarity condition $2\Sigma w = \lambda\mathbf{1}$ immediately gives $w \propto \Sigma^{-1}\mathbf{1}$: the entire problem reduces to one symmetric positive definite linear system $\Sigma v = \mathbf{1}$, solved matrix-free by conjugate gradients. The budget constraint is recovered by a single rescaling $w = v / (\mathbf{1}^\top v)$.

Working directly with the returns matrix $X \in \mathbb{R}^{T \times N}$ — rather than the assembled covariance $X^\top X$ — has two consequences. First, each conjugate gradient iteration costs $O(TN)$ rather than $O(N^2)$, and $X^\top X$ is never stored. Second, Ledoit-Wolf shrinkage enters as a simple row-augmentation of $X$: stacking $[\sqrt{1-\alpha}\,X;\,\sqrt{\gamma}\,I]$ yields a matrix whose Gram matrix equals $\Sigma_{\text{LW}}$. The same CG code handles both the plain and shrunk problem without modification.

Quick Start

import numpy as np
from fast_minimum_variance import Problem

# 500 daily returns, 20 assets
X = np.random.default_rng(42).standard_normal((500, 20))

w, iters = Problem(X).solve_cg()   # matrix-free CG — recommended
w, iters = Problem(X).solve_kkt()  # direct dense solve — exact baseline

assert abs(w.sum() - 1.0) < 1e-8
assert (w >= 0).all()

Ledoit-Wolf Shrinkage

Ledoit-Wolf shrinkage plays a dual role: statistically it reduces estimation error; numerically it compresses the eigenvalue spectrum and directly cuts CG iteration counts. Use alpha = N / (N + T) as a simple analytical estimate of the optimal shrinkage intensity:

T, N = X.shape
w, iters = Problem(X, alpha=N / (N + T)).solve_cg()

On S&P 500 equity data (495 assets, 1192 days), shrinkage cuts CG iterations from 685 to 205 and makes the matrix-free solver the fastest option by a wide margin.

Solvers

All solvers are methods on Problem and return (w, iters) where $w \in \mathbb{R}^N$, $\sum_i w_i = 1$, $w_i \geq 0$.

Method Approach When to use
solve_cg() Matrix-free conjugate gradients on the SPD reduced system Default — fastest for large $N$, especially with shrinkage
solve_kkt() Direct dense factorisation via numpy.linalg.solve Small problems or when an exact solve is needed
solve_nnls() Non-negative least squares via Lawson-Hanson Single-shot; useful when no outer loop is desired
solve_clarabel() Clarabel interior-point solver (direct API) Comparison baseline without CVXPY overhead
solve_osqp() OSQP operator-splitting QP solver (direct API) Alternative QP baseline; faster than Clarabel on medium problems
solve_cvxpy() CVXPY + Clarabel Ground-truth reference

solve_cg — matrix-free conjugate gradients

The inner step builds a LinearOperator that applies

$$v \;\mapsto\; (1-\alpha)\,X_a^\top(X_a v) + \gamma v, \qquad \gamma = \frac{\alpha|X|_F^2}{N}$$

to a vector using two matrix-vector products with the active-asset submatrix $X_a$, without ever forming $\Sigma_a = X_a^\top X_a$. Standard CG then solves $\Sigma_a v = \mathbf{1}$. Ledoit-Wolf shrinkage ($\alpha > 0$) compresses the eigenvalue spectrum and reduces iteration counts dramatically — from nearly 2000 iterations at $\alpha \approx 0$ to single digits at $\alpha \approx 1$ in rank-deficient settings.

solve_kkt — direct dense solve

Assembles $\Sigma_a = (1-\alpha)X_a^\top X_a + \gamma I$ explicitly and calls numpy.linalg.solve. Exact to machine precision. Scales as $O(N^3)$ in the active portfolio size, so it becomes expensive for $N \gtrsim 500$ without shrinkage (which reduces the number of active assets). With shrinkage, the active-set outer loop converges in 2–4 steps and the inner systems are small, making the direct solve competitive.

solve_nnls — non-negative least squares

Reformulates the problem as a non-negative least squares problem on an augmented matrix:

$$\min_{w \geq 0}\;\left|\begin{pmatrix}\sqrt{1-\alpha}\,X \ \sqrt{\gamma}\,I \ M\mathbf{1}^\top\end{pmatrix}w - \begin{pmatrix}\mathbf{0} \ \mathbf{0} \ M\end{pmatrix}\right|^2$$

where $M = |X|_F \cdot T$ enforces the budget constraint as a large penalty. The Lawson-Hanson algorithm handles $w \geq 0$ natively, so no outer primal-dual loop is needed. Single-shot but does not benefit from the matrix-free structure: Lawson-Hanson implicitly forms normal equations of the augmented matrix. With shrinkage the augmented matrix grows from $T \times N$ to $(T+N) \times N$, making solve_nnls slower with shrinkage than without.

solve_clarabel — Clarabel direct API

Calls the Clarabel interior-point solver directly, bypassing CVXPY's problem-construction overhead. Assembles $P = 2\Sigma_{\text{LW}}$ as a sparse CSC matrix and solves the standard QP. Useful for benchmarking: on a 1000-asset synthetic problem, Clarabel direct takes 0.28 s while the CVXPY wrapper takes 8.2 s — over 97% of solve_cvxpy's time is Python interface overhead, not solving. CG is still 15× faster than Clarabel direct.

solve_osqp — OSQP operator-splitting solver

Calls the OSQP operator-splitting QP solver directly, bypassing CVXPY overhead. Assembles $P = 2\Sigma_{\text{LW}}$ as a sparse upper-triangular CSC matrix and applies ADMM iterations on the primal-dual update. Consistently about 2× faster than Clarabel direct: on S&P 500 data OSQP takes 0.036 s versus Clarabel's 0.067 s; on a 1000-asset synthetic problem, 0.12 s versus 0.28 s. The iters return value is the ADMM iteration count. CG is still 4–6× faster than OSQP for large $N$, but OSQP is the fastest drop-in QP solver for problems where the matrix-free structure cannot be exploited.

The Primal-Dual Active-Set Loop

Long-only weights are enforced by an outer loop that wraps any inner solver:

  1. Primal step. Solve the budget-only equality system over the current active asset set. Drop any asset with weight below $-\varepsilon$ (multiple assets at once if violations are large).
  2. Dual step. Once all active weights are non-negative, compute the gradient $\nabla_i f(w) = 2[(1-\alpha)(X^\top X w)_i + \gamma w_i] - \rho\mu_i$ for every excluded asset. If any excluded asset has $\nabla_i f(w) < \lambda$ (the budget multiplier), it would decrease variance if added — re-insert the most-violated asset and repeat.
  3. Termination. The loop exits when primal and dual feasibility hold simultaneously. Combined with stationarity from the inner solve, this is sufficient for global optimality.

With Ledoit-Wolf shrinkage at the analytically optimal $\alpha$, the loop typically converges in 2–4 outer iterations on real equity data.

Problem Variants

The same solver handles a range of portfolio construction problems by choosing $\alpha$, $\rho$, $\mu$:

Problem alpha rho mu
Minimum variance $0$ $0$
Mean-variance (Markowitz) any $> 0$ expected returns
Minimum tracking error to benchmark $b$ any $2$ X.T @ (X @ b)
LW-regularised minimum variance $N/(N+T)$ $0$
# Mean-variance
mu = np.random.default_rng(0).standard_normal(N)  # expected returns, shape (N,)
w, _ = Problem(X, rho=1.0, mu=mu).solve_cg()

# Minimum tracking error to benchmark b
b = np.ones(N) / N  # equal-weight benchmark
mu_te = X.T @ (X @ b)
w, _ = Problem(X, rho=2.0, mu=mu_te).solve_cg()

When rho != 0, two SPD solves are performed per outer step: $\Sigma_a v_1 = \mathbf{1}$ and $\Sigma_a v_2 = \mu_a$. The budget multiplier $\lambda$ is recovered analytically from the budget constraint, avoiding the full saddle-point system.

Custom Constraints

For problems beyond budget + long-only (sector limits, turnover bounds, factor-exposure constraints), pass explicit constraint matrices:

A = np.ones((N, 1))   # budget: 1'w = 1
b = np.ones(1)
C = -np.eye(N)        # long-only: w >= 0
d = np.zeros(N)
w, _ = Problem(X, A=A, b=b, C=C, d=d).solve_kkt()

This routes to a general active-set solver that handles arbitrary linear equality and inequality constraints. Use this path sparingly — the default path (no A, b, C, d) is significantly faster for the standard long-only problem.

Benchmarks

All timings on Apple M4 Pro, Python 3.12, NumPy 2.4, SciPy 1.17.

Synthetic: $N=1000$, $T=2000$, i.i.d. Gaussian returns

Method Time (s) Speedup vs CVXPY
solve_cvxpy 8.16
solve_clarabel 0.28 29×
solve_osqp 0.12 68×
solve_kkt 0.063 129×
solve_cg 0.019 430×
solve_nnls 1.69

With Ledoit-Wolf shrinkage ($\alpha = 0.333$), 56 CG iterations.

S&P 500: $N=495$, $T=1192$ (Jul 2021–Apr 2026)

Method Time (s) Speedup vs CVXPY
solve_cvxpy 1.48
solve_clarabel 0.067 22×
solve_osqp 0.036 41×
solve_kkt 0.018 84×
solve_cg 0.0091 162×
solve_nnls 0.088 17×

With Ledoit-Wolf shrinkage ($\alpha = 0.293$), 205 CG iterations.

Installation

pip install fast-minimum-variance

For development:

git clone https://github.com/Jebel-Quant/fast_minimum_variance
cd fast_minimum_variance
make install

Requirements

  • Python 3.11+
  • numpy
  • scipy
  • cvxpy
  • clarabel
  • osqp

Citing

If you use this library in academic work or research, please cite:

@software{fast_minimum_variance,
  author  = {Schmelzer, Thomas},
  title   = {fast-minimum-variance: Solving Minimum Variance Portfolios Fast},
  url     = {https://github.com/Jebel-Quant/fast_minimum_variance},
  year    = {2026},
  license = {MIT}
}

License

MIT License — see LICENSE for details.