Coverage for src / basanos / math / optimizer.py: 100%

122 statements  

« prev     ^ index     » next       coverage.py v7.13.5, created at 2026-04-02 17:47 +0000

1"""Correlation-aware risk position optimizer (Basanos). 

2 

3This module provides utilities to compute correlation-adjusted risk positions 

4from price data and expected-return signals. It relies on volatility-adjusted 

5returns to estimate a dynamic correlation matrix (via EWM), applies shrinkage 

6towards identity, and solves a normalized linear system per timestamp to 

7obtain stable positions. 

8 

9Performance characteristics 

10--------------------------- 

11Let *N* be the number of assets and *T* the number of timestamps. 

12 

13**Computational complexity** 

14 

15+----------------------------------+------------------+--------------------------------------+ 

16| Operation | Complexity | Bottleneck | 

17+==================================+==================+======================================+ 

18| EWM volatility (``ret_adj``, | O(T·N) | Linear in both T and N; negligible | 

19| ``vola``) | | | 

20+----------------------------------+------------------+--------------------------------------+ 

21| EWM correlation (``cor``) | O(T·N²) | ``lfilter`` over all N² asset pairs | 

22| | | simultaneously | 

23+----------------------------------+------------------+--------------------------------------+ 

24| Linear solve per timestamp | O(N³) | Cholesky / LU per row in | 

25| (``cash_position``) | * T solves | ``cash_position`` | 

26+----------------------------------+------------------+--------------------------------------+ 

27 

28**Memory usage** (peak, approximate) 

29 

30``ewm_corr`` allocates roughly **14 float64 arrays** of shape 

31``(T, N, N)`` at peak (input sequences, IIR filter outputs, EWM components, 

32and the result tensor). Peak RAM ≈ **112 * T * N²** bytes. Typical 

33working sizes on a 16 GB machine: 

34 

35+--------+--------------------------+------------------------------------+ 

36| N | T (daily rows) | Peak memory (approx.) | 

37+========+==========================+====================================+ 

38| 50 | 252 (~1 yr) | ~70 MB | 

39+--------+--------------------------+------------------------------------+ 

40| 100 | 252 (~1 yr) | ~280 MB | 

41+--------+--------------------------+------------------------------------+ 

42| 100 | 2 520 (~10 yr) | ~2.8 GB | 

43+--------+--------------------------+------------------------------------+ 

44| 200 | 2 520 (~10 yr) | ~11 GB | 

45+--------+--------------------------+------------------------------------+ 

46| 500 | 2 520 (~10 yr) | ~70 GB ⚠ exceeds typical RAM | 

47+--------+--------------------------+------------------------------------+ 

48 

49**Practical limits (daily data)** 

50 

51* **≤ 150 assets, ≤ 5 years** — well within reach on an 8 GB laptop. 

52* **≤ 250 assets, ≤ 10 years** — requires ~11-12 GB; feasible on a 16 GB 

53 workstation. 

54* **> 500 assets with multi-year history** — peak memory exceeds 16 GB; 

55 reduce the time range or switch to a chunked / streaming approach. 

56* **> 1 000 assets** — the O(N³) per-solve cost alone makes real-time 

57 optimization impractical even with adequate RAM. 

58 

59See ``BENCHMARKS.md`` for measured wall-clock timings across representative 

60dataset sizes. 

61 

62Internal structure 

63------------------ 

64The implementation is split across focused private modules to keep each file 

65readable and independently testable: 

66 

67* :mod:`basanos.math._config` — :class:`BasanosConfig` and all 

68 covariance-mode configuration classes. 

69* :mod:`basanos.math._ewm_corr` — :func:`ewm_corr`, the vectorised 

70 IIR-filter implementation of per-row EWM correlation matrices. 

71* :mod:`basanos.math._engine_solve` — private helpers providing the 

72 ``_iter_matrices`` and ``_iter_solve`` generators (per-timestamp solve 

73 logic). 

74* :mod:`basanos.math._engine_diagnostics` — private helpers providing 

75 matrix-quality diagnostics (condition number, effective rank, solver 

76 residual, signal utilisation). 

77* :mod:`basanos.math._engine_ic` — private helpers providing signal 

78 evaluation metrics (IC, Rank IC, ICIR, and summary statistics). 

79* This module — :class:`BasanosEngine`, a single flat class that wires 

80 every method together in clearly delimited sections. 

81""" 

82 

83import dataclasses 

84import datetime 

85import logging 

86from typing import TYPE_CHECKING 

87 

88import numpy as np 

89import polars as pl 

90from jquantstats import Portfolio 

91 

92from ..exceptions import ( 

93 ColumnMismatchError, 

94 ExcessiveNullsError, 

95 MissingDateColumnError, 

96 MonotonicPricesError, 

97 NonPositivePricesError, 

98 ShapeMismatchError, 

99) 

100from ._config import ( 

101 BasanosConfig, 

102 CovarianceConfig, 

103 CovarianceMode, 

104 EwmaShrinkConfig, 

105 SlidingWindowConfig, 

106) 

107from ._engine_diagnostics import _DiagnosticsMixin as _DiagnosticsMixin 

108from ._engine_ic import _SignalEvaluatorMixin as _SignalEvaluatorMixin 

109from ._engine_solve import _SolveMixin as _SolveMixin 

110from ._ewm_corr import ewm_corr as _ewm_corr_numpy 

111from ._signal import vol_adj 

112 

113if TYPE_CHECKING: 

114 from ._config_report import ConfigReport 

115 

116_logger = logging.getLogger(__name__) 

117 

118 

119def _validate_inputs(prices: pl.DataFrame, mu: pl.DataFrame, cfg: "BasanosConfig") -> None: 

120 """Validate ``prices``, ``mu``, and ``cfg`` for use with :class:`BasanosEngine`. 

121 

122 Checks that both DataFrames contain a ``'date'`` column, share identical 

123 shapes and column sets, contain no non-positive prices, no excessive NaN 

124 fractions, and no monotonically non-varying price series. Also emits a 

125 warning when the dataset is too short relative to a configured 

126 sliding-window size. 

127 

128 Args: 

129 prices: DataFrame of price levels per asset over time. 

130 mu: DataFrame of expected-return signals aligned with ``prices``. 

131 cfg: Engine configuration instance. 

132 

133 Raises: 

134 MissingDateColumnError: If ``'date'`` is absent from either frame. 

135 ShapeMismatchError: If ``prices`` and ``mu`` have different shapes. 

136 ColumnMismatchError: If the column sets of the two frames differ. 

137 NonPositivePricesError: If any asset contains a non-positive price. 

138 ExcessiveNullsError: If any asset column exceeds ``cfg.max_nan_fraction``. 

139 MonotonicPricesError: If any asset price series is monotonically 

140 non-decreasing or non-increasing. 

141 

142 Warns: 

143 UserWarning (via logging): If ``cfg.covariance`` is a 

144 :class:`SlidingWindowConfig` and 

145 ``len(prices) < 2 * cfg.covariance.window``, a warning is emitted 

146 via the module logger rather than an exception. This is a 

147 deliberate soft boundary — callers may intentionally supply data 

148 shorter than the full warm-up period. During warm-up the first 

149 ``window - 1`` timestamps will yield zero positions. 

150 """ 

151 # ensure 'date' column exists in prices before any other validation 

152 if "date" not in prices.columns: 

153 raise MissingDateColumnError("prices") 

154 

155 # ensure 'date' column exists in mu as well (kept for symmetry and downstream assumptions) 

156 if "date" not in mu.columns: 

157 raise MissingDateColumnError("mu") 

158 

159 # check that prices and mu have the same shape 

160 if prices.shape != mu.shape: 

161 raise ShapeMismatchError(prices.shape, mu.shape) 

162 

163 # check that the columns of prices and mu are identical 

164 if not set(prices.columns) == set(mu.columns): 

165 raise ColumnMismatchError(prices.columns, mu.columns) 

166 

167 assets = [c for c in prices.columns if c != "date" and prices[c].dtype.is_numeric()] 

168 

169 # check for non-positive prices: log returns require strictly positive prices 

170 for asset in assets: 

171 col = prices[asset].drop_nulls() 

172 if col.len() > 0 and (col <= 0).any(): 

173 raise NonPositivePricesError(asset) 

174 

175 # check for excessive NaN values: more than cfg.max_nan_fraction null in any asset column 

176 n_rows = prices.height 

177 if n_rows > 0: 

178 for asset in assets: 

179 nan_frac = prices[asset].null_count() / n_rows 

180 if nan_frac > cfg.max_nan_fraction: 

181 raise ExcessiveNullsError(asset, nan_frac, cfg.max_nan_fraction) 

182 

183 # check for monotonic price series: a strictly non-decreasing or non-increasing 

184 # series has no variance in its return sign, indicating malformed or synthetic data 

185 for asset in assets: 

186 col = prices[asset].drop_nulls() 

187 if col.len() > 2: 

188 diffs = col.diff().drop_nulls() 

189 if (diffs >= 0).all() or (diffs <= 0).all(): 

190 raise MonotonicPricesError(asset) 

191 

192 # warn when the dataset is too short to benefit from the sliding window 

193 if cfg.covariance_mode == CovarianceMode.sliding_window and cfg.window is not None: 

194 w: int = cfg.window 

195 if n_rows < 2 * w: 

196 _logger.warning( 

197 "Dataset length (%d rows) is less than 2 * window (%d). " 

198 "The first %d timestamps will yield zero positions during warm-up; " 

199 "consider using a longer history or reducing 'window'.", 

200 n_rows, 

201 2 * w, 

202 w - 1, 

203 ) 

204 

205 

206# --------------------------------------------------------------------------- 

207# Re-export config symbols so ``from basanos.math.optimizer import …`` keeps 

208# working for existing callers. 

209# --------------------------------------------------------------------------- 

210__all__ = [ 

211 "BasanosConfig", 

212 "BasanosEngine", 

213 "CovarianceConfig", 

214 "CovarianceMode", 

215 "EwmaShrinkConfig", 

216 "SlidingWindowConfig", 

217] 

218 

219 

220@dataclasses.dataclass(frozen=True) 

221class BasanosEngine(_DiagnosticsMixin, _SignalEvaluatorMixin, _SolveMixin): 

222 """Engine to compute correlation matrices and optimize risk positions. 

223 

224 Encapsulates price data and configuration to build EWM-based 

225 correlations, apply shrinkage, and solve for normalized positions. 

226 

227 Public methods are organised into clearly delimited sections (some 

228 inherited from the private mixin classes): 

229 

230 * **Core data access** — :attr:`assets`, :attr:`ret_adj`, :attr:`vola`, 

231 :attr:`cor`, :attr:`cor_tensor` 

232 * **Solve / position logic** — :attr:`cash_position`, 

233 :attr:`position_status`, :attr:`risk_position`, 

234 :attr:`position_leverage`, :meth:`warmup_state` 

235 (solve helpers inherited from :class:`~._engine_solve._SolveMixin`) 

236 * **Portfolio and performance** — :attr:`portfolio`, 

237 :attr:`naive_sharpe`, :meth:`sharpe_at_shrink`, 

238 :meth:`sharpe_at_window_factors` 

239 * **Matrix diagnostics** — :attr:`condition_number`, 

240 :attr:`effective_rank`, :attr:`solver_residual`, 

241 :attr:`signal_utilisation` 

242 (inherited from :class:`~._engine_diagnostics._DiagnosticsMixin`) 

243 * **Signal evaluation** — :attr:`ic`, :attr:`rank_ic`, :attr:`ic_mean`, 

244 :attr:`ic_std`, :attr:`icir`, :attr:`rank_ic_mean`, 

245 :attr:`rank_ic_std` 

246 (inherited from :class:`~._engine_ic._SignalEvaluatorMixin`) 

247 * **Reporting** — :attr:`config_report` 

248 

249 Data-flow diagram 

250 ----------------- 

251 

252 .. code-block:: text 

253 

254 prices (pl.DataFrame) 

255 

256 ├─ vol_adj ──► ret_adj (volatility-adjusted log returns) 

257 │ │ 

258 │ ├─ ewm_corr ──► cor / cor_tensor 

259 │ │ │ 

260 │ │ └─ shrink2id / FactorModel 

261 │ │ │ 

262 │ vola covariance matrix 

263 │ │ │ 

264 └── mu ──────────┴── _iter_solve ──────────┘ 

265 

266 cash_position 

267 

268 ┌────────┴────────┐ 

269 portfolio diagnostics 

270 (Portfolio) (condition_number, 

271 effective_rank, 

272 solver_residual, 

273 signal_utilisation, 

274 ic, rank_ic, …) 

275 

276 Attributes: 

277 prices: Polars DataFrame of price levels per asset over time. Must 

278 contain a ``'date'`` column and at least one numeric asset column 

279 with strictly positive values that are not monotonically 

280 non-decreasing or non-increasing (i.e. they must vary in sign). 

281 mu: Polars DataFrame of expected-return signals aligned with *prices*. 

282 Must share the same shape and column names as *prices*. 

283 cfg: Immutable :class:`BasanosConfig` controlling EWMA half-lives, 

284 clipping, shrinkage intensity, and AUM. 

285 

286 Examples: 

287 Build an engine with two synthetic assets over 30 days and inspect the 

288 optimized positions and diagnostic properties. 

289 

290 >>> import numpy as np 

291 >>> import polars as pl 

292 >>> from basanos.math import BasanosConfig, BasanosEngine 

293 >>> dates = list(range(30)) 

294 >>> rng = np.random.default_rng(42) 

295 >>> prices = pl.DataFrame({ 

296 ... "date": dates, 

297 ... "A": np.cumprod(1 + rng.normal(0.001, 0.02, 30)) * 100.0, 

298 ... "B": np.cumprod(1 + rng.normal(0.001, 0.02, 30)) * 150.0, 

299 ... }) 

300 >>> mu = pl.DataFrame({ 

301 ... "date": dates, 

302 ... "A": rng.normal(0.0, 0.5, 30), 

303 ... "B": rng.normal(0.0, 0.5, 30), 

304 ... }) 

305 >>> cfg = BasanosConfig(vola=5, corr=10, clip=2.0, shrink=0.5, aum=1_000_000) 

306 >>> engine = BasanosEngine(prices=prices, mu=mu, cfg=cfg) 

307 >>> engine.assets 

308 ['A', 'B'] 

309 >>> engine.cash_position.shape 

310 (30, 3) 

311 >>> engine.position_leverage.columns 

312 ['date', 'leverage'] 

313 """ 

314 

315 prices: pl.DataFrame 

316 mu: pl.DataFrame 

317 cfg: BasanosConfig 

318 

319 def __post_init__(self) -> None: 

320 """Validate inputs by delegating to :func:`_validate_inputs`.""" 

321 _validate_inputs(self.prices, self.mu, self.cfg) 

322 

323 # ------------------------------------------------------------------ 

324 # Core data-access properties 

325 # ------------------------------------------------------------------ 

326 

327 @property 

328 def assets(self) -> list[str]: 

329 """List asset column names (numeric columns excluding 'date').""" 

330 return [c for c in self.prices.columns if c != "date" and self.prices[c].dtype.is_numeric()] 

331 

332 @property 

333 def ret_adj(self) -> pl.DataFrame: 

334 """Return per-asset volatility-adjusted log returns clipped by cfg.clip. 

335 

336 Uses an EWMA volatility estimate with lookback ``cfg.vola`` to 

337 standardize log returns for each numeric asset column. 

338 """ 

339 return self.prices.with_columns( 

340 [vol_adj(pl.col(asset), vola=self.cfg.vola, clip=self.cfg.clip) for asset in self.assets] 

341 ) 

342 

343 @property 

344 def vola(self) -> pl.DataFrame: 

345 """Per-asset EWMA volatility of percentage returns. 

346 

347 Computes percent changes for each numeric asset column and applies an 

348 exponentially weighted standard deviation using the lookback specified 

349 by ``cfg.vola``. The result is a DataFrame aligned with ``self.prices`` 

350 whose numeric columns hold per-asset volatility estimates. 

351 """ 

352 return self.prices.with_columns( 

353 pl.col(asset) 

354 .pct_change() 

355 .ewm_std(com=self.cfg.vola - 1, adjust=True, min_samples=self.cfg.vola) 

356 .alias(asset) 

357 for asset in self.assets 

358 ) 

359 

360 @property 

361 def cor(self) -> dict[datetime.date, np.ndarray]: 

362 """Compute per-timestamp EWM correlation matrices. 

363 

364 Builds volatility-adjusted returns for all assets, computes an 

365 exponentially weighted correlation using a pure NumPy implementation 

366 (with window ``cfg.corr``), and returns a mapping from each timestamp 

367 to the corresponding correlation matrix as a NumPy array. 

368 

369 Returns: 

370 dict: Mapping ``date -> np.ndarray`` of shape (n_assets, n_assets). 

371 

372 Performance: 

373 Delegates to :func:`ewm_corr`, which is O(T·N²) in both 

374 time and memory. The returned dict holds *T* references into the 

375 result tensor (one N*N view per date); no extra copies are made. 

376 For large *N* or *T*, prefer ``cor_tensor`` to keep a single 

377 contiguous array rather than building a Python dict. 

378 """ 

379 index = self.prices["date"] 

380 ret_adj_np = self.ret_adj.select(self.assets).to_numpy() 

381 tensor = _ewm_corr_numpy( 

382 ret_adj_np, 

383 com=self.cfg.corr, 

384 min_periods=self.cfg.corr, 

385 min_corr_denom=self.cfg.min_corr_denom, 

386 ) 

387 return {index[t]: tensor[t] for t in range(len(index))} 

388 

389 @property 

390 def cor_tensor(self) -> np.ndarray: 

391 """Return all correlation matrices stacked as a 3-D tensor. 

392 

393 Converts the per-timestamp correlation dict (see :py:attr:`cor`) into a 

394 single contiguous NumPy array so that the full history can be saved to 

395 a flat ``.npy`` file with :func:`numpy.save` and reloaded with 

396 :func:`numpy.load`. 

397 

398 Returns: 

399 np.ndarray: Array of shape ``(T, N, N)`` where *T* is the number of 

400 timestamps and *N* the number of assets. ``tensor[t]`` is the 

401 correlation matrix for the *t*-th date (same ordering as 

402 ``self.prices["date"]``). 

403 

404 Examples: 

405 >>> import tempfile, pathlib 

406 >>> import numpy as np 

407 >>> import polars as pl 

408 >>> from basanos.math.optimizer import BasanosConfig, BasanosEngine 

409 >>> dates = pl.Series("date", list(range(100))) 

410 >>> rng0 = np.random.default_rng(0).lognormal(size=100) 

411 >>> rng1 = np.random.default_rng(1).lognormal(size=100) 

412 >>> prices = pl.DataFrame({"date": dates, "A": rng0, "B": rng1}) 

413 >>> rng2 = np.random.default_rng(2).normal(size=100) 

414 >>> rng3 = np.random.default_rng(3).normal(size=100) 

415 >>> mu = pl.DataFrame({"date": dates, "A": rng2, "B": rng3}) 

416 >>> cfg = BasanosConfig(vola=10, corr=20, clip=3.0, shrink=0.5, aum=1e6) 

417 >>> engine = BasanosEngine(prices=prices, mu=mu, cfg=cfg) 

418 >>> tensor = engine.cor_tensor 

419 >>> with tempfile.TemporaryDirectory() as td: 

420 ... path = pathlib.Path(td) / "cor.npy" 

421 ... np.save(path, tensor) 

422 ... loaded = np.load(path) 

423 >>> np.testing.assert_array_equal(tensor, loaded) 

424 """ 

425 return np.stack(list(self.cor.values()), axis=0) 

426 

427 # ------------------------------------------------------------------ 

428 # Internal solve helpers — inherited from _SolveMixin 

429 # ------------------------------------------------------------------ 

430 # (_compute_mask, _check_signal, _scale_to_cash, _row_early_check, 

431 # _denom_guard_yield, _compute_position, _replay_positions, 

432 # _iter_matrices, _iter_solve, warmup_state) 

433 # Implementations live in _engine_solve.py; patch targets remain in that 

434 # module's namespace, e.g. ``patch("basanos.math._engine_solve.solve")``. 

435 

436 # ------------------------------------------------------------------ 

437 # Position properties 

438 # ------------------------------------------------------------------ 

439 

440 @property 

441 def cash_position(self) -> pl.DataFrame: 

442 r"""Optimize correlation-aware risk positions for each timestamp. 

443 

444 Supports two covariance modes controlled by ``cfg.covariance_config``: 

445 

446 * :class:`EwmaShrinkConfig` (default): Computes EWMA correlations, applies 

447 linear shrinkage toward the identity, and solves a normalised linear 

448 system :math:`C\,x = \mu` per timestamp via Cholesky / LU. 

449 

450 * :class:`SlidingWindowConfig`: At each timestamp uses the 

451 ``cfg.covariance_config.window`` most recent vol-adjusted returns to fit a 

452 rank-``cfg.covariance_config.n_factors`` factor model via truncated SVD and 

453 solves the system via the Woodbury identity at :math:`O(k^3 + kn)` rather 

454 than :math:`O(n^3)` per step. 

455 

456 Non-finite or ill-posed cases yield zero positions for safety. 

457 

458 Returns: 

459 pl.DataFrame: DataFrame with columns ['date'] + asset names containing 

460 the per-timestamp cash positions (risk divided by EWMA volatility). 

461 

462 Performance: 

463 For ``ewma_shrink``: dominant cost is ``self.cor`` (O(T·N²) time, 

464 O(T·N²) memory — see :func:`ewm_corr`). The per-timestamp 

465 linear solve adds O(N³) per row. 

466 

467 For ``sliding_window``: O(T·W·N·k) for sliding SVDs plus 

468 O(T·(k³ + kN)) for Woodbury solves. Memory is O(W·N) per step, 

469 independent of T. 

470 """ 

471 assets = self.assets 

472 

473 # Compute risk positions row-by-row using _replay_positions. 

474 prices_num = self.prices.select(assets).to_numpy() 

475 

476 risk_pos_np = np.full_like(prices_num, fill_value=np.nan, dtype=float) 

477 cash_pos_np = np.full_like(prices_num, fill_value=np.nan, dtype=float) 

478 vola_np = self.vola.select(assets).to_numpy() 

479 

480 self._replay_positions(risk_pos_np, cash_pos_np, vola_np) 

481 

482 # Build Polars DataFrame for cash positions (numeric columns only) 

483 cash_position = self.prices.with_columns( 

484 [(pl.lit(cash_pos_np[:, i]).alias(asset)) for i, asset in enumerate(assets)] 

485 ) 

486 

487 return cash_position 

488 

489 @property 

490 def position_status(self) -> pl.DataFrame: 

491 """Per-timestamp reason code explaining each :attr:`cash_position` row. 

492 

493 Labels every row with exactly one of four :class:`~basanos.math.SolveStatus` 

494 codes (which compare equal to their string equivalents): 

495 

496 * ``'warmup'``: Insufficient history for the sliding-window 

497 covariance mode (``i + 1 < cfg.covariance_config.window``). 

498 Positions are ``NaN`` for all assets at this timestamp. 

499 * ``'zero_signal'``: The expected-return vector ``mu`` was 

500 all-zeros (or all-NaN) at this timestamp; the optimizer 

501 short-circuited and returned zero positions without solving. 

502 * ``'degenerate'``: The normalisation denominator was non-finite 

503 or below ``cfg.denom_tol``, the Cholesky / Woodbury solve 

504 failed, or no asset had a finite price; positions were zeroed 

505 for safety. 

506 * ``'valid'``: The linear system was solved successfully and 

507 positions are non-trivially non-zero. 

508 

509 The codes map one-to-one onto the three NaN / zero cases 

510 described in the issue and allow downstream consumers (backtests, 

511 risk monitors) to distinguish data gaps from signal silence from 

512 numerical ill-conditioning without re-inspecting ``mu`` or the 

513 engine configuration. 

514 

515 Returns: 

516 pl.DataFrame: Two-column DataFrame ``{'date': ..., 'status': ...}`` 

517 with one row per timestamp. The ``status`` column has 

518 ``Polars`` dtype ``String``. 

519 """ 

520 statuses = [status for _i, _t, _mask, _pos, status in self._iter_solve()] 

521 return pl.DataFrame({"date": self.prices["date"], "status": pl.Series(statuses, dtype=pl.String)}) 

522 

523 @property 

524 def risk_position(self) -> pl.DataFrame: 

525 """Risk positions (before EWMA-volatility scaling) at each timestamp. 

526 

527 Derives the un-volatility-scaled position by multiplying the cash 

528 position by the per-asset EWMA volatility. Equivalently, this is 

529 the quantity solved by the correlation-adjusted linear system before 

530 dividing by ``vola``. 

531 

532 Relationship to other properties:: 

533 

534 cash_position = risk_position / vola 

535 risk_position = cash_position * vola 

536 

537 Returns: 

538 pl.DataFrame: DataFrame with columns ``['date'] + assets`` where 

539 each value is ``cash_position_i * vola_i`` at the given timestamp. 

540 """ 

541 assets = self.assets 

542 cp_np = self.cash_position.select(assets).to_numpy() 

543 vola_np = self.vola.select(assets).to_numpy() 

544 with np.errstate(invalid="ignore"): 

545 risk_pos = cp_np * vola_np 

546 return self.prices.with_columns([pl.lit(risk_pos[:, i]).alias(asset) for i, asset in enumerate(assets)]) 

547 

548 @property 

549 def position_leverage(self) -> pl.DataFrame: 

550 """L1 norm of cash positions (gross leverage) at each timestamp. 

551 

552 Sums the absolute values of all asset cash positions at each row. 

553 NaN positions are treated as zero (they contribute nothing to gross 

554 leverage). 

555 

556 Returns: 

557 pl.DataFrame: Two-column DataFrame ``{'date': ..., 'leverage': ...}`` 

558 where ``leverage`` is the L1 norm of the cash-position vector. 

559 """ 

560 assets = self.assets 

561 cp_np = self.cash_position.select(assets).to_numpy() 

562 leverage = np.nansum(np.abs(cp_np), axis=1) 

563 return pl.DataFrame({"date": self.prices["date"], "leverage": pl.Series(leverage, dtype=pl.Float64)}) 

564 

565 # ------------------------------------------------------------------ 

566 # Portfolio and performance 

567 # ------------------------------------------------------------------ 

568 

569 @property 

570 def portfolio(self) -> Portfolio: 

571 """Construct a Portfolio from the optimized cash positions. 

572 

573 Converts the computed cash positions into a Portfolio using the 

574 configured AUM. The ``cost_per_unit`` from :attr:`cfg` is forwarded 

575 so that :attr:`~jquantstats.Portfolio.net_cost_nav` and 

576 :attr:`~jquantstats.Portfolio.position_delta_costs` work out 

577 of the box without any further configuration. 

578 

579 Returns: 

580 Portfolio: Instance built from cash positions with AUM scaling. 

581 """ 

582 cp = self.cash_position 

583 assets = [c for c in cp.columns if c != "date" and cp[c].dtype.is_numeric()] 

584 scaled = cp.with_columns(pl.col(a) * self.cfg.position_scale for a in assets) 

585 return Portfolio.from_cash_position(self.prices, scaled, aum=self.cfg.aum, cost_per_unit=self.cfg.cost_per_unit) 

586 

587 def sharpe_at_shrink(self, shrink: float) -> float: 

588 r"""Return the annualised portfolio Sharpe ratio for the given shrinkage weight. 

589 

590 Constructs a new :class:`BasanosEngine` with all parameters identical to 

591 ``self`` except that ``cfg.shrink`` is replaced by ``shrink``, then 

592 returns the annualised Sharpe ratio of the resulting portfolio. 

593 

594 This is the canonical single-argument callable required by the benchmarks 

595 specification: ``f(λ) → Sharpe``. Use it to sweep λ across ``[0, 1]`` 

596 and measure whether correlation adjustment adds value over the 

597 signal-proportional baseline (λ = 0) or the unregularised limit (λ = 1). 

598 

599 Corner cases: 

600 * **λ = 0** — the shrunk matrix equals the identity, so the 

601 optimiser treats all assets as uncorrelated and positions are 

602 purely signal-proportional (no correlation adjustment). 

603 * **λ = 1** — the raw EWMA correlation matrix is used without 

604 shrinkage. 

605 

606 Args: 

607 shrink: Retention weight λ ∈ [0, 1]. See 

608 :attr:`BasanosConfig.shrink` for full documentation. 

609 

610 Returns: 

611 Annualised Sharpe ratio of the portfolio returns as a ``float``. 

612 Returns ``float("nan")`` when the Sharpe ratio cannot be computed 

613 (e.g. zero-variance returns). 

614 

615 Raises: 

616 ValidationError: When ``shrink`` is outside [0, 1] (delegated to 

617 :class:`BasanosConfig` field validation). 

618 

619 Examples: 

620 >>> import numpy as np 

621 >>> import polars as pl 

622 >>> from basanos.math.optimizer import BasanosConfig, BasanosEngine 

623 >>> dates = pl.Series("date", list(range(200))) 

624 >>> rng = np.random.default_rng(0) 

625 >>> prices = pl.DataFrame({"date": dates, "A": rng.lognormal(size=200), "B": rng.lognormal(size=200)}) 

626 >>> mu = pl.DataFrame({"date": dates, "A": rng.normal(size=200), "B": rng.normal(size=200)}) 

627 >>> cfg = BasanosConfig(vola=10, corr=20, clip=3.0, shrink=0.5, aum=1e6) 

628 >>> engine = BasanosEngine(prices=prices, mu=mu, cfg=cfg) 

629 >>> s = engine.sharpe_at_shrink(0.5) 

630 >>> isinstance(s, float) 

631 True 

632 """ 

633 new_cfg = self.cfg.replace(shrink=shrink) 

634 engine = BasanosEngine(prices=self.prices, mu=self.mu, cfg=new_cfg) 

635 return float(engine.portfolio.stats.sharpe().get("returns") or float("nan")) 

636 

637 def sharpe_at_window_factors(self, window: int, n_factors: int) -> float: 

638 r"""Return the annualised portfolio Sharpe ratio for the given sliding-window parameters. 

639 

640 Constructs a new :class:`BasanosEngine` with ``covariance_mode`` set to 

641 ``"sliding_window"`` and the supplied ``window`` / ``n_factors``, keeping 

642 all other configuration identical to ``self``. 

643 

644 Use this method to sweep ``(W, k)`` and compare the sliding-window 

645 estimator against the EWMA baseline (via :meth:`sharpe_at_shrink`). 

646 

647 Args: 

648 window: Rolling window length :math:`W \geq 1`. 

649 Rule of thumb: :math:`W \geq 2 \cdot n_{\text{assets}}`. 

650 n_factors: Number of latent factors :math:`k \geq 1`. 

651 

652 Returns: 

653 Annualised Sharpe ratio of the portfolio returns as a ``float``. 

654 Returns ``float("nan")`` when the Sharpe ratio cannot be computed 

655 (e.g. not enough history to fill the first window). 

656 

657 Raises: 

658 ValidationError: When ``window`` or ``n_factors`` fail field 

659 constraints (delegated to :class:`BasanosConfig`). 

660 

661 Examples: 

662 >>> import numpy as np 

663 >>> import polars as pl 

664 >>> from basanos.math.optimizer import BasanosConfig, BasanosEngine 

665 >>> dates = pl.Series("date", list(range(200))) 

666 >>> rng = np.random.default_rng(0) 

667 >>> prices = pl.DataFrame({"date": dates, "A": rng.lognormal(size=200), "B": rng.lognormal(size=200)}) 

668 >>> mu = pl.DataFrame({"date": dates, "A": rng.normal(size=200), "B": rng.normal(size=200)}) 

669 >>> cfg = BasanosConfig(vola=10, corr=20, clip=3.0, shrink=0.5, aum=1e6) 

670 >>> engine = BasanosEngine(prices=prices, mu=mu, cfg=cfg) 

671 >>> s = engine.sharpe_at_window_factors(window=40, n_factors=2) 

672 >>> isinstance(s, float) 

673 True 

674 """ 

675 new_cfg = self.cfg.replace( 

676 covariance_config=SlidingWindowConfig(window=window, n_factors=n_factors), 

677 ) 

678 engine = BasanosEngine(prices=self.prices, mu=self.mu, cfg=new_cfg) 

679 return float(engine.portfolio.stats.sharpe().get("returns") or float("nan")) 

680 

681 @property 

682 def naive_sharpe(self) -> float: 

683 r"""Sharpe ratio of the naïve equal-weight signal (μ = 1 for every asset/timestamp). 

684 

685 Replaces the expected-return signal ``mu`` with a constant matrix of 

686 ones, then runs the optimiser with the current configuration and returns 

687 the annualised Sharpe ratio of the resulting portfolio. 

688 

689 This provides the baseline answer to *"does the signal add value?"*: 

690 a real signal should produce a higher Sharpe than the naïve benchmark. 

691 Combined with :meth:`sharpe_at_shrink`, this yields a three-way 

692 comparison: 

693 

694 +--------------------+----------------------------------------------+ 

695 | Benchmark | What it measures | 

696 +====================+==============================================+ 

697 | ``naive_sharpe`` | No signal skill; pure correlation routing | 

698 +--------------------+----------------------------------------------+ 

699 | ``sharpe_at_shrink(0.0)`` | Signal skill, no correlation adj. | 

700 +--------------------+----------------------------------------------+ 

701 | ``sharpe_at_shrink(cfg.shrink)`` | Signal + correlation adj. | 

702 +--------------------+----------------------------------------------+ 

703 

704 Returns: 

705 Annualised Sharpe ratio of the equal-weight portfolio as a ``float``. 

706 Returns ``float("nan")`` when the Sharpe ratio cannot be computed. 

707 

708 Examples: 

709 >>> import numpy as np 

710 >>> import polars as pl 

711 >>> from basanos.math.optimizer import BasanosConfig, BasanosEngine 

712 >>> dates = pl.Series("date", list(range(200))) 

713 >>> rng = np.random.default_rng(0) 

714 >>> prices = pl.DataFrame({"date": dates, "A": rng.lognormal(size=200), "B": rng.lognormal(size=200)}) 

715 >>> mu = pl.DataFrame({"date": dates, "A": rng.normal(size=200), "B": rng.normal(size=200)}) 

716 >>> cfg = BasanosConfig(vola=10, corr=20, clip=3.0, shrink=0.5, aum=1e6) 

717 >>> engine = BasanosEngine(prices=prices, mu=mu, cfg=cfg) 

718 >>> s = engine.naive_sharpe 

719 >>> isinstance(s, float) 

720 True 

721 """ 

722 naive_mu = self.mu.with_columns(pl.lit(1.0).alias(asset) for asset in self.assets) 

723 engine = BasanosEngine(prices=self.prices, mu=naive_mu, cfg=self.cfg) 

724 return float(engine.portfolio.stats.sharpe().get("returns") or float("nan")) 

725 

726 # ------------------------------------------------------------------ 

727 # Reporting 

728 # ------------------------------------------------------------------ 

729 

730 @property 

731 def config_report(self) -> "ConfigReport": 

732 """Return a :class:`~basanos.math._config_report.ConfigReport` facade for this engine. 

733 

734 Returns a :class:`~basanos.math._config_report.ConfigReport` that 

735 includes the full **lambda-sweep chart** — an interactive plot of the 

736 annualised Sharpe ratio as :attr:`~BasanosConfig.shrink` (λ) is swept 

737 across [0, 1] — in addition to the parameter table, shrinkage-guidance 

738 table, and theory section available from 

739 :attr:`BasanosConfig.report`. 

740 

741 Returns: 

742 basanos.math._config_report.ConfigReport: Report facade with 

743 ``to_html()`` and ``save()`` methods. 

744 

745 Examples: 

746 >>> import numpy as np 

747 >>> import polars as pl 

748 >>> from basanos.math.optimizer import BasanosConfig, BasanosEngine 

749 >>> dates = pl.Series("date", list(range(200))) 

750 >>> rng = np.random.default_rng(0) 

751 >>> prices = pl.DataFrame({"date": dates, "A": rng.lognormal(size=200), "B": rng.lognormal(size=200)}) 

752 >>> mu = pl.DataFrame({"date": dates, "A": rng.normal(size=200), "B": rng.normal(size=200)}) 

753 >>> cfg = BasanosConfig(vola=10, corr=20, clip=3.0, shrink=0.5, aum=1e6) 

754 >>> engine = BasanosEngine(prices=prices, mu=mu, cfg=cfg) 

755 >>> report = engine.config_report 

756 >>> html = report.to_html() 

757 >>> "Lambda" in html 

758 True 

759 """ 

760 from ._config_report import ConfigReport 

761 

762 return ConfigReport(config=self.cfg, engine=self) 

763 

764 # ------------------------------------------------------------------ 

765 # Matrix diagnostics — inherited from _DiagnosticsMixin 

766 # ------------------------------------------------------------------ 

767 # (condition_number, effective_rank, solver_residual, signal_utilisation) 

768 # Implementations live in _engine_diagnostics.py; patch targets remain in 

769 # that module's namespace, e.g. 

770 # ``patch("basanos.math._engine_diagnostics.solve")``. 

771 

772 # ------------------------------------------------------------------ 

773 # Signal evaluation — inherited from _SignalEvaluatorMixin 

774 # ------------------------------------------------------------------ 

775 # (_ic_series, ic, rank_ic, ic_mean, ic_std, icir, 

776 # rank_ic_mean, rank_ic_std) 

777 # Implementations live in _engine_ic.py.