Coverage for src / basanos / math / _signal.py: 100%

10 statements  

« prev     ^ index     » next       coverage.py v7.13.5, created at 2026-03-19 05:23 +0000

1"""Internal signal utilities (private to basanos.math). 

2 

3This module contains low-level helpers for building signals and 

4transformations. It is considered an internal implementation detail of 

5``basanos.math``. Do not import this module directly from outside the 

6package; instead import the public symbols from ``basanos.math``. 

7""" 

8 

9from __future__ import annotations 

10 

11import numpy as np 

12import polars as pl 

13 

14 

15def shrink2id(matrix: np.ndarray, lamb: float = 1.0) -> np.ndarray: 

16 r"""Shrink a square matrix linearly towards the identity matrix. 

17 

18 This implements the **convex linear shrinkage** estimator 

19 

20 .. math:: 

21 

22 \\hat{\\Sigma}(\\lambda) = \\lambda \\cdot M + (1 - \\lambda) \\cdot I_n 

23 

24 where :math:`M` is the sample matrix, :math:`I_n` is the :math:`n \\times n` 

25 identity matrix, and :math:`\\lambda \\in [0, 1]` is the *retention weight* 

26 (equivalently, ``1 - lambda`` is the *shrinkage intensity*). 

27 

28 **Why shrink toward the identity?** 

29 

30 Sample covariance/correlation matrices estimated from a finite number of 

31 observations :math:`T` are poorly conditioned when the number of assets 

32 :math:`n` is large relative to :math:`T`. This is the classical 

33 *curse of dimensionality*: extreme eigenvalues of the sample matrix are 

34 biased away from their population counterparts (the Marchenko-Pastur law 

35 describes the bias as a function of the concentration ratio :math:`n / T`). 

36 Shrinkage pulls eigenvalues toward a common target — here the unit sphere — 

37 reducing estimation error at the cost of a small bias [1]_. 

38 

39 **Relationship to Ledoit-Wolf shrinkage** 

40 

41 Ledoit and Wolf (2004) [2]_ derive the *optimal* scalar shrinkage 

42 intensity :math:`\\alpha^*` by minimizing the expected Frobenius loss 

43 :math:`\\mathbb{E}[\\|\\hat{\\Sigma}(\\alpha) - \\Sigma\\|_F^2]` under a 

44 general factor model. Their closed-form estimator is a special case of 

45 this function where ``lamb = 1 - alpha*``. The Oracle Approximating 

46 Shrinkage (OAS) estimator [3]_ improves on Ledoit-Wolf by accounting for 

47 the bias in the analytic formula, often yielding better finite-sample 

48 performance. 

49 

50 **Basanos usage** 

51 

52 In Basanos the target matrix is always the *correlation* identity (diagonal 

53 ones, off-diagonal zeros), and ``lamb`` is supplied via 

54 :attr:`~basanos.math.BasanosConfig.shrink` as a user-controlled 

55 hyperparameter rather than an analytically chosen optimal value. This is 

56 appropriate in the context of *regularising a solver* (the system 

57 :math:`C x = \\mu` must be well-posed at every timestamp) rather than 

58 *estimating a covariance matrix* — here practical stability often matters 

59 more than minimum Frobenius loss. 

60 

61 **Empirical guidance for choosing** ``lamb`` **(= cfg.shrink)** 

62 

63 The table below offers practical starting points for daily financial return 

64 data. All recommendations should be validated on out-of-sample data. 

65 

66 +--------------------------+---------------------------+------------------+ 

67 | Regime | Suggested ``lamb`` | Rationale | 

68 +==========================+===========================+==================+ 

69 | Many assets, short | 0.3 - 0.6 | High | 

70 | lookback (n/T > 0.5) | | concentration | 

71 | | | ratio; sample | 

72 | | | matrix is noisy. | 

73 +--------------------------+---------------------------+------------------+ 

74 | Moderate assets, | 0.5 - 0.8 | Balanced | 

75 | moderate lookback | | regularisation. | 

76 | (n/T ~ 0.1 - 0.5) | | | 

77 +--------------------------+---------------------------+------------------+ 

78 | Few assets, long | 0.7 - 1.0 | Sample matrix | 

79 | lookback (n/T < 0.1) | | is reliable; | 

80 | | | light shrinkage | 

81 | | | for robustness. | 

82 +--------------------------+---------------------------+------------------+ 

83 

84 A simple heuristic: start with ``lamb = 1 - n / (2 * T)`` where 

85 ``n`` is the number of assets and ``T`` is the EWMA correlation lookback 

86 (``cfg.corr``) — a rough approximation of the Ledoit-Wolf formula — 

87 then tune on held-out data. 

88 

89 **Sensitivity note** 

90 

91 Shrinkage is most sensitive in the range :math:`\\lambda \\in [0.3, 0.8]`. 

92 Below ~0.3 the matrix can become nearly singular for small portfolios 

93 (``n > 10`` with ``corr < 50``); above ~0.8 the off-diagonal correlations 

94 are so heavily damped that the optimizer behaves almost as if assets were 

95 uncorrelated. 

96 

97 Args: 

98 matrix: Square matrix to shrink (typically a correlation matrix). 

99 lamb: Retention weight :math:`\\lambda \\in [0, 1]`. ``1.0`` returns 

100 the original matrix unchanged; ``0.0`` returns the identity. 

101 

102 Returns: 

103 The shrunk matrix with the same shape as ``matrix``. 

104 

105 References: 

106 .. [1] Stein, C. (1956). *Inadmissibility of the usual estimator for 

107 the mean of a multivariate normal distribution.* Proceedings 

108 of the Third Berkeley Symposium, 1, 197-206. 

109 .. [2] Ledoit, O., & Wolf, M. (2004). *A well-conditioned estimator for 

110 large-dimensional covariance matrices.* Journal of Multivariate 

111 Analysis, 88(2), 365-411. 

112 https://doi.org/10.1016/S0047-259X(03)00096-4 

113 .. [3] Chen, Y., Wiesel, A., Eldar, Y. C., & Hero, A. O. (2010). 

114 *Shrinkage algorithms for MMSE covariance estimation.* IEEE 

115 Transactions on Signal Processing, 58(10), 5016-5029. 

116 https://doi.org/10.1109/TSP.2010.2053029 

117 

118 Examples: 

119 >>> import numpy as np 

120 >>> # Full retention: original matrix unchanged 

121 >>> shrink2id(np.array([[2.0, 1.0], [1.0, 3.0]]), lamb=1.0).tolist() 

122 [[2.0, 1.0], [1.0, 3.0]] 

123 >>> # Full shrinkage: identity matrix 

124 >>> shrink2id(np.array([[2.0, 0.0], [0.0, 2.0]]), lamb=0.0).tolist() 

125 [[1.0, 0.0], [0.0, 1.0]] 

126 >>> # Half-way: average of matrix and identity 

127 >>> m = np.array([[2.0, 1.0], [1.0, 3.0]]) 

128 >>> shrink2id(m, lamb=0.5).tolist() 

129 [[1.5, 0.5], [0.5, 2.0]] 

130 """ 

131 return matrix * lamb + (1 - lamb) * np.eye(N=matrix.shape[0]) 

132 

133 

134def vol_adj(x: pl.Expr, vola: int, clip: float, min_samples: int = 1) -> pl.Expr: 

135 """Compute clipped, volatility-adjusted log returns per column. 

136 

137 - ``vola`` controls the EWM std smoothing (converted to alpha internally). 

138 - ``clip`` applies symmetric clipping to the standardized returns. 

139 

140 Args: 

141 x: Polars expression (price series) to transform. 

142 vola: EWMA lookback (span-equivalent) for std. 

143 clip: Symmetric clipping threshold applied after standardization. 

144 min_samples: Minimum samples required by EWM to yield non-null values. 

145 

146 Returns: 

147 A Polars expression with standardized and clipped log returns. 

148 

149 Examples: 

150 >>> import polars as pl 

151 >>> df = pl.DataFrame({"p": [1.0, 1.1, 1.05, 1.15, 1.2]}) 

152 >>> result = df.select(vol_adj(pl.col("p"), vola=2, clip=3.0)) 

153 >>> result.shape 

154 (5, 1) 

155 """ 

156 # compute the log returns 

157 log_returns = x.log().diff() 

158 

159 # compute the volatility of the log returns 

160 vol = log_returns.ewm_std(com=vola - 1, adjust=True, min_samples=min_samples) 

161 

162 # compute the volatility-adjusted returns 

163 vol_adj_returns = (log_returns / vol).clip(-clip, clip) 

164 

165 return vol_adj_returns