Skip to content

API Reference

The public API surface of jquantstats. All stable exports are importable directly from the top-level package:

from jquantstats import Portfolio, Data, CostModel

See API Stability for the versioning and deprecation policy.


jquantstats

jQuantStats: Portfolio analytics for quants.

Two entry points

Entry point 1 — prices + positions (recommended for active portfolios):

Use Portfolio when you have price series and position sizes. Portfolio compiles the NAV curve from raw inputs and exposes the full analytics suite via .stats, .plots, and .report.

from jquantstats import Portfolio
import polars as pl

pf = Portfolio.from_cash_position(
    prices=prices_df,
    cash_position=positions_df,
    aum=1_000_000,
)
pf.stats.sharpe()
pf.plots.snapshot()

Entry point 2 — returns series (for arbitrary return streams):

Use Data when you already have a returns series (e.g. downloaded from a data vendor) and want benchmark comparison or factor analytics.

from jquantstats import Data
import polars as pl

data = Data.from_returns(returns=returns_df, benchmark=bench_df)
data.stats.sharpe()
data.plots.snapshot(title="Performance")

The two APIs are layered: portfolio.data returns a Data object so you can always drop into the returns-series API from a Portfolio.

For more information, visit the jQuantStats Documentation <https://jebel-quant.github.io/jquantstats/book>_.

CostModel dataclass

Unified representation of a portfolio transaction-cost model.

Eliminates the implicit "pick one" contract between the two independent cost parameters (cost_per_unit and cost_bps) on Portfolio. A CostModel instance encapsulates one model at a time and can be passed to any Portfolio factory method instead of specifying the raw float parameters.

Attributes:

Name Type Description
cost_per_unit float

One-way cost per unit of position change (Model A). Defaults to 0.0.

cost_bps float

One-way cost in basis points of AUM turnover (Model B). Defaults to 0.0.

Raises:

Type Description
ValueError

If cost_per_unit or cost_bps is negative, or if both are non-zero (which would silently double-count costs).

Examples:

>>> CostModel.per_unit(0.01)
CostModel(cost_per_unit=0.01, cost_bps=0.0)
>>> CostModel.turnover_bps(5.0)
CostModel(cost_per_unit=0.0, cost_bps=5.0)
>>> CostModel.zero()
CostModel(cost_per_unit=0.0, cost_bps=0.0)
Source code in src/jquantstats/_cost_model.py
@dataclasses.dataclass(frozen=True)
class CostModel:
    """Unified representation of a portfolio transaction-cost model.

    Eliminates the implicit "pick one" contract between the two independent
    cost parameters (``cost_per_unit`` and ``cost_bps``) on
    `Portfolio`.  A ``CostModel``
    instance encapsulates one model at a time and can be passed to any
    Portfolio factory method instead of specifying the raw float parameters.

    Attributes:
        cost_per_unit: One-way cost per unit of position change (Model A).
            Defaults to 0.0.
        cost_bps: One-way cost in basis points of AUM turnover (Model B).
            Defaults to 0.0.

    Raises:
        ValueError: If ``cost_per_unit`` or ``cost_bps`` is negative, or if
            both are non-zero (which would silently double-count costs).

    Examples:
        >>> CostModel.per_unit(0.01)
        CostModel(cost_per_unit=0.01, cost_bps=0.0)
        >>> CostModel.turnover_bps(5.0)
        CostModel(cost_per_unit=0.0, cost_bps=5.0)
        >>> CostModel.zero()
        CostModel(cost_per_unit=0.0, cost_bps=0.0)
    """

    cost_per_unit: float = 0.0
    cost_bps: float = 0.0

    def __post_init__(self) -> None:
        if self.cost_per_unit < 0:
            raise ValueError(f"cost_per_unit must be non-negative, got {self.cost_per_unit}")  # noqa: TRY003
        if self.cost_bps < 0:
            raise ValueError(f"cost_bps must be non-negative, got {self.cost_bps}")  # noqa: TRY003
        if self.cost_per_unit > 0 and self.cost_bps > 0:
            raise ValueError(  # noqa: TRY003
                "Only one cost model may be active at a time: "
                f"got cost_per_unit={self.cost_per_unit} and cost_bps={self.cost_bps}. "
                "Use CostModel.per_unit() or CostModel.turnover_bps() to make intent explicit."
            )

    # ── Named constructors ────────────────────────────────────────────────────

    @classmethod
    def per_unit(cls, cost: float) -> CostModel:
        """Create a Model A (position-delta) cost model.

        Args:
            cost: One-way cost per unit of position change.  Must be
                non-negative.

        Returns:
            A `CostModel` with ``cost_per_unit=cost`` and
            ``cost_bps=0.0``.

        Examples:
            >>> CostModel.per_unit(0.01)
            CostModel(cost_per_unit=0.01, cost_bps=0.0)
        """
        return cls(cost_per_unit=cost, cost_bps=0.0)

    @classmethod
    def turnover_bps(cls, bps: float) -> CostModel:
        """Create a Model B (turnover-bps) cost model.

        Args:
            bps: One-way cost in basis points of AUM turnover.  Must be
                non-negative.

        Returns:
            A `CostModel` with ``cost_per_unit=0.0`` and
            ``cost_bps=bps``.

        Examples:
            >>> CostModel.turnover_bps(5.0)
            CostModel(cost_per_unit=0.0, cost_bps=5.0)
        """
        return cls(cost_per_unit=0.0, cost_bps=bps)

    @classmethod
    def zero(cls) -> CostModel:
        """Create a zero-cost model (no transaction costs).

        Returns:
            A `CostModel` with both parameters set to 0.0.

        Examples:
            >>> CostModel.zero()
            CostModel(cost_per_unit=0.0, cost_bps=0.0)
        """
        return cls(cost_per_unit=0.0, cost_bps=0.0)

per_unit(cost) classmethod

Create a Model A (position-delta) cost model.

Parameters:

Name Type Description Default
cost float

One-way cost per unit of position change. Must be non-negative.

required

Returns:

Type Description
CostModel

A CostModel with cost_per_unit=cost and

CostModel

cost_bps=0.0.

Examples:

>>> CostModel.per_unit(0.01)
CostModel(cost_per_unit=0.01, cost_bps=0.0)
Source code in src/jquantstats/_cost_model.py
@classmethod
def per_unit(cls, cost: float) -> CostModel:
    """Create a Model A (position-delta) cost model.

    Args:
        cost: One-way cost per unit of position change.  Must be
            non-negative.

    Returns:
        A `CostModel` with ``cost_per_unit=cost`` and
        ``cost_bps=0.0``.

    Examples:
        >>> CostModel.per_unit(0.01)
        CostModel(cost_per_unit=0.01, cost_bps=0.0)
    """
    return cls(cost_per_unit=cost, cost_bps=0.0)

turnover_bps(bps) classmethod

Create a Model B (turnover-bps) cost model.

Parameters:

Name Type Description Default
bps float

One-way cost in basis points of AUM turnover. Must be non-negative.

required

Returns:

Type Description
CostModel

A CostModel with cost_per_unit=0.0 and

CostModel

cost_bps=bps.

Examples:

>>> CostModel.turnover_bps(5.0)
CostModel(cost_per_unit=0.0, cost_bps=5.0)
Source code in src/jquantstats/_cost_model.py
@classmethod
def turnover_bps(cls, bps: float) -> CostModel:
    """Create a Model B (turnover-bps) cost model.

    Args:
        bps: One-way cost in basis points of AUM turnover.  Must be
            non-negative.

    Returns:
        A `CostModel` with ``cost_per_unit=0.0`` and
        ``cost_bps=bps``.

    Examples:
        >>> CostModel.turnover_bps(5.0)
        CostModel(cost_per_unit=0.0, cost_bps=5.0)
    """
    return cls(cost_per_unit=0.0, cost_bps=bps)

zero() classmethod

Create a zero-cost model (no transaction costs).

Returns:

Type Description
CostModel

A CostModel with both parameters set to 0.0.

Examples:

>>> CostModel.zero()
CostModel(cost_per_unit=0.0, cost_bps=0.0)
Source code in src/jquantstats/_cost_model.py
@classmethod
def zero(cls) -> CostModel:
    """Create a zero-cost model (no transaction costs).

    Returns:
        A `CostModel` with both parameters set to 0.0.

    Examples:
        >>> CostModel.zero()
        CostModel(cost_per_unit=0.0, cost_bps=0.0)
    """
    return cls(cost_per_unit=0.0, cost_bps=0.0)

Data dataclass

A container for financial returns data and an optional benchmark.

Provides methods for analyzing and manipulating financial returns data, including resampling, truncation, and access to statistical metrics and visualizations via the stats and plots properties.

Attributes:

Name Type Description
returns DataFrame

DataFrame containing returns data with assets as columns.

benchmark DataFrame | None

Optional benchmark returns DataFrame. Defaults to None.

index DataFrame

DataFrame containing the date index for the returns data.

Source code in src/jquantstats/data.py
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
@dataclasses.dataclass(frozen=True, slots=True)
class Data:
    """A container for financial returns data and an optional benchmark.

    Provides methods for analyzing and manipulating financial returns data,
    including resampling, truncation, and access to statistical metrics and
    visualizations via the ``stats`` and ``plots`` properties.

    Attributes:
        returns (pl.DataFrame): DataFrame containing returns data with assets
            as columns.
        benchmark (pl.DataFrame | None): Optional benchmark returns DataFrame.
            Defaults to None.
        index (pl.DataFrame): DataFrame containing the date index for the
            returns data.

    """

    returns: pl.DataFrame
    index: pl.DataFrame
    benchmark: pl.DataFrame | None = None

    def __post_init__(self) -> None:
        """Validate the Data object after initialization."""
        # You need at least two points
        if self.index.shape[0] < 2:
            raise ValueError("Index must contain at least two timestamps.")  # noqa: TRY003

        # Check index is monotonically increasing
        datetime_col = self.index[self.index.columns[0]]
        if not datetime_col.is_sorted():
            raise ValueError("Index must be monotonically increasing.")  # noqa: TRY003

        # Check row count matches returns
        if self.returns.shape[0] != self.index.shape[0]:
            raise ValueError("Returns and index must have the same number of rows.")  # noqa: TRY003

        # Check row count matches benchmark (if provided)
        if self.benchmark is not None and self.benchmark.shape[0] != self.index.shape[0]:
            raise ValueError("Benchmark and index must have the same number of rows.")  # noqa: TRY003

    @classmethod
    def from_returns(
        cls,
        returns: NativeFrame,
        rf: NativeFrameOrScalar = 0.0,
        benchmark: NativeFrame | None = None,
        date_col: str = "Date",
        null_strategy: Literal["raise", "drop", "forward_fill"] | None = None,
    ) -> Data:
        """Create a Data object from returns and optional benchmark.

        Args:
            returns (NativeFrame): Financial returns data. First column should
                be the date column, remaining columns are asset returns.
            rf (float | NativeFrame): Risk-free rate. Defaults to 0.0 (no
                risk-free rate adjustment).

                - If float: Constant risk-free rate applied to all dates.
                - If NativeFrame: Time-varying risk-free rate with dates
                  matching returns.

            benchmark (NativeFrame | None): Benchmark returns. Defaults to
                None (no benchmark). First column should be the date column,
                remaining columns are benchmark returns.
            date_col (str): Name of the date column in the DataFrames.
                Defaults to ``"Date"``.
            null_strategy ({"raise", "drop", "forward_fill"} | None): How to
                handle ``null`` (missing) values in *returns* and *benchmark*.
                Defaults to ``None`` (nulls propagate through calculations).

                - ``None`` — no null checking; nulls propagate through all
                  downstream calculations.
                - ``"raise"`` — raise `NullsInReturnsError` if any null is
                  found.
                - ``"drop"`` — silently drop every row that contains at least
                  one null.
                - ``"forward_fill"`` — fill each null with the most recent
                  non-null value in the same column.

                Note: Affects only Polars ``null`` values (i.e. ``None`` /
                missing entries). IEEE-754 ``NaN`` values are **not** affected
                and continue to propagate as per IEEE-754 semantics.

        Returns:
            Data: Object containing excess returns and benchmark (if any),
            with methods for analysis and visualization through the ``stats``
            and ``plots`` properties.

        Raises:
            NullsInReturnsError: If *null_strategy* is ``"raise"`` and the
                data contains null values.
            ValueError: If there are no overlapping dates between returns and
                benchmark.

        Examples:
            Basic usage:

            ```python
            from jquantstats import Data
            import polars as pl

            returns = pl.DataFrame({
                "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
                "Asset1": [0.01, -0.02, 0.03]
            }).with_columns(pl.col("Date").str.to_date())

            data = Data.from_returns(returns=returns)
            ```

            With benchmark and risk-free rate:

            ```python
            benchmark = pl.DataFrame({
                "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
                "Market": [0.005, -0.01, 0.02]
            }).with_columns(pl.col("Date").str.to_date())

            data = Data.from_returns(returns=returns, benchmark=benchmark, rf=0.0002)
            ```

            Handling nulls automatically:

            ```python
            returns_with_nulls = pl.DataFrame({
                "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
                "Asset1": [0.01, None, 0.03]
            }).with_columns(pl.col("Date").str.to_date())

            # Drop rows with nulls (mirrors pandas/QuantStats behaviour)
            data = Data.from_returns(returns=returns_with_nulls, null_strategy="drop")

            # Or forward-fill nulls
            data = Data.from_returns(returns=returns_with_nulls, null_strategy="forward_fill")
            ```

        """
        returns_pl = _to_polars(returns)
        benchmark_pl = _to_polars(benchmark) if benchmark is not None else None
        rf_converted: float | pl.DataFrame
        if isinstance(rf, pl.DataFrame) or (not isinstance(rf, float) and not isinstance(rf, int)):
            rf_converted = _to_polars(rf)
        else:
            rf_converted = rf  # int is not float/DataFrame: _subtract_risk_free raises TypeError

        returns_pl = _apply_null_strategy(returns_pl, date_col, "returns", null_strategy)
        if benchmark_pl is not None:
            benchmark_pl = _apply_null_strategy(benchmark_pl, date_col, "benchmark", null_strategy)

        if benchmark_pl is not None:
            joined_dates = returns_pl.join(benchmark_pl, on=date_col, how="inner").select(date_col)
            if joined_dates.is_empty():
                raise ValueError("No overlapping dates between returns and benchmark.")  # noqa: TRY003
            returns_pl = returns_pl.join(joined_dates, on=date_col, how="inner")
            benchmark_pl = benchmark_pl.join(joined_dates, on=date_col, how="inner")

        index = returns_pl.select(date_col)
        excess_returns = _subtract_risk_free(returns_pl, rf_converted, date_col).drop(date_col)
        excess_benchmark = (
            _subtract_risk_free(benchmark_pl, rf_converted, date_col).drop(date_col)
            if benchmark_pl is not None
            else None
        )

        return cls(returns=excess_returns, benchmark=excess_benchmark, index=index)

    @classmethod
    def from_prices(
        cls,
        prices: NativeFrame,
        rf: NativeFrameOrScalar = 0.0,
        benchmark: NativeFrame | None = None,
        date_col: str = "Date",
        null_strategy: Literal["raise", "drop", "forward_fill"] | None = None,
    ) -> Data:
        """Create a Data object from prices and optional benchmark.

        Converts price levels to returns via percentage change and delegates
        to `from_returns`. The first row of each asset is dropped because no
        prior price is available to compute a return.

        Args:
            prices (NativeFrame): Price-level data. First column should be
                the date column; remaining columns are asset prices.
            rf (float | NativeFrame): Risk-free rate. Forwarded unchanged to
                `from_returns`. Defaults to 0.0 (no risk-free rate
                adjustment).
            benchmark (NativeFrame | None): Benchmark prices. Converted to
                returns in the same way as ``prices`` before being forwarded
                to `from_returns`. Defaults to None (no benchmark).
            date_col (str): Name of the date column in the DataFrames.
                Defaults to ``"Date"``.
            null_strategy ({"raise", "drop", "forward_fill"} | None): How to
                handle ``null`` (missing) values after converting prices to
                returns. Forwarded unchanged to `from_returns`. Defaults to
                ``None`` (nulls propagate through calculations).

                - ``None`` — no null checking; nulls propagate.
                - ``"raise"`` — raise `NullsInReturnsError` if any null is
                  found in the derived returns.
                - ``"drop"`` — silently drop every row that contains at least
                  one null.
                - ``"forward_fill"`` — fill each null with the most recent
                  non-null value.

                Note: Prices that contain nulls will produce null returns via
                ``pct_change()``. If you expect missing price entries, pass
                ``null_strategy="drop"`` or ``null_strategy="forward_fill"``.

        Returns:
            Data: Object containing excess returns derived from the supplied
            prices, with methods for analysis and visualization through the
            ``stats`` and ``plots`` properties.

        Examples:
            ```python
            from jquantstats import Data
            import polars as pl

            prices = pl.DataFrame({
                "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
                "Asset1": [100.0, 101.0, 99.0]
            }).with_columns(pl.col("Date").str.to_date())

            data = Data.from_prices(prices=prices)
            ```

        """
        prices_pl = _to_polars(prices)
        asset_cols = [c for c in prices_pl.columns if c != date_col]
        returns_pl = prices_pl.with_columns([pl.col(c).pct_change().alias(c) for c in asset_cols]).slice(1)

        benchmark_returns: NativeFrame | None = None
        if benchmark is not None:
            benchmark_pl = _to_polars(benchmark)
            bench_cols = [c for c in benchmark_pl.columns if c != date_col]
            benchmark_returns = benchmark_pl.with_columns([pl.col(c).pct_change().alias(c) for c in bench_cols]).slice(
                1
            )

        return cls.from_returns(
            returns=returns_pl,
            rf=rf,
            benchmark=benchmark_returns,
            date_col=date_col,
            null_strategy=null_strategy,
        )

    def __repr__(self) -> str:
        """Return a string representation of the Data object."""
        rows = len(self.index)
        date_cols = self.date_col
        if date_cols:
            date_column = date_cols[0]
            start = self.index[date_column].min()
            end = self.index[date_column].max()
            return f"Data(assets={self.assets}, rows={rows}, start={start}, end={end})"
        return f"Data(assets={self.assets}, rows={rows})"  # pragma: no cover  # __post_init__ requires ≥1 index column

    @property
    def plots(self) -> DataPlots:
        """Provides access to visualization methods for the financial data.

        Returns:
            DataPlots: An instance of the DataPlots class initialized with this data.

        """
        from ._plots import DataPlots

        return DataPlots(self)

    @property
    def stats(self) -> Stats:
        """Provides access to statistical analysis methods for the financial data.

        Returns:
            Stats: An instance of the Stats class initialized with this data.

        """
        from ._stats import Stats

        return Stats(self)

    @property
    def reports(self) -> Reports:
        """Provides access to reporting methods for the financial data.

        Returns:
            Reports: An instance of the Reports class initialized with this data.

        """
        from ._reports import Reports

        return Reports(self)

    @property
    def utils(self) -> DataUtils:
        """Provides access to utility transforms and conversions for the financial data.

        Returns:
            DataUtils: An instance of the DataUtils class initialized with this data.

        """
        from ._utils import DataUtils

        return DataUtils(self)

    @property
    def date_col(self) -> list[str]:
        """Return the column names of the index DataFrame.

        Returns:
            list[str]: List of column names in the index DataFrame, typically containing
                      the date column name.

        """
        return list(self.index.columns)

    @property
    def assets(self) -> list[str]:
        """Return the combined list of asset column names from returns and benchmark.

        Returns:
            list[str]: List of all asset column names from both returns and benchmark
                      (if available).

        """
        if self.benchmark is not None:
            return list(self.returns.columns) + list(self.benchmark.columns)
        return list(self.returns.columns)

    @property
    def all(self) -> pl.DataFrame:
        """Combine index, returns, and benchmark data into a single DataFrame.

        This property provides a convenient way to access all data in a single DataFrame,
        which is useful for analysis and visualization.

        Returns:
            pl.DataFrame: A DataFrame containing the index, all returns data, and benchmark data
                         (if available) combined horizontally.

        """
        if self.benchmark is None:
            return pl.concat([self.index, self.returns], how="horizontal")
        else:
            return pl.concat([self.index, self.returns, self.benchmark], how="horizontal")

    def resample(self, every: str = "1mo") -> Data:
        """Resample returns and benchmark to a different frequency.

        Args:
            every (str): Resampling frequency (e.g., ``'1mo'``, ``'1y'``).
                Defaults to ``'1mo'``.

        Returns:
            Data: Resampled data at the requested frequency.

        """

        def resample_frame(dframe: pl.DataFrame) -> pl.DataFrame:
            """Resample a single DataFrame to the target frequency using compound returns."""
            dframe = self.index.hstack(dframe)  # Add the date column for resampling

            return dframe.group_by_dynamic(
                index_column=self.index.columns[0], every=every, period=every, closed="right", label="right"
            ).agg(
                [
                    ((pl.col(col) + 1.0).product() - 1.0).alias(col)
                    for col in dframe.columns
                    if col != self.index.columns[0]
                ]
            )

        resampled_returns = resample_frame(self.returns)
        resampled_benchmark = resample_frame(self.benchmark) if self.benchmark is not None else None
        resampled_index = resampled_returns.select(self.index.columns[0])

        return Data(
            returns=resampled_returns.drop(self.index.columns[0]),
            benchmark=resampled_benchmark.drop(self.index.columns[0]) if resampled_benchmark is not None else None,
            index=resampled_index,
        )

    def describe(self) -> pl.DataFrame:
        """Return a tidy summary of shape, date range and asset names.

        Returns:
            pl.DataFrame: One row per asset with columns: asset, start, end,
            rows, has_benchmark.

        """
        date_column = self.date_col[0]
        start = self.index[date_column].min()
        end = self.index[date_column].max()
        rows = len(self.index)
        return pl.DataFrame(
            {
                "asset": self.returns.columns,
                "start": [start] * len(self.returns.columns),
                "end": [end] * len(self.returns.columns),
                "rows": [rows] * len(self.returns.columns),
                "has_benchmark": [self.benchmark is not None] * len(self.returns.columns),
            }
        )

    def copy(self) -> Data:
        """Create a deep copy of the Data object.

        Returns:
            Data: A new Data object with copies of the returns and benchmark.

        """
        if self.benchmark is not None:
            return Data(returns=self.returns.clone(), benchmark=self.benchmark.clone(), index=self.index.clone())
        return Data(returns=self.returns.clone(), index=self.index.clone())

    def head(self, n: int = 5) -> Data:
        """Return the first n rows of the combined returns and benchmark data.

        Args:
            n (int, optional): Number of rows to return. Defaults to 5.

        Returns:
            Data: A new Data object containing the first n rows of the combined data.

        """
        benchmark_head = self.benchmark.head(n) if self.benchmark is not None else None
        return Data(returns=self.returns.head(n), benchmark=benchmark_head, index=self.index.head(n))

    def tail(self, n: int = 5) -> Data:
        """Return the last n rows of the combined returns and benchmark data.

        Args:
            n (int, optional): Number of rows to return. Defaults to 5.

        Returns:
            Data: A new Data object containing the last n rows of the combined data.

        """
        benchmark_tail = self.benchmark.tail(n) if self.benchmark is not None else None
        return Data(returns=self.returns.tail(n), benchmark=benchmark_tail, index=self.index.tail(n))

    def truncate(
        self,
        start: date | datetime | str | int | None = None,
        end: date | datetime | str | int | None = None,
    ) -> Data:
        """Return a new Data object truncated to the inclusive [start, end] range.

        When the index is temporal (Date/Datetime), truncation is performed by
        comparing the date column against ``start`` and ``end`` values.

        When the index is integer-based, row slicing is used instead, and
        ``start`` and ``end`` must be non-negative integers.  Passing
        non-integer bounds to an integer-indexed Data raises `TypeError`.

        Args:
            start: Optional lower bound (inclusive).  A date/datetime value
                when the index is temporal; a non-negative `int` row
                index when the data has no temporal index.
            end: Optional upper bound (inclusive).  Same type rules as
                ``start``.

        Returns:
            Data: A new Data object filtered to the specified range.

        Raises:
            TypeError: When the index is not temporal and a non-integer bound
                is supplied.

        """
        date_column = self.index.columns[0]
        is_temporal = self.index[date_column].dtype.is_temporal()

        if is_temporal:
            cond = pl.lit(True)
            if start is not None:
                cond = cond & (pl.col(date_column) >= pl.lit(start))
            if end is not None:
                cond = cond & (pl.col(date_column) <= pl.lit(end))
            mask = self.index.select(cond.alias("mask"))["mask"]
            new_index = self.index.filter(mask)
            new_returns = self.returns.filter(mask)
            new_benchmark = self.benchmark.filter(mask) if self.benchmark is not None else None
        else:
            if start is not None and not isinstance(start, int):
                raise TypeError(f"start must be an integer, got {type(start).__name__}.")  # noqa: TRY003
            if end is not None and not isinstance(end, int):
                raise TypeError(f"end must be an integer, got {type(end).__name__}.")  # noqa: TRY003
            row_start = start if start is not None else 0
            row_end = end + 1 if end is not None else self.index.height
            length = max(0, row_end - row_start)
            new_index = self.index.slice(row_start, length)
            new_returns = self.returns.slice(row_start, length)
            new_benchmark = self.benchmark.slice(row_start, length) if self.benchmark is not None else None

        return Data(returns=new_returns, benchmark=new_benchmark, index=new_index)

    @property
    def _periods_per_year(self) -> float:
        """Estimate the number of periods per year based on average frequency in the index.

        For temporal (Date/Datetime) indices, computes the mean gap between observations
        and converts to an annualised period count (e.g. ~252 for daily, ~52 for weekly).

        For integer indices (date-free portfolios), falls back to 252 trading days per year
        because integer diffs have no time meaning.
        """
        datetime_col = self.index[self.index.columns[0]]

        if not datetime_col.dtype.is_temporal():
            return 252.0

        sorted_dt = datetime_col.sort()
        diffs = sorted_dt.diff().drop_nulls()
        mean_diff = diffs.mean()

        if isinstance(mean_diff, timedelta):
            seconds = mean_diff.total_seconds()
        else:  # pragma: no cover  # Polars always returns timedelta for temporal diff
            seconds = cast(float, mean_diff) if mean_diff is not None else 1.0

        return (365 * 24 * 60 * 60) / seconds

    def items(self) -> Iterator[tuple[str, pl.Series]]:
        """Iterate over all assets and their corresponding data series.

        This method provides a convenient way to iterate over all assets in the data,
        yielding each asset name and its corresponding data series.

        Yields:
            tuple[str, pl.Series]: A tuple containing the asset name and its data series.

        """
        matrix = self.all

        for col in self.assets:
            yield col, matrix.get_column(col)

all property

Combine index, returns, and benchmark data into a single DataFrame.

This property provides a convenient way to access all data in a single DataFrame, which is useful for analysis and visualization.

Returns:

Type Description
DataFrame

pl.DataFrame: A DataFrame containing the index, all returns data, and benchmark data (if available) combined horizontally.

assets property

Return the combined list of asset column names from returns and benchmark.

Returns:

Type Description
list[str]

list[str]: List of all asset column names from both returns and benchmark (if available).

date_col property

Return the column names of the index DataFrame.

Returns:

Type Description
list[str]

list[str]: List of column names in the index DataFrame, typically containing the date column name.

plots property

Provides access to visualization methods for the financial data.

Returns:

Name Type Description
DataPlots DataPlots

An instance of the DataPlots class initialized with this data.

reports property

Provides access to reporting methods for the financial data.

Returns:

Name Type Description
Reports Reports

An instance of the Reports class initialized with this data.

stats property

Provides access to statistical analysis methods for the financial data.

Returns:

Name Type Description
Stats Stats

An instance of the Stats class initialized with this data.

utils property

Provides access to utility transforms and conversions for the financial data.

Returns:

Name Type Description
DataUtils DataUtils

An instance of the DataUtils class initialized with this data.

__post_init__()

Validate the Data object after initialization.

Source code in src/jquantstats/data.py
def __post_init__(self) -> None:
    """Validate the Data object after initialization."""
    # You need at least two points
    if self.index.shape[0] < 2:
        raise ValueError("Index must contain at least two timestamps.")  # noqa: TRY003

    # Check index is monotonically increasing
    datetime_col = self.index[self.index.columns[0]]
    if not datetime_col.is_sorted():
        raise ValueError("Index must be monotonically increasing.")  # noqa: TRY003

    # Check row count matches returns
    if self.returns.shape[0] != self.index.shape[0]:
        raise ValueError("Returns and index must have the same number of rows.")  # noqa: TRY003

    # Check row count matches benchmark (if provided)
    if self.benchmark is not None and self.benchmark.shape[0] != self.index.shape[0]:
        raise ValueError("Benchmark and index must have the same number of rows.")  # noqa: TRY003

__repr__()

Return a string representation of the Data object.

Source code in src/jquantstats/data.py
def __repr__(self) -> str:
    """Return a string representation of the Data object."""
    rows = len(self.index)
    date_cols = self.date_col
    if date_cols:
        date_column = date_cols[0]
        start = self.index[date_column].min()
        end = self.index[date_column].max()
        return f"Data(assets={self.assets}, rows={rows}, start={start}, end={end})"
    return f"Data(assets={self.assets}, rows={rows})"  # pragma: no cover  # __post_init__ requires ≥1 index column

copy()

Create a deep copy of the Data object.

Returns:

Name Type Description
Data Data

A new Data object with copies of the returns and benchmark.

Source code in src/jquantstats/data.py
def copy(self) -> Data:
    """Create a deep copy of the Data object.

    Returns:
        Data: A new Data object with copies of the returns and benchmark.

    """
    if self.benchmark is not None:
        return Data(returns=self.returns.clone(), benchmark=self.benchmark.clone(), index=self.index.clone())
    return Data(returns=self.returns.clone(), index=self.index.clone())

describe()

Return a tidy summary of shape, date range and asset names.

Returns:

Type Description
DataFrame

pl.DataFrame: One row per asset with columns: asset, start, end,

DataFrame

rows, has_benchmark.

Source code in src/jquantstats/data.py
def describe(self) -> pl.DataFrame:
    """Return a tidy summary of shape, date range and asset names.

    Returns:
        pl.DataFrame: One row per asset with columns: asset, start, end,
        rows, has_benchmark.

    """
    date_column = self.date_col[0]
    start = self.index[date_column].min()
    end = self.index[date_column].max()
    rows = len(self.index)
    return pl.DataFrame(
        {
            "asset": self.returns.columns,
            "start": [start] * len(self.returns.columns),
            "end": [end] * len(self.returns.columns),
            "rows": [rows] * len(self.returns.columns),
            "has_benchmark": [self.benchmark is not None] * len(self.returns.columns),
        }
    )

from_prices(prices, rf=0.0, benchmark=None, date_col='Date', null_strategy=None) classmethod

Create a Data object from prices and optional benchmark.

Converts price levels to returns via percentage change and delegates to from_returns. The first row of each asset is dropped because no prior price is available to compute a return.

Parameters:

Name Type Description Default
prices NativeFrame

Price-level data. First column should be the date column; remaining columns are asset prices.

required
rf float | NativeFrame

Risk-free rate. Forwarded unchanged to from_returns. Defaults to 0.0 (no risk-free rate adjustment).

0.0
benchmark NativeFrame | None

Benchmark prices. Converted to returns in the same way as prices before being forwarded to from_returns. Defaults to None (no benchmark).

None
date_col str

Name of the date column in the DataFrames. Defaults to "Date".

'Date'
null_strategy {'raise', 'drop', 'forward_fill'} | None

How to handle null (missing) values after converting prices to returns. Forwarded unchanged to from_returns. Defaults to None (nulls propagate through calculations).

  • None — no null checking; nulls propagate.
  • "raise" — raise NullsInReturnsError if any null is found in the derived returns.
  • "drop" — silently drop every row that contains at least one null.
  • "forward_fill" — fill each null with the most recent non-null value.

Note: Prices that contain nulls will produce null returns via pct_change(). If you expect missing price entries, pass null_strategy="drop" or null_strategy="forward_fill".

None

Returns:

Name Type Description
Data Data

Object containing excess returns derived from the supplied

Data

prices, with methods for analysis and visualization through the

Data

stats and plots properties.

Examples:

from jquantstats import Data
import polars as pl

prices = pl.DataFrame({
    "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
    "Asset1": [100.0, 101.0, 99.0]
}).with_columns(pl.col("Date").str.to_date())

data = Data.from_prices(prices=prices)
Source code in src/jquantstats/data.py
@classmethod
def from_prices(
    cls,
    prices: NativeFrame,
    rf: NativeFrameOrScalar = 0.0,
    benchmark: NativeFrame | None = None,
    date_col: str = "Date",
    null_strategy: Literal["raise", "drop", "forward_fill"] | None = None,
) -> Data:
    """Create a Data object from prices and optional benchmark.

    Converts price levels to returns via percentage change and delegates
    to `from_returns`. The first row of each asset is dropped because no
    prior price is available to compute a return.

    Args:
        prices (NativeFrame): Price-level data. First column should be
            the date column; remaining columns are asset prices.
        rf (float | NativeFrame): Risk-free rate. Forwarded unchanged to
            `from_returns`. Defaults to 0.0 (no risk-free rate
            adjustment).
        benchmark (NativeFrame | None): Benchmark prices. Converted to
            returns in the same way as ``prices`` before being forwarded
            to `from_returns`. Defaults to None (no benchmark).
        date_col (str): Name of the date column in the DataFrames.
            Defaults to ``"Date"``.
        null_strategy ({"raise", "drop", "forward_fill"} | None): How to
            handle ``null`` (missing) values after converting prices to
            returns. Forwarded unchanged to `from_returns`. Defaults to
            ``None`` (nulls propagate through calculations).

            - ``None`` — no null checking; nulls propagate.
            - ``"raise"`` — raise `NullsInReturnsError` if any null is
              found in the derived returns.
            - ``"drop"`` — silently drop every row that contains at least
              one null.
            - ``"forward_fill"`` — fill each null with the most recent
              non-null value.

            Note: Prices that contain nulls will produce null returns via
            ``pct_change()``. If you expect missing price entries, pass
            ``null_strategy="drop"`` or ``null_strategy="forward_fill"``.

    Returns:
        Data: Object containing excess returns derived from the supplied
        prices, with methods for analysis and visualization through the
        ``stats`` and ``plots`` properties.

    Examples:
        ```python
        from jquantstats import Data
        import polars as pl

        prices = pl.DataFrame({
            "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
            "Asset1": [100.0, 101.0, 99.0]
        }).with_columns(pl.col("Date").str.to_date())

        data = Data.from_prices(prices=prices)
        ```

    """
    prices_pl = _to_polars(prices)
    asset_cols = [c for c in prices_pl.columns if c != date_col]
    returns_pl = prices_pl.with_columns([pl.col(c).pct_change().alias(c) for c in asset_cols]).slice(1)

    benchmark_returns: NativeFrame | None = None
    if benchmark is not None:
        benchmark_pl = _to_polars(benchmark)
        bench_cols = [c for c in benchmark_pl.columns if c != date_col]
        benchmark_returns = benchmark_pl.with_columns([pl.col(c).pct_change().alias(c) for c in bench_cols]).slice(
            1
        )

    return cls.from_returns(
        returns=returns_pl,
        rf=rf,
        benchmark=benchmark_returns,
        date_col=date_col,
        null_strategy=null_strategy,
    )

from_returns(returns, rf=0.0, benchmark=None, date_col='Date', null_strategy=None) classmethod

Create a Data object from returns and optional benchmark.

Parameters:

Name Type Description Default
returns NativeFrame

Financial returns data. First column should be the date column, remaining columns are asset returns.

required
rf float | NativeFrame

Risk-free rate. Defaults to 0.0 (no risk-free rate adjustment).

  • If float: Constant risk-free rate applied to all dates.
  • If NativeFrame: Time-varying risk-free rate with dates matching returns.
0.0
benchmark NativeFrame | None

Benchmark returns. Defaults to None (no benchmark). First column should be the date column, remaining columns are benchmark returns.

None
date_col str

Name of the date column in the DataFrames. Defaults to "Date".

'Date'
null_strategy {'raise', 'drop', 'forward_fill'} | None

How to handle null (missing) values in returns and benchmark. Defaults to None (nulls propagate through calculations).

  • None — no null checking; nulls propagate through all downstream calculations.
  • "raise" — raise NullsInReturnsError if any null is found.
  • "drop" — silently drop every row that contains at least one null.
  • "forward_fill" — fill each null with the most recent non-null value in the same column.

Note: Affects only Polars null values (i.e. None / missing entries). IEEE-754 NaN values are not affected and continue to propagate as per IEEE-754 semantics.

None

Returns:

Name Type Description
Data Data

Object containing excess returns and benchmark (if any),

Data

with methods for analysis and visualization through the stats

Data

and plots properties.

Raises:

Type Description
NullsInReturnsError

If null_strategy is "raise" and the data contains null values.

ValueError

If there are no overlapping dates between returns and benchmark.

Examples:

Basic usage:

from jquantstats import Data
import polars as pl

returns = pl.DataFrame({
    "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
    "Asset1": [0.01, -0.02, 0.03]
}).with_columns(pl.col("Date").str.to_date())

data = Data.from_returns(returns=returns)

With benchmark and risk-free rate:

benchmark = pl.DataFrame({
    "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
    "Market": [0.005, -0.01, 0.02]
}).with_columns(pl.col("Date").str.to_date())

data = Data.from_returns(returns=returns, benchmark=benchmark, rf=0.0002)

Handling nulls automatically:

returns_with_nulls = pl.DataFrame({
    "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
    "Asset1": [0.01, None, 0.03]
}).with_columns(pl.col("Date").str.to_date())

# Drop rows with nulls (mirrors pandas/QuantStats behaviour)
data = Data.from_returns(returns=returns_with_nulls, null_strategy="drop")

# Or forward-fill nulls
data = Data.from_returns(returns=returns_with_nulls, null_strategy="forward_fill")
Source code in src/jquantstats/data.py
@classmethod
def from_returns(
    cls,
    returns: NativeFrame,
    rf: NativeFrameOrScalar = 0.0,
    benchmark: NativeFrame | None = None,
    date_col: str = "Date",
    null_strategy: Literal["raise", "drop", "forward_fill"] | None = None,
) -> Data:
    """Create a Data object from returns and optional benchmark.

    Args:
        returns (NativeFrame): Financial returns data. First column should
            be the date column, remaining columns are asset returns.
        rf (float | NativeFrame): Risk-free rate. Defaults to 0.0 (no
            risk-free rate adjustment).

            - If float: Constant risk-free rate applied to all dates.
            - If NativeFrame: Time-varying risk-free rate with dates
              matching returns.

        benchmark (NativeFrame | None): Benchmark returns. Defaults to
            None (no benchmark). First column should be the date column,
            remaining columns are benchmark returns.
        date_col (str): Name of the date column in the DataFrames.
            Defaults to ``"Date"``.
        null_strategy ({"raise", "drop", "forward_fill"} | None): How to
            handle ``null`` (missing) values in *returns* and *benchmark*.
            Defaults to ``None`` (nulls propagate through calculations).

            - ``None`` — no null checking; nulls propagate through all
              downstream calculations.
            - ``"raise"`` — raise `NullsInReturnsError` if any null is
              found.
            - ``"drop"`` — silently drop every row that contains at least
              one null.
            - ``"forward_fill"`` — fill each null with the most recent
              non-null value in the same column.

            Note: Affects only Polars ``null`` values (i.e. ``None`` /
            missing entries). IEEE-754 ``NaN`` values are **not** affected
            and continue to propagate as per IEEE-754 semantics.

    Returns:
        Data: Object containing excess returns and benchmark (if any),
        with methods for analysis and visualization through the ``stats``
        and ``plots`` properties.

    Raises:
        NullsInReturnsError: If *null_strategy* is ``"raise"`` and the
            data contains null values.
        ValueError: If there are no overlapping dates between returns and
            benchmark.

    Examples:
        Basic usage:

        ```python
        from jquantstats import Data
        import polars as pl

        returns = pl.DataFrame({
            "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
            "Asset1": [0.01, -0.02, 0.03]
        }).with_columns(pl.col("Date").str.to_date())

        data = Data.from_returns(returns=returns)
        ```

        With benchmark and risk-free rate:

        ```python
        benchmark = pl.DataFrame({
            "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
            "Market": [0.005, -0.01, 0.02]
        }).with_columns(pl.col("Date").str.to_date())

        data = Data.from_returns(returns=returns, benchmark=benchmark, rf=0.0002)
        ```

        Handling nulls automatically:

        ```python
        returns_with_nulls = pl.DataFrame({
            "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
            "Asset1": [0.01, None, 0.03]
        }).with_columns(pl.col("Date").str.to_date())

        # Drop rows with nulls (mirrors pandas/QuantStats behaviour)
        data = Data.from_returns(returns=returns_with_nulls, null_strategy="drop")

        # Or forward-fill nulls
        data = Data.from_returns(returns=returns_with_nulls, null_strategy="forward_fill")
        ```

    """
    returns_pl = _to_polars(returns)
    benchmark_pl = _to_polars(benchmark) if benchmark is not None else None
    rf_converted: float | pl.DataFrame
    if isinstance(rf, pl.DataFrame) or (not isinstance(rf, float) and not isinstance(rf, int)):
        rf_converted = _to_polars(rf)
    else:
        rf_converted = rf  # int is not float/DataFrame: _subtract_risk_free raises TypeError

    returns_pl = _apply_null_strategy(returns_pl, date_col, "returns", null_strategy)
    if benchmark_pl is not None:
        benchmark_pl = _apply_null_strategy(benchmark_pl, date_col, "benchmark", null_strategy)

    if benchmark_pl is not None:
        joined_dates = returns_pl.join(benchmark_pl, on=date_col, how="inner").select(date_col)
        if joined_dates.is_empty():
            raise ValueError("No overlapping dates between returns and benchmark.")  # noqa: TRY003
        returns_pl = returns_pl.join(joined_dates, on=date_col, how="inner")
        benchmark_pl = benchmark_pl.join(joined_dates, on=date_col, how="inner")

    index = returns_pl.select(date_col)
    excess_returns = _subtract_risk_free(returns_pl, rf_converted, date_col).drop(date_col)
    excess_benchmark = (
        _subtract_risk_free(benchmark_pl, rf_converted, date_col).drop(date_col)
        if benchmark_pl is not None
        else None
    )

    return cls(returns=excess_returns, benchmark=excess_benchmark, index=index)

head(n=5)

Return the first n rows of the combined returns and benchmark data.

Parameters:

Name Type Description Default
n int

Number of rows to return. Defaults to 5.

5

Returns:

Name Type Description
Data Data

A new Data object containing the first n rows of the combined data.

Source code in src/jquantstats/data.py
def head(self, n: int = 5) -> Data:
    """Return the first n rows of the combined returns and benchmark data.

    Args:
        n (int, optional): Number of rows to return. Defaults to 5.

    Returns:
        Data: A new Data object containing the first n rows of the combined data.

    """
    benchmark_head = self.benchmark.head(n) if self.benchmark is not None else None
    return Data(returns=self.returns.head(n), benchmark=benchmark_head, index=self.index.head(n))

items()

Iterate over all assets and their corresponding data series.

This method provides a convenient way to iterate over all assets in the data, yielding each asset name and its corresponding data series.

Yields:

Type Description
tuple[str, Series]

tuple[str, pl.Series]: A tuple containing the asset name and its data series.

Source code in src/jquantstats/data.py
def items(self) -> Iterator[tuple[str, pl.Series]]:
    """Iterate over all assets and their corresponding data series.

    This method provides a convenient way to iterate over all assets in the data,
    yielding each asset name and its corresponding data series.

    Yields:
        tuple[str, pl.Series]: A tuple containing the asset name and its data series.

    """
    matrix = self.all

    for col in self.assets:
        yield col, matrix.get_column(col)

resample(every='1mo')

Resample returns and benchmark to a different frequency.

Parameters:

Name Type Description Default
every str

Resampling frequency (e.g., '1mo', '1y'). Defaults to '1mo'.

'1mo'

Returns:

Name Type Description
Data Data

Resampled data at the requested frequency.

Source code in src/jquantstats/data.py
def resample(self, every: str = "1mo") -> Data:
    """Resample returns and benchmark to a different frequency.

    Args:
        every (str): Resampling frequency (e.g., ``'1mo'``, ``'1y'``).
            Defaults to ``'1mo'``.

    Returns:
        Data: Resampled data at the requested frequency.

    """

    def resample_frame(dframe: pl.DataFrame) -> pl.DataFrame:
        """Resample a single DataFrame to the target frequency using compound returns."""
        dframe = self.index.hstack(dframe)  # Add the date column for resampling

        return dframe.group_by_dynamic(
            index_column=self.index.columns[0], every=every, period=every, closed="right", label="right"
        ).agg(
            [
                ((pl.col(col) + 1.0).product() - 1.0).alias(col)
                for col in dframe.columns
                if col != self.index.columns[0]
            ]
        )

    resampled_returns = resample_frame(self.returns)
    resampled_benchmark = resample_frame(self.benchmark) if self.benchmark is not None else None
    resampled_index = resampled_returns.select(self.index.columns[0])

    return Data(
        returns=resampled_returns.drop(self.index.columns[0]),
        benchmark=resampled_benchmark.drop(self.index.columns[0]) if resampled_benchmark is not None else None,
        index=resampled_index,
    )

tail(n=5)

Return the last n rows of the combined returns and benchmark data.

Parameters:

Name Type Description Default
n int

Number of rows to return. Defaults to 5.

5

Returns:

Name Type Description
Data Data

A new Data object containing the last n rows of the combined data.

Source code in src/jquantstats/data.py
def tail(self, n: int = 5) -> Data:
    """Return the last n rows of the combined returns and benchmark data.

    Args:
        n (int, optional): Number of rows to return. Defaults to 5.

    Returns:
        Data: A new Data object containing the last n rows of the combined data.

    """
    benchmark_tail = self.benchmark.tail(n) if self.benchmark is not None else None
    return Data(returns=self.returns.tail(n), benchmark=benchmark_tail, index=self.index.tail(n))

truncate(start=None, end=None)

Return a new Data object truncated to the inclusive [start, end] range.

When the index is temporal (Date/Datetime), truncation is performed by comparing the date column against start and end values.

When the index is integer-based, row slicing is used instead, and start and end must be non-negative integers. Passing non-integer bounds to an integer-indexed Data raises TypeError.

Parameters:

Name Type Description Default
start date | datetime | str | int | None

Optional lower bound (inclusive). A date/datetime value when the index is temporal; a non-negative int row index when the data has no temporal index.

None
end date | datetime | str | int | None

Optional upper bound (inclusive). Same type rules as start.

None

Returns:

Name Type Description
Data Data

A new Data object filtered to the specified range.

Raises:

Type Description
TypeError

When the index is not temporal and a non-integer bound is supplied.

Source code in src/jquantstats/data.py
def truncate(
    self,
    start: date | datetime | str | int | None = None,
    end: date | datetime | str | int | None = None,
) -> Data:
    """Return a new Data object truncated to the inclusive [start, end] range.

    When the index is temporal (Date/Datetime), truncation is performed by
    comparing the date column against ``start`` and ``end`` values.

    When the index is integer-based, row slicing is used instead, and
    ``start`` and ``end`` must be non-negative integers.  Passing
    non-integer bounds to an integer-indexed Data raises `TypeError`.

    Args:
        start: Optional lower bound (inclusive).  A date/datetime value
            when the index is temporal; a non-negative `int` row
            index when the data has no temporal index.
        end: Optional upper bound (inclusive).  Same type rules as
            ``start``.

    Returns:
        Data: A new Data object filtered to the specified range.

    Raises:
        TypeError: When the index is not temporal and a non-integer bound
            is supplied.

    """
    date_column = self.index.columns[0]
    is_temporal = self.index[date_column].dtype.is_temporal()

    if is_temporal:
        cond = pl.lit(True)
        if start is not None:
            cond = cond & (pl.col(date_column) >= pl.lit(start))
        if end is not None:
            cond = cond & (pl.col(date_column) <= pl.lit(end))
        mask = self.index.select(cond.alias("mask"))["mask"]
        new_index = self.index.filter(mask)
        new_returns = self.returns.filter(mask)
        new_benchmark = self.benchmark.filter(mask) if self.benchmark is not None else None
    else:
        if start is not None and not isinstance(start, int):
            raise TypeError(f"start must be an integer, got {type(start).__name__}.")  # noqa: TRY003
        if end is not None and not isinstance(end, int):
            raise TypeError(f"end must be an integer, got {type(end).__name__}.")  # noqa: TRY003
        row_start = start if start is not None else 0
        row_end = end + 1 if end is not None else self.index.height
        length = max(0, row_end - row_start)
        new_index = self.index.slice(row_start, length)
        new_returns = self.returns.slice(row_start, length)
        new_benchmark = self.benchmark.slice(row_start, length) if self.benchmark is not None else None

    return Data(returns=new_returns, benchmark=new_benchmark, index=new_index)

Portfolio dataclass

Bases: PortfolioNavMixin, PortfolioAttributionMixin, PortfolioTurnoverMixin, PortfolioCostMixin

Portfolio analytics class for quant finance.

Stores the three raw inputs — cash positions, prices, and AUM — and exposes the standard derived data series, analytics facades, transforms, and attribution tools.

Derived data series:

  • profits — per-asset daily cash P&L
  • profit — aggregate daily portfolio profit
  • nav_accumulated — cumulative additive NAV
  • nav_compounded — compounded NAV
  • returns — daily returns (profit / AUM)
  • monthly — monthly compounded returns
  • highwater — running high-water mark
  • drawdown — drawdown from high-water mark
  • all — merged view of all derived series

  • Lazy composition accessors: stats, plots, report

  • Portfolio transforms: truncate, lag, smoothed_holding
  • Attribution: tilt, timing, tilt_timing_decomp
  • Turnover: turnover, turnover_weekly, turnover_summary
  • Cost analysis: cost_adjusted_returns, trading_cost_impact
  • Utility: correlation

Attributes:

Name Type Description
cashposition DataFrame

Polars DataFrame of positions per asset over time (includes date column if present).

prices DataFrame

Polars DataFrame of prices per asset over time (includes date column if present).

aum float

Assets under management used as base NAV offset.

Analytics facades
  • .stats : delegates to the legacy Stats pipeline via .data; all 50+ metrics available.
  • .plots : portfolio-specific Plots; NAV overlays, lead-lag IR, rolling Sharpe/vol, heatmaps.
  • .report : HTML Report; self-contained portfolio performance report.
  • .data : bridge to the legacy Data / Stats / DataPlots pipeline.

.plots and .report are intentionally not delegated to the legacy path: the legacy path operates on a bare returns series, while the analytics path has access to raw prices, positions, and AUM for richer portfolio-specific visualisations.

Cost models

Two independent cost models are provided. They are not interchangeable:

Model A — position-delta (stateful, set at construction): cost_per_unit: float — one-way cost per unit of position change (e.g. 0.01 per share). Used by .position_delta_costs and .net_cost_nav. Best for: equity portfolios where cost scales with shares traded.

Model B — turnover-bps (stateless, passed at call time): cost_bps: float — one-way cost in basis points of AUM turnover (e.g. 5 bps). Used by .cost_adjusted_returns(cost_bps) and .trading_cost_impact(max_bps). Best for: macro / fund-of-funds portfolios where cost scales with notional traded.

To sweep a range of cost assumptions use trading_cost_impact(max_bps=20) (Model B). To compute a net-NAV curve set cost_per_unit at construction and read .net_cost_nav (Model A).

Date column requirement

Most analytics work with or without a date column. The following features require a temporal date column (pl.Date or pl.Datetime):

  • portfolio.plots.correlation_heatmap()
  • portfolio.plots.lead_lag_ir_plot()
  • stats.monthly_win_rate() — returns NaN per column when no date is present
  • stats.annual_breakdown() — raises ValueError when no date is present
  • stats.max_drawdown_duration() — returns period count (int) instead of days

Portfolios without a date column (integer-indexed) are fully supported for NAV, returns, Sharpe, drawdown, cost analytics, and most rolling metrics.

Examples:

>>> import polars as pl
>>> from datetime import date
>>> prices = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [100.0, 110.0]})
>>> pos = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [1000.0, 1000.0]})
>>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e6)
>>> pf.assets
['A']
Source code in src/jquantstats/portfolio.py
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
@dataclasses.dataclass(frozen=True, slots=True)
class Portfolio(
    PortfolioNavMixin,
    PortfolioAttributionMixin,
    PortfolioTurnoverMixin,
    PortfolioCostMixin,
):
    """Portfolio analytics class for quant finance.

    Stores the three raw inputs — cash positions, prices, and AUM — and
    exposes the standard derived data series, analytics facades, transforms,
    and attribution tools.

    Derived data series:

    - `profits` — per-asset daily cash P&L
    - `profit` — aggregate daily portfolio profit
    - `nav_accumulated` — cumulative additive NAV
    - `nav_compounded` — compounded NAV
    - `returns` — daily returns (profit / AUM)
    - `monthly` — monthly compounded returns
    - `highwater` — running high-water mark
    - `drawdown` — drawdown from high-water mark
    - `all` — merged view of all derived series

    - Lazy composition accessors: `stats`, `plots`, `report`
    - Portfolio transforms: `truncate`, `lag`,
      `smoothed_holding`
    - Attribution: `tilt`, `timing`, `tilt_timing_decomp`
    - Turnover: `turnover`, `turnover_weekly`,
      `turnover_summary`
    - Cost analysis: `cost_adjusted_returns`,
      `trading_cost_impact`
    - Utility: `correlation`

    Attributes:
        cashposition: Polars DataFrame of positions per asset over time
            (includes date column if present).
        prices: Polars DataFrame of prices per asset over time (includes date
            column if present).
        aum: Assets under management used as base NAV offset.

    Analytics facades
    -----------------
    - ``.stats``   : delegates to the legacy ``Stats`` pipeline via ``.data``; all 50+ metrics available.
    - ``.plots``   : portfolio-specific ``Plots``; NAV overlays, lead-lag IR, rolling Sharpe/vol, heatmaps.
    - ``.report``  : HTML ``Report``; self-contained portfolio performance report.
    - ``.data``    : bridge to the legacy ``Data`` / ``Stats`` / ``DataPlots`` pipeline.

    ``.plots`` and ``.report`` are intentionally *not* delegated to the legacy path: the legacy
    path operates on a bare returns series, while the analytics path has access to raw prices,
    positions, and AUM for richer portfolio-specific visualisations.

    Cost models
    -----------
    Two independent cost models are provided. They are not interchangeable:

    **Model A — position-delta (stateful, set at construction):**
        ``cost_per_unit: float``  — one-way cost per unit of position change (e.g. 0.01 per share).
        Used by ``.position_delta_costs`` and ``.net_cost_nav``.
        Best for: equity portfolios where cost scales with shares traded.

    **Model B — turnover-bps (stateless, passed at call time):**
        ``cost_bps: float``  — one-way cost in basis points of AUM turnover (e.g. 5 bps).
        Used by ``.cost_adjusted_returns(cost_bps)`` and ``.trading_cost_impact(max_bps)``.
        Best for: macro / fund-of-funds portfolios where cost scales with notional traded.

    To sweep a range of cost assumptions use ``trading_cost_impact(max_bps=20)`` (Model B).
    To compute a net-NAV curve set ``cost_per_unit`` at construction and read ``.net_cost_nav`` (Model A).

    Date column requirement
    -----------------------
    Most analytics work with or without a ``date`` column. The following features require a
    temporal ``date`` column (``pl.Date`` or ``pl.Datetime``):

    - ``portfolio.plots.correlation_heatmap()``
    - ``portfolio.plots.lead_lag_ir_plot()``
    - ``stats.monthly_win_rate()``      — returns NaN per column when no date is present
    - ``stats.annual_breakdown()``      — raises ``ValueError`` when no date is present
    - ``stats.max_drawdown_duration()`` — returns period count (int) instead of days

    Portfolios without a ``date`` column (integer-indexed) are fully supported for
    NAV, returns, Sharpe, drawdown, cost analytics, and most rolling metrics.

    Examples:
        >>> import polars as pl
        >>> from datetime import date
        >>> prices = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [100.0, 110.0]})
        >>> pos = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [1000.0, 1000.0]})
        >>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e6)
        >>> pf.assets
        ['A']
    """

    cashposition: pl.DataFrame
    prices: pl.DataFrame
    aum: float
    cost_per_unit: float = 0.0
    cost_bps: float = 0.0

    # ── Internal cache fields ─────────────────────────────────────────────────
    # All cache fields are initialised to ``None`` in ``__post_init__`` via
    # ``object.__setattr__`` (required for frozen dataclasses) and populated
    # lazily on first property access.
    #
    # Lifecycle:
    #   - Initialised: ``__post_init__`` sets every field to ``None``.
    #   - Populated: each property computes its value on the first call and
    #     writes it back via ``object.__setattr__``.
    #   - Invalidation: not required — ``Portfolio`` is a *frozen* dataclass,
    #     so its inputs never change and all derived values remain valid for the
    #     lifetime of the instance.
    _data_bridge: "Data | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False)
    _stats_cache: "Stats | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False)
    _plots_cache: "PortfolioPlots | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False)
    _report_cache: "Report | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False)
    _utils_cache: "PortfolioUtils | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False)
    _profits_cache: "pl.DataFrame | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False)
    _returns_cache: "pl.DataFrame | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False)
    _tilt_cache: "Portfolio | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False)
    _turnover_cache: "pl.DataFrame | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False)

    @staticmethod
    def _build_data_bridge(ret: pl.DataFrame) -> "Data":
        """Build a `Data` bridge from a returns frame.

        Splits out the ``'date'`` column (if present) into an index and passes
        the remaining numeric columns as returns.  Used internally to populate
        ``_data_bridge`` at construction time so the ``data`` property is O(1).

        Args:
            ret: Returns DataFrame, optionally with a leading ``'date'`` column.

        Returns:
            A `Data` instance backed by *ret*.
        """
        from .data import Data

        returns_only = ret.select("returns")
        if "date" in ret.columns:
            return Data(returns=returns_only, index=ret.select("date"))
        return Data(returns=returns_only, index=pl.DataFrame({"index": list(range(ret.height))}))

    def __post_init__(self) -> None:
        """Validate input types, shapes, and parameters post-initialization."""
        if not isinstance(self.prices, pl.DataFrame):
            raise InvalidPricesTypeError(type(self.prices).__name__)
        if not isinstance(self.cashposition, pl.DataFrame):
            raise InvalidCashPositionTypeError(type(self.cashposition).__name__)
        if self.cashposition.shape[0] != self.prices.shape[0]:
            raise RowCountMismatchError(self.prices.shape[0], self.cashposition.shape[0])
        if self.aum <= 0.0:
            raise NonPositiveAumError(self.aum)
        object.__setattr__(self, "_data_bridge", None)
        object.__setattr__(self, "_stats_cache", None)
        object.__setattr__(self, "_plots_cache", None)
        object.__setattr__(self, "_report_cache", None)
        object.__setattr__(self, "_utils_cache", None)
        object.__setattr__(self, "_profits_cache", None)
        object.__setattr__(self, "_returns_cache", None)
        object.__setattr__(self, "_tilt_cache", None)
        object.__setattr__(self, "_turnover_cache", None)

    def _date_range(self) -> tuple[int, date | datetime | None, date | datetime | None]:
        """Return (rows, start, end) for the portfolio's returns series.

        ``start`` and ``end`` are ``None`` when there is no ``'date'`` column.
        """
        ret = self.returns
        rows = ret.height
        if "date" in ret.columns:
            return rows, cast(date | None, ret["date"].min()), cast(date | None, ret["date"].max())
        return rows, None, None

    @property
    def cost_model(self) -> CostModel:
        """Return the active cost model as a `CostModel` instance.

        Returns:
            A `CostModel` whose ``cost_per_unit`` and ``cost_bps`` fields
            reflect the values stored on this portfolio.
        """
        return CostModel(cost_per_unit=self.cost_per_unit, cost_bps=self.cost_bps)

    def __repr__(self) -> str:
        """Return a string representation of the Portfolio object."""
        rows, start, end = self._date_range()
        if start is not None:
            return f"Portfolio(assets={self.assets}, rows={rows}, start={start}, end={end})"
        return f"Portfolio(assets={self.assets}, rows={rows})"

    def describe(self) -> pl.DataFrame:
        """Return a tidy summary of shape, date range and asset names.

        Returns:
        -------
        pl.DataFrame
            One row per asset with columns: asset, start, end, rows.

        Examples:
            >>> import polars as pl
            >>> from datetime import date
            >>> prices = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [100.0, 110.0]})
            >>> pos = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [1000.0, 1000.0]})
            >>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e6)
            >>> df = pf.describe()
            >>> list(df.columns)
            ['asset', 'start', 'end', 'rows']
        """
        rows, start, end = self._date_range()
        return pl.DataFrame(
            {
                "asset": self.assets,
                "start": [start] * len(self.assets),
                "end": [end] * len(self.assets),
                "rows": [rows] * len(self.assets),
            }
        )

    # ── Factory classmethods ──────────────────────────────────────────────────

    @classmethod
    def from_risk_position(
        cls,
        prices: pl.DataFrame,
        risk_position: pl.DataFrame | pl.Expr,
        aum: float,
        vola: int | dict[str, int] = 32,
        vol_cap: float | None = None,
        cost_per_unit: float = 0.0,
        cost_bps: float = 0.0,
        cost_model: CostModel | None = None,
    ) -> Self:
        """Create a Portfolio from per-asset risk positions.

        De-volatizes each risk position using an EWMA volatility estimate
        derived from the corresponding price series.

        Args:
            prices: Price levels per asset over time (may include a date column).
            risk_position: Risk units per asset aligned with prices.
            vola: EWMA lookback (span-equivalent) used to estimate volatility.
                Pass an ``int`` to apply the same span to every asset, or a
                ``dict[str, int]`` to set a per-asset span (assets absent from
                the dict default to ``32``).  Every span value must be a
                positive integer; a ``ValueError`` is raised otherwise.  Dict
                keys that do not correspond to any numeric column in *prices*
                also raise a ``ValueError``.
            vol_cap: Optional lower bound for the EWMA volatility estimate.
                When provided, the vol series is clipped from below at this
                value before dividing the risk position, preventing
                position blow-up in calm, low-volatility regimes.  For
                example, ``vol_cap=0.05`` ensures annualised vol is never
                estimated below 5%.  Must be positive when not ``None``.
            aum: Assets under management used as the base NAV offset.
            cost_per_unit: One-way trading cost per unit of position change.
                Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
            cost_bps: One-way trading cost in basis points of AUM turnover.
                Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
            cost_model: Optional `CostModel`
                instance.  When supplied, its ``cost_per_unit`` and
                ``cost_bps`` values take precedence over the individual
                parameters above.

        Returns:
            A Portfolio instance whose cash positions are risk_position
            divided by EWMA volatility.

        Raises:
            ValueError: If any span value in *vola* is ≤ 0, or if a key in a
                *vola* dict does not match any numeric column in *prices*, or
                if *vol_cap* is provided but is not positive.
        """
        if isinstance(risk_position, pl.Expr):
            risk_position = prices.with_columns(risk_position)
        if cost_model is not None:
            cost_per_unit = cost_model.cost_per_unit
            cost_bps = cost_model.cost_bps
        assets = [col for col, dtype in prices.schema.items() if dtype.is_numeric()]

        # ── Validate vol_cap ──────────────────────────────────────────────────
        if vol_cap is not None and vol_cap <= 0:
            raise ValueError(f"vol_cap must be a positive number when provided, got {vol_cap!r}")  # noqa: TRY003

        # ── Validate vola ─────────────────────────────────────────────────────
        if isinstance(vola, dict):
            unknown = set(vola.keys()) - set(assets)
            if unknown:
                raise ValueError(  # noqa: TRY003
                    f"vola dict contains keys that do not match any numeric column in prices: {sorted(unknown)}"
                )
            for asset, span in vola.items():
                if int(span) <= 0:
                    raise ValueError(f"vola span for '{asset}' must be a positive integer, got {span!r}")  # noqa: TRY003
        else:
            if int(vola) <= 0:
                raise ValueError(f"vola span must be a positive integer, got {vola!r}")  # noqa: TRY003

        def _span(asset: str) -> int:
            """Return the EWMA span for *asset*, falling back to 32 if not specified."""
            if isinstance(vola, dict):
                return int(vola.get(asset, 32))
            return int(vola)

        def _vol(asset: str) -> pl.Series:
            """Return the EWMA volatility series for *asset*, optionally clipped from below."""
            vol = prices[asset].pct_change().ewm_std(com=_span(asset) - 1, adjust=True, min_samples=_span(asset))
            if vol_cap is not None:
                vol = vol.clip(lower_bound=vol_cap)
            return vol

        cash_position = risk_position.with_columns((pl.col(asset) / _vol(asset)).alias(asset) for asset in assets)
        return cls(prices=prices, cashposition=cash_position, aum=aum, cost_per_unit=cost_per_unit, cost_bps=cost_bps)

    @classmethod
    def from_position(
        cls,
        prices: pl.DataFrame,
        position: pl.DataFrame | pl.Expr,
        aum: float,
        cost_per_unit: float = 0.0,
        cost_bps: float = 0.0,
        cost_model: CostModel | None = None,
    ) -> Self:
        """Create a Portfolio from share/unit positions.

        Converts *position* (number of units held per asset) to cash exposure
        by multiplying element-wise with *prices*, then delegates to
        :py`from_cash_position`.

        Args:
            prices: Price levels per asset over time (may include a date column).
            position: Number of units held per asset over time, aligned with
                *prices*.  Non-numeric columns (e.g. ``'date'``) are passed
                through unchanged.
            aum: Assets under management used as the base NAV offset.
            cost_per_unit: One-way trading cost per unit of position change.
                Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
            cost_bps: One-way trading cost in basis points of AUM turnover.
                Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
            cost_model: Optional `CostModel` instance.
                When supplied, its ``cost_per_unit`` and ``cost_bps`` values
                take precedence over the individual parameters above.

        Returns:
            A Portfolio instance whose cash positions equal *position* x *prices*.

        Examples:
            >>> import polars as pl
            >>> prices = pl.DataFrame({"A": [100.0, 110.0, 105.0]})
            >>> pos = pl.DataFrame({"A": [10.0, 10.0, 10.0]})
            >>> pf = Portfolio.from_position(prices=prices, position=pos, aum=1e6)
            >>> pf.cashposition["A"].to_list()
            [1000.0, 1100.0, 1050.0]
        """
        if isinstance(position, pl.Expr):
            position = prices.with_columns(position)
        assets = [col for col, dtype in prices.schema.items() if dtype.is_numeric()]
        cash_position = position.with_columns((pl.col(asset) * prices[asset]).alias(asset) for asset in assets)
        return cls.from_cash_position(
            prices=prices,
            cash_position=cash_position,
            aum=aum,
            cost_per_unit=cost_per_unit,
            cost_bps=cost_bps,
            cost_model=cost_model,
        )

    @classmethod
    def from_cash_position(
        cls,
        prices: pl.DataFrame,
        cash_position: pl.DataFrame | pl.Expr,
        aum: float,
        cost_per_unit: float = 0.0,
        cost_bps: float = 0.0,
        cost_model: CostModel | None = None,
    ) -> Self:
        """Create a Portfolio directly from cash positions aligned with prices.

        Args:
            prices: Price levels per asset over time (may include a date column).
            cash_position: Cash exposure per asset over time, either as a
                DataFrame or as a Polars expression evaluated against *prices*.
            aum: Assets under management used as the base NAV offset.
            cost_per_unit: One-way trading cost per unit of position change.
                Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
            cost_bps: One-way trading cost in basis points of AUM turnover.
                Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
            cost_model: Optional `CostModel`
                instance.  When supplied, its ``cost_per_unit`` and
                ``cost_bps`` values take precedence over the individual
                parameters above.

        Returns:
            A Portfolio instance with the provided cash positions.
        """
        if isinstance(cash_position, pl.Expr):
            cash_position = prices.with_columns(cash_position)
        if cost_model is not None:
            cost_per_unit = cost_model.cost_per_unit
            cost_bps = cost_model.cost_bps
        return cls(prices=prices, cashposition=cash_position, aum=aum, cost_per_unit=cost_per_unit, cost_bps=cost_bps)

    # ── Internal helpers ───────────────────────────────────────────────────────

    @staticmethod
    def _assert_clean_series(series: pl.Series, name: str = "") -> None:
        """Raise ValueError if *series* contains nulls or non-finite values."""
        if series.null_count() != 0:
            raise ValueError
        if not series.is_finite().all():
            raise ValueError

    # ── Core data properties ───────────────────────────────────────────────────

    @property
    def assets(self) -> list[str]:
        """List the asset column names from prices (numeric columns).

        Returns:
            list[str]: Names of numeric columns in prices; typically excludes
            ``'date'``.
        """
        return [c for c in self.prices.columns if self.prices[c].dtype.is_numeric()]

    # ── Lazy composition accessors ─────────────────────────────────────────────

    @property
    def data(self) -> "Data":
        """Build a legacy `Data` object from this portfolio's returns.

        This bridges the two entry points: ``Portfolio`` compiles the NAV curve from
        prices and positions; the returned `Data` object
        gives access to the full legacy analytics pipeline (``data.stats``,
        ``data.plots``, ``data.reports``).

        Returns:
            `Data`: A Data object whose ``returns`` column
            is the portfolio's daily return series and whose ``index`` holds the date
            column (or a synthetic integer index for date-free portfolios).

        Examples:
            >>> import polars as pl
            >>> from datetime import date
            >>> prices = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [100.0, 110.0]})
            >>> pos = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [1000.0, 1000.0]})
            >>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e6)
            >>> d = pf.data
            >>> "returns" in d.returns.columns
            True
        """
        if self._data_bridge is not None:
            return self._data_bridge
        bridge = Portfolio._build_data_bridge(self.returns)
        object.__setattr__(self, "_data_bridge", bridge)
        return bridge

    @property
    def stats(self) -> "Stats":
        """Return a Stats object built from the portfolio's daily returns.

        Delegates to the legacy `Stats` pipeline via
        `data`, so all analytics (Sharpe, drawdown, summary, etc.) are
        available through the shared implementation.

        The result is cached after first access so repeated calls are O(1).
        """
        if self._stats_cache is None:
            object.__setattr__(self, "_stats_cache", self.data.stats)
        return self._stats_cache  # type: ignore[return-value]

    @property
    def plots(self) -> PortfolioPlots:
        """Convenience accessor returning a PortfolioPlots facade for this portfolio.

        Use this to create Plotly visualizations such as snapshots, lagged
        performance curves, and lead/lag IR charts.

        Returns:
            `PortfolioPlots`: Helper object with
            plotting methods.

        The result is cached after first access so repeated calls are O(1).
        """
        if self._plots_cache is None:
            object.__setattr__(self, "_plots_cache", PortfolioPlots(self))
        return self._plots_cache  # type: ignore[return-value]

    @property
    def report(self) -> Report:
        """Convenience accessor returning a Report facade for this portfolio.

        Use this to generate a self-contained HTML performance report
        containing statistics tables and interactive charts.

        Returns:
            `Report`: Helper object with
            report methods.

        The result is cached after first access so repeated calls are O(1).
        """
        if self._report_cache is None:
            object.__setattr__(self, "_report_cache", Report(self))
        return self._report_cache  # type: ignore[return-value]

    @property
    def utils(self) -> "PortfolioUtils":
        """Convenience accessor returning a PortfolioUtils facade for this portfolio.

        Use this for common data transformations such as converting returns to
        prices, computing log returns, rebasing, aggregating by period, and
        computing exponential standard deviation.

        Returns:
            `PortfolioUtils`: Helper object with
            utility transform methods.

        The result is cached after first access so repeated calls are O(1).
        """
        if self._utils_cache is None:
            from ._utils import PortfolioUtils

            object.__setattr__(self, "_utils_cache", PortfolioUtils(self))
        return self._utils_cache  # type: ignore[return-value]

    # ── Portfolio transforms ───────────────────────────────────────────────────

    def truncate(
        self,
        start: date | datetime | str | int | None = None,
        end: date | datetime | str | int | None = None,
    ) -> "Portfolio":
        """Return a new Portfolio truncated to the inclusive [start, end] range.

        When a ``'date'`` column is present in both prices and cash positions,
        truncation is performed by comparing the ``'date'`` column against
        ``start`` and ``end`` (which should be date/datetime values or strings
        parseable by Polars).

        When the ``'date'`` column is absent, integer-based row slicing is
        used instead.  In this case ``start`` and ``end`` must be non-negative
        integers representing 0-based row indices.  Passing non-integer bounds
        to an integer-indexed portfolio raises `TypeError`.

        In all cases the ``aum`` value is preserved.

        Args:
            start: Optional lower bound (inclusive). A date/datetime or
                Polars-parseable string when a ``'date'`` column exists; a
                non-negative int row index when the data has no ``'date'``
                column.
            end: Optional upper bound (inclusive). Same type rules as
                ``start``.

        Returns:
            A new Portfolio instance with prices and cash positions filtered
            to the specified range.

        Raises:
            TypeError: When the portfolio has no ``'date'`` column and a
                non-integer bound is supplied.
        """
        has_date = "date" in self.prices.columns
        if has_date:
            cond = pl.lit(True)
            if start is not None:
                cond = cond & (pl.col("date") >= pl.lit(start))
            if end is not None:
                cond = cond & (pl.col("date") <= pl.lit(end))
            pr = self.prices.filter(cond)
            cp = self.cashposition.filter(cond)
        else:
            if start is not None and not isinstance(start, int):
                raise IntegerIndexBoundError("start", type(start).__name__)
            if end is not None and not isinstance(end, int):
                raise IntegerIndexBoundError("end", type(end).__name__)
            row_start = int(start) if start is not None else 0
            row_end = int(end) + 1 if end is not None else self.prices.height
            length = max(0, row_end - row_start)
            pr = self.prices.slice(row_start, length)
            cp = self.cashposition.slice(row_start, length)
        return Portfolio(
            prices=pr,
            cashposition=cp,
            aum=self.aum,
            cost_per_unit=self.cost_per_unit,
            cost_bps=self.cost_bps,
        )

    def lag(self, n: int) -> "Portfolio":
        """Return a new Portfolio with cash positions lagged by ``n`` steps.

        This method shifts the numeric asset columns in the cashposition
        DataFrame by ``n`` rows, preserving the ``'date'`` column and any
        non-numeric columns unchanged.  Positive ``n`` delays weights (moves
        them down); negative ``n`` leads them (moves them up); ``n == 0``
        returns the current portfolio unchanged.

        Notes:
            Missing values introduced by the shift are left as nulls;
            downstream profit computation already guards and treats nulls as
            zero when multiplying by returns.

        Args:
            n: Number of rows to shift (can be negative, zero, or positive).

        Returns:
            A new Portfolio instance with lagged cash positions and the same
            prices/AUM as the original.
        """
        if not isinstance(n, int):
            raise TypeError
        if n == 0:
            return self

        assets = [c for c in self.cashposition.columns if c != "date" and self.cashposition[c].dtype.is_numeric()]
        cp_lagged = self.cashposition.with_columns(pl.col(c).shift(n) for c in assets)
        return Portfolio(
            prices=self.prices,
            cashposition=cp_lagged,
            aum=self.aum,
            cost_per_unit=self.cost_per_unit,
            cost_bps=self.cost_bps,
        )

    def smoothed_holding(self, n: int) -> "Portfolio":
        """Return a new Portfolio with cash positions smoothed by a rolling mean.

        Applies a trailing window average over the last ``n`` steps for each
        numeric asset column (excluding ``'date'``). The window length is
        ``n + 1`` so that:

        - n=0 returns the original weights (no smoothing),
        - n=1 averages the current and previous weights,
        - n=k averages the current and last k weights.

        Args:
            n: Non-negative integer specifying how many previous steps to
                include.

        Returns:
            A new Portfolio with smoothed cash positions and the same
            prices/AUM.
        """
        if not isinstance(n, int):
            raise TypeError
        if n < 0:
            raise ValueError
        if n == 0:
            return self

        assets = [c for c in self.cashposition.columns if c != "date" and self.cashposition[c].dtype.is_numeric()]
        window = n + 1
        cp_smoothed = self.cashposition.with_columns(
            pl.col(c).rolling_mean(window_size=window, min_samples=1).alias(c) for c in assets
        )
        return Portfolio(
            prices=self.prices,
            cashposition=cp_smoothed,
            aum=self.aum,
            cost_per_unit=self.cost_per_unit,
            cost_bps=self.cost_bps,
        )

    # ── Utility ────────────────────────────────────────────────────────────────

    def correlation(self, frame: pl.DataFrame, name: str = "portfolio") -> pl.DataFrame:
        """Compute a correlation matrix of asset returns plus the portfolio.

        Computes percentage changes for all numeric columns in ``frame``,
        appends the portfolio profit series under the provided ``name``, and
        returns the Pearson correlation matrix across all numeric columns.

        Args:
            frame: A Polars DataFrame containing at least the asset price
                columns (and a date column which will be ignored if
                non-numeric).
            name: The column name to use when adding the portfolio profit
                series to the input frame.

        Returns:
            A square Polars DataFrame where each cell is the correlation
            between a pair of series (values in [-1, 1]).
        """
        p = frame.with_columns(cs.by_dtype(pl.Float32, pl.Float64).pct_change())
        p = p.with_columns(pl.Series(name, self.profit["profit"]))
        corr_matrix = p.select(cs.numeric()).fill_null(0.0).corr()
        return corr_matrix

assets property

List the asset column names from prices (numeric columns).

Returns:

Type Description
list[str]

list[str]: Names of numeric columns in prices; typically excludes

list[str]

'date'.

cost_model property

Return the active cost model as a CostModel instance.

Returns:

Type Description
CostModel

A CostModel whose cost_per_unit and cost_bps fields

CostModel

reflect the values stored on this portfolio.

data property

Build a legacy Data object from this portfolio's returns.

This bridges the two entry points: Portfolio compiles the NAV curve from prices and positions; the returned Data object gives access to the full legacy analytics pipeline (data.stats, data.plots, data.reports).

Returns:

Type Description
Data

Data: A Data object whose returns column

Data

is the portfolio's daily return series and whose index holds the date

Data

column (or a synthetic integer index for date-free portfolios).

Examples:

>>> import polars as pl
>>> from datetime import date
>>> prices = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [100.0, 110.0]})
>>> pos = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [1000.0, 1000.0]})
>>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e6)
>>> d = pf.data
>>> "returns" in d.returns.columns
True

plots property

Convenience accessor returning a PortfolioPlots facade for this portfolio.

Use this to create Plotly visualizations such as snapshots, lagged performance curves, and lead/lag IR charts.

Returns:

Type Description
PortfolioPlots

PortfolioPlots: Helper object with

PortfolioPlots

plotting methods.

The result is cached after first access so repeated calls are O(1).

report property

Convenience accessor returning a Report facade for this portfolio.

Use this to generate a self-contained HTML performance report containing statistics tables and interactive charts.

Returns:

Type Description
Report

Report: Helper object with

Report

report methods.

The result is cached after first access so repeated calls are O(1).

stats property

Return a Stats object built from the portfolio's daily returns.

Delegates to the legacy Stats pipeline via data, so all analytics (Sharpe, drawdown, summary, etc.) are available through the shared implementation.

The result is cached after first access so repeated calls are O(1).

utils property

Convenience accessor returning a PortfolioUtils facade for this portfolio.

Use this for common data transformations such as converting returns to prices, computing log returns, rebasing, aggregating by period, and computing exponential standard deviation.

Returns:

Type Description
PortfolioUtils

PortfolioUtils: Helper object with

PortfolioUtils

utility transform methods.

The result is cached after first access so repeated calls are O(1).

__post_init__()

Validate input types, shapes, and parameters post-initialization.

Source code in src/jquantstats/portfolio.py
def __post_init__(self) -> None:
    """Validate input types, shapes, and parameters post-initialization."""
    if not isinstance(self.prices, pl.DataFrame):
        raise InvalidPricesTypeError(type(self.prices).__name__)
    if not isinstance(self.cashposition, pl.DataFrame):
        raise InvalidCashPositionTypeError(type(self.cashposition).__name__)
    if self.cashposition.shape[0] != self.prices.shape[0]:
        raise RowCountMismatchError(self.prices.shape[0], self.cashposition.shape[0])
    if self.aum <= 0.0:
        raise NonPositiveAumError(self.aum)
    object.__setattr__(self, "_data_bridge", None)
    object.__setattr__(self, "_stats_cache", None)
    object.__setattr__(self, "_plots_cache", None)
    object.__setattr__(self, "_report_cache", None)
    object.__setattr__(self, "_utils_cache", None)
    object.__setattr__(self, "_profits_cache", None)
    object.__setattr__(self, "_returns_cache", None)
    object.__setattr__(self, "_tilt_cache", None)
    object.__setattr__(self, "_turnover_cache", None)

__repr__()

Return a string representation of the Portfolio object.

Source code in src/jquantstats/portfolio.py
def __repr__(self) -> str:
    """Return a string representation of the Portfolio object."""
    rows, start, end = self._date_range()
    if start is not None:
        return f"Portfolio(assets={self.assets}, rows={rows}, start={start}, end={end})"
    return f"Portfolio(assets={self.assets}, rows={rows})"

correlation(frame, name='portfolio')

Compute a correlation matrix of asset returns plus the portfolio.

Computes percentage changes for all numeric columns in frame, appends the portfolio profit series under the provided name, and returns the Pearson correlation matrix across all numeric columns.

Parameters:

Name Type Description Default
frame DataFrame

A Polars DataFrame containing at least the asset price columns (and a date column which will be ignored if non-numeric).

required
name str

The column name to use when adding the portfolio profit series to the input frame.

'portfolio'

Returns:

Type Description
DataFrame

A square Polars DataFrame where each cell is the correlation

DataFrame

between a pair of series (values in [-1, 1]).

Source code in src/jquantstats/portfolio.py
def correlation(self, frame: pl.DataFrame, name: str = "portfolio") -> pl.DataFrame:
    """Compute a correlation matrix of asset returns plus the portfolio.

    Computes percentage changes for all numeric columns in ``frame``,
    appends the portfolio profit series under the provided ``name``, and
    returns the Pearson correlation matrix across all numeric columns.

    Args:
        frame: A Polars DataFrame containing at least the asset price
            columns (and a date column which will be ignored if
            non-numeric).
        name: The column name to use when adding the portfolio profit
            series to the input frame.

    Returns:
        A square Polars DataFrame where each cell is the correlation
        between a pair of series (values in [-1, 1]).
    """
    p = frame.with_columns(cs.by_dtype(pl.Float32, pl.Float64).pct_change())
    p = p.with_columns(pl.Series(name, self.profit["profit"]))
    corr_matrix = p.select(cs.numeric()).fill_null(0.0).corr()
    return corr_matrix

describe()

Return a tidy summary of shape, date range and asset names.

Returns:

pl.DataFrame One row per asset with columns: asset, start, end, rows.

Examples:

>>> import polars as pl
>>> from datetime import date
>>> prices = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [100.0, 110.0]})
>>> pos = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [1000.0, 1000.0]})
>>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e6)
>>> df = pf.describe()
>>> list(df.columns)
['asset', 'start', 'end', 'rows']
Source code in src/jquantstats/portfolio.py
def describe(self) -> pl.DataFrame:
    """Return a tidy summary of shape, date range and asset names.

    Returns:
    -------
    pl.DataFrame
        One row per asset with columns: asset, start, end, rows.

    Examples:
        >>> import polars as pl
        >>> from datetime import date
        >>> prices = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [100.0, 110.0]})
        >>> pos = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [1000.0, 1000.0]})
        >>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e6)
        >>> df = pf.describe()
        >>> list(df.columns)
        ['asset', 'start', 'end', 'rows']
    """
    rows, start, end = self._date_range()
    return pl.DataFrame(
        {
            "asset": self.assets,
            "start": [start] * len(self.assets),
            "end": [end] * len(self.assets),
            "rows": [rows] * len(self.assets),
        }
    )

from_cash_position(prices, cash_position, aum, cost_per_unit=0.0, cost_bps=0.0, cost_model=None) classmethod

Create a Portfolio directly from cash positions aligned with prices.

Parameters:

Name Type Description Default
prices DataFrame

Price levels per asset over time (may include a date column).

required
cash_position DataFrame | Expr

Cash exposure per asset over time, either as a DataFrame or as a Polars expression evaluated against prices.

required
aum float

Assets under management used as the base NAV offset.

required
cost_per_unit float

One-way trading cost per unit of position change. Defaults to 0.0 (no cost). Ignored when cost_model is given.

0.0
cost_bps float

One-way trading cost in basis points of AUM turnover. Defaults to 0.0 (no cost). Ignored when cost_model is given.

0.0
cost_model CostModel | None

Optional CostModel instance. When supplied, its cost_per_unit and cost_bps values take precedence over the individual parameters above.

None

Returns:

Type Description
Self

A Portfolio instance with the provided cash positions.

Source code in src/jquantstats/portfolio.py
@classmethod
def from_cash_position(
    cls,
    prices: pl.DataFrame,
    cash_position: pl.DataFrame | pl.Expr,
    aum: float,
    cost_per_unit: float = 0.0,
    cost_bps: float = 0.0,
    cost_model: CostModel | None = None,
) -> Self:
    """Create a Portfolio directly from cash positions aligned with prices.

    Args:
        prices: Price levels per asset over time (may include a date column).
        cash_position: Cash exposure per asset over time, either as a
            DataFrame or as a Polars expression evaluated against *prices*.
        aum: Assets under management used as the base NAV offset.
        cost_per_unit: One-way trading cost per unit of position change.
            Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
        cost_bps: One-way trading cost in basis points of AUM turnover.
            Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
        cost_model: Optional `CostModel`
            instance.  When supplied, its ``cost_per_unit`` and
            ``cost_bps`` values take precedence over the individual
            parameters above.

    Returns:
        A Portfolio instance with the provided cash positions.
    """
    if isinstance(cash_position, pl.Expr):
        cash_position = prices.with_columns(cash_position)
    if cost_model is not None:
        cost_per_unit = cost_model.cost_per_unit
        cost_bps = cost_model.cost_bps
    return cls(prices=prices, cashposition=cash_position, aum=aum, cost_per_unit=cost_per_unit, cost_bps=cost_bps)

from_position(prices, position, aum, cost_per_unit=0.0, cost_bps=0.0, cost_model=None) classmethod

Create a Portfolio from share/unit positions.

Converts position (number of units held per asset) to cash exposure by multiplying element-wise with prices, then delegates to :pyfrom_cash_position.

Parameters:

Name Type Description Default
prices DataFrame

Price levels per asset over time (may include a date column).

required
position DataFrame | Expr

Number of units held per asset over time, aligned with prices. Non-numeric columns (e.g. 'date') are passed through unchanged.

required
aum float

Assets under management used as the base NAV offset.

required
cost_per_unit float

One-way trading cost per unit of position change. Defaults to 0.0 (no cost). Ignored when cost_model is given.

0.0
cost_bps float

One-way trading cost in basis points of AUM turnover. Defaults to 0.0 (no cost). Ignored when cost_model is given.

0.0
cost_model CostModel | None

Optional CostModel instance. When supplied, its cost_per_unit and cost_bps values take precedence over the individual parameters above.

None

Returns:

Type Description
Self

A Portfolio instance whose cash positions equal position x prices.

Examples:

>>> import polars as pl
>>> prices = pl.DataFrame({"A": [100.0, 110.0, 105.0]})
>>> pos = pl.DataFrame({"A": [10.0, 10.0, 10.0]})
>>> pf = Portfolio.from_position(prices=prices, position=pos, aum=1e6)
>>> pf.cashposition["A"].to_list()
[1000.0, 1100.0, 1050.0]
Source code in src/jquantstats/portfolio.py
@classmethod
def from_position(
    cls,
    prices: pl.DataFrame,
    position: pl.DataFrame | pl.Expr,
    aum: float,
    cost_per_unit: float = 0.0,
    cost_bps: float = 0.0,
    cost_model: CostModel | None = None,
) -> Self:
    """Create a Portfolio from share/unit positions.

    Converts *position* (number of units held per asset) to cash exposure
    by multiplying element-wise with *prices*, then delegates to
    :py`from_cash_position`.

    Args:
        prices: Price levels per asset over time (may include a date column).
        position: Number of units held per asset over time, aligned with
            *prices*.  Non-numeric columns (e.g. ``'date'``) are passed
            through unchanged.
        aum: Assets under management used as the base NAV offset.
        cost_per_unit: One-way trading cost per unit of position change.
            Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
        cost_bps: One-way trading cost in basis points of AUM turnover.
            Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
        cost_model: Optional `CostModel` instance.
            When supplied, its ``cost_per_unit`` and ``cost_bps`` values
            take precedence over the individual parameters above.

    Returns:
        A Portfolio instance whose cash positions equal *position* x *prices*.

    Examples:
        >>> import polars as pl
        >>> prices = pl.DataFrame({"A": [100.0, 110.0, 105.0]})
        >>> pos = pl.DataFrame({"A": [10.0, 10.0, 10.0]})
        >>> pf = Portfolio.from_position(prices=prices, position=pos, aum=1e6)
        >>> pf.cashposition["A"].to_list()
        [1000.0, 1100.0, 1050.0]
    """
    if isinstance(position, pl.Expr):
        position = prices.with_columns(position)
    assets = [col for col, dtype in prices.schema.items() if dtype.is_numeric()]
    cash_position = position.with_columns((pl.col(asset) * prices[asset]).alias(asset) for asset in assets)
    return cls.from_cash_position(
        prices=prices,
        cash_position=cash_position,
        aum=aum,
        cost_per_unit=cost_per_unit,
        cost_bps=cost_bps,
        cost_model=cost_model,
    )

from_risk_position(prices, risk_position, aum, vola=32, vol_cap=None, cost_per_unit=0.0, cost_bps=0.0, cost_model=None) classmethod

Create a Portfolio from per-asset risk positions.

De-volatizes each risk position using an EWMA volatility estimate derived from the corresponding price series.

Parameters:

Name Type Description Default
prices DataFrame

Price levels per asset over time (may include a date column).

required
risk_position DataFrame | Expr

Risk units per asset aligned with prices.

required
vola int | dict[str, int]

EWMA lookback (span-equivalent) used to estimate volatility. Pass an int to apply the same span to every asset, or a dict[str, int] to set a per-asset span (assets absent from the dict default to 32). Every span value must be a positive integer; a ValueError is raised otherwise. Dict keys that do not correspond to any numeric column in prices also raise a ValueError.

32
vol_cap float | None

Optional lower bound for the EWMA volatility estimate. When provided, the vol series is clipped from below at this value before dividing the risk position, preventing position blow-up in calm, low-volatility regimes. For example, vol_cap=0.05 ensures annualised vol is never estimated below 5%. Must be positive when not None.

None
aum float

Assets under management used as the base NAV offset.

required
cost_per_unit float

One-way trading cost per unit of position change. Defaults to 0.0 (no cost). Ignored when cost_model is given.

0.0
cost_bps float

One-way trading cost in basis points of AUM turnover. Defaults to 0.0 (no cost). Ignored when cost_model is given.

0.0
cost_model CostModel | None

Optional CostModel instance. When supplied, its cost_per_unit and cost_bps values take precedence over the individual parameters above.

None

Returns:

Type Description
Self

A Portfolio instance whose cash positions are risk_position

Self

divided by EWMA volatility.

Raises:

Type Description
ValueError

If any span value in vola is ≤ 0, or if a key in a vola dict does not match any numeric column in prices, or if vol_cap is provided but is not positive.

Source code in src/jquantstats/portfolio.py
@classmethod
def from_risk_position(
    cls,
    prices: pl.DataFrame,
    risk_position: pl.DataFrame | pl.Expr,
    aum: float,
    vola: int | dict[str, int] = 32,
    vol_cap: float | None = None,
    cost_per_unit: float = 0.0,
    cost_bps: float = 0.0,
    cost_model: CostModel | None = None,
) -> Self:
    """Create a Portfolio from per-asset risk positions.

    De-volatizes each risk position using an EWMA volatility estimate
    derived from the corresponding price series.

    Args:
        prices: Price levels per asset over time (may include a date column).
        risk_position: Risk units per asset aligned with prices.
        vola: EWMA lookback (span-equivalent) used to estimate volatility.
            Pass an ``int`` to apply the same span to every asset, or a
            ``dict[str, int]`` to set a per-asset span (assets absent from
            the dict default to ``32``).  Every span value must be a
            positive integer; a ``ValueError`` is raised otherwise.  Dict
            keys that do not correspond to any numeric column in *prices*
            also raise a ``ValueError``.
        vol_cap: Optional lower bound for the EWMA volatility estimate.
            When provided, the vol series is clipped from below at this
            value before dividing the risk position, preventing
            position blow-up in calm, low-volatility regimes.  For
            example, ``vol_cap=0.05`` ensures annualised vol is never
            estimated below 5%.  Must be positive when not ``None``.
        aum: Assets under management used as the base NAV offset.
        cost_per_unit: One-way trading cost per unit of position change.
            Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
        cost_bps: One-way trading cost in basis points of AUM turnover.
            Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
        cost_model: Optional `CostModel`
            instance.  When supplied, its ``cost_per_unit`` and
            ``cost_bps`` values take precedence over the individual
            parameters above.

    Returns:
        A Portfolio instance whose cash positions are risk_position
        divided by EWMA volatility.

    Raises:
        ValueError: If any span value in *vola* is ≤ 0, or if a key in a
            *vola* dict does not match any numeric column in *prices*, or
            if *vol_cap* is provided but is not positive.
    """
    if isinstance(risk_position, pl.Expr):
        risk_position = prices.with_columns(risk_position)
    if cost_model is not None:
        cost_per_unit = cost_model.cost_per_unit
        cost_bps = cost_model.cost_bps
    assets = [col for col, dtype in prices.schema.items() if dtype.is_numeric()]

    # ── Validate vol_cap ──────────────────────────────────────────────────
    if vol_cap is not None and vol_cap <= 0:
        raise ValueError(f"vol_cap must be a positive number when provided, got {vol_cap!r}")  # noqa: TRY003

    # ── Validate vola ─────────────────────────────────────────────────────
    if isinstance(vola, dict):
        unknown = set(vola.keys()) - set(assets)
        if unknown:
            raise ValueError(  # noqa: TRY003
                f"vola dict contains keys that do not match any numeric column in prices: {sorted(unknown)}"
            )
        for asset, span in vola.items():
            if int(span) <= 0:
                raise ValueError(f"vola span for '{asset}' must be a positive integer, got {span!r}")  # noqa: TRY003
    else:
        if int(vola) <= 0:
            raise ValueError(f"vola span must be a positive integer, got {vola!r}")  # noqa: TRY003

    def _span(asset: str) -> int:
        """Return the EWMA span for *asset*, falling back to 32 if not specified."""
        if isinstance(vola, dict):
            return int(vola.get(asset, 32))
        return int(vola)

    def _vol(asset: str) -> pl.Series:
        """Return the EWMA volatility series for *asset*, optionally clipped from below."""
        vol = prices[asset].pct_change().ewm_std(com=_span(asset) - 1, adjust=True, min_samples=_span(asset))
        if vol_cap is not None:
            vol = vol.clip(lower_bound=vol_cap)
        return vol

    cash_position = risk_position.with_columns((pl.col(asset) / _vol(asset)).alias(asset) for asset in assets)
    return cls(prices=prices, cashposition=cash_position, aum=aum, cost_per_unit=cost_per_unit, cost_bps=cost_bps)

lag(n)

Return a new Portfolio with cash positions lagged by n steps.

This method shifts the numeric asset columns in the cashposition DataFrame by n rows, preserving the 'date' column and any non-numeric columns unchanged. Positive n delays weights (moves them down); negative n leads them (moves them up); n == 0 returns the current portfolio unchanged.

Notes

Missing values introduced by the shift are left as nulls; downstream profit computation already guards and treats nulls as zero when multiplying by returns.

Parameters:

Name Type Description Default
n int

Number of rows to shift (can be negative, zero, or positive).

required

Returns:

Type Description
Portfolio

A new Portfolio instance with lagged cash positions and the same

Portfolio

prices/AUM as the original.

Source code in src/jquantstats/portfolio.py
def lag(self, n: int) -> "Portfolio":
    """Return a new Portfolio with cash positions lagged by ``n`` steps.

    This method shifts the numeric asset columns in the cashposition
    DataFrame by ``n`` rows, preserving the ``'date'`` column and any
    non-numeric columns unchanged.  Positive ``n`` delays weights (moves
    them down); negative ``n`` leads them (moves them up); ``n == 0``
    returns the current portfolio unchanged.

    Notes:
        Missing values introduced by the shift are left as nulls;
        downstream profit computation already guards and treats nulls as
        zero when multiplying by returns.

    Args:
        n: Number of rows to shift (can be negative, zero, or positive).

    Returns:
        A new Portfolio instance with lagged cash positions and the same
        prices/AUM as the original.
    """
    if not isinstance(n, int):
        raise TypeError
    if n == 0:
        return self

    assets = [c for c in self.cashposition.columns if c != "date" and self.cashposition[c].dtype.is_numeric()]
    cp_lagged = self.cashposition.with_columns(pl.col(c).shift(n) for c in assets)
    return Portfolio(
        prices=self.prices,
        cashposition=cp_lagged,
        aum=self.aum,
        cost_per_unit=self.cost_per_unit,
        cost_bps=self.cost_bps,
    )

smoothed_holding(n)

Return a new Portfolio with cash positions smoothed by a rolling mean.

Applies a trailing window average over the last n steps for each numeric asset column (excluding 'date'). The window length is n + 1 so that:

  • n=0 returns the original weights (no smoothing),
  • n=1 averages the current and previous weights,
  • n=k averages the current and last k weights.

Parameters:

Name Type Description Default
n int

Non-negative integer specifying how many previous steps to include.

required

Returns:

Type Description
Portfolio

A new Portfolio with smoothed cash positions and the same

Portfolio

prices/AUM.

Source code in src/jquantstats/portfolio.py
def smoothed_holding(self, n: int) -> "Portfolio":
    """Return a new Portfolio with cash positions smoothed by a rolling mean.

    Applies a trailing window average over the last ``n`` steps for each
    numeric asset column (excluding ``'date'``). The window length is
    ``n + 1`` so that:

    - n=0 returns the original weights (no smoothing),
    - n=1 averages the current and previous weights,
    - n=k averages the current and last k weights.

    Args:
        n: Non-negative integer specifying how many previous steps to
            include.

    Returns:
        A new Portfolio with smoothed cash positions and the same
        prices/AUM.
    """
    if not isinstance(n, int):
        raise TypeError
    if n < 0:
        raise ValueError
    if n == 0:
        return self

    assets = [c for c in self.cashposition.columns if c != "date" and self.cashposition[c].dtype.is_numeric()]
    window = n + 1
    cp_smoothed = self.cashposition.with_columns(
        pl.col(c).rolling_mean(window_size=window, min_samples=1).alias(c) for c in assets
    )
    return Portfolio(
        prices=self.prices,
        cashposition=cp_smoothed,
        aum=self.aum,
        cost_per_unit=self.cost_per_unit,
        cost_bps=self.cost_bps,
    )

truncate(start=None, end=None)

Return a new Portfolio truncated to the inclusive [start, end] range.

When a 'date' column is present in both prices and cash positions, truncation is performed by comparing the 'date' column against start and end (which should be date/datetime values or strings parseable by Polars).

When the 'date' column is absent, integer-based row slicing is used instead. In this case start and end must be non-negative integers representing 0-based row indices. Passing non-integer bounds to an integer-indexed portfolio raises TypeError.

In all cases the aum value is preserved.

Parameters:

Name Type Description Default
start date | datetime | str | int | None

Optional lower bound (inclusive). A date/datetime or Polars-parseable string when a 'date' column exists; a non-negative int row index when the data has no 'date' column.

None
end date | datetime | str | int | None

Optional upper bound (inclusive). Same type rules as start.

None

Returns:

Type Description
Portfolio

A new Portfolio instance with prices and cash positions filtered

Portfolio

to the specified range.

Raises:

Type Description
TypeError

When the portfolio has no 'date' column and a non-integer bound is supplied.

Source code in src/jquantstats/portfolio.py
def truncate(
    self,
    start: date | datetime | str | int | None = None,
    end: date | datetime | str | int | None = None,
) -> "Portfolio":
    """Return a new Portfolio truncated to the inclusive [start, end] range.

    When a ``'date'`` column is present in both prices and cash positions,
    truncation is performed by comparing the ``'date'`` column against
    ``start`` and ``end`` (which should be date/datetime values or strings
    parseable by Polars).

    When the ``'date'`` column is absent, integer-based row slicing is
    used instead.  In this case ``start`` and ``end`` must be non-negative
    integers representing 0-based row indices.  Passing non-integer bounds
    to an integer-indexed portfolio raises `TypeError`.

    In all cases the ``aum`` value is preserved.

    Args:
        start: Optional lower bound (inclusive). A date/datetime or
            Polars-parseable string when a ``'date'`` column exists; a
            non-negative int row index when the data has no ``'date'``
            column.
        end: Optional upper bound (inclusive). Same type rules as
            ``start``.

    Returns:
        A new Portfolio instance with prices and cash positions filtered
        to the specified range.

    Raises:
        TypeError: When the portfolio has no ``'date'`` column and a
            non-integer bound is supplied.
    """
    has_date = "date" in self.prices.columns
    if has_date:
        cond = pl.lit(True)
        if start is not None:
            cond = cond & (pl.col("date") >= pl.lit(start))
        if end is not None:
            cond = cond & (pl.col("date") <= pl.lit(end))
        pr = self.prices.filter(cond)
        cp = self.cashposition.filter(cond)
    else:
        if start is not None and not isinstance(start, int):
            raise IntegerIndexBoundError("start", type(start).__name__)
        if end is not None and not isinstance(end, int):
            raise IntegerIndexBoundError("end", type(end).__name__)
        row_start = int(start) if start is not None else 0
        row_end = int(end) + 1 if end is not None else self.prices.height
        length = max(0, row_end - row_start)
        pr = self.prices.slice(row_start, length)
        cp = self.cashposition.slice(row_start, length)
    return Portfolio(
        prices=pr,
        cashposition=cp,
        aum=self.aum,
        cost_per_unit=self.cost_per_unit,
        cost_bps=self.cost_bps,
    )

Result dataclass

Lightweight container for system outputs.

Attributes:

Name Type Description
portfolio Portfolio

The portfolio constructed by a system/experiment.

mu DataFrame | None

Optional per-asset expected-returns surface used by some systems.

Source code in src/jquantstats/result.py
@dataclass(frozen=True)
class Result:
    """Lightweight container for system outputs.

    Attributes:
        portfolio: The portfolio constructed by a system/experiment.
        mu: Optional per-asset expected-returns surface used by some systems.
    """

    portfolio: Portfolio
    mu: pl.DataFrame | None = None

    def create_reports(self, output_dir: Path) -> None:
        """Generate CSV exports and interactive HTML plots for this result.

        Args:
            output_dir: Destination directory where two subfolders will be created:
                - data/: CSV exports of prices, profit, returns, positions, and signal (if mu present).
                - plots/: Plotly HTML reports (snapshot, lead/lag IR, lagged performance,
                  smoothed holdings performance).
        """
        data = output_dir / "data"
        plots = output_dir / "plots"

        data.mkdir(parents=True, exist_ok=True)
        plots.mkdir(parents=True, exist_ok=True)

        self.portfolio.prices.write_csv(file=data / "prices.csv")
        self.portfolio.profit.write_csv(file=data / "profit.csv")
        self.portfolio.returns.write_csv(file=data / "returns.csv")
        self.portfolio.tilt_timing_decomp.write_csv(file=data / "tilt_timing_decomp.csv")

        if self.mu is not None:
            self.mu.write_csv(file=data / "signal.csv")

        self.portfolio.cashposition.write_csv(file=data / "position.csv")

        fig = self.portfolio.plots.snapshot()
        fig.write_html(file=plots / "snapshot.html", auto_open=False, include_plotlyjs="cdn")
        fig = self.portfolio.plots.lead_lag_ir_plot()
        fig.write_html(file=plots / "lag_ir.html", auto_open=False, include_plotlyjs="cdn")
        fig = self.portfolio.plots.lagged_performance_plot()
        fig.write_html(file=plots / "lagged_perf.html", auto_open=False, include_plotlyjs="cdn")
        fig = self.portfolio.plots.smoothed_holdings_performance_plot()
        fig.write_html(file=plots / "smooth_perf.html", auto_open=False, include_plotlyjs="cdn")

create_reports(output_dir)

Generate CSV exports and interactive HTML plots for this result.

Parameters:

Name Type Description Default
output_dir Path

Destination directory where two subfolders will be created: - data/: CSV exports of prices, profit, returns, positions, and signal (if mu present). - plots/: Plotly HTML reports (snapshot, lead/lag IR, lagged performance, smoothed holdings performance).

required
Source code in src/jquantstats/result.py
def create_reports(self, output_dir: Path) -> None:
    """Generate CSV exports and interactive HTML plots for this result.

    Args:
        output_dir: Destination directory where two subfolders will be created:
            - data/: CSV exports of prices, profit, returns, positions, and signal (if mu present).
            - plots/: Plotly HTML reports (snapshot, lead/lag IR, lagged performance,
              smoothed holdings performance).
    """
    data = output_dir / "data"
    plots = output_dir / "plots"

    data.mkdir(parents=True, exist_ok=True)
    plots.mkdir(parents=True, exist_ok=True)

    self.portfolio.prices.write_csv(file=data / "prices.csv")
    self.portfolio.profit.write_csv(file=data / "profit.csv")
    self.portfolio.returns.write_csv(file=data / "returns.csv")
    self.portfolio.tilt_timing_decomp.write_csv(file=data / "tilt_timing_decomp.csv")

    if self.mu is not None:
        self.mu.write_csv(file=data / "signal.csv")

    self.portfolio.cashposition.write_csv(file=data / "position.csv")

    fig = self.portfolio.plots.snapshot()
    fig.write_html(file=plots / "snapshot.html", auto_open=False, include_plotlyjs="cdn")
    fig = self.portfolio.plots.lead_lag_ir_plot()
    fig.write_html(file=plots / "lag_ir.html", auto_open=False, include_plotlyjs="cdn")
    fig = self.portfolio.plots.lagged_performance_plot()
    fig.write_html(file=plots / "lagged_perf.html", auto_open=False, include_plotlyjs="cdn")
    fig = self.portfolio.plots.smoothed_holdings_performance_plot()
    fig.write_html(file=plots / "smooth_perf.html", auto_open=False, include_plotlyjs="cdn")

interpolate(df)

Forward-fill numeric columns only between first and last non-null values.

For each numeric column, forward-fill is applied strictly within the span bounded by its first and last non-null samples. Values outside this span are left as-is (including leading/trailing nulls). Non-numeric columns are returned unchanged.

Parameters:

Name Type Description Default
df DataFrame

Input frame possibly containing nulls.

required

Returns:

Type Description
DataFrame

pl.DataFrame: Frame where numeric columns have been interior-forward-

DataFrame

filled; schema and dtypes of the original columns are preserved.

Examples:

import polars as pl
from jquantstats import interpolate

df = pl.DataFrame({"a": [None, 1.0, None, 3.0, None], "b": ["x", "y", "z", "w", "v"]})
result = interpolate(df)
# a: [None, 1.0, 1.0, 3.0, None]  (leading/trailing nulls untouched)
# b: ["x", "y", "z", "w", "v"]    (non-numeric unchanged)
Source code in src/jquantstats/data.py
def interpolate(df: pl.DataFrame) -> pl.DataFrame:
    """Forward-fill numeric columns only between first and last non-null values.

    For each numeric column, forward-fill is applied strictly within the span
    bounded by its first and last non-null samples. Values outside this span
    are left as-is (including leading/trailing nulls). Non-numeric columns are
    returned unchanged.

    Args:
        df: Input frame possibly containing nulls.

    Returns:
        pl.DataFrame: Frame where numeric columns have been interior-forward-
        filled; schema and dtypes of the original columns are preserved.

    Examples:
        ```python
        import polars as pl
        from jquantstats import interpolate

        df = pl.DataFrame({"a": [None, 1.0, None, 3.0, None], "b": ["x", "y", "z", "w", "v"]})
        result = interpolate(df)
        # a: [None, 1.0, 1.0, 3.0, None]  (leading/trailing nulls untouched)
        # b: ["x", "y", "z", "w", "v"]    (non-numeric unchanged)
        ```

    """
    # Choose a temp column name guaranteed not to collide with any user column.
    tmp_col = "__row_idx__"
    while tmp_col in df.columns:
        tmp_col = f"_{tmp_col}_"

    out = []

    for col in df.columns:
        s = df[col]
        if s.dtype.is_numeric():
            non_null_mask = s.is_not_null()
            if non_null_mask.any():
                _fwd = non_null_mask.arg_max()
                _rev = non_null_mask.reverse().arg_max()
                if _fwd is None or _rev is None:  # pragma: no cover
                    out.append(pl.col(col))
                    continue
                first_valid_idx = _fwd
                last_valid_idx = len(s) - 1 - _rev
            else:
                out.append(pl.col(col))
                continue

            mask = (pl.col(tmp_col) >= pl.lit(first_valid_idx)) & (pl.col(tmp_col) <= pl.lit(last_valid_idx))
            filled_col = pl.when(mask).then(pl.col(col).fill_null(strategy="forward")).otherwise(pl.col(col)).alias(col)
            out.append(filled_col)
        else:
            out.append(pl.col(col))

    return df.with_columns(pl.int_range(0, df.height).alias(tmp_col)).select(out)