API Reference¶

The public API surface of jquantstats. All stable exports are importable directly from the top-level package:

from jquantstats import Portfolio, Data, CostModel

See API Stability for the versioning and deprecation policy.

`jquantstats` ¶

jQuantStats: Portfolio analytics for quants.

Two entry points¶

Entry point 1 — prices + positions (recommended for active portfolios):

Use Portfolio when you have price series and position sizes. Portfolio compiles the NAV curve from raw inputs and exposes the full analytics suite via .stats, .plots, and .report.

from jquantstats import Portfolio
import polars as pl

pf = Portfolio.from_cash_position(
    prices=prices_df,
    cash_position=positions_df,
    aum=1_000_000,
)
pf.stats.sharpe()
pf.plots.snapshot()

Entry point 2 — returns series (for arbitrary return streams):

Use Data when you already have a returns series (e.g. downloaded from a data vendor) and want benchmark comparison or factor analytics.

from jquantstats import Data
import polars as pl

data = Data.from_returns(returns=returns_df, benchmark=bench_df)
data.stats.sharpe()
data.plots.snapshot(title="Performance")

The two APIs are layered: portfolio.data returns a Data object so you can always drop into the returns-series API from a Portfolio.

For more information, visit the jQuantStats Documentation <https://jebel-quant.github.io/jquantstats/book>_.

`CostModel` `dataclass` ¶

Unified representation of a portfolio transaction-cost model.

Eliminates the implicit "pick one" contract between the two independent cost parameters (cost_per_unit and cost_bps) on Portfolio. A CostModel instance encapsulates one model at a time and can be passed to any Portfolio factory method instead of specifying the raw float parameters.

Attributes:

Name	Type	Description
`cost_per_unit`	`float`	One-way cost per unit of position change (Model A). Defaults to 0.0.
`cost_bps`	`float`	One-way cost in basis points of AUM turnover (Model B). Defaults to 0.0.

Raises:

Type	Description
`ValueError`	If `cost_per_unit` or `cost_bps` is negative, or if both are non-zero (which would silently double-count costs).

Examples:

>>> CostModel.per_unit(0.01)
CostModel(cost_per_unit=0.01, cost_bps=0.0)
>>> CostModel.turnover_bps(5.0)
CostModel(cost_per_unit=0.0, cost_bps=5.0)
>>> CostModel.zero()
CostModel(cost_per_unit=0.0, cost_bps=0.0)

Source code in src/jquantstats/_cost_model.py

@dataclasses.dataclass(frozen=True)
class CostModel:
    """Unified representation of a portfolio transaction-cost model.

    Eliminates the implicit "pick one" contract between the two independent
    cost parameters (``cost_per_unit`` and ``cost_bps``) on
    `Portfolio`.  A ``CostModel``
    instance encapsulates one model at a time and can be passed to any
    Portfolio factory method instead of specifying the raw float parameters.

    Attributes:
        cost_per_unit: One-way cost per unit of position change (Model A).
            Defaults to 0.0.
        cost_bps: One-way cost in basis points of AUM turnover (Model B).
            Defaults to 0.0.

    Raises:
        ValueError: If ``cost_per_unit`` or ``cost_bps`` is negative, or if
            both are non-zero (which would silently double-count costs).

    Examples:
        >>> CostModel.per_unit(0.01)
        CostModel(cost_per_unit=0.01, cost_bps=0.0)
        >>> CostModel.turnover_bps(5.0)
        CostModel(cost_per_unit=0.0, cost_bps=5.0)
        >>> CostModel.zero()
        CostModel(cost_per_unit=0.0, cost_bps=0.0)
    """

    cost_per_unit: float = 0.0
    cost_bps: float = 0.0

    def __post_init__(self) -> None:
        if self.cost_per_unit < 0:
            raise ValueError(f"cost_per_unit must be non-negative, got {self.cost_per_unit}")  # noqa: TRY003
        if self.cost_bps < 0:
            raise ValueError(f"cost_bps must be non-negative, got {self.cost_bps}")  # noqa: TRY003
        if self.cost_per_unit > 0 and self.cost_bps > 0:
            raise ValueError(  # noqa: TRY003
                "Only one cost model may be active at a time: "
                f"got cost_per_unit={self.cost_per_unit} and cost_bps={self.cost_bps}. "
                "Use CostModel.per_unit() or CostModel.turnover_bps() to make intent explicit."
            )

    # ── Named constructors ────────────────────────────────────────────────────

    @classmethod
    def per_unit(cls, cost: float) -> CostModel:
        """Create a Model A (position-delta) cost model.

        Args:
            cost: One-way cost per unit of position change.  Must be
                non-negative.

        Returns:
            A `CostModel` with ``cost_per_unit=cost`` and
            ``cost_bps=0.0``.

        Examples:
            >>> CostModel.per_unit(0.01)
            CostModel(cost_per_unit=0.01, cost_bps=0.0)
        """
        return cls(cost_per_unit=cost, cost_bps=0.0)

    @classmethod
    def turnover_bps(cls, bps: float) -> CostModel:
        """Create a Model B (turnover-bps) cost model.

        Args:
            bps: One-way cost in basis points of AUM turnover.  Must be
                non-negative.

        Returns:
            A `CostModel` with ``cost_per_unit=0.0`` and
            ``cost_bps=bps``.

        Examples:
            >>> CostModel.turnover_bps(5.0)
            CostModel(cost_per_unit=0.0, cost_bps=5.0)
        """
        return cls(cost_per_unit=0.0, cost_bps=bps)

    @classmethod
    def zero(cls) -> CostModel:
        """Create a zero-cost model (no transaction costs).

        Returns:
            A `CostModel` with both parameters set to 0.0.

        Examples:
            >>> CostModel.zero()
            CostModel(cost_per_unit=0.0, cost_bps=0.0)
        """
        return cls(cost_per_unit=0.0, cost_bps=0.0)

`per_unit(cost)` `classmethod` ¶

Create a Model A (position-delta) cost model.

Parameters:

Name	Type	Description	Default
`cost`	`float`	One-way cost per unit of position change. Must be non-negative.	required

Returns:

Type	Description
`CostModel`	A `CostModel` with `cost_per_unit=cost` and
`CostModel`	`cost_bps=0.0`.

Examples:

>>> CostModel.per_unit(0.01)
CostModel(cost_per_unit=0.01, cost_bps=0.0)

Source code in src/jquantstats/_cost_model.py

@classmethod
def per_unit(cls, cost: float) -> CostModel:
    """Create a Model A (position-delta) cost model.

    Args:
        cost: One-way cost per unit of position change.  Must be
            non-negative.

    Returns:
        A `CostModel` with ``cost_per_unit=cost`` and
        ``cost_bps=0.0``.

    Examples:
        >>> CostModel.per_unit(0.01)
        CostModel(cost_per_unit=0.01, cost_bps=0.0)
    """
    return cls(cost_per_unit=cost, cost_bps=0.0)

`turnover_bps(bps)` `classmethod` ¶

Create a Model B (turnover-bps) cost model.

Parameters:

Name	Type	Description	Default
`bps`	`float`	One-way cost in basis points of AUM turnover. Must be non-negative.	required

Returns:

Type	Description
`CostModel`	A `CostModel` with `cost_per_unit=0.0` and
`CostModel`	`cost_bps=bps`.

Examples:

>>> CostModel.turnover_bps(5.0)
CostModel(cost_per_unit=0.0, cost_bps=5.0)

Source code in src/jquantstats/_cost_model.py

@classmethod
def turnover_bps(cls, bps: float) -> CostModel:
    """Create a Model B (turnover-bps) cost model.

    Args:
        bps: One-way cost in basis points of AUM turnover.  Must be
            non-negative.

    Returns:
        A `CostModel` with ``cost_per_unit=0.0`` and
        ``cost_bps=bps``.

    Examples:
        >>> CostModel.turnover_bps(5.0)
        CostModel(cost_per_unit=0.0, cost_bps=5.0)
    """
    return cls(cost_per_unit=0.0, cost_bps=bps)

`zero()` `classmethod` ¶

Create a zero-cost model (no transaction costs).

Returns:

Type	Description
`CostModel`	A `CostModel` with both parameters set to 0.0.

Examples:

>>> CostModel.zero()
CostModel(cost_per_unit=0.0, cost_bps=0.0)

Source code in src/jquantstats/_cost_model.py

@classmethod
def zero(cls) -> CostModel:
    """Create a zero-cost model (no transaction costs).

    Returns:
        A `CostModel` with both parameters set to 0.0.

    Examples:
        >>> CostModel.zero()
        CostModel(cost_per_unit=0.0, cost_bps=0.0)
    """
    return cls(cost_per_unit=0.0, cost_bps=0.0)

`Data` `dataclass` ¶

A container for financial returns data and an optional benchmark.

Provides methods for analyzing and manipulating financial returns data, including resampling, truncation, and access to statistical metrics and visualizations via the stats and plots properties.

Attributes:

Name	Type	Description
`returns`	`DataFrame`	DataFrame containing returns data with assets as columns.
`benchmark`	`DataFrame \| None`	Optional benchmark returns DataFrame. Defaults to None.
`index`	`DataFrame`	DataFrame containing the date index for the returns data.

Source code in src/jquantstats/data.py

@dataclasses.dataclass(frozen=True, slots=True)
class Data:
    """A container for financial returns data and an optional benchmark.

    Provides methods for analyzing and manipulating financial returns data,
    including resampling, truncation, and access to statistical metrics and
    visualizations via the ``stats`` and ``plots`` properties.

    Attributes:
        returns (pl.DataFrame): DataFrame containing returns data with assets
            as columns.
        benchmark (pl.DataFrame | None): Optional benchmark returns DataFrame.
            Defaults to None.
        index (pl.DataFrame): DataFrame containing the date index for the
            returns data.

    """

    returns: pl.DataFrame
    index: pl.DataFrame
    benchmark: pl.DataFrame | None = None

    def __post_init__(self) -> None:
        """Validate the Data object after initialization."""
        # You need at least two points
        if self.index.shape[0] < 2:
            raise ValueError("Index must contain at least two timestamps.")  # noqa: TRY003

        # Check index is monotonically increasing
        datetime_col = self.index[self.index.columns[0]]
        if not datetime_col.is_sorted():
            raise ValueError("Index must be monotonically increasing.")  # noqa: TRY003

        # Check row count matches returns
        if self.returns.shape[0] != self.index.shape[0]:
            raise ValueError("Returns and index must have the same number of rows.")  # noqa: TRY003

        # Check row count matches benchmark (if provided)
        if self.benchmark is not None and self.benchmark.shape[0] != self.index.shape[0]:
            raise ValueError("Benchmark and index must have the same number of rows.")  # noqa: TRY003

    @classmethod
    def from_returns(
        cls,
        returns: NativeFrame,
        rf: NativeFrameOrScalar = 0.0,
        benchmark: NativeFrame | None = None,
        date_col: str = "Date",
        null_strategy: Literal["raise", "drop", "forward_fill"] | None = None,
    ) -> Data:
        """Create a Data object from returns and optional benchmark.

        Args:
            returns (NativeFrame): Financial returns data. First column should
                be the date column, remaining columns are asset returns.
            rf (float | NativeFrame): Risk-free rate. Defaults to 0.0 (no
                risk-free rate adjustment).

                - If float: Constant risk-free rate applied to all dates.
                - If NativeFrame: Time-varying risk-free rate with dates
                  matching returns.

            benchmark (NativeFrame | None): Benchmark returns. Defaults to
                None (no benchmark). First column should be the date column,
                remaining columns are benchmark returns.
            date_col (str): Name of the date column in the DataFrames.
                Defaults to ``"Date"``.
            null_strategy ({"raise", "drop", "forward_fill"} | None): How to
                handle ``null`` (missing) values in *returns* and *benchmark*.
                Defaults to ``None`` (nulls propagate through calculations).

                - ``None`` — no null checking; nulls propagate through all
                  downstream calculations.
                - ``"raise"`` — raise `NullsInReturnsError` if any null is
                  found.
                - ``"drop"`` — silently drop every row that contains at least
                  one null.
                - ``"forward_fill"`` — fill each null with the most recent
                  non-null value in the same column.

                Note: Affects only Polars ``null`` values (i.e. ``None`` /
                missing entries). IEEE-754 ``NaN`` values are **not** affected
                and continue to propagate as per IEEE-754 semantics.

        Returns:
            Data: Object containing excess returns and benchmark (if any),
            with methods for analysis and visualization through the ``stats``
            and ``plots`` properties.

        Raises:
            NullsInReturnsError: If *null_strategy* is ``"raise"`` and the
                data contains null values.
            ValueError: If there are no overlapping dates between returns and
                benchmark.

        Examples:
            Basic usage:

            ```python
            from jquantstats import Data
            import polars as pl

            returns = pl.DataFrame({
                "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
                "Asset1": [0.01, -0.02, 0.03]
            }).with_columns(pl.col("Date").str.to_date())

            data = Data.from_returns(returns=returns)
            ```

            With benchmark and risk-free rate:

            ```python
            benchmark = pl.DataFrame({
                "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
                "Market": [0.005, -0.01, 0.02]
            }).with_columns(pl.col("Date").str.to_date())

            data = Data.from_returns(returns=returns, benchmark=benchmark, rf=0.0002)
            ```

            Handling nulls automatically:

            ```python
            returns_with_nulls = pl.DataFrame({
                "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
                "Asset1": [0.01, None, 0.03]
            }).with_columns(pl.col("Date").str.to_date())

            # Drop rows with nulls (mirrors pandas/QuantStats behaviour)
            data = Data.from_returns(returns=returns_with_nulls, null_strategy="drop")

            # Or forward-fill nulls
            data = Data.from_returns(returns=returns_with_nulls, null_strategy="forward_fill")
            ```

        """
        returns_pl = _to_polars(returns)
        benchmark_pl = _to_polars(benchmark) if benchmark is not None else None
        rf_converted: float | pl.DataFrame
        if isinstance(rf, pl.DataFrame) or (not isinstance(rf, float) and not isinstance(rf, int)):
            rf_converted = _to_polars(rf)
        else:
            rf_converted = rf  # int is not float/DataFrame: _subtract_risk_free raises TypeError

        returns_pl = _apply_null_strategy(returns_pl, date_col, "returns", null_strategy)
        if benchmark_pl is not None:
            benchmark_pl = _apply_null_strategy(benchmark_pl, date_col, "benchmark", null_strategy)

        if benchmark_pl is not None:
            joined_dates = returns_pl.join(benchmark_pl, on=date_col, how="inner").select(date_col)
            if joined_dates.is_empty():
                raise ValueError("No overlapping dates between returns and benchmark.")  # noqa: TRY003
            returns_pl = returns_pl.join(joined_dates, on=date_col, how="inner")
            benchmark_pl = benchmark_pl.join(joined_dates, on=date_col, how="inner")

        index = returns_pl.select(date_col)
        excess_returns = _subtract_risk_free(returns_pl, rf_converted, date_col).drop(date_col)
        excess_benchmark = (
            _subtract_risk_free(benchmark_pl, rf_converted, date_col).drop(date_col)
            if benchmark_pl is not None
            else None
        )

        return cls(returns=excess_returns, benchmark=excess_benchmark, index=index)

    @classmethod
    def from_prices(
        cls,
        prices: NativeFrame,
        rf: NativeFrameOrScalar = 0.0,
        benchmark: NativeFrame | None = None,
        date_col: str = "Date",
        null_strategy: Literal["raise", "drop", "forward_fill"] | None = None,
    ) -> Data:
        """Create a Data object from prices and optional benchmark.

        Converts price levels to returns via percentage change and delegates
        to `from_returns`. The first row of each asset is dropped because no
        prior price is available to compute a return.

        Args:
            prices (NativeFrame): Price-level data. First column should be
                the date column; remaining columns are asset prices.
            rf (float | NativeFrame): Risk-free rate. Forwarded unchanged to
                `from_returns`. Defaults to 0.0 (no risk-free rate
                adjustment).
            benchmark (NativeFrame | None): Benchmark prices. Converted to
                returns in the same way as ``prices`` before being forwarded
                to `from_returns`. Defaults to None (no benchmark).
            date_col (str): Name of the date column in the DataFrames.
                Defaults to ``"Date"``.
            null_strategy ({"raise", "drop", "forward_fill"} | None): How to
                handle ``null`` (missing) values after converting prices to
                returns. Forwarded unchanged to `from_returns`. Defaults to
                ``None`` (nulls propagate through calculations).

                - ``None`` — no null checking; nulls propagate.
                - ``"raise"`` — raise `NullsInReturnsError` if any null is
                  found in the derived returns.
                - ``"drop"`` — silently drop every row that contains at least
                  one null.
                - ``"forward_fill"`` — fill each null with the most recent
                  non-null value.

                Note: Prices that contain nulls will produce null returns via
                ``pct_change()``. If you expect missing price entries, pass
                ``null_strategy="drop"`` or ``null_strategy="forward_fill"``.

        Returns:
            Data: Object containing excess returns derived from the supplied
            prices, with methods for analysis and visualization through the
            ``stats`` and ``plots`` properties.

        Examples:
            ```python
            from jquantstats import Data
            import polars as pl

            prices = pl.DataFrame({
                "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
                "Asset1": [100.0, 101.0, 99.0]
            }).with_columns(pl.col("Date").str.to_date())

            data = Data.from_prices(prices=prices)
            ```

        """
        prices_pl = _to_polars(prices)
        asset_cols = [c for c in prices_pl.columns if c != date_col]
        returns_pl = prices_pl.with_columns([pl.col(c).pct_change().alias(c) for c in asset_cols]).slice(1)

        benchmark_returns: NativeFrame | None = None
        if benchmark is not None:
            benchmark_pl = _to_polars(benchmark)
            bench_cols = [c for c in benchmark_pl.columns if c != date_col]
            benchmark_returns = benchmark_pl.with_columns([pl.col(c).pct_change().alias(c) for c in bench_cols]).slice(
                1
            )

        return cls.from_returns(
            returns=returns_pl,
            rf=rf,
            benchmark=benchmark_returns,
            date_col=date_col,
            null_strategy=null_strategy,
        )

    def __repr__(self) -> str:
        """Return a string representation of the Data object."""
        rows = len(self.index)
        date_cols = self.date_col
        if date_cols:
            date_column = date_cols[0]
            start = self.index[date_column].min()
            end = self.index[date_column].max()
            return f"Data(assets={self.assets}, rows={rows}, start={start}, end={end})"
        return f"Data(assets={self.assets}, rows={rows})"  # pragma: no cover  # __post_init__ requires ≥1 index column

    @property
    def plots(self) -> DataPlots:
        """Provides access to visualization methods for the financial data.

        Returns:
            DataPlots: An instance of the DataPlots class initialized with this data.

        """
        from ._plots import DataPlots

        return DataPlots(self)

    @property
    def stats(self) -> Stats:
        """Provides access to statistical analysis methods for the financial data.

        Returns:
            Stats: An instance of the Stats class initialized with this data.

        """
        from ._stats import Stats

        return Stats(self)

    @property
    def reports(self) -> Reports:
        """Provides access to reporting methods for the financial data.

        Returns:
            Reports: An instance of the Reports class initialized with this data.

        """
        from ._reports import Reports

        return Reports(self)

    @property
    def utils(self) -> DataUtils:
        """Provides access to utility transforms and conversions for the financial data.

        Returns:
            DataUtils: An instance of the DataUtils class initialized with this data.

        """
        from ._utils import DataUtils

        return DataUtils(self)

    @property
    def date_col(self) -> list[str]:
        """Return the column names of the index DataFrame.

        Returns:
            list[str]: List of column names in the index DataFrame, typically containing
                      the date column name.

        """
        return list(self.index.columns)

    @property
    def assets(self) -> list[str]:
        """Return the combined list of asset column names from returns and benchmark.

        Returns:
            list[str]: List of all asset column names from both returns and benchmark
                      (if available).

        """
        if self.benchmark is not None:
            return list(self.returns.columns) + list(self.benchmark.columns)
        return list(self.returns.columns)

    @property
    def all(self) -> pl.DataFrame:
        """Combine index, returns, and benchmark data into a single DataFrame.

        This property provides a convenient way to access all data in a single DataFrame,
        which is useful for analysis and visualization.

        Returns:
            pl.DataFrame: A DataFrame containing the index, all returns data, and benchmark data
                         (if available) combined horizontally.

        """
        if self.benchmark is None:
            return pl.concat([self.index, self.returns], how="horizontal")
        else:
            return pl.concat([self.index, self.returns, self.benchmark], how="horizontal")

    def resample(self, every: str = "1mo") -> Data:
        """Resample returns and benchmark to a different frequency.

        Args:
            every (str): Resampling frequency (e.g., ``'1mo'``, ``'1y'``).
                Defaults to ``'1mo'``.

        Returns:
            Data: Resampled data at the requested frequency.

        """

        def resample_frame(dframe: pl.DataFrame) -> pl.DataFrame:
            """Resample a single DataFrame to the target frequency using compound returns."""
            dframe = self.index.hstack(dframe)  # Add the date column for resampling

            return dframe.group_by_dynamic(
                index_column=self.index.columns[0], every=every, period=every, closed="right", label="right"
            ).agg(
                [
                    ((pl.col(col) + 1.0).product() - 1.0).alias(col)
                    for col in dframe.columns
                    if col != self.index.columns[0]
                ]
            )

        resampled_returns = resample_frame(self.returns)
        resampled_benchmark = resample_frame(self.benchmark) if self.benchmark is not None else None
        resampled_index = resampled_returns.select(self.index.columns[0])

        return Data(
            returns=resampled_returns.drop(self.index.columns[0]),
            benchmark=resampled_benchmark.drop(self.index.columns[0]) if resampled_benchmark is not None else None,
            index=resampled_index,
        )

    def describe(self) -> pl.DataFrame:
        """Return a tidy summary of shape, date range and asset names.

        Returns:
            pl.DataFrame: One row per asset with columns: asset, start, end,
            rows, has_benchmark.

        """
        date_column = self.date_col[0]
        start = self.index[date_column].min()
        end = self.index[date_column].max()
        rows = len(self.index)
        return pl.DataFrame(
            {
                "asset": self.returns.columns,
                "start": [start] * len(self.returns.columns),
                "end": [end] * len(self.returns.columns),
                "rows": [rows] * len(self.returns.columns),
                "has_benchmark": [self.benchmark is not None] * len(self.returns.columns),
            }
        )

    def copy(self) -> Data:
        """Create a deep copy of the Data object.

        Returns:
            Data: A new Data object with copies of the returns and benchmark.

        """
        if self.benchmark is not None:
            return Data(returns=self.returns.clone(), benchmark=self.benchmark.clone(), index=self.index.clone())
        return Data(returns=self.returns.clone(), index=self.index.clone())

    def head(self, n: int = 5) -> Data:
        """Return the first n rows of the combined returns and benchmark data.

        Args:
            n (int, optional): Number of rows to return. Defaults to 5.

        Returns:
            Data: A new Data object containing the first n rows of the combined data.

        """
        benchmark_head = self.benchmark.head(n) if self.benchmark is not None else None
        return Data(returns=self.returns.head(n), benchmark=benchmark_head, index=self.index.head(n))

    def tail(self, n: int = 5) -> Data:
        """Return the last n rows of the combined returns and benchmark data.

        Args:
            n (int, optional): Number of rows to return. Defaults to 5.

        Returns:
            Data: A new Data object containing the last n rows of the combined data.

        """
        benchmark_tail = self.benchmark.tail(n) if self.benchmark is not None else None
        return Data(returns=self.returns.tail(n), benchmark=benchmark_tail, index=self.index.tail(n))

    def truncate(
        self,
        start: date | datetime | str | int | None = None,
        end: date | datetime | str | int | None = None,
    ) -> Data:
        """Return a new Data object truncated to the inclusive [start, end] range.

        When the index is temporal (Date/Datetime), truncation is performed by
        comparing the date column against ``start`` and ``end`` values.

        When the index is integer-based, row slicing is used instead, and
        ``start`` and ``end`` must be non-negative integers.  Passing
        non-integer bounds to an integer-indexed Data raises `TypeError`.

        Args:
            start: Optional lower bound (inclusive).  A date/datetime value
                when the index is temporal; a non-negative `int` row
                index when the data has no temporal index.
            end: Optional upper bound (inclusive).  Same type rules as
                ``start``.

        Returns:
            Data: A new Data object filtered to the specified range.

        Raises:
            TypeError: When the index is not temporal and a non-integer bound
                is supplied.

        """
        date_column = self.index.columns[0]
        is_temporal = self.index[date_column].dtype.is_temporal()

        if is_temporal:
            cond = pl.lit(True)
            if start is not None:
                cond = cond & (pl.col(date_column) >= pl.lit(start))
            if end is not None:
                cond = cond & (pl.col(date_column) <= pl.lit(end))
            mask = self.index.select(cond.alias("mask"))["mask"]
            new_index = self.index.filter(mask)
            new_returns = self.returns.filter(mask)
            new_benchmark = self.benchmark.filter(mask) if self.benchmark is not None else None
        else:
            if start is not None and not isinstance(start, int):
                raise TypeError(f"start must be an integer, got {type(start).__name__}.")  # noqa: TRY003
            if end is not None and not isinstance(end, int):
                raise TypeError(f"end must be an integer, got {type(end).__name__}.")  # noqa: TRY003
            row_start = start if start is not None else 0
            row_end = end + 1 if end is not None else self.index.height
            length = max(0, row_end - row_start)
            new_index = self.index.slice(row_start, length)
            new_returns = self.returns.slice(row_start, length)
            new_benchmark = self.benchmark.slice(row_start, length) if self.benchmark is not None else None

        return Data(returns=new_returns, benchmark=new_benchmark, index=new_index)

    @property
    def _periods_per_year(self) -> float:
        """Estimate the number of periods per year based on average frequency in the index.

        For temporal (Date/Datetime) indices, computes the mean gap between observations
        and converts to an annualised period count (e.g. ~252 for daily, ~52 for weekly).

        For integer indices (date-free portfolios), falls back to 252 trading days per year
        because integer diffs have no time meaning.
        """
        datetime_col = self.index[self.index.columns[0]]

        if not datetime_col.dtype.is_temporal():
            return 252.0

        sorted_dt = datetime_col.sort()
        diffs = sorted_dt.diff().drop_nulls()
        mean_diff = diffs.mean()

        if isinstance(mean_diff, timedelta):
            seconds = mean_diff.total_seconds()
        else:  # pragma: no cover  # Polars always returns timedelta for temporal diff
            seconds = cast(float, mean_diff) if mean_diff is not None else 1.0

        return (365 * 24 * 60 * 60) / seconds

    def items(self) -> Iterator[tuple[str, pl.Series]]:
        """Iterate over all assets and their corresponding data series.

        This method provides a convenient way to iterate over all assets in the data,
        yielding each asset name and its corresponding data series.

        Yields:
            tuple[str, pl.Series]: A tuple containing the asset name and its data series.

        """
        matrix = self.all

        for col in self.assets:
            yield col, matrix.get_column(col)

`all` `property` ¶

Combine index, returns, and benchmark data into a single DataFrame.

This property provides a convenient way to access all data in a single DataFrame, which is useful for analysis and visualization.

Returns:

Type	Description
`DataFrame`	pl.DataFrame: A DataFrame containing the index, all returns data, and benchmark data (if available) combined horizontally.

`assets` `property` ¶

Return the combined list of asset column names from returns and benchmark.

Returns:

Type	Description
`list[str]`	list[str]: List of all asset column names from both returns and benchmark (if available).

`date_col` `property` ¶

Return the column names of the index DataFrame.

Returns:

Type	Description
`list[str]`	list[str]: List of column names in the index DataFrame, typically containing the date column name.

`plots` `property` ¶

Provides access to visualization methods for the financial data.

Returns:

Name	Type	Description
`DataPlots`	`DataPlots`	An instance of the DataPlots class initialized with this data.

`reports` `property` ¶

Provides access to reporting methods for the financial data.

Returns:

Name	Type	Description
`Reports`	`Reports`	An instance of the Reports class initialized with this data.

`stats` `property` ¶

Provides access to statistical analysis methods for the financial data.

Returns:

Name	Type	Description
`Stats`	`Stats`	An instance of the Stats class initialized with this data.

`utils` `property` ¶

Provides access to utility transforms and conversions for the financial data.

Returns:

Name	Type	Description
`DataUtils`	`DataUtils`	An instance of the DataUtils class initialized with this data.

`__post_init__()` ¶

Validate the Data object after initialization.

Source code in src/jquantstats/data.py

def __post_init__(self) -> None:
    """Validate the Data object after initialization."""
    # You need at least two points
    if self.index.shape[0] < 2:
        raise ValueError("Index must contain at least two timestamps.")  # noqa: TRY003

    # Check index is monotonically increasing
    datetime_col = self.index[self.index.columns[0]]
    if not datetime_col.is_sorted():
        raise ValueError("Index must be monotonically increasing.")  # noqa: TRY003

    # Check row count matches returns
    if self.returns.shape[0] != self.index.shape[0]:
        raise ValueError("Returns and index must have the same number of rows.")  # noqa: TRY003

    # Check row count matches benchmark (if provided)
    if self.benchmark is not None and self.benchmark.shape[0] != self.index.shape[0]:
        raise ValueError("Benchmark and index must have the same number of rows.")  # noqa: TRY003

`repr()` ¶

Return a string representation of the Data object.

Source code in src/jquantstats/data.py

def __repr__(self) -> str:
    """Return a string representation of the Data object."""
    rows = len(self.index)
    date_cols = self.date_col
    if date_cols:
        date_column = date_cols[0]
        start = self.index[date_column].min()
        end = self.index[date_column].max()
        return f"Data(assets={self.assets}, rows={rows}, start={start}, end={end})"
    return f"Data(assets={self.assets}, rows={rows})"  # pragma: no cover  # __post_init__ requires ≥1 index column

`copy()` ¶

Create a deep copy of the Data object.

Returns:

Name	Type	Description
`Data`	`Data`	A new Data object with copies of the returns and benchmark.

Source code in src/jquantstats/data.py

def copy(self) -> Data:
    """Create a deep copy of the Data object.

    Returns:
        Data: A new Data object with copies of the returns and benchmark.

    """
    if self.benchmark is not None:
        return Data(returns=self.returns.clone(), benchmark=self.benchmark.clone(), index=self.index.clone())
    return Data(returns=self.returns.clone(), index=self.index.clone())

`describe()` ¶

Return a tidy summary of shape, date range and asset names.

Returns:

Type	Description
`DataFrame`	pl.DataFrame: One row per asset with columns: asset, start, end,
`DataFrame`	rows, has_benchmark.

Source code in src/jquantstats/data.py

def describe(self) -> pl.DataFrame:
    """Return a tidy summary of shape, date range and asset names.

    Returns:
        pl.DataFrame: One row per asset with columns: asset, start, end,
        rows, has_benchmark.

    """
    date_column = self.date_col[0]
    start = self.index[date_column].min()
    end = self.index[date_column].max()
    rows = len(self.index)
    return pl.DataFrame(
        {
            "asset": self.returns.columns,
            "start": [start] * len(self.returns.columns),
            "end": [end] * len(self.returns.columns),
            "rows": [rows] * len(self.returns.columns),
            "has_benchmark": [self.benchmark is not None] * len(self.returns.columns),
        }
    )

`from_prices(prices, rf=0.0, benchmark=None, date_col='Date', null_strategy=None)` `classmethod` ¶

Create a Data object from prices and optional benchmark.

Converts price levels to returns via percentage change and delegates to from_returns. The first row of each asset is dropped because no prior price is available to compute a return.

Parameters:

Name	Type	Description	Default
`prices`	`NativeFrame`	Price-level data. First column should be the date column; remaining columns are asset prices.	required
`rf`	`float \| NativeFrame`	Risk-free rate. Forwarded unchanged to `from_returns`. Defaults to 0.0 (no risk-free rate adjustment).	`0.0`
`benchmark`	`NativeFrame \| None`	Benchmark prices. Converted to returns in the same way as `prices` before being forwarded to `from_returns`. Defaults to None (no benchmark).	`None`
`date_col`	`str`	Name of the date column in the DataFrames. Defaults to `"Date"`.	`'Date'`
`null_strategy`	`{'raise', 'drop', 'forward_fill'} \| None`	How to handle `null` (missing) values after converting prices to returns. Forwarded unchanged to `from_returns`. Defaults to `None` (nulls propagate through calculations). `None` — no null checking; nulls propagate. `"raise"` — raise `NullsInReturnsError` if any null is found in the derived returns. `"drop"` — silently drop every row that contains at least one null. `"forward_fill"` — fill each null with the most recent non-null value. Note: Prices that contain nulls will produce null returns via `pct_change()`. If you expect missing price entries, pass `null_strategy="drop"` or `null_strategy="forward_fill"`.	`None`

Returns:

Name	Type	Description
`Data`	`Data`	Object containing excess returns derived from the supplied
	`Data`	prices, with methods for analysis and visualization through the
	`Data`	`stats` and `plots` properties.

Examples:

from jquantstats import Data
import polars as pl

prices = pl.DataFrame({
    "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
    "Asset1": [100.0, 101.0, 99.0]
}).with_columns(pl.col("Date").str.to_date())

data = Data.from_prices(prices=prices)

Source code in src/jquantstats/data.py

@classmethod
def from_prices(
    cls,
    prices: NativeFrame,
    rf: NativeFrameOrScalar = 0.0,
    benchmark: NativeFrame | None = None,
    date_col: str = "Date",
    null_strategy: Literal["raise", "drop", "forward_fill"] | None = None,
) -> Data:
    """Create a Data object from prices and optional benchmark.

    Converts price levels to returns via percentage change and delegates
    to `from_returns`. The first row of each asset is dropped because no
    prior price is available to compute a return.

    Args:
        prices (NativeFrame): Price-level data. First column should be
            the date column; remaining columns are asset prices.
        rf (float | NativeFrame): Risk-free rate. Forwarded unchanged to
            `from_returns`. Defaults to 0.0 (no risk-free rate
            adjustment).
        benchmark (NativeFrame | None): Benchmark prices. Converted to
            returns in the same way as ``prices`` before being forwarded
            to `from_returns`. Defaults to None (no benchmark).
        date_col (str): Name of the date column in the DataFrames.
            Defaults to ``"Date"``.
        null_strategy ({"raise", "drop", "forward_fill"} | None): How to
            handle ``null`` (missing) values after converting prices to
            returns. Forwarded unchanged to `from_returns`. Defaults to
            ``None`` (nulls propagate through calculations).

            - ``None`` — no null checking; nulls propagate.
            - ``"raise"`` — raise `NullsInReturnsError` if any null is
              found in the derived returns.
            - ``"drop"`` — silently drop every row that contains at least
              one null.
            - ``"forward_fill"`` — fill each null with the most recent
              non-null value.

            Note: Prices that contain nulls will produce null returns via
            ``pct_change()``. If you expect missing price entries, pass
            ``null_strategy="drop"`` or ``null_strategy="forward_fill"``.

    Returns:
        Data: Object containing excess returns derived from the supplied
        prices, with methods for analysis and visualization through the
        ``stats`` and ``plots`` properties.

    Examples:
        ```python
        from jquantstats import Data
        import polars as pl

        prices = pl.DataFrame({
            "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
            "Asset1": [100.0, 101.0, 99.0]
        }).with_columns(pl.col("Date").str.to_date())

        data = Data.from_prices(prices=prices)
        ```

    """
    prices_pl = _to_polars(prices)
    asset_cols = [c for c in prices_pl.columns if c != date_col]
    returns_pl = prices_pl.with_columns([pl.col(c).pct_change().alias(c) for c in asset_cols]).slice(1)

    benchmark_returns: NativeFrame | None = None
    if benchmark is not None:
        benchmark_pl = _to_polars(benchmark)
        bench_cols = [c for c in benchmark_pl.columns if c != date_col]
        benchmark_returns = benchmark_pl.with_columns([pl.col(c).pct_change().alias(c) for c in bench_cols]).slice(
            1
        )

    return cls.from_returns(
        returns=returns_pl,
        rf=rf,
        benchmark=benchmark_returns,
        date_col=date_col,
        null_strategy=null_strategy,
    )

`from_returns(returns, rf=0.0, benchmark=None, date_col='Date', null_strategy=None)` `classmethod` ¶

Create a Data object from returns and optional benchmark.

Parameters:

Name	Type	Description	Default
`returns`	`NativeFrame`	Financial returns data. First column should be the date column, remaining columns are asset returns.	required
`rf`	`float \| NativeFrame`	Risk-free rate. Defaults to 0.0 (no risk-free rate adjustment). If float: Constant risk-free rate applied to all dates. If NativeFrame: Time-varying risk-free rate with dates matching returns.	`0.0`
`benchmark`	`NativeFrame \| None`	Benchmark returns. Defaults to None (no benchmark). First column should be the date column, remaining columns are benchmark returns.	`None`
`date_col`	`str`	Name of the date column in the DataFrames. Defaults to `"Date"`.	`'Date'`
`null_strategy`	`{'raise', 'drop', 'forward_fill'} \| None`	How to handle `null` (missing) values in returns and benchmark. Defaults to `None` (nulls propagate through calculations). `None` — no null checking; nulls propagate through all downstream calculations. `"raise"` — raise `NullsInReturnsError` if any null is found. `"drop"` — silently drop every row that contains at least one null. `"forward_fill"` — fill each null with the most recent non-null value in the same column. Note: Affects only Polars `null` values (i.e. `None` / missing entries). IEEE-754 `NaN` values are not affected and continue to propagate as per IEEE-754 semantics.	`None`

Returns:

Name	Type	Description
`Data`	`Data`	Object containing excess returns and benchmark (if any),
	`Data`	with methods for analysis and visualization through the `stats`
	`Data`	and `plots` properties.

Raises:

Type	Description
`NullsInReturnsError`	If null_strategy is `"raise"` and the data contains null values.
`ValueError`	If there are no overlapping dates between returns and benchmark.

Examples:

Basic usage:

from jquantstats import Data
import polars as pl

returns = pl.DataFrame({
    "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
    "Asset1": [0.01, -0.02, 0.03]
}).with_columns(pl.col("Date").str.to_date())

data = Data.from_returns(returns=returns)

With benchmark and risk-free rate:

benchmark = pl.DataFrame({
    "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
    "Market": [0.005, -0.01, 0.02]
}).with_columns(pl.col("Date").str.to_date())

data = Data.from_returns(returns=returns, benchmark=benchmark, rf=0.0002)

Handling nulls automatically:

returns_with_nulls = pl.DataFrame({
    "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
    "Asset1": [0.01, None, 0.03]
}).with_columns(pl.col("Date").str.to_date())

# Drop rows with nulls (mirrors pandas/QuantStats behaviour)
data = Data.from_returns(returns=returns_with_nulls, null_strategy="drop")

# Or forward-fill nulls
data = Data.from_returns(returns=returns_with_nulls, null_strategy="forward_fill")

Source code in src/jquantstats/data.py

@classmethod
def from_returns(
    cls,
    returns: NativeFrame,
    rf: NativeFrameOrScalar = 0.0,
    benchmark: NativeFrame | None = None,
    date_col: str = "Date",
    null_strategy: Literal["raise", "drop", "forward_fill"] | None = None,
) -> Data:
    """Create a Data object from returns and optional benchmark.

    Args:
        returns (NativeFrame): Financial returns data. First column should
            be the date column, remaining columns are asset returns.
        rf (float | NativeFrame): Risk-free rate. Defaults to 0.0 (no
            risk-free rate adjustment).

            - If float: Constant risk-free rate applied to all dates.
            - If NativeFrame: Time-varying risk-free rate with dates
              matching returns.

        benchmark (NativeFrame | None): Benchmark returns. Defaults to
            None (no benchmark). First column should be the date column,
            remaining columns are benchmark returns.
        date_col (str): Name of the date column in the DataFrames.
            Defaults to ``"Date"``.
        null_strategy ({"raise", "drop", "forward_fill"} | None): How to
            handle ``null`` (missing) values in *returns* and *benchmark*.
            Defaults to ``None`` (nulls propagate through calculations).

            - ``None`` — no null checking; nulls propagate through all
              downstream calculations.
            - ``"raise"`` — raise `NullsInReturnsError` if any null is
              found.
            - ``"drop"`` — silently drop every row that contains at least
              one null.
            - ``"forward_fill"`` — fill each null with the most recent
              non-null value in the same column.

            Note: Affects only Polars ``null`` values (i.e. ``None`` /
            missing entries). IEEE-754 ``NaN`` values are **not** affected
            and continue to propagate as per IEEE-754 semantics.

    Returns:
        Data: Object containing excess returns and benchmark (if any),
        with methods for analysis and visualization through the ``stats``
        and ``plots`` properties.

    Raises:
        NullsInReturnsError: If *null_strategy* is ``"raise"`` and the
            data contains null values.
        ValueError: If there are no overlapping dates between returns and
            benchmark.

    Examples:
        Basic usage:

        ```python
        from jquantstats import Data
        import polars as pl

        returns = pl.DataFrame({
            "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
            "Asset1": [0.01, -0.02, 0.03]
        }).with_columns(pl.col("Date").str.to_date())

        data = Data.from_returns(returns=returns)
        ```

        With benchmark and risk-free rate:

        ```python
        benchmark = pl.DataFrame({
            "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
            "Market": [0.005, -0.01, 0.02]
        }).with_columns(pl.col("Date").str.to_date())

        data = Data.from_returns(returns=returns, benchmark=benchmark, rf=0.0002)
        ```

        Handling nulls automatically:

        ```python
        returns_with_nulls = pl.DataFrame({
            "Date": ["2023-01-01", "2023-01-02", "2023-01-03"],
            "Asset1": [0.01, None, 0.03]
        }).with_columns(pl.col("Date").str.to_date())

        # Drop rows with nulls (mirrors pandas/QuantStats behaviour)
        data = Data.from_returns(returns=returns_with_nulls, null_strategy="drop")

        # Or forward-fill nulls
        data = Data.from_returns(returns=returns_with_nulls, null_strategy="forward_fill")
        ```

    """
    returns_pl = _to_polars(returns)
    benchmark_pl = _to_polars(benchmark) if benchmark is not None else None
    rf_converted: float | pl.DataFrame
    if isinstance(rf, pl.DataFrame) or (not isinstance(rf, float) and not isinstance(rf, int)):
        rf_converted = _to_polars(rf)
    else:
        rf_converted = rf  # int is not float/DataFrame: _subtract_risk_free raises TypeError

    returns_pl = _apply_null_strategy(returns_pl, date_col, "returns", null_strategy)
    if benchmark_pl is not None:
        benchmark_pl = _apply_null_strategy(benchmark_pl, date_col, "benchmark", null_strategy)

    if benchmark_pl is not None:
        joined_dates = returns_pl.join(benchmark_pl, on=date_col, how="inner").select(date_col)
        if joined_dates.is_empty():
            raise ValueError("No overlapping dates between returns and benchmark.")  # noqa: TRY003
        returns_pl = returns_pl.join(joined_dates, on=date_col, how="inner")
        benchmark_pl = benchmark_pl.join(joined_dates, on=date_col, how="inner")

    index = returns_pl.select(date_col)
    excess_returns = _subtract_risk_free(returns_pl, rf_converted, date_col).drop(date_col)
    excess_benchmark = (
        _subtract_risk_free(benchmark_pl, rf_converted, date_col).drop(date_col)
        if benchmark_pl is not None
        else None
    )

    return cls(returns=excess_returns, benchmark=excess_benchmark, index=index)

`head(n=5)` ¶

Return the first n rows of the combined returns and benchmark data.

Parameters:

Name	Type	Description	Default
`n`	`int`	Number of rows to return. Defaults to 5.	`5`

Returns:

Name	Type	Description
`Data`	`Data`	A new Data object containing the first n rows of the combined data.

Source code in src/jquantstats/data.py

def head(self, n: int = 5) -> Data:
    """Return the first n rows of the combined returns and benchmark data.

    Args:
        n (int, optional): Number of rows to return. Defaults to 5.

    Returns:
        Data: A new Data object containing the first n rows of the combined data.

    """
    benchmark_head = self.benchmark.head(n) if self.benchmark is not None else None
    return Data(returns=self.returns.head(n), benchmark=benchmark_head, index=self.index.head(n))

`items()` ¶

Iterate over all assets and their corresponding data series.

This method provides a convenient way to iterate over all assets in the data, yielding each asset name and its corresponding data series.

Yields:

Type	Description
`tuple[str, Series]`	tuple[str, pl.Series]: A tuple containing the asset name and its data series.

Source code in src/jquantstats/data.py

def items(self) -> Iterator[tuple[str, pl.Series]]:
    """Iterate over all assets and their corresponding data series.

    This method provides a convenient way to iterate over all assets in the data,
    yielding each asset name and its corresponding data series.

    Yields:
        tuple[str, pl.Series]: A tuple containing the asset name and its data series.

    """
    matrix = self.all

    for col in self.assets:
        yield col, matrix.get_column(col)

`resample(every='1mo')` ¶

Resample returns and benchmark to a different frequency.

Parameters:

Name	Type	Description	Default
`every`	`str`	Resampling frequency (e.g., `'1mo'`, `'1y'`). Defaults to `'1mo'`.	`'1mo'`

Returns:

Name	Type	Description
`Data`	`Data`	Resampled data at the requested frequency.

Source code in src/jquantstats/data.py

def resample(self, every: str = "1mo") -> Data:
    """Resample returns and benchmark to a different frequency.

    Args:
        every (str): Resampling frequency (e.g., ``'1mo'``, ``'1y'``).
            Defaults to ``'1mo'``.

    Returns:
        Data: Resampled data at the requested frequency.

    """

    def resample_frame(dframe: pl.DataFrame) -> pl.DataFrame:
        """Resample a single DataFrame to the target frequency using compound returns."""
        dframe = self.index.hstack(dframe)  # Add the date column for resampling

        return dframe.group_by_dynamic(
            index_column=self.index.columns[0], every=every, period=every, closed="right", label="right"
        ).agg(
            [
                ((pl.col(col) + 1.0).product() - 1.0).alias(col)
                for col in dframe.columns
                if col != self.index.columns[0]
            ]
        )

    resampled_returns = resample_frame(self.returns)
    resampled_benchmark = resample_frame(self.benchmark) if self.benchmark is not None else None
    resampled_index = resampled_returns.select(self.index.columns[0])

    return Data(
        returns=resampled_returns.drop(self.index.columns[0]),
        benchmark=resampled_benchmark.drop(self.index.columns[0]) if resampled_benchmark is not None else None,
        index=resampled_index,
    )

`tail(n=5)` ¶

Return the last n rows of the combined returns and benchmark data.

Parameters:

Name	Type	Description	Default
`n`	`int`	Number of rows to return. Defaults to 5.	`5`

Returns:

Name	Type	Description
`Data`	`Data`	A new Data object containing the last n rows of the combined data.

Source code in src/jquantstats/data.py

def tail(self, n: int = 5) -> Data:
    """Return the last n rows of the combined returns and benchmark data.

    Args:
        n (int, optional): Number of rows to return. Defaults to 5.

    Returns:
        Data: A new Data object containing the last n rows of the combined data.

    """
    benchmark_tail = self.benchmark.tail(n) if self.benchmark is not None else None
    return Data(returns=self.returns.tail(n), benchmark=benchmark_tail, index=self.index.tail(n))

`truncate(start=None, end=None)` ¶

Return a new Data object truncated to the inclusive [start, end] range.

When the index is temporal (Date/Datetime), truncation is performed by comparing the date column against start and end values.

When the index is integer-based, row slicing is used instead, and start and end must be non-negative integers. Passing non-integer bounds to an integer-indexed Data raises TypeError.

Parameters:

Name	Type	Description	Default
`start`	`date \| datetime \| str \| int \| None`	Optional lower bound (inclusive). A date/datetime value when the index is temporal; a non-negative `int` row index when the data has no temporal index.	`None`
`end`	`date \| datetime \| str \| int \| None`	Optional upper bound (inclusive). Same type rules as `start`.	`None`

Returns:

Name	Type	Description
`Data`	`Data`	A new Data object filtered to the specified range.

Raises:

Type	Description
`TypeError`	When the index is not temporal and a non-integer bound is supplied.

Source code in src/jquantstats/data.py

def truncate(
    self,
    start: date | datetime | str | int | None = None,
    end: date | datetime | str | int | None = None,
) -> Data:
    """Return a new Data object truncated to the inclusive [start, end] range.

    When the index is temporal (Date/Datetime), truncation is performed by
    comparing the date column against ``start`` and ``end`` values.

    When the index is integer-based, row slicing is used instead, and
    ``start`` and ``end`` must be non-negative integers.  Passing
    non-integer bounds to an integer-indexed Data raises `TypeError`.

    Args:
        start: Optional lower bound (inclusive).  A date/datetime value
            when the index is temporal; a non-negative `int` row
            index when the data has no temporal index.
        end: Optional upper bound (inclusive).  Same type rules as
            ``start``.

    Returns:
        Data: A new Data object filtered to the specified range.

    Raises:
        TypeError: When the index is not temporal and a non-integer bound
            is supplied.

    """
    date_column = self.index.columns[0]
    is_temporal = self.index[date_column].dtype.is_temporal()

    if is_temporal:
        cond = pl.lit(True)
        if start is not None:
            cond = cond & (pl.col(date_column) >= pl.lit(start))
        if end is not None:
            cond = cond & (pl.col(date_column) <= pl.lit(end))
        mask = self.index.select(cond.alias("mask"))["mask"]
        new_index = self.index.filter(mask)
        new_returns = self.returns.filter(mask)
        new_benchmark = self.benchmark.filter(mask) if self.benchmark is not None else None
    else:
        if start is not None and not isinstance(start, int):
            raise TypeError(f"start must be an integer, got {type(start).__name__}.")  # noqa: TRY003
        if end is not None and not isinstance(end, int):
            raise TypeError(f"end must be an integer, got {type(end).__name__}.")  # noqa: TRY003
        row_start = start if start is not None else 0
        row_end = end + 1 if end is not None else self.index.height
        length = max(0, row_end - row_start)
        new_index = self.index.slice(row_start, length)
        new_returns = self.returns.slice(row_start, length)
        new_benchmark = self.benchmark.slice(row_start, length) if self.benchmark is not None else None

    return Data(returns=new_returns, benchmark=new_benchmark, index=new_index)

`Portfolio` `dataclass` ¶

Bases: PortfolioNavMixin, PortfolioAttributionMixin, PortfolioTurnoverMixin, PortfolioCostMixin

Portfolio analytics class for quant finance.

Stores the three raw inputs — cash positions, prices, and AUM — and exposes the standard derived data series, analytics facades, transforms, and attribution tools.

Derived data series:

profits — per-asset daily cash P&L
profit — aggregate daily portfolio profit
nav_accumulated — cumulative additive NAV
nav_compounded — compounded NAV
returns — daily returns (profit / AUM)
monthly — monthly compounded returns
highwater — running high-water mark
drawdown — drawdown from high-water mark
all — merged view of all derived series
Lazy composition accessors: stats, plots, report
Portfolio transforms: truncate, lag, smoothed_holding
Attribution: tilt, timing, tilt_timing_decomp
Turnover: turnover, turnover_weekly, turnover_summary
Cost analysis: cost_adjusted_returns, trading_cost_impact
Utility: correlation

Attributes:

Name	Type	Description
`cashposition`	`DataFrame`	Polars DataFrame of positions per asset over time (includes date column if present).
`prices`	`DataFrame`	Polars DataFrame of prices per asset over time (includes date column if present).
`aum`	`float`	Assets under management used as base NAV offset.

Analytics facades¶

.stats : delegates to the legacy Stats pipeline via .data; all 50+ metrics available.
.plots : portfolio-specific Plots; NAV overlays, lead-lag IR, rolling Sharpe/vol, heatmaps.
.report : HTML Report; self-contained portfolio performance report.
.data : bridge to the legacy Data / Stats / DataPlots pipeline.

.plots and .report are intentionally not delegated to the legacy path: the legacy path operates on a bare returns series, while the analytics path has access to raw prices, positions, and AUM for richer portfolio-specific visualisations.

Cost models¶

Two independent cost models are provided. They are not interchangeable:

Model A — position-delta (stateful, set at construction): cost_per_unit: float — one-way cost per unit of position change (e.g. 0.01 per share). Used by .position_delta_costs and .net_cost_nav. Best for: equity portfolios where cost scales with shares traded.

Model B — turnover-bps (stateless, passed at call time): cost_bps: float — one-way cost in basis points of AUM turnover (e.g. 5 bps). Used by .cost_adjusted_returns(cost_bps) and .trading_cost_impact(max_bps). Best for: macro / fund-of-funds portfolios where cost scales with notional traded.

To sweep a range of cost assumptions use trading_cost_impact(max_bps=20) (Model B). To compute a net-NAV curve set cost_per_unit at construction and read .net_cost_nav (Model A).

Date column requirement¶

Most analytics work with or without a date column. The following features require a temporal date column (pl.Date or pl.Datetime):

portfolio.plots.correlation_heatmap()
portfolio.plots.lead_lag_ir_plot()
stats.monthly_win_rate() — returns NaN per column when no date is present
stats.annual_breakdown() — raises ValueError when no date is present
stats.max_drawdown_duration() — returns period count (int) instead of days

Portfolios without a date column (integer-indexed) are fully supported for NAV, returns, Sharpe, drawdown, cost analytics, and most rolling metrics.

Examples:

>>> import polars as pl
>>> from datetime import date
>>> prices = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [100.0, 110.0]})
>>> pos = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [1000.0, 1000.0]})
>>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e6)
>>> pf.assets
['A']

Source code in src/jquantstats/portfolio.py

@dataclasses.dataclass(frozen=True, slots=True)
class Portfolio(
    PortfolioNavMixin,
    PortfolioAttributionMixin,
    PortfolioTurnoverMixin,
    PortfolioCostMixin,
):
    """Portfolio analytics class for quant finance.

    Stores the three raw inputs — cash positions, prices, and AUM — and
    exposes the standard derived data series, analytics facades, transforms,
    and attribution tools.

    Derived data series:

    - `profits` — per-asset daily cash P&L
    - `profit` — aggregate daily portfolio profit
    - `nav_accumulated` — cumulative additive NAV
    - `nav_compounded` — compounded NAV
    - `returns` — daily returns (profit / AUM)
    - `monthly` — monthly compounded returns
    - `highwater` — running high-water mark
    - `drawdown` — drawdown from high-water mark
    - `all` — merged view of all derived series

    - Lazy composition accessors: `stats`, `plots`, `report`
    - Portfolio transforms: `truncate`, `lag`,
      `smoothed_holding`
    - Attribution: `tilt`, `timing`, `tilt_timing_decomp`
    - Turnover: `turnover`, `turnover_weekly`,
      `turnover_summary`
    - Cost analysis: `cost_adjusted_returns`,
      `trading_cost_impact`
    - Utility: `correlation`

    Attributes:
        cashposition: Polars DataFrame of positions per asset over time
            (includes date column if present).
        prices: Polars DataFrame of prices per asset over time (includes date
            column if present).
        aum: Assets under management used as base NAV offset.

    Analytics facades
    -----------------
    - ``.stats``   : delegates to the legacy ``Stats`` pipeline via ``.data``; all 50+ metrics available.
    - ``.plots``   : portfolio-specific ``Plots``; NAV overlays, lead-lag IR, rolling Sharpe/vol, heatmaps.
    - ``.report``  : HTML ``Report``; self-contained portfolio performance report.
    - ``.data``    : bridge to the legacy ``Data`` / ``Stats`` / ``DataPlots`` pipeline.

    ``.plots`` and ``.report`` are intentionally *not* delegated to the legacy path: the legacy
    path operates on a bare returns series, while the analytics path has access to raw prices,
    positions, and AUM for richer portfolio-specific visualisations.

    Cost models
    -----------
    Two independent cost models are provided. They are not interchangeable:

    **Model A — position-delta (stateful, set at construction):**
        ``cost_per_unit: float``  — one-way cost per unit of position change (e.g. 0.01 per share).
        Used by ``.position_delta_costs`` and ``.net_cost_nav``.
        Best for: equity portfolios where cost scales with shares traded.

    **Model B — turnover-bps (stateless, passed at call time):**
        ``cost_bps: float``  — one-way cost in basis points of AUM turnover (e.g. 5 bps).
        Used by ``.cost_adjusted_returns(cost_bps)`` and ``.trading_cost_impact(max_bps)``.
        Best for: macro / fund-of-funds portfolios where cost scales with notional traded.

    To sweep a range of cost assumptions use ``trading_cost_impact(max_bps=20)`` (Model B).
    To compute a net-NAV curve set ``cost_per_unit`` at construction and read ``.net_cost_nav`` (Model A).

    Date column requirement
    -----------------------
    Most analytics work with or without a ``date`` column. The following features require a
    temporal ``date`` column (``pl.Date`` or ``pl.Datetime``):

    - ``portfolio.plots.correlation_heatmap()``
    - ``portfolio.plots.lead_lag_ir_plot()``
    - ``stats.monthly_win_rate()``      — returns NaN per column when no date is present
    - ``stats.annual_breakdown()``      — raises ``ValueError`` when no date is present
    - ``stats.max_drawdown_duration()`` — returns period count (int) instead of days

    Portfolios without a ``date`` column (integer-indexed) are fully supported for
    NAV, returns, Sharpe, drawdown, cost analytics, and most rolling metrics.

    Examples:
        >>> import polars as pl
        >>> from datetime import date
        >>> prices = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [100.0, 110.0]})
        >>> pos = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [1000.0, 1000.0]})
        >>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e6)
        >>> pf.assets
        ['A']
    """

    cashposition: pl.DataFrame
    prices: pl.DataFrame
    aum: float
    cost_per_unit: float = 0.0
    cost_bps: float = 0.0

    # ── Internal cache fields ─────────────────────────────────────────────────
    # All cache fields are initialised to ``None`` in ``__post_init__`` via
    # ``object.__setattr__`` (required for frozen dataclasses) and populated
    # lazily on first property access.
    #
    # Lifecycle:
    #   - Initialised: ``__post_init__`` sets every field to ``None``.
    #   - Populated: each property computes its value on the first call and
    #     writes it back via ``object.__setattr__``.
    #   - Invalidation: not required — ``Portfolio`` is a *frozen* dataclass,
    #     so its inputs never change and all derived values remain valid for the
    #     lifetime of the instance.
    _data_bridge: "Data | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False)
    _stats_cache: "Stats | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False)
    _plots_cache: "PortfolioPlots | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False)
    _report_cache: "Report | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False)
    _utils_cache: "PortfolioUtils | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False)
    _profits_cache: "pl.DataFrame | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False)
    _returns_cache: "pl.DataFrame | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False)
    _tilt_cache: "Portfolio | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False)
    _turnover_cache: "pl.DataFrame | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False)

    @staticmethod
    def _build_data_bridge(ret: pl.DataFrame) -> "Data":
        """Build a `Data` bridge from a returns frame.

        Splits out the ``'date'`` column (if present) into an index and passes
        the remaining numeric columns as returns.  Used internally to populate
        ``_data_bridge`` at construction time so the ``data`` property is O(1).

        Args:
            ret: Returns DataFrame, optionally with a leading ``'date'`` column.

        Returns:
            A `Data` instance backed by *ret*.
        """
        from .data import Data

        returns_only = ret.select("returns")
        if "date" in ret.columns:
            return Data(returns=returns_only, index=ret.select("date"))
        return Data(returns=returns_only, index=pl.DataFrame({"index": list(range(ret.height))}))

    def __post_init__(self) -> None:
        """Validate input types, shapes, and parameters post-initialization."""
        if not isinstance(self.prices, pl.DataFrame):
            raise InvalidPricesTypeError(type(self.prices).__name__)
        if not isinstance(self.cashposition, pl.DataFrame):
            raise InvalidCashPositionTypeError(type(self.cashposition).__name__)
        if self.cashposition.shape[0] != self.prices.shape[0]:
            raise RowCountMismatchError(self.prices.shape[0], self.cashposition.shape[0])
        if self.aum <= 0.0:
            raise NonPositiveAumError(self.aum)
        object.__setattr__(self, "_data_bridge", None)
        object.__setattr__(self, "_stats_cache", None)
        object.__setattr__(self, "_plots_cache", None)
        object.__setattr__(self, "_report_cache", None)
        object.__setattr__(self, "_utils_cache", None)
        object.__setattr__(self, "_profits_cache", None)
        object.__setattr__(self, "_returns_cache", None)
        object.__setattr__(self, "_tilt_cache", None)
        object.__setattr__(self, "_turnover_cache", None)

    def _date_range(self) -> tuple[int, date | datetime | None, date | datetime | None]:
        """Return (rows, start, end) for the portfolio's returns series.

        ``start`` and ``end`` are ``None`` when there is no ``'date'`` column.
        """
        ret = self.returns
        rows = ret.height
        if "date" in ret.columns:
            return rows, cast(date | None, ret["date"].min()), cast(date | None, ret["date"].max())
        return rows, None, None

    @property
    def cost_model(self) -> CostModel:
        """Return the active cost model as a `CostModel` instance.

        Returns:
            A `CostModel` whose ``cost_per_unit`` and ``cost_bps`` fields
            reflect the values stored on this portfolio.
        """
        return CostModel(cost_per_unit=self.cost_per_unit, cost_bps=self.cost_bps)

    def __repr__(self) -> str:
        """Return a string representation of the Portfolio object."""
        rows, start, end = self._date_range()
        if start is not None:
            return f"Portfolio(assets={self.assets}, rows={rows}, start={start}, end={end})"
        return f"Portfolio(assets={self.assets}, rows={rows})"

    def describe(self) -> pl.DataFrame:
        """Return a tidy summary of shape, date range and asset names.

        Returns:
        -------
        pl.DataFrame
            One row per asset with columns: asset, start, end, rows.

        Examples:
            >>> import polars as pl
            >>> from datetime import date
            >>> prices = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [100.0, 110.0]})
            >>> pos = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [1000.0, 1000.0]})
            >>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e6)
            >>> df = pf.describe()
            >>> list(df.columns)
            ['asset', 'start', 'end', 'rows']
        """
        rows, start, end = self._date_range()
        return pl.DataFrame(
            {
                "asset": self.assets,
                "start": [start] * len(self.assets),
                "end": [end] * len(self.assets),
                "rows": [rows] * len(self.assets),
            }
        )

    # ── Factory classmethods ──────────────────────────────────────────────────

    @classmethod
    def from_risk_position(
        cls,
        prices: pl.DataFrame,
        risk_position: pl.DataFrame | pl.Expr,
        aum: float,
        vola: int | dict[str, int] = 32,
        vol_cap: float | None = None,
        cost_per_unit: float = 0.0,
        cost_bps: float = 0.0,
        cost_model: CostModel | None = None,
    ) -> Self:
        """Create a Portfolio from per-asset risk positions.

        De-volatizes each risk position using an EWMA volatility estimate
        derived from the corresponding price series.

        Args:
            prices: Price levels per asset over time (may include a date column).
            risk_position: Risk units per asset aligned with prices.
            vola: EWMA lookback (span-equivalent) used to estimate volatility.
                Pass an ``int`` to apply the same span to every asset, or a
                ``dict[str, int]`` to set a per-asset span (assets absent from
                the dict default to ``32``).  Every span value must be a
                positive integer; a ``ValueError`` is raised otherwise.  Dict
                keys that do not correspond to any numeric column in *prices*
                also raise a ``ValueError``.
            vol_cap: Optional lower bound for the EWMA volatility estimate.
                When provided, the vol series is clipped from below at this
                value before dividing the risk position, preventing
                position blow-up in calm, low-volatility regimes.  For
                example, ``vol_cap=0.05`` ensures annualised vol is never
                estimated below 5%.  Must be positive when not ``None``.
            aum: Assets under management used as the base NAV offset.
            cost_per_unit: One-way trading cost per unit of position change.
                Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
            cost_bps: One-way trading cost in basis points of AUM turnover.
                Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
            cost_model: Optional `CostModel`
                instance.  When supplied, its ``cost_per_unit`` and
                ``cost_bps`` values take precedence over the individual
                parameters above.

        Returns:
            A Portfolio instance whose cash positions are risk_position
            divided by EWMA volatility.

        Raises:
            ValueError: If any span value in *vola* is ≤ 0, or if a key in a
                *vola* dict does not match any numeric column in *prices*, or
                if *vol_cap* is provided but is not positive.
        """
        if isinstance(risk_position, pl.Expr):
            risk_position = prices.with_columns(risk_position)
        if cost_model is not None:
            cost_per_unit = cost_model.cost_per_unit
            cost_bps = cost_model.cost_bps
        assets = [col for col, dtype in prices.schema.items() if dtype.is_numeric()]

        # ── Validate vol_cap ──────────────────────────────────────────────────
        if vol_cap is not None and vol_cap <= 0:
            raise ValueError(f"vol_cap must be a positive number when provided, got {vol_cap!r}")  # noqa: TRY003

        # ── Validate vola ─────────────────────────────────────────────────────
        if isinstance(vola, dict):
            unknown = set(vola.keys()) - set(assets)
            if unknown:
                raise ValueError(  # noqa: TRY003
                    f"vola dict contains keys that do not match any numeric column in prices: {sorted(unknown)}"
                )
            for asset, span in vola.items():
                if int(span) <= 0:
                    raise ValueError(f"vola span for '{asset}' must be a positive integer, got {span!r}")  # noqa: TRY003
        else:
            if int(vola) <= 0:
                raise ValueError(f"vola span must be a positive integer, got {vola!r}")  # noqa: TRY003

        def _span(asset: str) -> int:
            """Return the EWMA span for *asset*, falling back to 32 if not specified."""
            if isinstance(vola, dict):
                return int(vola.get(asset, 32))
            return int(vola)

        def _vol(asset: str) -> pl.Series:
            """Return the EWMA volatility series for *asset*, optionally clipped from below."""
            vol = prices[asset].pct_change().ewm_std(com=_span(asset) - 1, adjust=True, min_samples=_span(asset))
            if vol_cap is not None:
                vol = vol.clip(lower_bound=vol_cap)
            return vol

        cash_position = risk_position.with_columns((pl.col(asset) / _vol(asset)).alias(asset) for asset in assets)
        return cls(prices=prices, cashposition=cash_position, aum=aum, cost_per_unit=cost_per_unit, cost_bps=cost_bps)

    @classmethod
    def from_position(
        cls,
        prices: pl.DataFrame,
        position: pl.DataFrame | pl.Expr,
        aum: float,
        cost_per_unit: float = 0.0,
        cost_bps: float = 0.0,
        cost_model: CostModel | None = None,
    ) -> Self:
        """Create a Portfolio from share/unit positions.

        Converts *position* (number of units held per asset) to cash exposure
        by multiplying element-wise with *prices*, then delegates to
        :py`from_cash_position`.

        Args:
            prices: Price levels per asset over time (may include a date column).
            position: Number of units held per asset over time, aligned with
                *prices*.  Non-numeric columns (e.g. ``'date'``) are passed
                through unchanged.
            aum: Assets under management used as the base NAV offset.
            cost_per_unit: One-way trading cost per unit of position change.
                Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
            cost_bps: One-way trading cost in basis points of AUM turnover.
                Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
            cost_model: Optional `CostModel` instance.
                When supplied, its ``cost_per_unit`` and ``cost_bps`` values
                take precedence over the individual parameters above.

        Returns:
            A Portfolio instance whose cash positions equal *position* x *prices*.

        Examples:
            >>> import polars as pl
            >>> prices = pl.DataFrame({"A": [100.0, 110.0, 105.0]})
            >>> pos = pl.DataFrame({"A": [10.0, 10.0, 10.0]})
            >>> pf = Portfolio.from_position(prices=prices, position=pos, aum=1e6)
            >>> pf.cashposition["A"].to_list()
            [1000.0, 1100.0, 1050.0]
        """
        if isinstance(position, pl.Expr):
            position = prices.with_columns(position)
        assets = [col for col, dtype in prices.schema.items() if dtype.is_numeric()]
        cash_position = position.with_columns((pl.col(asset) * prices[asset]).alias(asset) for asset in assets)
        return cls.from_cash_position(
            prices=prices,
            cash_position=cash_position,
            aum=aum,
            cost_per_unit=cost_per_unit,
            cost_bps=cost_bps,
            cost_model=cost_model,
        )

    @classmethod
    def from_cash_position(
        cls,
        prices: pl.DataFrame,
        cash_position: pl.DataFrame | pl.Expr,
        aum: float,
        cost_per_unit: float = 0.0,
        cost_bps: float = 0.0,
        cost_model: CostModel | None = None,
    ) -> Self:
        """Create a Portfolio directly from cash positions aligned with prices.

        Args:
            prices: Price levels per asset over time (may include a date column).
            cash_position: Cash exposure per asset over time, either as a
                DataFrame or as a Polars expression evaluated against *prices*.
            aum: Assets under management used as the base NAV offset.
            cost_per_unit: One-way trading cost per unit of position change.
                Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
            cost_bps: One-way trading cost in basis points of AUM turnover.
                Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
            cost_model: Optional `CostModel`
                instance.  When supplied, its ``cost_per_unit`` and
                ``cost_bps`` values take precedence over the individual
                parameters above.

        Returns:
            A Portfolio instance with the provided cash positions.
        """
        if isinstance(cash_position, pl.Expr):
            cash_position = prices.with_columns(cash_position)
        if cost_model is not None:
            cost_per_unit = cost_model.cost_per_unit
            cost_bps = cost_model.cost_bps
        return cls(prices=prices, cashposition=cash_position, aum=aum, cost_per_unit=cost_per_unit, cost_bps=cost_bps)

    # ── Internal helpers ───────────────────────────────────────────────────────

    @staticmethod
    def _assert_clean_series(series: pl.Series, name: str = "") -> None:
        """Raise ValueError if *series* contains nulls or non-finite values."""
        if series.null_count() != 0:
            raise ValueError
        if not series.is_finite().all():
            raise ValueError

    # ── Core data properties ───────────────────────────────────────────────────

    @property
    def assets(self) -> list[str]:
        """List the asset column names from prices (numeric columns).

        Returns:
            list[str]: Names of numeric columns in prices; typically excludes
            ``'date'``.
        """
        return [c for c in self.prices.columns if self.prices[c].dtype.is_numeric()]

    # ── Lazy composition accessors ─────────────────────────────────────────────

    @property
    def data(self) -> "Data":
        """Build a legacy `Data` object from this portfolio's returns.

        This bridges the two entry points: ``Portfolio`` compiles the NAV curve from
        prices and positions; the returned `Data` object
        gives access to the full legacy analytics pipeline (``data.stats``,
        ``data.plots``, ``data.reports``).

        Returns:
            `Data`: A Data object whose ``returns`` column
            is the portfolio's daily return series and whose ``index`` holds the date
            column (or a synthetic integer index for date-free portfolios).

        Examples:
            >>> import polars as pl
            >>> from datetime import date
            >>> prices = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [100.0, 110.0]})
            >>> pos = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [1000.0, 1000.0]})
            >>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e6)
            >>> d = pf.data
            >>> "returns" in d.returns.columns
            True
        """
        if self._data_bridge is not None:
            return self._data_bridge
        bridge = Portfolio._build_data_bridge(self.returns)
        object.__setattr__(self, "_data_bridge", bridge)
        return bridge

    @property
    def stats(self) -> "Stats":
        """Return a Stats object built from the portfolio's daily returns.

        Delegates to the legacy `Stats` pipeline via
        `data`, so all analytics (Sharpe, drawdown, summary, etc.) are
        available through the shared implementation.

        The result is cached after first access so repeated calls are O(1).
        """
        if self._stats_cache is None:
            object.__setattr__(self, "_stats_cache", self.data.stats)
        return self._stats_cache  # type: ignore[return-value]

    @property
    def plots(self) -> PortfolioPlots:
        """Convenience accessor returning a PortfolioPlots facade for this portfolio.

        Use this to create Plotly visualizations such as snapshots, lagged
        performance curves, and lead/lag IR charts.

        Returns:
            `PortfolioPlots`: Helper object with
            plotting methods.

        The result is cached after first access so repeated calls are O(1).
        """
        if self._plots_cache is None:
            object.__setattr__(self, "_plots_cache", PortfolioPlots(self))
        return self._plots_cache  # type: ignore[return-value]

    @property
    def report(self) -> Report:
        """Convenience accessor returning a Report facade for this portfolio.

        Use this to generate a self-contained HTML performance report
        containing statistics tables and interactive charts.

        Returns:
            `Report`: Helper object with
            report methods.

        The result is cached after first access so repeated calls are O(1).
        """
        if self._report_cache is None:
            object.__setattr__(self, "_report_cache", Report(self))
        return self._report_cache  # type: ignore[return-value]

    @property
    def utils(self) -> "PortfolioUtils":
        """Convenience accessor returning a PortfolioUtils facade for this portfolio.

        Use this for common data transformations such as converting returns to
        prices, computing log returns, rebasing, aggregating by period, and
        computing exponential standard deviation.

        Returns:
            `PortfolioUtils`: Helper object with
            utility transform methods.

        The result is cached after first access so repeated calls are O(1).
        """
        if self._utils_cache is None:
            from ._utils import PortfolioUtils

            object.__setattr__(self, "_utils_cache", PortfolioUtils(self))
        return self._utils_cache  # type: ignore[return-value]

    # ── Portfolio transforms ───────────────────────────────────────────────────

    def truncate(
        self,
        start: date | datetime | str | int | None = None,
        end: date | datetime | str | int | None = None,
    ) -> "Portfolio":
        """Return a new Portfolio truncated to the inclusive [start, end] range.

        When a ``'date'`` column is present in both prices and cash positions,
        truncation is performed by comparing the ``'date'`` column against
        ``start`` and ``end`` (which should be date/datetime values or strings
        parseable by Polars).

        When the ``'date'`` column is absent, integer-based row slicing is
        used instead.  In this case ``start`` and ``end`` must be non-negative
        integers representing 0-based row indices.  Passing non-integer bounds
        to an integer-indexed portfolio raises `TypeError`.

        In all cases the ``aum`` value is preserved.

        Args:
            start: Optional lower bound (inclusive). A date/datetime or
                Polars-parseable string when a ``'date'`` column exists; a
                non-negative int row index when the data has no ``'date'``
                column.
            end: Optional upper bound (inclusive). Same type rules as
                ``start``.

        Returns:
            A new Portfolio instance with prices and cash positions filtered
            to the specified range.

        Raises:
            TypeError: When the portfolio has no ``'date'`` column and a
                non-integer bound is supplied.
        """
        has_date = "date" in self.prices.columns
        if has_date:
            cond = pl.lit(True)
            if start is not None:
                cond = cond & (pl.col("date") >= pl.lit(start))
            if end is not None:
                cond = cond & (pl.col("date") <= pl.lit(end))
            pr = self.prices.filter(cond)
            cp = self.cashposition.filter(cond)
        else:
            if start is not None and not isinstance(start, int):
                raise IntegerIndexBoundError("start", type(start).__name__)
            if end is not None and not isinstance(end, int):
                raise IntegerIndexBoundError("end", type(end).__name__)
            row_start = int(start) if start is not None else 0
            row_end = int(end) + 1 if end is not None else self.prices.height
            length = max(0, row_end - row_start)
            pr = self.prices.slice(row_start, length)
            cp = self.cashposition.slice(row_start, length)
        return Portfolio(
            prices=pr,
            cashposition=cp,
            aum=self.aum,
            cost_per_unit=self.cost_per_unit,
            cost_bps=self.cost_bps,
        )

    def lag(self, n: int) -> "Portfolio":
        """Return a new Portfolio with cash positions lagged by ``n`` steps.

        This method shifts the numeric asset columns in the cashposition
        DataFrame by ``n`` rows, preserving the ``'date'`` column and any
        non-numeric columns unchanged.  Positive ``n`` delays weights (moves
        them down); negative ``n`` leads them (moves them up); ``n == 0``
        returns the current portfolio unchanged.

        Notes:
            Missing values introduced by the shift are left as nulls;
            downstream profit computation already guards and treats nulls as
            zero when multiplying by returns.

        Args:
            n: Number of rows to shift (can be negative, zero, or positive).

        Returns:
            A new Portfolio instance with lagged cash positions and the same
            prices/AUM as the original.
        """
        if not isinstance(n, int):
            raise TypeError
        if n == 0:
            return self

        assets = [c for c in self.cashposition.columns if c != "date" and self.cashposition[c].dtype.is_numeric()]
        cp_lagged = self.cashposition.with_columns(pl.col(c).shift(n) for c in assets)
        return Portfolio(
            prices=self.prices,
            cashposition=cp_lagged,
            aum=self.aum,
            cost_per_unit=self.cost_per_unit,
            cost_bps=self.cost_bps,
        )

    def smoothed_holding(self, n: int) -> "Portfolio":
        """Return a new Portfolio with cash positions smoothed by a rolling mean.

        Applies a trailing window average over the last ``n`` steps for each
        numeric asset column (excluding ``'date'``). The window length is
        ``n + 1`` so that:

        - n=0 returns the original weights (no smoothing),
        - n=1 averages the current and previous weights,
        - n=k averages the current and last k weights.

        Args:
            n: Non-negative integer specifying how many previous steps to
                include.

        Returns:
            A new Portfolio with smoothed cash positions and the same
            prices/AUM.
        """
        if not isinstance(n, int):
            raise TypeError
        if n < 0:
            raise ValueError
        if n == 0:
            return self

        assets = [c for c in self.cashposition.columns if c != "date" and self.cashposition[c].dtype.is_numeric()]
        window = n + 1
        cp_smoothed = self.cashposition.with_columns(
            pl.col(c).rolling_mean(window_size=window, min_samples=1).alias(c) for c in assets
        )
        return Portfolio(
            prices=self.prices,
            cashposition=cp_smoothed,
            aum=self.aum,
            cost_per_unit=self.cost_per_unit,
            cost_bps=self.cost_bps,
        )

    # ── Utility ────────────────────────────────────────────────────────────────

    def correlation(self, frame: pl.DataFrame, name: str = "portfolio") -> pl.DataFrame:
        """Compute a correlation matrix of asset returns plus the portfolio.

        Computes percentage changes for all numeric columns in ``frame``,
        appends the portfolio profit series under the provided ``name``, and
        returns the Pearson correlation matrix across all numeric columns.

        Args:
            frame: A Polars DataFrame containing at least the asset price
                columns (and a date column which will be ignored if
                non-numeric).
            name: The column name to use when adding the portfolio profit
                series to the input frame.

        Returns:
            A square Polars DataFrame where each cell is the correlation
            between a pair of series (values in [-1, 1]).
        """
        p = frame.with_columns(cs.by_dtype(pl.Float32, pl.Float64).pct_change())
        p = p.with_columns(pl.Series(name, self.profit["profit"]))
        corr_matrix = p.select(cs.numeric()).fill_null(0.0).corr()
        return corr_matrix

`assets` `property` ¶

List the asset column names from prices (numeric columns).

Returns:

Type	Description
`list[str]`	list[str]: Names of numeric columns in prices; typically excludes
`list[str]`	`'date'`.

`cost_model` `property` ¶

Return the active cost model as a CostModel instance.

Returns:

Type	Description
`CostModel`	A `CostModel` whose `cost_per_unit` and `cost_bps` fields
`CostModel`	reflect the values stored on this portfolio.

`data` `property` ¶

Build a legacy Data object from this portfolio's returns.

This bridges the two entry points: Portfolio compiles the NAV curve from prices and positions; the returned Data object gives access to the full legacy analytics pipeline (data.stats, data.plots, data.reports).

Returns:

Type	Description
`Data`	`Data`: A Data object whose `returns` column
`Data`	is the portfolio's daily return series and whose `index` holds the date
`Data`	column (or a synthetic integer index for date-free portfolios).

Examples:

>>> import polars as pl
>>> from datetime import date
>>> prices = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [100.0, 110.0]})
>>> pos = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [1000.0, 1000.0]})
>>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e6)
>>> d = pf.data
>>> "returns" in d.returns.columns
True

`plots` `property` ¶

Convenience accessor returning a PortfolioPlots facade for this portfolio.

Use this to create Plotly visualizations such as snapshots, lagged performance curves, and lead/lag IR charts.

Returns:

Type	Description
`PortfolioPlots`	`PortfolioPlots`: Helper object with
`PortfolioPlots`	plotting methods.

The result is cached after first access so repeated calls are O(1).

`report` `property` ¶

Convenience accessor returning a Report facade for this portfolio.

Use this to generate a self-contained HTML performance report containing statistics tables and interactive charts.

Returns:

Type	Description
`Report`	`Report`: Helper object with
`Report`	report methods.

The result is cached after first access so repeated calls are O(1).

`stats` `property` ¶

Return a Stats object built from the portfolio's daily returns.

Delegates to the legacy Stats pipeline via data, so all analytics (Sharpe, drawdown, summary, etc.) are available through the shared implementation.

The result is cached after first access so repeated calls are O(1).

`utils` `property` ¶

Convenience accessor returning a PortfolioUtils facade for this portfolio.

Use this for common data transformations such as converting returns to prices, computing log returns, rebasing, aggregating by period, and computing exponential standard deviation.

Returns:

Type	Description
`PortfolioUtils`	`PortfolioUtils`: Helper object with
`PortfolioUtils`	utility transform methods.

The result is cached after first access so repeated calls are O(1).

`__post_init__()` ¶

Validate input types, shapes, and parameters post-initialization.

Source code in src/jquantstats/portfolio.py

def __post_init__(self) -> None:
    """Validate input types, shapes, and parameters post-initialization."""
    if not isinstance(self.prices, pl.DataFrame):
        raise InvalidPricesTypeError(type(self.prices).__name__)
    if not isinstance(self.cashposition, pl.DataFrame):
        raise InvalidCashPositionTypeError(type(self.cashposition).__name__)
    if self.cashposition.shape[0] != self.prices.shape[0]:
        raise RowCountMismatchError(self.prices.shape[0], self.cashposition.shape[0])
    if self.aum <= 0.0:
        raise NonPositiveAumError(self.aum)
    object.__setattr__(self, "_data_bridge", None)
    object.__setattr__(self, "_stats_cache", None)
    object.__setattr__(self, "_plots_cache", None)
    object.__setattr__(self, "_report_cache", None)
    object.__setattr__(self, "_utils_cache", None)
    object.__setattr__(self, "_profits_cache", None)
    object.__setattr__(self, "_returns_cache", None)
    object.__setattr__(self, "_tilt_cache", None)
    object.__setattr__(self, "_turnover_cache", None)

`repr()` ¶

Return a string representation of the Portfolio object.

Source code in src/jquantstats/portfolio.py

def __repr__(self) -> str:
    """Return a string representation of the Portfolio object."""
    rows, start, end = self._date_range()
    if start is not None:
        return f"Portfolio(assets={self.assets}, rows={rows}, start={start}, end={end})"
    return f"Portfolio(assets={self.assets}, rows={rows})"

`correlation(frame, name='portfolio')` ¶

Compute a correlation matrix of asset returns plus the portfolio.

Computes percentage changes for all numeric columns in frame, appends the portfolio profit series under the provided name, and returns the Pearson correlation matrix across all numeric columns.

Parameters:

Name	Type	Description	Default
`frame`	`DataFrame`	A Polars DataFrame containing at least the asset price columns (and a date column which will be ignored if non-numeric).	required
`name`	`str`	The column name to use when adding the portfolio profit series to the input frame.	`'portfolio'`

Returns:

Type	Description
`DataFrame`	A square Polars DataFrame where each cell is the correlation
`DataFrame`	between a pair of series (values in [-1, 1]).

Source code in src/jquantstats/portfolio.py

def correlation(self, frame: pl.DataFrame, name: str = "portfolio") -> pl.DataFrame:
    """Compute a correlation matrix of asset returns plus the portfolio.

    Computes percentage changes for all numeric columns in ``frame``,
    appends the portfolio profit series under the provided ``name``, and
    returns the Pearson correlation matrix across all numeric columns.

    Args:
        frame: A Polars DataFrame containing at least the asset price
            columns (and a date column which will be ignored if
            non-numeric).
        name: The column name to use when adding the portfolio profit
            series to the input frame.

    Returns:
        A square Polars DataFrame where each cell is the correlation
        between a pair of series (values in [-1, 1]).
    """
    p = frame.with_columns(cs.by_dtype(pl.Float32, pl.Float64).pct_change())
    p = p.with_columns(pl.Series(name, self.profit["profit"]))
    corr_matrix = p.select(cs.numeric()).fill_null(0.0).corr()
    return corr_matrix

`describe()` ¶

Return a tidy summary of shape, date range and asset names.

Returns:¶

pl.DataFrame One row per asset with columns: asset, start, end, rows.

Examples:

>>> import polars as pl
>>> from datetime import date
>>> prices = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [100.0, 110.0]})
>>> pos = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [1000.0, 1000.0]})
>>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e6)
>>> df = pf.describe()
>>> list(df.columns)
['asset', 'start', 'end', 'rows']

Source code in src/jquantstats/portfolio.py

def describe(self) -> pl.DataFrame:
    """Return a tidy summary of shape, date range and asset names.

    Returns:
    -------
    pl.DataFrame
        One row per asset with columns: asset, start, end, rows.

    Examples:
        >>> import polars as pl
        >>> from datetime import date
        >>> prices = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [100.0, 110.0]})
        >>> pos = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [1000.0, 1000.0]})
        >>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e6)
        >>> df = pf.describe()
        >>> list(df.columns)
        ['asset', 'start', 'end', 'rows']
    """
    rows, start, end = self._date_range()
    return pl.DataFrame(
        {
            "asset": self.assets,
            "start": [start] * len(self.assets),
            "end": [end] * len(self.assets),
            "rows": [rows] * len(self.assets),
        }
    )

`from_cash_position(prices, cash_position, aum, cost_per_unit=0.0, cost_bps=0.0, cost_model=None)` `classmethod` ¶

Create a Portfolio directly from cash positions aligned with prices.

Parameters:

Name	Type	Description	Default
`prices`	`DataFrame`	Price levels per asset over time (may include a date column).	required
`cash_position`	`DataFrame \| Expr`	Cash exposure per asset over time, either as a DataFrame or as a Polars expression evaluated against prices.	required
`aum`	`float`	Assets under management used as the base NAV offset.	required
`cost_per_unit`	`float`	One-way trading cost per unit of position change. Defaults to 0.0 (no cost). Ignored when cost_model is given.	`0.0`
`cost_bps`	`float`	One-way trading cost in basis points of AUM turnover. Defaults to 0.0 (no cost). Ignored when cost_model is given.	`0.0`
`cost_model`	`CostModel \| None`	Optional `CostModel` instance. When supplied, its `cost_per_unit` and `cost_bps` values take precedence over the individual parameters above.	`None`

Returns:

Type	Description
`Self`	A Portfolio instance with the provided cash positions.

Source code in src/jquantstats/portfolio.py

@classmethod
def from_cash_position(
    cls,
    prices: pl.DataFrame,
    cash_position: pl.DataFrame | pl.Expr,
    aum: float,
    cost_per_unit: float = 0.0,
    cost_bps: float = 0.0,
    cost_model: CostModel | None = None,
) -> Self:
    """Create a Portfolio directly from cash positions aligned with prices.

    Args:
        prices: Price levels per asset over time (may include a date column).
        cash_position: Cash exposure per asset over time, either as a
            DataFrame or as a Polars expression evaluated against *prices*.
        aum: Assets under management used as the base NAV offset.
        cost_per_unit: One-way trading cost per unit of position change.
            Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
        cost_bps: One-way trading cost in basis points of AUM turnover.
            Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
        cost_model: Optional `CostModel`
            instance.  When supplied, its ``cost_per_unit`` and
            ``cost_bps`` values take precedence over the individual
            parameters above.

    Returns:
        A Portfolio instance with the provided cash positions.
    """
    if isinstance(cash_position, pl.Expr):
        cash_position = prices.with_columns(cash_position)
    if cost_model is not None:
        cost_per_unit = cost_model.cost_per_unit
        cost_bps = cost_model.cost_bps
    return cls(prices=prices, cashposition=cash_position, aum=aum, cost_per_unit=cost_per_unit, cost_bps=cost_bps)

`from_position(prices, position, aum, cost_per_unit=0.0, cost_bps=0.0, cost_model=None)` `classmethod` ¶

Create a Portfolio from share/unit positions.

Converts position (number of units held per asset) to cash exposure by multiplying element-wise with prices, then delegates to :pyfrom_cash_position.

Parameters:

Name	Type	Description	Default
`prices`	`DataFrame`	Price levels per asset over time (may include a date column).	required
`position`	`DataFrame \| Expr`	Number of units held per asset over time, aligned with prices. Non-numeric columns (e.g. `'date'`) are passed through unchanged.	required
`aum`	`float`	Assets under management used as the base NAV offset.	required
`cost_per_unit`	`float`	One-way trading cost per unit of position change. Defaults to 0.0 (no cost). Ignored when cost_model is given.	`0.0`
`cost_bps`	`float`	One-way trading cost in basis points of AUM turnover. Defaults to 0.0 (no cost). Ignored when cost_model is given.	`0.0`
`cost_model`	`CostModel \| None`	Optional `CostModel` instance. When supplied, its `cost_per_unit` and `cost_bps` values take precedence over the individual parameters above.	`None`

Returns:

Type	Description
`Self`	A Portfolio instance whose cash positions equal position x prices.

Examples:

>>> import polars as pl
>>> prices = pl.DataFrame({"A": [100.0, 110.0, 105.0]})
>>> pos = pl.DataFrame({"A": [10.0, 10.0, 10.0]})
>>> pf = Portfolio.from_position(prices=prices, position=pos, aum=1e6)
>>> pf.cashposition["A"].to_list()
[1000.0, 1100.0, 1050.0]

Source code in src/jquantstats/portfolio.py

@classmethod
def from_position(
    cls,
    prices: pl.DataFrame,
    position: pl.DataFrame | pl.Expr,
    aum: float,
    cost_per_unit: float = 0.0,
    cost_bps: float = 0.0,
    cost_model: CostModel | None = None,
) -> Self:
    """Create a Portfolio from share/unit positions.

    Converts *position* (number of units held per asset) to cash exposure
    by multiplying element-wise with *prices*, then delegates to
    :py`from_cash_position`.

    Args:
        prices: Price levels per asset over time (may include a date column).
        position: Number of units held per asset over time, aligned with
            *prices*.  Non-numeric columns (e.g. ``'date'``) are passed
            through unchanged.
        aum: Assets under management used as the base NAV offset.
        cost_per_unit: One-way trading cost per unit of position change.
            Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
        cost_bps: One-way trading cost in basis points of AUM turnover.
            Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
        cost_model: Optional `CostModel` instance.
            When supplied, its ``cost_per_unit`` and ``cost_bps`` values
            take precedence over the individual parameters above.

    Returns:
        A Portfolio instance whose cash positions equal *position* x *prices*.

    Examples:
        >>> import polars as pl
        >>> prices = pl.DataFrame({"A": [100.0, 110.0, 105.0]})
        >>> pos = pl.DataFrame({"A": [10.0, 10.0, 10.0]})
        >>> pf = Portfolio.from_position(prices=prices, position=pos, aum=1e6)
        >>> pf.cashposition["A"].to_list()
        [1000.0, 1100.0, 1050.0]
    """
    if isinstance(position, pl.Expr):
        position = prices.with_columns(position)
    assets = [col for col, dtype in prices.schema.items() if dtype.is_numeric()]
    cash_position = position.with_columns((pl.col(asset) * prices[asset]).alias(asset) for asset in assets)
    return cls.from_cash_position(
        prices=prices,
        cash_position=cash_position,
        aum=aum,
        cost_per_unit=cost_per_unit,
        cost_bps=cost_bps,
        cost_model=cost_model,
    )

`from_risk_position(prices, risk_position, aum, vola=32, vol_cap=None, cost_per_unit=0.0, cost_bps=0.0, cost_model=None)` `classmethod` ¶

Create a Portfolio from per-asset risk positions.

De-volatizes each risk position using an EWMA volatility estimate derived from the corresponding price series.

Parameters:

Name	Type	Description	Default
`prices`	`DataFrame`	Price levels per asset over time (may include a date column).	required
`risk_position`	`DataFrame \| Expr`	Risk units per asset aligned with prices.	required
`vola`	`int \| dict[str, int]`	EWMA lookback (span-equivalent) used to estimate volatility. Pass an `int` to apply the same span to every asset, or a `dict[str, int]` to set a per-asset span (assets absent from the dict default to `32`). Every span value must be a positive integer; a `ValueError` is raised otherwise. Dict keys that do not correspond to any numeric column in prices also raise a `ValueError`.	`32`
`vol_cap`	`float \| None`	Optional lower bound for the EWMA volatility estimate. When provided, the vol series is clipped from below at this value before dividing the risk position, preventing position blow-up in calm, low-volatility regimes. For example, `vol_cap=0.05` ensures annualised vol is never estimated below 5%. Must be positive when not `None`.	`None`
`aum`	`float`	Assets under management used as the base NAV offset.	required
`cost_per_unit`	`float`	One-way trading cost per unit of position change. Defaults to 0.0 (no cost). Ignored when cost_model is given.	`0.0`
`cost_bps`	`float`	One-way trading cost in basis points of AUM turnover. Defaults to 0.0 (no cost). Ignored when cost_model is given.	`0.0`
`cost_model`	`CostModel \| None`	Optional `CostModel` instance. When supplied, its `cost_per_unit` and `cost_bps` values take precedence over the individual parameters above.	`None`

Returns:

Type	Description
`Self`	A Portfolio instance whose cash positions are risk_position
`Self`	divided by EWMA volatility.

Raises:

Type	Description
`ValueError`	If any span value in vola is ≤ 0, or if a key in a vola dict does not match any numeric column in prices, or if vol_cap is provided but is not positive.

Source code in src/jquantstats/portfolio.py

@classmethod
def from_risk_position(
    cls,
    prices: pl.DataFrame,
    risk_position: pl.DataFrame | pl.Expr,
    aum: float,
    vola: int | dict[str, int] = 32,
    vol_cap: float | None = None,
    cost_per_unit: float = 0.0,
    cost_bps: float = 0.0,
    cost_model: CostModel | None = None,
) -> Self:
    """Create a Portfolio from per-asset risk positions.

    De-volatizes each risk position using an EWMA volatility estimate
    derived from the corresponding price series.

    Args:
        prices: Price levels per asset over time (may include a date column).
        risk_position: Risk units per asset aligned with prices.
        vola: EWMA lookback (span-equivalent) used to estimate volatility.
            Pass an ``int`` to apply the same span to every asset, or a
            ``dict[str, int]`` to set a per-asset span (assets absent from
            the dict default to ``32``).  Every span value must be a
            positive integer; a ``ValueError`` is raised otherwise.  Dict
            keys that do not correspond to any numeric column in *prices*
            also raise a ``ValueError``.
        vol_cap: Optional lower bound for the EWMA volatility estimate.
            When provided, the vol series is clipped from below at this
            value before dividing the risk position, preventing
            position blow-up in calm, low-volatility regimes.  For
            example, ``vol_cap=0.05`` ensures annualised vol is never
            estimated below 5%.  Must be positive when not ``None``.
        aum: Assets under management used as the base NAV offset.
        cost_per_unit: One-way trading cost per unit of position change.
            Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
        cost_bps: One-way trading cost in basis points of AUM turnover.
            Defaults to 0.0 (no cost).  Ignored when *cost_model* is given.
        cost_model: Optional `CostModel`
            instance.  When supplied, its ``cost_per_unit`` and
            ``cost_bps`` values take precedence over the individual
            parameters above.

    Returns:
        A Portfolio instance whose cash positions are risk_position
        divided by EWMA volatility.

    Raises:
        ValueError: If any span value in *vola* is ≤ 0, or if a key in a
            *vola* dict does not match any numeric column in *prices*, or
            if *vol_cap* is provided but is not positive.
    """
    if isinstance(risk_position, pl.Expr):
        risk_position = prices.with_columns(risk_position)
    if cost_model is not None:
        cost_per_unit = cost_model.cost_per_unit
        cost_bps = cost_model.cost_bps
    assets = [col for col, dtype in prices.schema.items() if dtype.is_numeric()]

    # ── Validate vol_cap ──────────────────────────────────────────────────
    if vol_cap is not None and vol_cap <= 0:
        raise ValueError(f"vol_cap must be a positive number when provided, got {vol_cap!r}")  # noqa: TRY003

    # ── Validate vola ─────────────────────────────────────────────────────
    if isinstance(vola, dict):
        unknown = set(vola.keys()) - set(assets)
        if unknown:
            raise ValueError(  # noqa: TRY003
                f"vola dict contains keys that do not match any numeric column in prices: {sorted(unknown)}"
            )
        for asset, span in vola.items():
            if int(span) <= 0:
                raise ValueError(f"vola span for '{asset}' must be a positive integer, got {span!r}")  # noqa: TRY003
    else:
        if int(vola) <= 0:
            raise ValueError(f"vola span must be a positive integer, got {vola!r}")  # noqa: TRY003

    def _span(asset: str) -> int:
        """Return the EWMA span for *asset*, falling back to 32 if not specified."""
        if isinstance(vola, dict):
            return int(vola.get(asset, 32))
        return int(vola)

    def _vol(asset: str) -> pl.Series:
        """Return the EWMA volatility series for *asset*, optionally clipped from below."""
        vol = prices[asset].pct_change().ewm_std(com=_span(asset) - 1, adjust=True, min_samples=_span(asset))
        if vol_cap is not None:
            vol = vol.clip(lower_bound=vol_cap)
        return vol

    cash_position = risk_position.with_columns((pl.col(asset) / _vol(asset)).alias(asset) for asset in assets)
    return cls(prices=prices, cashposition=cash_position, aum=aum, cost_per_unit=cost_per_unit, cost_bps=cost_bps)

`lag(n)` ¶

Return a new Portfolio with cash positions lagged by n steps.

This method shifts the numeric asset columns in the cashposition DataFrame by n rows, preserving the 'date' column and any non-numeric columns unchanged. Positive n delays weights (moves them down); negative n leads them (moves them up); n == 0 returns the current portfolio unchanged.

Notes

Missing values introduced by the shift are left as nulls; downstream profit computation already guards and treats nulls as zero when multiplying by returns.

Parameters:

Name	Type	Description	Default
`n`	`int`	Number of rows to shift (can be negative, zero, or positive).	required

Returns:

Type	Description
`Portfolio`	A new Portfolio instance with lagged cash positions and the same
`Portfolio`	prices/AUM as the original.

Source code in src/jquantstats/portfolio.py

def lag(self, n: int) -> "Portfolio":
    """Return a new Portfolio with cash positions lagged by ``n`` steps.

    This method shifts the numeric asset columns in the cashposition
    DataFrame by ``n`` rows, preserving the ``'date'`` column and any
    non-numeric columns unchanged.  Positive ``n`` delays weights (moves
    them down); negative ``n`` leads them (moves them up); ``n == 0``
    returns the current portfolio unchanged.

    Notes:
        Missing values introduced by the shift are left as nulls;
        downstream profit computation already guards and treats nulls as
        zero when multiplying by returns.

    Args:
        n: Number of rows to shift (can be negative, zero, or positive).

    Returns:
        A new Portfolio instance with lagged cash positions and the same
        prices/AUM as the original.
    """
    if not isinstance(n, int):
        raise TypeError
    if n == 0:
        return self

    assets = [c for c in self.cashposition.columns if c != "date" and self.cashposition[c].dtype.is_numeric()]
    cp_lagged = self.cashposition.with_columns(pl.col(c).shift(n) for c in assets)
    return Portfolio(
        prices=self.prices,
        cashposition=cp_lagged,
        aum=self.aum,
        cost_per_unit=self.cost_per_unit,
        cost_bps=self.cost_bps,
    )

`smoothed_holding(n)` ¶

Return a new Portfolio with cash positions smoothed by a rolling mean.

Applies a trailing window average over the last n steps for each numeric asset column (excluding 'date'). The window length is n + 1 so that:

n=0 returns the original weights (no smoothing),
n=1 averages the current and previous weights,
n=k averages the current and last k weights.

Parameters:

Name	Type	Description	Default
`n`	`int`	Non-negative integer specifying how many previous steps to include.	required

Returns:

Type	Description
`Portfolio`	A new Portfolio with smoothed cash positions and the same
`Portfolio`	prices/AUM.

Source code in src/jquantstats/portfolio.py

def smoothed_holding(self, n: int) -> "Portfolio":
    """Return a new Portfolio with cash positions smoothed by a rolling mean.

    Applies a trailing window average over the last ``n`` steps for each
    numeric asset column (excluding ``'date'``). The window length is
    ``n + 1`` so that:

    - n=0 returns the original weights (no smoothing),
    - n=1 averages the current and previous weights,
    - n=k averages the current and last k weights.

    Args:
        n: Non-negative integer specifying how many previous steps to
            include.

    Returns:
        A new Portfolio with smoothed cash positions and the same
        prices/AUM.
    """
    if not isinstance(n, int):
        raise TypeError
    if n < 0:
        raise ValueError
    if n == 0:
        return self

    assets = [c for c in self.cashposition.columns if c != "date" and self.cashposition[c].dtype.is_numeric()]
    window = n + 1
    cp_smoothed = self.cashposition.with_columns(
        pl.col(c).rolling_mean(window_size=window, min_samples=1).alias(c) for c in assets
    )
    return Portfolio(
        prices=self.prices,
        cashposition=cp_smoothed,
        aum=self.aum,
        cost_per_unit=self.cost_per_unit,
        cost_bps=self.cost_bps,
    )

`truncate(start=None, end=None)` ¶

Return a new Portfolio truncated to the inclusive [start, end] range.

When a 'date' column is present in both prices and cash positions, truncation is performed by comparing the 'date' column against start and end (which should be date/datetime values or strings parseable by Polars).

When the 'date' column is absent, integer-based row slicing is used instead. In this case start and end must be non-negative integers representing 0-based row indices. Passing non-integer bounds to an integer-indexed portfolio raises TypeError.

In all cases the aum value is preserved.

Parameters:

Name	Type	Description	Default
`start`	`date \| datetime \| str \| int \| None`	Optional lower bound (inclusive). A date/datetime or Polars-parseable string when a `'date'` column exists; a non-negative int row index when the data has no `'date'` column.	`None`
`end`	`date \| datetime \| str \| int \| None`	Optional upper bound (inclusive). Same type rules as `start`.	`None`

Returns:

Type	Description
`Portfolio`	A new Portfolio instance with prices and cash positions filtered
`Portfolio`	to the specified range.

Raises:

Type	Description
`TypeError`	When the portfolio has no `'date'` column and a non-integer bound is supplied.

Source code in src/jquantstats/portfolio.py

def truncate(
    self,
    start: date | datetime | str | int | None = None,
    end: date | datetime | str | int | None = None,
) -> "Portfolio":
    """Return a new Portfolio truncated to the inclusive [start, end] range.

    When a ``'date'`` column is present in both prices and cash positions,
    truncation is performed by comparing the ``'date'`` column against
    ``start`` and ``end`` (which should be date/datetime values or strings
    parseable by Polars).

    When the ``'date'`` column is absent, integer-based row slicing is
    used instead.  In this case ``start`` and ``end`` must be non-negative
    integers representing 0-based row indices.  Passing non-integer bounds
    to an integer-indexed portfolio raises `TypeError`.

    In all cases the ``aum`` value is preserved.

    Args:
        start: Optional lower bound (inclusive). A date/datetime or
            Polars-parseable string when a ``'date'`` column exists; a
            non-negative int row index when the data has no ``'date'``
            column.
        end: Optional upper bound (inclusive). Same type rules as
            ``start``.

    Returns:
        A new Portfolio instance with prices and cash positions filtered
        to the specified range.

    Raises:
        TypeError: When the portfolio has no ``'date'`` column and a
            non-integer bound is supplied.
    """
    has_date = "date" in self.prices.columns
    if has_date:
        cond = pl.lit(True)
        if start is not None:
            cond = cond & (pl.col("date") >= pl.lit(start))
        if end is not None:
            cond = cond & (pl.col("date") <= pl.lit(end))
        pr = self.prices.filter(cond)
        cp = self.cashposition.filter(cond)
    else:
        if start is not None and not isinstance(start, int):
            raise IntegerIndexBoundError("start", type(start).__name__)
        if end is not None and not isinstance(end, int):
            raise IntegerIndexBoundError("end", type(end).__name__)
        row_start = int(start) if start is not None else 0
        row_end = int(end) + 1 if end is not None else self.prices.height
        length = max(0, row_end - row_start)
        pr = self.prices.slice(row_start, length)
        cp = self.cashposition.slice(row_start, length)
    return Portfolio(
        prices=pr,
        cashposition=cp,
        aum=self.aum,
        cost_per_unit=self.cost_per_unit,
        cost_bps=self.cost_bps,
    )

`Result` `dataclass` ¶

Lightweight container for system outputs.

Attributes:

Name	Type	Description
`portfolio`	`Portfolio`	The portfolio constructed by a system/experiment.
`mu`	`DataFrame \| None`	Optional per-asset expected-returns surface used by some systems.

Source code in src/jquantstats/result.py

@dataclass(frozen=True)
class Result:
    """Lightweight container for system outputs.

    Attributes:
        portfolio: The portfolio constructed by a system/experiment.
        mu: Optional per-asset expected-returns surface used by some systems.
    """

    portfolio: Portfolio
    mu: pl.DataFrame | None = None

    def create_reports(self, output_dir: Path) -> None:
        """Generate CSV exports and interactive HTML plots for this result.

        Args:
            output_dir: Destination directory where two subfolders will be created:
                - data/: CSV exports of prices, profit, returns, positions, and signal (if mu present).
                - plots/: Plotly HTML reports (snapshot, lead/lag IR, lagged performance,
                  smoothed holdings performance).
        """
        data = output_dir / "data"
        plots = output_dir / "plots"

        data.mkdir(parents=True, exist_ok=True)
        plots.mkdir(parents=True, exist_ok=True)

        self.portfolio.prices.write_csv(file=data / "prices.csv")
        self.portfolio.profit.write_csv(file=data / "profit.csv")
        self.portfolio.returns.write_csv(file=data / "returns.csv")
        self.portfolio.tilt_timing_decomp.write_csv(file=data / "tilt_timing_decomp.csv")

        if self.mu is not None:
            self.mu.write_csv(file=data / "signal.csv")

        self.portfolio.cashposition.write_csv(file=data / "position.csv")

        fig = self.portfolio.plots.snapshot()
        fig.write_html(file=plots / "snapshot.html", auto_open=False, include_plotlyjs="cdn")
        fig = self.portfolio.plots.lead_lag_ir_plot()
        fig.write_html(file=plots / "lag_ir.html", auto_open=False, include_plotlyjs="cdn")
        fig = self.portfolio.plots.lagged_performance_plot()
        fig.write_html(file=plots / "lagged_perf.html", auto_open=False, include_plotlyjs="cdn")
        fig = self.portfolio.plots.smoothed_holdings_performance_plot()
        fig.write_html(file=plots / "smooth_perf.html", auto_open=False, include_plotlyjs="cdn")

`create_reports(output_dir)` ¶

Generate CSV exports and interactive HTML plots for this result.

Parameters:

Name	Type	Description	Default
`output_dir`	`Path`	Destination directory where two subfolders will be created: - data/: CSV exports of prices, profit, returns, positions, and signal (if mu present). - plots/: Plotly HTML reports (snapshot, lead/lag IR, lagged performance, smoothed holdings performance).	required

Source code in src/jquantstats/result.py

def create_reports(self, output_dir: Path) -> None:
    """Generate CSV exports and interactive HTML plots for this result.

    Args:
        output_dir: Destination directory where two subfolders will be created:
            - data/: CSV exports of prices, profit, returns, positions, and signal (if mu present).
            - plots/: Plotly HTML reports (snapshot, lead/lag IR, lagged performance,
              smoothed holdings performance).
    """
    data = output_dir / "data"
    plots = output_dir / "plots"

    data.mkdir(parents=True, exist_ok=True)
    plots.mkdir(parents=True, exist_ok=True)

    self.portfolio.prices.write_csv(file=data / "prices.csv")
    self.portfolio.profit.write_csv(file=data / "profit.csv")
    self.portfolio.returns.write_csv(file=data / "returns.csv")
    self.portfolio.tilt_timing_decomp.write_csv(file=data / "tilt_timing_decomp.csv")

    if self.mu is not None:
        self.mu.write_csv(file=data / "signal.csv")

    self.portfolio.cashposition.write_csv(file=data / "position.csv")

    fig = self.portfolio.plots.snapshot()
    fig.write_html(file=plots / "snapshot.html", auto_open=False, include_plotlyjs="cdn")
    fig = self.portfolio.plots.lead_lag_ir_plot()
    fig.write_html(file=plots / "lag_ir.html", auto_open=False, include_plotlyjs="cdn")
    fig = self.portfolio.plots.lagged_performance_plot()
    fig.write_html(file=plots / "lagged_perf.html", auto_open=False, include_plotlyjs="cdn")
    fig = self.portfolio.plots.smoothed_holdings_performance_plot()
    fig.write_html(file=plots / "smooth_perf.html", auto_open=False, include_plotlyjs="cdn")

`interpolate(df)` ¶

Forward-fill numeric columns only between first and last non-null values.

For each numeric column, forward-fill is applied strictly within the span bounded by its first and last non-null samples. Values outside this span are left as-is (including leading/trailing nulls). Non-numeric columns are returned unchanged.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	Input frame possibly containing nulls.	required

Returns:

Type	Description
`DataFrame`	pl.DataFrame: Frame where numeric columns have been interior-forward-
`DataFrame`	filled; schema and dtypes of the original columns are preserved.

Examples:

import polars as pl
from jquantstats import interpolate

df = pl.DataFrame({"a": [None, 1.0, None, 3.0, None], "b": ["x", "y", "z", "w", "v"]})
result = interpolate(df)
# a: [None, 1.0, 1.0, 3.0, None]  (leading/trailing nulls untouched)
# b: ["x", "y", "z", "w", "v"]    (non-numeric unchanged)

Source code in src/jquantstats/data.py

def interpolate(df: pl.DataFrame) -> pl.DataFrame:
    """Forward-fill numeric columns only between first and last non-null values.

    For each numeric column, forward-fill is applied strictly within the span
    bounded by its first and last non-null samples. Values outside this span
    are left as-is (including leading/trailing nulls). Non-numeric columns are
    returned unchanged.

    Args:
        df: Input frame possibly containing nulls.

    Returns:
        pl.DataFrame: Frame where numeric columns have been interior-forward-
        filled; schema and dtypes of the original columns are preserved.

    Examples:
        ```python
        import polars as pl
        from jquantstats import interpolate

        df = pl.DataFrame({"a": [None, 1.0, None, 3.0, None], "b": ["x", "y", "z", "w", "v"]})
        result = interpolate(df)
        # a: [None, 1.0, 1.0, 3.0, None]  (leading/trailing nulls untouched)
        # b: ["x", "y", "z", "w", "v"]    (non-numeric unchanged)
        ```

    """
    # Choose a temp column name guaranteed not to collide with any user column.
    tmp_col = "__row_idx__"
    while tmp_col in df.columns:
        tmp_col = f"_{tmp_col}_"

    out = []

    for col in df.columns:
        s = df[col]
        if s.dtype.is_numeric():
            non_null_mask = s.is_not_null()
            if non_null_mask.any():
                _fwd = non_null_mask.arg_max()
                _rev = non_null_mask.reverse().arg_max()
                if _fwd is None or _rev is None:  # pragma: no cover
                    out.append(pl.col(col))
                    continue
                first_valid_idx = _fwd
                last_valid_idx = len(s) - 1 - _rev
            else:
                out.append(pl.col(col))
                continue

            mask = (pl.col(tmp_col) >= pl.lit(first_valid_idx)) & (pl.col(tmp_col) <= pl.lit(last_valid_idx))
            filled_col = pl.when(mask).then(pl.col(col).fill_null(strategy="forward")).otherwise(pl.col(col)).alias(col)
            out.append(filled_col)
        else:
            out.append(pl.col(col))

    return df.with_columns(pl.int_range(0, df.height).alias(tmp_col)).select(out)

API Reference¶

jquantstats ¶

Two entry points¶

CostModel dataclass ¶

per_unit(cost) classmethod ¶

turnover_bps(bps) classmethod ¶

zero() classmethod ¶

Data dataclass ¶

all property ¶

assets property ¶

date_col property ¶

plots property ¶

reports property ¶

stats property ¶

utils property ¶

__post_init__() ¶

__repr__() ¶

copy() ¶

describe() ¶

from_prices(prices, rf=0.0, benchmark=None, date_col='Date', null_strategy=None) classmethod ¶

from_returns(returns, rf=0.0, benchmark=None, date_col='Date', null_strategy=None) classmethod ¶

head(n=5) ¶

items() ¶

resample(every='1mo') ¶

tail(n=5) ¶

truncate(start=None, end=None) ¶

Portfolio dataclass ¶

Analytics facades¶

Cost models¶

Date column requirement¶

assets property ¶

cost_model property ¶

data property ¶

plots property ¶

report property ¶

stats property ¶

utils property ¶

__post_init__() ¶

__repr__() ¶

correlation(frame, name='portfolio') ¶

describe() ¶

Returns:¶

from_cash_position(prices, cash_position, aum, cost_per_unit=0.0, cost_bps=0.0, cost_model=None) classmethod ¶

from_position(prices, position, aum, cost_per_unit=0.0, cost_bps=0.0, cost_model=None) classmethod ¶

from_risk_position(prices, risk_position, aum, vola=32, vol_cap=None, cost_per_unit=0.0, cost_bps=0.0, cost_model=None) classmethod ¶

lag(n) ¶

smoothed_holding(n) ¶

truncate(start=None, end=None) ¶

Result dataclass ¶

create_reports(output_dir) ¶

interpolate(df) ¶

`jquantstats` ¶

`CostModel` `dataclass` ¶

`per_unit(cost)` `classmethod` ¶

`turnover_bps(bps)` `classmethod` ¶

`zero()` `classmethod` ¶

`Data` `dataclass` ¶

`all` `property` ¶

`assets` `property` ¶

`date_col` `property` ¶

`plots` `property` ¶

`reports` `property` ¶

`stats` `property` ¶

`utils` `property` ¶

`__post_init__()` ¶

`repr()` ¶

`copy()` ¶

`describe()` ¶

`from_prices(prices, rf=0.0, benchmark=None, date_col='Date', null_strategy=None)` `classmethod` ¶

`from_returns(returns, rf=0.0, benchmark=None, date_col='Date', null_strategy=None)` `classmethod` ¶

`head(n=5)` ¶

`items()` ¶

`resample(every='1mo')` ¶

`tail(n=5)` ¶

`truncate(start=None, end=None)` ¶

`Portfolio` `dataclass` ¶

`assets` `property` ¶

`cost_model` `property` ¶

`data` `property` ¶

`plots` `property` ¶

`report` `property` ¶

`stats` `property` ¶

`utils` `property` ¶

`__post_init__()` ¶

`repr()` ¶

`correlation(frame, name='portfolio')` ¶

`describe()` ¶

`from_cash_position(prices, cash_position, aum, cost_per_unit=0.0, cost_bps=0.0, cost_model=None)` `classmethod` ¶

`from_position(prices, position, aum, cost_per_unit=0.0, cost_bps=0.0, cost_model=None)` `classmethod` ¶

`from_risk_position(prices, risk_position, aum, vola=32, vol_cap=None, cost_per_unit=0.0, cost_bps=0.0, cost_model=None)` `classmethod` ¶

`lag(n)` ¶

`smoothed_holding(n)` ¶

`truncate(start=None, end=None)` ¶

`Result` `dataclass` ¶

`create_reports(output_dir)` ¶

`interpolate(df)` ¶