Skip to content

BaseDataProvider total relative return #136

@gadamc

Description

@gadamc

What is the expected enhancement?

Adds a new function to BaseDataProvider that returns the total relative change in an asset.

The Problem

In the get_period_return_mean_vector method, the average percentage change in the value of an asset per time step (daily) over the entire data set is returned. https://github.com/Qiskit/qiskit-finance/blob/main/qiskit_finance/data_providers/_base_data_provider.py#L113

However, I'm not sure if this is what is really desired.

During the Fall 2021 Quantum Challenge the Qiskit Finance package was demonstrated in challenge notebook 1. In that demonstration there were four randomly generated stock prices generated over a 30 year period.

Screen Shot 2022-01-26 at 10 41 50 AM

The values returned by get_period_return_mean_vector for these four stocks are

[1.59702144e-04 4.76518943e-04 2.39123234e-04 9.85029012e-05]

As you can see, STOCK1 has a larger mean return value than STOCK0: 4.76e-4 > 1.59e-4

However, looking at the chart of the value of the stock over the course of 30 years, STOCK0 has increased in relative value much more significantly than STOCK1. I would think that an investor would prefer STOCK0 over STOCK1. But using the get_period_return_mean_vector as the expected returns in the PortfolioOptimization class will cause the optimization to prefer STOCK1 even though it results in smaller return.

The values from get_period_return_mean_vector are affected by the fact that STOCK1 has a larger number of very small daily negative changes in value and a handful of large daily increases in value which skews the returned average value.

The Solution

Instead of get_period_return_mean_vector the BaseDataProvider object could have a method get_period_return_total_vector, which will simply be the relative increase in the value of the asset between the start and end of the period.

def get_period_return_total_vector(self) -> np.ndarray:
        """
        Returns a vector containing the total relative return of each asset over the entire period.
        Returns:
            a per-asset vector.
        Raises:
            QiskitFinanceError: no data loaded
        """
        try:
            if not self._data:
                raise QiskitFinanceError(
                    "No data loaded, yet. Please run the method run() first to load the data."
                )
        except AttributeError as ex:
            raise QiskitFinanceError(
                "No data loaded, yet. Please run the method run() first to load the data."
            ) from ex
        _div_func = np.vectorize(BaseDataProvider._divide)
        period_total_return = _div_func(np.array(self._data)[:, -1], np.array(self._data)[:, 0]) - 1
        self.period_total_return = cast(np.ndarray, period_total_return)
        return self.period_total_return

The returns from this function for the data above are

[3.39820122, 0.16965773, 1.84632666, 0.02657591]

One would then use these as the expected returns in the PortfolioOptimization class.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions