Skip to content

Commit 5c87883

Browse files
authored
Merge pull request #70 from DynamicsAndNeuralSystems/jmoo2880-python-upgrade
Update pyspi dependencies for python 3.10+ compatibility
2 parents 0a53795 + 526c7eb commit 5c87883

File tree

13 files changed

+129
-64
lines changed

13 files changed

+129
-64
lines changed

.github/SECURITY.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,6 @@ currently being supported with security updates.
1313

1414
| Version | Supported |
1515
| ------- | ------------------ |
16-
| 0.4 | :white_check_mark: |
16+
| 1.1.0 | :white_check_mark: |
1717

1818

.github/workflows/run_unit_tests.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ jobs:
88
runs-on: ubuntu-latest
99
strategy:
1010
matrix:
11-
python-version: ["3.8", "3.9"]
11+
python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]
1212
steps:
1313
- uses: actions/checkout@v4
1414
- name: Setup python ${{ matrix.python-version }}
@@ -23,6 +23,7 @@ jobs:
2323
- name: Install pyspi dependencies
2424
run: |
2525
python -m pip install --upgrade pip
26+
pip install setuptools
2627
pip install -r requirements.txt
2728
pip install .
2829
- name: Run pyspi calculator/utils unit tests

README.md

Lines changed: 1 addition & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
<a href="https://www.gnu.org/licenses/gpl-3.0"><img src="https://img.shields.io/badge/License-GPLv3-blue.svg" height="20"/></a>
1515
<a href="https://github.com/DynamicsAndNeuralSystems/pyspi/actions/workflows/run_unit_tests.yaml"><img src="https://github.com/DynamicsAndNeuralSystems/pyspi/actions/workflows/run_unit_tests.yaml/badge.svg" height="20"/></a>
1616
<a href="https://twitter.com/compTimeSeries"><img src="https://img.shields.io/twitter/url/https/twitter.com/compTimeSeries.svg?style=social&label=Follow%20%40compTimeSeries" height="20"/></a><br>
17-
<a href="https://www.python.org"><img src="https://img.shields.io/badge/Python-3.8%20|%203.9-3776AB.svg?style=flat&logo=python&logoColor=white" alt="Python 3.8 | 3.9"></a>
17+
<a href="https://www.python.org"><img src="https://img.shields.io/badge/Python-3.8%20|%203.9%20|%203.10%20|%203.11%20|%203.12-3776AB.svg?style=flat&logo=python&logoColor=white" alt="Python 3.8 | 3.9 | 3.10 | 3.11 | 3.12"></a>
1818
</p>
1919

2020
_pyspi_ is a comprehensive python library for computing statistics of pairwise interactions (SPIs) from multivariate time-series (MTS) data.
@@ -74,21 +74,6 @@ Once you have installed _pyspi_, you can learn how to apply the package by check
7474

7575
### Advanced Usage
7676
For advanced users, we offer several additional guides in the [full documentation](https://time-series-features.gitbook.io/pyspi/usage/advanced-usage) on how you can distribute your _pyspi_ jobs across PBS clusters, as well as how you can construct your own subsets of SPIs.
77-
Click one of the following dropdowns for more information:
78-
79-
<details closed>
80-
<summary>Distributing pyspi calculations</summary>
81-
<p>If you have access to a PBS cluster and are processing MTS with many processes (or are analyzing many MTS), then you may find the <a href="https://github.com/DynamicsAndNeuralSystems/pyspi-distribute"><em>pyspi distribute</em></a> repository helpful.
82-
In the full <a href="https://time-series-features.gitbook.io/pyspi/usage/advanced-usage/distributing-calculations-on-a-cluster">documentation </a>, we provide a comprehensive guide on how you can distribute <em>pyspi</em> calculations on a PBS cluster, along with the necessary scripts and commands to get started!</p>
83-
</details>
84-
85-
<details closed>
86-
<summary>Reduced subsets</summary>
87-
<p>If your dataset is large (containing many processes and/or observations), you can use a pre-configured set of reduced statistics or create your own subsets.
88-
Follow the guide in the <a href="https://time-series-features.gitbook.io/pyspi/usage/advanced-usage/using-a-reduced-spi-set">documentation </a> to learn how you can create your own reduced subsets.</p>
89-
</details>
90-
91-
9277

9378
## SPI Descriptions 📋
9479
To access a table with a high-level overview of the _pyspi_ library of SPIs, including their associated identifiers, see the [table of SPIs](https://time-series-features.gitbook.io/pyspi/spis/table-of-spis) in the full documentation.
@@ -167,4 +152,3 @@ Below are some of the leading contributors to _pyspi_:
167152

168153
## License 🧾
169154
_pyspi_ is released under the [GNU General Public License](https://www.gnu.org/licenses/gpl-3.0).
170-

pyproject.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
44

55
[project]
66
name = "pyspi"
7-
version = "1.0.3"
7+
version = "1.1.0"
88
authors = [
99
{ name ="Oliver M. Cliff", email="[email protected]"},
1010
]
@@ -15,7 +15,7 @@ maintainers = [
1515
description = "Library for pairwise analysis of time series data."
1616
readme = "README.md"
1717
license = {text = "GNU General Public License v3 (GPLv3)"}
18-
requires-python = ">=3.8,<3.10"
18+
requires-python = ">=3.8"
1919
classifiers = [
2020
"Programming Language :: Python",
2121
"Programming Language :: Python :: 3",

pyspi/calculator.py

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99
# From this package
1010
from .data import Data
11-
from .utils import convert_mdf_to_ddf, check_optional_deps
11+
from .utils import convert_mdf_to_ddf, check_optional_deps, inspect_calc_results
1212

1313

1414
class Calculator:
@@ -34,14 +34,18 @@ class Calculator:
3434
A pre-configured subset of SPIs to use. Options are "all", "fast", "sonnet", or "fabfour", defaults to "all".
3535
configfile (str, optional):
3636
The location of the YAML configuration file for a user-defined subset. See :ref:`Using a reduced SPI set`, defaults to :code:`'</path/to/pyspi>/pyspi/config.yaml'`
37+
normalise (bool, optional):
38+
Normalise the dataset along the time axis before computing SPIs, defaults to True.
3739
"""
3840
_optional_dependencies = None
3941

4042
def __init__(
41-
self, dataset=None, name=None, labels=None, subset="all", configfile=None
43+
self, dataset=None, name=None, labels=None, subset="all", configfile=None,
44+
normalise=True
4245
):
4346
self._spis = {}
4447
self._excluded_spis = list()
48+
self._normalise = normalise
4549

4650
# Define configfile by subset if it was not specified
4751
if configfile is None:
@@ -252,7 +256,7 @@ def load_dataset(self, dataset):
252256
New dataset to attach to calculator.
253257
"""
254258
if not isinstance(dataset, Data):
255-
self._dataset = Data(Data.convert_to_numpy(dataset))
259+
self._dataset = Data(Data.convert_to_numpy(dataset), normalise=self._normalise)
256260
else:
257261
self._dataset = dataset
258262

@@ -293,7 +297,9 @@ def compute(self):
293297
warnings.warn(f'Caught {type(err)} for SPI "{spi}": {err}')
294298
self._table[spi] = np.NaN
295299
pbar.close()
296-
300+
print(f"\nCalculation complete. Time taken: {pbar.format_dict['elapsed']:.4f}s")
301+
inspect_calc_results(self)
302+
297303
def _rmmin(self):
298304
"""Iterate through all spis and remove the minimum (fixes absolute value errors when correlating)"""
299305
for spi in self.spis:

pyspi/data.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -177,11 +177,14 @@ def set_data(
177177
data = data[:, :n_observations]
178178

179179
if self.normalise:
180+
print("Normalising the dataset...\n")
180181
data = zscore(data, axis=1, nan_policy="omit", ddof=1)
181182
try:
182183
data = detrend(data, axis=1)
183184
except ValueError as err:
184185
print(f"Could not detrend data: {err}")
186+
else:
187+
print("Skipping normalisation of the dataset...\n")
185188

186189
nans = np.isnan(data)
187190
if nans.any():

pyspi/statistics/causal.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22
import pandas as pd
33
from cdt.causality.pairwise import ANM, CDS, IGCI, RECI
44
import pyEDM
5+
from sklearn.gaussian_process import GaussianProcessRegressor
6+
from cdt.causality.pairwise.ANM import normalized_hsic
57

68
from pyspi.base import Directed, Unsigned, Signed, parse_bivariate, parse_multivariate
79

@@ -11,6 +13,15 @@ class AdditiveNoiseModel(Directed, Unsigned):
1113
name = "Additive noise model"
1214
identifier = "anm"
1315
labels = ["unsigned", "causal", "unordered", "linear", "directed"]
16+
17+
# monkey-patch the anm_score function, see cdt PR #155
18+
def corrected_anm_score(self, x, y):
19+
gp = GaussianProcessRegressor(random_state=42).fit(x, y)
20+
y_predict = gp.predict(x).reshape(-1, 1)
21+
indepscore = normalized_hsic(y_predict - y, x)
22+
return indepscore
23+
24+
ANM.anm_score = corrected_anm_score
1425

1526
@parse_bivariate
1627
def bivariate(self, data, i=None, j=None):

pyspi/statistics/misc.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
import warnings
22
import numpy as np
3+
import inspect
34

45
from statsmodels.tsa import stattools
56
from statsmodels.tsa.vector_ar.vecm import coint_johansen
@@ -115,7 +116,11 @@ def bivariate(self, data, i=None, j=None):
115116
z = data.to_numpy()
116117
with warnings.catch_warnings():
117118
warnings.simplefilter("ignore")
118-
mdl = self._model().fit(z[i], np.ravel(z[j]))
119+
model_params = inspect.signature(self._model).parameters
120+
if "random_state" in model_params:
121+
mdl = self._model(random_state=42).fit(z[i], np.ravel(z[j]))
122+
else:
123+
mdl = self._model().fit(z[i], np.ravel(z[j]))
119124
y_predict = mdl.predict(z[i])
120125
return mean_squared_error(y_predict, np.ravel(z[j]))
121126

pyspi/utils.py

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -228,3 +228,47 @@ def filter_spis(keywords, output_name = None, configfile= None):
228228
- Next Steps: To utilise the filtered set of SPIs, please initialise a new Calculator instance with the following command:
229229
`Calculator(configfile='{output_file}')`
230230
""")
231+
232+
def inspect_calc_results(calc):
233+
total_num_spis = calc.n_spis
234+
num_procs = calc.dataset.n_processes
235+
spi_results = dict({'Successful': list(), 'NaNs': list(), 'Partial NaNs': list()})
236+
for key in calc.spis.keys():
237+
if calc.table[key].isna().all().all():
238+
spi_results['NaNs'].append(key)
239+
elif calc.table[key].isnull().values.sum() > num_procs:
240+
# off-diagonal NaNs
241+
spi_results['Partial NaNs'].append(key)
242+
else:
243+
# returned numeric values (i.e., not NaN)
244+
spi_results['Successful'].append(key)
245+
246+
# print summary
247+
double_line_60 = "="*60
248+
single_line_60 = "-"*60
249+
print("\nSPI Computation Results Summary")
250+
print(double_line_60)
251+
print(f"\nTotal number of SPIs attempted: {total_num_spis}")
252+
print(f"Number of SPIs successfully computed: {len(spi_results['Successful'])} ({len(spi_results['Successful']) / total_num_spis * 100:.2f}%)")
253+
print(single_line_60)
254+
print("Category | Count | Percentage")
255+
print(single_line_60)
256+
for category, spis in spi_results.items():
257+
count = len(spis)
258+
percentage = (count / total_num_spis) * 100
259+
print(f"{category:14} | {count:5} | {percentage:6.2f}%")
260+
print(single_line_60)
261+
262+
if spi_results['NaNs']:
263+
print(f"\n[{len(spi_results['NaNs'])}] SPI(s) produced NaN outputs:")
264+
print(single_line_60)
265+
for i, spi in enumerate(spi_results['NaNs']):
266+
print(f"{i+1}. {spi}")
267+
print(single_line_60 + "\n")
268+
if spi_results['Partial NaNs']:
269+
print(f"\n[{len(spi_results['Partial NaNs'])}] SPIs which produced partial NaN outputs:")
270+
print(single_line_60)
271+
for i, spi in enumerate(spi_results['Partial NaNs']):
272+
print(f"{i+1}. {spi}")
273+
print(single_line_60 + "\n")
274+

requirements.txt

Lines changed: 19 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,22 @@
11
pytest
2-
scikit-learn==1.0.1
3-
scipy==1.7.3
4-
numpy>=1.21.1
5-
pandas==1.5.0
6-
statsmodels==0.12.1
7-
pyyaml==5.4
8-
tqdm==4.50.2
9-
nitime==0.9
10-
hyppo==0.2.1
11-
pyEDM==1.9.3
12-
jpype1==1.2.0
13-
sktime==0.8.0
14-
dill==0.3.2
15-
spectral-connectivity==0.2.4.dev0
16-
torch==1.13.1
2+
h5py
3+
scikit-learn
4+
scipy
5+
numpy
6+
pandas
7+
statsmodels
8+
pyyaml
9+
tqdm
10+
nitime
11+
hyppo
12+
pyEDM==1.15.2.0
13+
jpype1
14+
sktime
15+
dill
16+
spectral-connectivity
17+
torch
1718
cdt==0.5.23
18-
oct2py==5.2.0
19-
tslearn==0.5.2
19+
oct2py
20+
tslearn
2021
mne==0.23.0
21-
seaborn==0.11.0
22+
seaborn

0 commit comments

Comments
 (0)