07 May 13:36

schroedk

0b8fec2

v0.9.2 Latest

Latest

0.9.2 - 🏗 Bug fixes, logging improvement

Added

Add progress bars to the computation of LazyChunkSequence and
NestedLazyChunkSequence
PR #567
Add a device fixture for pytest, which depending on the availability and
user input (pytest --with-cuda) resolves to cuda device
PR #574

Fixed

Fixed logging issue in decorator log_duration
PR #567
Fixed missing move of tensors to model device in EkfacInfluence
implementation PR #570
Missing move to device of preconditioner in CgInfluence implementation
PR #572
Raise a more specific error message, when a RunTimeError occurs in
torch.linalg.eigh, so the user can check if it is related to a known
issue
PR #578
Fix an edge case (empty train data) in the test
test_classwise_scorer_accuracies_manual_derivation, which resulted
in undefined behavior (np.nan to int conversion with different results
depending on OS)
PR #579

Changed

Changed logging behavior of iterative methods LissaInfluence and
CgInfluence to warn on not achieving desired tolerance within maxiter,
add parameter warn_on_max_iteration to set the level for this information
to logging.DEBUG
PR #567

Assets 2

22 Apr 09:33

schroedk

v0.9.1

123d01f

v0.9.1

0.9.1

Fixed

FutureWarning for ParallelConfig constantly raised without actually
instantiating the object
PR #562
Modify log level for implementations of TorchInfluenceFunctionModel
Add duration logging to output of SequentialCalculator

Assets 2

12 Apr 18:11

mdbenito

v0.9.0

786458c

v0.9.0

🆕 New methods, better docs and bugfixes 📚🐞

Added

New method MSR Banzhaf with accompanying notebook, and new stopping
criterion RankCorrelation PR #520
New method: NystroemSketchInfluence PR #504
New preconditioned block variant of conjugate gradient PR #507
Improvements to documentation: fixes, links, text, example gallery, LFS and more PR #532, PR #543
Glossary of data valuation and influence terms in the documentation PR #537
Documentation about writing notes for new features, changes or deprecations PR #557

Fixed

Bug in LissaInfluence, when not using CPU device PR #495
Memory issue with CgInfluence and ArnoldiInfluence PR #498
Raising specific error message with install instruction when trying to load pydvl.utils.cache.memcached without pymemcache installed. If pymemcache is available, all symbols from pydvl.utils.cache.memcached are available through pydvl.utils.cache PR #509

Changed

Add property model_dtype to instances of type TorchInfluenceFunctionModel
Bump versions of CI actions to avoid warnings PR #502
Add Python Version 3.11 to supported versions PR #510
Documentation improvements and cleanup PR #521, PR #522
Simplified parallel backend configuration PR #549

New Contributors

@jakobkruse1 made their first contribution in #510

Full Changelog: v0.8.1...v0.9.0

Contributors

jakobkruse1

Assets 2

26 Jan 09:46

AnesBenmerzoug

v0.8.1

ac4ac7f

v0.8.1

🆕 New method and notebook, Games with exact shapley values, bug fixes and cleanup 🏗

Added

Implement new method: EkfacInfluence #451
New notebook to showcase ekfac for LLMs #483
Implemented exact games in Castro et al. 2009 and 2017 #341

Fixed

Bug in using DaskInfluenceCalcualator with TorchnumpyConverter for single dimensional arrays #485
Fix implementations of to methods of TorchInfluenceFunctionModel implementations #487
Fixed bug with checking for converged values in semivalues #341

Docs

Add applications of data valuation section, display examples more prominently, make all sections visible in table of contents, use mkdocs material cards in the home page #492

New Contributors

@opcode81 made their first contribution in #481
@dependabot made their first contribution in #455

Full Changelog: v0.8.0...v0.8.1

Contributors

opcode81 and dependabot

Assets 2

21 Dec 11:35

schroedk

v0.8.0

70df031

v0.8.0

🆕 New interfaces, scaling computation, bug fixes and improvements 🎁

Added

New cache backends: InMemoryCacheBackend and DiskCacheBackend PR #458
New influence function interface InfluenceFunctionModel
Data parallel computation with DaskInfluenceCalculator PR #26
Sequential batch-wise computation and write to disk with SequentialInfluenceCalculator PR #377
Adapt notebooks to new influence abstractions PR #430

Changed

Refactor and simplify caching implementation PR #458
Simplify display of computation progress PR #466
Improve readme and explain better the examples PR #465
Simplify and improve tests, add CodeCov code coverage PR #429
Breaking Changes
- Removed compute_influences and all related code.
  Replaced by new InfluenceFunctionModel interface. Removed modules:
  - influence.general
  - influence.inversion
  - influence.twice_differentiable
  - influence.torch.torch_differentiable

Fixed

Import bug in README PR #457

Full Changelog: v0.7.1...v0.8.0

Assets 2

14 Oct 15:15

mdbenito

v0.7.1

61df499

v0.7.1

🆕 New methods, bug fixes and improvements for local tests 🐞🧪

Added

New method: Class-wise Shapley values PR #338
New method: Data-OOB by @BastienZim PR #426, PR #431
Added AntitheticPermutationSampler PR #439
Faster semi-value computation with per-index check of stopping criteria (optional) PR #437

Changed

No longer using docker within tests to start a memcached server PR #444
Using pytest-xdist for faster local tests PR #440
Improvements and fixes to notebooks PR #436
Refactoring of parallel module. Old imports will stop working in v0.9.0 PR #421

Fixed

Fix initialization of data_names in ValuationResult.zeros() PR #443

Contributors

BastienZim

Assets 2

02 Sep 16:20

mdbenito

v0.7.0

8739d18

v0.7.0

📚🆕 Documentation and IF overhaul, new methods and bug fixes 💥🐞

This is our first β release! We have worked hard to deliver improvements across
the board, with a focus on documentation and usability. We have also reworked
the internals of the influence module, improved parallelism and handling of
randomness.

Added

Implemented solving the Hessian equation via spectral low-rank approximation PR #365
Enabled parallel computation for Leave-One-Out values PR #406
Added more abbreviations to documentation PR #415
Added seed to functions from pydvl.utils.numeric, pydvl.value.shapley and pydvl.value.semivalues. Introduced new type Seed and conversion function ensure_seed_sequence. PR #396

Changed

Replaced sphinx with mkdocs for documentation. Major overhaul of documentation PR #352
Made ray an optional dependency, relying on joblib as default parallel backend PR #408
Decoupled ray.init from ParallelConfig PR #373
Breaking Changes
- Signature change: return information about Hessian inversion from compute_influence_factors PR #375
- Major changes to IF interface and functionality. Foundation for a framework abstraction for IF computation. PR #278, PR #394
- Renamed semivalues to compute_generic_semivalues PR #413
- New joblib backend as default instead of ray. Simplify MapReduceJob. PR #355
- Bump torch dependency for influence package to 2.0. PR #365

Fixed

Fixes to parallel computation of generic semi-values: properly handle all samplers and stopping criteria, irrespective of parallel backend. PR #372
Optimize memory usage in IF calculation PR #375
Fix adding valuation results with overlapping indices and different lengths PR #370
Fixed bugs in conjugate gradient and linear_solve PR #358
Fix installation of dev requirements for Python 3.10 PR #382
Improvements to IF documentation PR #371

New Contributors

@schroedk made their first contribution in #378

Full Changelog: v0.6.1...v0.7.0

Contributors

schroedk

Assets 2

13 Apr 12:18

AnesBenmerzoug

v0.6.1

0e929ae

v0.6.1

🏗 Bug fixes and minor improvements

Fix parsing keyword arguments of compute_semivalues dispatch function by @kosmitive in #333
Create new RayExecutor class based on the concurrent.futures API, use the new class to fix an issue with Truncated Monte Carlo Shapley (TMCS) starting too many processes and dying, plus other small changes by @AnesBenmerzoug in #329
Fix creation of GroupedDataset objects using the from_arrays and from_sklearn class methods by @AnesBenmerzoug in #334
Fix release job not triggering on CI when a new tag is pushed by @AnesBenmerzoug in #331
Added alias ApproShapley from Castro et al. 2009 for permutation Shapley by @mdbenito in #332

Full Changelog: v0.6.0...v0.6.1

Contributors

mdbenito, kosmitive, and AnesBenmerzoug

Assets 2

16 Mar 11:06

mdbenito

v0.6.0

f8e07cc

v0.6.0

🆕 New algorithms, cleanup and bug fixes 🏗

Fix/stopping checks by @mdbenito in #283
Fix Monte Carlo Least Core error when n_iterations < len(dataset) by @AnesBenmerzoug in #281
Hide parallel backend in tmcs main function by @mdbenito in #293
Cosmetic changes to Dataset by @mdbenito in #290
Refactor/nicer imports by @mdbenito in #284
Fix StandardError stopping criterion by @mdbenito in #300
Remove unpackable decorator, use asdict() by @mdbenito in #233
Add burn-in param to AbsoluteStandardError by @mdbenito in #305
Remove default non-negativity constraint on least core subsidy by @AnesBenmerzoug in #304
Close #280: Add py.typed by @mdbenito in #307
Minor docstring and cosmetic changes by @mdbenito in #317
Allow passing additional kwargs to Dataset class' classmethods by @AnesBenmerzoug in #316
Semi-values and samplers by @mdbenito in #319
Remove bogus iter method. by @kosmitive in #326
Improvements to ValuationResult by @mdbenito in #327

Full Changelog: v0.5.0...v0.6.0

Contributors

mdbenito, kosmitive, and AnesBenmerzoug

Assets 2

21 Feb 07:57

mdbenito

v0.5.0

e1d28ef

v0.5.0

🛠️ Fixes, nicer interfaces and... more breaking changes 💥😒

Slow and steady does it

What’s changed

Fixed parallel and antithetic Owen sampling for Shapley values. Simplified and extended tests. #267
Added Scorer class for a cleaner interface. Fixed minor bugs around Group-Testing Shapley, added more tests and switched to cvxpy for the solver. #264
Generalised stopping criteria for valuation algorithms. Improved classes ValuationResult and Status with more operations. Some minor issues fixed. #250
Fixed a bug whereby compute_shapley_values would only spawn one process when using n_jobs=-1 and Monte Carlo methods. #270
Bugfix in RayParallelBackend: wrong semantics for kwargs. #268
Splitting of problem preparation and solution in Least-Core computation. Umbrella function for LC methods. #257
Operations on ValuationResult and Status and some cleanup #248
Bug fix and minor improvements: Fixes bug in TMCS with remote Ray cluster, raises an error for dummy sequential parallel backend with TMCS, clones model inside Utility before fitting by default, with flag clone_before_fit to disable it, catches all warnings in Utility when show_warnings is False. Adds Miner and Gloves toy games utilities #247

Full Changelog: v0.4.0...v0.5.0

Assets 2

Releases: aai-institute/pyDVL

v0.9.2

0.9.2 - 🏗 Bug fixes, logging improvement

Added

Fixed

Changed

v0.9.1

0.9.1

Fixed

v0.9.0

🆕 New methods, better docs and bugfixes 📚🐞

Added

Fixed

Changed

New Contributors

Contributors

v0.8.1

🆕 New method and notebook, Games with exact shapley values, bug fixes and cleanup 🏗

Added

Fixed

Docs

New Contributors

Contributors

v0.8.0

🆕 New interfaces, scaling computation, bug fixes and improvements 🎁

Added

Changed

Fixed

v0.7.1

🆕 New methods, bug fixes and improvements for local tests 🐞🧪

Added

Changed

Fixed

Contributors

v0.7.0

📚🆕 Documentation and IF overhaul, new methods and bug fixes 💥🐞

Added

Changed

Fixed

New Contributors

Contributors

v0.6.1

🏗 Bug fixes and minor improvements

Contributors

v0.6.0

🆕 New algorithms, cleanup and bug fixes 🏗

Contributors

v0.5.0

🛠️ Fixes, nicer interfaces and... more breaking changes 💥😒

What’s changed